ECE243

Andreas Moshovos

Feb 2024

 

The Audio Device

 

On our lab board there is also an audio device that can record and generate “audio” signals (will clarify we refer to “audio” and not plain audio). The core of the audio device is an analog to/from digital converter. The Analog to Digital Converter (ADC) converting analog “audio” input signals into digital samples (will explain shortly) that the CPU can then read and manipulate. The   Digital to Analog Converter (DAC) perform the reverse function: it accepts digital samples from the CPU and converts them into analog  audio” signals.

 

What is Audio?

 

Let us first discuss what is audio. For the purposes of our discussion (and to avoid going down a rabbit hole trying to cover every possible nuance of what audio really can be), audio are back-and-forth movements (displacement) in air (or any other physical medium) that when they reach our ears are perceived by us as sound.

 

The graph below, for example, shows the displacement that would cause our ears to perceive a sound. The x-axis shows time and the y-axis shows physical displacement relative to the resting condition (no movement).

 

 

A graph with a line

Description automatically generated

 

Converting Audio to Digital Form and Back

 

Through the use of sensor (microphone) and of electronic circuit we can convert sound into digital information that can then be manipulated by a program. There is not just one way of going about this. We will restrict attention to what is available to us on the lab board. An audio converter accepts as input an electrical waveform where the amplitude (voltage) represents the physical displacement sensed by the microphone or some other source. The conversion it performs is producing a digital quantity that represents the input amplitude. In our case, the full input voltage range is converted the 32b integer number space; meaning, -2^31 is the highest negative voltage and 2^31-1 the highest positive voltage.

 

However, audio itself is a continuous in time signal, meaning in has a value at any time instant. Converting voltages to integers takes time and besides executing a single instruction itself takes some time. Accordingly, the output of the ADC contains samples of the input voltage signals. A sample is measurement of the input voltage at a particular instant, and the samples are taken at regular intervals by the audio device. For our system the sampling frequency is 8KHz, meaning that in one second the device will produce 8,000 samples, each of 32b. So, what the CPU will be able to “see” is a sequence of values in time order, which are samples of the input waveform:

A graph with blue and black dots

Description automatically generated

 

For our example waveform above there the signal repeats every 32 samples. Given that the sampling frequency is 8KHz the waveform will be perceived as a 250Hz sound.

 

 

Similarly to the input conversion, the audio device is also capable of “producing” sound from data written by the processor. The DAC side accepts a sequence of 32b integers which are converted into voltages at the rate of 8KHz (8,000 samples per second one per input integer). The resulting voltage waveform and then are fed to an actuator or any other suitable device to generate or record the sound. A typical actuator is a speaker which is in its simplest form a surface (typically cone-line) which can move up and down via an electromagnet.

 

The Board’s Audio Device

 

The actual audio processing device is quite complicated. However, the designers of the board have hidden much of this complexity from the processor via a mediator device starting at base address 0xFF203040. This is the one we get to program. It exposes to the CPU the following “registers” in memory:

 

A white rectangular box with black text

Description automatically generated

 

 

The device can process and generate two-channel (aka stereo) sound. It has two channels, left and right each with separate input and output connections.

 

Data Registers: The leftdata and rightdata “registers” are read and write ports to the inputs and outputs of the respective channels. What gets passed through those registers are 32b integers which are interpreted as audio samples. When we write to these registers we are accessing the output, and when we read we are accessing the input. There is no way to write to the input or to read the output.

 

Internally, these registers are connected to FIFO queues. Writing places one more sample to the output FIFO and reading returns the next available sample from the input FIFO.  There are four FIFOs, two per channel, one for input and one for output.

For our system each of those FIFOs can contain up to 128 samples. The conversion from/to the analog side is happening independently of the CPU: if there are samples in the output FIFOs (the CPU has written them) the audio device will convert them to voltages consuming them in the process and thus draining the output FIFOs. Similarly, as long as there is space in the input FIFOs the audio device will place samples there for both channels. Effectively, the CPU delegates work to the audio device asking it to convert to/from samples at a rate of 8KHz. Note that the CPU can read and write samples much faster than this conversion rate. To ensure that output audio is perceived as continuous the CPU must ensure that the output FIFO always has samples to convert. Similarly, to ensure that no samples are lost when converting an input signal, the CPU must ensure that it reads them at least at a rate of 8KHz per second. The fast that the buffers have space for 128 samples makes things easier for the CPU. It does not have to read and write samples every 1/8000 seconds. Instead, it has to make sure that it keeps up so that none of the FIFOs is overrun. For example, it could read or write 128 samples in one go every (1/8,000)x128 seconds. This should be roughly 16ms or, assuming 1 instruction per cycle and a cycle frequency of 100MHz, every 1.9M instructions – it’s a good idea to double-check these numbers as they could be wrong or even if they are right, it’s nice to have an understanding of the relatively speeds.

 

Fifospace Registers: The fifospace register has four subfields each of 8b. These are occupancy counters for the four FIFOs:

·        WSLC and WSRC are the output FIFO occupancy counters respectively for the left and the right channels. They report how many FIFO entries are currently empty. The CPU is supposed to check that there is space (counter is not zero) in the output FIFO before writing another sample.

·        RALC and RARC are the input FIFO occupancy counters respectively for the left and right channels. They report how many input samples are available for the CPU to read. The CPU is supposed to check that these are non-zero before attempting to read a sample.

 

Control  Register: It contains several bit fields that the CPU is can use to control the device and get status information.

·        By writing 1 to CW  the CPU clears the output FIFOs. Both of them. This remains effective until the bit is cleared. So the CPU has to first write 1 and then to write 0 for output conversion to “resume” – the audio output will always have a voltage which in the absence of actual samples in the output FIFOs will be at “rest”.

·        By writing 1 to CR the CPU clears the input FIFOs. Again both of them. As with CW, the FIFOs remain empty until the CPU writes a 0 to this bit.

·        WE and RE are interrupt enable bits respectively for output and input, whereas WI and RI are interrupt indicators. The CPU can enable interrupts by writing a 1 to WE and/or RE. The audio device will request output interrupts when the output FIFO are less then 25% full. The idea here is that the CPU can then have some time to fill in the output queue to ensure that the output conversion never runs out of samples (maintain continuity in the signal). This is a head’s up for the CPU which now is running against time: the DAC continues to convert samples from the output FIFO. It better be that the CPU fills it up with more samples before it runs out. The 25% threshold gives the CPU some time to react and fill in more samples.

 

The RE enables input interrupts. This means that the audio device will request interrupts when it filled at least 75% of the input FIFO. The intention is for the CPU to go and process those samples. Why 75% full and not 100% full? This is because, it may take some time for the CPU to accept the interrupt and start processing consuming the samples, and since in the meantime the ADC is actively producing more samples placing them in the FIFO.

 

Programming the Audio Device in C

 

Let’s us show how we can program the audio device using C instead of direct assembly. We can declare pointers to the various registers and use those to read and write them. Here, we will use a structure definition for our code to be more readable and manageable. The following structure definition can be used to access the various registers:

 

struct audio_t {

      volatile unsigned int control;

      volatile unsigned char rarc;

      volatile unsigned char ralc;

      volatile unsigned char wsrc;

      volatile unsigned char wslc;

      volatile unsigned int ldata;

      volatile unsigned int rdata;

};

 

struct audio_t *const audiop = ((struct audio_t *)0xff203040);

 

The control register and the FIFO access registers are all defined as unsigned integers which we know map to words (32b) in our system. We define the fifospace fields separately using char fields instead of  a  single int. Since our CPU is little endian, the first byte corresponds to rarc and the last to walc.

 

The next statement declares audiop as a pointer of constant value which points to the base address of the audio device. The table below lists the offsets relative to the base address accessing of the fields will generate:

Field

Offset

audiop->control

+0

audiop->rarc

+4

audiop->ralc

+5

audiop->wsrc

+6

audiop->wslc

+7

audiop->ldata

+8

audiop->rdata

+12

 

Audio Output

 

Let us first see how we can output audio. Here’s a C function that outputs a waveform of n sample on both output channels. The for loop repeats until all n samples have been sent the audio device. It uses polling to check whether there is space available on the output FIFOs. Any time space is available it places a sample value by copying it twice, once per channel. The outer for loop repeats until all n samples have been sent.

 

void

audio_tone(int n) {

            int i, s = 0;

 

            audiop->control = 0x8; // clear the output FIFOs

            audiop->control = 0x0; // resume output conversion

            for (i = 0; i < n; i++) // out n samples

              // output data if there is space in the output FIFOs

              if ((audiop->wsrc != 0) && (audiop->wslc != 0)) {

                  audiop->ldata = s;

                  audiop->rdata = s;

                  s = sample_next(s); // get next sample of sawtooth waveform

              }

}     

 

The sample_next() helper function generate the samples of a sawtooth wave form. It takes a single parameter sample_c, which is the current sample value, and returns the next in the sequence sample for the desired waveform. In our case it generates the following sample sequence, 0, S, 2S, 3S, 4S, SMAX, SMIN, SMIN+S, SMIN+2S, …, 0, S, 2S, … and so on. S = SAWTOOTH_STEP, SMIN = SAWTOOTH_MIN, SMAX=SAWTOOTH_MAX constants as defined below. This is a helper function and could be replaced with any other. Our focus here is on the way we communicate with the audio device above.

 

Since the output channels perform conversion in parallel, and since we are placing a value in both of them at the same time, suffices to check one of the occupancy counters:

 

void

audio_tone(int n) {

            int i, s = 0;

 

            audiop->control = 0x8; // clear the output FIFOs

            audiop->control = 0x0; // resume output conversion

            for (i = 0; i < n; i++) // out n samples

              // output data if there is space in the output FIFOs

              if (audiop->wsrc !=0) {

                  audiop->ldata = s;

                  audiop->rdata = s;

                  s = sample_next(s); // get next sample of sawtooth waveform

              }

}     

 

The function below playsback the samples stored in an array. It assumes that the array contains samples for monophonic sound and copies the same sample to both channels at the same time.

 

void

audio_playback_mono(int *samples, int n) {

            int i;

 

            audiop->control = 0x8; // clear the output FIFOs

            audiop->control = 0x0; // resume input conversion

            for (i = 0; i < n; i++) { // output a sample per iteration

              // wait until there is space at the output FIFOs

              while (audiop->wsrc == 0);

              audiop->ldata = samples[i];

              audiop->rdata = samples[i];

             }

}     

 

An example of what the samples[] array could be follows. It’s just an array of 32b integers.

 

int samples[] = {

0xfffbeb96, 0xffb2de9b, 0xffd2add6, 0x0030bc7c,

0x002dab2e, 0x0063c21d, 0x004a0e3e, 0xfff5ede4,

0x004ed094, 0x008c65d8, 0x0075a7fc, 0x009e4ffe,

...

};

 

int samples_n = some number; // how many samples are in the array

 

 

Record and Playback

We present two functions that respectively record and playback BUF_SIZE audio samples from both channels. The function, audio_record() reads the samples from the input channels and copies them into the following two buffers in memory. The second function, audio_playback() reads the samples from these buffers and copies them to the output buffers. Both channels operate in parallel and are being read and written at the “same time”.

 

Recording Input Audio

 

int left_buffer[BUF_SIZE];

int right_buffer[BUF_SIZE];

         

void audio_record(void) {        

            int buffer_index;

 

            audiop->control = 0x4; // clear the input FIFOs

            audiop->control = 0x0; // resume input conversion

            buffer_index = 0;

            while (buffer_index < BUF_SIZE) { 

                // read samples if there are any in the input FIFOs

                if (audiop->rarc) {

                      left_buffer[buffer_index] = audiop->ldata;

                      right_buffer[buffer_index] = audiop->rdata;

                      ++buffer_index;

                }

            }

}

 

The outer while loop repeats until we have read BUF_SIZE samples. The if checks whether there are samples to read at the moment (audiop->rarc must be non-zero) and of course that we have not yet read as many as we wanted (buffer_index < BUF_SIZE). The code checks only the right channel counter. This is OK since we are reading from both channels at the same time.

 

It then copies into the left_buffer[] and the right_buffer[] one sample at a time by accessing respectively the audiop->ldata and the audiop->rdata.

 

Playing Back Recorded Audio

Here’s the code that outputs BUF_SIZE samples from the two buffers in memory:

 

void audio_playback(void) {

            int buffer_index = 0;

 

            audiop->control = 0x8; // clear the output FIFOs

            audiop->control = 0x0; // resume input conversion

            while (buffer_index < BUF_SIZE) {

              // output data if there is space in the output FIFOs

              if (audiop->wsrc) {

                  audiop->ldata = left_buffer[buffer_index];

                  audiop->rdata = right_buffer[buffer_index];

                  ++buffer_index;

              }

             }

}

 

The code follows a similar logic to the one we used for the input: The outer while repeats until the CPU managed to output all samples in the buffers. The if checks that there is space at the output FIFOs and if so, copies the next pair of samples from the buffers by writing to audiop->ldata and audiop->rdata.

 

 

Note that the above code will spend most of its time busy waiting on the sound device to either accept one more sample as output (when the output FIFOs are filled in) or to produce one more input sample (when we drain the input FIFOs). This is because the CPU is much faster then the audio conversion rate (100MHz vs. 8KHz).

 

Alternatively, we may want to copy data only after the output FIFOs have been drained at 75% empty and to read data when the input FIFOs are 75% full.

 

Here’s how we can modify the audio_record()’s loop:

 

#define FIFO_TRESHOLD 96 // 75% of 128           

   buffer_index = 0;

            while (buffer_index < BUF_SIZE) {  

              if (audiop->rarc > FIFO_THRESHOLD)

                while ((audiop->rarc) && (buffer_index < BUF_SIZE)) {

                      left_buffer[buffer_index] = audiop->ldata;

                      right_buffer[buffer_index] = audiop->rdata;

                      ++buffer_index;

                }

                // do something else

               

             }

 

The “if (audiop->rarc > FIFO_THRESHOLD)” statement checks that the right input FIFO is at least 75% full before it attempts to copy any data to the memory buffers. What follows is a while loop that reads input samples as long as there are any (audiop->rarc != 0) and as long we have not filled out memory buffers (buffer_index < BUF_SIZE).

 

“Connect” the input to the output

 

The following code reads samples from the audio input queues and immediately places them at the output queues. The effect is that whatever audio is being captured by the microphone is immediately played back from the speakers (assuming the input is connected to a microphone and the output to speakers).

 

 

               

void

audio_parrot(void) {

   audiop->control = 0xC; //clear all queues -- CW & CR

   audiop->control = 0; // resume conversion in & out

   while (1)

      while (audiop->rarc != 0) {

                      audiop->ldata; = audiop->ldata;

                      audiop->rdata = audiop->rdata;

                   }

    }

 

 

Using words for all audio device registers

 

Thus far we used 8b fields to access the FIFO occupancy counters individually. This is very convenient. However, while this may be possible for the audio device, it may not be possible to access the fields individually for some other device which would require all accessed to be word-wide. In this case, we can declare all four “registers” as 32b words as follows:

 

struct audio_t {

       volatile unsigned int control;

       volatile unsigned int status; // contains wslc, wsrc, ralc, rarc

       volatile unsigned int ldata;

       volatile unsigned int rdata;

};

 

We can then use standard C bitwise operations to access the individual fields. For example the following if statement checks that both output queue counters are non-zero:

 

  unsigned int status = audiop->status; // this becomes a ldwio AUDIO_BASE+4

  unsigned int wslc = (status >> 24) & 0xFF; // extract bits 24-31

  unsigned int wsrc = (status >> 16) & 0xFF; // extract bits 16-23

 

  if ((wsrc !=0) && (wslc != 0)) {

     CODE GOES HERE

  }