Lecture 2

ECE243

Andreas Moshovos

Feb 2024

The Video Device

The video output device on the lab computer can be used to display on an appropriate external device such as a monitor or projector. This document focuses on the programmer’s view of the video device. In the lectures we overview the physical setup and briefly touch upon display technologies.

The document first presents a simplified, yet fully functional approach to using the video to display graphics. This takes for granted the default video device configuration. This is the way the video device is configured when the system boots. In the second part of the document, we will discuss how a programmer can manipulate this configuration through the video devices control and other registers.

Simplified View of What a Video Display Projects

Let’s first overview what a typical display device presents itself to a user. What does a user “see” when the look at a display. For our video device, and many video/graphics devices out there, the display is a 2D array, or grid of pixels (picture elements):

A diagram of a pixelated image

Description automatically generated

In the above example there are in total 320x240 (columns x rows) of pixels. This is the displays resolution (this is called CGA for historical reasons). Each pixel has a (column,row) coordinate that we can use to refer to it. The column coordinates vary from 0 to 319, and the row ones from 0 to 239. Pixel (0,0) is at the upper left corner, while pixel (319,239) is at the bottom right corner.

The color displays we will be discussing have pixels that comprise three components (view them as subpixels), a reg, a green, and a blue (illustrated for pixel (0,0) on the figure). They use these three color components since our eyes have sensors that are most sensitive to the light of the corresponding color (where color is electromagnetic radiation in the range of 400-700nm as per Wikipedia at the time of this writing).

The display can adjust the intensity of these colors per pixel and as a result can create many different colors and at various intensities. For example, each of the red, green, blue components could be controlled using each 8b allowing us to specify 256 levels of intensity per color. In total, we will be able to specify 2^24 colors this way. Depending on the quality of the display these levels may be discernible visually. In any case, here we will assume that the display is perfect and will display colors perfectly.

The video display is connected to our lab board through a VGA connector and “protocol” (meaning, how information is packaged and communicated over the connection). This is a standard, somewhat older physical video interface. It transfers display contents using a form of bit-serial communication. Other standard connections are DVI, HDMI, Display port, etc. I am sure more will be developed as the years go by. If interested, you can seek more information. This topic will not be covered here as it is orthogonal to our discussion.

In all, the display is external to our board and we will be able to communicate what we wish to draw. More on this below.

Introduction to the DE1-SoC video device

The video device is a sophisticated one and has several registers that we can use to adjust its configuration and behavior. Fortunately, the NIOS II system designer has made sure that by the time the system boots, the video device is in a default configuration that is usable immediately.

We will for the time being assume this configuration and focus instead on how to paint an image on the screen by accessing only those memory addresses that contain pixel values. Later on we will cover a more detailed view of the interface the video device presents to programmers.

The Default Configuration of the Video Device – How to “Paint” via the default “Frame Buffer”

When the system boots, the video device will be using a portion of memory starting an address 0x0800 0000 for holding the values we wish the pixels to take when displayed on the external display. We will refer to this portion of memory as the frame buffer.

What is the frame buffer? It is a 2D array of pixel values. What the video device does is continuously reading the pixels one by one, in row major order (first all the pixels of row 0 left to right, then the pixels of row 1, and so on), and sends those to the display.

The video device reads and displays the pixels from the frame buffer continuously and in parallel with the processor. At any given instant only one pixel is read and sent for display. Once a full frame has been displayed, the video device starts over after a short delay. The number of complete frames displayed per second is the refresh rate.

The processor can change the image that is displayed by writing the pixels in the frame buffer.

First lets see how pixel values are represented.

The Pixel Format

Our video device represents each pixel as a halfword (16bits). Inside this 16b there are three integers one for red, one for green, and one for blue. Since there are 16b these integers are respectively 5b, 6b, 5b. That is there are 32 levels of intensity for reg and blue, and 64 for green. Why more for green? Our eyes are more sensitive to green.

Here’s the diagram from the designers:

A black and white text

Description automatically generated

0 means zero intensity (darkness for that color) and intensity grows with the number.

For example, 0x0000, corresponds to a dark pixel, whereas 0xFFFF corresponds to a white pixel.

0xF8000 is bright red, and 0x001F is bright blue.

The Frame Buffer Format

The default resolution is 320 x 240 (columns x rows). So, conceptually the framebuffer is just a 2D array of 320x240 elements, where each element is a half-word. Since this is stored in memory, we have to somehow map the elements into addresses. To simplify the hardware the designers used a clever trick:

1. The require that the frame buffer starts at address that is aligned by 2^18 (the lower 18 bits are 0). Note for the time being that this is the size of an array of halfwords that has in total 512 x 256 elements. This will become important later in our discussion.

2. They allocate space in memory that is enough for a 512x256 frame buffer out of which they use only the 320x240 pixels. Conceptually, the frame buffer is stored inside a much bigger 2D array:

A diagram of a rectangular object with numbers and arrows

Description automatically generated

That is each of rows 0-219 have an extra 512-320 columns at their end and there are additional rows: 240-255. Those are padding. They are reserved but not used to hold pixel values that the video device will read.

The way pixels are stored in this space is as follows staring from the base address (0x0800 0000), Notation is bit confusing since the coordinates used are (column, row) = (x,y)

Pixel (0,0) is at offset 0, pixel (1,0) is at offset +2, pixel (2,0) is at offset +4,…, pixel (0,1) is at offset +512*2, pixel (1,1) at offset 512 *2 + 2, and so on. That is the pixel values are stored in row major order. First are all the pixels of row 0, then all the pixels of row 1, and so on. Within each row, the pixels are stored first the pixel at column 0, then the pixel at column 1, and so on.

Note that there are in total 512 pixels per row (320 real and the rest are padding) and each pixel takes two bytes.

Eventually, we will see a pattern emerging:

Pixel (x,y) is at address: base + y * 512 * 2+ x * 2.

That is to get to that pixel, we need to skip over y rows (of 512 elements of 2 bytes each), and then x columns (of two bytes each).

The above formula can also be written as base + (y << 10) + (x << 1).

Recall that an added constraint is that the buffer has to be aligned at 2^18. So, the + above can become at OR simplifying the hardware. It’s not important now why this simplifies the hardware. This information is relevant only for us to appreciate why we had to go through all the above information.

Accessing pixels from a program

Given the above arrangement, as programmers we can treat the framebuffer as a two dimensional array of 256 rows and 512 columns and where the elements are half words. Here’s a C code routine pixel_plot() to set the color of the pixel at (x,y). It uses a globally defined variable fbp for the framebuffer base address:

struct fb_t { unsigned short volatile pixels[256][512]; };

struct fb_t *const fbp = ((struct fb_t *) 0x8000000);

void pixel_plot (int x, int y, short color) {

fbp->pixels[y][x] = color ;

}

We first define the frame buffer as a structure containing a 2D array with 256 rows and 512 columns of shorts (half words for NIOS II):

struct fb_t { unsigned short volatile pixels[256][512]; };

We then initialize a global variable as a pointer to such a structure in memory starting at 0x08000 0000:

struct fb_t *const fbp = ((struct fb_t *) 0x8000000);

Then we can simply access the pixels using a standard C array:

void pixel_plot (int x, int y, short color) {

fbp->pixels[y][x] = color ;

}

The above routine trusts (reads it does not check otherwise) that the x and y coordinates are within valid limits.

Here’s another example routine that sets all pixels in the frame to a desired color:

int xn = 320; int yn=240;

void

solid_color(struct fb_t *const fbp, unsigned short color) {

int x, y;

for (x = 0; x < xn; x++)

for (y = 0; y < yn; y++)

fbp->pixels[y][x] = color; // set pixel value at x,y

}

In some cases, we may wish to set individual color components of a pixel’s value. For that we can use bit-wise instructions to set only the bits corresponding to the field we can. For example, here’s a sequence of operations in C that set the blue field (lower 5b) leaving all other fields (red and green untouched)

unsigned short color;

unsigned short blue;

...

color = color & ~0x001F; // zero out the blue field -- ~ is bit-wise invert

color = color | (blue & 0x1F); set the lower 5b to the lower 5b of variable blue

And here’s how we can set the red field:

color = color & ~0xF800; // zero out the red field upper 5b

color = color | ((red & 0x1F) << 11); // keep the lower 5b of red, and shift them into the upper 5b positions of a half word, and then or them into color

If you are familiar with C bitfields declarations (otherwise, please skip this step as it may end up more confusing than useful) then you may also use them to have the compiler do all the bitfield manipulation by declaring pixels as follows (“a:5” means that is 5 bits long and the compiler can neatly pack these in a “short” along with any other variables that take less than 16 bits – the net effect is that the three variables will be placed into a single 16b value):

typedef struct { short b:5,g:6,r:5; } pixel_rgb_t;

struct fb_t { pixel_rgb_t volatile pixels[256][512]; };

struct fb_t *const fbp = ((struct fb_t*) 0x8000000);

Then we can access individual color fields as follows:

fbp->pixels[y][x].g = 0x3F; // all set in max intensity

fbp->pixels[y][x]. r = 0x1F; // end color will be white

fbp->pixels[y][x]. b = 0x1F;

Drawing a Sprite

In the lectures we will go over code that draws 16x16 sprites starting from the left or the screen and ending to the right. A sprite is just a small 2D array of pixel values we can simply copy into the frame buffer anywhere we wish. The code also shows a demo of how we can create the perception of movement. The technique is familiar to all of us. The “trick” is to show images representing the movement in successive moments in time. Maybe you have seen this done or you have done it with a small notebook where we draw these images on page after page and flip through them quickly. Our brain perceives the rapid succession of images as motion.

Here's the routine that draws a 16x16 sprite. The upper left corner of the sprite is placed at pixel (x,y) and the whole sprite occupies a square portion of the frame. The sprite is a row-major 2D array of 16b color values.

void

sprite_draw(struct fb_t *const fbp, unsigned short sprite[16][16], int x, int y) {

int sxi, syi; // sprite coordinates (0-15,0-15)

int xi, yi; // frame coordinates (x – x+15, y – y+15)

for (sxi = 0; sxi < 16; sxi++)

for (syi = 0; syi < 16; syi++) {

xi = x + sxi; // coordinates on the frame

yi = y + syi;

fbp->pixels[yi][xi] = sprite[syi][sxi];

}

At the end of this document we present another way of declaring the pixel datatypes that allow us to access them as either a whole 16b value or to access the r, g, and b bit fields without having to directly do bit manipulation. Look at that only if you are comfortable with unions and bitfield declarations in C.

TO BE UPDATED WITH THE VIDEO DEVICE DETAILS…

The Programmer’s Interface to the Video Device

Now let’s take a look at the video device’s programmer interface. As an overview, suffices to note that as a programmer, we can control where the framebuffer is in memory, we are given the option of having two buffers (which will be handy when we are trying to change what want the display to show), plus a few other control and configuration options. The NIOS II processor “sees” the video device as a collection of words in memory that it can read and write:

A screenshot of a computer

Description automatically generated

The video device (“pixel buffer controller” as per the manual – it is a more apt name for the device as presented to NIOS II but we will be using “video device” in this document) comprises 5 word (32bit) “registers”. Its base address is 0xFF203020.

At offset +0 is the “Buffer” pointer register. This contains the address where the frame buffer is located in memory. By default, this is set automatically when the system boots to 0x0800 0000 (the default frame buffer we were using earlier in these notes). A program can read this register to find out where the buffer is currently at. Writing to this register does not change its value but instead swaps its value with that of the “Backbuffer” register below. Why this is the case we will explain after we list the remaining registers. It is a functionality that helps when we want to ensure what is displayed is consistent with what we wish to draw (it takes time to draw things and recall, the video device is continuously sending pixels on the display one at time). Note that in the table above, the “buffer” register is shown as “read only” which is not the case.

At offset +4 is the “BackBuffer” pointer register. This contains the address where a second frame buffer is located in memory. A program and change the location of this buffer by writing this register. It can also read it to find out where the buffer is currently at. This buffer provides functionality for avoiding artifacts when changing what we want the display to show (more on this shortly). For the time being suffices to know that the video device is not actively using this 2^nd buffer unless we explicitly instruct to.

At offset +8 is the “Resolution” register. The processor can read this register to find out what is the resolution currently in use. For our purposes, we can assume that this is fixed at 240 (y=rows) x 320 (x=columns). It is broken into two halves of 16bits each. The upper half (bits 31-16) reports y, and the lower half (bits 15-0) reports x. While we assume a fixed resolution, we can always write our program in a way that it can scale to other resolutions so that it works, for example, in future generations of the hardware.

At offset +12 there are two registers: “Status” (on read) and “Control” (write). As their name suggests they provide information about the current status of the device and control some of its functionality.

In C we can declare the following structure and pointer to access these registers:

#define uint32 unsigned int

struct videoout_t {

struct fb_t volatile *fbp; // front frame buffer (Buffer)

struct fb_t volatile *bfbp; // back frame buffer (BackBuffer)

uint32 volatile resolution; // resolution 2 fields 16b each, packed into a 32b word

uint32 volatile StatusControl;

};

struct videoout_t *const vp = ((struct videoout_t *) 0xFF203020);

The Control Register & the DMA Engine

Let’s first explain what the Control register does. We access this register by writing a word to address base+12. Only bit 2 (3^rd bit from right), referred to as “EN”, is meaningful and when writing all other bits should be 0. Which is to say that we can either write 0x4 (set EN to 1), or 0x0 (set EN to 0). When EN is 1 then the video device is actively reading pixels from the frame buffer sending them to the display. That is, the video output is active. When EN is 0, you guessed it, we disable the read-and-display pixels.

· EN = 1 à the device is actively reading and displaying pixels

· EN = 0 à video output disabled, no reading of pixels by the device

For the hardware-curious, the video device has a component (engine) called Direct Memory Access (DMA for short). This is a simple hardware unit that can perform memory accesses. In our case, the DMA accesses the frame buffer in row major order sending the pixel values to the display. Once it completes a full pass over the whole frame, it wraps around and starts again. The process repeats N times per second where N is the refresh rate. For example, for N=60Hz the DMA will read the full frame buffer 60 times per second. It should complete a full pass every 1/60^th of a second. Every pass reads 320x240x2 bytes, or 150KB. So, per second the DMA will read 150xNKB. My understanding is that the actual hardware in our lab works with a 60Hz refresh rate. A typical refresh rate for a low-end display is 30Hz, and for a “decent” display 60Hz. That means that the DMA engine reads 4.5MB/sec or 9MB/sec (our hardware) respectively.

Code:

Disable video out: vp->StatusControl &= ~0x4;

// screen goes blank – on CPUlator this is emulated by making all pixels black

Any changes to FB will not be displayed until we reenable the DMA

Enable video out: vp->StatusControl |= 0x4;

On CPUlator setting EN to zero manifests as the video device display turning “blank” (all pixels show as black). I suspect in the real hardware the display will go blank as in “I’m not receiving any input from the board”. Our program can still change pixel values in the framebuffer, but the video device will not be reading them. The video device is practically disabled. We can re-activate it by making EN 1 again.

A program can check whether the DMA is enabled by reading the EN bit through the Status register.

Why Two Frame Buffers? – “Double Buffering”

Let’s now discuss why there are two frame buffer pointer registers in the device. In the default configuration, both the “Buffer” and the “BackBuffer” point to the same area in memory at 0x0800 0000. However, a program can change the value of the “BackBuffer” to some other area in memory. Because of hardware restrictions, if we wish to change the BackBuffer we must pick an area that physically maps to the SDRAM memory chips on the boards. These are the addresses that start at 0x0000 0000 and end at 0x03FF FFFF.

Now why would we want a second buffer? Here’s an example. Let’s say we wanted to draw a 64x64 rectangle on the display where its upper left corner is at (100,100). In total there are 4K pixels we must change. So, it will be a few thousand instructions of NIOS II that we will need. The effect we want is for the rectangle to appear as a whole:

A red square in a blue box

Description automatically generated

What We Want

So, what we would want is for the full rectangle to appear as a whole in one frame. Let’s assume that our program writes the pixels for the rectangle in row major order starting from the top-most row (however, the order does not matter, we are being specific show we can show how the challenge manifests in one specific case). In practice, then we may instead get something like this:

A red rectangle on a blue background

Description automatically generated

What We Might Get

That is, what we might see for one frame is only a bottom portion of the rectangle, and maybe one row being half drawn. How is this possible and what can we do to avoid it? Let’s discuss these questions in order.

First: How is this possible?

The key is appreciating that neither drawing the square nor displaying a frame are instantaneous actions. They both take considerable time.

Unless they are explicitly synchronized, these two actions are proceeding in parallel without any consideration of how much progress the other one has made.

Effectively, we have two processes running in the system, one on the processor writing to the frame buffer and another on the video device reading from the frame buffer. In pseudo-code the processes are as follows:

CPU side

Device side

for (x = 0; x < 64; x++)

for (y = 0; y < 64, y++)

vp->fb[y+100][x+100];

for (dx = 0; dx < 320; x++)

for (dy = 0; dy < 240, y++)

send to display = vp->fb[dy][dx];

Let’s focus on the following moment in time illustrated below:

A diagram of a device pointer

Description automatically generated

At this point the device has made some progress displaying part of the frame (pixel noted as the square pointed by the “Device Pointer). The device has read all preceding pixels (all rows before and the pixels before in the same row) and has already send them to the display which has shown them to us. The device will continue to read and send pixels in the order noted on the diagram.

At this moment, the processor starts executing the instructions that wrote the first pixel of our square. This pixel is not shown on the display yet. The device will have to complete sending the full frame and then after it wraps around it will start from the beginning of the frame buffer. Eventually, at some later time and during the next frame it will read the red pixel written by the CPU just right now.

The CPU will proceed to write more pixels turning them red and will do that while the device is reading pixels and sending them for display. The device does this at a rate of at most 1/(320x240x60) per second (1/ (resolution x refresh rate) – actually there are some additional delays that unimportant to our discussion. The important observation is that the device will move a lot slower than the CPU draws the pixels. At some point the CPU will “catch up” with the device and starting changing pixels that the device has yet to access for the current frame. This is illustrated below. The device has moved forward and displayed several more pixels as its pointer has advanced to rows below and has currently only partially draw its next row. The CPU has caught up and has drawn much of the square (several of the top rows) and partially its next row. After this moment the CPU will keep drawing and from this point on the device will be reading the newly updated pixels written by the CPU:

A diagram of a device pointer

Description automatically generated

All of this is happening quite fast and at the end, we will perceive this image being displayed on the screen:

A red rectangle on a blue background

Description automatically generated

And then, a frame later, the “correct one”:

A red square on a blue background

Description automatically generated

Double Buffering to the rescue

To avoid such artifacts the video device uses double buffering where the device is actively reading from the frame buffer pointed to by “Buffer” and the CPU is actively updating the second frame buffer that is pointed by “BackBuffer”.

The intention here is for the CPU to finish making any updates it wishes on the BackBuffer, and then instruct the video device to start using that for display. Conceptually the process is as follows.

Let A and B two frame buffers.

Buffer = A; BackBuffer = B;

Concurrently:

Device uses Buffer to display frames at a rate of 60Hz

CPU draws what we want in BackBuffer (may take any amount of time)

CPU waits until Device finishes drawing the current frame

CPU asks device to swaps the pointers in Buffer and BackBuffer

In plain words, the device is using the frame buffer pointer (initially A) to by “Buffer” to draw frames on the display. In parallel the CPU is free to draw whatever it wants in the frame buffer pointed to by the “BackBuffer” (initially B). The CPU can take as much as it needs to do its drawing. Once the CPU is done drawing, it can request that the device swaps the two frame pointers so that the next frame is taken from the newly updated buffer. This swap request does not happen immediately necessary. Instead, the device will perform it only after it finishes displaying the current frame. At that instance, the device swaps the two pointer instantaneously.

Then it can ask the device to instantaneously swap the values of the two registers. The next effect, the next frame will be drawn from the completed contents left by the CPU in B, whereas now the CPU is free to use A (which is not pointed to by BackBuffer) to draw the next frame it wishes to display. This process can repeat as needed with the two frame buffers in memory alternating roles as the CPU instructs.

There are three actions that the CPU must be able to perform beyond being able to write to frame buffers (this functionality we already have discussed):

1. INIT_BACKFB: CPU changes the value of BackBuffer to point to another buffer in memory.

2. REQSWAP_FB: CPU requests the device to swap Buffer and BackBuffer pointer values.

3. WAIT_FRAME: CPU waits for the device to complete drawing the current frame and to swap the frame pointers.

#1 INIT_BACKFP: Initially, “BackBuffer” also points to 0x0800 0000 which means that there is no second buffer in use. If we wish to use double-buffering then we can change the BackBuffer to point to a appropriately allocated array in memory.

struct fb_t backbuffer; // the back buffer

vp->bfbp = (struct fb_t *volatile) backbuffer.pixels; // backbuffer

#2 REQSWAP_FB is assigned to: write 1 to Buffer:

vp->fbp = 1;

As a side effect, the device sets bit S in the Status register to 1, to indicate that the swap request has been accepted and the swap is pending. It has not been performed yet. When the current frame is done drawing, the swap is “instantaneous” as far as the CPU can tell.

In assembly the above statement would map to a single stwio of the value 1 to the address where Buffer is. We are not supposed to write any other value to Buffer. We cannot change its value. It is by default set to 0x0800 0000.

#3 WAIT_FRAME: is done by reading the “Status” register and checking bit 0 (S bit). As long as this is 1 the device is actively display pixels for the current frame and has not yet reached its end. Thus it has not swapped the pointer yet as we asked. The S bit turns to 0 when the swap has been completed. So, the CPU has to poll the “Status” register until S becomes 0:

while ((vp->StatusControl & 1) != 0);

Immediately then it can ask for the swap to happen.

CPUlator configuration: On the CPUlator when programming the video device with C we will have to use a variable in memory for the backbuffer. The CPUlator will complain if we use the IO load and stores that bypass the caches. One solution would be to declare the backbuffer as non-volatile. Another is to disable the warning of “Memory: Suspicious use of cache bypass”. We will be using the second method.

The other bits in the Status Register

For our purposes we do not need to be concerned with the other bits found in the Status register, specifically BS, SB, and A. They control how addressing is done for the frame buffer and whether it uses padding or not. In addition, how many bytes are used per pixel and how many pixels the DMA engine reads (these last two I am guessing based on the manual description). Please refer to the manual for more information if interested.

“Convenient” way to set pixels either as a whole value or set individual color components.

If you are familiar with C unions and bitfields, then here’s a way we can declare the framebuffer array that allows us to either access the whole pixel value as a halfword, or to set the red, green, and blue fields individually. If you are not familiar with unions and bit fields, please skip this section for the time being. Your time would be best spent on other material during the limited time we have available on the course.

The following structure declares the pixel as a having three fields, b, g, and r, respectively using 5 bits, 6 bits, and 5bits. The C compiler will tightly pack this with as few bits as it can using an existing datatype. Since 5+6+5 is 16b it will pack them into a short.

typedef struct {

short b:5,g:6,r:5;

} pixel_rgb_t;

The following union declares the pixel as something we can access either a whole halfword using the v field or using the above structure using the c field. Accessing c.r for example will return the 5b for the r above. Either v or c need 16b and since this is a union they will be allocated a single short of 16b which can be accessed as v or c. It is the same bits regardless.

typedef union {

pixel_rgb_t c;

unsigned short v;

} pixel_t;

The following declares the type of the framebuffer. What is different is that we use the above union datatype for each pixel.

struct fb_t {

pixel_t volatile pixels[256][512];

};

The following declares the type of the framebuffer. What is different is that we use the above union datatype for each pixel.

struct fb_t *const fbp = ((struct fb_t*) 0x8000000);

Here’s how we can now access pixels:

As whole values:

fbp->pixels[y][x].v = 0xFFFF; // white

Or their individual r, g, b fields:

// also white

fbp->pixels[y][x].c.r = 0x1F;

fbp->pixels[y][x].c.g = 0x3F;

fbp->pixels[y][x].c.b = 0x1F;

Using C Bitfields to Access individual Status, Control and Resolution Register Fields Directly from C

As before, this is given for future reference for those interested in “conveniences” C provides. The following declaration allows us to then access the various fields of some of the registers directly in C without any bit manipulation on our part.

struct fb_t {

unsigned short volatile pixels[256][512];

};

#define uint32 unsigned int

struct videoout_t {

struct fb_t *volatile fbp; // front frame buffer

struct fb_t *volatile bfbp; // back frame buffer

uint32 volatile resX:16,resY:16; // resolution 2 16b fields, packed into a 32b word

uint32 volatile S:1,A:1,EN:1,u5x3:3,SB:2,BS:4,u15x12:4,n:8,m:8;

};

struct videoout_t volatile *const vp = ((struct videoout_t *) 0xFF203020);

For example, the following statement waits until a requested swap of the frame buffer pointers has completed:

while (vp->S != 0);

One shortcoming of this declaration is that it is no longer possible to access all 32b of each of those registers in one go. We could declare them as unions as done for the pixels. This is left as an exercise for the interested reader.