Datatypes in Memory and Loads and Stores

Andreas Moshovos, Jan 2024

Let’s now take another look at some of the intrinsic datatypes that NIOS II supports, how they are laid out in memory, and how we can access them using load and store instructions.

Here’s an example set of assembly statements that initialize a few values in memory:

               .data

tucan:     .byte 0x1F, 0x21, 0xFE, 0xAB

p4:         .hword 0x1234, 0xFFFE

p12:       .word 0x12345678, 0xfedcba98

Foo:       .space 4

 

Before reading further, it would be best to try to figure out on your own how memory will be initialized given the above assembly “code”. Keep in mind that NIOS II is byte addressable and that it stores values in little endian order.

Ok, let’s confirm our expectations. Here’s let assume that tucan will be address 0x100.

The following line initializes addresses 0x100, 0x101, 0x102, and 0x103 respectively to the values listed in and that order. That is, address 0x101 will contain 0x21.

 tucan:     .byte 0x1F, 0x21, 0xFE, 0xAB

 

The following line allocates two half-words, each of 2B starting at address 0x104. Since these are stored in little-endian order, for each one, first we will store their least significant byte. At the end, in memory, addresses 0x104 to 0x107 will contain respectively 0x34, 0x12, 0xFE, 0xFF.

p4:         .hword 0x1234, 0xFFFE

 

Similarly, the next line allocates 8B in total for two words of 4B each starting at address 0x108. The values will be 0x78, 0x56, 0x34, 0x12, 0x98, 0xba, 0xdc, 0xfe.

p12:       .word 0x12345678, 0xfedcba98

 

Finally, the line below just reserves 4 bytes but does not initialize them to anything. Whatever values happen to be there will be there ones that the program will have access to:

Foo:       .space 4

 

Let’s now confirm this with the cpulator:

            .org 0x100  # this tells the linker to place the values below starting at address 0x100 (avoid using this – we are using it here to control where cpulator will place the values – using it opens up the possibility of having overlapping declarations, etc.)

               .data

tucan: .byte 0x1F, 0x21, 0xFE, 0xAB

p4: .hword 0x1234, 0xFFFE

p12: .word 0x12345678, 0xfedcba98

Foo: .space 4

 

Here’s how memory looks:

 

Now consider the code below:

.text

    movia r8, tucan

    ldw r9, 8(r8)

    ldw r10, 12(r8)

    ldw r11, 0xc(r8)

   

    ldh r12, 4(r8)

    ldhu r13, 4(r8)

    ldh r14, 6(r8)

    ldhu r15, 6(r8)

   

    ldb r9, 0(r8)

    ldbu r10, 0(r8)

    ldb r11, 3(r8)

    ldbu r12, 3(r8)

   

    ldw r9, 0(r8)

    ldw r10, 4(r8)

   

    ldh r11, 8(r8)

    ldhu r12, 14(r8)

   

    ldb r13, 15(r8)

 

    stb r13, 16(r8)

    stb r12, 17(r8)

          sth r9, 18(r8)

 

Can you figure out what would be the register or memory address that will change by each instruction? Can you figure out what values will they change to?

Try this first without the help of cpulator and then validate your expectations by executing the code using single-step execution.

The answers are below along with an explanation, but, it would be best to read this after trying things on your own, testing your expectations via cpulator and trying to figure out why the results are the way they are.

movia r8, tucan

R8 = 0x100 because tucan is address 0x100. Tucan is a 32b constant which we intend to use as an address.

    ldw r9, 8(r8)

Read 4B starting at address 0x108 and place those in little-endian order in r9. The bytes are 0x78, 0x56, 0x34, 0x12 in memory. So, r9 = 0x12345678

    ldw r10, 12(r8)

 

Read 4B starting at address 0x100+12 = 0x10c. The bytes are 0x98, 0xba, 0xdc, 0xfe. The result goes into r10. So, r10 = 0xfedcba98

    ldw r11, 0xc(r8)

 

Read 4B starting at address 0x100+0xc = 0x10c, same as above. Write those 4B into r11. R11 = 0xfedcba98

    ldh r12, 4(r8)

 

Read 2B starting at address 0x100+4=0x104. The bytes there are 0x34, 0x12. As a half-word these are 0x1234. Write this half-word into r12. R12 is 4B long. So we have to extend the value from 16b to 32b. Since this is ldh and NOT ldhu it treats the half-word read from memory as a signed 2’s complement number. So, we have to fill in the upper 16b of r12 with the sign bit of 0x1234. We write this in binary and see that it is really 0001 0010 0011 0100. The sign is the most significant bit. So, r12 = 0x00001234  (which we can write as r12 = 0x1234 for convenience – r12 is always 32b so it does have the leading 0s). The number is positive if viewed as 2’

    ldhu r13, 4(r8)

Reads 2B as a half-word from address 0x100 (r8) + 4 = 0x104. Sounds familiar? Look at the previous row of this table :) However, this is the unsigned load half word. So, we fill in the upper 16b of r13 (target register) with 0s no matter what. In this case, r13 = 0x1234, same as above.

    ldh r14, 6(r8)

 

Read 2B starting at address 0x106. The bytes are 0xfe, 0xff, and as a half-word they become 0xfffe. Since this is LDH we have to sign-extend before we write into r14. We look at the MSb (most significant bit) of 0xfffe and we see it is 1 (write 0xfffe in binary if you need to). So, at the end r14=0xfffffffe. R14 is -2 expressed in 4B. The value read from memory was 0xfffe which is -2 expressed in 2B. So, we preserved the numerical meaning of the value when we read it as 2B from memory and store it as 4B in the register.

    ldhu r15, 6(r8)

 

As in the previous row, but now we tread the 0xfffe read from memory as an UNSIGNED number. So, we zero-extend it as we write it into r15 to preserve its meaning. R15 = 0x0000FFFE.

    ldb r9, 0(r8)

 

Read 1B from 0x100+0 and sign-extend it to 32b and write it into r9. The memory at 0x100 contains 0x1f, which if viewed as 2’s complement is positive (look at the MSb), so when written into r9 it becomes r9=0x0000001f. Sign-extension = copy the sign bit into the empty positions.

    ldbu r10, 0(r8)

 

As in the previous row, but we just zero out the upper 3B of r10. We don’t look at the value coming from memory to decide this. We always zero them out since this is the UNSIGNED load. R10=0x0000001f.

    ldb r11, 3(r8)

 

Read 1B from 0x100+3, sign-extend it to 32b and write this into r11. The byte at 0x103 is 0xab and its MSb is 1 (0xac = 1010 1011). So, the sign-extended value written into r11 is 0xffffffab.

    ldbu r12, 3(r8)

 

As in the previous row, but since this the unsigned load, we zero extend. R12=0x000000ab

    ldw r9, 0(r8)

 

Read 4B starting at address 0x100+0=0x100. It does not matter that we initialized these bytes using .byte directives. At the end, memory does not care nor does it have any information of how it goes initialized. It only contains the values and they are all bytes. So, now we read the four bytes and we get the following word: 0xabfe211f. This goes into r9=0xabfe211f. No extension needed since we did read 4B and our register holds exactly 4B no more.

    ldw r10, 4(r8)

 

Read 4B starting at address 0x104 and write them into r10 in little-endian order. R10=0xfffe1234

    ldh r11, 8(r8)

 

Similarly, it does not matter than we initialized the memory at address 0x100+8=0x108 using .word. Memory has no information about that. It only contains the byte values. So, we read those two bytes as a half-word in little-endian order (0x5678) and sign-extend to 32b (0x00005678) and write this into r11=0x00005678.

    ldhu r12, 14(r8)

 

Read 2B starting at address 0x100+14=0x10d, we get 0xfedc, zero-extend to 32b and write into r12=0x0000fedc.

    ldb r13, 15(r8)

 

Read 1B from address 0x100+15=0x10F. We get 0xfe. Sign-extend to 32b, we get 0xfffffffe. Write this into r13=0xfffffffe

    stb r13, 16(r8)

 

Write the least significant byte of r13 to memory at address 0x100+16=0x110. Memory[0x110]=0xfe.

Why don’t we sign-extend? Answer below at (*), but think about this before reading it.

    stb r12, 17(r8)

 

Write the least significant byte of r12 (0xdc) to memory at address 0x100+17=0x111. Memory[0x111]=0xdc.

 

          sth r9, 18(r8)

 

The the least significant half-word of r9=0x211f and write it in little-endian order to memory starting at address 0x100+18=0x112.

Memory[0x112]=0x1f

Memory[0x113]=0x21

  

   

   

   (*) Why aren’t there signed and unsigned store instructions? In NIOS II, a store instruction takes a value from a register which always contains 32b and stores it to memory as either a word (32b), half-word (16b), or byte (8b). In all cases, the destination (memory) has at most as many bits as the source value. So, there are no missing bits to fill in. If anything, we are “chopping” a value to fit it to fewer bits when we store it as a half-word or byte. It is upon us, the programmers, to ensure that the value does fit in this many bytes.