Datatypes
in Memory and Loads and Stores
Andreas Moshovos, Jan 2024
Let’s now take another look at some of the intrinsic datatypes that NIOS II supports, how they are laid out in memory, and how we can access them using load and store instructions.
Here’s an example set of assembly statements that initialize a few values in memory:
.data
tucan:
.byte 0x1F, 0x21, 0xFE, 0xAB
p4:
.hword
0x1234, 0xFFFE
p12:
.word 0x12345678, 0xfedcba98
Foo:
.space 4
Before reading further, it would be best to try to figure out on your own how memory will be initialized given the above assembly “code”. Keep in mind that NIOS II is byte addressable and that it stores values in little endian order.
Ok, let’s confirm our expectations. Here’s let assume that tucan will be address 0x100.
The following line initializes addresses 0x100, 0x101, 0x102, and 0x103 respectively to the values listed in and that order. That is, address 0x101 will contain 0x21.
tucan:
.byte 0x1F, 0x21, 0xFE, 0xAB
The following line allocates two half-words, each of 2B starting at address 0x104. Since these are stored in little-endian order, for each one, first we will store their least significant byte. At the end, in memory, addresses 0x104 to 0x107 will contain respectively 0x34, 0x12, 0xFE, 0xFF.
p4:
.hword
0x1234, 0xFFFE
Similarly, the next line allocates 8B in total for two words of 4B each starting at address 0x108. The values will be 0x78, 0x56, 0x34, 0x12, 0x98, 0xba, 0xdc, 0xfe.
p12:
.word 0x12345678, 0xfedcba98
Finally, the line below just reserves 4 bytes but does not initialize them to anything. Whatever values happen to be there will be there ones that the program will have access to:
Foo:
.space 4
Let’s now confirm this with the cpulator:
.org
0x100 # this tells the linker to
place the values below starting at address 0x100 (avoid using this – we are
using it here to control where cpulator will place the values – using it opens
up the possibility of having overlapping declarations, etc.)
.data
tucan: .byte 0x1F, 0x21, 0xFE, 0xAB
p4: .hword 0x1234, 0xFFFE
p12: .word 0x12345678, 0xfedcba98
Foo: .space 4
Here’s how memory looks:
![]()
Now consider the code below:
.text
movia r8, tucan
ldw r9, 8(r8)
ldw r10, 12(r8)
ldw r11, 0xc(r8)
ldh r12, 4(r8)
ldhu r13, 4(r8)
ldh r14, 6(r8)
ldhu r15, 6(r8)
ldb r9, 0(r8)
ldbu r10, 0(r8)
ldb r11, 3(r8)
ldbu r12, 3(r8)
ldw r9, 0(r8)
ldw r10, 4(r8)
ldh r11, 8(r8)
ldhu r12, 14(r8)
ldb r13, 15(r8)
stb r13, 16(r8)
stb r12, 17(r8)
sth r9,
18(r8)
Can you figure out what would be the register or memory address that will change by each instruction? Can you figure out what values will they change to?
Try this first without the help of cpulator and then validate your expectations by executing the code using single-step execution.
The answers are below along with an explanation, but, it would be best to read this after trying things on your own, testing your expectations via cpulator and trying to figure out why the results are the way they are.
|
movia r8,
tucan |
R8 = 0x100 because tucan
is address 0x100. Tucan is a 32b constant which we intend to use as an
address. |
|
ldw r9, 8(r8) |
Read 4B starting at
address 0x108 and place those in little-endian order in r9. The bytes are 0x78,
0x56, 0x34, 0x12 in memory. So, r9 = 0x12345678 |
|
ldw r10, 12(r8) |
Read 4B starting at address
0x100+12 = 0x10c. The bytes are 0x98, 0xba, 0xdc, 0xfe. The result goes into
r10. So, r10 = 0xfedcba98 |
|
ldw r11, 0xc(r8) |
Read 4B starting at address
0x100+0xc = 0x10c, same as above. Write those 4B into r11. R11 = 0xfedcba98 |
|
ldh r12, 4(r8) |
Read 2B starting at
address 0x100+4=0x104. The bytes there are 0x34, 0x12. As a half-word these
are 0x1234. Write this half-word into r12. R12 is 4B long. So we have to
extend the value from 16b to 32b. Since this is ldh and NOT ldhu it treats
the half-word read from memory as a signed 2’s complement number. So, we have
to fill in the upper 16b of r12 with the sign bit of 0x1234. We write this in
binary and see that it is really 0001 0010 0011 0100. The sign is the most significant
bit. So, r12 = 0x00001234 (which we
can write as r12 = 0x1234 for convenience – r12 is always 32b so it does have
the leading 0s). The number is positive if viewed as 2’ |
|
ldhu r13, 4(r8) |
Reads 2B as a half-word from
address 0x100 (r8) + 4 = 0x104. Sounds familiar? Look at the previous row of
this table :) However, this is the unsigned load half word. So, we fill in
the upper 16b of r13 (target register) with 0s no matter what. In this case, r13
= 0x1234, same as above. |
|
ldh r14, 6(r8) |
Read 2B starting at address
0x106. The bytes are 0xfe, 0xff, and as a half-word they become 0xfffe. Since
this is LDH we have to sign-extend before we write into r14. We look at the
MSb (most significant bit) of 0xfffe and we see it is 1 (write 0xfffe in
binary if you need to). So, at the end r14=0xfffffffe. R14 is -2 expressed in
4B. The value read from memory was 0xfffe which is -2 expressed in 2B. So, we
preserved the numerical meaning of the value when we read it as 2B from memory
and store it as 4B in the register. |
|
ldhu r15, 6(r8) |
As in the previous row,
but now we tread the 0xfffe read from memory as an UNSIGNED number. So, we
zero-extend it as we write it into r15 to preserve its meaning. R15 =
0x0000FFFE. |
|
ldb r9, 0(r8) |
Read 1B from 0x100+0 and
sign-extend it to 32b and write it into r9. The memory at 0x100 contains
0x1f, which if viewed as 2’s complement is positive (look at the MSb), so
when written into r9 it becomes r9=0x0000001f. Sign-extension = copy the sign
bit into the empty positions. |
|
ldbu r10, 0(r8) |
As in the previous row,
but we just zero out the upper 3B of r10. We don’t look at the value coming
from memory to decide this. We always zero them out since this is the UNSIGNED
load. R10=0x0000001f. |
|
ldb r11, 3(r8) |
Read 1B from 0x100+3,
sign-extend it to 32b and write this into r11. The byte at 0x103 is 0xab and
its MSb is 1 (0xac = 1010 1011). So, the sign-extended value written into r11
is 0xffffffab. |
|
ldbu r12, 3(r8) |
As in the previous row,
but since this the unsigned load, we zero extend. R12=0x000000ab |
|
ldw r9, 0(r8) |
Read 4B starting at
address 0x100+0=0x100. It does not matter that we initialized these bytes
using .byte directives. At the end, memory does not care nor does it have any
information of how it goes initialized. It only contains the values and they
are all bytes. So, now we read the four bytes and we get the following word:
0xabfe211f. This goes into r9=0xabfe211f. No extension needed since we did
read 4B and our register holds exactly 4B no more. |
|
ldw r10, 4(r8) |
Read 4B starting at
address 0x104 and write them into r10 in little-endian order. R10=0xfffe1234 |
|
ldh r11, 8(r8) |
Similarly, it does not matter
than we initialized the memory at address 0x100+8=0x108 using .word. Memory
has no information about that. It only contains the byte values. So, we read
those two bytes as a half-word in little-endian order (0x5678) and sign-extend
to 32b (0x00005678) and write this into r11=0x00005678. |
|
ldhu r12, 14(r8) |
Read 2B starting at
address 0x100+14=0x10d, we get 0xfedc, zero-extend to 32b and write into
r12=0x0000fedc. |
|
ldb r13, 15(r8) |
Read 1B from address
0x100+15=0x10F. We get 0xfe. Sign-extend to 32b, we get 0xfffffffe. Write
this into r13=0xfffffffe |
|
stb r13, 16(r8) |
Write the least significant
byte of r13 to memory at address 0x100+16=0x110. Memory[0x110]=0xfe. Why don’t we sign-extend?
Answer below at (*), but think about this before reading it. |
|
stb r12, 17(r8) |
Write the least significant
byte of r12 (0xdc) to memory at address 0x100+17=0x111. Memory[0x111]=0xdc. |
|
sth r9, 18(r8) |
The the least significant
half-word of r9=0x211f and write it in little-endian order to memory starting
at address 0x100+18=0x112. Memory[0x112]=0x1f Memory[0x113]=0x21 |
(*) Why aren’t there signed and unsigned store instructions? In NIOS II,
a store instruction takes a value from a register which always contains 32b and
stores it to memory as either a word (32b), half-word (16b), or byte (8b). In
all cases, the destination (memory) has at most as many bits as the source
value. So, there are no missing bits to fill in. If anything, we are “chopping”
a value to fit it to fewer bits when we store it as a half-word or byte. It is
upon us, the programmers, to ensure that the value does fit in this many bytes.