Stored Program and Instruction Representation
So far, we have seen several examples of a NIOS II programs. We have intentionally avoided to discuss in detail where this program exists inside the machine, how instructions are represented and how the CPU sequences through them. These will concern us in today’s lecture. We just alluded that instructions are stored in memory just like data is and that the CPU uses the PC to access them from memory. We have also seen one example of how the information is encoded in instructions.
The NIOS II very much like any other processor uses the concept of a stored program. That is the instructions that comprise the program we want to execute are represented using binary numbers and are stored inside the memory. So, inside a computer instructions as expected are too represented using a collection of bits. They are also stored in memory very much like anything else inside the computer. How does the CPU know if something is an instruction or not? It does not. Any memory location may potentially contain data or instructions. What differentiates the two is not to be found in memory itself. The only thing that differentiates an instruction from data is the point in time the CPU references it. Recall that a CPU goes through the following steps repeatedly:
1. Get next instruction
2. Decode (i.e., interpret its meaning) the instruction
3. Read Source Operands
4. Perform Operation (e.g., add two numbers)
5. Write Result
6. Determine which is the next instruction
What makes some memory contents an instruction is that it they are referenced in step 1. Similarly what makes some memory contents data is that they are not referenced in step 1 (in our description such data will be referenced in steps 3 and 5 but depending on the processor and the instruction the number of steps we described may not be a good match with what the processor actually does).
In NIOS II we can refine the description of the CPU execution steps as follows:
1. Read four bytes from the memory location pointed to by the contents of PC:
3. Read Sources
4. Perform Operation
5. Write Result
6. Alter PC so that it now points to the next instruction we want to execute. This is either PC + 4 or some other calculation if the instruction is a control flow one such as a branch or a call.
As you can see now step 1 is described in more detail. Instructions are stored in memory and the contents of the PC register are used as the starting address from where the next to be executed instruction is read. Because the length of a NIOS instruction is always four bytes fetching and decoding are conceptually simple: just read four bytes from memory and look at them; they tell you what the instruction is and hence what are the steps required to executed it.
Step 6 which is “determine what is the next instruction” is implemented by changing the value of the register PC. This way, when we return to step 1 for the next instruction, the PC should be pointing at the address where the bytes representing this next instruction are stored.
It is important to emphasize that any memory location can hold data or instructions and that both are just binary quantities. There is nothing in memory that differentiates instructions from data. If a memory access is done at step 1 then the binary quantity read will be interpreted as an instruction otherwise it will be used as data by the current instruction.
So, what is an instruction? An instruction represents an elemental operation that the processor is capable of performing. All operations the processor is capable of are represented as instructions, and each instruction corresponds to one processor operation.
The next question is “what is contained in the instruction?”. The instruction is an encoding that uniquely represents all the information necessary to perform each one of the six steps of instruction executed detailed before. Here are examples of what needs to be represented by an instruction:
1. The operation that must be performed, e.g., an add or an branch.
2. The source and destination operand names (e.g., register names as in or “r9, r9, r10”) and if necessary values (e.g., the immediate in “addi r9, r9, 10”).
3. The way the PC must be updated at the end. Note that this is an example where information may be represented implicitly. For example, all non-control-flow instructions do not explicitly contain any information on how to update the PC. It is implied that PC ought to become “PC + 4”. Only control-flow instructions contain explicit information such as the offset to the target in “beq r9, r0, label”.
How NIOS II Encodes Instructions.
NIOS II is representative of what are called “LOAD/STORE”
or “RISC” instruction set architectures. Other examples,
include MIPS and
Here are the NIOS II instruction formats:
This format includes two register name fields A and B of 5 bits each (since there are 32 registers), an immediate of 16 bits and a 6-bit opcode field. The number contained in the opcode field (OP) uniquely identifies the particular instruction. That is, there is no other instruction that will have the same bit pattern on the corresponding bits. The way the fields appear in the instruction is as follows:
This format is used by instructions that require two registers, and a 16-bit immediate. These instructions include:
1. Arithmetic and logical operations such as addi and ori.
2. Load and store instructions.
3. And branch instructions.
How do we know whether an instruction uses the I-FORMAT? We simply need to look at the OP field and check whether the value there corresponds to one of these instructions. How do we know which those instructions are? They are listed in the processor manual and are hardwired in the decoding circuits: The processor was build with this knowledge and contains this information internally. Think of it as if the processor has a table that lists all opcodes and whether they use the I-FORMAT or not.
Here are few examples of I-FORMATed instructions:
1. addi r11, r9, 129
B = r11, A = r9, Imm16 = 129 and OP = 0x04. So in binary form the instruction is:
In hexadecimal, this is: 0x4ac02044
2. beq r15, r7, -100
A = r15, B = r7, Imm16 = -100 = 1111111110011100, and OP = 0x26. So, in binary the instruction is:
In hexadecimal, this is: 0x3bffe726
You can check the encoding of instructions by compiling and using nios2-elf-objdump –d prog.elf.
The following table lists the OP encoding of all I-FORMAT instructions. Notice that one OP encoding (0x3A) is reserved for R-TYPE or R-FORMAT instructions. This format is explained next.
Before we talk about the R-FORMAT, let’s discuss the J-FORMAT. This is used by the call instruction only. The J-FORMAT includes a 6-bit OP field which must hold the value 0x00 and a 26-bit immediate field. The format is:
IMMED26 field encodes the target address for the call. The target address is calculated as:
Bits 31 through 28 of the call PC, concatenated with IMMED26, concatenated with two 0 bits.
The R format provides for three register names A, B and C of 5 bits each, and an extended OPX opcode field of 6 bits, plus on more field, X, of 5 bits. The R-format instructions also include the OP field of the I-FORMAT instructions but use the unique value of 0x3A to differentiate from I-FORMAT instructions. So, at decode, NIOS II first looks at the OP field (bits 0 through 5) and if these contain the value 0x3A then it knows that it must also look at the OPX field to determine what the instruction is. The actual format is as follows:
The following table lists all OPX values and the corresponding instructions:
Note that not all possible values for OPX are used. The unused values may be used for future expansion, i.e., for introducing new instructions. Most of the instruction use X = 0. Some use the X field for additional information. The details can be found in the processor manual posted on the course website.
Here’s how “xor r21, r3, r9” is encoded:
C = r21, A = r3 and B = r9, OPX = 0x1E and OP = 0x3A
How the Processor Decodes Instructions
We have seen how we can convert instructions into their binary form. How does the processor “understand” what each instruction is? This process is called decode. Decode is possible because instruction encoding is a one-on-one mapping. The rules of instruction encoding are built-in into the hardware design of the processor. Here are examples of how we can decode instructions:
1. Let’s say that the next instruction is: 0xd6a02004.
First let’s write it in binary: 1101 0110 1010 0000 0010 0000 0000 0100.
First let’s look at the lower six bits. These form the OP field which is present in all instructions. The last six bits are: 00 0100. In hexadecimal this is 0x04. Looking at Table 8-1, 0x04 is the OP encoding for ADDI. Now we know how to decode the rest of the bits. The instruction is ADDI B, A, IMM16 in the I-FORMAT described previously. Now we know that:
A = 11010
B = 11010
Imm16 = 10 0000 0010 0000 00.
So the instruction is ADDI r26, r26, IMM16, where IMM16 = 1000 0000 1000 0000, or 0x8080. Imm16 is to be interpreted as a 2’s complement signed integer. So, it’s actually -0111 1111 1000 0000, or -32640.
2. How about 0x31cd883a, or 0011 0001 1100 1101 1000 1000 0011 1010. Again, let’s start with the OP field, the last six bits: 11 1010, or 0x3a in hex. So, this is an R-FORMAT instruction as per Table 8-1. Now we have to look at the OPX field, bits 16 through 11 110001, or 0x31. Table 8-2 shows that this is an ADD. So, now we know that we have to look at fields A, B, and C, and that the instruction is ADD C, A, B. A = 00110, B = 00111, and C = 00110. So, the instruction is ADD r6, r6, r7.
General Observations about Instruction Encoding
When the NIOS II designers were deciding about the NIOS II instruction encoding they operated under a set of requirements and a set of desirable properties:
The encoding should be such that it can uniquely encode all possible (valid) combinations of:
a. Operation (e.g., add and sub)
b. Datatypes (e.g, word, byte)
c. Operands for source and destination.
The encoding must be reversible, i.e., one-to-one. Equivalently, it should be possible given a textual representation of
an instruction (add r1, r2, r3) to determine its encoding in binary and then it should be possible givenq
a binary encoding to determine what is the instruction this corresponds to.
2. DESIRABLE PROPERTIES:
a. The amount of memory required to represent a program should be as small as possible.
b. It should be as easy as possible to decode an instruction.
c. The instruction set should be as uniform as possible. This is hard to define precisely but one facet of it is that there should be as few special-cases as possible. For example, if addition can use three register operands then all other arithmetic operations should be able to use three register operands. This simplifies code generation by making it easy to consider several alternatives. In addition, uniformity implies that the various fields in the instruction format should appear exactly in position in different
instructions. For example, if the destination operand appears in bits 21 through 17 in add then it would be nice if the destination operand also appeared in the same bits for xor. This property simplifies hardware and allows some optimizations examples of which we will see when we explain how to build a working processor.
Often requirements 2.a and 2.b are at odds. The NIOS II designers gave preference to 2.b instead of 2.a: they designed an instruction set that is very simple to decode at the expense of always using four bytes per instruction. Other instruction sets such as those for x85 machines and ARM (embedded system processor) use a variable length instruction encoding. That is instructions do not use the same number of bytes for their representation. There are at least two bytes but there may be more. NIOS II like are other load/store CPU families uses a fixed length instruction encoding where all instructions use the same number of bits for their representation.
As an example of how encoding can be done assumed that we are asked to come up with encoding for a machine that has: 32 Registers each of 4 bytes, 16 different operations, 1 possible datatype (4 bytes) and instructions of the form dst = src1 OP src2 where dst, src1 and src2 are registers (we would need other instructions for accessing memory and sequencing but for clarity and without loss of generality let us assume that these are the only instructions that this machine has). How can we encode these instructions?
There are many different ways. Here’s one.
1. How many bits do we need to encode the operation? Since there are 16 operations 4 bits should be sufficient. So, let’s use the four most significant bits for this purpose. We then we have to come up with a mapping of four bit binary quantities and operations. For example, we could define that 0000 is for an addition, 1000 is for subtraction and so on. Any encoding is valid.
2. Besides the operation we should be able to specify the datatype. Since there is only one datatype we do not need to directly encode this using bits. This can be implied.
3. Finally, we need to encode the three operands. We assumed that these refer to registers. There are 32 registers, let’s call them X0 through X31. To refer to each one of those 32 registers a five bit name is sufficient. That name can be made to encode the binary form of the register number (n in Xn). Since there are three operands, we will need three five bit names.
Our encoding can then be:
As an example, let’s see how we would then encode ADD X1, X15, X31
The encoding will be:
0000 00001 01111 11111
the first four bits (0000) stand for ADD
the next five are for the destination register X1
the next five are for the first source register X15
and the last five are for the second source register X31.