Lecture 12
Andreas Moshovos
Spring 2005
Subroutines
Continued: Passing Arguments, Returning Values and Allocating Local Variables
Thus far we have seen the mechanisms via which a subroutine can be called and return to its caller. In this lecture we will be looking at the mechanisms that are used for passing arguments and for returning a value. Please keep in mind that what we present is the calling convention used by the popular GNU gcc compiler for the 68k processor family. There are other possibilities which we will discuss later on.
In this calling convention all parameters are passed as values on the stack and the return value is returned into register D0. Local variables (in a subroutine) are also kept in the stack (or could be allocated into a register). Let us for the time being ignore local variables and look at just the issues of passing parameters and returning a value. The code for performing a call generally will consist of the following four sections (shown underlined):
Caller Callee
… prologue
pre-call main body
jsr callee epilogue
post-call rts
…
Pre-Call: Prior to making the call, the caller we will have to take some actions. In particular, parameters are pushed onto the stack by the caller in this section. Parameters are pushed onto the stack in reverse order. That is, we first push onto the stack the last parameter. We push the first parameter last. (Explanation -- you may want to revisit this discussion after you have seen an example of how parameters are passed: The net result is that on the stack, the first parameter appears on the top, immediately followed by the second parameters and so on. This is done to also support subroutines with a variable number of arguments such as printf. By having the first argument at the top we can always access it. If it was at the bottom then we would .need to know how many arguments are there (which is a problem if the number of argument varies).)
Post-Call: After the call, the caller must de-allocate the stack space it allocated in the pre-call section for passing arguments.
Prologue: In this section the callee will be allocating space for local variables and will be taking appropriate actions for preserving those register values that it should not change. We will revisit this later on.
Epilogue: In this section, the callee will be reversing all actions that took place in the prologue. We will expand on this later on.
Let’s see an example:
int add3 (int a, int b, int c)
{ return a + b + c;
}
int sum = 0;
main()
{
sum += add3 (1, 2, 3);
}
Here’s the code for main:
org $20000
main
; Pre-call section
move.l #3, -(a7) ; push third argument onto the stack
move.l #2, -(a7) ; push second argument onto the stack
move.l #1, -(a7) ; push first argument onto the stack
jsr add3 ; call the subroutine
; this also pushes the return address onto the stack
; post-call section
retadd
adda.l #12, a7 ; deallocate the stack space used for the parameters
; we pushed four long words in the pre-call section
; thus by adding 12 to the stack pointer we are essentially removing these four long words off the stack
move.l sum, d1 ; read sum into d1
add.l d0, d1 ; add to d1 the value returned by add3 (in d0)
move.l d1, sum ; write the result back to sum in memory
rts
Assuming that when main starts being executed a7 is $70000 then just after the “jsr add3” instruction is executed the stack looks as follows:
|
Long-words |
A7 à $6fff0 |
retadd |
$6fff4 |
1 |
$6fff8 |
2 |
$6fffc |
3 |
Where retadd is the address of the instruction immediately after the jsr.
Here’s the code for add3:
add3 move.l 12(a7), d0 ; read parameter c (the third) into d0. This is at distance 12 from the top of the stack (see preceding figure)
add.l 8(a7), d0 ; add the second parameter to d0.
add.l 4(a7), d0 ; add the first parameter to d0
rts ; return to the caller
There are no prologue and epilogue sections in add3 since it has no local variables and since it does not change any registers other than d0.
What happens to registers across calls? In the previous example, the callee (add3) did not change any registers other than d0. The caller expects d0 to change as it is used to return a value. What if add3 was using other registers? The convention says that all registers except d0, d1, a0, a1 and a7 should be preserved across a call. That is, the caller expects that when the calle returns, these registers will have the same values they had before the callee was called. If we read through this statement carefully we can see that it does not say that the registers should not change value while the callee executes. All we have to guarantee is that before returning to the caller the registers must be loaded with the original values. There are two ways of achieving this: (1) Do not touch a register at all, (2) Allow a register to change its value but remember what value it had prior to the call and restore that value prior to returning to the caller.
For (2) we can do the following. In the subroutine prologue save on the stack the values of all those registers that the routine will change.
In the epilogue restore the registers to their original values using those stored onto the stack.
While this is contrived example, let us assume that in our previous code, main allocated sum into register d3 and that for whatever reason add3 used d3:
org $20000
main
clr.l d3 ; temporary sum = 0
move.l #3, -(a7) ; pass arguments onto the stack
move.l #2, -(a7)
move. #1, -(a7)
jsr add3
addq.l #12, a7 ; deallocate space for parameters
add.l d0, d3 ; add return value to temporary sum
move.l d3, sum ; now write sum to memory
rts
add3
; PROLOGUE
move.l d3, -(a7) ; save d3’s value onto the stack
; MAIN BODY
move.l 8(a7), d3 ; read first parameter from the stack
; the first word on the stack is d3’s original value
; the second is the return address
; hence we need to read the third word (which is at distance 8) for the third parameter
add.l 12(a7), d3
add.l 16(a7), d3
move.l d3, d0 ; move sum into d0 since the caller expects the return value theree
; EPILOGUE
move.l (a7)+, d3 ; restore d3’s original value
rts ; return to caller
Notice that the prologue and epilogue sections are symmetric. One saves register values onto the stack and the other restore them. Restoring is typically done in the reverse order. For example, if we needed to save and restore registers a2, a3 and a4 we will use the following prologue and epilogue sections:
prologue move.l a2, -(a7)
move.l a3, -(a7)
move.l a4, -(a7)
…
epilogue move.l (a7)+, a4
move.l (a7)+, a3
move.l (a7)+, a2
Because saving and restoring registers is done often 68k provides an instruction that can save or restore multiple registers. This is the movem instruction. It’s syntax is:
Movem.datatype D_register-list/A_register_list, -(a7)
This pushes onto the stack the D and A registers specified in the list (read on for specific examples).
Movem.datatype (a7)+, D_register-list/A_register_list
This restores from the stack the D and A registers specified in the list (read on for specific examples).
In both cases the datatype can be a word or a long-word.
For example:
movem.l d0-d3/d7/a0-a2/a4/a6, -(a7)
This pushes onto the stack the registers d0, d1, d2, d3 and d7, and a0, a1, a2, a4 and a6.
Generally, The order of storing is from A7 to A0, then from D7 to D0. The order in which the registers are pushed onto the stack is independent of the order in which they are specified in the movem list. So, “movem.l a0-a2/a4/a6/d7/d0-d3, -(a7)” is equivalent to the previous movem.
Callee-Saved vs. Caller-Saved Registers: Note that while in our discussion it is always the responsibility of the callee to save and restore registers in general it does not have to always be so. Registers that are supposed to preserve their values across a call are called callee-saved registers. An alternative is to have the caller save in the pre-call section those registers it cares about and then restore them in the post-call section. Such registers are called caller-saved. We will return to this discussion later on.
Here’s how main’s code will look like if d3 was a caller saved register (in this case add3 does not save and restore d3):
org $20000
main
clr.l d3 ; temporary sum = 0
move.l d3,
-(a7) ; main saves d3 since we
assume it is a caller saved register, thus it may change during the call to
add3
move.l #3, -(a7) ; pass arguments onto the stack
move.l #2, -(a7)
move. #1, -(a7)
jsr add3
addq.l #12, a7 ; deallocate space for parameters
move.l (a7)+, d3 ; main restores d3, now d3 has the
value it had just before the preceding jsr add3 statement
add.l d0, d3 ; add return value to temporary sum
move.l d3, sum ; now write sum to memory
rts
Local Variables? Local variables can either be allocated in registers or on the stack immediately after the space allocated for preserving register values.
Stack Frame: This term is used to refer to the stack space allocated per subroutine invocation. Based on our discussion the layout of a stack frame is as follows:
A7à |
Local variables |
Allocated by callee |
|
Saved registers |
|
|
Return address |
|
|
First Parameter |
Allocated by caller |
|
Second Parameter |
|
|
… |
|
A complete example: The Ackerman recursive subroutine.
The following C code computes the Ackerman function:
unsigned int
Ackerman(unsigned int x, unsigned int y)
{
if (x == 0) return
y+1;
if (y == 0) return
Ackerman (x-1, 1);
return Ackerman
(x-1, Ackerman(x, y-1));
}
Here’s an implementation in 68k assembly:
org $20000
Ackerman
move.l 4(a7), d0 ; d0 = Mem[[a7]+4] = x
bne Xnot0 ; if x == 0 then fall through
move.l 8(a7), d0 ; return y + 1
addq.l #1, d0
jmp epilogue
Xnot0
move.l 8(a7), d0 ; d0 = y
bne Ynot0 ; if y==0 then fall through
; call Ackerman (x-1, 1)
move.l #1, -(a7) ; push 1 as y parameter
move.l 8(a7), d0 ; read x (now at distance +8 since we just pushed one element)
subq.l #1, d0 ; d0 = x -1
move.l d0,-(a7) ; push x-1 and the x parameter
jsr Ackerman
addq.l #8, a7 ; deallocate the space for the two parameters
jmp epilogue ; jump to epilogue for returning
Ynot0
; call Ackerman (x, y-1)
move.l 8(a7), d0 ; read y into d0
subq.l #1, d0 ; d0 = y - 1
move.l d0, -(a7) ; push y-1 as second parameter
move.l 8(a7), d0 ; d0 = x (at distance 8 now since we just pushed a longword on the stack)
move.l d0, -(a7) ; push x as first parameter
jsr Ackerman
addq.l #8, a7 ; deallocate space for the two parameters
; call Ackerman (x-1, tmp) where tmp is the value returned from Ackerman (x, y-1)
move.l d0, -(a7) ; d0 holds the return value of Ackerman (x, y – 1), push on the stack as second parameter
move.l 8(a7), d0 ; read x
subq.l #1, d0 ; calculate x -1
move.l d0, -(a7) ; push x -1 as first parameter
jsr Ackerman
addq.l #8, a7 ; deallocate the two parameters
epilogue
rts ; return to caller