Related articles |
---|
[18 earlier articles] |
Re: Caller/Callee saved Registers zsh@cs.princeton.edu (1994-03-25) |
Re: Caller/Callee saved Registers law@fast.cs.utah.edu (1994-03-26) |
Re: Caller/Callee saved Registers hbaker@netcom.com (1994-03-26) |
Re: Caller/Callee saved Registers hbaker@netcom.com (1994-03-26) |
Re: Caller/Callee saved Registers hbaker@netcom.com (1994-03-26) |
Re: Caller/Callee saved Registers anton@mips.complang.tuwien.ac.at (1994-03-28) |
Re: Caller/Callee saved Registers zsh@cs.princeton.edu (1994-03-27) |
Re: Caller/Callee saved Registers pardo@cs.washington.edu (1994-03-28) |
Re: Caller/Callee saved Registers pardo@cs.washington.edu (1994-03-29) |
Re: Caller/Callee saved Registers bart@cs.uoregon.edu (1994-03-29) |
Re: Caller/Callee saved Registers hbaker@netcom.com (1994-03-29) |
Re: Caller/Callee saved Registers hbaker@netcom.com (1994-03-29) |
Re: Caller/Callee saved Registers pardo@cs.washington.edu (1994-03-31) |
[5 later articles] |
Newsgroups: | comp.compilers |
From: | zsh@cs.princeton.edu (Zhong Shao) |
Keywords: | optimize, functional |
Organization: | Princeton University |
References: | 94-03-054 94-03-151 |
Date: | Sun, 27 Mar 1994 21:30:14 GMT |
|>>>Tail-recursions can be implemented more efficiently in callee-save
|>>>registers than on the stack.
|> hbaker@netcom.com (Henry G. Baker) writes:
|> Thus, so long as you don't run out of registers, the overhead of tail
|> recursion is _independent_ of the number of arguments, and essentially
|> identical to the imperative code.
Actually, tail-recursion (or loop) that makes other function calls inside
its body can benefit even more from good use of callee-save registers. For
example, considering the following C code which iteratively applies
function "f" to "x" until it converges to satisfy the predicate "p":
------------
extern int p(float a, float r);
extern float f(float a);
float iter(float x) {
float a,r;
a = x; r = 1.0;
while (p(a,r)) {
r = a;
a = f(a);
}
return a;
}
-------------
[the difference between "iter" here and "fact1" in Henry's example is
that "fact1" doesn't make any other function calls in its loop body.]
Both "p" and "f" here are functions defined in other modules which you
don't really know how they treat the machine registers. Assuming initially
all variables (e.g., "a","p", and "r") all stay in registers. Then,
inside the above "while" loop, both "a" and "p" must be saved into memory
(i.e., stack frame) before the function call to "p", and later be restored
after the call to "p" returns.
If we establish a "callee-save register" convention: "every function
reserves k callee-save registers (say, r1-r4 for k=4)," then we can do the
following:
(1) Before entering the "while" loop:
(a) save "r1-r3" into the memory by building a 3-element record "CR",
put "CR" into r3.
(b) move "a" into r1, and "p" into r2.
(2) Run the while loop as usual, except now even when you are calling
"p" and "f", no register save/restore are necessary because every
function promises to preserve the callee-save register contents (r1-r4)
(3) At the exit of "while" loop, restore the old contents into r1-r4
(basically, retrieve from the "CR" record)
In Standard ML, the above C program is normally written as the following
tail-recursive function, which benefits from the "callee-save registers"
in the same way:
----------------
fun iter(x) =
let fun h(a,r) = if p(a,r) then a
else h(f(a),a)
in h(x,1.0)
end
-----------------
The algorithm that does the above "callee-save register" allocation is
based on the compile-time variable lifetime information and control-flow
information (and is described in our LFP94 paper).
-Zhong Shao
(zsh@cs.princeton.edu)
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.