|[4 earlier articles]|
|Re: SPARC compiler optimisation email@example.com (1992-02-22)|
|Re: SPARC compiler optimisation firstname.lastname@example.org (1992-02-26)|
|Re: SPARC compiler optimisation email@example.com (1992-02-27)|
|Re: SPARC compiler optimisation nickh@CS.CMU.EDU (1992-02-28)|
|Re: SPARC compiler optimisation nickh@CS.CMU.EDU (1992-03-02)|
|Re: SPARC compiler optimisation firstname.lastname@example.org (1992-03-02)|
|Re: SPARC compiler optimisation email@example.com (1992-03-09)|
|Date:||Mon, 9 Mar 1992 13:45:07 GMT|
In article 92-03-011 Preston Briggs writes:
> I think of aligned LDD and STD as kind of a mistake for RISC machines,
> since they can't be generated by compilers. I expect they're usually
> justified for their usefulness in hand-coded library routines, especially
> block copies and such, where alignment can be tested.
I don't agree with this. With proper design of both compiler and
architecture, these instructions can be quite useful. Here's an example.
At Berkeley, we ported the Aquarius Prolog compiler to both the VLSI-BAM
chip (a RISC-like processor with extensions for Prolog) and the SPARC.
Both of these processors have LDD and STD instructions, but they are
slightly different. It's this difference that makes the difference.
On the VLSI-BAM the LDD and STD instructions can load or store _any_ two
registers. The only condition for a load or store from
register+displacement R+D is that R+D be aligned on a double-word
boundary. This condition is easily satisfied by our compiler; we measure
the performance improvement for the double-word memory port to be about
17% (see article in ISCA 1990; this figure is a lump sum for LDD, STD,
STDC, PUSHD, and PUSHDC).
There are two additional conditions on the SPARC: the source or
destination registers must be _consecutive_ and the first register number
must be _even_. These conditions are too strong for our compiler; only a
small number of LDD & STD instructions are generated. (However, we did
not show that a smarter compiler could not accommodate these additional
I conclude that properly designed double-word loads and stores can indeed
be fruitfully used by a RISC machine, as shown by the VLSI-BAM. However,
I do not know whether this is true for the SPARC as well.
BTW, the VLSI-BAM was built together with a cache board, and the
combination achieves its rated speed.
Peter Van Roy
Peter Van Roy
Digital Equipment Corporation Net: firstname.lastname@example.org
Paris Research Laboratory Tel:  (1) 47 14 28 65
85, avenue Victor Hugo Fax:  (1) 47 14 28 99
92563 RUEIL MALMAISON CEDEX
Return to the
Search the comp.compilers archives again.