Related articles |
---|
Re: Why C is much slower than Fortran gneuner@dyn.com (1999-05-16) |
Re: Why C is much slower than Fortran jhallen@world.std.com (1999-05-29) |
Hardware doing the work (Was Re: Why C is much slower than Fortran) creedy@mitretek.org (Chris Reedy) (1999-06-02) |
Re: Hardware doing the work (Was Re: Why C is much slower than Fortran jhallen@world.std.com (1999-06-03) |
Re: Hardware doing the work (Was Re: Why C is much slower than Fortran sjc@netcom.com (1999-06-06) |
Re: Hardware doing the work (Was Re: Why C is much slower than Fortran zalman@netcom.com (1999-06-06) |
From: | jhallen@world.std.com (Joseph H Allen) |
Newsgroups: | comp.compilers,comp.arch |
Date: | 3 Jun 1999 02:25:28 -0400 |
Organization: | The World Public Access UNIX, Brookline, MA |
References: | <3710584B.1C0F05F5@hotmail.com> 99-05-057 99-05-142 99-06-021 |
Keywords: | C, Fortran, architecture |
Chris Reedy <creedy@mitretek.org> wrote:
>For historical reasons, I find this argument unpersuasive. The problem
>is that having hardware handle these issues leads one down a road that
>ultimately ends in very CISCy designs like the ill-fated Intel
>432. This kind of design fails, IMHO, because there is no mechanism
>for the compiler to communicate knowledge obtained from static
>analysis to the hardware so it can take advantage of it during
>execution. The IA64 continues down the road (started by the original
>RISC processors) of requiring _more_, not less, analysis by the
>compiler in order to make the hardware run efficiently.
I don't think the compiler has to do very much to support this. A C
compiler has to unload registers around subroutine calls anyway, so instead
of emitting:
ld r7,16(sp)
...
st r7,16(sp)
jsr foo
ld r7,16(sp)
it emits:
sld r7,16(sp) ; speculative load of r7
...
jsr foo
check r7 ; reload r7 if 16(sp) changed
This should be a lot easier than trying to provide two versions of the
code (one which assumes the alias and one which doesn't) plus code to
test if the alias exists ahead of time, as someone else suggested.
The above example of a remote function call would be particularly
difficult to deal with.
Even a more complicated situation is pretty easy:
ald r1,16(sp)
ald r2,32(sp)
ald r3,48(sp)
... f(r1,r2,r3)->r4 ...
jsr foo
bcheck r1,redo ; Branch if 16(sp) changed.
bcheck r2,redo
bcheck r3,redo
bra skip
redo:
ald r1,16(sp)
ald r2,32(sp)
ald r3,48(sp)
... f(r1,r2,r3)->r4 ...
skip:
The analogy with the 432 is a little silly. The IA64 is a
CDC6600-like supercomputer on a chip, not a multi-cycle object
oriented micro-coded CISC without a cache or even a high-bandwidth
bus. We are way out there on the diminishing-returns curve, so there
is little harm in trying a little cheap hardware help to a
long-standing compiler problem.
>I believe that the "only true solution" will ultimately be
>improvements in programming languages that improve developers ability
>to communicate their intentions to the compiler so that the compiler
>can do a better job of communicating the programmer's intentions to
>the hardware.
I have not been impressed with the programmer-aware solutions thus far
suggested. You could add a keyword to tell the compiler that a
pointer (or whatever) is not aliased, but this leads to broken code.
If I pass the same array to both arguments of a dot product
multiplier, I don't want it to fail because the original programmer
decided that nobody was ever going to do that.
>[I agree that better, probably higher level, languages will make it possible
>to write faster and more reliable programs.]
Fortran compilers are still much faster (by at least an order of magnitude)
than any functional language compiler I've seen, so I have my doubts about
this.
--
/* jhallen@world.std.com (192.74.137.5) */ /* Joseph H. Allen */
Return to the
comp.compilers page.
Search the
comp.compilers archives again.