|Ada vs. C performance, was Possible to write compiler to Java VM? email@example.com (Arch Robison) (1997-01-07)|
|Re: Ada vs. C performance, was Possible to write compiler to Dave_Koogler_at_CCISDAPPS1@ppc-191.putnaminv.com (1997-01-09)|
|Re: Ada vs. C performance, was Possible to write compiler to firstname.lastname@example.org (Craig Burley) (1997-01-12)|
|Re: Ada vs. C performance, was Possible to write compiler to email@example.com (Toon Moene) (1997-01-14)|
|Re: Ada vs. C performance, was Possible to write compiler to firstname.lastname@example.org (Christopher Glaeser) (1997-01-17)|
|From:||Toon Moene <email@example.com>|
|Date:||14 Jan 1997 20:20:46 -0500|
|Organization:||Moene Computational Physics, Maartensdijk, The Netherlands|
|References:||97-01-045 97-01-065 97-01-078|
|Keywords:||Fortran, performance, GCC|
Craig Burley <firstname.lastname@example.org> wrote:
>The good news is that we're beginning to teach the gcc back end about
>these new constructs -- at least, for g77, we're in the
>experimental/development phase for some of them.
>A minor example, that I believe rarely amounts to much in practice
Sorry, Craig, if I'd seen your post earlier, I would have pointed this out:
There is a very simple, convincing example of the sort of optimisation the
current backend does not apply, whereas it will when John Carr's
[email@example.com] new alias analysis code is implemented.
Consider the (dumbed down) saxpy routine from BLAS/LAPACK:
if (sa .eq. 0.0) return
ix = 1
iy = 1
if(incx.lt.0)ix = (-n+1)*incx + 1
if(incy.lt.0)iy = (-n+1)*incy + 1
do 10 i = 1,n
sy(iy) = sy(iy) + sa*sx(ix)
ix = ix + incx
iy = iy + incy
The problem here, for the gcc backend, is that the arguments to the
Fortran routine are implemented in g77 as "call-by-reference",
i.e. pointers to their values in memory are passed. The C semantics
doesn't promise anything about aliasing of pointers 1); therefore, the
backend _currently_ assumes that the store into sy invalidates the
values of sa, incx and incy in the loop. Not only does this mean that
those values have to be retrieved from memory at every loop turn, it
also means that the normal strength reduction on induction variables
(ix, iy) can't take place, because the backend can't prove that incx
and incy are loop invariants.
This is all solved by John's work ...
Hope this makes the discussion somewhat less academic ...
1) I proposed, about a year ago, to Craig to apply copy-in-copy-out semantics
(that are allowed by the Fortran Standard) to scalar arguments, which
would have solved *this* too, but at the expense of complication of the
Fortran fronted. There are other optimisations possible due to Fortran's
"overlap" restrictions that are not dealt with *that* approach, however).
Toon Moene (mailto:firstname.lastname@example.org)
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: mailto:email@example.com; NWP: http://www.knmi.nl/hirlam
Return to the
Search the comp.compilers archives again.