Related articles |
---|
Ada vs. C performance, was Possible to write compiler to Java VM? robison@kai.com (Arch Robison) (1997-01-07) |
Re: Ada vs. C performance, was Possible to write compiler to Dave_Koogler_at_CCISDAPPS1@ppc-191.putnaminv.com (1997-01-09) |
Re: Ada vs. C performance, was Possible to write compiler to burley@gnu.ai.mit.edu (Craig Burley) (1997-01-12) |
Re: Ada vs. C performance, was Possible to write compiler to toon@moene.indiv.nluug.nl (Toon Moene) (1997-01-14) |
Re: Ada vs. C performance, was Possible to write compiler to cdg@nullstone.com (Christopher Glaeser) (1997-01-17) |
From: | Toon Moene <toon@moene.indiv.nluug.nl> |
Newsgroups: | comp.compilers |
Date: | 14 Jan 1997 20:20:46 -0500 |
Organization: | Moene Computational Physics, Maartensdijk, The Netherlands |
References: | 97-01-045 97-01-065 97-01-078 |
Keywords: | Fortran, performance, GCC |
Craig Burley <burley@gnu.ai.mit.edu> wrote:
>The good news is that we're beginning to teach the gcc back end about
>these new constructs -- at least, for g77, we're in the
>experimental/development phase for some of them.
>A minor example, that I believe rarely amounts to much in practice
Sorry, Craig, if I'd seen your post earlier, I would have pointed this out:
There is a very simple, convincing example of the sort of optimisation the
current backend does not apply, whereas it will when John Carr's
[jfc@mit.edu] new alias analysis code is implemented.
Consider the (dumbed down) saxpy routine from BLAS/LAPACK:
subroutine saxpy(n,sa,sx,incx,sy,incy)
real sx(*),sy(*),sa
integer i,incx,incy,ix,iy,n
if(n.le.0)return
if (sa .eq. 0.0) return
ix = 1
iy = 1
if(incx.lt.0)ix = (-n+1)*incx + 1
if(incy.lt.0)iy = (-n+1)*incy + 1
do 10 i = 1,n
sy(iy) = sy(iy) + sa*sx(ix)
ix = ix + incx
iy = iy + incy
10 continue
end
The problem here, for the gcc backend, is that the arguments to the
Fortran routine are implemented in g77 as "call-by-reference",
i.e. pointers to their values in memory are passed. The C semantics
doesn't promise anything about aliasing of pointers 1); therefore, the
backend _currently_ assumes that the store into sy invalidates the
values of sa, incx and incy in the loop. Not only does this mean that
those values have to be retrieved from memory at every loop turn, it
also means that the normal strength reduction on induction variables
(ix, iy) can't take place, because the backend can't prove that incx
and incy are loop invariants.
This is all solved by John's work ...
Hope this makes the discussion somewhat less academic ...
1) I proposed, about a year ago, to Craig to apply copy-in-copy-out semantics
(that are allowed by the Fortran Standard) to scalar arguments, which
would have solved *this* too, but at the expense of complication of the
Fortran fronted. There are other optimisations possible due to Fortran's
"overlap" restrictions that are not dealt with *that* approach, however).
--
Toon Moene (mailto:toon@moene.indiv.nluug.nl)
Saturnushof 14, 3738 XG Maartensdijk, The Netherlands
Phone: +31 346 214290; Fax: +31 346 214286
g77 Support: mailto:fortran@gnu.ai.mit.edu; NWP: http://www.knmi.nl/hirlam
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.