Re: Using C as an UNCOL

Dave Lloyd <dave@occl-cam.demon.co.uk>
13 Jun 1996 20:07:51 -0400

          From comp.compilers

Related articles
Re: Java virtual machine as target language for C/C++ kik@zia.cray.com (1996-05-08)
Re: Java virtual machine as target language for C/C++ dw3u+@andrew.cmu.edu (Daniel C. Wang) (1996-05-27)
Re: Using C as an UNCOL toon@moene.indiv.nluug.nl (Toon Moene) (1996-06-09)
Re: Using C as an UNCOL dave@occl-cam.demon.co.uk (Dave Lloyd) (1996-06-13)
Re: Using C as an UNCOL dw3u+@andrew.cmu.edu (Daniel C Wang) (1996-06-14)
UNCOL, or: dealing with loss of information when compiling. toon@moene.indiv.nluug.nl (Toon Moene) (1996-06-21)
Re: Using C as an UNCOL fjh@mundook.cs.mu.OZ.AU (1996-06-21)
Re: Using C as an UNCOL darius@phidani.be (Darius Blasband) (1996-06-21)
Re: UNCOL, or: dealing with loss of information when compiling. preston@tera.com (1996-06-23)
| List of all articles for this month |

From: Dave Lloyd <dave@occl-cam.demon.co.uk>
Newsgroups: comp.compilers
Date: 13 Jun 1996 20:07:51 -0400
Organization: Compilers Central
References: 96-05-061 96-05-163 96-06-044
Keywords: C, UNCOL, performance

Toon Moene <toon@moene.indiv.nluug.nl> wrote:


> > The situation is more absurd with Fortran 90 which provides more powerful
> > arrays and extensive array syntax that can be highly optimised with
> > comparatively cheap analysis (or for that matter directly invoke
> > hand-optimised BLAS).
>
> Good Fortran compilers (on systems where the vendor has complete
> control, like Cray) indeed do this: calling BLAS routines when
> pattern recognition determines that they can be substituted.
> Fortran 90 is not a prerequisite for that.


My point is that Fortran 90 allows these to be determined without complicated
pattern recognition. It is easy for an F77 programmer to write a set of loops
that are hard to transform into equivalent vector algebra without considerable
analysis (see Wolfe's book for the lengths that you sometimes have to go to for
things that are 'obviously equivalent'). Further many pattern recognisers have
a relatively small set of patterns that can be matched leading to missed
vectorisation when the loops don't quite conform (induction variables updated
several times within a loop get some optimisers). Whereas F90 constrains the
programmer to write vector expressions which are directly implementable via
BLAS calls - or (as with our compiler) which can have code generated directly
for them with full knowledge of the semantics (giving better heuristics for
unrolling, register allocation, loop nesting, temporary copies required,
cache prefetches, etc).


But if you transformed F90 to C you must either call the BLAS via the C (and
there are only a finite number of them) or you must generate the loops in C
and trust to the C back-end to deduce what was explicit. Of course in
generating the C loops, some of the original semantics can be used (such as
determining the loop nesting) but not all unless your C back-end has some hooks
(via pragmas or attributes) for the front-end. The game also changes if you
allow the front-end to have significant knowledge of the target platform and
generate different C code for a Cray as for a Pentium - with vector algebra
strategic decisions about the implementation may have to be made quite early.


In summary, C is sufficient as an intermediate, but it is too poor for
high-performance compilers of larger languages.
----------------------------------------------------------------------
Dave Lloyd Email: Dave@occl-cam.demon.co.uk
Oxford and Cambridge Compilers Ltd Phone: (44) 1223 572074
55 Brampton Rd, Cambridge CB1 3HJ, UK
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.