Re: Folk Theorem: Assemblers are superior to Compilers

Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>
Mon, 15 Nov 1993 18:17:54 GMT

From comp.compilers

Related articles
[27 earlier articles]
Re: Folk Theorem: Assemblers are superior to Compilers steven.parker@acadiau.ca (1993-11-02)
Re: Folk Theorem: Assemblers are superior to Compilers pardo@cs.washington.edu (1993-11-03)
Re: Folk Theorem: Assemblers are superior to Compilers kanze@us-es.sel.de (James Kanze) (1993-11-03)
Re: Folk Theorem: Assemblers are superior to Compilers vthrc@mailbox.uq.oz.au (Danny Thomas) (1993-11-05)
Re: Folk Theorem: Assemblers are superior to Compilers lenngray@netcom.com (1993-11-07)
Re: Folk Theorem: Assemblers are superior to Compilers rfg@netcom.com (1993-11-13)
*Re: Folk Theorem: Assemblers are superior to Compilers synaptx!thymus!daveg@uunet.UU.NET (Dave Gillespie)* (1993-11-15)**

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>
Keywords:	assembler, optimize, performance
Organization:	Compilers Central
References:	93-10-114 93-11-084
Date:	Mon, 15 Nov 1993 18:17:54 GMT

[I wrote:]
>How many languages have a declaration that
>tells the compiler that a given pointer, or even a given integer, is a
>multiple of 16?

Ron Guilmette writes:
> In the case of the C language, we are (I think) fortunate to have certain
> "industry standards", which, in many cases, go beyond the requirements
> laid down by the international ISO C standard.

We know about that industry standard, and it's saved our bacon--- it would
be incredibly painful for the programmer to arrange for proper alignment
if "new" and "malloc" didn't give that guarantee.

I don't think our compiler guarantees arrays on the stack to be
quadword aligned; the documentation certainly doesn't mention any
such guarantee, and we have never needed to check it out.

> In the case of the i860 (in particular) the ps-ABI for this processor does
> indeed require compilers to align all data objects (and members of struct
> and union types) which have type `long double' to 16 bytes boundaries.

I think you may have missed my point: It's not that we want to load one
quad-float at once, it's that we want to load *four* single-floats at
once. Say you're doing a vector "a = b*c" operation; for every one-cycle
multiply, you need three load/stores. With a bit of loop unrolling plus
load/store-quad, you can get your three load/stores per cycle with room to
spare.

This is really an issue of information at the procedure-call boundary.
(In that sense it's a relative of the infamous "noalias" problem.) Say I
have a function

double sum_vector(double *p, int n);

At first glance, the ABI might imply that "sum_vector" can assume that "p"
is quadword aligned on an 860. But of course it can't; there's nothing
stopping the programmer from writing

double array[10];
double last_five = sum_vector(&array[5], 5);

The pointer "p" has the wrong alignment now. And this is nothing specific
to C; even number-friendly FORTRAN has this problem. The only way you can
do it is with exhaustive interprocedural analysis, non-standard
declarations, or having the compiler automatically write "sum_vector" in
the form of

if (happy(p)) <fast-loop> else <slow-loop>

which is hard to make into a general solution.

The compiler we use offers none of these, so the load-quad instruction is
simply out of its reach.

-- Dave
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Folk Theorem: Assemblers are superior to Compilers

Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>Mon, 15 Nov 1993 18:17:54 GMT

Dave Gillespie <synaptx!thymus!daveg@uunet.UU.NET>
Mon, 15 Nov 1993 18:17:54 GMT