Re: Compilers in six hours

chase@Think.COM (David Chase)
Thu, 19 May 1994 16:33:53 GMT

          From comp.compilers

Related articles
Compilers in six hours macrakis@osf.org (1994-05-12)
Re: Compilers in six hours chase@Think.COM (1994-05-17)
Re: Compilers in six hours grunwald@widget.cs.colorado.edu (1994-05-17)
Re: Compilers in six hours hbaker@netcom.com (1994-05-18)
Compilers in six hours ssimmons@convex.com (1994-05-18)
Compilers in six hours ssimmons@convex.com (1994-05-19)
Re: Compilers in six hours anton@mips.complang.tuwien.ac.at (1994-05-19)
Re: Compilers in six hours chase@Think.COM (1994-05-19)
Re: Compilers in six hours hbaker@netcom.com (1994-05-20)
Re: Compilers in six hours monnier@di.epfl.ch (Stefan Monnier) (1994-05-22)
Re: Compilers in six hours munk@prl.philips.nl (1994-05-24)
Re: Compilers in six hours li@marcus.cs.umn.edu (1994-05-24)
Re: Compilers in six hours cytron@kato.wustl.edu (1994-05-25)
Re: Compilers in six hours hobbs@gemmax.zko.dec.com (1994-05-26)
| List of all articles for this month |

Newsgroups: comp.compilers
From: chase@Think.COM (David Chase)
Keywords: courses
Organization: Thinking Machines Corporation, Cambridge MA, USA
References: 94-05-066
Date: Thu, 19 May 1994 16:33:53 GMT

ssimmons@convex.com (Steve Simmons) writes:
|> Hmmm... I did teach a class recently and allowed C to be the target
|> language (as opposed to assembly) for the compiler project.


|> Drawbacks:
|> - Do not gain an understanding of register allocation problems
|> and spillage.
|> - Do not understand memory allocation and address references.
|> - Do not learn parameter passing well.
|> - Do not learn problems with calling conventions.
|> - Do not get a complete feel for linearization of the parse tree
|> or expansion of any intermediate representation.
|> - Do not need to do array expansion.
|> - Do not need to do type conversions.
|> - Do not learn about runtime libraries. They intuitively use C's.
|> - Forget about instruction scheduling.


Having actually done a compiler-translating-to-C once (Modula-3), I think
you'd be surprised how much you do in fact end up worrying about some of
these issues. One interesting subtask in doing the translation is
figuring out what the costs of various constructs are for the C compilers
of interest (the output is basically portable, but it may have varying
performance characteristics).


So, for example, you might choose to do you own structure parameter
passing and returning, if your language has particular properties. If
your arrays don't happen to fit C's arrays very well, you may well find
yourself getting into rather detailed stuff.


The one place where a difference can be made, even in C, is in scheduling,
software pipelining, and poor man's vectorization. If you've got a loop
over some data type that is smaller than the largest load/store size
(e.g., ldd/std on Sparc), then you can unroll your loop enough to enable
the use of the larger load/store (first, you have to align things, of
course). In places where you have a choice between saying something like


    while (x != 0 && x -> data != key)
          x = x -> next;


versus


    while (x != 0 && ((t = x -> next), x->data) != key)
          x = t;


you might consider it (this second one allows the load of x -> next to
occur before the test, which means it might schedule better). Since you
are generating your own code, you can even choose to use a different NULL
value (this is great for performance, but terrible for interoperability
with C) so that you could rewrite the loop as:




    while (((t = x -> next), x->data) != key && x != NULL)
                x = t;


This gives you somewhat more freedom in scheduling, since you can push the
load up a little more. In this case, NULL ends up being something like a
pointer into the middle of two pages full of NULL -- that way, loads from
NULL just give you NULL. Laurie Hendren and Joe Hummel (and someone
else?) pointed out a nifty trick where (given a NULL with these
properties) you can unroll loops like the one above and defer the test for
NULL, since NULL == *NULL == **NULL == ***NULL etc. They got speedups
doing this -- it was pretty cool.


If you know something about your data types, such as "this array is
allocated all the way out to the Nth element", you get an opportunity to
schedule loads up before an early exit from a loop, which basically means
that you can do pipelining at the source level. C compilers in general
cannot do this optimization, because array bounds are rarely supplied and
often lied about. You may not get optimal pipelining, but in those cases
where I did it experimentally (it's hard work by hand) I got a substantial
speedup.


Of course, the bad outcome of learning all this stuff is that people might
try to do these same tricks while hand-coding C.


David Chase, speaking for myself
Thinking Machines Corp.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.