Related articles |
---|
RE: Two questions about compiler design tom@kednos.com (Tom Linden) (2004-02-26) |
Re: Two questions about compiler design cfc@shell01.TheWorld.com (Chris F Clark) (2004-02-27) |
RE: Two questions about compiler design tom@kednos.com (Tom Linden) (2004-03-02) |
Re: Two questions about compiler design cfc@shell01.TheWorld.com (Chris F Clark) (2004-03-06) |
From: | "Tom Linden" <tom@kednos.com> |
Newsgroups: | comp.compilers |
Date: | 2 Mar 2004 11:03:34 -0500 |
Organization: | Compilers Central |
References: | 04-02-170 |
Keywords: | design |
Posted-Date: | 02 Mar 2004 11:03:34 EST |
Tom Linden wrote:
> Between the two came the n-tuple design that Freiburghouse developed
> for PL/I which was widely used for a number of languages by Digital,
> Wang, Prime, DG, Honeywell, CDC, Stratus and others I can't recall.
Ah, fond memories rekindled. Freiburghouse's IL was fairly close to a
useful UNCOL. At Prime, we had frontends for it for PL/I (at least 3
dialects), Fortran, Cobol, Pascal, Basic, Modula-2, and RPG that were
shipped to customers and supported. In house, Kevin Cummings wrote an
Algol-60 frontend as a fun project. I'm pretty sure a C frontend was
written also, but was not the compiler that got shipped to customers.
The best part was that most of the backend and scaffolding for all
those projects was common.
Prime's C front-end was written by Conboy and did indeed conform to
the PL/I IL and Symbol table design. I no longer have the souces
around in electronic form, but I found a while back a listing of the
Imode code gen for the 50 series. In fact the code gen I did for the
Alpha was a distant offshoot of this.
At Prime we even built our own improved global optimizer for the IL.
We started to build a third version of the global optimizer based on
Fred Chow's Stanford thesis, but that fell prey to the second system
syndrome and died a death due to management not wanting to fund the
project unless it had no risks and the project engineers adding things
to reduce the risk, but never willing to say that there were none. (I
left when the project plan was over 100 pages with no end in sight.)
Ah, oprimization. Interesting from several points of view. When
Cutler et. al. did the VAX code gen they basically followed Aho,
Sethi and Ullman. But the interesting thing is that you can get 80%
of the optimizations with a rather simple optimizer, at a fraction of
the cost to develop - and maintain.
In addition to the hardware vendors Tom listed above, I know two
compiler houses made a good living off the technology, TSI and LPI.
Later in my career I did a stint with LPI.
Well actually, I don't know if you would call it a good living, but
TSI morphed into Kednos which today only does PL/I for VAX and Alpha
and next year Itanium.
Having mentioned Fred Chow, it is worth tying this back to P-code.
His thesis used a variant of P-code with register information that he
called U-code. There were several different U-code frontends built
also. I recall C and Fortran at DEC. I was there when they were
retiring the U-code backend, replacing it with the GEM backend.
I believe the Ucode is what Hennessey used for the MIPS compilers.
Interestingly, most of the GEM staff came out of Digital's PL/I group
which had develpoed VCG. There focus was largely C so they eliminated
frame pointers, which stack unwinfing a bit of a chore and indeed
required one to use undocumented features.
Having experienced both, I think the U-code IL was better for many
compiler purposes, but not as good an UNCOL. The global optimizer
technology associated with U-code was certainly better, both simpler
to maintain and more sophisticated. In contrast, I think the
Freiburghouse code generator technology was better, especially from an
easy of maintenance standpoint. Part of this was due to the fact the
the Freiburghouse IL was not as close to the machine and let more
frontend semantics peer through. For example, when looking at a
"reference" to a variable in the Freiburghouse IL, one had to know
which frontend produced the reference for certain aspects of its
semantics--that made it a better UNCOL, becase the frontend didn't
have to bend so much to match some other languages memory model.
Yes, the IL was much like an abstract, overloaded assembly language.
In fact, I, as an experiment, generated in the semantic pass of PL/I
IL code for bultin functions essentially writing the algorithms in IL
to be inlined rather than as a library call and this turned out to be
very simple and almost self documenting
The distance from the machine helped at code generation time, because
each IL operator stood for something more or less complete and one
could then factor the cases as code generation time. There was a
simple but useful "language" used by the code generators to implement
those semantics. That made all the difference. It made the code
generator into something one could read easily and understand the code
sequences coming out.
In contrast, with the U-code IL, semantics were composed from more
primitive operations and code generation worked by matching
patterns--think BURG. While one can express all the same decisions
that way and perhaps more, it is much less clear to a code generator
writer how a small change will affect the code generated. Of course,
exposing all of those details in the IL was part of what made the
U-code optimizer good. The optimizer could easily rewrite unncessary
operations out of the program, because the operations were all exposed
in the IL. In the Freiburghouse IL, many of those things were more
implicit and thus inaccessible to the optimizer.
Perhaps the most striking thing about that difference is how it was
actually localized to a small part of the different IL's. If one
looked at the opcode listings for both, they would be mostly
identical. The key difference being in the memory access opcodes,
Freiburghouse's IL had only a generic "reference" operation, where
U-code has explicit load and store operations that used explicit
arithmetic to calculate the exact memory location. However, that
semantic difference ripples through the IL and completely changes
everything. One could easily implement a Freiburghouse on nearly any
architecture, memory-to-memory, a few registers, many registers, stack
based, byte addressible, word addressible--the IL was architecure
neutral. In constrast, U-code was optimized toward many register
machines with a load-store architecture (with a preference toward byte
addressible machines). If your machine doesn't look like that, U-code
isn't quite as useful.
The ref operator was specifically designed with a view in mind to
efficiently reference a member of an indexed, based strucuture, Ucode
did not have that capability. This also helped with Fortran common.
Of course, in the current time, we seem to have a paucity of different
architectures, but that won't last forever, I hope. And, that brings
up the final and perhaps most important point.
What gave rise to all the different architectres was essentially the
AMD 2901 bit slice chips which made it possible to rather easily put
together a microcode architecture. The latest Xeon I believe has 125
million transistors each of which is about the sice of the DNA
molecule! It takes billions to create a plant that can produce those
kinds of chips, so I think diversity is not part of the future.
Freiburghouse designed his IL in a time when many architectures were
around and none were prevelant. Making his IL into an UNCOL,
particularly for different underlying machines was important. For him
investing a few more hours into developing a code generator for a new
machine/language combination actually meant money in his pocket, so he
wanted that task simple enough that he could effectively do it. The
ability to optimize the code on those machines was relevant but the
opportunities were more limited.
He actually formed most of his ideas while at Multics.
The U-code architecture reflects the great shift to more of an
atchitectural monoculture. Different ports were not as different (at
least in terms of basic machine semantics) and getting a more relevant
was achieving optimized results on the machines which were becoming
dominant, machines whose architecture is designed for C-like
languages, byte addressible uniform memory access and a reasonable
sized register file.
Well, I've rambled on enough on this topic. I hope something in here
was of interest....
Me too.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.