Re: Translator design decisions

Chris F Clark <cfc@shell01.TheWorld.com>
Mon, 21 Jan 2008 16:00:22 -0500

          From comp.compilers

Related articles
Translator design decisions msuvajac@sfsb.hr (Mario Suvajac) (2008-01-19)
Re: Translator design decisions DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-01-20)
Re: Translator design decisions cfc@shell01.TheWorld.com (Chris F Clark) (2008-01-21)
Re: Translator design decisions dot@dotat.at (Tony Finch) (2008-01-22)
Re: Translator design decisions rose@acm.org (Ken Rose) (2008-01-22)
Re: Translator design decisions cfc@shell01.TheWorld.com (Chris F Clark) (2008-01-23)
Re: Translator design decisions pertti.kellomaki@tut.fi (=?ISO-8859-1?Q?Pertti_Kellom=E4ki?=) (2008-01-23)
Re: Translator design decisions DrDiettrich1@aol.com (Hans-Peter Diettrich) (2008-01-23)
Re: Translator design decisions idbaxter@semdesigns.com (2008-01-26)
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: Mon, 21 Jan 2008 16:00:22 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 08-01-050 08-01-054
Keywords: UNCOL
Posted-Date: 21 Jan 2008 23:48:57 EST

Our estemed and wise moderator wrote:
> Just to save you time, the so-far inevitable trajectory of an UNCOL
> project is that they try a couple of semantically similar source
> languages, and a couple of semantially similar targets, it seems to
> work OK, and wild enthusiasm ensues. Then as they add more sources
> and more targets, it becomes apparent that each one requires a bunch
> of new special case hacks in the intermediate language, which rapidly
> overwhelms whatever common stuff they thought they had. After a
> while, the project quietly disappears. Heard from ANDF lately?


I've worked on several semi-successful UNCOL projects, maintaining
them after they had gotten used to translate a variety of languages
(e.g. the TSI and LPI compiler suites which are interelated, the MIPS
suite based on Fred Chow's ucode optimizer, some internal stuff used
in Cadence, and probably some more, but not gcc, ANDF, or SUIF).
There is a lot of compiler infrastructure to leverage, which is what
promotes the initial success and wild enthusiasm.


However, in the end, there is a huge body of implicit assumptions
present in any given language that pervades the semantics and that is
what eventually kills the commonality, because if one ignores the
hidden semantics the tool seriously fails to capture some important
detail, which either results in incorrect behaviour (if one choose
something overly unsafe) or abysmal performance (if one choose
something overly pessimistic).


A semi-successful UNCOL project will allow that to be factored out
into a language specific portion. For example the TSI based FORTRAN
compiler had special semantics for identifier names at the IL
(intermediate langauge) level, and that was the right place to put the
special hooks FORTRAN needed for that suite. The FORTRAN compiler in
the MIPS suite put the special hooks elsewhere.


A contrasting experience was the work I did at Cadence to merge the
Verilog and VHDL code generators, both languages are in the hw design
space--one is C like the other more in the Pascal/Ada family.
However, that didn't seem like a large semantic gap and initial
estimates showed that about 60% of the IL opcodes were shared (had the
same name and same function). However, as I started looking into the
details, much of the compatiblity disolved because at the roots the
two languages had essentially different models of operation (one based
on bits the other on abstract types) and although the IL operators did
the same things, the way they did them was essentially different. As
a result, only about 25% of the code would have any kind of merger and
then it would have almost immediately diverged again.


This effect is not something unique to compilers. If you follow the
OO literature, you will quickly come across "refactoring". This is
the process of [re-]finding commonalities once the "shape" of a
project has changed. Each architecture exposes some commonalities and
makes them easy to solve, but hides others. There is no architecture
that exposes all the commonalities for easy solution. One of my
mentors called this the "conservation of difficulty": once you reach a
local minima, if you solve difficulty in one region, the difficulty is
conserved and pops up somewhere else.


There is one nice aspect to the UNCOL projects. They do find ways to
keep plenty of compiler people hired. The initial success is
seductive enough that people get addicted, and the downstream work is
just successful enough to keep people from abandoning it.


If I were starting my compiler career these days, I would probably
learn a popular Java JIT or the .NET clr. There should be good money
in those for awhile.


However, unless I were overly self-confident, I would not try to
create my own new UNCOL unless I had pretty tight bounds on what I
wanted it to do. Even if the Lemming documentaries were faked, they
are still instructive.


Hope this helps,
-Chris


******************************************************************************
Chris Clark Internet: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. or: compres@world.std.com
23 Bailey Rd Web Site: http://world.std.com/~compres
Berlin, MA 01503 voice: (508) 435-5016
USA fax: (978) 838-0263 (24 hours)
------------------------------------------------------------------------------


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.