Re: Pros and cons of high-level intermediate languages

chased@rbbb.Eng.Sun.COM (David Chase)
Tue, 21 Jul 1992 18:01:44 GMT

          From comp.compilers

Related articles
Pros and cons of high-level intermediate languages ssp@csl36h.csl.ncsu.edu (1992-07-20)
Re: Pros and cons of high-level intermediate languages boehm@parc.xerox.com (1992-07-21)
Re: Pros and cons of high-level intermediate languages chased@rbbb.Eng.Sun.COM (1992-07-21)
Re: Pros and cons of high-level intermediate languages bbx!bbx.basis.com!scott@unmvax.cs.unm.edu (1992-07-22)
Re: Pros and cons of high-level intermediate languages shankar@sgi.com (1992-07-23)
Re: Pros and cons of high-level intermediate languages Olivier.Ridoux@irisa.fr (1992-07-23)
Re: Pros and cons of high-level intermediate languages fjh@munta.cs.mu.OZ.AU (1992-07-23)
Re: Pros and cons of high-level intermediate languages tmb@idiap.ch (1992-07-23)
Re: Pros and cons of high-level intermediate languages henry@zoo.toronto.edu (1992-07-23)
[25 later articles]
| List of all articles for this month |
Newsgroups: comp.compilers
From: chased@rbbb.Eng.Sun.COM (David Chase)
Organization: Sun Microsystems, Mt. View, Ca.
Date: Tue, 21 Jul 1992 18:01:44 GMT
Keywords: translator, design
References: 92-07-064

ssp@csl36h.csl.ncsu.edu (Santosh Pande) writes:
> I am interested in knowing the pros and cons of using an
>intermediate language (IL) in general. In particular I find 'C' has been
>used extensively as the IL in many situations: Modula, SISAL, AT&T Cfront
>for C++ etc.


> Some of the reasons I can see in favor of such an approach are:
> (1) Ease of portability (a feature of C that made it so popular!),


yes.


> (2) Easy retargettability (one can patch the run-time support with a
>customized library for the given target architecture easily), and,


yes.


> (3) Relative ease of mapping a given intermediate form (IF) to C's
>data structures.


so-so. The closer the original language is to C, the better. As soon as
you have a non-C concept (e.g, continuations, or guaranteed tail-call, or
garbage collection, or threads) things begin to get a bit rougher. Note
that there are existence proofs demonstrating that these things can be
done, but the non-C concepts are a lot harder to translate.


Note well that my experience (Modula-3) and the experience of friends who
have done similar things indicates that you should dive to a fairly low
level in the generated C. In particular, you should consider generating
"cast-ful" code, unless you have an extremely good command of exactly what
happens when you combine types of various flavors in arithmetic.


> However, such an approach might also suffer from:
> (1) Debugging is hellish,


It's not as bad as you might think (for the compiler-writer, that is).
This depends largely on you, and largely on the language that you are
compiling -- for Modula-3, since all pointer types were tagged, we were
able to generate P_<typename> subroutines that did an excellent job of
formatting data structures. We tinkered with insertion of #line
directives (back to the M3 source), and sometimes it succeeded
wonderfully, but too often it would be subtly misleading at just the wrong
time. I tried to not mangle variable names excessively, and that helped.


> (2) Efficiency is dictated to a large degree by the target C
>Compiler,


Yes and no. This depends lots on the code that you generate. I took the
approach (for Olivetti M3) that we couldn't rely on the optimizer because
we were using garbage collection and exception handling (implemented with
setjmp and longjmp), and so I took great care in the code that I
generated.


> (3) Translation in some situations might involve clumsy data
>structures and thus loss of efficiency.


It's not clear what you mean here. Compile-time efficiency? (it sucks,
generally) Run-time efficiency? (it depends.)


> Now my questions:
> (1) I am looking for examples in which using C as IL will lead to
>inefficiencies,


Aliasing analysis. You already know about Fortran. Other places where
this can happen include reference to pieces of descriptors that you "know"
won't change because they are part of your run-time. You can get around
this by pre-loading everything that won't change into a local variable,
but this could lead to a Big Surprise for the target C compiler. ("Big
Surprise" means potentially slow compilation and/or slow execution, or
(worse) overflowed internal tables.)


Also, the lack of (portable) register globals, or (portable) lightweight
access to thread-local storage can screw you up. If you want to write a
compacting garbage collector, the intermediate C compiler introduces
uncertainty as to just where the pointers are stored in the activation
records, and what has been done to them. There are other clever tricks
that you might want to try (special code generation for critical sections,
special code generation for exception handling) that are completely
off-limits if you use C as an IL.


> (3) I want to learn about the efforts to evolve efficient ILs
>(just like IFs) for procedural languages.


What's the difference between an IL and an IF?


David Chase
Sun
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.