Re: UNCOL

dmason@scs.ryerson.ca (Dave Mason)
8 Jun 1996 22:11:15 -0400

          From comp.compilers

Related articles
Re: Java virtual machine as target language for C/C++ kik@zia.cray.com (1996-05-08)
Re: Java virtual machine as target language for C/C++ pardo@cs.washington.edu (1996-05-19)
Re: UNCOL pardo@cs.washington.edu (1996-05-21)
Re: UNCOL patrick_d_logan@ccm.hf.intel.com (Patrick Logan) (1996-05-26)
Re: UNCOL dmason@scs.ryerson.ca (1996-06-08)
Re: UNCOL dave@occl-cam.demon.co.uk (Dave Lloyd) (1996-06-13)
| List of all articles for this month |

From: dmason@scs.ryerson.ca (Dave Mason)
Newsgroups: comp.compilers
Date: 8 Jun 1996 22:11:15 -0400
Organization: Ryerson Polytechnic University
References: 96-05-061 96-05-133 96-05-141 96-05-159
Keywords: Java, UNCOL

pardo@cs.washington.edu wrote:
>>[UNCOL says you *can* use a common virtual machine language, but it
>> also says there will be a performance penalty.]


John Levine (the moderator) wrote:
>[UNCOL-ish systems seem to work OK if you have a single source
> language, e.g. Smalltalk, or a single target, e.g. multiple compilers
> sharing a back end. It's when you try NxM that you get heat death.]


I think we're conflating two different things under the general
heading of UNCOL: IR for compiler front-ends and back-ends (the
original idea), and VM architecture for portable executables.


I don't think this conflation is useful. Java VM is a reasonable
design for its intended purpose (although as Shivers and others have
pointed out, it's not ideal for various other languages). It is not a
particularly good intermediate language to use as an UNCOL. It is not
nearly general enough, and discards too much information.


Most of the traditional attempts at UNCOL have been one of:
stack-based, register-transfer, or abstract-syntax-trees. They have
also almost exclusively been restricted to a small class of source
languages -- i.e. they do okay on the COL part, but miss on the
UNiversal part! Most of them share the JavaVM's problems: not general
enough, discard too much information (with some tension between these
two ends. They also assume that the important part of code generation
is expression evaluation, when for languages like ML, Scheme,
Smalltalk, and even C++, it is much more important to minimize the
cost of their large number of function calls.


In article 96-05-141 pardo@cs.washington.edu (David Keppel) writes:
> I believe that it should be workable to use a single IR for several
> similar source languages and several similar back ends. For
> example, one IR for Algol-class languages on 32-bit flat-address
> machines, another IR for Smalltalk/SELF-class languages on
> microcontrollers. Thus, MxN languages is hard, but M/10 x N/10
> languages might be both tractable and give reasonable efficiency.
>[...]
> To summarize, UNCOL is a hard problem; subsets are easier to solve; and
> it is hard to provide both high efficiency and, at the same time, a
> convincing argument that successful translation on one platform will
> imply successful translation on other platforms.


In article 96-05-159 Patrick Logan <patrick_d_logan@ccm.hf.intel.com> wrote:
> Didn't Guy Steele suggest Scheme as a suitable UNCOL-substitute
> almost twenty years ago?
>[...]
> Could it be that MxN UNCOLs/IRs are difficult because of the level
> of abstraction?


I don't think UNCOL *is* that hard a problem, and Patrick has hit the
nail on the head: most of the attempts at UNCOL have come from people
writing compilers for Algol class languages and then trying to
generalize them. Scheme is a *far* better bet for a true UNCOL than any
other I've seen proposed. My PhD thesis -- due out any month now --
suggests a better intermediate representation that supports languages
including C, Modula3, CAML, Scheme, Smalltalk, Haskell (although I
don't have compilers for all of these, I do for several, and I have
encodings of the tricky bits in the other languages).


An interesting point that is often overlooked in discussing UNCOLs is
that producing good code for a range of destination machines is as
hard as supporting the semantics of a range of source languages. We
get seduced somewhat by the relative uniformity of available
processors (the 386 is the register-poorest machine that we usually
consider, and even it has 7 usable registers). I don't know how good
a job even *my* intermediate form would do on say, a Transputer.


(As an aside, thanks to our moderator for providing the search
capability on the archive. I found it quite helpful in composing this
reply.)


../Dave


--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.