Re: Intermediate language

dwight@pentasoft.com (Dwight VandenBerghe)
19 Apr 1999 14:48:55 -0400

          From comp.compilers

Related articles
Intermediate language tools2@gate.net (Jim Prince) (1999-04-18)
Re: Intermediate language dwight@pentasoft.com (1999-04-19)
Re: Intermediate language mph@zdenka.demon.co.uk (Mike Harrison) (1999-04-19)
Re: Intermediate language bob.morgan@digital.com (1999-04-19)
Re: Intermediate language jamz@my-dejanews.com (1999-04-19)
Re: Intermediate language rogerha@aol.com (1999-04-22)
| List of all articles for this month |

From: dwight@pentasoft.com (Dwight VandenBerghe)
Newsgroups: comp.compilers
Date: 19 Apr 1999 14:48:55 -0400
Organization: Compilers Central
References: 99-04-056
Keywords: interpreter, design

On 18 Apr 1999 02:07:26 -0400, Jim Prince <tools2@gate.net> wrote:


>I tried to start a discussion on intermediate languages and was
>unsuccessful. Thank you to the people that replied directly to me.


It's hard to start into an open-ended discussion like that; where do
you begin? It's a big topic. Do you have specific questions? If you
could frame your concern or confusion into a more focused area, I
think more people might want to respond.


I can start off with an overview. You design an intermediate for a
specific purpose, so there is really no such thing as a single
intermediate language that is suitable across the board. Here are
three very common intermediate languages:


1. AST (Abstract Syntax Tree). This is a tree structure, typically
an N-ary tree, that mirrors the abstract syntax of the source
language. Each leaf in the tree corresponds to a terminal of some
sort in the language (say, a constant or a variable) and each node
corresponds to an operator or, perhaps, a non-terminal (add, subtract,
assign). This type of structure is typically the result of a parse,
and is used for type checking and as input to a later stage of the
compilation process.


2. Triples or Quads. This is a list of what look like pseudo-machine
instructions. The first field of each triple is the op code, and the
second and third fields are the operands. [For quads, the fourth
field specifies where the result is placed.] Triples define how an
idealized machine might execute the program, assuming an infinite
number of registers and other resources. Triples/quads are typically
used in an optimizing compiler; the various passes of the optimizer
operate on the lists of instructions and attempt to reduce them
(according to some metric of space or time). Triples/quads (also
known as "tuples") are typically divided into lists that correspond to
the basic blocks of the program, as this turns out to be the right
unit of measure for most optimizations.


3. C. Yes, the C language itself is an intermediate language, and a
very successful one at that! The original AT&T cfront "compiler" for
C++ was actually a preprocessor with C as its target; the resultant C
code was then input to a standard C compiler which was used as the
back-end. There are many other languages that create their output in
the form of C, which has turned out to be the realization of the 70's
dream of a universal intermediate language that would be supported on
all architectures. I use C as an intermediate language all the time:
by writing the front end of a compiler and then optimizing the
creation of C output, I can take advantage of the very effective
native hardware support that the vendor's C compiler usually provides.
Often I don't have to do all that much optimization work, because the
C compiler is so good!


Any of these topics strike your fancy?


Dwight


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.