Related articles |
---|
AST's, why are they necessary? Phil@rosl.demon.co.uk (Phil) (1998-08-30) |
Re: AST's, why are they necessary? thetick@magelang.com (Scott Stanchfield) (1998-08-31) |
Re: AST's, why are they necessary? jamz@my-dejanews.com (1998-08-31) |
Re: AST's, why are they necessary? friwi@prosun.first.gmd.de (1998-09-05) |
From: | friwi@prosun.first.gmd.de (F.W. Schroeer) |
Newsgroups: | comp.compilers |
Date: | 5 Sep 1998 01:18:50 -0400 |
Organization: | Compilers Central |
References: | 98-08-200 |
Keywords: | parse |
Phil Royall <Phil@rosl.demon.co.uk> wrote:
> I am writing translators to convert about 20 robot languages into our
> own proprietary language (a derivative of VB), and on advice I have
> been looking into using AST's. However, why would, in this case an
> AST give us an advantage over just hand coding the actions into
> fprintf statements (for example).
The Gentle compiler construction system has been used various times for
that purpose, the generated robot control software is in use at the
automobile industry (e.g at Volkswagen).
This methodology strongly recommends to separate issues
and to decompose the task of translation into subtasks:
* Discover the structure of the source program,
* Process this structure to generate the target program.
The abstract syntax representation serves as the interface between subtasks.
In the ancient days when syntax trees did not fit into 64K memory,
the structure of compilers (and even programming languages) was often
determined by the goal to process a program in a single pass.
Today's machines allow a compiler design in favour of productivity and
ease of maintenance.
There are many advantages of using abstract syntax.
* The compiler has a clean structure, independent subtasks are
described independently. There is a clear, declarative interface
between subtasks.
* Building an intermediate representation allows the reuse
of compiler components: You may exchange the frontend without touching
the backend, hence translating several source languages into the
same target language (which would be hard if parsing and code generation
would be intertwined). You may also provide different backends for the
same source language.
* Semantic analysis and code generation is based on the logical structure,
not on rules dictated by the parsing algorithm (e.g. by the need of
left factorization or conflict resolution).
* Abstract syntax allows normalization: different syntax with the same
meaning is handled only once (e.g A[i][j] and A[i,j] in Pascal).
* Abstract syntax allows simplification: in concrete syntax
you have to deal with operator precedence, abstract syntax has an
unambiguous structure.
* Modern languages (such as Java) allow the use of items that
are not yet declared at the point of application. Such forward references
are no problem in compilers that use an internal program representation.
Algorithms based on abstract syntax trees may visit nodes several times
and in any order. Even if you think that you can process the source
in one pass from left to right, it is likely that you will be confronted
with an unforeseen problem that cannot be solved in this way.
* Abstract syntax allows for automatically checking the consistency
of the translation. Errors are discovered when processing the specification.
Linguistic support for defining and processing abstract syntax includes:
* Type definitions
Abstract syntax describes the hierarchical decomposition of programs.
A given item is constructed according to one of of several alternatives,
an alternative is constructed by certain constituents. Hence abstract syntax
is best described by a system of (recursive) types, where a type is
defined by listing its alternatives and an alternative is defined by listing
its constituents. For example, a type STATEMENT could be defined by
assignment(VARIABLE, EXPRESSION)
while(EXPRESSION, STATEMENT)
/* ... */
A STATEMENT is an assignment, a while statement, ... .
An assignment is given by a VARIABLE and an EXPRESSION.
A while statement is given by an EXPRESSION and a STATEMENT.
* Constructors and Patterns
Elements of the abstract program may be denoted by terms. The term
assignment(V, E)
may be a constructor that builds a statement S from a variable V
and an expression E. It may also be considered as a pattern: matching
a statement S against this pattern succeeds if S is an assignment,
thereafter V and E denote the constituents of that assignment.
* Attribution of concrete syntax grammars
The concrete syntax of a language is described by a context-free grammar.
For example, a concrete assignment statement is described by
Stmt : Var ":=" Expr
The corresponding abstract syntax can be introduced by decorating this rule
with attributes:
Stmt(-> assign(V, E)) : Var(-> V) ":=" Expr(-> E)
which means that if Var has the abstract representation V and Expr is
represented by E then the whole phrase has the abstract syntax
assign(V, E).
* Rule-based traversal of abstract syntax
Traversal of abstract syntax (for semantic analysis or code generation)
may be expressed by rules (similar to grammar rules). Very often these rules
follow the structure of the abstract syntax types: there is a rule for each
alternative, a rule processes the constituents of the alternative.
Rules are selected by pattern matching. For example, the rule
StmntCode(assign(V, E)) : VarCode(V) ExprCode(E) STORE
expresses that the code for assign(V, E) is given by the code for V,
followed by the code for E, followed by the instruction STORE.
For simple source-to-source translation first-fit rule selection
is sufficient. For advanced code generation rules may be augmented with
cost values, and dynamic programming can be used for optimal rule selection.
A book presenting this approach is available online
(and also as PostScript file or HTML archive for downloading):
http://www.first.gmd.de/gccs/
Friedrich Wilhelm Schroeer
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.