Re: Compiler construction in ... and OOD

kalsow@src.dec.com (Bill Kalsow)
Thu, 4 Feb 1993 16:46:46 GMT

          From comp.compilers

Related articles
Compiler construction in C++ and OOD vivek@sequent.com (1993-02-03)
Re: Compiler construction in ... and OOD kalsow@src.dec.com (1993-02-04)
| List of all articles for this month |

Newsgroups: comp.compilers
From: kalsow@src.dec.com (Bill Kalsow)
Keywords: OOP, modula
Organization: DEC Systems Research Center
References: 93-02-032
Date: Thu, 4 Feb 1993 16:46:46 GMT

vivek@sequent.com (Vivek Buzruk) writes:
> But the same article gave me some thoughts about considering object
> oriented design of a compiler. Does anyone know about research done on
> this topic? OR any practical compilers using this methodology, and how
> they do it?


The SRC Modula-3 compiler uses objects throughout the front-end. Several
years ago I observed that the internals of most compilers are a mess and
wondered why. My conjecture is that two factors contribute:


    1) Serious compilers have a long life-time. Several programmers
          hack on the source. The original authors are gone and their
          inspirations get muddled over time. Of course, any long-lived
          program will suffer these problems -- not just compilers.


    2) Most compilers expose a large complicated data structure, usually a
          decorated syntax tree, with little or no access control or abstraction
          (e.g. GCC's tree.{def,h,c}).


I decided to attack the second problem. I started building a compiler
where the front-end data structures were hidden whenever possible. The
scanner is a single hand written module. The parser is a simple recursive
descent parser that's distributed across many modules. There's a separate
module for each language construct: ARRAY type, FOR statement, +
expression, ... Each module is responsible for parsing, type checking,
answering queries, and emitting code for its construct. Of course, the
glue that makes it possible is that each parser yields a tree node that's
a subtype of one of the four general classes: Stmt, Type, Expr, Value.
Every statement has a "type-check" and a "compile" method, but only
ForStmt knows that its node has "index-var", "from", "to", "step" and
"body" components.


I would say the results are mixed. Instead of a few large modules, there
are about a hundred little ones. What might be implemented as a CASE
statement with several labels sharing code turns into a method call with
less shared code. On the plus side, clients almost always report bugs in
terms of a language construct (e.g "this IF statement doesn't work") not
in terms of a traditional compiler organization (e.g. "I think the type
checker is broken"). The most telling advantage remains to be seen. How
well will the compiler survive its hackers over time?


If you're interested, the SRC Modula-3 compiler and its sources are
available for public FTP in /pub/DEC/Modula-3/release/* on
gatekeeper.dec.com.


    - Bill Kalsow
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.