From: | kalsow@winternet.com (Bill Kalsow) |
Newsgroups: | comp.compilers |
Date: | 14 Jun 1996 16:11:17 -0400 |
Organization: | Critical Mass, Inc |
References: | 96-06-010 96-06-026 96-06-047 |
Keywords: | OOP |
: Oliver Plohmann wrote:
: You can even take this idea further by having a NonTerminal root
: class, a Statement class derived from NonTerminal, an IfStatement
: class derived from Statement, etc...
: You can then have semantical analysis and code generation methods
: defined as virtual (deferred) in the intermediate classes, and
: redefined in the leafs.
The SRC Modula-3 compiler (http://www.research.digital.com/SRC/modula-3/html/)
is constructed as you suggested. I built it that way as an experiment.
I was interested in seeing what could be done to make compilers (as
examples of long-lived, evolved, maintained, mutated, and often hacked
programs) more robust over their useful lives. In nearly every compiler
I've seen, the original author's inspiration and design is clouded or
destroyed as the program evolves. We can decide if my experiment was a
success or not about 5 years after I stop messing with the compiler
and someone else takes over...
My initial hypothesis was that compilers tended to chaos because they
contain a single primary data structure, the AST, which is exposed to
and mutated by the rest of the program. The SRC Modula-3 compiler uses
Modula-3's opaque types and partial revelations to hide the details of
the AST nodes in separate compilation units. Only the IfStmt module
has access to the inside of IF statement nodes. That module has access
to other Expr.T and Stmt.T nodes and methods, but it cannot see inside
them.
As pointed out by others, this design is not perfect. Method calls
are used where simple field accesses might suffice. Given that
processors get 1% faster each week and compiling is basically an O(1)
task on a tiny bit of source, this overhead seems quite tolerable.
Another problem with this design (also cited by other posters) is that
it's painful to make some modifications. For example, adding a parameter
to the "typecheck" method of every node requires editing ~100 modules.
In a more conventional design this change would touch only a few
modules. But, on the flip side, when you want to change the internal
representation of an IF statement only one module needs editing. In
a conventional compiler most of the front-end modules would be affected.
A surprising (tiny) benefit of this organization is that people's bug
reports often identify exactly which piece of the compiler is broken.
Bug reports are usually "when I write this IF statement it doesn't work".
It's rare to have someone say, "I think code generation of conditional
branches is broken". The value of this benefit increases as the
experience of the bug fixer decreases.
The SRC Modula-3 compiler does not include an optimizer. If it did,
I expect other weaknesses in the design would be exposed. I don't think
it would be difficult to implement high-level optimizations using
methods for the needed predicates and transformations. Low-level
stuff like instruction scheduling doesn't fit as well. But, by the
time you get to that stuff, the AST is largely irrelavent.
So far the experiment is a success. I still understand what's going on. :-)
- Bill
_____________________________________________________________________
Bill Kalsow E-mail: kalsow@cmass.com
Critical Mass, Inc. WWW: http://www.cmass.com/people/kalsow/
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.