|Time to Write a Compiler (Oberon) Chuck_Lins.SIAC_QMAIL@gateway.qm.apple.com (Chuck Lins) (1990-11-13)|
|Re: Time to Write a Compiler (Oberon) email@example.com (Rob Turner) (1990-11-15)|
|From:||Chuck Lins <Chuck_Lins.SIAC_QMAIL@gateway.qm.apple.com>|
|Date:||13 Nov 90 09:39:31|
I have been working on an Object Oberon compiler in my spare time. While
Object Oberon is a much simplier language than Fortran90, I have kept track
of the hours invested in this project. First some background and a brief
The front-end and the shell for a back-end was obtained from ETH Zurich at
the cost of 1000 Swiss francs. It contained a portable scanner, lexical
analyzer, parser, and abstract syntax tree builder. The front-end covers the
language Oberon. Object Oberon is a superset of Oberon.
The back-end shell consists of an abstract syntax tree traversal module and
stubs for calls to a lower-level code generator. The job of the compiler
writer is to fill in these stubs as well as adding a machine dependent
low-level code generator.
The compiler itself is written in Oberon. The target processor is the
Motorola MC68000 family. Target hardware is the Macintosh (tm). Target
programming environment is MPW (Macintosh Programmers Workshop).
I started with a copy of the MacOberon system from ETH. This is a standalone
application running on the Macintosh. It contains an Oberon compiler in
object form. Thus the first task was to write a cross-compiler running under
MacOberon but producing object code for MPW. Once this was done, I created
another copy of the compiler source code (based on the cross-compiler). This
version was modified to run under MPW and produce code for MPW. The main
task here was replacing calls to the Oberon environment/libraries with calls
to semantically equivalent or similar MPW libraries.
As of today (11/12/90) I have completed work on the cross-compiler and have a
working compiler running under MPW. The compiler compiles itself under MPW.
How Long it Took
To date, I have spent 399.5 hours on both the cross-compiler and the native
compiler. I started on July 24, 1990. After 286 hours of work, on Oct 15 I
got the native compiler to link successfully. After another 44 hours of
debugging I was able to start using the native compiler to compile itself.
35 more hours of debugging and the native compiler was able to compile
itself. Since then I've been using the native compiler to add enhancements
to itself as well as improve its code generation.
What Gave Me Problems
It took me quite a while to get the comparison and branching instructions
correct. Mapping the source language operands onto the hardware instructions
was very time-consuming. I remember spending quite a bit of my time in
MacsBug (the machine level debugger) single-stepping machine instructions
from a known 'good' point in the compiler. You had to do this when you
jumped off into space somewhere. I had one nasty bug for referencing VAR
parameters. Unfortunately, you'd sometimes overwrite part of the return
address (or destroy a pointer) and jump into the twilight zone. The good
part was that I found four other bugs while searching for this one.
Another source of errors was my failure to check for special cases of
variable addressing modes. Specifically things like VAR-parameters and
accessing parameters at outer nesting levels.
What Went Well
When I did the initial (paper) design, I reduced the 68000 addressing modes
to four categories: immediate operand, register value, memory operand, and
condition code. Then I created a table for each AST node (eg integer ADD,
logical AND, procedure CALL, etc). Each AST node has (at most) two operands
and one result operand. I then made N rows covering all the operands as
input and output values. I was influenced here by Horspool's ideas given in
"An Alternative to the Graham-Glanville Code-Generation Method", IEEE
Software, May 1987.
An example table for integer addition would be as follows (where
op = operator, res = result, l = left subtree, r = right subtree,
R = register, O = operand, I = immediate, and CC = condition code):
op res l r
+ R R O
+ R O O
+ R R R
+ R O R
+ R R I
+ R I R
Each row was then filled in with the assembly language source statements
necessary to produce the proper result. This was where I forgot the special
cases. The generic operand sometimes needed object code to produce a
simplier addressing mode. It was easy to fix once I recognised the problem
(even though I had all the code written for the code generator).
I think using trees worked out well. It was relatively easy to generate
correct code from them. It may not be the most highly optimized object code,
but it's not terrible. At present, the compiler compiles the largest of its
source modules (63K) in about 75 seconds.
Improvements remain in the area of array indexing on parameters. My current
code involves a multiply. I can take advantage of type information to
eliminate most of these. This should yield a good performance improvement.
Now I can concentrate on optimizations, adding the object oriented
extensions, and the other capabilities necessary for a production quality
Return to the
Search the comp.compilers archives again.