Coding effort for a simple translator

waite@scotty.cs.colorado.edu (William Waite )
24 Aug 1998 13:20:38 -0400

          From comp.compilers

Related articles
Coding effort for a simple translator waite@scotty.cs.colorado.edu (1998-08-24)
| List of all articles for this month |
From: waite@scotty.cs.colorado.edu (William Waite )
Newsgroups: comp.compilers
Date: 24 Aug 1998 13:20:38 -0400
Organization: University of Colorado, Boulder
Keywords: practice

As part of the support material for the second assignment in the compiler
course, I had built an abstract tree and interpreter along the lines given
by Appel in his book "Modern Compiler Construction in Java", using both
C++ and Eli. (For HTML, PostScript and PDL versions, see


http://ece-www.colorado.edu/~ecen4553//Reference/abstr/straight/
http://ece-www.colorado.edu/~ecen4553//Reference/abstr/straight.ps
http://ece-www.colorado.edu/~ecen4553//Reference/abstr/straight.pdl


repectively.)


This example was in the first chapter of Appel's book, and he didn't
want to get into scanning and parsing. Therefore he built the tree
simply by writing a hairy constructor call. I did the same, because I
wanted to concentrate on the abstraction, as he did.


The third assignment involves the structure of scanners and parsers,
so I decided to provide a document similar to the one for the second
assignment, but covering the scanning and parsing needed to produce
the abstraction described earlier. The students could then connect
the scanner and parser with the previous code to get a complete
compiler/interpreter.


I decided to write the code first, and then weave the document around
it. I first wrote the Eli specification, merged it with the previous
specification, and generated the system. Then I wrote C++, starting
with the Eli specification as a specification, and following the
prescription in "An Introduction to Compiler Construction", by Waite &
Carter (slightly modified because the language was C++ rather than
Pascal). The strategy described there is basically the hand-coded
version of what Eli generates, so I was using the same structure for
both implementations.


Here are the statistics:


Eli: 56 lines, 102 minutes (approximately 1.8 min/line)
C++: 444 lines, 594 minutes (approximately 1.3 min/line)


The times and lines are total, including all writing, testing and
debugging but no documentation except comments. In both cases, I
think that the code is reasonably readable. I've done similar
statistics on writing Eli specifications before, and 2 min/line seems
to be a remarkably consistent rate over a number of different types of
specifications. I was a bit surprised that I was able to write faster
in C++, since I seem to have incredibly frustrating sessions where I
make no progress at all. My debugging seems to consist largely of
putting #ifdef-#endif around some piece of code and then recompiling
-- desk checking is all but useless.


I broke the C++ implementation into scanning and parsing:


Scanner: 301 lines, 390 minutes (approximately 1.3 min/line)
Parser: 143 lines, 204 minutes (approximately 1.4 min/line)


I ran both implementations on my laptop under gcc-2.7.2:


snag% size trycpp
text data bss dec hex filename
123831 1532 24 125387 1e9cb trycpp


-> ast.specs :exe ! size
text data bss dec hex filename
56390 4108 560 61058 ee82 /.../ast.specs.2925553.exe


I suppose that the major difference is that Eli generates C, whereas
C++ using the standard template library is a well known source of code
bloat. I've not measured the speed, but I expect that the
Eli-generated version would win hands down because my C++ source
module uses string concatenation to build up a line one character at a
time (hey, I needed to get the thing running!)


The raw data says that using Eli I built a processor that less than
1/2 the size of the C++ version in less than 1/4 of the time. What
doesn't show is that the Eli version is more flexible and more robust:
The C++ version reads ONLY from standard input, the Eli version will
read either from standard input or from a specified file. The C++
version quits on the first syntax error, the Eli version recovers from
syntax errors and quits only when the error density exceeds a
threshhold. Both versions accept the same input language and produce
the same output.
[I'm not surprised you can write lines of C or C++ faster than lines of
Eli. C/C++ is low enough level that you end up writing the same old
stuff over and over again, and the 400th time you've written a for loop
that runs down a link list, the chances of getting that idiom right are
reasonably good. -John]




--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.