From: | Randall Hyde <rhyde@cs.ucr.edu> |
Newsgroups: | comp.compilers |
Date: | 31 Oct 2000 14:33:16 -0500 |
Organization: | Posted via Supernews, http://www.supernews.com |
References: | 00-10-061 00-10-067 00-10-093 00-10-109 00-10-130 00-10-193 00-10-209 |
Keywords: | parse, tools |
Posted-Date: | 31 Oct 2000 14:33:16 EST |
> I think that writing a compiler of that size in recursive descent would
> be a maintenance nightmare.
So was writing a compiler of that size with Flex/Bison :-)
I predict the C++ R.D. version would be half the size.
> ANTLR generates the recursive descent code
> for you from a grammar specification. However, as you painfully
> discovered, you will be married to the tool, for better or for worse.
ANTLR is at the top of my list of tools to consider. I do have a couple
of concerns I would like answers to though:
(1) I would prefer to write my lexer in assembly language (gasp!).
HLA spends the vast majority of its time in lexical analysis and
due to the interpreter I've integrated into the compiler (producing
what I call the "compile-time language") there is a very incestuous
relationship between the parser and lexer (the parser calls the lexer
which recursively calls the parser, which recursively calls the lexer,
etc.). I'm using this bit of craziness to support on-the-fly expansion
of HLA's macros and other features (a preprocessor will *not* work
here). So how hard is it to ignore ANTLR's lexer generator capabilities
and call one or more different lexers written in another language (e.g.,
assembly)?
(2) Is it easy to create and merge multiple parsers into the same compiler?
Remember, I need a compiler and an interpreter in the same package.
Merging the two grammars together, as I had to do with Flex/Bison
created a lot of problems for me (yes, I realize I could have created
two parsers with Bison, but I was never able to get some simple test
systems working to my satisfaction).
(3) Being written in Java concerns me. What is the performance like?
Granted, it's got to be better than running 100 KLOC through Bison
(which generally takes about a minute on my machine), but I would
really like better turn around on the compiles with the next version.
(4) How large a system can you compile with ANTLR? Although not using
Bison will allow me to reduce my grammar considerably, it's still a
large language. HLA has several hundred statements, each with it's
own syntactical variations. The original versions of FLEX and Bison
that I started with choked early on (first I ran out of table space,
then the unsigned shorts used as indexes into the tables started
overflowing, etc.). I had to modify Flex and Bison to handle the
large grammar I'd produced. Alas, I was left behind as newer versions
of Flex and Bison arrived because I didn't want to have to go in and
make all those changes to the newer versions of the tools. This was
a real problem with FLEX (you should see the ugly algorithm, involving
unget, I use to expand macros on the fly; a problem that was
eliminated in later versions of Flex that I could not use).
>
> I would also point out that a recursive program tends to be much slower
> than an iterative one. A few years back, I compared my table-driven C
> recognizer to the one provided with ANTLR. Mine was about three times
> faster when executed on the ANTLR source code itself.
Parsing consumes a tiny fraction of the time compared with scanning (at
least, in HLA). Even with the truly disgusting lexer I currently have,
compile times are usually quite reasonable. I'm not too worried about
the minor differences in the parse times. Besides, the only place recursion
really kicks in is within complex arithmetic expressions. By using a
bottom-up parser (say, operator precedence) here, you can speed this up
if it turns out to be a problem. Once I fix the problems with HLA's
lexer, performance will be very good.
>
> If you send me a non-flattened, actionless version of your grammar, I
> will run it through my parser-generator to see if the language is strong
> LL(k), for some reasonable value of k.
At various times I've had offers to fix the problems with my grammar.
However, HLA v1.x is a prototype system, it already works (well, sort
of...), and has served its purpose. The grammar issue is just one of
about a 1,000 different problems I need to address in the next version,
so I wouldn't want to waste anyone's time trying to improve the prototype.
My main concern at this point is "should I write it manually or should I
use a tool?" The obvious answer to someone not familiar with HLA is to
learn and use a better tool. As I've said, I'm considering ANTLR and
my browser has links to lots of other compiler tool pages.
However, when I first started writing HLA v1.0 I was *sure* Flex
and Bison were the proper tools. It took me a couple of years to
discover than they were not the appropriate tools to use. I realize
that hand-coding a parser makes the maintenance effort greater and that
using a tool will let me change things around much easier (once I master
the tool, that is). However, the whole purpose of v1.x was to determine
what I would really need in v2.x. While there will definitely be some
maintenance issues that come up, most if the "what-if" scenarios have
already taken place. I don't want to claim that v2.x is just a coding
project, but most of the major design issues are already settled.
That means that I have a pretty good idea of what needs to be done
if I write HLA in a language like C++ or object Pascal (or even HLA,
for that matter). I'm not at all sure what the issues will be if I
switch to a tool like ANTLR. This is a traditional software engineering
risk management problem: do I go with the tried and true mechanism that
is safe, or do I try to use a tool and hope that the tool is appropriate?
On the one hand, ANTLR (or some other tool) could save a lot of effort.
OTOH, the tool may turn out to be inappropriate and I get locked into
using the tool or waste a couple of years of my life and have to start
over using the 3rd generation language approach.
Whatever, the actual work on V2.0 is still about a year away, so
I've got lots of time to mull over this decision (and, perhaps,
play around with some tools in the meantime).
Randy Hyde
Return to the
comp.compilers page.
Search the
comp.compilers archives again.