Related articles |
---|
Is LALR(1) or LL(k) parser better for C++ kikonen@cs.joensuu.fi (1997-01-22) |
Re: Is LALR(1) or LL(k) parser better for C++ jlilley@empathy.com (John Lilley) (1997-01-22) |
Re: Is LALR(1) or LL(k) parser better for C++ dlmoore@ix.netcom.com (David L Moore) (1997-01-25) |
Re: Is LALR(1) or LL(k) parser better for C++ thetick@scruz.net (Scott Stanchfield) (1997-01-26) |
Re: Is LALR(1) or LL(k) parser better for C++ mrs@kithrup.com (1997-01-26) |
Re: Is LALR(1) or LL(k) parser better for C++ dlmoore@ix.netcom.com (David L Moore) (1997-01-29) |
From: | "Scott Stanchfield" <thetick@scruz.net> |
Newsgroups: | comp.compilers |
Date: | 26 Jan 1997 22:29:47 -0500 |
Organization: | scruz-net |
References: | 97-01-163 97-01-181 97-01-206 |
Keywords: | yacc, debug |
> Several issues stand out. First, it is much harder to debug LALR
> grammars. You can get your average debugger to track through
> semantic code embedded in the grammar, but you cannot (at least I
> cannot) get it to print out those variables with names like
> $$. Also, single stepping will insist on stepping through the parser
> code as well. Some of this is a feature of YACC rather than LALR
> parsing in general.
Just a note here -- the "$$ variables" in yacc actually get turned into
references like yypvt[0]. Depends on the yacc you're using. In some, the
$1, $2, $3 are represented by (if you just reduced something like a: b c d)
yypvt[0] -- most recent ($3)
yypvt[-1] -- next back ($2)
yypvt[-2] -- next back from that ($1)
and so on. Basically, yypvt is a pointer into the stack where all the
fun happens. But beware -- this can change based on the yacc you're
using...
That aside, I still think table-driven parsers (whether they are LL,
LR or LALR) are just plain hard to debug. Recursive-descent is much
easier and it's really possible to see a correlation between the code
and the grammar.
> Second, control of parsing in those situations where a symbol must
> be (for example), a type-id in one context or an undefined-id in
> another are hard to handle. Either you have weird productions like
> <var_decl> = <type_id> <type_id>; or you have globals flags flying
> back and forth between the lexer and parser, neither of which is a
> great solution.
This is handled well by a parser-generator such as PCCTS, where you
can add semantic predicates to choose alternatives based on a
condition. Rather than set up a test in the lexer and return
different tokens, you return a single type of token in either case and
predicate the grammar to choose rules based on the token AND a
condition. Glad to see someone who is as dead against parser-->lexer
backtalk as I...
> Third, I see ambiguities on productions that I think should not be
> ambiguous in a full LR grammar. I might be wrong about this since I
> have not tried to run the grammar through a full LR parser nor have
> I sat down and tried to grind out the states by hand but my
> suspicion is that the power you lose with the LALR simplification is
> significant.
> We shall probably replace this parser with a recursive descent
> parser at some time in the fairly near future ("real soon now"). If
> we proceed, it will be interesting to compare the ease of producing
> the two parsers.
Have a look at PCCTS -- look at
http://java.magelang.com/antlr/entry.html
for some great starting info on it. PCCTS (ANTLR/DLG) generates
recursive-descent parsers that are really cool. I also have a tutorial in
progress
http://www.scruz.net/~thetick/pcctstut
As for C++ with PCCTS, see John Lilley's work (via the entry page above).
--
Scott Stanchfield
Santa Cruz, CA
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.