Re: Is LALR(1) or LL(k) parser better for C++

David L Moore <dlmoore@ix.netcom.com>
25 Jan 1997 22:16:07 -0500

          From comp.compilers

Related articles
Is LALR(1) or LL(k) parser better for C++ kikonen@cs.joensuu.fi (1997-01-22)
Re: Is LALR(1) or LL(k) parser better for C++ jlilley@empathy.com (John Lilley) (1997-01-22)
Re: Is LALR(1) or LL(k) parser better for C++ dlmoore@ix.netcom.com (David L Moore) (1997-01-25)
Re: Is LALR(1) or LL(k) parser better for C++ thetick@scruz.net (Scott Stanchfield) (1997-01-26)
Re: Is LALR(1) or LL(k) parser better for C++ mrs@kithrup.com (1997-01-26)
Re: Is LALR(1) or LL(k) parser better for C++ dlmoore@ix.netcom.com (David L Moore) (1997-01-29)
| List of all articles for this month |
From: David L Moore <dlmoore@ix.netcom.com>
Newsgroups: comp.compilers
Date: 25 Jan 1997 22:16:07 -0500
Organization: Netcom
References: 97-01-163 97-01-181
Keywords: C++, parse

Kari Ikonen wrote:
> Is LALR(1) or LL(k) based parser better for parsing C++ code?
> Which one of these is easier to handle?


John Lilley wrote:
> 2) What paradigm are you more comfortable with? I find LL to be much
> easier to grasp and debug than LALR, but that's because LALR makes my
> brain hurt. Someone smarter than I may find LALR easier to work with,
> because fewer ambiguities will be encountered. But an LALR(1)
> parser-generator with no backtracking may prove to be unworkable.


I've been enhancing an LALR (YACC) grammar for a (relatively small)
subset of C++. Enhancing it was often a considerable pain. I inherited
the original parser so one might be able to do better by carefully
crafting a grammar oneself.


I have written parsers for other languages, including Pascal and
Modula-2 in the past using recursive descent. A Pascal Recursive
Descent parser is pretty much a no-brainer - you just generate your
first sets and then follow the rail-road tracks. You can probably
manage without actually generating the first sets.


The C++ subset was the first time I had used YACC for a serious
grammar.


Several issues stand out. First, it is much harder to debug LALR
grammars. You can get your average debugger to track through semantic
code embedded in the grammar, but you cannot (at least I cannot) get
it to print out those variables with names like $$. Also, single
stepping will insist on stepping through the parser code as well. Some
of this is a feature of YACC rather than LALR parsing in general.


Second, control of parsing in those situations where a symbol must be
(for example), a type-id in one context or an undefined-id in another
are hard to handle. Either you have weird productions like <var_decl>
= <type_id> <type_id>; or you have globals flags flying back and forth
between the lexer and parser, neither of which is a great solution.


Third, I see ambiguities on productions that I think should not be
ambiguous in a full LR grammar. I might be wrong about this since I
have not tried to run the grammar through a full LR parser nor have I
sat down and tried to grind out the states by hand but my suspicion is
that the power you lose with the LALR simplification is significant.


We shall probably replace this parser with a recursive descent parser
at some time in the fairly near future ("real soon now"). If we
proceed, it will be interesting to compare the ease of producing the
two parsers.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.