From: | wclodius@los-alamos.net (William Clodius) |
Newsgroups: | comp.compilers |
Date: | Fri, 11 Apr 2008 22:09:16 -0600 |
Organization: | Compilers Central |
References: | 08-03-107 08-03-119 08-04-024 |
Keywords: | tools, parse, C++ |
Posted-Date: | 12 Apr 2008 01:38:58 EDT |
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
><snip>
> In particular, while I know of several tools for instruction
> selection using tree parsing, none of them seems to be widely-used;
> many compilers use hand-written instruction selectors, and of those
> where I have heard that they use generated tree-parsing instruction
> selectors, the generator was developed or extended in-house.
>
> One explanation I have heard is that the compiler writers don't like
> to make themselves dependent on a tool that may go away. OTOH, gcc
> reverted from using bison-generated parsers to hand-written ones (at
> least for C++ and C), and I very much doubt that the future of bison
> was the reason for that.
> <snip>
If I remember correctly they had two problems with the bison generated
grammars, poor error reporting and a gemneral mismatch between the
grammar bein parsed and the capabilities of the tools.
The minimal grammars needed to define syntax compatible with LR
parsing leave little context for determining the cause of a syntax
error in user code. While an LL(k) grammar provides sufficient
context for detailed error reporting, and is compatible with an LR(k)
parser, for an LR parser generator to use this detail in aiding
language implementers it would need to add a switch to: recognize when
a given grammar is LR, but not LL(k); tell users the problems that
prevent the grammar from being LL(k); and add additional hooks to
allow the grammar developer to incorporate the error reporting. These
additions detract from the compact code size and high processing speed
of an LR parser, and I am unaware of any such parser that has
incorporated sufficient error reporting. Further some languages that
can be expressed using LR compatible grammars, cannot be rewritten to
LL(k) form. (Note however that the LR compatible grammar may generate
a language that is a subset of a language that is LL compatible, and
the LL compatible languagee may be useful in recognizing common coing
errors.) Parser generator developers have on the whole decided that an
LR parser generator should be an LR parser generator, and if language
developers want to give up the speed and flexibility of an LR grammar
for the error reporting of an LL(k) grammar, then they should use an
LL(k) parser generator such as ANTLR.
The other problem is that C++(and to a lesser extent C) do not have
true LR grammars (let alone LL(k) ones). Awkward hacks are required to
deal with the context dependencies in C++ using an LR generator. These
context dependencies are easilly accessed in a recursive descent
parser.
[My understanding is that GCC switched to a hand-written parser
because of the difficulty of parsing the awful C++ grammar with
anything other than hand-written hacks. The new parser may be a
little faster but that wasn't a big issue, since parse time is never a
bottleneck in a compiler. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.