Related articles |
---|
[7 earlier articles] |
Re: Bottom-up versus Top-down henry@zoo.toronto.edu (Henry Spencer) (1997-12-02) |
Re: Bottom-up versus Top-down thetick@magelang.com (Scott Stanchfield) (1997-12-02) |
Re: Bottom-up versus Top-down dwight@pentasoft.com (1997-12-02) |
Re: Bottom-up versus Top-down neitzel@gaertner.de (1997-12-05) |
Re: Bottom-up versus Top-down jmccarty@sun1307.spd.dsccc.com (1997-12-05) |
Re: Bottom-up versus Top-down sperber@informatik.uni-tuebingen.de (1997-12-07) |
Re: Bottom-up versus Top-down henry@zoo.toronto.edu (Henry Spencer) (1997-12-07) |
Re: Bottom-up versus Top-down bromage@cs.mu.oz.au (1997-12-07) |
Re: Bottom-up versus Top-down jeffj@csm.astate.edu (Jeff Jenness) (1997-12-10) |
Re: Bottom-up versus Top-down thetick@magelang.com (Scott Stanchfield) (1997-12-10) |
Re: Bottom-up versus Top-down torbenm@diku.dk (1997-12-12) |
From: | Henry Spencer <henry@zoo.toronto.edu> |
Newsgroups: | comp.compilers |
Date: | 7 Dec 1997 22:09:09 -0500 |
Organization: | SP Systems, Toronto |
References: | 97-11-123 97-11-155 97-11-178 97-12-024 |
Keywords: | parse, performance |
Mike McCarty <jmccarty@sun1307.spd.dsccc.com> wrote:
>I don't know where this bit of misinformation came from, but it is EASY
>to get an LL grammar for C. Just take the one in the ANSI standard, and
>remove left recursion, replacing it with right recursion.
When I did it, it wasn't quite that simple, alas. There are a few
places in the C grammar where one ends up fudging because two
constructs start out rather similarly.
In particular, if your program begins:
static unsigned long int *foo(blah blah, foo bar, grunt grunt)
where "blah blah" etc. are messy type names and declarators, you
*still* don't know whether this is the beginning of a function's
new-style definition or almost all of a prototyped declaration of it,
not until you see the next token after the closing parenthesis.
There is a similar problem in the declarators of a function prototype,
where you don't know whether it's going to be an abstract declarator
or a normal one until you find out whether there's an identifier
inside it or not.
If t is a typedefed name, a straightforward transform of the grammar
will read "short t;" as two declaration specifiers, when in fact it is
a declaration specifier plus an identifier being declared. Similar
problems occur if a typedefed name is reused as a label in the first
statement of a compound statement -- fudging is necessary to make sure
it gets read as a label rather than the start of a declaration.
It's not fundamentally difficult to derive an LL grammar for C, but
the result is not just the ANSI C grammar with the recursions
reversed. Some careful rewriting is needed here and there.
--
| Henry Spencer
| henry@zoo.toronto.edu
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.