Re: Bottom-up versus Top-down

Henry Spencer <henry@zoo.toronto.edu>
7 Dec 1997 22:09:09 -0500

          From comp.compilers

Related articles
[7 earlier articles]
Re: Bottom-up versus Top-down henry@zoo.toronto.edu (Henry Spencer) (1997-12-02)
Re: Bottom-up versus Top-down thetick@magelang.com (Scott Stanchfield) (1997-12-02)
Re: Bottom-up versus Top-down dwight@pentasoft.com (1997-12-02)
Re: Bottom-up versus Top-down neitzel@gaertner.de (1997-12-05)
Re: Bottom-up versus Top-down jmccarty@sun1307.spd.dsccc.com (1997-12-05)
Re: Bottom-up versus Top-down sperber@informatik.uni-tuebingen.de (1997-12-07)
Re: Bottom-up versus Top-down henry@zoo.toronto.edu (Henry Spencer) (1997-12-07)
Re: Bottom-up versus Top-down bromage@cs.mu.oz.au (1997-12-07)
Re: Bottom-up versus Top-down jeffj@csm.astate.edu (Jeff Jenness) (1997-12-10)
Re: Bottom-up versus Top-down thetick@magelang.com (Scott Stanchfield) (1997-12-10)
Re: Bottom-up versus Top-down torbenm@diku.dk (1997-12-12)
| List of all articles for this month |

From: Henry Spencer <henry@zoo.toronto.edu>
Newsgroups: comp.compilers
Date: 7 Dec 1997 22:09:09 -0500
Organization: SP Systems, Toronto
References: 97-11-123 97-11-155 97-11-178 97-12-024
Keywords: parse, performance

Mike McCarty <jmccarty@sun1307.spd.dsccc.com> wrote:
>I don't know where this bit of misinformation came from, but it is EASY
>to get an LL grammar for C. Just take the one in the ANSI standard, and
>remove left recursion, replacing it with right recursion.


When I did it, it wasn't quite that simple, alas. There are a few
places in the C grammar where one ends up fudging because two
constructs start out rather similarly.


In particular, if your program begins:


static unsigned long int *foo(blah blah, foo bar, grunt grunt)


where "blah blah" etc. are messy type names and declarators, you
*still* don't know whether this is the beginning of a function's
new-style definition or almost all of a prototyped declaration of it,
not until you see the next token after the closing parenthesis.


There is a similar problem in the declarators of a function prototype,
where you don't know whether it's going to be an abstract declarator
or a normal one until you find out whether there's an identifier
inside it or not.


If t is a typedefed name, a straightforward transform of the grammar
will read "short t;" as two declaration specifiers, when in fact it is
a declaration specifier plus an identifier being declared. Similar
problems occur if a typedefed name is reused as a label in the first
statement of a compound statement -- fudging is necessary to make sure
it gets read as a label rather than the start of a declaration.


It's not fundamentally difficult to derive an LL grammar for C, but
the result is not just the ANSI C grammar with the recursions
reversed. Some careful rewriting is needed here and there.
--
| Henry Spencer
| henry@zoo.toronto.edu
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.