Related articles |
---|
Handling the typedef problem with a modifiable grammar bburshte@pyrps5.eng.pyramid.com (1992-01-09) |
Parsing C typedefs drw@math.mit.edu (1992-01-15) |
Newsgroups: | comp.compilers |
From: | drw@math.mit.edu (Dale R. Worley) |
Keywords: | C, parse, yacc |
Organization: | MIT Dept. of Tetrapilotomy, Cambridge, MA, USA |
References: | 92-01-043 |
Date: | Wed, 15 Jan 1992 18:14:39 GMT |
[To the moderator: Even though the topic has been closed, I hope that
this article will be accepted anyway, since it describes how these
problems were solved in a production compiler that handled all the
tricky cases of ANSI C.]
[This is the absolute definite last word on parsing typedefs. Really. -John]
I worked on the parser for the Compass C compiler. The solution we used
to the typedef vs. identifier problem was to have the scanner return two
different token types, based on the symbol table. In order to make this
work correctly, a few things have to be done carefully:
An identifier has to be inserted into the symbol table when the
punctuation following its declarator (',', ';', or '=') is read. This
is not particularly tricky, since an LALR(1) parser never has to look
ahead at that point, and our parser looks ahead only when needed.
Making sure that tricky declarations like
typedef int foo;
foo;
foo x;
volatile foo;
int foo;
are handled correctly is done by having the grammar of declarations
enforce the rule that "if a typedef name is used as a declarator, the
declaration-specifiers of the declaration must contain a type-specifier".
Thus, if "foo" is a typedef, then
volatile foo;
is an empty declaration (and is illegal), while
int foo;
is a redeclaration of "foo" as an integer variable. To do this, the
grammar must keep track of whether a type-specifier has been seen in the
declaration-specifiers as it reads the next token and tries to decide
whether it is another declaration-specifier or a declarator. Thus you
have grammar rules like:
declaration-specifiers-without-type-specifier ::=
declaration-specifiers-without-type-specifier type-modifier
declaration-specifiers-with-type-specifier ::=
declaration-specifiers-with-type-specifier type-modifier
| declaration-specifiers-without-type-specifier type-specifier
| declaration-specifiers-with-type-specifier type-specifier
declaration ::=
declaration-specifiers-without-type-specifier
non-typedef-declarator
| declaration-specifiers-with-type-specifier
non-typedef-declarator
| declaration-specifiers-with-type-specifier
typedef-declarator
If the grammar is aware of these distinctions, it can be made LALR(1).
This multiples the grammar rules, of course, but this is not the only
place that a practical C grammar has several clones of a single rule in
the Standard's grammar.
Dale Worley Dept. of Math., MIT drw@math.mit.edu
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.