|Handling the typedef problem with a modifiable grammar firstname.lastname@example.org (1992-01-09)|
|Parsing C typedefs email@example.com (1992-01-15)|
|From:||firstname.lastname@example.org (Dale R. Worley)|
|Keywords:||C, parse, yacc|
|Organization:||MIT Dept. of Tetrapilotomy, Cambridge, MA, USA|
|Date:||Wed, 15 Jan 1992 18:14:39 GMT|
[To the moderator: Even though the topic has been closed, I hope that
this article will be accepted anyway, since it describes how these
problems were solved in a production compiler that handled all the
tricky cases of ANSI C.]
[This is the absolute definite last word on parsing typedefs. Really. -John]
I worked on the parser for the Compass C compiler. The solution we used
to the typedef vs. identifier problem was to have the scanner return two
different token types, based on the symbol table. In order to make this
work correctly, a few things have to be done carefully:
An identifier has to be inserted into the symbol table when the
punctuation following its declarator (',', ';', or '=') is read. This
is not particularly tricky, since an LALR(1) parser never has to look
ahead at that point, and our parser looks ahead only when needed.
Making sure that tricky declarations like
typedef int foo;
are handled correctly is done by having the grammar of declarations
enforce the rule that "if a typedef name is used as a declarator, the
declaration-specifiers of the declaration must contain a type-specifier".
Thus, if "foo" is a typedef, then
is an empty declaration (and is illegal), while
is a redeclaration of "foo" as an integer variable. To do this, the
grammar must keep track of whether a type-specifier has been seen in the
declaration-specifiers as it reads the next token and tries to decide
whether it is another declaration-specifier or a declarator. Thus you
have grammar rules like:
| declaration-specifiers-without-type-specifier type-specifier
| declaration-specifiers-with-type-specifier type-specifier
If the grammar is aware of these distinctions, it can be made LALR(1).
This multiples the grammar rules, of course, but this is not the only
place that a practical C grammar has several clones of a single rule in
the Standard's grammar.
Dale Worley Dept. of Math., MIT email@example.com
Return to the
Search the comp.compilers archives again.