|Handing typedefs in yacc generated parsers email@example.com (Dibyendu Majumdar) (1999-01-11)|
|Re: Handing typedefs in yacc generated parsers firstname.lastname@example.org (1999-01-15)|
|Re: Handing typedefs in yacc generated parsers email@example.com (Dibyendu Majumdar) (1999-01-19)|
|From:||Dibyendu Majumdar <firstname.lastname@example.org>|
|Date:||11 Jan 1999 14:38:26 -0500|
I would appreciate your advice on following:
I am working on the UPS C Interpreter - fixing bugs and improving
compliance to the C Standard. One area where the interpreter was weak
was in the handling of typedefs. I have made changes which seem to
work - but am not sure if my way of handling it was correct.
The original implementation used a simple lookup function to
distinguish between IDENTIFIER and TYPEDEF_NAME. The parser provided a
function for this purpose - the lexical analyzer called the function
when it encountered an IDENTIFIER. If the the lookup function found
that the name was a typedef name, it returned TYPEDEF_NAME - and
that's what the lexer returned as the token.
The problem with this approach was that a TYPEDEF_NAME could appear
anywhere an IDENTIFIER was expected, causing the yacc parser (grammer
based on K&R2) to fail.
I have solved this problem by adding context sensitivity to the
lexer. I did this as follows:
1) I added a couple of flags to the lexer.
When the lexer encounters the keywords STRUCT, UNION, or
ENUM, it sets both flags to TRUE.
When the lexer encounters either VOID, CHAR, SHORT, INT,
LONG, FLOAT, DOUBLE, SIGNED or UNSIGNED, it sets both flags
When the lexer encounters either STATIC, AUTO, REGISTER, EXTERN,
TYPEDEF, CONST or VOLATILE, and the flag in_decl_specifier
is FALSE, it sets seen_type_specifier to FALSE (just to be sure)
and in_decl_specifier to TRUE. Otherwise it does nothing.
When an IDENTIFIER is found, the lexer first calls the
parser function described before. If the parser function
identifies a TYPEDEF_NAME, then the lexer does one of
* If the previous token was GOTO, DOT (.) or
ARROW (->), it returns IDENTIFIER instead of
* Else, if in_decl_specifier flag is TRUE - the action
taken is one of following. If seen_type_specifier is also TRUE,
it returns IDENTIFIER, otherwise it sets seen_type_specifier
to TRUE and returns TYPEDEF_NAME.
* Else, if next token is COLON (:) and previous token was
either RBRACE (}) or SEMI (;), it calls a parser function
called ci_label_allowed() (described later)
to determine if Labels are allowed. If not, it returns
TYPEDEF_NAME, otherwise, IDENTIFIER is returned.
* If none of above match, the flags in_decl_specifier and
seen_type_specifier are set to TRUE, and TYPEDEF_NAME is
The lexer resets the flags in_decl_specifier and
seen_type_specifier when it encounters any token not
allowed in a declaration specifier (including IDENTIFIER).
2. I added two flags to the parser as well. These flags are
set when a) parsing enum constants, and b) struct/union members.
The parser typedef lookup function tests the first flag. If the
flag is set it does not lookup the name at all, and returns
IDENTIFIER straightaway. (If the name was already defined
as a TYPEDEF_NAME, the redefinition is reported by the
parser later during semantic analysis).
The second flag is used by the lexer to determine if Labels
are allowed (see previous section) when it sees a construct
that looks like either a label or a bitfield.
With above changes, the interpreter is able to parse typedef names
correctly. In my tests so far, the namespace/scoping rules of Standard
C are followed correctly.
My question is this:
Is this the right way to deal with this problem in a yacc generated
parser ? How have other people dealt with similar problems (without
rewriting the grammer as suggested by Jim Roskind) ? My intention is
to avoid changing the grammer - because that would mean much more
changes to the parser, which otherwise works fine.
Any help would be much appreciated.
Thanks and Regards
The website for the UPS C Interpreter is www.concerto.demon.co.uk.
The UPS Debugger/Interpreter was created by Mark Russell. For
more information please check the website.
Return to the
Search the comp.compilers archives again.