Related articles |
---|
Re: Typedef/identifier problem in C(++) parser. josv@@ursa-major.spdcc.com (1992-05-20) |
Object oriented compiler construction bibliography hstrelow@ing.puc.cl (strelow fiedler hans christian erich) (1995-05-16) |
Newsgroups: | comp.compilers |
From: | josv@@ursa-major.spdcc.com (Jos Vermeijlen) |
Keywords: | C++, parse |
Organization: | Compilers Central |
References: | 95-05-110 |
Date: | Wed, 20 May 1992 15:07:01 GMT |
Michael Stal writes:
> - Can anybody provide me with an idea or solution how to cope with the
> TYPEDEFname/IDENTIFIER-problem within the scanner?
Here is an idea:
I am writing a ANSI-C compiler (mainly for simulation purposes) and of
course I am having the same problem with TYPEDEFNAMEs and IDENTIFIERs as
you have and you are right, a simple table for TYPEDEFNAMEs is not enough.
I think it is mainly a semantic problem and not a syntactic: Two tokens
that look the same but have a different meaning. So IDENTIFIERs and
TYPEDEFNAMEs are identified by the same rule in the scanner, but the
information to distinguish them has to be provided by the parser.
So here is what I did: I made a table to keep track of the TYPEDEFNAMEs
that have been declared (look out: this table increases and decreases with
block entries and exits) and a kind of flag, NeedTYPEDEFNAME, to indicate
the need for an identifier (i.e. a bunch of characters) be intepreted as
an IDENTIFIER or as a TYPEDEFNAME. In the Scanner this is computed as
follows:
NeedTDN == up && identifier in table --> return a TYPEDEFNAME
NeedTDN == up && !(identifier in table) --> return an IDENTIFIER
NeedTDN == down --> return an IDENTIFIER
In the parser I set and reset the flag with actions in the parser rules.
For example, I 'expect' an TYPEDEFNAME at the beginning of a declaration,
but when I recognize a TypeSpecifier (TYPEDEFNAME included) I know the
following identifiers are IDENTIFIERs, so I reset the NeedTDN flag.
Be care where you put your action to set and reset the flag: let's assume you
use a LALR(1) parser :
1) If you recognize a '(' at the beginning of a expression the parser can not
decide wether it is the '(' of a primary expression or the '(' of a cast
expression, it is all the same to it. So you need to set NeedTDN. Primary
expressions (other than cast expressions and 'sizeof' operators) with
TYPEDEFNAMEs in them cause parser errors. (Many more rules in the ANSI-C
grammar have the same kind of difficulties.)
2) With Look Ahead (1) the scanner will always be one step ahead of the
parser, so if you want an action to be active BEHIND a token, you have
to put the action IN FRONT OF that token.
3) Try to attach actions to terminals and not to nonterminals. (I think it
is the safest way, because Look Ahead works with terminals. Sue me if I
am wrong.)
And last but not least: this is a idea. I used it myself and it worked for
me (I have to solve just 1 reduce reduce conflict caused by actions
introducing epsilon rules) but I don't know much about C++ and maybe this
"hack" won't work. Have fun!
Jos Vermeijlen.
Digital Information Systems Group, Room EH 11.26
Eindhoven University of Technology.
P.O. Box 513
5600 MB Eindhoven tel: +31-40-473394
The Netherlands Email: josv@eb.ele.tue.nl
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.