Re: Typedef/identifier problem in C(++) parser. (Jos Vermeijlen)
Wed, 20 May 1992 15:07:01 GMT

          From comp.compilers

Related articles
Re: Typedef/identifier problem in C(++) parser. (1992-05-20)
Object oriented compiler construction bibliography (strelow fiedler hans christian erich) (1995-05-16)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Jos Vermeijlen)
Keywords: C++, parse
Organization: Compilers Central
References: 95-05-110
Date: Wed, 20 May 1992 15:07:01 GMT

Michael Stal writes:
> - Can anybody provide me with an idea or solution how to cope with the
> TYPEDEFname/IDENTIFIER-problem within the scanner?

Here is an idea:

I am writing a ANSI-C compiler (mainly for simulation purposes) and of
course I am having the same problem with TYPEDEFNAMEs and IDENTIFIERs as
you have and you are right, a simple table for TYPEDEFNAMEs is not enough.

I think it is mainly a semantic problem and not a syntactic: Two tokens
that look the same but have a different meaning. So IDENTIFIERs and
TYPEDEFNAMEs are identified by the same rule in the scanner, but the
information to distinguish them has to be provided by the parser.

So here is what I did: I made a table to keep track of the TYPEDEFNAMEs
that have been declared (look out: this table increases and decreases with
block entries and exits) and a kind of flag, NeedTYPEDEFNAME, to indicate
the need for an identifier (i.e. a bunch of characters) be intepreted as
an IDENTIFIER or as a TYPEDEFNAME. In the Scanner this is computed as
NeedTDN == up && identifier in table --> return a TYPEDEFNAME
NeedTDN == up && !(identifier in table) --> return an IDENTIFIER
NeedTDN == down --> return an IDENTIFIER

In the parser I set and reset the flag with actions in the parser rules.
For example, I 'expect' an TYPEDEFNAME at the beginning of a declaration,
but when I recognize a TypeSpecifier (TYPEDEFNAME included) I know the
following identifiers are IDENTIFIERs, so I reset the NeedTDN flag.

Be care where you put your action to set and reset the flag: let's assume you
use a LALR(1) parser :
1) If you recognize a '(' at the beginning of a expression the parser can not
      decide wether it is the '(' of a primary expression or the '(' of a cast
      expression, it is all the same to it. So you need to set NeedTDN. Primary
      expressions (other than cast expressions and 'sizeof' operators) with
      TYPEDEFNAMEs in them cause parser errors. (Many more rules in the ANSI-C
      grammar have the same kind of difficulties.)

2) With Look Ahead (1) the scanner will always be one step ahead of the
      parser, so if you want an action to be active BEHIND a token, you have
      to put the action IN FRONT OF that token.

3) Try to attach actions to terminals and not to nonterminals. (I think it
      is the safest way, because Look Ahead works with terminals. Sue me if I
      am wrong.)

And last but not least: this is a idea. I used it myself and it worked for
me (I have to solve just 1 reduce reduce conflict caused by actions
introducing epsilon rules) but I don't know much about C++ and maybe this
"hack" won't work. Have fun!
Jos Vermeijlen.

Digital Information Systems Group, Room EH 11.26
Eindhoven University of Technology.
P.O. Box 513
5600 MB Eindhoven tel: +31-40-473394
The Netherlands Email:

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.