Re: Recognition of typedefs in C ?

"Jos A. Horsmeier" <>
26 Jan 2001 17:03:37 -0500

          From comp.compilers

Related articles
Recognition of typedefs in C ? (David Pereira) (2001-01-18)
Re: Recognition of typedefs in C ? (2001-01-19)
Re: Recognition of typedefs in C ? (Anthony PIRON) (2001-01-19)
Re: Recognition of typedefs in C ? (Joachim Pimiskern) (2001-01-19)
Re: Recognition of typedefs in C ? (Jos A. Horsmeier) (2001-01-26)
| List of all articles for this month |

From: "Jos A. Horsmeier" <>
Newsgroups: comp.compilers
Date: 26 Jan 2001 17:03:37 -0500
Organization: AND Software B.V. Rotterdam
References: 01-01-096
Keywords: C, types
Posted-Date: 26 Jan 2001 17:03:37 EST

David Pereira wrote:

> I am writing a C compiler and am at the parser stage. I have run
> into a bit of a problem. I need to enable the lexer to distinguish
> typedef names from ordinary identifiers. At first glance, it seemed
> that building a table of typedef names was enough - the lexer would
> consult this table and return TYPENAME if the name was found in this
> table, otherwise IDENTIFIER. However, this is inadequate since
> *context* matters. How can this context be discerned ?

By hacking ... I once solved this as follows: check your grammar
from which you've constructed your parser. At several locations
typedef'd names are allowed while at other locations, a name represents
an identifier (variable). Have a look at this:

typedef int t;

<1> t <2> x;

<1> t <2> f(<1> t <2> t) { <3> t <2> i; }

At locations <1>, a name _may_ represent a typedef. The parser can
figure it out by consulting its stacked symbol tables. At locations
<2> an identifier can never represent a typedef'd name. Location
<3> could have been a situation like at the <1> locations. The
newly stacked symbol table (when entering the function parameter list)
prevents the name 't' at the start of the body of the function to
be interpreted as a typedef'd name, i.e. the formal parameter 't'
hided the typedef'd name. (so at location 'i' a diagnostic has
to be reported by your compiler).

A simple boolean variable, toggled at the right locations will do the
trick. If true, a name _may_ be a typedef'd name, otherwise a name
simply represents an identifier. This ambiguity can be resolved by
consulting the (stacked) symbol tables.

There's one little pitfall here though. Most parsers and lexers operate
in lock step, i.e. the lexer has already consumed the latest token,
the parser is still figuring out what to do syntactically. If the
semantic action (toggling the boolean switch) is done _after_ the latest
token is consumed, you're too late. Carefully hacking your parser/lexer
communication is the only way out ...

I _hate_ this typedef kludge ...

kind regards,

Jos aka

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.