Re: Parsing C: identifier VS typedef name

"Ira Baxter" <idbaxter@semdesigns.com>
17 Apr 2005 15:52:23 -0400

          From comp.compilers

Related articles
Parsing C: identifier VS typedef name igor@SB8286.spb.edu (Igor Baltic) (2005-04-16)
Re: Parsing C: identifier VS typedef name cfc@world.std.com (Chris F Clark) (2005-04-17)
Re: Parsing C: identifier VS typedef name idbaxter@semdesigns.com (Ira Baxter) (2005-04-17)
Re: Parsing C: identifier VS typedef name drdiettrich@compuserve.de (DrDiettrich) (2005-04-26)
| List of all articles for this month |
From: "Ira Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: 17 Apr 2005 15:52:23 -0400
Organization: http://extra.newsguy.com
References: 05-04-046
Keywords: C, parse
Posted-Date: 17 Apr 2005 15:52:23 EDT

"Igor Baltic" <igor@SB8286.spb.edu> wrote in message
> The problem may seem a bit "old", but it does not really make it easier.
>
> As my utilite needs to parse a C program, I tried to use some of the most
> well-known grammars, ..
> ...[and I found that many such grammars are] ...
> dependent on whether the lexeme is an IDENTIFIER or a TYPEDEFname.
> To manage [this ... it is ]
> recommended to maintain a symbol table for current scope, ...


> But the grammar is dependent on distinguishing IDENTIFIER and
> TYPEDEF name even in declarations. [But for declarations, the
> lexer hasn't ] stored it in any symbol table yet, [!}


> [There's a variety of answers, all kludges. -John]


There are some very non-kludgey answers.


The one we use avoids the distinction of TYPEDEF name and IDENTIFIER
name, and thus avoids the problem of building the symbol table while
lexing/parsing. This is technically easy if you have a context-free
parser that will produce all possible (e.g., ambiguous) parses. GLR
parsers are ideal for this, and produce a parse DAG instead of a tree
with sharing of subtrees of ambiguity nodes.


Igor didn't say what he wanted to do, so this may be enough. However,
if you still want to know which is which after parsing, you can then
do symbol table construction by a tree walk. We implement this as an
attribute grammar, with a special value ERROR, having the property
that if the ERROR value is produced, the subtree producing is simply
deleted from the parse DAG.


We have implemented full C and C++ parsers + name/type resolvers this
way and use them for production program transformation tasks using
DMS.


--
Ira D. Baxter, Ph.D., CTO 512-250-1018
Semantic Designs, Inc. www.semdesigns.com



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.