Related articles |
---|
Maintaining scope while parsing C with a YACC grammar eliben@gmail.com (eliben) (2011-04-25) |
Re: Maintaining scope while parsing C with a YACC grammar bobduff@shell01.TheWorld.com (Robert A Duff) (2011-04-26) |
Re: Maintaining scope while parsing C with a YACC grammar bobduff@shell01.TheWorld.com (Robert A Duff) (2011-04-26) |
Re: Maintaining scope while parsing C with a YACC grammar eliben@gmail.com (eliben) (2011-04-28) |
Re: Maintaining scope while parsing C with a YACC grammar bobduff@shell01.TheWorld.com (Robert A Duff) (2011-05-02) |
Re: Maintaining scope while parsing C with a YACC grammar torbenm@diku.dk (2011-05-03) |
Re: Maintaining scope while parsing C with a YACC grammar paul@paulbmann.com (Paul B Mann) (2011-05-06) |
Re: Maintaining scope while parsing C with a YACC grammar idbaxter@semdesigns.com (Ira Baxter) (2011-05-13) |
Maintaining scope while parsing C with a Yacc grammar cfc@shell01.TheWorld.com (Chris F Clark) (2011-06-12) |
From: | torbenm@diku.dk (Torben Ęgidius Mogensen) |
Newsgroups: | comp.compilers |
Date: | Tue, 03 May 2011 09:51:14 +0200 |
Organization: | SunSITE.dk - Supporting Open source |
References: | 11-04-036 11-04-038 11-05-003 |
Keywords: | C, parse |
Posted-Date: | 04 May 2011 13:53:09 EDT |
eliben <eliben@gmail.com> writes:
> Since it's parsing of C I'm talking about, this approach will have to
> somehow handle ambiguity of this kind:
>
> T * x;
>
> This can be either a declaration or a multiplication, depending on
> earlier symbol table information (whether T is a type or not).
One technique for handling this is to let the lexer access the symbol
table and determine if T is a type name or not and generate different
tokens for these. The grammar would then have productions somewhat
like
Declaration -> Type non-type-id
| ...
Type -> type-id
| Type *
| ...
Expression -> Expression * Expression
| non-type-id
| ...
It becomes much more complicated for real C, but the idea should be
clear enough.
This requires the parser to keep a symbol table for the current scope
available to the lexer. This table needs not contain full information
for each identifier, just enough to distinguish type names from other
names.
That said, I consider this kind of ambiguity bad language design, as
it is not only hard for a parser to handle, but also hard for a human
reader. Possible fixes are to make declarations and expressions /
statements non-overlapping syntactically (as in Pascal) or to keep
type names syntactically distinct from variable names, e.g. by making
type names start with upper case letter and variable names start with
lower case letters (as in Haskell).
Torben
[As Dennis said, "the ice is thin here." -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.