Related articles |
---|
[3 earlier articles] |
Re: Lookahead vs. Scanner Feedback sef@kithrup.COM (1992-01-07) |
Re: Lookahead vs. Scanner Feedback Jan.Rekers@cwi.nl (1992-01-07) |
Re: Lookahead vs. Scanner Feedback burley@geech.gnu.ai.mit.edu (1992-01-07) |
Re: Lookahead vs. Scanner Feedback drw@lagrange.mit.edu (1992-01-07) |
Re: Lookahead vs. Scanner Feedback smk@dcs.edinburgh.ac.uk (1992-01-07) |
Re: Lookahead vs. Scanner Feedback bill@twwells.com (1992-01-08) |
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-08) |
Re: Lookahead vs. Scanner Feedback nigelh@sol.UVic.CA (1992-01-08) |
Re: Lookahead vs. Scanner Feedback dww@inf.fu-berlin.de (1992-01-08) |
Re: Lookahead vs. Scanner Feedback jwoods@convex.com (1992-01-09) |
Re: Lookahead vs. Scanner Feedback jwoods@convex.com (1992-01-10) |
Re: Lookahead vs. Scanner Feedback bliss@sp64.csrd.uiuc.edu (1992-01-13) |
Re: Lookahead vs. Scanner Feedback megatest!djones@decwrl.dec.com (1992-01-13) |
Newsgroups: | comp.compilers |
From: | bliss@sp64.csrd.uiuc.edu (Brian Bliss) |
Keywords: | parse, C |
Organization: | UIUC Center for Supercomputing Research and Development |
References: | 92-01-032 |
Date: | Wed, 8 Jan 92 17:55:13 GMT |
In article 92-01-032, smk@dcs.edinburgh.ac.uk writes:
|> [Reusing a typedef name] shouldn't be a problem, because this is not really
|> an ambiguous occurrence. You can deal with that by having a production
|>
|> any_ident : ident | type_ident;
|>
|> and using any_ident for the identifier in a declarator (and several other
|> places). This should be possible without introducing any ambiguities.
|>
|> But for some parts of the C syntax this is not so easy, for labels you
|> probably have to expand the any_ident production to allow programs like
|>
|> typef int foo;
|> main ()
|> { foo: ;
|> }
|>
|> because otherwise there is a shift-reduce conflict
|> (reduce type_ident to any_ident for labels, shift for declarations).
[It's not impossible, but it's tricky and messy to get right. -John]
O.K. I haven't got out the grammar and done the actual table construction
(read: disclaimer), but declarations ARE the one place where you do need
the separate tokens for ident and type_ident. any other place, the
any_ident->ident|type_ident rule works fine (On labels, for instance, the
: in the lookahead stream resolves the ambiguity. I have also sucessfully
used the above productions to allow a typedef name to also be a tag name).
Consider the code fragment:
typedef int z;
main() {
long z;
}
is z being redeclared as a local variable in main(), or are you just
specifying the empty declaration for a long int type? The ambiguity
depends upon which token you return from the lexical analyzer when a is
encountered for the second time. The ANSI C grammar in the back of K&RII
is not ambiguous: it assumes that the lexer resolves the ambiguity, not
the parser.
The fix to this problem is much easier than I first thought: Just use
lex's right-context sensitivity operator (/) to search ahead in the input
stream for one of [,{;] (preceeded by optional whitespace) when an
identifier is encountered. In cases that match, always return the IDENT
token; on cases that don't, lookup the name and return TYPE_NAME if the
identifier is a typedef name, return IDENT otherwise.
As for my original statement
>One place where every yacc/lex based C compiler I know of is broken
I knew sun's cc was broken & any C compiler I had work on was too,
couldn't figure out a way to easily fix the problem, and over-generalized :-)
bb
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.