|The signs of literals email@example.com (Hung-Ta Lin) (1997-12-07)|
|Re: The signs of literals firstname.lastname@example.org (David L Moore) (1997-12-10)|
|Re: The signs of literals email@example.com (Chris Clark USG) (1997-12-10)|
|Re: The signs of literals firstname.lastname@example.org (1997-12-12)|
|Re: The signs of literals tim@wagner.Princeton.EDU (1997-12-12)|
|Re: The signs of literals email@example.com (Matt Timmermans) (1997-12-12)|
|Re: The signs of literals firstname.lastname@example.org (David L Moore) (1997-12-13)|
|From:||tim@wagner.Princeton.EDU (Tim Hollebeek)|
|Date:||12 Dec 1997 14:51:25 -0500|
|Organization:||Chemistry Department, Princeton University|
Hung-Ta Lin wrote:
> > Hi, I am working on a grammar that explicitly treats sign as part of a
> > integer literal:
> > The grammar is ambiguous because [minus could be part of the literal or a
> > part of a unary minus expression]:
David L Moore <email@example.com> writes:
> Two quick points.
The canonical example of lexer ambiguity, I think, is the C construct
'i+++++j' which tokenizes as 'i++ ++ + j' (which the grammar rejects)
and not as 'i++ + ++j' which the grammar would accept. In general,
languages have arbitrary rules about tokenization, and longest match
is not uncommon. Sometimes there are really fun ones. I still
maintain a language where '])' can be the right bracket for indexing
followed by a parenthesis, or the close of a '([' construct. Figuring
out which one it is in the lexer would be a real pain, so the lexer
just returns ']' and ')' and lets the grammar figure it out. This has
the unfortunate side effect of allowing whitespace between the two.
That's what happens when you inherit a 6 year old flaw in the language
design, and have to support it.
Return to the
Search the comp.compilers archives again.