Related articles |
---|
How to parse keywords that can be used as identifiers? mark@research.techforce.nl (Mark Thiehatten) (1996-08-19) |
Re: How to parse keywords that can be used as identifiers? anton@a0.complang.tuwien.ac.at (1996-08-20) |
Re: How to parse keywords that can be used as identifiers? kanze@lts.sel.alcatel.de (1996-08-20) |
Re: How to parse keywords that can be used as identifiers? leichter@smarts.com (Jerry Leichter) (1996-08-21) |
Re: How to parse keywords that can be used as identifiers? ph@anweald.exnet.co.uk (1996-08-24) |
Re: How to parse keywords that can be used as identifiers? grosch@cocolab.sub.com (1996-08-24) |
Re: How to parse keywords that can be used as identifiers? dlmoore@ix.netcom.com (David L Moore) (1996-08-24) |
Re: How to parse keywords that can be used as identifiers? itz@rahul.net (1996-08-24) |
[6 later articles] |
From: | anton@a0.complang.tuwien.ac.at (Anton Ertl) |
Newsgroups: | comp.compilers |
Date: | 20 Aug 1996 23:08:03 -0400 |
Organization: | Institut fuer Computersprachen, Technische Universitaet Wien |
References: | 96-08-058 |
Keywords: | parse |
Mark Thiehatten <mark@research.techforce.nl> writes:
|> I am working on a parser for a language that allows keywords to
|> be used as identifiers. This causes all kinds of problems.
|> I would like to know if somebody has already solved this problem,
|> and, of course, how.
If you allow all keywords as identifiers everywhere, the grammar is
probably ambiguous.
I did something less ambitious 10 years ago. I did it somewhat
intuitively with a hand-written parser, here I'll try to find some
principles that were behind my intuition.
The following method will transform an LL(1) grammar into another
LL(1) grammar that allows keywords instead of identifiers in many
places; and I think you cannot allow more keywords without breaking
the LL(1) property or restructuring the grammar.
First, let me define the first' set:
first' =
if (eps in first) then
first+follow-{eps}
else
first
Associate every instance of ident in the grammar with a set, initially
consisiting of all keywords. For every rule that contains ident in its
first' set, remove first' from the set associated with the
ident. Finally, for all idents, write rules of the form:
ident1: ident | keyword1 | keyword2 | ...
with the keywords coming from the set associated with the instance of
ident; replace the instance of ident with the newly defined
nonterminal (ident1).
E.g.:
stmt: IF expr THEN stmt ELSE stmt ENDIF
| ident Becomes expr
;
expr: ident
| NOT expr
;
This would be transformed into:
ident1: ident|THEN|ELSE|ENDIF|NOT;
ident2: ident|IF|THEN|ELSE|ENDIF;
stmt: IF expr THEN stmt ELSE stmt ENDIF
| ident1 Becomes expr
;
expr: ident2
| NOT expr
;
IIRC, Bison/yacc can handle LL(1) grammars, so this should work with
them, too (although it may be possible to allow even more keywords
with LALR(1)-grammars).
- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/home.html
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.