|Grammar question email@example.com (1994-09-20)|
|Re: Grammar question firstname.lastname@example.org (1994-09-22)|
|Re: Grammar question email@example.com (Chris Clark USG) (1994-09-26)|
|Re: Grammar question firstname.lastname@example.org (Zerksis D. Umrigar) (1994-09-28)|
|grammar question email@example.com (1996-03-14)|
|From:||Chris Clark USG <firstname.lastname@example.org>|
|Date:||Mon, 26 Sep 1994 16:41:51 GMT|
email@example.com (Elan Feingold) writes:
> I am trying to use lex and yacc to parse the following sorts
> of "sentences":
> "Verb Foo Foo Foo Bar Bar FooBar Foo"
> Where the stream of lexemes returned should be:
> VERB IDENTIFIER FOO IDENTIFIER BAR IDENTIFIER FOOBAR IDENTIFIER
> according to the following grammar:
> Command -> Expression FOO Identifier BAR Identifier FOOBAR Identifier
> Expression -> Identifier | Expression + Expression ... etc ...
> Identifier -> IDENTIFIER | NUMBER
. . .
> And return the keyword the first time around and IDENTIFIER after that.
> However, I am a slave of the existing grammar :(
If you mean you cannot change the grammar in any way (i.e. it is
pre-compiled into a library which you don't have the source for), then you
have, as the moderator points out, a horrendous problem. [In fact, the
only "true" solution is to parse the text using a changed grammar had then
have it decide which tokens to send to the original parser.] However, if
you are allowed to fix the grammar as long as you don't change what it is
intended to parse, then you have a very simple problem.
In your description, it is clear that you intend for the grammar to treat
all keywords as identifiers when they don't occur in their reserved
context. [This was originally the distinction between keywords (as in
PL/I) and reserved words (as in Pascal). A keyword was just an identifier
which could have context sensitive special meanings, while a reserved word
always has the context free special meaning and cannot be used as an
identifier.] If that is what you mean, rewrite your identifier production
to state that fact.
Identifier -> IDENTIFIER | NUMBER | FOO | BAR | FOOBAR
The only problems this will cause is places where your grammar is
ambiguous (with respect to the parsing method you are using). In (LA)LR
parsers, these will manifest themselves as conflicts. By careful use of
precedence and ordering the productions, you can often resolve most of
them. Other conflicts may require a more powerful parsing method or more
lookahead. [Of course, theoretically any LR(k) language has an LR(1)
grammar. But if that isn't the grammar you have, it may only be a small
Return to the
Search the comp.compilers archives again.