|A big easy subset of the CFGs email@example.com (Matt Timmermans) (1998-06-18)|
|Re: A big easy subset of the CFGs firstname.lastname@example.org (1998-06-19)|
|Re: A big easy subset of the CFGs email@example.com (Chris F Clark) (1998-06-19)|
|Re: A big easy subset of the CFGs firstname.lastname@example.org (Matt Timmermans) (1998-06-24)|
|Re: A big easy subset of the CFGs email@example.com (Salvador Valerio Cavadini) (1998-06-24)|
|Re: A big easy subset of the CFGs firstname.lastname@example.org (Chris Clark USG) (1998-07-01)|
|From:||"Matt Timmermans" <email@example.com>|
|Date:||24 Jun 1998 00:11:35 -0400|
|Organization:||IGS - Information Gateway Services|
Thanks, Dennise and Chris, for your excellent replies.
Chris F Clark wrote...
>Given what your desires appear to be, my guess is that your goal is to
>try to have a parser which recovers gracefully from single-character
>errors (e.g. missing closing quotes).
Good guess, but no.
>The task is harder than it looks.
Gee, I don't know -- it looks pretty hard.
What I'm doing is making a sort of syntax-directed translation package
for translator writers who understand grammars but not parsing theory,
and users (those who do translations) who understand neither. For the
translator writers, I need to provide a simple and consistent model of
their language, and error messages that make sense to people who
understand only grammars -- hence the need for "non-bizarre" ambiguity
I do have to include regular expression operators in the grammar, but
this is a non-issue, since I will require optional and repeating parts
to have names. This is necessary to give nice error messages to the
user, and makes them into non-terminals.
>However, a second reason why we haven't pursued non-canonical parsing
>more actively is that it actually complicates the conflict reporting
>process. The types of conflicts which can occur in a non-canonical
>parser are more subtle than in a simple LR parser. Our experiments
>suggest that writting and debugging a non-canonical grammar may
>actually be more difficult (once you get past trivial grammars).
This is the part that bothers me. How sure are you that this is a
product of non-cannonical parsing, rather than the conflict report
itself? Thinking of what you have to do sometimes to resolve an
LALR(1) conflict (ACK! how is it that an expression can be followed by
an identifier ?!?!), I can see how conflict reporting similar to
YACC's would get much more obtuse, mainly because states get much more
complicated and effects become less local.
Non-bizarre ambiguity rules, however, should make it possible to do to
much friendlier conflict reporting, like "In an If-statement, IF x
THEN IF y THEN z ELSE ... could mean IF x THEN (IF y THEN z ELSE ...)
or IF x THEN ( IF y THEN z ) ELSE ...". That is to say, conflict
reporting by smallest possible example of alternate sentiential forms.
There is no recourse, really, if translator writers don't have to
My thinking is that something like LR(term) (I haven't checked the Tai
references yet, so I'm not calling it non-cannonical SLR) would allow
me to have the same sort of conflict messages that I would get with
LR(1), except that some examples would be given in terms of
non-terminals rather than terminals, e.g., " 'x string' conflicts with
'x A string' " instead of " 'x QUOTE' conflicts with 'x A QUOTE' ".
Return to the
Search the comp.compilers archives again.