Re: A big easy subset of the CFGs

"Matt Timmermans" <mtimmerm@microstar.no-spam.com>
24 Jun 1998 00:11:35 -0400

From comp.compilers

Related articles
A big easy subset of the CFGs mtimmerm@microstara.com (Matt Timmermans) (1998-06-18)
Re: A big easy subset of the CFGs mickunas@cs.uiuc.edu (1998-06-19)
Re: A big easy subset of the CFGs cfc@world.std.com (Chris F Clark) (1998-06-19)
*Re: A big easy subset of the CFGs mtimmerm@microstar.no-spam.com (Matt Timmermans)* (1998-06-24)**
Re: A big easy subset of the CFGs scavadini@hotmail.com (Salvador Valerio Cavadini) (1998-06-24)
Re: A big easy subset of the CFGs clark@quarry.zk3.dec.com (Chris Clark USG) (1998-07-01)

| List of all articles for this month |

From:	"Matt Timmermans" <mtimmerm@microstar.no-spam.com>
Newsgroups:	comp.compilers
Date:	24 Jun 1998 00:11:35 -0400
Organization:	IGS - Information Gateway Services
References:	98-06-104 98-06-121
Keywords:	parse

Thanks, Dennise and Chris, for your excellent replies.

Chris F Clark wrote...
>Given what your desires appear to be, my guess is that your goal is to
>try to have a parser which recovers gracefully from single-character
>errors (e.g. missing closing quotes).

Good guess, but no.

>The task is harder than it looks.

Gee, I don't know -- it looks pretty hard.

What I'm doing is making a sort of syntax-directed translation package
for translator writers who understand grammars but not parsing theory,
and users (those who do translations) who understand neither. For the
translator writers, I need to provide a simple and consistent model of
their language, and error messages that make sense to people who
understand only grammars -- hence the need for "non-bizarre" ambiguity
rules.

I do have to include regular expression operators in the grammar, but
this is a non-issue, since I will require optional and repeating parts
to have names. This is necessary to give nice error messages to the
user, and makes them into non-terminals.

>However, a second reason why we haven't pursued non-canonical parsing
>more actively is that it actually complicates the conflict reporting
>process. The types of conflicts which can occur in a non-canonical
>parser are more subtle than in a simple LR parser. Our experiments
>suggest that writting and debugging a non-canonical grammar may
>actually be more difficult (once you get past trivial grammars).

This is the part that bothers me. How sure are you that this is a
product of non-cannonical parsing, rather than the conflict report
itself? Thinking of what you have to do sometimes to resolve an
LALR(1) conflict (ACK! how is it that an expression can be followed by
an identifier ?!?!), I can see how conflict reporting similar to
YACC's would get much more obtuse, mainly because states get much more
complicated and effects become less local.

Non-bizarre ambiguity rules, however, should make it possible to do to
much friendlier conflict reporting, like "In an If-statement, IF x
THEN IF y THEN z ELSE ... could mean IF x THEN (IF y THEN z ELSE ...)
or IF x THEN ( IF y THEN z ) ELSE ...". That is to say, conflict
reporting by smallest possible example of alternate sentiential forms.
There is no recourse, really, if translator writers don't have to
understand parsing.

My thinking is that something like LR(term) (I haven't checked the Tai
references yet, so I'm not calling it non-cannonical SLR) would allow
me to have the same sort of conflict messages that I would get with
LR(1), except that some examples would be given in terms of
non-terminals rather than terminals, e.g., " 'x string' conflicts with
'x A string' " instead of " 'x QUOTE' conflicts with 'x A QUOTE' ".

--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: A big easy subset of the CFGs

"Matt Timmermans" <mtimmerm@microstar.no-spam.com>24 Jun 1998 00:11:35 -0400

"Matt Timmermans" <mtimmerm@microstar.no-spam.com>
24 Jun 1998 00:11:35 -0400