|Re: Error productions in YACC email@example.com (Gary Merrill) (1992-08-24)|
|From:||Gary Merrill <firstname.lastname@example.org>|
|Date:||Mon, 24 Aug 1992 20:17:19 GMT|
Well, to me this is the most frustrating and arcane part of writing
parsers in yacc. I think I can insert error productions with pretty good
results now, but I'm not sure that I can explain in any coherent way how I
do it -- and often I am still surprised at the bizarre results!
First, a good understanding of how an LR parser works is a rather
essential bit of background. In terms of books, the most complete
treatment I've seen (and "complete" is hardly the correct description) is
in _Introduction_to_Compiler_Construction_with_UNIX_ by A. T. Schreiner
and H. G. Friedman, Jr. (Prentice-Hall, 1985). In general, and in a
number of ways, this book is not as good as the O'Reilly book, but it does
contain a whole (15 page) chapter on error recovery. Examples using the
techniques discussed then occur throughout the remainder of the book.
The following general guidelines are given for the placement of error
1. as close as possible to the start symbol of the grammar
2. as close as possible to each terminal symbol
3. without introducing conflicts
Ha! Ha! The authors do point out that these goals conflict. They then go
on to offer some more specific suggestions for certain common language
constructs (which I shall not reproduce here). They also discuss the use
of yyerrok. Even though the techniques are employed in several examples,
these are still "toy" grammars. But you probably would benefit from
looking at this material.
I like to catch errors at as low a level as possible, but frequently this
tends to inhibit error recovery because you don't have enough context to
know what is best to do. I also don't like the generic "syntax error"
message emitted by the yacc-generated code, and so I suppress it. If this
is done, care must be taken to ensure that you don't fail to diagnose a
syntax error for which you do not have an error production (and explicit
In addition to these issues you need to deal with choices of when you want
to scan to a specific token (a ';', ')' , or '}', for example) as part of
your recovery. Finally, one alternative that is quite effective in some
circumstances is to actually have your grammar *accept* what you know is
bad syntax, but diagnose it as an error. This can facilitate "recovery"
since you have in fact not entered a parser error state.
I still view error reporting and recovery in yacc as a black art and would
be happy to learn from those who are more experienced.
Gary H. Merrill [Principal Systems Developer, C Compiler Development]
SAS Institute Inc. / SAS Campus Dr. / Cary, NC 27513 / (919) 677-8000
email@example.com ... !mcnc!sas!sasghm
Return to the
Search the comp.compilers archives again.