Re: Error productions in YACC

Gary Merrill <sasghm@unx.sas.com>
Mon, 24 Aug 1992 20:17:19 GMT

          From comp.compilers

Related articles
Re: Error productions in YACC sasghm@unx.sas.com (Gary Merrill) (1992-08-24)
| List of all articles for this month |
Newsgroups: comp.compilers
From: Gary Merrill <sasghm@unx.sas.com>
Organization: Compilers Central
Date: Mon, 24 Aug 1992 20:17:19 GMT
Keywords: yacc, errors

Well, to me this is the most frustrating and arcane part of writing
parsers in yacc. I think I can insert error productions with pretty good
results now, but I'm not sure that I can explain in any coherent way how I
do it -- and often I am still surprised at the bizarre results!


First, a good understanding of how an LR parser works is a rather
essential bit of background. In terms of books, the most complete
treatment I've seen (and "complete" is hardly the correct description) is
in _Introduction_to_Compiler_Construction_with_UNIX_ by A. T. Schreiner
and H. G. Friedman, Jr. (Prentice-Hall, 1985). In general, and in a
number of ways, this book is not as good as the O'Reilly book, but it does
contain a whole (15 page) chapter on error recovery. Examples using the
techniques discussed then occur throughout the remainder of the book.


The following general guidelines are given for the placement of error
symbols:


1. as close as possible to the start symbol of the grammar


2. as close as possible to each terminal symbol


3. without introducing conflicts


Ha! Ha! The authors do point out that these goals conflict. They then go
on to offer some more specific suggestions for certain common language
constructs (which I shall not reproduce here). They also discuss the use
of yyerrok. Even though the techniques are employed in several examples,
these are still "toy" grammars. But you probably would benefit from
looking at this material.


I like to catch errors at as low a level as possible, but frequently this
tends to inhibit error recovery because you don't have enough context to
know what is best to do. I also don't like the generic "syntax error"
message emitted by the yacc-generated code, and so I suppress it. If this
is done, care must be taken to ensure that you don't fail to diagnose a
syntax error for which you do not have an error production (and explicit
diagnostic).


In addition to these issues you need to deal with choices of when you want
to scan to a specific token (a ';', ')' , or '}', for example) as part of
your recovery. Finally, one alternative that is quite effective in some
circumstances is to actually have your grammar *accept* what you know is
bad syntax, but diagnose it as an error. This can facilitate "recovery"
since you have in fact not entered a parser error state.


I still view error reporting and recovery in yacc as a black art and would
be happy to learn from those who are more experienced.
---
Gary H. Merrill [Principal Systems Developer, C Compiler Development]
SAS Institute Inc. / SAS Campus Dr. / Cary, NC 27513 / (919) 677-8000
sasghm@theseus.unx.sas.com ... !mcnc!sas!sasghm
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.