Unusual error recovery in Yacc parser

Stephan Mueller <smueller@MICROSOFT.com>
23 Feb 1996 00:07:33 -0500

          From comp.compilers

Related articles
Unusual error recovery in Yacc parser smueller@MICROSOFT.com (Stephan Mueller) (1996-02-23)
Re: Unusual error recovery in Yacc parser kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763) (1996-02-27)
| List of all articles for this month |

From: Stephan Mueller <smueller@MICROSOFT.com>
Newsgroups: comp.compilers
Date: 23 Feb 1996 00:07:33 -0500
Organization: Compilers Central
Keywords: yacc, question

Suppose we have a Yacc (actually Bison) parser for a language
consisting of, in general, arbitrary keywords in arbitrary order. We
have one rule for handling syntax errors, which is to completely
ignore any keyword found at an inappropriate time. How do I do this

Getting more concrete: I have a Bison-based parser for RTF. RTF (Rich
Text Format) text consists of plain text, easily identifiable
keywords, and groups delimited by '{' and '}' containing RTF. Some
groups are 'special' in that they may contain only a subset of the
roughly 800 keywords currently defined in Rtf. There are numerous
problems in parsing RTF, but I've overcome most of these with a rather
smart lexer. The lexer returns a separate token for each possible
keyword. Things that look like keywords but are unknown are discarded
by the lexer, so every keyword seen by the grammar is legal in some

The principal problem is that while in general, any keyword can occur
at any time, there are just enough special '{' and '}' delimited
groups that littering the grammar with error productions to handle
keywords seen in inappropriate groups is messy, requires some
restructuring of the grammar to avoid shift/reduce conflicts, and
leaves me with an uneasy feeling that there will be cases I miss.

I'm currently toying with the idea of futzing with the internals of
bison.simple. I'm thinking of snapshoting all relevant state whenever
YYLEX is called, and thereafter maintaining an 'undo list' of
shifts/reductions. Then, if we hit a syntax error, I'll undo the
shifts/reductions, restore the relevant state and discard the token
that caused us grief. If all proceeds well to the point that YYLEX is
called again, the undo list can be discarded and a new one started.
While this could be fun, it also leaves me with an uneasy feeling that
I've taken the first big step to the Dark Side of the Force.

Thanks in advance for any wisdom on the subject, as I'm confident I
won't live long enough to try all possible mistakes myself.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.