Re: Error detection under Yacc

"Srik" <>
21 Aug 1999 02:03:39 -0400

          From comp.compilers

Related articles
Error detection under Yacc (patrick cohen) (1999-08-12)
Re: Error detection under Yacc (1999-08-13)
Re: Error detection under Yacc (Srik) (1999-08-21)
| List of all articles for this month |

From: "Srik" <>
Newsgroups: comp.compilers
Date: 21 Aug 1999 02:03:39 -0400
Organization:! at Internet America
References: 99-08-044
Keywords: yacc, errors, comment

patrick cohen wrote in message 99-08-044...
>Hello every body,
> What is the best way to detect an error under Yacc?


I have employed, with pretty good success, one scheme which is a
pretty generic solution -- no need to do anything special with your
grammar. I have never seen this done anywhere (I swear !) and it is
something I came up with in the past.

This solution would also work if you use error productions. All it
assumes is that when the parser meets an error, you are given control
and at that single point, you have access to the current state of the
parser. What you do with the state is up to you. I dont know what
version of yacc you are using so I leave it up to you to figure out
the actual var in the yacc generated C code that contains the current
state of the parser.

So what I did is this (its pretty painful, but I promise, its worth

First, Yacc versions usually have a command line option which produces
a y.out file. This is the key to the solution. Generate the y.out
file for your grammar. The y.out file contains information about
every state the parser can get into. Also, for each state, it
produces a line from the production rule like: <tokens met> . <tokens
expected> The "." identifies where the parser is right now.

You may be getting a clue as to what where I'm heading. I wrote a
simple awk script which reads the y.out file produced by yacc for a
given grammar and writes out a C compilable table of the form:

{ <int>, "<token expected>" },
{ <int>, "<token expected>" },

where <int> is the state and <token expected> is the first thing on
the right hand side of the "." in y.out. This table is then compiled
as part of your compiler.

The rest of the solution is simple (you have to modify the C code that
yacc generated to do this a bit -- which I find extremely annoying --
but then it works). When a syntax error occurs, you have both the
state of the parser and the current token (text) read. All you have
to do is search through the C table for the state you currently are
in, then print a message of the form: Expected <value from above
table> but got <current token text>

This method is elegant, works for all grammars, is marginally painful
to develop but is well worth the effort. I have used it in two
completely different compilers till now.

Again, the key to the whole thing is the y.out file. Generate it first
and you will immediately see what I'm talking about.

Hope this helps.

/Srikanth Subramanian
[I don't know if anyone's done it by postprocessing y.output, but lots
of people have written LALR error routines that report what the valid
tokens in the error state are. It's not clear to me that this is
particularly useful, both because yacc's default reductions can put
the parser in a state fairly far from the one you'd expect to be in at
error time, and some simple errors like a mismatched close braces can
completely confuse a parser so what it's expecting bears no relation
to what the programmer intended. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.