Related articles |
---|
Error reporting drw@kronecker.mit.edu (1992-06-04) |
Newsgroups: | comp.compilers |
From: | drw@kronecker.mit.edu (Dale R. Worley) |
Keywords: | parse, errors |
Organization: | MIT Dept. of Tetrapilotomy, Cambridge, MA, USA |
Date: | Thu, 4 Jun 1992 03:23:06 GMT |
I don't know if these ideas are new, but I found them extremely useful in
the compiler I implemented them in:
When a syntax error is found, dump the parse stack (this was an LALR
parser) in terms of the syntax categories in the language manual.
Generally, each item on the stack can be easily represented by some piece
of the syntax notation presented in the manual. (You need some mechanism
for limiting the number of items dumped.)
Also, print the complete list of terminals that can be accepted at that
point. To keep this list to a reasonable length, there are also certain
'classes' defined that incorporate many terminals that are all acceptable
in many states (e.g., "operator", "operand", "statement keyword").
For example, the following erroneous C code produces:
27: for (i = 1; i < 20; i++ {
Syntax error ^
Parse so far: ... FOR ( EXPRESSION ; EXPRESSION ; EXPRESSION
Seen: {
Expecting: OPERATOR )
Or:
31: a = 2 * ( b + c ;
Syntax error ^
Parse so far: ... EXPRESSION = EXPRESSION * ( EXPRESSION + EXPRESSION
Seen: ;
Expecting OPERATOR )
In either case, this simple systematic system is more useful than the
usual "semicolon expected" sorts of syntax error messages, because it
doesn't try to second-guess what might have gone wrong, but rather
describes in a reasonably user-friendly way why the input was wrong.
Together, these features tell the user what the compiler thought was going
on at that point, and what can validly follow. This makes it much easier
to diagnose many errors. For example: (1) When a minor punctuation error
causes the parser to mis-interpret several following tokens before hitting
a syntax error, the stack dump immediately shows that the compiler had an
entirely different view of what was going on than the user. (2) When a
token was classified differently than the user intended. (3) When the
user has poor knowledge of the syntax of the language, the "expected"
display will often correct him more quickly than referring to the manual.
Dale Worley Dept. of Math., MIT drw@math.mit.edu
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.