Re: Error reporting, was Infinite look ahead required by C++?

"Ira Baxter" <idbaxter@semdesigns.com>
Mon, 15 Feb 2010 15:33:02 -0600

          From comp.compilers

Related articles
Infinite look ahead required by C++? ng2010@att.invalid (ng2010) (2010-02-05)
Re: Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-06)
Re: Infinite look ahead required by C++? cfc@shell01.TheWorld.com (Chris F Clark) (2010-02-10)
Re: Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-13)
Re: Infinite look ahead required by C++? wclodius@los-alamos.net (2010-02-13)
Re: Error reporting, was Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-14)
Re: Error reporting, was Infinite look ahead required by C++? idbaxter@semdesigns.com (Ira Baxter) (2010-02-15)
Re: Error reporting, was Infinite look ahead required by C++? haberg_20080406@math.su.se (Hans Aberg) (2010-02-16)
Re: Error reporting, was Infinite look ahead required by C++? sh006d3592@blueyonder.co.uk (Stephen Horne) (2010-02-17)
Re: Error reporting, was Infinite look ahead required by C++? kkylheku@gmail.com (Kaz Kylheku) (2010-02-17)
Re: Error reporting, was Infinite look ahead required by C++? haberg_20080406@math.su.se (Hans Aberg) (2010-02-19)
Re: Error reporting, was Infinite look ahead required by C++? jdenny@clemson.edu (Joel E. Denny) (2010-02-19)
Re: Error reporting, was Infinite look ahead required by C++? cfc@shell01.TheWorld.com (Chris F Clark) (2010-02-19)
[5 later articles]
| List of all articles for this month |

From: "Ira Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers
Date: Mon, 15 Feb 2010 15:33:02 -0600
Organization: Compilers Central
References: 10-02-024 10-02-029 10-02-047 10-02-055 10-02-062 10-02-064
Keywords: errors, parse
Posted-Date: 16 Feb 2010 10:27:47 EST

"Stephen Horne" <sh006d3592@blueyonder.co.uk> wrote in message
> On Sat, 13 Feb 2010 18:24:28 -0700, wclodius@los-alamos.net (William
> Clodius) wrote:
>
> In LR(1), it is *easy* to give a message of the form "expected one of
> <token list>, but <token> was found." -
>
> Yacc and Bison don't support reporting errors in this form AFAIK, but
> the tool isn't the same as the algorithm the tool uses.


One more reason not to use these tools, or at least get a groundswell
in favor of some open source person to integrate such error reporting.


> For generalised LR, of course, the picture is a bit more complex -
> but something like this could still be done - just report the
> rightmost location where your last parses were exhausted and base
> your message on the set of states that occurred at that point
> (either the cases where backtracking was triggered at that point, or
> the cases for the last few rejected stack-duplicates).


We do the following with our GLR parsers: when a syntax error is
found, we attempt the deletion of the input token, and the insertion
of every other possible token (done cleverly by inspect
follow-states), and report those tokens as "expected" than enable a
transition into a state where a next token is required. This produces
faily good error reports, certainly picking up missing punctuation
where needed. We haven't tried backup up and deleting the previous
token.


There's always the question of how big a patch are you willing to make
to do error recovery. The most interesting paper I read on this topic
(sorry, don't recall author) suggested running the real parser some N
tokens (for fixed N) behind the error checking parser, and allowing up
to N error-checking parser tokens to be revised. Don't know if this
would be really effective, but it sounded reasonable.


There's also the possibililty of reporting not individual tokens,
but rather the largest-covering nonterminal. When you have
a syntax error like the following:
            x = ;
an error report of the form of "Inserted NUMBER" is OK,
but one that says "Inserted EXPRESSION" is lots better.
This seems like it should be pretty easy to implement at least
with the infrastructure we have, but we just haven't gotten around to it.


> The problem isn't the parsing technology, but the language
> itself. If the language is ambiguous enough to need generalised LR,
> then at the point of the *real* error, no error may be detected
> because there are alternative parses yet to be rejected. When those
> last possibilities are exhausted so that a syntax error is proved,
> the parser may have gone quite some distance past the real problem,
> making the cause difficult to trace irrespective of how the
> diagnostic is worded.


Agree that the language can make it harder.


-- IDB


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.