Related articles |
---|
LR(1) Parsing : Error Handling & Recovery seimarao@gmail.com (Seima Rao) (2014-07-10) |
Re: LR(1) Parsing : Error Handling & Recovery ivan@ootbcomp.com (Ivan Godard) (2014-07-16) |
Re: LR(1) Parsing : Error Handling & Recovery news@fx29.iad.highwinds-media.com (Eric) (2014-07-16) |
Re: LR(1) Parsing : Error Handling & Recovery drikosev@otenet.gr (Evangelos Drikos) (2014-07-17) |
Re: LR(1) Parsing : Error Handling & Recovery ivan@ootbcomp.com (Ivan Godard) (2014-07-17) |
Re: LR(1) Parsing : Error Handling & Recovery ivan@ootbcomp.com (Ivan Godard) (2014-07-17) |
Re: LR(1) Parsing : Error Handling & Recovery gneuner2@comcast.net (George Neuner) (2014-07-17) |
Re: LR(1) Parsing : Error Handling & Recovery wclodius@earthlink.net (2014-07-18) |
Re: LR(1) Parsing : Error Handling & Recovery wclodius@earthlink.net (2014-07-18) |
[22 later articles] |
From: | Eric <news@fx29.iad.highwinds-media.com> |
Newsgroups: | comp.compilers |
Date: | Wed, 16 Jul 2014 16:40:01 -0400 (EDT) |
Organization: | Compilers Central |
References: | 14-07-023 |
Keywords: | parse, errors |
Posted-Date: | 16 Jul 2014 16:40:01 EDT |
Seima Rao wrote:
> How far is impressive error handling & recovery possible by LR(1) parsers ?
>
> What would be good references for the subject?
>
> As an aside, do the famous compilers use LR(1) parsing ?
I can take a crack at answering some of this. I built my own LALR1
parse generator and parser about 20 years ago. The parser had both
Error Repair and Error Handler. I can describe how I implemented
these.
There are two mechanisms:
Error Repair tries to fix the input token stream when a syntax error
is detected. This was popular back when compiling was much more
expensive and done as submitted batch jobs. The desire was to not
abort a compile job or toss away source just because of some syntax
error. Error Repair attempts to keep as much of the original input
stream and find as many errors as possible in one pass.
The basic repair mechanism tries to either Keep, Insert or Delete
input stream tokens according to their cost. The "best" repair is the
one with the minimum cost.
Error Handling is a simpler mechanism and essentially works by tossing
away tokens until a recognized state is reached. It can be used when
Error Repair is not available or has failed. Error Handling is
conceptually like an exception handler (this is just conceptual, it
has nothing to do with C++ or operating system exceptions).
The LALR1 parser pushes items on the stack. If it reaches a state
where the next token cannot be consumed because there is no next state
for it (a syntax error), then the parser does the equivalent of
"throwing an exception". It scans backwards through the parse stack
checking the state for a special terminal called ERROR. If it finds
one it rolls back the parse stack to that point, pushes the ERROR
terminal on the stack and calls the associated reduction routine as
though a match on the input terminal ERROR had been found. That
reduction action routine is expected to do clean up.
For example
statement
: assignment_statement
| if_statement
| loop_statement
| ERROR { action_error(); };
Say we are parsing "foo = a * b / / c;".
At the second '/' token a syntax error is detected.
The parse stack is scanned backwards until we find a state
that can shift the ERROR token. That matches a production
for "assignment" so the action_error() routine is called
which reads input token stream until it consumes a ';'.
(I'm glossing over grotty details like who cleans up
the parse stack and how is memory recovered.)
Finally the assignment production is pushed on the
parse stack and we continue as if nothing had gone wrong.
Anyway, that is the basic idea.
I found Error Repair more interesting but is much more difficult.
Most parsers seem to use Error Handling because it is much easier
to implement and recompiling is cheap.
Eric
Return to the
comp.compilers page.
Search the
comp.compilers archives again.