Related articles |
---|
recursive-descent error recovery decvax!utzoo!henry (1987-08-13) |
Re: recursive-descent error recovery chuck@amdahl.amdahl.com (1987-08-15) |
Re: recursive-descent error recovery decvax!utzoo!henry (1987-08-17) |
Re: recursive-descent error recovery harvard!seismo!cs.rochester.edu!scott (1987-08-16) |
recursive-descent error recovery cullvax!drw@EDDIE.MIT.EDU (1987-08-17) |
Re: recursive-descent error recovery decvax!utzoo!henry (1987-08-22) |
From: | decvax!utzoo!henry |
Date: | Thu, 13 Aug 87 20:05:29 edt |
References: | <634@ima.ISC.COM>, <642@ima.ISC.COM> |
> [I'd be interested in hearing what the recursive descent lexer feeds to the
> parser when the input is something like a = b + ( ; -John]
The basic approach in the lexer is (a) give the customer what he wants, and
(b) if what he wants doesn't match what's in the input, apply some simple
heuristics to try to resynchronize. Say we've just parsed the "b", so the
parser knows we're in an expression that is part of a statement; here is
the dialogue, slightly simplified:
Parser: "Do you have any of <several expression continuations like '+'>?"
Lexer: "Yes, I have '+'."
P (to itself): "Aha, there's more to this expression."
P: "Do you have any of <things that might start an operand, like '('>?"
L: "Yes, I have '('."
P (to itself): "Aha, a parenthesized expression."
P: "Give me one of <things that would start an expression>."
L (to itself): "<gurgling noises> That's not in my input! I'm going to
have to fake it. Let's see... what I see is a line terminator,
and he's not asking for a line terminator, so I will try to fake
the rest of the line to resynchronize. What's the most harmless
thing I can give him, out of the list he asked for?..."
L: "I have '0'."
P: "Do you have any of <things that might continue the expression>?"
L (to itself): "To avoid infinite loops, I should try to cut this short."
L: "No."
P (to itself): "Must be the end of the expression."
P: "Give me ')'."
L (to itself): "Keep on faking it..."
L: "Okay, I have ')'."
P: "Do you have any of <things that might continue the outer expression>?"
L (to itself): "Again, try to cut it short."
L: "No."
P (to itself): "Must be the end of the expression."
P: "Give me ';'."
L (to itself): "<sigh of relief> Resynchronized! Now I can stop lying."
L: "I have ';'."
... and so forth. The details of resynchronization heuristics can get messy,
especially in a language like C where clear-cut line-terminator tokens are
hard to find (e.g. ';' also appears inside "for"), but the basic idea is
simple and works well. It's not original with me; I think the first place
where I saw it was in Dave Barnard's M.Sc. thesis, describing the SP/k
parser and scanner.
Henry Spencer @ U of Toronto Zoology
{allegra,ihnp4,decvax,pyramid}!utzoo!henry
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.