|Collecting look ahead set email@example.com (Craig) (1989-02-21)|
|Re: Collecting look ahead set firstname.lastname@example.org (1989-02-23)|
|Re: Collecting look ahead set email@example.com (1989-03-07)|
|Re: Collecting look ahead set firstname.lastname@example.org (1989-03-09)|
|Re: Collecting look ahead set email@example.com (1989-03-10)|
|Date:||Tue, 7 Mar 89 18:58:20 EST|
|Posted-From:||The MITRE Corp., Bedford, MA|
|Organization:||The MITRE Corporation, Bedford, Mass.|
In article <firstname.lastname@example.org> email@example.com writes (lots deleted):
>So, what to do? How's this: At the start, and every time any action
>other than a default reduction or error action gets done, we empty a
>set called "yy_default_reduction_states", er, excuse me, "yydfrs".
>Every time a default reduction or error-action takes place, we add the
>state on top of the stack to the set.
>After the syntax error is detected, we go through the set of
>states, making a set of all tokens which could have caused
>a shift *or* a reduce in any of those states, but didn't. Those are
>the ones which would be legal next.
>The trick would be to encode the yydfrs set efficiently enough so as
>not to slow the parser significantly.
>Anybody care to volunteer a go at it?
There is an easier way to do this and I and several others have
done it. (I was working on LALR on Multics, I believe several people
working under Gerry Fisher when he was at NYU used the same technique.
See the SIGPLAN 82 Proceedings for their paper and others.)
The technique is to delay all reductions until a legal shift is
found. Since look-aheads and reductions do not change the contents of
the stack, just the position on the stack, there is no need to store
anything. (But you do need two different stack pointers, and to keep
the state immediately following the last shift.) When the parser
decides to do a shift, it first does the necessary reductions.
A second stack of reductions to perform in a given state (you
never remember a shift) can help--and is just one store per reduction
since you use the same stack index. However, this is a case where C
style programming hurts, because this optimization is VERY non-obvious
in C, although the correct code can be written. (Compute the
difference between pointers to the tops of the two stacks, then
address the second by pointer plus offset. If you write the code
correctly your compiler should notice that the offset is a constant
and fold it. The stacks must, of course have identically sized
elements, and should be adjacent. The easiest way to insure this is
to make them two sections of the same array.
Of course, going to all this effort for yacc is probably not
worth it unless a lot of other bugs are fixed. A better idea would be
to take a true LALR grammar generator (such as Wetherell) and host it
on Unix, then add error correction.
Robert I. Eachus
[From firstname.lastname@example.org (Eachus)]
Return to the
Search the comp.compilers archives again.