|Left Recursive Examples for use with LL Parser Generator firstname.lastname@example.org (Alexander Morou) (2015-07-19)|
|From:||Alexander Morou <email@example.com>|
|Date:||Sun, 19 Jul 2015 04:20:14 -0500|
|Keywords:||parse, LL(1), question|
|Posted-Date:||19 Jul 2015 17:16:25 EDT|
After further evaluating the very early release of Oilexer, I've
discovered a few things:
1. Handling Left Recursion in a LL(*) context is possible.
2. Handling Left Recursive Predictions is the hard part
3. Chicken Before Egg type issues are annoying, but there seems to
be a finite set of variations to handling them (I think :)
On some levels I found early assumptions about the result decisions
in predictions were incorrect, but I'm slowly overcoming those.
*If I may ask the assistance of others here for some examples of left
recursive grammars that would require a prediction, it would be most
appreciated. I could use these as a means to investigate the surrounding
In situations where it is easily determined that a rule is 'x', where 'x'
is left recursive, it appears that the left recursive mechanics I have in
place handle this case without issue; however, in cases where the decision
requires a prediction, the system falls apart, too quickly it tries to
say which rule it is when left recursion requires much longer look-ahead.
Here's an example that is solved easily:
A ::= B C 'd' | E C 'f' ;
B ::= 'x' 'y' ;
E ::= 'x' 'y' ;
C ::= 'c' | C 'c' ;
In this example (sourced from
E and B are identical productions, and you can't solve for which it is
until you move past the 'x' and 'y' terminals, and past the C production.
The prediction for A would look for x, then y, then it is deterministically
in the left-recursive production C, parse C and put it on the stack, if you
see d then the prediction says parse B within A, if you see f parse E. Both
variations will skip parsing C because it's already on the symbol stack from
the prediction on A.
Here's an example that isn't as easy:
A ::=> B | C | D;
B ::= A 'b';
D ::= A 'd';
C ::= 'c';
T ::= A 'a' | B | C | D;
In the above example, everything goes great until you hit T, which requires
knowledge of which variation is valid to parse correctly. Right now it
parses C (since A, B, C and D of T reduce to C after 'c') then it checks
for 'a', 'b', or 'd', and only does so once. I can easily look at the state
machine and introduce a repetition in the prediction to repeat the check
until we hit either EOF, or 'a'. I just need more realistic examples to
ensure the logic I put in place is sound.
Return to the
Search the comp.compilers archives again.