Re: LL vs LR, strengths and weaknesses (Jos Horsmeier)
Sat, 16 May 1992 13:29:47 GMT

          From comp.compilers

Related articles
[2 earlier articles]
Re: LL vs LR, strengths and weaknesses (1992-05-13)
Re: LL vs LR, strengths and weaknesses (1992-05-13)
Re: LL vs LR, strengths and weaknesses (1992-05-14)
Re: LL vs LR, strengths and weaknesses (1992-05-14)
Re: LL vs LR, strengths and weaknesses (1992-05-15)
Re: LL vs LR, strengths and weaknesses (1992-05-19)
Re: LL vs LR, strengths and weaknesses (1992-05-16)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Jos Horsmeier)
Keywords: LL(1), parse
Organization: AND Software BV Rotterdam
References: 92-05-059 92-05-092
Date: Sat, 16 May 1992 13:29:47 GMT (Gene Ressler) writes:
|I've been surprised that no one has mentioned the inherent problem that LL
|has with many precedence levels, i.e. to parse ((x*x)+(y*y)) in C
|requires perhaps 70 recursive descent calls (or stack pushes in a
|table-driven LL parser) with the stack up to 30 deep, whereas an LALR
|parser does 13 stack ops and never gets deeper than 6. (All these subject
|to the correctness of my hand parses, but I think they're in the
|Comments on the practical effects of this observation, especially from the
|LL advocates? (For starters, I believe this is why at least some RD parsers
|have embedded operator precedence parsers to do arithmetic.)


I don't want to advocate top-down parsing at all, and especially not LL(1)
parsing, but I think you're exaggerating a bit. Consider the following
(boring) two grammars:

E -> E + T | T
T -> T * F | F
F -> ( E ) | ident

E -> T E'
E'-> + T E' |
T -> F T'
T'-> * F T' |
F -> ( E ) | ident

both grammars can easily be identified as _the_ grammars (LR(1) vs LL(1))
to do the job.

When LL(1) parsing, the deepest stacklevel is reached here:

      ^______________ current token

The stack contains the following suffix of the current sentential form:

*, F, T', E', ), T', E', ), T', E'

This is just 10 elements deep, not 30. And it didn't take 70 calls or
table lookup actions to get here. I agree with you that the stack contains
more elements than a LR(1)'s stack would contain at this stage. The
latter contains just 3 elements at most here:

    ^______________ current token

[5 i], [5 (], [0 (]

BTW if you implement a table driven LL(1) parser, don't push the entire
rhs of the current rule on the stack. That is naive. Simply push a
reference of the current production rule and the position therein on the
stack. In this example, this reduces the logical stack depth to 8, but the
actual amount of bytes needed is far less ...

But there's also a positive thing about this stack, it is very easy to
implement inherited attributes on the fly. The attribute stack simply
mimics the behavior of the parse stack. Every token of the sentential form
generated so far, owns a slot on the attribute stack. IMHO, implementing
inherited attributes in a LR(1) parser is a hack. When I come to think of
it: I've implemented a lot of LL(1) parsers and I simply used a stack of
no more than, say, 200 elements. I never noticed any stack overflow. And
who cares about a stack with just 200 elements?

Concluding: I really _hate_ writing an LL(1) grammar, but when the job is
done, sticking in the semantic actions (or `output tokens' if you prefer)
is very easy. A lot easier compared to LR(1) parsers when these inherited
attributes come in. I think it's just a matter of `give a bit, take a
bit'. LL(1) grammars are a mess to work with. LR(1) grammars are a lot
nicer. LL(1) parsers can do some more for you than LR(1) parsers. BTW did
you ever have a look at the table sizes for both types of generated

Let me stress this again: I`m not advocating LL(1) grammars and parsers at
all. I simply think that they can be nice tools when appropriate. The same
reasoning applies to LR(1) parsers and grammars too ...

kind regards,

Jos aka

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.