Re: LL(1) vs. LR(1)

Chris F Clark <cfc@world.std.com>
13 Oct 2001 23:12:40 -0400

From comp.compilers

Related articles
LL(1) vs. LR(1) willverine@hotpop.com (Will) (2001-10-12)
Re: LL(1) vs. LR(1) gzw@home.com (Geoff Wozniak) (2001-10-13)
Re: LL(1) vs. LR(1) joachim_d@gmx.de (Joachim Durchholz) (2001-10-13)
*Re: LL(1) vs. LR(1) cfc@world.std.com (Chris F Clark)* (2001-10-13)**

| List of all articles for this month |

From:	Chris F Clark <cfc@world.std.com>
Newsgroups:	comp.compilers
Date:	13 Oct 2001 23:12:40 -0400
Organization:	Compilers Central
References:	01-10-042
Keywords:	parse, theory, LL(1), LR(1)
Cc:	compilers@iecc.com
Posted-Date:	13 Oct 2001 23:12:40 EDT

Your message contains several small but important errors in
understanding that are probably impeding your understanding of the
difference between LL(1) and LR(1) parsers. I will try to correct
this misunderstandings without inserting new misunderstandings of my
own. (That is always a difficult task, because there is a lot about
parsing which is counter-intuitive, and thus sometimes my memory
substitutes a simpler but incorrect fact in place of the correct one.)

> I was told that LR(1) parsers are much more powerful than the LL(1)
> parsers because they recognize more grammars.

This is true. Every LL(1) grammar is an LR(1) grammar, although there
are LL(1) grammars that are not LALR(1). However, any LR(1) grammar
with left recursion is not an LL(1) grammar. Any LR(1) grammar that
is not left-factored is not an LL(1) grammar.

This is strictly at the grammar level. If you start allowing yourself
to modify the grammar, you are no longer at the grammar level, you are
at the language level. A language is simply a set of strings
(sentences). If you have a grammar, it describes some set of strings.
Thus, a grammar defines a language. The reverse is not true.

There may or may not be a grammar acceptable to your favorite parsing
method that describes that set of strings. Thus, if you have an LR(1)
that contains left-recursion and you can successfully remove that left
recursion, you may have an LL(1) grammar for than language. If so,
the language is LL(1). If you weren't successful, there still maybe
another grammar for the language that is LL(1) and you just don't know
what it is. This makes the statements between languages (and not
grammars) more complex. However, any LL(1) language (i.e. that there
exists an LL(1) grammar for) is also an LR(1) language.
Counter-intuitively, any LR(k) language is also an LR(1) language,
even though there are LR(k) grammars that are not LR(1).

> I can see that LR(1) parsers are probably more powerful because to
> create the parsing table, a stack is used.

Not true. LL(1) parsers use a stack also. The reason LR(1) parsers
are more powerful is because the perform the closure operation you
mention later and they defer the recognition of which production is
being chosen until the production is complete (and to be precise until
1 token of lookahead past the end of the production has been
inspected). LL(1) parsers must chose which production to apply upon
seeing the first token of the production.

> For LL(1) you just look at the next input and do a trial and error
> approach (of course you can left-factor, and most certainly you have
> to remove left-recursions otherwise you'll go into an infinite loop
> - all this may result in less backtracking although it's not
> guaranteed).

Not true. The process is NOT trial and error and there is NO
backtracking. The parser simply looks at the first token and choses
the one (and only) production that given the state of the parse has at
that token as its first token. (A grammar is LL(1) if and only if
that choice is unique, which is why left recursion and
non-left-factored grammars are not allowed). If you allow
backtracking (or trial and error), you are in a completely different
class of parsers (a class that is much more powerful, i.e. all LR(1)
grammars and many non-LR(k) grammars can be solved by backtracking
parsers).

> Another curiosity is the efficiency of LL(1) and LR(1) algorithms.

Both parsing algorithms run in linear time (i.e. for n tokens of
input, k*n units of time are required to perform the parse).

> Due to backtracking, LL(1) may end up being exhaustive in that it
> may backtrack to the point that it reaches the last production in
> which it will try to apply. This exhaustive search means LL(1) is
> bounded by an exponential (am I correct?).

Although backtracking parsers are not LL(1) parsers, backtracking
parsers do sometimes exhibit exponential run-time performance.

> Because of this fan-out, it seems LR(1) algorithm for creating
> parsing table (btw: I'm talking about the algorithm presented in the
> new dragon book) is also bounded exponentially (am I correct?).

I have not seen a worst-case analysis of the space requirements for an
LR parsing table. However, the following argument should hold. Each
state in an LR(1) parsing table consists of a set of items which
correspond to the positions the parser can be in each of the
productions (the so called dotted rules) (*note below). Each rule has
a fixed, finite number of positions. Since each parser state is a
collection of a unique set (note set, not bag, no position is ever
repeated in a state) of those positions, the set of all parser states
is simply a subset of the power set of the positions. Now, this set
is potentially exponential in size in terms of the number of positions
(i.e. the size of the grammar), it is a fixed constant amount for any
grammar and does not vary with the size of input string being
recognized.

In practice, most human written grammars have a number of states that
is less than the square of the number of productions (and are nearly
linear in the number of positions).

*note: For LR(1) (but not LR(0) or LALR(1)), the items also include
  the lookahead set, which is a finite set of tokens that might follow
  the production. This is again a powerset subset, but again fixed
  given the grammar.

Hope this helps,
-Chris

*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: LL(1) vs. LR(1)

Chris F Clark <cfc@world.std.com>13 Oct 2001 23:12:40 -0400

Chris F Clark <cfc@world.std.com>
13 Oct 2001 23:12:40 -0400