13 Oct 2001 23:12:40 -0400

Related articles |
---|

LL(1) vs. LR(1) willverine@hotpop.com (Will) (2001-10-12) |

Re: LL(1) vs. LR(1) gzw@home.com (Geoff Wozniak) (2001-10-13) |

Re: LL(1) vs. LR(1) joachim_d@gmx.de (Joachim Durchholz) (2001-10-13) |

Re: LL(1) vs. LR(1) cfc@world.std.com (Chris F Clark) (2001-10-13) |

From: | Chris F Clark <cfc@world.std.com> |

Newsgroups: | comp.compilers |

Date: | 13 Oct 2001 23:12:40 -0400 |

Organization: | Compilers Central |

References: | 01-10-042 |

Keywords: | parse, theory, LL(1), LR(1) |

Cc: | compilers@iecc.com |

Posted-Date: | 13 Oct 2001 23:12:40 EDT |

Your message contains several small but important errors in

understanding that are probably impeding your understanding of the

difference between LL(1) and LR(1) parsers. I will try to correct

this misunderstandings without inserting new misunderstandings of my

own. (That is always a difficult task, because there is a lot about

parsing which is counter-intuitive, and thus sometimes my memory

substitutes a simpler but incorrect fact in place of the correct one.)

*> I was told that LR(1) parsers are much more powerful than the LL(1)*

*> parsers because they recognize more grammars.*

This is true. Every LL(1) grammar is an LR(1) grammar, although there

are LL(1) grammars that are not LALR(1). However, any LR(1) grammar

with left recursion is not an LL(1) grammar. Any LR(1) grammar that

is not left-factored is not an LL(1) grammar.

This is strictly at the grammar level. If you start allowing yourself

to modify the grammar, you are no longer at the grammar level, you are

at the language level. A language is simply a set of strings

(sentences). If you have a grammar, it describes some set of strings.

Thus, a grammar defines a language. The reverse is not true.

There may or may not be a grammar acceptable to your favorite parsing

method that describes that set of strings. Thus, if you have an LR(1)

that contains left-recursion and you can successfully remove that left

recursion, you may have an LL(1) grammar for than language. If so,

the language is LL(1). If you weren't successful, there still maybe

another grammar for the language that is LL(1) and you just don't know

what it is. This makes the statements between languages (and not

grammars) more complex. However, any LL(1) language (i.e. that there

exists an LL(1) grammar for) is also an LR(1) language.

Counter-intuitively, any LR(k) language is also an LR(1) language,

even though there are LR(k) grammars that are not LR(1).

*> I can see that LR(1) parsers are probably more powerful because to*

*> create the parsing table, a stack is used.*

Not true. LL(1) parsers use a stack also. The reason LR(1) parsers

are more powerful is because the perform the closure operation you

mention later and they defer the recognition of which production is

being chosen until the production is complete (and to be precise until

1 token of lookahead past the end of the production has been

inspected). LL(1) parsers must chose which production to apply upon

seeing the first token of the production.

*> For LL(1) you just look at the next input and do a trial and error*

*> approach (of course you can left-factor, and most certainly you have*

*> to remove left-recursions otherwise you'll go into an infinite loop*

*> - all this may result in less backtracking although it's not*

*> guaranteed).*

Not true. The process is NOT trial and error and there is NO

backtracking. The parser simply looks at the first token and choses

the one (and only) production that given the state of the parse has at

that token as its first token. (A grammar is LL(1) if and only if

that choice is unique, which is why left recursion and

non-left-factored grammars are not allowed). If you allow

backtracking (or trial and error), you are in a completely different

class of parsers (a class that is much more powerful, i.e. all LR(1)

grammars and many non-LR(k) grammars can be solved by backtracking

parsers).

*> Another curiosity is the efficiency of LL(1) and LR(1) algorithms.*

Both parsing algorithms run in linear time (i.e. for n tokens of

input, k*n units of time are required to perform the parse).

*> Due to backtracking, LL(1) may end up being exhaustive in that it*

*> may backtrack to the point that it reaches the last production in*

*> which it will try to apply. This exhaustive search means LL(1) is*

*> bounded by an exponential (am I correct?).*

Although backtracking parsers are not LL(1) parsers, backtracking

parsers do sometimes exhibit exponential run-time performance.

*> Because of this fan-out, it seems LR(1) algorithm for creating*

*> parsing table (btw: I'm talking about the algorithm presented in the*

*> new dragon book) is also bounded exponentially (am I correct?).*

I have not seen a worst-case analysis of the space requirements for an

LR parsing table. However, the following argument should hold. Each

state in an LR(1) parsing table consists of a set of items which

correspond to the positions the parser can be in each of the

productions (the so called dotted rules) (*note below). Each rule has

a fixed, finite number of positions. Since each parser state is a

collection of a unique set (note set, not bag, no position is ever

repeated in a state) of those positions, the set of all parser states

is simply a subset of the power set of the positions. Now, this set

is potentially exponential in size in terms of the number of positions

(i.e. the size of the grammar), it is a fixed constant amount for any

grammar and does not vary with the size of input string being

recognized.

In practice, most human written grammars have a number of states that

is less than the square of the number of productions (and are nearly

linear in the number of positions).

*note: For LR(1) (but not LR(0) or LALR(1)), the items also include

the lookahead set, which is a finite set of tokens that might follow

the production. This is again a powerset subset, but again fixed

given the grammar.

Hope this helps,

-Chris

*****************************************************************************

Chris Clark Internet : compres@world.std.com

Compiler Resources, Inc. Web Site : http://world.std.com/~compres

3 Proctor Street voice : (508) 435-5016

Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.