23 Apr 2006 10:01:04 -0400

Related articles |
---|

understanding the intuition behind LL(k) parsers and LR(k) parsers Mark.Felzer@gmail.com (Mark F.) (2006-04-21) |

Re: understanding the intuition behind LL(k) parsers and LR(k) parsers tom@infoether.com (Tom Copeland) (2006-04-22) |

Re: understanding the intuition behind LL(k) parsers and LR(k) parsers cfc@shell01.TheWorld.com (Chris F Clark) (2006-04-23) |

Re: understanding the intuition behind LL(k) parsers and LR(k) parsers max@gustavus.edu (Max Hailperin) (2006-04-23) |

Re: understanding the intuition behind LL(k) parsers and LR(k) parsers DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2006-04-23) |

Re: understanding the intuition behind LL(k) parsers and LR(k) parsers pbmann@gmail.com (2006-04-28) |

Re: understanding the intuition behind LL(k) parsers and LR(k) parsers pbmann@gmail.com (2006-05-01) |

From: | Max Hailperin <max@gustavus.edu> |

Newsgroups: | comp.compilers |

Date: | 23 Apr 2006 10:01:04 -0400 |

Organization: | Compilers Central |

References: | 06-04-124 |

Keywords: | parse, comment |

"Mark F." <Mark.Felzer@gmail.com> writes:

*> ...*

*> Maybe you can help me visualize the basic idea behind the two*

*> approaches to AST tree generation....*

Conceptually, both an LL parser and an LR parser maintain a stack of

grammar symbols, that is, a stack where each element is either a

nonterminal or a terminal. However, the meaning of that stack is

completely opposite in the two styles of parser, and so is the

combination of two operations used to update the stack.

In an LR parser, if you read the stack from bottom to top, you get a

summary of the input that has already been read. In an LL parser, if

you read the stack from top to bottom, you get a prediction of the

input that remains to be read. In both cases, when I say the stack is

a "summary," I mean that it may be at a relatively abstract level

because it can contain nonterminals rather than only terminals. For

example, if an LR stack contains

+

Expr

(

it means that the input that was already read consists of a left

parenthesis, some sort of Expr, and then a plus sign. This is an

abstract summary, because it doesn't specify which particular terminal

symbols were read that constituted the Expr. Similarly, if an LL

stack contains

+

Expr

)

it means that the remaining input, which has not yet been read, is

expected to contain a plus sign, some sort of Expr, and then a right

parenthesis. Again, there is no specificity regarding what form the

Expr will take.

This is just the conceptual view; LR stacks in paricular actually

contain numeric "states", which are critical for understanding how the

parser efficiently decides on the next action to take, but which are

not critical for understanding the fundamental contrast between LL and

LR. Instead, a better next step would be to move from considering how

to read the stack to considering what the two actions available to the

parser are. For appreciating the LL/LR contrast, you can just assume

the appropriate action is magically chosen.

In an LR parser, the two possible actions are:

(1) Read in one more terminal symbol from the input; to maintain the

stack as a summary of the input that has been read, the terminal

symbol needs to be pushed onto the stack. This is called

"shifting" a symbol onto the stack.

(2) Reduce the summary that is on the stack to a more abstract one by

replacing the right-hand side of a production by the left-hand

side. For example, if there is a production Factor -> ( Expr )

and the stack is

)

Expr

(

(

then a reduction would summarize the "( Expr )" part more

concisely as being a Factor; this entails popping the top three

symbols off the stack and pushing the Factor on, yielding

Factor

(

which is still a summary of the same already-consumed input, just

a more abstract one.

Because of these two actions, the LR parser is also called a

shift/reduce parser.

In an LL parser, on the other hand, the actions are the opposite,

reflecting the oppositeness of what its stack means:

(1) If a terminal symbol is read in from the input, then the same

terminal symbol must be popped off from the top of the stack,

because the stack is a prediction of the remaining input. The

action here consists of confirming one symbol worth of that

prediction.

(2) If instead the prediction is changed without reading input, this

happens by expanding out the prediction to a less abstract one

by replacing the left-hand side of a production by the

right-hand side. For example, considering again the production

Factor -> ( Expr ), an action could change the stack from just

Factor

)

to

(

Expr

)

)

so as to predict not just some sort of Factor followed by a

right parenthesis, but rather more specifically a left

parenthesis, Expr, and right parenthesis followed by a (further)

right parenthesis.

I like to think of an LL parser as a "confirm/expand" parser by

analogy with the "shift/reduce", though this nomenclature doesn't seem

to be standard.

I hope seeing the duality will help you understand the relationship

between these two parsing techniques.

-Max Hailperin

Professor of Computer Science

Chair, Mathematics and Computer Science Department

Gustavus Adolphus College

800 W. College Ave.

St. Peter, MN 56082

USA

http://www.gustavus.edu/+max/

[Nice explanation. Thanks! -John]

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.