8 Nov 2003 01:35:25 -0500

Related articles |
---|

Grammar LR(1) but not LALR(1) jm@bourguet.org (Jean-Marc Bourguet) (2003-10-31) |

Re: Grammar LR(1) but not LALR(1) oliver@zeigermann.de (Oliver Zeigermann) (2003-11-02) |

Re: Grammar LR(1) but not LALR(1) cfc@shell01.TheWorld.com (Chris F Clark) (2003-11-02) |

Re: Grammar LR(1) but not LALR(1) stefan@infoiasi.ro (ANDREI Stefan) (2003-11-08) |

Re: Grammar LR(1) but not LALR(1) jm@bourguet.org (Jean-Marc Bourguet) (2003-11-08) |

Re: Grammar LR(1) but not LALR(1) RLake@oxfam.org.uk (2003-11-11) |

From: | ANDREI Stefan <stefan@infoiasi.ro> |

Newsgroups: | comp.compilers |

Date: | 8 Nov 2003 01:35:25 -0500 |

Organization: | Compilers Central |

Keywords: | LALR, parse, question |

Posted-Date: | 08 Nov 2003 01:35:25 EST |

*> It is well known that there are grammars which are LR(1) but not*

*> LALR(1). For exemple:*

*> S -> aEa | bEb | aFb | bFa*

*> E -> e*

*> F -> e*

*> It is possible to transform this grammar to make it LALR(1) by*

*> duplicating productions. For exemple:*

*> S -> a E a | b E' a | a F b | b F' a*

*> E -> e*

*> E' -> e*

*> F -> e*

*> F' -> e*

*> As a matter of fact, only one duplication is needed to make the*

*> grammar LALR(1).*

*> I'm convinced that such a transformation (contructing an LALR(1)*

*> grammar by duplicating productions) is possible for every LR(1)*

*> grammars, such duplication preventing the merge of the states LR(1)*

*> automaton.*

*> A collegue is not convinced and try to come up -- without success*

*> until now -- with a language which would have a LR(1) grammar but no*

*> LALR(1) one.*

*> Could someone point me to a counter example for my claim or a proof of*

*> its correctness?*

Dear Jean-Marc,

Your question is interesting. I have a good news and a bad news. The

good news is that your construction is correct! This is due to the

fact that this "non-terminal duplication" eliminates the reduce

conflicts in the new CFG grammar since the kernel of the items which

previously were generated such conflicts are no longer in the prefix

automaton. In other words, the states which previously merged together

(and generate the reduce conflict in the LALR(1) automaton) cannot be

merged after the "non-terminal duplication" since they have different

kernels!

Now the bad news. The LALR(1) grammars were invented in 1969 in an

effort to do practical compilation since the prefix automaton for a

LR(1) CFG may be quite huge! Hence many today compiler-compilers are

using LALR(1) grammars instead of LR(1). That is, the LALR(1) prefix

automaton has the same number of states like LR(0) prefix automaton,

and almost the same power as LR(1) grammars, too. But the

"non-terminal duplication" will not have this advantage of reducing

the size of the prefix automaton (this will the same as the initial

LR(1) automaton). A more even worst news is that you need a bigger

GoTo table to specify the reductions for the new generated

non-terminal symbols. For example, in your proposed CFG (by the way,

there is a typo: S -> b E' a should be S -> b E' b) the initial LR(1)

grammar generates a prefix automaton with 11 states. By merging the

states {(E->e.,{a}), (F->e.,{b})} and {(E->e.,{b}), (F->e.,{a})} into

{(E->e.,{a,b}), (F->e.,{a,b})}, the LALR(1) prefix automaton has only

10 states, but it generates a reduce-reduce conflict, so the CFG

grammar is not LALR(1). As we mention earlier, the "non-terminal

duplication" technique transforms the old non-LALR(1) CFG into an

equivalent LALR(1) grammar, but the prefix automaton will have 11

states and two (although one suffices) new non- variables. So, in

general, I am afraid to tell you that there is no practical advantage

of such technique because: 1. the prefix automaton of the new LALR(1)

grammar will have the same number of states as the initial LR(1)

automaton 2. the new GoTo table will be bigger since it has to store

the new non-variable symbols 3. it is known that LR(1) parsing is

faster than LALR(1) parsing. Here I mean that in case of

non-acceptance of the input word, the LR(1) parser may detect this

with no reductions, whereas the LALR(1) parser can do many reductions

until it gets the same answer!

Here are some useful references:

[DR69] DeRemer, F.L.: Practical translators for LR(k) languages. Ph.D.

Thesis, MIT, Cambridge, 1969

[DRP82] DeRemer, F.L., Pennello, T.: Efficient Computation of LALR(1)

Look-Ahead Set. TOPLAS, vol. 4, no. 4, october 1982

Let me conclude by thanking for the question. If you want to know

more about these things (proofs of 1, 2, 3) contact me directly or via

compilers@iecc.com

Best wishes,

Stefan

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.