Sat, 20 Dec 2008 22:19:23 -0500

Related articles |
---|

look for LR(k) grammar txchen@gmail.com (Tom) (2008-12-15) |

Re: look for LR(k) grammar felipeangriman@gmail.com (Felipe Angriman) (2008-12-15) |

Re: look for LR(k) grammar pjj@cs.man.ac.uk (Pete Jinks) (2008-12-17) |

Re: look for LR(k) grammar txchen@gmail.com (Tom) (2008-12-18) |

Re: look for LR(k) grammar cfc@shell01.TheWorld.com (Chris F Clark) (2008-12-20) |

Re: look for LR(k) grammar txchen@gmail.com (Tom) (2009-01-15) |

From: | Chris F Clark <cfc@shell01.TheWorld.com> |

Newsgroups: | comp.compilers |

Date: | Sat, 20 Dec 2008 22:19:23 -0500 |

Organization: | The World Public Access UNIX, Brookline, MA |

References: | 08-12-086 08-12-087 08-12-099 |

Keywords: | LR(1) |

Posted-Date: | 21 Dec 2008 10:43:30 EST |

Tom <txchen@gmail.com> writes:

*> Yes I think this is a LR(2) grammar.*

*>*

*> It is similar to a pattern of LR(k) grammars that I know:*

*>*

*> S : a A D a ;*

*> S : b A D b ;*

*> S : a B D b ;*

*> S : b B D a ;*

*>*

*> A : a ;*

*> B : a ;*

*> D : c^k ;*

*>*

*> Here c^k means the concatenation of k letters 'c'.*

*> This is a LR(k+1) grammar, depending on the value of k.*

*>*

*> I'm looking for LR(k) grammars that are a little more complicated in*

*> the following way:*

*>*

*> There is an approach to solve the reduce/reduce conflict in a LR(1)*

*> parsing machine by splitting the states involved, for example,*

*> discussed in D. Spector (81,88) or D. Pager (77, lane-tracing). For*

*> both grammars above, there are only 2 states involved: the state a*

*> that contains the conflict, and one of its immediate parent state b*

*> that leads to the conflict in a. I'm hoping to get LR(k) grammars that*

*> contains more levels of states involved, not just 2 levels as*

*> discussed here.*

*>> On Mon, Dec 15, 2008 at 8:17 AM, Tom <txc...@gmail.com> wrote:*

*>> > I'm looking for LR(k) grammars where k > 1.*

*>>*

*>> S : B a b*

*>> S : C a c*

*>> B : 1*

*>> C : 1*

*>>*

*>> I think that will do for k = 2. (I haven't tested it)*

Ah, state splitting.

First, as far as I know, state splitting is no help in increasing

k. State splitting is about restoring *left* context that gets lost in

the SLR/LALR algorithms. If you do canonical LR parsing, you never

merge the states, and thus do not need to split them. If you are

doing some kind of minimal state LR processing, (i.e. where one

computes the LR(0) tables first, as one does in both SLR and LALR

parsing), then you have to deal with state splitting.

So, borrowing from above (and simplifying): the following grammar

involves state splitting:

S : a A a ;

S : b A b ;

S : a B b ;

S : b B a ;

A : a a ;

B : a a ;

Let's look at the LR(0) tables, with the LALR lookaheads:

State 0:

S : .a A a ;

S : .b A b ;

S : .a B b ;

S : .b B a ;

a -> 1

b -> 2

State 1:

S : a .A a ;

S : a .B b ;

A : .a a ;

B : .a a ;

a -> 3 // state 3 needs to split

A -> 5

B -> 6

State 2:

S : b .A b ;

S : b .B a ;

A : .a a ;

B : .a a ;

a -> 3 // state 3 needs to split

A -> 7

B -> 8

State 3:

A : a .a ;

B : a .a ;

a -> 4

State 4: // reduce-reduce conflict

A : a a .; a (from 5) | b (from 6)

B : a a .; a (from 8) | b (from 7)

State 5:

S : a A .a ;

State 6:

S : b A .b ;

State 7:

S : a B .b ;

State 8:

S : b B .a ;

Now, if you split state 3 & 4into two new states 3x, 3y, 4x, and 4y,

let's see the machine: (I'll just show the changed states):

State 1:

S : a .A a ;

S : a .B b ;

A : .a a ;

B : .a a ;

a -> 3x

A -> 5

B -> 6

State 2:

S : b .A b ;

S : b .B a ;

A : .a a ;

B : .a a ;

a -> 3y

A -> 7

B -> 8

State 3x:

A : a .a ;

B : a .a ;

a -> 4x

State 4x: // reduce-reduce conflict

A : a a .; a (from 5)

B : a a .; b (from 7)

State 3y:

A : a .a ;

B : a .a ;

a -> 4y

State 4y:

A : a a .; b (from 6)

B : a a .; a (from 8)

That is the essence of state splitting. We had different LHS side

contexts for the two uses of state 3, whether we were coming from

state 1 or state 2. This context is lost in the LR(0) machine. Thus,

when we got to state 4 and need to know what to reduce, we couldn't

distunguish the two cases and the lookaheads all get merged together

and we get the conflict.

Now, it sound like you want to understand how to get more rules

involved. (You wrote more states, but I think you mean more rules, as

they give you more states on the tail end you have to search through.)

One way to get more rules involved, is to nest the rules for A anb B

in other rules.

For example, work through this grammar for the same language:

S : a A a ;

S : b A b ;

S : a B b ;

S : b B a ;

A : a C ;

B : a D ;

C : a ;

D : a ;

Here is an LR(2) grammar for the same language:

S : a A a ;

S : b A b ;

S : a B b ;

S : b B a ;

A : C a ;

B : D a ;

C : a ;

D : a ;

Here is an LR(3) grammar:

S : a A a a ;

S : b A a b ;

S : a B a b ;

S : b B a a ;

A : C a ;

B : D a ;

C : a ;

D : a ;

Here is a grammar that isn't LR(k) for any k, for a language that

could have an LR(1) grammar for it--in fact the language is regular.

The language is (a|b)a+(a|b).

S : a A a ;

S : b A b ;

S : a B b ;

S : b B a ;

A : A a ;

B : B a ;

A : a ;

B : a ;

Note, this grammar is particularly interesting to me as it is part of

what we tried to solve in Yacc++, when we did LR(infinite) (or what I

sometimes call LR(closed) parsing. If you look carefully, you will

see that you can parse it deterministically, However, you have to push

an indefinite amount of rules on the "stack". Still, when you get

done, and you see the EOF marker, you can then deterministically

resolve the rules you had pushed. I believe that every unambiguous

CFG has such a deterministic parse, and I believe the computing the

"closure" is the method to find it (if it exists), which is why I

sometimes refer to the method as LR(closed). Unfortunately, due to

the halting probloem, there are grammars that attempting to compute

the closure on will never terminate.

In any case, I hope the supplied grammars at least partially answer

your original question (and perhaps shed some insight into what is

going on).

Hope this helps,

-Chris

******************************************************************************

Chris Clark Internet: christopher.f.clark@compiler-resources.com

Compiler Resources, Inc. or: compres@world.std.com

23 Bailey Rd Web Site: http://world.std.com/~compres

Berlin, MA 01503 voice: (508) 435-5016

USA fax: (978) 838-0263 (24 hours)

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.