Re: Where reductions in LALR?

Cleo Saulnier <cleos@nb.sympatico-dot-ca.remove>
7 Sep 2005 13:11:44 -0400

          From comp.compilers

Related articles
Where reductions in LALR? (tubylerczyk) (2005-09-03)
Re: Where reductions in LALR? (Sylvain Schmitz) (2005-09-04)
Re: Where reductions in LALR? (Karsten Nyblad) (2005-09-04)
Re: Where reductions in LALR? cleos@nb.sympatico-dot-ca.remove (Cleo Saulnier) (2005-09-07)
| List of all articles for this month |

From: Cleo Saulnier <cleos@nb.sympatico-dot-ca.remove>
Newsgroups: comp.compilers
Date: 7 Sep 2005 13:11:44 -0400
Organization: Aliant Internet
References: 05-09-016
Keywords: LALR
Posted-Date: 07 Sep 2005 13:11:44 EDT

tubylerczyk wrote:
> I have sample from "Compilers constructions" by Waite and Goos
> In chapter 7 is grammar LALR(1) but not SLR(1):
> (1) Z -> A
> (2) A -> aBb
> (3) A -> adc
> (4) A -> bBc
> (5) A -> bdd
> (6) B -> d
> In book table is:
> a b c d # A B
> -------------------------------
> 0 2 3 . . . 1
> 1 . . . . *
> 2 . . . 5 . 4
> 3 . . . 7 . 6
> 4 8 .
> 5 . +6 9 . .
> 6 . 10
> 7 . . +6 11 .
> 8 +2 +2 +2 +2 +2
> 9 +3 +3 +3 +3 +3
> 10 +4 +4 +4 +4 +4
> 11 +5 +5 +5 +5 +5
> number means Shift and Goto, +number means reduction
> Lookaheads for Z->.A in state 0 is # - end of stream,
> how will for A->,aBb ? In my opinion also # because
> First(epsilon+#) = #
> This way I find lookaaheads for items in state 8,9,10 and 11
> I find all are #, but in table in book reduction in this state are for
> any terminal symbol, not only end of stream
> What is mean? If lookaheads is # I must reduce for ALL symbols when
> state <> 1 ? Or maybe in state 8,9,10,11 lookaheads are set of all
> symbols? How is would created?
> Note that LALR tables (differs to SLR) often have state with the same
> reduction for all terminal symbols

I'm not exactly sure what you're asking. It seems like your confused
between lookaheads and terminals symbols that dedice shift/reduce. They
are somewhat similar, but the lookahead symbols you speak of are what
symbols can appear after the production. These will result in a
reduction of the production. So I'm guessing you're wondering why it's
reducing on all symbols, right?

I ran the grammar through my canonical LR(1) parser generator and it
only produced 10 states vs. your 11 which is strange because LR(1) is
supposed to create a larger amount of states. My parser generator
substitutes single token productions so that might explain it.
Regardless, you are correct that only the # (end of input) symbol is the
only lookahead. So it's impossible to reduce on other symbols.

So the question is "why is the table reducing on all symbols?"

If you read in your book, perhaps it mentions table compression. Notice
how most entries are empty. Especially if the EXTRA reduce table
entries are removed as they are not needed. So it'd be nice to only
store the entries which are valid, this compressing the table to 15
entries vs. 55. Even if you need to store the symbol with the action,
it's still 30 vs. 55 entries. A 50% reduction in space.

But explaining why all entries use a reduction takes a little
explaining. Usually, when you compress a table, you only store the
symbol and the action (negative for reduction and positive for shifts or
whatever scheme you wish). But it's also common to store a default
reduction action. What I mean by this is that after you have stored all
the shift actions (and all but one reduce) and all that's left is one
reduce action, then store it as a default reduce action no matter the
symbol. In your example, it seems they used a specific condition that
the ONLY possibility has to be a reduction. If any shift actions are
used, the state is not modified.

The reason you can reduce on symbols that aren't valid is because
reducing doesn't actually consume a terminal. That only happens during
a shift, and when it eventually does come to that time after reducing,
the error will be caught. ie. error detection is deferred until no
reductions are possible.

It seems that in your example, they decided to catch the error right
away if it's possible that even 1 token can be shifted. It's also
possible that they did it for speed. For states 8-11 in your example,
the next terminal doesn't need to be consulted. Just reduce immediately.

But I'm a little confused too why they didn't do the same thing to
states 5 and 7.

Here's a compressed version of your table. Perhaps this will explain
things better.

0: {a,2},{b,3}
1: {#,acc}
2: {d,5}
3: {d,7}
4: {b,8}
5: {b,+6},{c,9}
6: {c,10}
7: {c,+6},{d,11}
8: {all,+2}
9: {all,+3}
10: {all,+4}
11: {all,+5}

15 entries with 2 values = 30 values vs. 55 for the full table of

For this table, putting a reduce value for all entries isn't necessary,
but maybe they're talking about table compression and it would be a good
place to start just for the sake of an example.

Short answer to your question is that it's not necessary to reduce on
all terminals and no, it doesn't really come from anywhere unless you're
doing table compression.

If you only reduce on #, it will work fine.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.