Re: Regular LALR grammars

markh@csd4.csd.uwm.edu (Mark)
Wed, 26 May 1993 23:01:23 GMT

From comp.compilers

Related articles
Regular LALR grammars simonh@swidev.demon.co.uk (1993-05-25)
*Re: Regular LALR grammars markh@csd4.csd.uwm.edu* (1993-05-26)**

| List of all articles for this month |

Newsgroups:	comp.compilers
From:	markh@csd4.csd.uwm.edu (Mark)
Keywords:	LALR, parse
Organization:	Computing Services Division, University of Wisconsin - Milwaukee
References:	93-05-125
Date:	Wed, 26 May 1993 23:01:23 GMT

>How are regular parser grammars handled by the parser generator?

The right hand side of an item is now a DFA.

Take the grammar (based on one presented on a prior post about the Tomita
parser):

                                      S: NP VP.
                                      NP: [D] N PP*.
                                      VP: V NP PP*.
                                      PP: P NP.

Rewriting these in equational DFA form, you get:

                                            Q0 = S Q1
                                            Q1 = 1.
                                      S: Q2 = NP Q3.
                                            Q3 = VP Q4.
                                            Q4 = 1.
                                    NP: Q5 = D Q6 | N Q7.
                                            Q6 = N Q7.
                                            Q7 = PP Q7 | 1.
                                    VP: Q8 = V Q9.
                                            Q9 = NP Q10.
                                            Q10 = PP Q10 | 1.
                                    PP: Q11 = P Q12.
                                            Q12 = NP Q13.
                                            Q13 = 1.

State 0 starts out with the item: [accept: Q0 ]. On closure, you get:

                                        [accept: Q0 ].
                                        [S: Q2].
                                        [NP: Q5].

If any of the states (Q0, Q2, Q5) are final states in their respective
DFAs then the corresponding reduction is listed. For each transition on
each of the states (Q0, Q2, and Q5) a corresponding contribution to a
shift action is recorded:
                                    shift symbol new state
                                              S 1: [accept: Q1].
                                            NP 2: [S: Q3].
                                              D 3: [NP: Q6].
                                              N 4: [NP: Q7].

State 1 has only one item with Q1 listed. Since Q1 is a final state in
the DFA corresponding to "accept", then only the reduction: [accept -> S]
is listed in state 1.

The final list of closures looks like this:
0: [accept: Q0] [S: Q2] [NP: Q5]
1: [accept: Q1]
2: [S: Q3] [VP: Q8]
3: [NP: Q6]
4: [NP: Q7] [PP: Q11]
5: [S: Q4]
6: [VP: Q9] [NP: Q5]
7: [PP: Q12] [NP: Q5]
8: [VP: Q10] [PP: Q11]
9: [PP: Q13]

and the final state table like this:

0: S => 1, NP => 2, D => 3, N => 4
1: <accept>
2: VP => 5, V => 6
3: N => 4
4: PP => 4, P => 7, <reduce NP: D N PP*>
5: <reduce S: NP VP>
6: NP => 8, D => 3, N => 4
7: NP => 9, D => 3, N => 4
8: PP => 8, P => 7, <reduce VP: V NP PP*>
9: <reduce PP: P NP>

with shift-reduce conflicts on states 4 and 8 on P.

Two things are true here that are not true of the BNF-based LR(k) parser:

                          (P0) The tables tend to be more compact
                          (P1) A state no longer necessarily has a unique predecessor
symbol (e.g. both N and PP have shifts/gotos into state 4).

A reduce action is carried out by following the corresponding DFA
backwards until you get back to the start state. For example, for the
reduction <VP = V NP PP*>, you first reverse the DFA:

                                    VP: Q8 = V Q9.
                                            Q9 = NP Q10.
                                            Q10 = PP Q10 | 1.

by reversing all the arrows and reversing final and start states to get:

  <reduce VP: V NP PP*>: Q10 = PP Q10 | NP Q9.
                                                Q9 = V Q8.
                                                Q8 = 1.

Since there may be more than one final state in the DFA, the reverse
finite state automaton won't necessarily be deterministic and might need
to be reduced to a DFA after the reversal. This automaton will reduce any
sequence of symobols on the stack of the form (reading backwards): PP* NP
V into a VP node.

Property P1 has the consequences that:
                        (a) a state in the reversed DFA
                                may have more than one transition
                and (b) it's therefore necessary to actually look at the symbol on
                                the stack when making the reduction.
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Regular LALR grammars

markh@csd4.csd.uwm.edu (Mark)Wed, 26 May 1993 23:01:23 GMT

markh@csd4.csd.uwm.edu (Mark)
Wed, 26 May 1993 23:01:23 GMT