Wed, 26 May 1993 23:01:23 GMT

Related articles |
---|

Regular LALR grammars simonh@swidev.demon.co.uk (1993-05-25) |

Re: Regular LALR grammars markh@csd4.csd.uwm.edu (1993-05-26) |

Newsgroups: | comp.compilers |

From: | markh@csd4.csd.uwm.edu (Mark) |

Keywords: | LALR, parse |

Organization: | Computing Services Division, University of Wisconsin - Milwaukee |

References: | 93-05-125 |

Date: | Wed, 26 May 1993 23:01:23 GMT |

*>How are regular parser grammars handled by the parser generator?*

The right hand side of an item is now a DFA.

Take the grammar (based on one presented on a prior post about the Tomita

parser):

S: NP VP.

NP: [D] N PP*.

VP: V NP PP*.

PP: P NP.

Rewriting these in equational DFA form, you get:

Q0 = S Q1

Q1 = 1.

S: Q2 = NP Q3.

Q3 = VP Q4.

Q4 = 1.

NP: Q5 = D Q6 | N Q7.

Q6 = N Q7.

Q7 = PP Q7 | 1.

VP: Q8 = V Q9.

Q9 = NP Q10.

Q10 = PP Q10 | 1.

PP: Q11 = P Q12.

Q12 = NP Q13.

Q13 = 1.

State 0 starts out with the item: [accept: Q0 ]. On closure, you get:

[accept: Q0 ].

[S: Q2].

[NP: Q5].

If any of the states (Q0, Q2, Q5) are final states in their respective

DFAs then the corresponding reduction is listed. For each transition on

each of the states (Q0, Q2, and Q5) a corresponding contribution to a

shift action is recorded:

shift symbol new state

S 1: [accept: Q1].

NP 2: [S: Q3].

D 3: [NP: Q6].

N 4: [NP: Q7].

State 1 has only one item with Q1 listed. Since Q1 is a final state in

the DFA corresponding to "accept", then only the reduction: [accept -> S]

is listed in state 1.

The final list of closures looks like this:

0: [accept: Q0] [S: Q2] [NP: Q5]

1: [accept: Q1]

2: [S: Q3] [VP: Q8]

3: [NP: Q6]

4: [NP: Q7] [PP: Q11]

5: [S: Q4]

6: [VP: Q9] [NP: Q5]

7: [PP: Q12] [NP: Q5]

8: [VP: Q10] [PP: Q11]

9: [PP: Q13]

and the final state table like this:

0: S => 1, NP => 2, D => 3, N => 4

1: <accept>

2: VP => 5, V => 6

3: N => 4

4: PP => 4, P => 7, <reduce NP: D N PP*>

5: <reduce S: NP VP>

6: NP => 8, D => 3, N => 4

7: NP => 9, D => 3, N => 4

8: PP => 8, P => 7, <reduce VP: V NP PP*>

9: <reduce PP: P NP>

with shift-reduce conflicts on states 4 and 8 on P.

Two things are true here that are not true of the BNF-based LR(k) parser:

(P0) The tables tend to be more compact

(P1) A state no longer necessarily has a unique predecessor

symbol (e.g. both N and PP have shifts/gotos into state 4).

A reduce action is carried out by following the corresponding DFA

backwards until you get back to the start state. For example, for the

reduction <VP = V NP PP*>, you first reverse the DFA:

VP: Q8 = V Q9.

Q9 = NP Q10.

Q10 = PP Q10 | 1.

by reversing all the arrows and reversing final and start states to get:

<reduce VP: V NP PP*>: Q10 = PP Q10 | NP Q9.

Q9 = V Q8.

Q8 = 1.

Since there may be more than one final state in the DFA, the reverse

finite state automaton won't necessarily be deterministic and might need

to be reduced to a DFA after the reversal. This automaton will reduce any

sequence of symobols on the stack of the form (reading backwards): PP* NP

V into a VP node.

Property P1 has the consequences that:

(a) a state in the reversed DFA

may have more than one transition

and (b) it's therefore necessary to actually look at the symbol on

the stack when making the reduction.

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.