# Re: DeRemer Pennello LALR(1) Construction Algo

## =?iso-8859-1?Q?Fran=E7ois=20D=E9sarm=E9nien?= <francois@fdesar.net>8 May 2000 00:50:07 -0400

From comp.compilers

Related articles
DeRemer Pennello LALR(1) Construction Algo johnston.p@worldnet.att.net (Paul Johnston) (2000-05-04)
Re: DeRemer Pennello LALR(1) Construction Algo francois@fdesar.net (=?iso-8859-1?Q?Fran=E7ois=20D=E9sarm=E9nien?=) (2000-05-08)
| List of all articles for this month |

 From: =?iso-8859-1?Q?Fran=E7ois=20D=E9sarm=E9nien?= Newsgroups: comp.compilers Date: 8 May 2000 00:50:07 -0400 Organization: Compilers Central References: 00-05-013 Keywords: parse

Hello,

I've implemented such a beast in Perl, not so long ago, so I'm trying to
help...

Paul Johnston wrote:
>
> This is a rather directed question: I am currently implementing an LALR(1)
> constructor which uses DeRemer and Pennello's Lookahead set algorithm
> described in ACM TOPLAS Vol.4 No.4 Oct1982 615-649 and explained using an
> example in 'The Theory and Practice of Compiler Writing" by Tremblay and
> Sorenson.

Could you mail me the example by Tremblay and Sorenson ? I'd be very interested
in it, and maybe it would make it easier for me to tell you hows and whys.

In fact, I've been using a little different implementation than DP's one, using
the digraph algorythm with the Knuth's left-dependency relation to build the first set.

> This algorithm defines several binary relations to assist in the LA set
> generation, one of which 'LOOKBACK' relates which FOLLOW sets contribute to
> a LA set.

Uhm, yes. One can say that :-)

LA(q, A -> omega .) = Union {FOLLOW(p,A) | p in PRED(q, omega)}

where PRED(q, omega) = {p | GOTO(p, omega)=q}

> The problem I'm having is in regards to the example grammar in Tremblay ---
> my implementation is emitting one extra LOOKAHEAD relation which I can't for
> the life of me reason why it is incorrect. It is:
>
> (12, T -> T * f) LOOKBACK (8, T)
>
> Which implies that the following statement is true: "when the top of the
> stack reads state 12 and spells out T * f, reduction by T -> T * f will pop
> three symbols off the stack and reveal state 8 (which has a nonterminal
> transition on T)".

I can' help on that without the grammar specification.

> Given the example grammar, I don't see why this statement is false. I've
> analysed the possible configurations of the stack and that one seems
> perfectly valid to me, yet it is conspicously absent from Tremblay.

But maybe it could on other grammars...maybe Tremblay is wrong...without
the paper, I cannot say.

> This is how I am identifying LOOKBACK relations:
>
> foreach state p in the LR(0) machine
> do
> foreach item in p of the form [ A -> dot alpha ] /* an item with the dot
> at the front */
> do
> trace through the LR(0) machine along the path given by alpha
> until you hit the item with the dot at the end [ A -> alpha dot ] to
> get state q.
> add a relation (q, A -> alpha) LOOKBACK (p, A)
> done
> done
>

Looks like you're computing PRED(p, alpha) = {q | GOTO(q, alpha)=p}, and I
don't see any mistake...

I do it a very different way: while computing the LR(0) states, I keep a back
pointer to all the states which leads to one state. Then, knowing the length
of rule A -> alpha, I can build PRED(p, alpha) set just by finding states which
are length(alpha) away "back" from p. But that should lead to the same result.

> I expect that people who have implemented this algorithm or people who have
> studied the example in detail may be able to offer insight why my approach
> is flawed.

I'll try to implement your algorithm to compare results with mine which I know
is good (I compiled correctly a C++ grammar, with 825 rules, 1611 states, amongst
others, and I'm not the only one using it :)

If you give me the example grammar, I could give you a listing of the PRED set
I find...