Re: Beginner help with LALR(1) closure

grout@sp55.csrd.uiuc.edu (John R. Grout)
14 Nov 1996 21:52:12 -0500

          From comp.compilers

Related articles
Beginner help with LALR(1) closure kentr@rollinssoft.com (Kent Rollins) (1996-11-12)
Re: Beginner help with LALR(1) closure dlester@cs.man.ac.uk (1996-11-14)
Re: Beginner help with LALR(1) closure grout@sp55.csrd.uiuc.edu (1996-11-14)
Re: Beginner help with LALR(1) closure compres@world.std.com (1996-11-14)
Re: Beginner help with LALR(1) closure farzu@cs.tamu.edu (Francisco Arzu) (1996-11-14)
Re: Beginner help with LALR(1) closure adrian@dcs.rhbnc.ac.uk (1996-11-19)
Re: Beginner help with LALR(1) closure salomon@silver.cs.umanitoba.ca (1996-11-19)
Re: Beginner help with LALR(1) closure gianni@engr.sgi.com (Gianni Mariani) (1996-12-03)
Re: Beginner help with LALR(1) closure icedancer@ibm.net (1996-12-07)
[1 later articles]
| List of all articles for this month |

From: grout@sp55.csrd.uiuc.edu (John R. Grout)
Newsgroups: comp.compilers
Date: 14 Nov 1996 21:52:12 -0500
Organization: Center for Supercomputing R and D, UIUC
References: 96-11-080
Keywords: LALR
In-reply-to: Kent Rollins's message of 12 Nov 1996 22:04:02 -0500

Kent Rollins <kentr@rollinssoft.com> writes:


> I'm working on writing my own LALR(1) parser table generator and
> I'm a little stuck. I'm doing this because I want to understand
> the parsing process. Any tips would be greatly appreciated.
>
> I'm using 'Compiler Design in C' [Holub]. I'm currently working
> on performing LALR(1) closure on LALR(1) states. I've got 2
> questions:
>
> 1. I've downloaded a few YACC-able grammars and I've noticed that
> they all have left-recusive grammars like (A) below.


Because, in an LR grammar, they are more efficient (the parsing stack
is shorter).


> When I run this thru my generator, these cause infinite recursion.


Sounds like a bug...


> I can't find anything in Holub about how the LALR(1) generator should
> handle left-recursion. In order to get around the problem, I used a
> technique from a previous chapter which eliminates left-recursion (B)
> from LL grammars. I would like to know how the LALR(1) closure process
> avoids left recursion.


When in doubt, go back to the source... a classic paper on computing
LALR(1) lookahead sets is by Frank DeRemer and Thomas Pennello,
"Efficient Computation of LALR(1) Look-Ahead Sets, ACM Transactions on
Programming Languages and Systems, Volume 4, No 4 (October 1982),
pp. 615-649... they were among the first to give a constructive
definition which described _how_ to build LALR(1) lookahead sets
efficiently... before their work (and some similar work done at the
same time), algorithms either used LALR subsets or (as did YACC-like
critters) performed potentially _very_ time-consuming and bug prone
iteration to a fixed point (program the iteration slightly
incorrectly, and you could easily loop forever).


My bachelor's thesis parser generator (back in 1981... sigh) was
orders of magnitude faster than the YACC-like critters of the day on
complicated language grammars because it used DeRemer and Pennello's
algorithm (as described in conference papers... the "dead tree"
version cited above was printed after I graduated).


DeRemer and Pennello's algorithm computes LALR(1) lookahead sets in
three simple steps which break down the ways terminal symbols can be
read after a reduction is performed:


1. Immediately (those read in the first parser state visited after the
        reduction).


2. Immediately after one or more trivial reductions (to epsilon) with no
        intervening reads are performed.


3. Immediately after one or more general reductions (either to epsilon or
        not) with no intervening reads are performed.


The "direct read" sets associated with the first step are formed by
inspection of the LR(0) parser, while the sets associated with the
second (the reads sets) and third (the final lookahead sets) steps are
performed using a algorithm by Tarjan (as modified by Eve and
Kurki-Suonio) to compute the transitive closure of a relation (and
associated set unions) which _can't_ get into an infinite loop, no
matter how tangled the underlying relation... the algorithm simply
detects a non-trivial strongly connected component in the underlying
relation and unions together the associated sets (note that there
_can_ be cycles in the relation underlying the third step for
unambiguous grammars... but, as the authors demonstrate, a cycle in
the relation underlying the second step means the grammar is not LR(k)
for any k).
--
John R. Grout Center for Supercomputing R & D j-grout@uiuc.edu
Coordinated Science Laboratory University of Illinois at Urbana-Champaign
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.