New LR parser generation algorithm

"David R Tribble" <david@tribble.com>
30 Apr 2006 18:35:24 -0400

          From comp.compilers

Related articles
New LR parser generation algorithm david@tribble.com (David R Tribble) (2006-04-30)
Re: New LR parser generation algorithm schmitz@i3s.unice.fr (Sylvain Schmitz) (2006-05-01)
Re: New LR parser generation algorithm schmitz@i3s.unice.fr (Sylvain Schmitz) (2006-05-01)
Re: New LR parser generation algorithm Francois.Pottier@inria.fr (Francois Pottier) (2006-05-09)
| List of all articles for this month |
From: "David R Tribble" <david@tribble.com>
Newsgroups: comp.compilers
Date: 30 Apr 2006 18:35:24 -0400
Organization: Compilers Central
Keywords: LR(1), parse, theory
Posted-Date: 30 Apr 2006 18:35:24 EDT

I wish to introduce a new LR parser generation algorithm. I call this
the Honalee Algorithm. It produces the minimal collection of
canonical LR(k) item sets for an LR(k) grammar.


The algorithm uses a "merge-as-you-go" approach to merging similar
item sets as they are generated, as opposed to generating an entire
collection of item sets and then merging them after the fact. This
approach conserves total memory space.


The output of the algorithm is the minimal collection of item sets
necessary for recognizing the input grammar with an LR parser. (The
item sets translate directly into parser states for an LR parser.) If
the input grammar is SLR or LALR, the resulting collection of item
sets is exactly the same as the item sets that are produced by a
comparable LALR parser generator. If the grammar is LR but is not SLR
or LALR, the collection of item sets produced contains additional item
sets that are necessary for an LR parser to recognize the complete
grammar. In this sense, the output is the minimal collection of item
sets, containing no unnecessary redundant or duplicate item sets.


A brief sketch of the algorithm, which is structured as a two-phase
loop:


    create initial item set and add it to toDoList;
    while incList is not empty,
        // Phase 1
        if incList is not empty,
            pop first incomplete item set in incList;
            generate new transition item sets from shift items in set,
            adding them to the toDoList;
            mark set as complete and move set to done list;


        // Phase 2
        for each set in the toDoList,
            generate closure items and mark all reduction items in set;
            if set can be merged with existing gSet,
                merge set into gSet;
                if gSet is complete and merging added shift items to gSet,
                    reset shift actions in gSet
                    mark gSet as incomplete and move gSet to incList;
                discard set;
            if set was not discarded,
                move set to incList;
    end while;


A more thorough treatment of the algorithm can be found at:
    http://david.tribble.com/text/honalee.html
    http://david.tribble.com/text/honalee_hist.html


The final version of the algorithm was developed in Dec 2005, while
developing a YACC-like parser genenerator (compiler-compiler) project
that I began in Mar 2001. The project itself (called "YACC/M for
Java") is about 90% complete at this point.


  --
David R. Tribble (david@tribble.com), 30 Apr 2006



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.