Generating LR parsers from EBNF?

Stephen Adams <S.R.Adams@ecs.southampton.ac.uk>
Tue, 7 May 91 21:33:42 BST

          From comp.compilers

Related articles
Generating LR parsers from EBNF? dexter314159@qwest.net (Gary F. Goodman) (2006-04-03)
Re: Generating LR parsers from EBNF? gene@abhost.us (Gene Wirchenko) (2006-04-17)
Generating LR parsers from EBNF? S.R.Adams@ecs.southampton.ac.uk (Stephen Adams) (1991-05-07)
Re: Generating LR parsers from EBNF? grass@ulysses.att.com (1991-05-09)
Re: Generating LR parsers from EBNF? KARSTEN@tfl.dk (Karsten Nyblad, TFL, Denmark) (1991-05-10)
| List of all articles for this month |

Newsgroups: comp.compilers
From: Stephen Adams <S.R.Adams@ecs.southampton.ac.uk>
Keywords: LR(0), parse, question
Organization: Southampton University Computer Science
Date: Tue, 7 May 91 21:33:42 BST

A recent posting (sorry, I can't remember exactly which) mentioned that
someone's parser generator used an extended BNF notation directly for
generating the LR automaton. I had always been lead to believe that an
EBNF grammar is first translated into a context free grammar (CFG) and
then that is used for parse table generation.


I thought about the idea and tried a few tiny examples by hand. The main
trick is to reason about the placing of the `dot' in the EBNF items.
Shifting the dot in an item may produce more than one derived item. For
example, consider the simple grammar for nested parenthesized lists of
numbers:


s -> ( {s} )
s -> number


`{s}' menas 0 or more s's. This grammar might generate `5' or
`(1 2 (3 4) 5)'.


The item `s -> . ( {s} )' shifted to two items, `s -> ( { . s} )' and
`s -> ( {s} . )' The state containing these two items would usually be two
states in a CFG generated automaton.


The parse tables came out smaller. The EBNF generated table was smaller
because some table slots in the CFG generated table were being replaced
with sequences of operations, for example a reduction of a null production
might be replaced by a compound operation `reduce and goto'. After these
changes some states and some columns of the goto table are never used and
may be removed.


I have looked in several books and a couple of conference proceedings but
I have failed to find any references on generating LR parse tables
directly from an EBNF grammar. What I would like to know is:


    1. Are there any references?


    2. Is the EBNF approach equivalent to the CFG one? Is
            the difference in the tables always a due to small set
            of `optimizations'?


    3. Which is faster? The EBNF item sets are smaller but
            more complex to handle. Using the EBNF approach seems
            to reduce the need for `optimizing' the generated
            table.


    4. I only generated the LR(0) automaton by hand. I have
            not thought about `higher' grammar categories like
            LALR(1), LR(2), regular-LR etc. Are these kinds
            easily generated using the EBNF method?




Stephen Adams S.R.Adams@ecs.soton.ac.uk
Computer Science S.R.Adams@sot-ecs.uucp
Southampton University
Southampton SO9 5NH, UK
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.