Re: epsilon-free grammar

torbenm@gna.diku.dk (Torben Ęgidius Mogensen)
29 Sep 2001 10:55:53 -0400

          From comp.compilers

Related articles
epsilon-free grammar no_email@gmx.at (Jan Horner) (2001-09-26)
Re: epsilon-free grammar torbenm@gna.diku.dk (2001-09-29)
Re: epsilon-free grammar eernst@cs.auc.dk (Erik Ernst) (2001-09-29)
Re: epsilon-free grammar christian.bau@isltd.insignia.com (Christian Bau) (2001-09-29)
Re: epsilon-free grammar gzw@home.com (Geoff Wozniak) (2001-09-29)
Re: epsilon-free grammar adrian@sartre.cs.rhbnc.ac.uk (A Johnstone) (2001-09-30)
| List of all articles for this month |

From: torbenm@gna.diku.dk (Torben Ęgidius Mogensen)
Newsgroups: comp.compilers
Date: 29 Sep 2001 10:55:53 -0400
Organization: Department of Computer Science, University of Copenhagen
References: 01-09-113
Keywords: parse, theory
Posted-Date: 29 Sep 2001 10:55:53 EDT

"Jan Horner" <no_email@gmx.at> wrote:


>How can I get a EPSILON-free grammar (from a grammar)? I want to
>left-factorize that grammar. I'm reading the Aho-Sethi book but can
>find any solution.


You don't need an epsilon-free grammar to do left factorisation.
Nevertheless, here is how you remove epsilon productions:


  1) Pick a nonterminal N that has an epsilon production.


  2) Remove that production.


  3) All productions that have N once on the right-hand side are
        duplicated and the N is removed from one of the copies. Example:


A -> a N b


        becomes


A -> a N b
| a b


        In the general case, all productions that have N k times on the
        right-hand side are duplicated in 2^k copies such that all
        combinations of occurences of N being there or not are present.
        Example:


A -> a N b N c N d


        becomes


A -> a N b N c N d
| a N b N c d
| a N b c N d
| a N b c d
| a b N c N d
| a b N c d
| a b c N d
| a b c d


  4) If there are still epsilon-productions, go to step 1.




Note that, if the language accepts the empty string, this
transformation will remove this property. I.e., the transformed
grammer will accept L(G)\{epsilon}.


Note that the transformation will introduce productions that share a
prefix, causing more cases where left-facorisation is required. And
once you do left factorisation, you may reintroduce
epsilon-productions. Example:


    A -> a
        | a b


left-factorises to


    A -> a B
    B -> b | epsilon


which after epsilon-removal becomes


    A -> a B
        | a


    B -> b


There is no way to avoid both epsilon productions and overlapping
FIRST-sets for this language.


Torben Mogensen (torbenm@diku.dk)


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.