Re: Conflict resolution in EBNF

friwi@prosun.first.gmd.de (F.W. Schroeer)
20 Oct 2001 21:50:58 -0400

          From comp.compilers

Related articles
Re: Conflict resolution in EBNF csanna@cs.Technion.AC.IL (Anna Nikitin) (2001-10-16)
Re: Conflict resolution in EBNF Soenke.Kannapinn@wincor-nixdorf.com (Sönke Kannapinn) (2001-10-20)
Re: Conflict resolution in EBNF alexc@world.std.com (2001-10-20)
Re: Conflict resolution in EBNF knyblad@baan.com (2001-10-20)
Re: Conflict resolution in EBNF cfc@world.std.com (Chris F Clark) (2001-10-20)
Re: Conflict resolution in EBNF friwi@prosun.first.gmd.de (2001-10-20)
Re: Conflict resolution in EBNF dr_feriozi@prodigy.net (2001-10-27)
| List of all articles for this month |

From: friwi@prosun.first.gmd.de (F.W. Schroeer)
Newsgroups: comp.compilers
Date: 20 Oct 2001 21:50:58 -0400
Organization: Compilers Central
References: 01-10-077
Keywords: parse
Posted-Date: 20 Oct 2001 21:50:58 EDT

Anna Nikitin wrote:


> I'm working on a new compiler-compiler language which allows using
> regular expressions [...]
> Therefore two new kinds of Reduce/Reduce conflict may appear:
>
> 1. Parser doesn't know which alternative to choose (when popping the
> symbols from the stack). The conflict appears in the following grammar:
> A -> {a | aa}B
> B -> {ab | b}
> 2. Parser doesn't know whether to stop or to continue popping symbols from
> the stack. The conflict appears in the following grammar:
> A -> {a | ab}{b}*
>
> In the first case we can use precedence rules (as in Yacc), i.e. give
> priorities to the alternatives. If the conflict occurs we simply
> choose the alternative with the highest priority. But in the second
> case it's not so clear what to do. [...]


and John added:


> [She knows the grammar is ambiguous, the question is whether there's some
> reasonable trick like yacc's %left and %right to tell the parser generator
> how to deal with it. -John]


I have developed such a notation for Accent.


The Accent compiler compiler is a production quality parser generator
that can process all grammars without the need to adapt them to any
particular parsing technique. It also supports regular expressions on
the right hand sides of rules. Accent can process ambiguous
grammars. It has a new framework to resolve ambiguities without the
need to modify the grammar. This framework is complete in the sense
that it can deal with all ambiguities.


The tutorial for Accent includes a discussion of this notation.
It is available at


      http://accent.compilertools.net


from where you may also download Accent.


Accent has %prio annotations that allow you to select between
conflicting alternatives.


Most people think that ambiguities are introduced in the following
way: In a derivation tree there is subtree that we can replace by a
tree with a different rule at the top such that the new subtree
generates the same part of the source text as the old subtree.


But there is also a second source of ambiguity which in its general
form is overlooked even in papers devoted to the topic (e.g. in
Earley's Acta Informatica paper where he proposes a notation to
resolve ambiguities, 1975, 183-192).


In this case you don't exchange the rule at the root of the subtree
that produces the ambiguity but you exchange more than one of the
members inside the same rule.


A nonterminal in this case covers a shorter or a longer piece of the
source text. You use the %short or the %long annotation to select what
you prefer.


It is interesting that in both grammars this second case occurs.


Please note that the annotation is on the abstract level of grammars,
not at the level a particular parser implementation.


There is also a tool called Amber that checks grammars for ambiguities and
explains how to resolve them.


I translated Anna's grammars into the Accent notation and used Amber
to analize them.


Grammar 1
---------


A : ( 'a' | 'a' 'a' ) B ;
B : ( 'a' 'b' | 'b' ) ;


Amber generates an example that indicates the ambiguity, prints the possible
derivation trees and then gives a hint how to resolve the ambiguity.


Here is the Amber output:


GRAMMAR DEBUG INFORMATION


Grammar ambiguity detected.
There are two different parses
for the beginning of ``A'', alternative at line 1, col 5 of grammar,
up to and containing ``B'' at line 1, col 23 of grammar.


PARSE 1
-------


A alternative at line 1, col 5 of grammar {
    Subphrase alternative at line 1, col 7 of grammar {
        'a'
    }
    B alternative at line 2, col 5 of grammar {
        Subphrase alternative at line 2, col 7 of grammar {
            'a'
            'b'
        }
    }
}


PARSE 2
-------


A alternative at line 1, col 5 of grammar {
    Subphrase alternative at line 1, col 13 of grammar {
        'a'
        'a'
    }
    B alternative at line 2, col 5 of grammar {
        Subphrase alternative at line 2, col 17 of grammar {
            'b'
        }
    }
}


For ``B'' at line 1, col 23 of grammar,
use %long annotation to select first parse,
use %short annotation to select second parse.


END OF GRAMMAR DEBUG INFORMATION


Grammar 2
---------


A : ( 'a' | 'a' 'b' )( 'b' )* ;


GRAMMAR DEBUG INFORMATION


Grammar ambiguity detected.
There are two different parses
for the beginning of ``A'', alternative at line 1, col 5 of grammar,
up to and containing ``Subphrase'' at line 1, col 22 of grammar.


PARSE 1
-------


A alternative at line 1, col 5 of grammar {
    Subphrase alternative at line 1, col 13 of grammar {
        'a'
        'b'
    }
    Subphrase alternative at line 1, col 24 of grammar {
    }
}


PARSE 2
-------


A alternative at line 1, col 5 of grammar {
    Subphrase alternative at line 1, col 7 of grammar {
        'a'
    }
    Subphrase alternative at line 1, col 24 of grammar {
        'b'
        Subphrase alternative at line 1, col 24 of grammar {
        }
    }
}


For ``Subphrase'' at line 1, col 22 of grammar,
use %short annotation to select first parse,
use %long annotation to select second parse.


END OF GRAMMAR DEBUG INFORMATION


Here is an annotated version of the first grammar :


A : ( 'a' | 'a' 'a' ) %long B ;
B : ( 'a' 'b' | 'b' ) ;


Note that the subgrammar for B is not ambiguous
because the two alternatives for B produce different strings.
The second rule is OK, a %prio annotation would not help
because there is no conflict.
The problem is introduced in the rule for A, other uses of B
might be harmless.


Here is the annotated second grammar :


A : ( 'a' | 'a' 'b' ) %short ( 'b' )* ;


Regards,


F.W. Schroeer
Fraunhofer FIRST


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.