(long) conflict resolution in action table

Tim Newsham <newsham@lava.net>
4 Sep 2003 22:36:27 -0400

          From comp.compilers

Related articles
(long) conflict resolution in action table newsham@lava.net (Tim Newsham) (2003-09-04)
Re: (long) conflict resolution in action table derkgwen@HotPOP.com (Derk Gwen) (2003-09-09)
Re: (long) conflict resolution in action table haberg@matematik.su.se (2003-09-09)
Re: (long) conflict resolution in action table newsham@lava.net (Tim Newsham) (2003-09-09)
Re: (long) conflict resolution in action table newsham@lava.net (Tim Newsham) (2003-09-14)
| List of all articles for this month |
From: Tim Newsham <newsham@lava.net>
Newsgroups: comp.compilers
Date: 4 Sep 2003 22:36:27 -0400
Organization: Compilers Central
Keywords: LALR, parse, question
Posted-Date: 04 Sep 2003 22:36:27 EDT

Hi,
      semi-long question and comment...


OVERVIEW:


      I have a comment about using precedence to disambiguate the
actions in a shift-reduce parser. I've found two approaches so
far:


      - Give priorities and associativities to terminals. When
          encountering a shift-reduce conflict, give the shift the
          precedence of the symbol to be shifted, and the reduce the
          precedence of the highest precedence symbol in the right
          hand side of the production to be reduced. Pick the action
          with the higher precedence.


      - Use a set of conflict constraints when generating the LR0
          goto table and the follow set of each production (note:
          each production has its own followset rather than computing
          followsets of symbols). This method is outlined by Visser
          in the following two papers:


              @misc{ visser-scannerless,
                  author = "E. Visser",
                  title = "Scannerless generalized-LR parsing",
                  text = "Visser, E. (1997b). Scannerless generalized-LR parsing. Technical Report
                      P9707, Programming Research Group, University of Amsterdam.",
                  url = "citeseer.nj.nec.com/visser97scannerles.html" }


              @misc{ visser-case,
                  author = "E. Visser",
                  title = "A case study in optimizing parsing schemata by disambiguation filters",
                  text = "E. Visser. A case study in optimizing parsing schemata by disambiguation
                      filters.",
                  url = "citeseer.nj.nec.com/164096.html" }


I'd be interested in hearing of other approaches.








BACKGROUND:


Anyway, I started to use Visser's method of disambiguation. Consider
the following example:


          E -> E + E {1}
                | E * E {2}
                | id {3}
                ;


          with relationships:
              {2} > {1}
              {1} left {1}
              {2} left {2}


the disambiguation rules mean that the following subtrees contain
conflicts and should not be generated:


        E -> [E -> E + E] * E because {2} > {1}
        E -> E * [E -> E + E] because {2} > {1}
        E -> E + [E -> E + E] because {1} left {1}
        E -> E * [E -> E * E] because {2} left {2}


During the construction of the LR0 goto table, these restrictions
prevent some predictions that might normally occur when computing
the closure of the item sets, ie:


        E -> . E * E does not predict E -> . E + E
        E -> E * . E does not predict E -> . E + E
        E -> E + . E does not predict E -> . E + E
        E -> E * . E does not predict E -> . E * E


which has the net effect of eliminating some of the shift operations
that otherwise would have been possible at a later point. During
computation of the follows sets of each production these restrictions
prevent some symbols from being part of the follows set. For example
the production:


        E -> E + E follows = { EOF, + }


never receives a '*' in its follows set because the only time '*' could
follow the production:


        E -> [E -> E + E] * E


cannot occur.


So with these disambiguation rules, there are no conflicts in the SLR
action table for the grammar.






THE PROBLEM


That's straightforward enough. Now my comment/complaint/question.
Consider when you add one more production to the grammar but do not
specify its precedence or associativity:


          E -> E + E {1}
                | E * E {2}
                | id {3}
                | E ^ E {4}
                ;


          with relationships:
              {2} > {1}
              {1} left {1}
              {2} left {2}


Now when we compute follow sets we get:


        E -> E ^ E follows = { EOF, +, *, ^ }


because there are no restrictions on this production. This leads to


        E -> E + E follows = { EOF, +, *, ^ }


because


        E -> E + [E -> E ^ E]


is allowed, and any values that follow [E -> E ^ E] may also follow
[E -> E + E].


The end effect is that the resulting action tables will have many
shift/reduce conflicts that the precedence rules imply should have
been elimianted. As an example consider the following itemset and
actions for one of the states in the SLR parser:


    State 7:
          E -> E + E . follows: ['#EOF#', '*', '+', '^']
          E -> E . + E
          E -> E . * E
          E -> E . ^ E
                on #EOF#: reduce E -> E + E
                *** on *: shift 3
                *** on *: reduce E -> E + E
                *** on +: shift 4
                *** on +: reduce E -> E + E
                *** on ^: shift 5
                *** on ^: reduce E -> E + E


If more traditional conflict resolution were used, we should be
able to resolve the action when the "*" lookahead is seen in
favor of the shift since a shift of a "*" would have a higher
precedence than a reduce of a production with a "+".


My comment: Visser's method seems very elegant for completely
disambiguating conflicts in a set of productions, but if you wish
to partially disambiguate the productions, it may not do as much
as you would hope.


My question: Is there anyway to augment Visser's conflict resolution
so that it can be more effective when resolving a subset of the
conflicts that occur? For example, is there a method that could
produce a parser for my simple expression grammar that only encountered
a ambiguity when the "^" operator was present in an input sentance?


Tim N.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.