Related articles |
---|
Yacc precedence bug not a bug David.Tarditi@B.GP.CS.CMU.EDU (1991-01-11) |
Newsgroups: | comp.compilers |
From: | David.Tarditi@B.GP.CS.CMU.EDU |
Keywords: | yacc, parse, debug |
Organization: | Compilers Central |
References: | <7993.9101101030@guava.seg.npl.co.uk> |
Date: | Fri, 11 Jan 91 14:36:36 EST |
Simon Voogd writes:
> You can give a token a precedence and associativity by
> declaring this in the declaration-part of a YACC-file
> According to the manual you can also change the precedence
> and associativity of a token in the grammar-rules by
> using the '%prec'-mechanism.
>
> But it seems to us that this mechanism doesn't work!!!
>
> We enclose a simple desk calculator and reverse the usual
> precedence of '+' with '*' using the '%prec'-mechanism.
> It seems that the associativity of tokens can be changed.(?)
> But their precedence levels are now the same.
I think you misunderstood the %prec mechanism. It changes the precedence
associated with a rule, not with any particular token.
Here is the Yacc precedence scheme. Each terminal may be assigned a
precedence and associativity. Each rule is then assigned the precedence of
its rightmost terminal. The %prec may be used to change the precedence
assigned to the rule to the precedence of that of the terminal used in the
%prec.
If a shift/reduce conflict occurs, the conflict is resolved silently if the
terminal and the rule in the conflict have precedences. If the terminal has
the higher precedence, shift is chosen. If the rule has the higher
precedence, the reduction is chosen. If both the terminal and the rule have
the same precedence, then the associativity of the terminal (to shift, that
is) is used to resolve the conflict. If the terminal is left associative,
the reduction is chosen. If the terminal is right associative, the shift is
chosen. If the terminal is nonassociative, that is an error in the
specification.
If a terminal or a rule in a shift/reduce conflict does not have a
precedence, then an error message is produced and the shift is chosen.
With this in mind, let us take a look at what happens. I have simplified the
grammar and changed the specification to:
%token NUMBER PLUS MUL
%%
start : expr
expr : expr PLUS expr %prec MUL
| expr MUL expr %prec PLUS
| NUMBER
;
%%
We can see the shift/reduce conflicts that result by running yacc on
the specification with -v option. Here are the relevant portions of
the y.output file:
6: shift/reduce conflict (shift 4, red'n 2) on PLUS
6: shift/reduce conflict (shift 5, red'n 2) on MUL
state 6
expr : expr_PLUS expr
expr : expr PLUS expr_ (2)
expr : expr_MUL expr
PLUS shift 4
MUL shift 5
. reduce 2
7: shift/reduce conflict (shift 4, red'n 3) on PLUS
7: shift/reduce conflict (shift 5, red'n 3) on MUL
state 7
expr : expr_PLUS expr
expr : expr_MUL expr
expr : expr MUL expr_ (3)
PLUS shift 4
MUL shift 5
. reduce 3
Now, suppose we add these precedence declarations to the specification:
%left PLUS
%left MUL
MUL has higher precedence than PLUS, as is traditional. How are the
shift/reduce conflicts resolved ? Rule 2 (expr : expr PLUS expr %prec MUL)
will have the same precedence as the terminal MUL. Rule 3
(expr : expr MUL expr %prec PLUS) will have the same precedence as PLUS.
The precedences of MUL and PLUS have not changed!
Thus, for state 6:
shift PLUS, reduce rule 2: Since rule 2 has higher precedence (that of MUL)
than PLUS, the reduction will be chosen.
shift MUL, reduce rule 2: The rule and the terminal have the same precedence.
Since MUL is left associative, the reduction will
be chosen.
For state 7:
shift PLUS, reduce rule 3: The rule and the terminal have the same
precedence. Since PLUS is left associative,
the reduction is chosen.
shift MUL, reduce rule 3: The rule has lower precedence, so the shift
is chosen.
The end result for something of the form
NUMBER op1 NUMBER op2 NUMBER
can be summarized as:
1. if op2 = PLUS, always reduce
2. if op2 = MUL, reduce if state=6 and shift if state=7. To make
sense of this, let us examine the states in the parser from whichp
we can make a transition to state 6 or state 7. There turn out to be
two states:
------
state 4
expr : expr PLUS_expr
NUMBER shift 3
. error
expr goto 6
state 5
expr : expr MUL_expr
NUMBER shift 3
. error
expr goto 7
-------
Thus, when parsing
expr PLUS expr MUL expr
we will be in state 6 after seeing
expr PLUS expr
causing us to reduce on MUL. When parsing
expr MUL expr MUL expr
we will be in state 7 after seeing
expr MUL expr
causing us to shift on MUL.
To summarize:
1. expr PLUS expr PLUS expr: reduce when we see the second PLUS
2. expr PLUS expr MUL expr: reduce when we see MUL
3. expr MUL expr PLUS expr: reduce when we see the PLUS
4. expr MUL exp MUL expr: shift when we see the second MUL
We cannot say that PLUS has "higher" precedence than MUL because
of (2). If this were true, we would shift there. Also, MUL
acts in (4) as though it were right associative!
Now we see why this happened:
> the enclosed calculator gives the following results :
> 4+7*2 = 22 [ = (4+7)*2 ]
> 2*7+4 = 18 [ = (2*7)+4 ] (surely should be 22 !!!!!)
2*7+4 is being parsed correctly according to Yacc's precedence rules.
The moral of the story is that one should use %prec with great caution,
and not to try to push the Yacc precedence scheme beyond what it is
designed for, mainly mathematical expressions involving binary operators.
If you find yourself trying to alter the precedences of your rules with
%prec, I suggest that you redesign your grammar instead.
David Tarditi School of Computer Science
tarditi@cs.cmu.edu Carnegie Mellon University
Pittsburgh, PA 15213
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.