Re: Simple Expression Recursion vs Expression/Term/Factor

"Joachim Durchholz" <joachim_d@gmx.de>
8 Aug 2001 01:02:05 -0400

From comp.compilers

Related articles
Simple Expression Recursion vs Expression/Term/Factor stoggers@uniquest.demon.co.uk (Mike Stogden) (2001-08-02)
Re: Simple Expression Recursion vs Expression/Term/Factor Mark.van.den.Brand@cwi.nl (M.G.J. van den Brand) (2001-08-06)
Re: Simple Expression Recursion vs Expression/Term/Factor lucadesantis@infinito.it (luca) (2001-08-06)
Re: Simple Expression Recursion vs Expression/Term/Factor gregod@cs.rpi.edu (Douglas Gregor) (2001-08-06)
*Re: Simple Expression Recursion vs Expression/Term/Factor joachim_d@gmx.de (Joachim Durchholz)* (2001-08-08)**
Re: Simple Expression Recursion vs Expression/Term/Factor lucads@xoommail.xoom.it (Luca) (2001-08-08)
Re: Simple Expression Recursion vs Expression/Term/Factor andi@diagonal.ch (Andreas Gieriet) (2001-08-15)

| List of all articles for this month |

From:	"Joachim Durchholz" <joachim_d@gmx.de>
Newsgroups:	comp.compilers
Date:	8 Aug 2001 01:02:05 -0400
Organization:	Compilers Central
References:	01-08-016 01-08-031
Keywords:	parse
Posted-Date:	08 Aug 2001 01:02:04 EDT

luca <lucadesantis@infinito.it> wrote:
> Mike Stogden wrote in message 01-08-016...
> >I have seen expressions for the same language described as...
> >
> >expr -> expr + expr
> >expr -> expr * expr
> >etc...
> >
> >alternatively...
> >
> >expr -> expr + term
> >term -> factor | term * factor
> >factor -> number | identifier
> >etc...
> >
>
> I think the second approach allows to manage priority of operations
> is a simpler way.

However, there's a drawback, in particular if the grammar has more than
just three precedence levels. E.g. assuming you have 9 levels, and the
grammar looks like this:
    E0 -> E0 op0 E1 | E1
    E1 -> E1 op1 E2 | E2
    ...
    E9 -> number | identifier | "(" E0 ")"

Now assume that "+" is an op5, the expression "a + b" will parse as

    E0
    |
    E2
    |
    E3
    |
    E4
    |
    +
  / \
E5 E4
| |
E4 E3
| |
E3 E2
| |
E2 E1
| |
E1 b
|
a

(no guarantees for fencepost errors here) which should have just been

    +
  / \
a b

The unnecessary nodes will make tree traversal slower (since expressions
are in almost any production, this will blow up the total tree size by a
factor). Whether that's a problem depends on how much time the compiler
actually spends traversing the tree.

There are several solutions here:
1) If you're using a parser generator that doesn't automatically produce
a parse tree, write the semantic actions so that the unnecessary
intermediate nodes aren't created in the first place.
2) After the parse, do a pass through the tree that throws the
unnecessary nodes away.
3) Use the ambiguous grammar, but add precedence annotations. Details
depend on the parser generator you're using.

Alternative 3 is not without risk; if the grammar does "funny" things,
your parser might end up accepting stuff you didn't want accepted.
There are rules that define when it is safe to do that, but it's not
something that's easily followed (though the algorithm to check operator
precedence safety for a grammar is straightforward enough). In practice,
one selects a stricter-than-necessary but easy-to-verify set of rules
that the expressions grammar must follow; one that I have used in the
past is this:

Every precedence level is represented by one grammar rule; grammar rules
are built according to one of the following patterns (assume m = n + 1):
    -- right-associative, prefix operators
    En -> Em op_n En | op_n En | Em
    -- left-associative, postfix operators
    En -> En op_n Em | En op_n | Em
    -- nonassociative, prefix
    En -> Em op_n Em | op_n Em | Em
    -- nonassociative, postfix
    En -> Em op_n Em | Em op_n | Em
You can mix nonassociative operators into the associative cases if the
operator symbols are different. You'll get
    En -> Em op_n En | op_n En | Em nonassoc_op_n Em | Em
    En -> En op_n Em | En op_n | Em nonassoc_op_n Em | Em

Please note that this is off the top of my head, no guarantee for
mix-ups here.

Regards,
Joachim

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Simple Expression Recursion vs Expression/Term/Factor

"Joachim Durchholz" <joachim_d@gmx.de>8 Aug 2001 01:02:05 -0400

"Joachim Durchholz" <joachim_d@gmx.de>
8 Aug 2001 01:02:05 -0400