Re: if then else shift/reduce Syndrome

bakul@netcom.com (Bakul Shah)
3 Mar 1996 19:44:36 -0500

          From comp.compilers

Related articles
[5 earlier articles]
Re: if then else shift/reduce Syndrome meissner@cygnus.com (Michael Meissner) (1996-02-16)
Re: if then else shift/reduce Syndrome solution@gate.net (1996-02-16)
Re: if then else shift/reduce Syndrome tarnwb@holly.colostate.edu (Tarn Burton) (1996-02-16)
Re: if then else shift/reduce Syndrome meissner@cygnus.com (Michael Meissner) (1996-02-21)
Re: if then else shift/reduce Syndrome henry@zoo.toronto.edu (Henry Spencer) (1996-02-27)
Re: if then else shift/reduce Syndrome neitzel@gaertner.de (1996-03-01)
Re: if then else shift/reduce Syndrome bakul@netcom.com (1996-03-03)
Re: if then else shift/reduce Syndrome (Now expression parsing) theo@engr.mun.ca (1996-03-14)
| List of all articles for this month |
From: bakul@netcom.com (Bakul Shah)
Newsgroups: comp.compilers
Date: 3 Mar 1996 19:44:36 -0500
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
References: 96-02-139 96-02-192 96-02-243 96-02-309
Keywords: yacc, parse

Michael Meissner <meissner@cygnus.com> writes:
>...if you don't have features like %left, %right and/or
>%nonassoc, you need to specify separate rules for each level of
>precedence. Thus a simple statement:
> a = 1;
>Can generate 13 or more reductions in C...


Henry Spencer <henry@zoo.toronto.edu> writes (among other things):
>The way to do it is to short-circuit the common cases, and invoke the
>full machinery only if (for example) that `1' is followed by an
>operator. This does make things messy -- you've got to do some
>backing and filling to cope with that unwelcome operator -- but if
>you're patient and careful, it can be done. The resulting grammar is
>not as clean and elegant as the one in the language spec, especially
>for a language with many precedence levels, but it parses real
>programs much more rapidly.


The grammar can be kept elegant, *and* the implementation clean and
efficient, if one employs a grammar description notation that is
_slightly different_ from the usual BNF (but which has exactly the
same expressive power). The basic idea is to separate OR rules from
AND rules. For instance, here is the grammar for a simple infix
expression language:


    /* OR rules */
    expressions = expression | expression-list
    expression = sum | term
    term = product | factor
    factor = ID | NUMBER | call | nested-expression | negation
    addop = '+' | '-'
    mulop = '*' | '/'


    /* AND rules */
    sum: expression addop term
    product: term mulop factor
    negation: '-' factor
    call: ID '(' expressions ')'
    nested-expression: '(' expression ')'
    expression-list: expressions ',' expression


Informally, any of the elements on the RHS of an OR rule is also the
element on the LHS (e.g. a sum or a term is an expression). An AND
rule says the element on the LHS is the *sequence* on the RHS (e.g. a
sum is an expression followed by an addop followed by a term).


During actual parsing there are *no* reductions for OR rules.
Rudections and transitions are only done for AND rules. For instance,
given a position like


    sum: expression addop term
            ^
transition to position


    sum: expression addop term
                                  ^ is made when any element that eventually matches
the OR rule for an expression is seen (that is, a sum or a term or
product or factor or ...).


A parser generator has to do a little more work but it generates much
more compact tables than Yacc would for an equivalent EBNF grammar.
Computation of the first set, follow set, lookahead set etc. are very
similar to what you would in a normal LALR parser generator. Since
the generator does all the hard work, code like ``a = 1;'' in Michael
Meissner's article will see exactly *one* reduction.


Note that you can do a similar `optimization' in a normal parser
generator by noting which rules have exactly one element on the RHS
but I find the above notation allows one to write cleaner language
descriptions.


I first wrote a parser generator using this notation back in '79 and
later wrote a paper about it but never got around to polishing or
publishing it (but the paper does formally describe things I handwave
here).


-- bakul shah <bakul@netcom.com>
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.