Re: Deep expression chain performance

Kaz Kylheku <864-117-4973@kylheku.com>
Thu, 27 Apr 2023 00:28:55 -0000 (UTC)

          From comp.compilers

Related articles
Deep expression chain performance borucki.andrzej@gmail.com (Andy) (2023-04-25)
Re: Deep expression chain performance anton@mips.complang.tuwien.ac.at (2023-04-26)
Re: Deep expression chain performance gah4@u.washington.edu (gah4) (2023-04-26)
Re: Deep expression chain performance 864-117-4973@kylheku.com (Kaz Kylheku) (2023-04-27)
Re: Deep expression chain performance anton@mips.complang.tuwien.ac.at (2023-04-27)
| List of all articles for this month |

From: Kaz Kylheku <864-117-4973@kylheku.com>
Newsgroups: comp.compilers
Date: Thu, 27 Apr 2023 00:28:55 -0000 (UTC)
Organization: A noiseless patient Spider
References: 23-04-012
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="23892"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, performance
Posted-Date: 26 Apr 2023 20:42:41 EDT

On 2023-04-25, Andy <borucki.andrzej@gmail.com> wrote:
> I am writing C grammar.
> Grammar speed may be down caused by deep expression chain.
> For example, simple "n=0" has 20 levels:
> assignmentExpression
> conditionalExpression
> logicalOrExpression
> logicalAndExpression
> inclusiveOrExpression
> exclusiveOrExpression
> andExpression
> eqaulityExpression
> relationalExpression
> shiftExpression
> additiveExpression
> multiplicativeExpression
> castExpression
> unaryExpression
> postfixExpression
> postfixExpressionLeft
> atom
> literal
> IntegerConstant
> DeciamlConstant


ISO C specifies its grammar in an obtusely verbose way, similar to the
above. The reason is that it makes the grammar unambiguous, meaning
that it does not require any annotation or external remarks about
ambiguities having to be resolved by a certain associativity or
precedence.


If you have some grammar processor which literally follows all those
productions, doing all of their reductions, then it will likely
not perform well.


I suspect even a LALR(1) compiled grammar will have an issue
there; I'm not aware that Yacc-like parser generators do any
collapsing of cascaded reductions which do nothing but { $$ = $1; }.


Here is a concrete example. Attached below is over 200 lines of
y.output from a Yacc implementation (Bison). In that y.output
we can trace the actions of the state machine and see what happens
when a the simple token LIT is reduced to an expression.


In State 0, when the parser sees a LIT, it will shift the token
and go to State 1. There, a reduction takes place via Rule 9,
and so we have a parser_expression on the top of the stack.


Then, back to State 0, where now have a primary_expression,
and so we go to state 8.


State 8, similarly to State 1, performs a reduction:
primary expression to mult_expression.


Back to State 0, where we have to dispatch a similar
thing again for multi_expression, then additive_expression,
then expression and finally top.


Thanks for raising this issue; I suspect it's a real performance
problem. Not only is a factored-out grammar hard to read but the
performance is bad due to useless back and forth transitions between
superfluous states to propagate reductions.


Above-referenced material follows:


Grammar


        0 $accept: top $end


        1 top: expression ';'
        2 | expression ';' expression


        3 expression: additive_expression


        4 additive_expression: mult_expression
        5 | mult_expression '+' additive_expression


        6 mult_expression: primary_expression
        7 | mult_expression '*' primary_expression


        8 primary_expression: '(' expression ')'
        9 | LIT
      10 | IDENT




Terminals, with rules where they appear


$end (0) 0
'(' (40) 8
')' (41) 8
'*' (42) 7
'+' (43) 5
';' (59) 1 2
error (256)
LIT (258) 9
IDENT (259) 10




Nonterminals, with rules where they appear


$accept (10)
        on left: 0
top (11)
        on left: 1 2, on right: 0
expression (12)
        on left: 3, on right: 1 2 8
additive_expression (13)
        on left: 4 5, on right: 3 5
mult_expression (14)
        on left: 6 7, on right: 4 5 7
primary_expression (15)
        on left: 8 9 10, on right: 6 7




state 0


        0 $accept: . top $end


        LIT shift, and go to state 1
        IDENT shift, and go to state 2
        '(' shift, and go to state 3


        top go to state 4
        expression go to state 5
        additive_expression go to state 6
        mult_expression go to state 7
        primary_expression go to state 8




state 1


        9 primary_expression: LIT .


        $default reduce using rule 9 (primary_expression)




state 2


      10 primary_expression: IDENT .


        $default reduce using rule 10 (primary_expression)




state 3


        8 primary_expression: '(' . expression ')'


        LIT shift, and go to state 1
        IDENT shift, and go to state 2
        '(' shift, and go to state 3


        expression go to state 9
        additive_expression go to state 6
        mult_expression go to state 7
        primary_expression go to state 8




state 4


        0 $accept: top . $end


        $end shift, and go to state 10




state 5


        1 top: expression . ';'
        2 | expression . ';' expression


        ';' shift, and go to state 11




state 6


        3 expression: additive_expression .


        $default reduce using rule 3 (expression)




state 7


        4 additive_expression: mult_expression .
        5 | mult_expression . '+' additive_expression
        7 mult_expression: mult_expression . '*' primary_expression


        '+' shift, and go to state 12
        '*' shift, and go to state 13


        $default reduce using rule 4 (additive_expression)




state 8


        6 mult_expression: primary_expression .


        $default reduce using rule 6 (mult_expression)




state 9


        8 primary_expression: '(' expression . ')'


        ')' shift, and go to state 14




state 10


        0 $accept: top $end .


        $default accept




state 11


        1 top: expression ';' .
        2 | expression ';' . expression


        LIT shift, and go to state 1
        IDENT shift, and go to state 2
        '(' shift, and go to state 3


        $default reduce using rule 1 (top)


        expression go to state 15
        additive_expression go to state 6
        mult_expression go to state 7
        primary_expression go to state 8




state 12


        5 additive_expression: mult_expression '+' . additive_expression


        LIT shift, and go to state 1
        IDENT shift, and go to state 2
        '(' shift, and go to state 3


        additive_expression go to state 16
        mult_expression go to state 7
        primary_expression go to state 8




state 13


        7 mult_expression: mult_expression '*' . primary_expression


        LIT shift, and go to state 1
        IDENT shift, and go to state 2
        '(' shift, and go to state 3


        primary_expression go to state 17




state 14


        8 primary_expression: '(' expression ')' .


        $default reduce using rule 8 (primary_expression)




state 15


        2 top: expression ';' expression .


        $default reduce using rule 2 (top)




state 16


        5 additive_expression: mult_expression '+' additive_expression .


        $default reduce using rule 5 (additive_expression)




state 17


        7 mult_expression: mult_expression '*' primary_expression .


        $default reduce using rule 7 (mult_expression)




--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.