Nice trick for "superfix" relational operators operator precedence parsing.

Kaz Kylheku <643-408-1753@kylheku.com>
Wed, 07 May 2025 02:44:45 -0000

From comp.compilers

Related articles
*Nice trick for "superfix" relational operators operator precedence parsing. 643-408-1753@kylheku.com (Kaz Kylheku)* (2025-05-07)**

| List of all articles for this month |

From:	Kaz Kylheku <643-408-1753@kylheku.com>
Newsgroups:	comp.compilers
Date:	Wed, 07 May 2025 02:44:45 -0000
Organization:	Compilers Central
Injection-Info:	gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="55436"; mail-complaints-to="abuse@iecc.com"
Keywords:	syntax, history, comment
Posted-Date:	07 May 2025 12:23:49 EDT

I wanted to have compounding relational operators (which I call
"superfix") in an operator precedence parser. This means that
syntax like:

    1 < j < k <= N

behaves as one comparison operation meaning
(1 < j) and (j < k) and (k <= N), except that the j and k
terms are not evaluated twice. There may be short-circuiting
of the logical and.

We want parentheses to override this, so that

    1 < (j < k) <= N

is treated as

    1 < T < N

where T is a single term consisting of (j < k).

Anyway, I came up with a nice solution for bringing this about
without complicating the Shunting-Yard-based operator precedence
parser.

1. I put all the relational operators into a single level of
      precedence, and made them right-associative.

2. A pass is made through the resulting syntax tree, whic
      recognizes the right-associative tree patterns involving these
      operators and rewrites them to the "and" terms.

We can see this live:

1> (parse-infix '(1 < i < j <= N))
(< 1 (< i (<= j N)))

2> (sys:finish-infix *1)
(and (< 1 i j)
    (<= j N))

(In my actual implementation, the < function is n-ary so we reduce
to a single (< 1 i j) term. Because this is a strictly evaluated
function call, it means that even if (< 1 i) is false, j is
still evaluated, unlike with (and (< 1 i) (< i j))).

Tree pattern matching matchin and rewriting is used to handle patterns
like (relop1 A (relop2 B C)) into (and (relop1 A B) (relop2 B C)).

Oh, about multiple evaluation, let's replace i with (inc i),
an expression wth a side-effect of pre-incrementing i (returning
the new value) which must happen once:

1> (parse-infix '(1 < (inc i) < j <= N))
(< 1 (< (inc i) (<= j N)))
2> (sys:finish-infix *1)
(let (#:g0014)
    (and (< 1 (set #:g0014
                            (inc i)))
        (< #:g0014 j)
        (<= j N)))

Code is emitted to bind a temporary hidden variable, which receives
the result of the first (inc i). The subsequent term that would
be an occurrence of (inc i) under the < function instead refers
to that variable. The strategy both preserves left-to-right evaluation
of all terms, and also depends on it or correctness.

Other than adjusting the declarations of associativity and precedence, nothing
had to be done in the parser; all the work in in the separate "finish infix"
pass.

It is then easy to document that way, too; the parsing model is very clear,
requiring no concepts beyond binary operator precedence, and the heuristics of
the "superfix" transformation are easy to clearly document also.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
[COBOL had roughly that shortcut comparison syntax in 1960 so a lot of people
must have looked at this. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Nice trick for "superfix" relational operators operator precedence parsing.

Kaz Kylheku <643-408-1753@kylheku.com>Wed, 07 May 2025 02:44:45 -0000

Kaz Kylheku <643-408-1753@kylheku.com>
Wed, 07 May 2025 02:44:45 -0000