Re: Prior art for operator parsing trick

Kaz Kylheku <643-408-1753@kylheku.com>
Sun, 06 Apr 2025 17:18:42 -0000

          From comp.compilers

Related articles
Prior art for operator parsing trick 643-408-1753@kylheku.com (Kaz Kylheku) (2025-04-05)
Re: Prior art for operator parsing trick antispam@fricas.org (2025-04-06)
Re: Prior art for operator parsing trick 643-408-1753@kylheku.com (Kaz Kylheku) (2025-04-06)
| List of all articles for this month |
From: Kaz Kylheku <643-408-1753@kylheku.com>
Newsgroups: comp.compilers
Date: Sun, 06 Apr 2025 17:18:42 -0000
Organization: Compilers Central
References: 25-04-002 25-04-003
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="92507"; mail-complaints-to="abuse@iecc.com"
Keywords: parse
Posted-Date: 06 Apr 2025 14:14:06 EDT

On 2025-04-06, antispam@fricas.org <antispam@fricas.org> wrote:
>> Someone must have come up with exactly the same thing before in all the
>> decades we have been doing this; but where have they written it up?
>
> Do you mean that
>
> sin x + y + - - sin z
>
> should have the same meaning as
>
> sin(x + y) + (- (- sin(z)))
>
> but
>
> sin x + y + - - z
>
> should have the same meaning as
>
> sin(x + y + (- (- z)))


Your understanding is correct. I have assigned (as I mentioned) a
precedence to the prefix - (a.k.a unary minus) which is higher than
that of the binary +. Therefore, no precedence demotion of +
takes place in the second example.


1> (parse-infix '(sin x + y + - - sin z))
(+ (sin (+ x y))
      (- (- (sin z))))


2> (parse-infix '(sin x + y + - - z))
(sin (+ (+ x y) (- (- z))))


> and similarly for longer chains of '-' signs? If yes, this needs
> unbounded lookahead,


That is correct. If we have a chain of 10,000 consecutive prefix
operators immediately after an infix operator, they all have to be
examined to find the lowest precedence.


> which make any method to handle it rather
> nonstandard.


I think, say, LALR(1) can't handle this even if we just look ahead at
one prefix operator, because of the dynamic precedence reassignment.
LALR(1) wants to reduce its stack based on the identity of the lookahead
token. The unmodified algorithm has no provision for conveying some
property of the loakahead token such that it temporarily rearranges some
aspect of the grammar.


> And frankly, such parsing looks rather obscure,
> so is is possible that there are no prior art in this specific case.
>
> There is paper "Noncanonical Extensions of LR Parsing Methods" by
> M. Hutton:
>
> https://www.eecg.toronto.edu/~mdhutton/papers/noncan.pdf
>
> and I think that parsing expressions in your way is an example
> of "LR-Regular Parsing". However, what Hutton presents is
> much more general and described in terms of LR-automata and
> not in terms of precedence.


Thank you very much; I will have a look.


--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.