Re: additional regular expression operators

zaimoni@zaimoni.com
Mon, 30 Mar 2009 09:19:16 -0700 (PDT)

          From comp.compilers

Related articles
additional regular expression operators rpboland@gmail.com (Ralph Boland) (2009-03-29)
Re: additional regular expression operators m.helvensteijn@gmail.com (2009-03-30)
Re: additional regular expression operators zaimoni@zaimoni.com (2009-03-30)
Re: additional regular expression operators haberg_20080406@math.su.se (Hans Aberg) (2009-03-30)
Re: additional regular expression operators torbenm@pc-003.diku.dk (2009-03-31)
Re: additional regular expression operators rpboland@gmail.com (Ralph Boland) (2009-03-31)
Re: additional regular expression operators torbenm@pc-003.diku.dk (2009-04-14)
Re: additional regular expression operators zayenz@gmail.com (MZL) (2009-04-15)
Re: additional regular expression operators anton@mips.complang.tuwien.ac.at (2009-04-16)
[3 later articles]
| List of all articles for this month |
From: zaimoni@zaimoni.com
Newsgroups: comp.compilers
Date: Mon, 30 Mar 2009 09:19:16 -0700 (PDT)
Organization: Compilers Central
References: 09-03-111
Keywords: lex
Posted-Date: 30 Mar 2009 14:00:37 EDT

On Mar 30, 1:49 am, Ralph Boland <rpbol...@gmail.com> wrote:
> I am building a tool for translating regular expressions
> into finite state machines and will eventually be building
> a parser generator tool that will use
> regular expressions for the scanner generator and
> also on the right side of productions.
> I plan to provide support for at least three additional
> unary operators to the standard three (?,*, and +)
> (plus new binary operators but that's a topic for another day).
> For a regular expression R they are:
>
> R! :does not accept the empty string but otherwise
> accepts every expression that R does. Very useful.
>
> R~ :accepts exactly the set of strings that R does not accept.
> note that R = R~~.
>
> R% : equivalent to the string (R~)R
>
> I have put these operators after the expression but in fact I prefer
> to have them in front
> because:
> 1) I prefer to have unary operators in front of their expressions
> as in "-5" rather than "5-".
> Yes, I know, what about 5! (5 factorial).
> 2) Parsing is easier and faster if the unary operator precedes
> the expression.


Not in my experience when hand-writing parsers. The difference
between prefix and postfix operators is negligible when all tokens are
available at once (one linear scan per precedence level "works" then),
but that's not the easy way to use a typical lexer which returns one
token at a time, left to right.


In that case, unambiguous postfix unary operators that are very high
priority can be parsed the instant they're seen. It's very hard to be
faster and more efficient than that.


> But the standard unary operator always follow the expression
> in every example I have seen (OK, I have seen one exception).


See above for what strongly motivates this.



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.