additional regular expression operators

Ralph Boland <rpboland@gmail.com>
Sun, 29 Mar 2009 23:49:30 -0700 (PDT)

          From comp.compilers

Related articles
additional regular expression operators rpboland@gmail.com (Ralph Boland) (2009-03-29)
Re: additional regular expression operators m.helvensteijn@gmail.com (2009-03-30)
Re: additional regular expression operators zaimoni@zaimoni.com (2009-03-30)
Re: additional regular expression operators haberg_20080406@math.su.se (Hans Aberg) (2009-03-30)
Re: additional regular expression operators torbenm@pc-003.diku.dk (2009-03-31)
Re: additional regular expression operators rpboland@gmail.com (Ralph Boland) (2009-03-31)
Re: additional regular expression operators torbenm@pc-003.diku.dk (2009-04-14)
[5 later articles]
| List of all articles for this month |

From: Ralph Boland <rpboland@gmail.com>
Newsgroups: comp.compilers
Date: Sun, 29 Mar 2009 23:49:30 -0700 (PDT)
Organization: Compilers Central
Keywords: lex, question
Posted-Date: 30 Mar 2009 08:44:58 EDT

I am building a tool for translating regular expressions
into finite state machines and will eventually be building
a parser generator tool that will use
regular expressions for the scanner generator and
also on the right side of productions.
I plan to provide support for at least three additional
unary operators to the standard three (?,*, and +)
(plus new binary operators but that's a topic for another day).
For a regular expression R they are:


        R! :does not accept the empty string but otherwise
                      accepts every expression that R does. Very useful.


        R~ :accepts exactly the set of strings that R does not accept.
                        note that R = R~~.


      R% : equivalent to the string (R~)R


I have put these operators after the expression but in fact I prefer
to have them in front
because:
        1) I prefer to have unary operators in front of their expressions
as in "-5" rather than "5-".
                  Yes, I know, what about 5! (5 factorial).
        2) Parsing is easier and faster if the unary operator precedes
the expression.


But the standard unary operator always follow the expression
in every example I have seen (OK, I have seen one exception).


I have three choices:
        a) Have all unary operators follow their expressions.
        b) Have all unary operators preceed their expressions.
        c) Leave the original unary operators alone but have the
                new unary operators precede their expressions.
                The original unary operators take precedence.
        d) Give the user the ability to choose which to use.


      a) frustrates me because I am then extending (in my mind) a bad
decision.
      b) worries me because users may not like it.
      c) is just weird.
      d) is too complex and I don't think the user wants this feature.


Before I decide I would like to hear opinions on the matter.


Suggestions as to further operators to support welcome. Suggestions
as to what are the best symbols to use for the new operators welcome.


Ralph Boland



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.