Mon, 30 Mar 2009 09:19:16 -0700 (PDT)

Related articles |
---|

additional regular expression operators rpboland@gmail.com (Ralph Boland) (2009-03-29) |

Re: additional regular expression operators m.helvensteijn@gmail.com (2009-03-30) |

Re: additional regular expression operators zaimoni@zaimoni.com (2009-03-30) |

Re: additional regular expression operators haberg_20080406@math.su.se (Hans Aberg) (2009-03-30) |

Re: additional regular expression operators torbenm@pc-003.diku.dk (2009-03-31) |

Re: additional regular expression operators rpboland@gmail.com (Ralph Boland) (2009-03-31) |

Re: additional regular expression operators torbenm@pc-003.diku.dk (2009-04-14) |

Re: additional regular expression operators zayenz@gmail.com (MZL) (2009-04-15) |

Re: additional regular expression operators anton@mips.complang.tuwien.ac.at (2009-04-16) |

[3 later articles] |

From: | zaimoni@zaimoni.com |

Newsgroups: | comp.compilers |

Date: | Mon, 30 Mar 2009 09:19:16 -0700 (PDT) |

Organization: | Compilers Central |

References: | 09-03-111 |

Keywords: | lex |

Posted-Date: | 30 Mar 2009 14:00:37 EDT |

On Mar 30, 1:49 am, Ralph Boland <rpbol...@gmail.com> wrote:

*> I am building a tool for translating regular expressions*

*> into finite state machines and will eventually be building*

*> a parser generator tool that will use*

*> regular expressions for the scanner generator and*

*> also on the right side of productions.*

*> I plan to provide support for at least three additional*

*> unary operators to the standard three (?,*, and +)*

*> (plus new binary operators but that's a topic for another day).*

*> For a regular expression R they are:*

*>*

*> R! :does not accept the empty string but otherwise*

*> accepts every expression that R does. Very useful.*

*>*

*> R~ :accepts exactly the set of strings that R does not accept.*

*> note that R = R~~.*

*>*

*> R% : equivalent to the string (R~)R*

*>*

*> I have put these operators after the expression but in fact I prefer*

*> to have them in front*

*> because:*

*> 1) I prefer to have unary operators in front of their expressions*

*> as in "-5" rather than "5-".*

*> Yes, I know, what about 5! (5 factorial).*

*> 2) Parsing is easier and faster if the unary operator precedes*

*> the expression.*

Not in my experience when hand-writing parsers. The difference

between prefix and postfix operators is negligible when all tokens are

available at once (one linear scan per precedence level "works" then),

but that's not the easy way to use a typical lexer which returns one

token at a time, left to right.

In that case, unambiguous postfix unary operators that are very high

priority can be parsed the instant they're seen. It's very hard to be

faster and more efficient than that.

*> But the standard unary operator always follow the expression*

*> in every example I have seen (OK, I have seen one exception).*

See above for what strongly motivates this.

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.