Re: Regular expression grammar?

terryg@uswest.net (Terry Greyzck)
20 Sep 1999 12:00:07 -0400

          From comp.compilers

Related articles
Regular expression grammar? bediger@teal.csn.net (Bruce Ediger) (1999-09-16)
Re: Regular expression grammar? jjan@cs.rug.nl (J.H.Jongejan) (1999-09-20)
Re: Regular expression grammar? terryg@uswest.net (1999-09-20)
Re: Regular expression grammar? lex@cc.gatech.edu (1999-09-20)
Re: Regular expression grammar? cbarron3@ix.netcom.com (1999-09-20)
Re: Regular expression grammar? zalman@netcom18.netcom.com (Zalman Stern) (1999-09-24)
Re: Regular expression grammar? zalman@netcom18.netcom.com (Zalman Stern) (1999-09-24)
Re: Regular expression grammar? zalman@netcom18.netcom.com (Zalman Stern) (1999-09-24)
Re: Regular expression grammar? cbarron3@ix.netcom.com (1999-09-27)
| List of all articles for this month |

From: terryg@uswest.net (Terry Greyzck)
Newsgroups: comp.compilers
Date: 20 Sep 1999 12:00:07 -0400
Organization: Posted via Supernews, http://www.supernews.com
References: 99-09-051
Keywords: lex

It's much easier if you treat regular expressions as Finite State
Automata, rather than trying to treat them as a high level language.
Certainly that is the approach I have taken in the past; I simply
built a dynamic FSA from the regular expression and used the FSA to do
the matching.


You would be better off using lex than yacc, although I'm not certain
an unaugumented lex is quite up to the task, since lex is really best
suited for static analysis. I've always had to build my own regular
expression "parser".. As indicated, alternatives pose an interesting
problem, which are straight forward to process using nested FSA's.


1. Define and build an extended FSA interpreter that is sufficient to process
          the necessary regular expressions
2. Build a translator ("parser") to convert the regular expression
          into a compatible extended FSA


You can also generate code directly from the translator if you are after the
utmost in execution speed, but that limits portability. If execution speed
isn't an issue, it may be interesting to attempt dynamically generating a lex
specification from the regular expression, as an academic exercise.


FSA, Finite State Automata, is sometimes referred to as Finite State Machines,
or FSM.


-- Terry Greyzck


Bruce Ediger <bediger@teal.csn.net> wrote:


>Dr Dobb's Journal, April 1999 carried an article by Brian Kernighan
>and Rob Pike titled "Regular Expressions". I read the article, and
>I was struck by the beginning sentences of the second-to-last paragraph:
>
> For regular expressions with parentheses and alternatives,
> the implementation must be more sophisticated. One approach
> is to compile the regular expression into a parse tree that
> captures its grammatical structure.
>
>How the heck do they do that compilation? I've tried to come up with
>a "yacc" grammar for regular expressions based on the definition of
>"regular expression" in "Automata and Formal Languages" by Dean
>Kelley, ISBN 0-13-497777-7, Prentice Hall, 1995.


terryg@uswest.net


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.