Related articles |
---|
How to rewrite a regexp without word boundaries? dave_140390@hotmail.com (2009-07-05) |
Re: How to rewrite a regexp without word boundaries? h.b.furuseth@usit.uio.no (Hallvard B Furuseth) (2009-07-05) |
Re: How to rewrite a regexp without word boundaries? dave_140390@hotmail.com (2009-07-06) |
Re: How to rewrite a regexp without word boundaries? haberg_20080406@math.su.se (Hans Aberg) (2009-07-07) |
Re: How to rewrite a regexp without word boundaries? h.b.furuseth@usit.uio.no (Hallvard B Furuseth) (2009-07-07) |
Re: How to rewrite a regexp without word boundaries? andrew@tomazos.com (Andrew Tomazos) (2009-07-07) |
Re: How to rewrite a regexp without word boundaries? cfc@shell01.TheWorld.com (Chris F Clark) (2009-07-13) |
Re: How to rewrite a regexp without word boundaries? hu47121@usenet.kitty.sub.org (2009-08-16) |
Re: How to rewrite a regexp without word boundaries? dot@dotat.at (Tony Finch) (2009-08-16) |
From: | Andrew Tomazos <andrew@tomazos.com> |
Newsgroups: | comp.compilers,comp.theory |
Date: | Tue, 7 Jul 2009 09:33:45 -0700 (PDT) |
Organization: | Compilers Central |
References: | 09-07-003 09-07-004 09-07-008 |
Keywords: | lex |
Posted-Date: | 10 Jul 2009 18:39:06 EDT |
On Jul 6, 10:43 pm, dave_140...@hotmail.com wrote:
> > > I have been wondering, with limited success, how to rewrite a regexp
> > > without word boundaries.
>
> > Why do you want to? Most likely, the answer is that your regexps are
> > getting too clever and thus too unreadable/bug-prone, so you should
> > break them up and use more ordinary programming instead.
>
> The regexps are not mine... Sorry, I should have explained. I am
> actually writing a tool that takes regexps as input and transforms
> them internally into NFAs/DFAs. Since the regexps are not really in my
> hands, I should be ready for weird regexps - for example, regexps with
> "\b" preceded or followed by other regexps. And I don't know how to
> transform a regexp that contains "\b" at an arbitrary position into an
> equivalent NFA/DFA.
Why don't you study Perl's regex engine to see how they implement it?
It is all open source. I did this a while ago. It is very
interesting to look at, Perl has the most advanced and heavily used
regex engine out of just about anything.
Also one thing to note is that in formal language theory "regular
expression" has a well-defined meaning. See Chomsky. Perl's regular
expressions do *not* classify as regular expressions under the formal
definition.
-Andrew.
[Perl's regex engine is swell, but if performance is an issue it's nowhere
near as fast as a DFA generated by flex or re2c. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.