|Regular expression search algorithm firstname.lastname@example.org (Harry) (2002-07-15)|
|Re: Regular expression search algorithm email@example.com (Joachim Durchholz) (2002-07-21)|
|Re: Regular expression search algorithm firstname.lastname@example.org (Ralph Corderoy) (2002-07-21)|
|Re: Regular expression search algorithm email@example.com (Peter S Tillier) (2002-07-24)|
|Re: Regular expression search algorithm firstname.lastname@example.org (Aharon Robbins) (2002-09-03)|
|regular expression search algorithm email@example.com (1993-03-18)|
|Re: regular expression search algorithm firstname.lastname@example.org (1993-03-19)|
|From:||"Joachim Durchholz" <email@example.com>|
|Date:||21 Jul 2002 01:52:37 -0400|
|Posted-Date:||21 Jul 2002 01:52:37 EDT|
> 1. I have read the boyer moore algo in the intro to algo by
> corman, rivest, leiserson.... however they taked about only
> the simple patterns ...
> How to acomodate the regular expression in the pattern ?
Regex searches work very differently than Boyer-Moore. The effect that
makes the Boyer-Moore algorithm fast is implicitly part of the regex
Regex searching is usually done by compiling the regex into a
deterministic finite automaton, then feeding the input to the regex.
(See Aho/Hopcroft/Ullman: Compiler Construction for a hands-on
introduction into DFAs. The book is also known as the "Dragon book"
since is sports a dragon and a knight on the front title.)
> Is there any varation of the algorithm for the same ?...
There are optimizations for various common cases (searching for any of a
number of words and such). Under Unix, they have been compiled into the
*grep commands (egrep, ngrep (sp?) etc.).
In practice, you need these only if speed is of utmost importance; a
good regexp library tends to be fast enough for most practical applications.
> and is there anyplace I can get the code.
Take a look at Henry Spencer's regex library. It's tested, fast, and
flexible. (It's part of the bowels of the run-time libraries that come
with my employer's programming languages, and we never had any real
problems with it.)
The only reasons why one might want to look further might be a different
regexp syntax. That would probably not be a good idea though - there are
already too many variations of the regexp syntax around.
Just my 5c.
Return to the
Search the comp.compilers archives again.