|How to do this odd kind of regex match? email@example.com (Tony Finch) (2002-07-15)|
|Re: How to do this odd kind of regex match? firstname.lastname@example.org (Michael Parker) (2002-07-21)|
|Re: How to do this odd kind of regex match? email@example.com (Joachim Durchholz) (2002-07-21)|
|Re: How to do this odd kind of regex match? Martin.Ward@durham.ac.uk (Martin Ward) (2002-07-21)|
|Re: How to do this odd kind of regex match? firstname.lastname@example.org (Simon Cozens) (2002-07-24)|
|From:||"Martin Ward" <Martin.Ward@durham.ac.uk>|
|Date:||21 Jul 2002 02:08:14 -0400|
|Posted-Date:||21 Jul 2002 02:08:14 EDT|
"Tony Finch" <email@example.com> writes:
> I'd also like to be able to match several regexes against the same
> text in parallel,
> (The aim is to speed up heuristic spam detection such as SpamAssassin.)
If you are matching a text against a huge number of regexps,
most of which contain words or phrases, then you might get
more benefit from preprocessing the text. Build a hash table
with the locations of all the 2, 3, 4 (or more) letter sequences.
Then, to match against a regexp containing the word "porn"
(say), you look up "porn" in the table and get the list of character
offsets of locations of that 4 character string in the text.
Martin.Ward@durham.ac.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
Return to the
Search the comp.compilers archives again.