Related articles |
---|
How to do this odd kind of regex match? dot@dotat.at (Tony Finch) (2002-07-15) |
Re: How to do this odd kind of regex match? michaelparker@earthlink.net (Michael Parker) (2002-07-21) |
Re: How to do this odd kind of regex match? joachim_d@gmx.de (Joachim Durchholz) (2002-07-21) |
Re: How to do this odd kind of regex match? Martin.Ward@durham.ac.uk (Martin Ward) (2002-07-21) |
Re: How to do this odd kind of regex match? simon.cozens@computing-services.oxford.ac.uk (Simon Cozens) (2002-07-24) |
From: | "Simon Cozens" <simon.cozens@computing-services.oxford.ac.uk> |
Newsgroups: | comp.compilers |
Date: | 24 Jul 2002 01:48:32 -0400 |
Organization: | Bethnal Green is PEOPLE! |
References: | 02-07-078 |
Keywords: | lex |
Posted-Date: | 24 Jul 2002 01:48:32 EDT |
"Martin Ward" <Martin.Ward@durham.ac.uk> writes:
> more benefit from preprocessing the text. Build a hash table
> with the locations of all the 2, 3, 4 (or more) letter sequences.
> Then, to match against a regexp containing the word "porn"
> (say), you look up "porn" in the table and get the list of character
> offsets of locations of that 4 character string in the text.
This is essentially what the Perl RE engine does, by performing a FBM
analysis on the text, and then anchoring parts of a RE by doing an FBM
search for portions of the text. For instance, given /\w{3,5}foo/ and
"xxx abcdefoo", Perl does this:
floating `foo' at 3..5 (checking floating) stclass `ALNUM' minlen 6
Guessing start of match, REx `\w{3,5}foo' against `xxxx abcdefoo'...
Found floating substr `foo' at offset 10...
Starting position does not contradict /^/m...
Does not contradict STCLASS...
Guessed: match at offset 5
and starts hunting at the "b".
But I don't think that's what Tony's question was; from what I
understand it, that was "how do I store (potentially multiple levels
of) bracket-captured text when running multiple regexes in
parallel". I'm also working on a fast RE engine, and bracketed text is
where I'm coming unstuck as well, so if anyone's got a decent answer,
I'd love to hear it...
--
The debate rages on: Is Perl Bactrian or Dromedary?
Return to the
comp.compilers page.
Search the
comp.compilers archives again.