Related articles |
---|
Regexp optimization? stefan@waldherr.org (Stefan Waldherr) (1999-01-02) |
Re: Regexp optimization? zalman@netcom.com (1999-01-06) |
From: | zalman@netcom.com (Zalman Stern) |
Newsgroups: | comp.compilers |
Date: | 6 Jan 1999 04:05:54 -0500 |
Organization: | ICGNetcom |
References: | 99-01-004 |
Keywords: | lex, comment |
Stefan Waldherr (stefan@waldherr.org) wrote:
: I have a problem with a huge file that contains (line per line) regular
: expressions. Thing is, that many of the regexps may be redundant and hence
: matching is pretty slow.
[Our moderator suggests flex.]
Or glue the regexps together with an or (|) and feed it to GNU grep
for that matter. I'm not familiar with the algorithm flex uses, but
GNU grep gives the possibility of Boyer-Moore-Gosper combined with DFA
based matching. It will be fast but memory intensive.
Another possibility is to use an NFA construction technique combined
with a lazy DFA builder. I believe the GNU rx pacakge does
this. (Haven't looked at the details.) There are probably other
optimization techniques since the match can be fuzzy. (I.e. you
probably don't have to stick to the exact specification given by the
regexps.)
-Z-
[Flex builds a fixed DFA and interprets it. I haven't done any
experiments but I believe that if you look at runtimes and disregard
all the time it takes to run lex and compile the generated C code,
it's faster than egrep. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.