Related articles |
---|
Is there a paper which merges the Aho-Corasik and Boyer-Moore algorith cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-16) |
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo bonzini@gnu.org (Paolo Bonzini) (2006-09-18) |
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo gah@ugcs.caltech.edu (glen herrmannsfeldt) (2006-09-18) |
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo henry@spsystems.net (2006-09-21) |
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo bonzini@gnu.org (Paolo Bonzini) (2006-09-21) |
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-21) |
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo loekcleophas@gmail.com (loekcleophas@gmail.com) (2006-09-30) |
From: | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
Newsgroups: | comp.compilers |
Date: | 18 Sep 2006 22:44:02 -0400 |
Organization: | Compilers Central |
References: | 06-09-089 06-09-099 |
Keywords: | theory |
Posted-Date: | 18 Sep 2006 22:44:02 EDT |
Paolo Bonzini wrote:
>>Specifically, I've been working on string matching problems recently,
>>and thus the classic Aho-Corasik (AC) and Boyer-Moore (BM) algorithms.
>>In addition, I've known for a "long time" that the trick of working
>>backwards ala BM could be applied to a AC style matcher to obtain the
>>same potential sub-linear performance.
One of my favorite places to look for string algorithms is the
CPM (Combinatorial Pattern Matching) conference proceedings.
As well as I remember it, the important part of Aho-Corasik is
a finite-state machine. FSA should be very efficient when you are
looking for one of a large number of different strings.
You might also look at the documentation for BLAST, a string search
program used for protein and DNA sequences from NCBI.
> I think GNU fgrep is doing something very similar.
As well as I remember, the GNU grep use BM on the longest
fixed string in the query, and then apply the regexp once that
is found. For the large number of grep patterns that are fixed
strings, this is especially useful. I haven't looked at the -F
option, though.
-- glen
Return to the
comp.compilers page.
Search the
comp.compilers archives again.