|Is there a paper which merges the Aho-Corasik and Boyer-Moore algorith cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-16)|
|Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo firstname.lastname@example.org (Paolo Bonzini) (2006-09-18)|
|Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo email@example.com (glen herrmannsfeldt) (2006-09-18)|
|Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo firstname.lastname@example.org (2006-09-21)|
|Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo email@example.com (Paolo Bonzini) (2006-09-21)|
|Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-21)|
|Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo firstname.lastname@example.org (email@example.com) (2006-09-30)|
|Date:||30 Sep 2006 17:42:20 -0400|
Henry Spencer wrote:
> Chris F Clark <cfc@shell01.TheWorld.com> wrote:
> >...thus the classic Aho-Corasik (AC) and Boyer-Moore (BM) algorithms.
> >In addition, I've known for a "long time" that the trick of working
> >backwards ala BM could be applied to a AC style matcher..
> >Still, it seems to me to be an obvious thing to do, so I would be
> >surprised if someone else hasn't already solved this problem...
> It's the Commentz-Walter algorithm, published (although rather obscurely)
> in 1979. CW is to AC as BM is to Knuth-Morris-Pratt, roughly speaking, a
> backwards-first optimization. Or alternatively :-), CW is to BM as AC is
> to KMP, a multi-string generalization.
> Commentz-Walter, B., "A string matching algorithm fast on the average",
> in Maurer ed., Proc 6th Internat. Coll. on Automata, Languages, and
> Programming, Springer-Verlag 1979, pp118-132. The inaccessibility of this
> paper (and the internal IBM Germany tech report it was based on) are a
> large part of why CW is little known.
Someone recently managed to track down Beate Commentz-Walter (she's a
professor at a German Hochschule), and she since put a scanned version
of the paper and abstract on her website. See
http://www.fh-albsig.de/win/personen/professoren.php?RID=36; most of
the page is in German, but the references to the PDF files containing
the paper and abstract can be found at the end of the page.
> Bruce Watson's 1995 Eindhoven PhD thesis, "Taxonomies and Toolkits of
> Regular Language Algorithms", is a much more readable and far more
> definitive reference -- including extensions of BM to full regular
> expressions -- but as far as I know, it suffers from the same problem of
> relative inaccessibility.
No it doesn't :-) Bruce Watson's thesis has been available in digital
form on his University of Pretoria and FASTAR Research Group website
for years: On fastar.org, you can find it under Publications ->
Theses/Internship Reports at the end of the list.
Software Engineering and Technology Group
Dept. Math & CS, Technische Universiteit Eindhoven
Main Building/HG 7.42, Den Dolech 2,
NL-5612 AZ Eindhoven, The Netherlands
Return to the
Search the comp.compilers archives again.