Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algorithms?

"loekcleophas@gmail.com" <loekcleophas@gmail.com>
30 Sep 2006 17:42:20 -0400

          From comp.compilers

Related articles
Is there a paper which merges the Aho-Corasik and Boyer-Moore algorith cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-16)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo bonzini@gnu.org (Paolo Bonzini) (2006-09-18)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo gah@ugcs.caltech.edu (glen herrmannsfeldt) (2006-09-18)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo henry@spsystems.net (2006-09-21)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo bonzini@gnu.org (Paolo Bonzini) (2006-09-21)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-21)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo loekcleophas@gmail.com (loekcleophas@gmail.com) (2006-09-30)
| List of all articles for this month |

From: "loekcleophas@gmail.com" <loekcleophas@gmail.com>
Newsgroups: comp.compilers,comp.theory
Date: 30 Sep 2006 17:42:20 -0400
Organization: Compilers Central
References: 06-09-08906-09-105
Keywords: bibliography

Henry Spencer wrote:
> Chris F Clark <cfc@shell01.TheWorld.com> wrote:
> >...thus the classic Aho-Corasik (AC) and Boyer-Moore (BM) algorithms.
> >In addition, I've known for a "long time" that the trick of working
> >backwards ala BM could be applied to a AC style matcher..
> >Still, it seems to me to be an obvious thing to do, so I would be
> >surprised if someone else hasn't already solved this problem...
>
> It's the Commentz-Walter algorithm, published (although rather obscurely)
> in 1979. CW is to AC as BM is to Knuth-Morris-Pratt, roughly speaking, a
> backwards-first optimization. Or alternatively :-), CW is to BM as AC is
> to KMP, a multi-string generalization.
>
> Commentz-Walter, B., "A string matching algorithm fast on the average",
> in Maurer ed., Proc 6th Internat. Coll. on Automata, Languages, and
> Programming, Springer-Verlag 1979, pp118-132. The inaccessibility of this
> paper (and the internal IBM Germany tech report it was based on) are a
> large part of why CW is little known.


Someone recently managed to track down Beate Commentz-Walter (she's a
professor at a German Hochschule), and she since put a scanned version
of the paper and abstract on her website. See
http://www.fh-albsig.de/win/personen/professoren.php?RID=36; most of
the page is in German, but the references to the PDF files containing
the paper and abstract can be found at the end of the page.


>
> Bruce Watson's 1995 Eindhoven PhD thesis, "Taxonomies and Toolkits of
> Regular Language Algorithms", is a much more readable and far more
> definitive reference -- including extensions of BM to full regular
> expressions -- but as far as I know, it suffers from the same problem of
> relative inaccessibility.


No it doesn't :-) Bruce Watson's thesis has been available in digital
form on his University of Pretoria and FASTAR Research Group website
for years: On fastar.org, you can find it under Publications ->
Theses/Internship Reports at the end of the list.




--
Loek Cleophas
Software Engineering and Technology Group
Dept. Math & CS, Technische Universiteit Eindhoven
Main Building/HG 7.42, Den Dolech 2,
NL-5612 AZ Eindhoven, The Netherlands


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.