Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algorithms?

Chris F Clark <cfc@shell01.TheWorld.com>
21 Sep 2006 10:34:50 -0400

          From comp.compilers

Related articles
Is there a paper which merges the Aho-Corasik and Boyer-Moore algorith cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-16)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo bonzini@gnu.org (Paolo Bonzini) (2006-09-18)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo gah@ugcs.caltech.edu (glen herrmannsfeldt) (2006-09-18)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo henry@spsystems.net (2006-09-21)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo bonzini@gnu.org (Paolo Bonzini) (2006-09-21)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo cfc@shell01.TheWorld.com (Chris F Clark) (2006-09-21)
Re: Is there a paper which merges the Aho-Corasik and Boyer-Moore algo loekcleophas@gmail.com (loekcleophas@gmail.com) (2006-09-30)
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: 21 Sep 2006 10:34:50 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 06-09-089 06-09-099
Keywords: bibliography, lex
Posted-Date: 21 Sep 2006 10:34:50 EDT

Thank you, Paolo.


Your lead got me what looks like the exact article I wanted:


@article{947951,
  author = {Bruce W. Watson and Richard E. Watson},
  title =
        {A Boyer--Moore-style algorithm for regular expression pattern matching},
  journal = {Sci. Comput. Program.},
  volume = {48},
  number = {2-3},
  year = {2003},
  issn = {0167-6423},
  pages = {99--117},
  doi = {http://dx.doi.org/10.1016/S0167-6423(03)00013-3},
  publisher = {Elsevier North-Holland, Inc.},
  address = {Amsterdam, The Netherlands, The Netherlands},
  }


The algorithm improves on the Commentz-Walter algorithm by handling
DFAs (REs), which is what I was looking for. It is presented fairly
abstractly, so I've got some implementation details to fill in, but
the essential concepts seem to be there.


Abrstact:


This paper presents a Boyer-Moore-type algorithm for regular
expression pattern matching, answering an open problem posed by Aho in
1980 (Pattern Matching in Strings, Academic Press, New York, 1980,
p. 342). The new algorithm handles patterns specified by regular
expressions-- a generalization of the Boyer-Moore and Commentz-Walter
algorithms.Like the Boyer-Moore and Commentz-Walter algorithms, the
new algorithm makes use of shift functions which can be precomputed
and tabulated. The precomputation algorithms are derived, and it is
shown that the required shift functions can be precomputed from
Commentz-Walter's d1 and d2 shift functions.In certain cases, the
Boyer-Moore (respectively Commentz-Walter) algorithm has greatly
outperformed the Knuth-Morris-Pratt (respectively Aho-Corasick)
algorithm (as discussed by Watson in his Ph.D. Thesis, Eindhoven
University of Technology, September 1995, and in: N. Ziviani,
R. Baeza-Yates, K. Guimaraes (Eds.), Proc. Third South American
Workshop on String Processing, International Informatics Series,
vol. 4, Carleton University Press, Recife, Brazil, 1996,
pp. 280-294). In testing, the algorithm presented in this paper also
frequently outperforms the regular expression generalization of the
Aho-Corasick algorithm.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.