Re: DotStar regex package

Chris F Clark <cfc@shell01.TheWorld.com>
Tue, 11 May 2010 13:44:20 -0400

          From comp.compilers

Related articles
DotStar regex package johnl@iecc.com (John R. Levine) (2010-05-11)
Re: DotStar regex package dot@dotat.at (Tony Finch) (2010-05-11)
Re: DotStar regex package cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-11)
Re: DotStar regex package cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-11)
Re: DotStar regex package kym@sdf.lonestar.org (russell kym horsell) (2010-05-12)
Re: DotStar regex package cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-12)
| List of all articles for this month |
From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: Tue, 11 May 2010 13:44:20 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 10-05-054
Keywords: lex, performance
Posted-Date: 12 May 2010 00:57:21 EDT

"John R. Levine" <johnl@iecc.com> writes:


> The March issue of IEEE Computer has an article called "Tools for Very
> Fast Regular Expression Matching" which describes a regex package
> called DotStar, written by three guys at IBM.
...
> I don't know enough about the nitty-gritty of current DFA pattern matchers
> to know if this really is a signficant improvement. Anyone else have
> opinions?


I've read the article and although a few details are a little
fragmentary, the work is sound and is part of a new trend in trying to
develop better (i.e. faster) regular expression processors. Similar
results can be seen in the work of Michella Becchi, who worked with
Patrick Crowley at Washington University (and interned with me here at
Intel). The underlying trick is converting the NFA into a DFA (and
the obvious generalization of the Aho-Corasik algorithm is a good way
to do this) except where doing so causes a state explosion and using
"other techniques" for dealing with the transitions that would cause
the explosion.


Hope this helps,
-Chris


******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.