Re: DotStar regex package

russell kym horsell <kym@sdf.lonestar.org>
Wed, 12 May 2010 04:06:21 +0000 (UTC)

          From comp.compilers

Related articles
DotStar regex package johnl@iecc.com (John R. Levine) (2010-05-11)
Re: DotStar regex package dot@dotat.at (Tony Finch) (2010-05-11)
Re: DotStar regex package cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-11)
Re: DotStar regex package cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-11)
Re: DotStar regex package kym@sdf.lonestar.org (russell kym horsell) (2010-05-12)
Re: DotStar regex package cfc@shell01.TheWorld.com (Chris F Clark) (2010-05-12)
| List of all articles for this month |
From: russell kym horsell <kym@sdf.lonestar.org>
Newsgroups: comp.compilers
Date: Wed, 12 May 2010 04:06:21 +0000 (UTC)
Organization: Netfront http://www.netfront.net/
References: 10-05-054
Keywords: lex, performance, comment
Posted-Date: 12 May 2010 01:00:11 EDT

John R. Levine <johnl@iecc.com> wrote:
> The March issue of IEEE Computer has an article called "Tools for Very
> Fast Regular Expression Matching" which describes a regex package
> called DotStar, written by three guys at IBM. It turns regular
> expressions into NFAs and then DFAs, but does so in a way that they
> say runs much faster than other packages on modern CPUs. They also
> say they deal with the state explosions that NFA->DFA translations can
> cause.
[...]


This has been a moderately warm topic over the past 5 years as that
"ubiquitous parallelism" has broken out from 4 cores on an x86 to
256 cores on a GPU. :)


A few people have published work that compiles "a" RE or a set of RE
into machine code (e.g. MMX) rather than some HLL, but this seems to be
among the early s/w for relatively large-scale SIMD and more specifically
for large sets of RE.


I *think* any claim about limiting state explosion relates more to
combinging tables for multiple RE rather than something related to build
of a single DFA.


Maybe some of the hype is attached to the tool because it's being used
by IBM (and maybe I've heard Microsoft mentioned as well) in an online
trading s/w.
[I got the impression that the main application was scanning stuff for
malware signatures. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.