Regular expressions speedup

Cleo Saulnier <cleos@nb.sympatico-dot-ca.remove>
5 Aug 2005 19:07:34 -0400

          From comp.compilers

Related articles
Regular expressions speedup cleos@nb.sympatico-dot-ca.remove (Cleo Saulnier) (2005-08-05)
Re: Regular expressions speedup haberg@math.su.se (2005-08-07)
Re: Regular expressions speedup cleos@nb.sympatico.ca (Cleo Saulnier) (2005-08-07)
Re: Regular expressions speedup haberg@math.su.se (2005-08-10)
Re: Regular expressions speedup bonzini@gnu.org (Paolo Bonzini) (2005-08-10)
Re: Regular expressions speedup dot@dotat.at (Tony Finch) (2005-08-10)
Re: Regular expressions speedup kszabo@bcml120x.ca.nortel.com (2005-08-10)
[3 later articles]
| List of all articles for this month |

From: Cleo Saulnier <cleos@nb.sympatico-dot-ca.remove>
Newsgroups: comp.compilers
Date: 5 Aug 2005 19:07:34 -0400
Organization: Aliant Internet
Keywords: DFA, lex, question, comment
Posted-Date: 05 Aug 2005 19:07:34 EDT

I wrote my own Regular expressions parser CSRegEx for C++ (all OS) which
is now on sourceforge as public domain. You can access backreferences
and it also supports UNICODE. I wrote it for use in my LR(1) parser and
will release that too. The regular expressions parser converts every
pattern into a binary format (still used as a string). The matching
algorithm is non-recursive and backtracking. Are there any tips on how
to speed up the matching process. I was thinking for RE's that aren't
anchored to the start, that getting the *FIRST* set of chars (as in LL
and LR parsers) would perhaps simplify the initial scanning process.
Are there any other obvious things that can be done on a backtracking
engine?


http://csregex.sourceforge.net/
http://sourceforge.net/projects/csregex
[If you really care about speed, why not turn the NFA into a DFA so
you don't have to do multiple states and backtracking? -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.