Re: Regular expressions speedup

torbenm@diku.dk (=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=)
13 Aug 2005 00:24:50 -0400

          From comp.compilers

Related articles
[2 earlier articles]
Re: Regular expressions speedup cleos@nb.sympatico.ca (Cleo Saulnier) (2005-08-07)
Re: Regular expressions speedup haberg@math.su.se (2005-08-10)
Re: Regular expressions speedup bonzini@gnu.org (Paolo Bonzini) (2005-08-10)
Re: Regular expressions speedup dot@dotat.at (Tony Finch) (2005-08-10)
Re: Regular expressions speedup kszabo@bcml120x.ca.nortel.com (2005-08-10)
Re: Regular expressions speedup jburgy@gmail.com (2005-08-10)
Re: Regular expressions speedup torbenm@diku.dk (2005-08-13)
Re: Regular expressions speedup cleos@nb.sympatico-dot-ca.remove (Cleo Saulnier) (2005-08-13)
| List of all articles for this month |
From: torbenm@diku.dk (=?iso-8859-1?q?Torben_=C6gidius_Mogensen?=)
Newsgroups: comp.compilers
Date: 13 Aug 2005 00:24:50 -0400
Organization: Department of Computer Science, University of Copenhagen
References: 05-08-023 05-08-028 05-08-036
Keywords: lex
Posted-Date: 13 Aug 2005 00:24:50 EDT

haberg@math.su.se (Hans Aberg) writes:


> John Levine <johnl@iecc.com> wrote:
>
> > > The book by Aho, Sethi & Ullman, "Compilers...", end of sec. 3.7, says
> > > that best for both space and time is a "lazy transition evaluation"
> > > technique.
>
> > [I wonder if they feel differently about space tradeoffs now than they
> > did 30 years ago. At that point, programs had to fit into 16 bit address
> > spaces. -John]
>
> A NFA to DFA translation may, in bad cases, result in an exponential
> expansion, in which case no computer will suffice, no matter how much
> memory it has. So in order to avoid that, one might use the suggestion
> above. That is, translating the NFA to DFA transitions at need, caching
> the partial DFA. The quote above also says that this lazy technique may be
> faster than a full runtime NFA to DFA translation, in view of that unused
> DFA transitions need not be computed.


At one point I ran into a regular expression that generated an
enourmous DFA. My solution was to rewrite it into a context-free
grammar and then use an LR parser to generate a parser table. While
these tables can also be exponential, it worked in this case because
the parse stack can remember things that would otherwise have to be
remembered in the DFA. Translating regular expressions to grammar
rules is quite simple.


                Torben


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.