|regular expressions (bug-report) email@example.com (1993-03-14)|
|Re: regular expressions (bug-report) Paul.Klint@cwi.nl (1993-03-15)|
|Re: regular expressions (bug-report) firstname.lastname@example.org (1993-03-15)|
|regular expressions (bug-report) email@example.com (Jos Horsmeier) (1993-03-16)|
|Re: regular expressions (bug-report) firstname.lastname@example.org (1993-03-19)|
|Re: regular expressions (bug-report) email@example.com (1993-03-25)|
|Re: regular expressions (bug-report) firstname.lastname@example.org (1993-03-26)|
|From:||email@example.com (Mark A Biggar)|
|Organization:||Loral Western Development Labs|
|Date:||Mon, 15 Mar 1993 17:06:03 GMT|
firstname.lastname@example.org (Dave Jones) writes:
> <<discussion of problems with NFA based RE packages including  >>
>Is there anything published on this? Does anyone know how the various
>reg-exp packages in existence do it? Do they work correctly all the time?
>(I've found that most reg-exp packages reject some strings that should
Most of the Unix based RE packages don't use the NFA or DFA methods (a
notable exception is egrep). They use an interpretive backtracking
algorithm using a stack of backtrack points. Prime examples of this are
grep, Henry Spenser's RE package and Perl (which is a heavly modified
version of Henry's package). There are three reasons why NFA/DFA based
methods are not used:
1) the NFA to DFA algorithm produces really big tables even for simple
2) DAF based pattern matches don't handle [A-Z] well (a major resaon why
they get so big).
3) Unix RE packages like to support the () \1 substring and remembered
stuff, which can't be implemented using DFA methods at all.
For example the pattern /^(a*)b\1b\1$/ matchs strings in the
language a^nba^nba^n which is a context sensitive language and
so is not a "regular" expression at all.
In addition interpretive methods allow for optimizations such as using
boyer-moore table driven matching methods for parts of a pattern that
contain no metacharacters.
Return to the
Search the comp.compilers archives again.