|Trie algorithms? email@example.com (Tim McDaniel) (1996-04-29)|
|Re: Trie algorithms? firstname.lastname@example.org (1996-05-04)|
|Re: Trie algorithms? email@example.com (1996-05-06)|
|From:||Tim McDaniel <firstname.lastname@example.org>|
|Date:||29 Apr 1996 23:29:04 -0400|
|Keywords:||lex, question, comment|
Since this is a kind of lexical matching problem, I thought I'd try
comp.compilers. Request 0: please tell me if there is a more
appropriate group for these questions.
I'm looking for algorithms for quick matching of one of a set of fixed
strings. "Trie"s look interesting, to judge by pp. 151-155 of the 2nd
edition "dragon book" (Aho, Sethi, and Ullman; they are exercises 3.26
thru 3.33). I could just do a simple DFA with backtracking, of
course, but a trie uses at most 2N state transitions in processing an
input of length N (3.32.c), and I've got a *really* *big* *N* in this
The problem is that the algorithms in AS&U just reply "yes" or "no" to
whether at least one keyword exists somewhere in the string.
What I'd like to do, though, is do something on each acceptance state.
It's token replacement, where an N-character keyword must be replaced
by an N-character replacement. Single unmatched characters are left
untranslated (which is what those trie algorithms assume anyway).
Furthermore, I want left-most greedy matching. For keywords
he -> qi
hers -> nzzy
erg -> btu
and input "herg", I want to match "he" (and find its replacement
value), "r", and "g", and produce "qirg". I don't want to match "h"
and then "ergs", which I think the trie algorithms there would do.
I could try to adapt the dragon book algorithms, but perhaps someone
out there has already developed code for this. Does anyone have any
pointers to code, or indications of how I can adapt the dragon book
(And can you tell me whether I should be capitalizing "dragon book"?)
email@example.com is Tim McDaniel at work: Customer Potential Management Corp.,
East Peoria, IL, USA. The phone number is +1 309 698-1037.
For personal mail please use "firstname.lastname@example.org".
[I'd just use flex, following some of the hints in the documentation to make
lexers work faster. -John]
Return to the
Search the comp.compilers archives again.