|Regex to DFA email@example.com (2006-09-25)|
|Re: Regex to DFA firstname.lastname@example.org (Wolfram Fenske) (2006-09-30)|
|Re: Regex to DFA email@example.com (2006-10-06)|
|Date:||25 Sep 2006 01:15:58 -0400|
|Keywords:||lex, DFA, question|
I am writing a lexer generator library in C++
(http://www.benhanson.net/lexertl.html). I am half way through the new
version which uses the equivalence class technique from flex to speed
up DFA creation. So far I have generated the regex syntax tree,
partitioned all the regex tokens and built the lookup vector to map
input characters to character sets.
I'm wondering what the fastest technique is for implementing the syntax
tree to DFA conversion as described in the Dragon Book. In previous
versions I have done it by creating
std::map<std::set<node *>, size_t>
This means that as a set of nodes representing the next DFA state is
built I can lookup in the map fairly quickly to see if it has already
been seen. However, the more containers I use, the more over-head
there is when these temporary data structures are destroyed. Is there
a much simpler C language approach that I could use that would be more
efficient? The source code to flex hasn't been particularly
illuminating and I found Vern Paxson's explanations on this group much
more helpful, but these were from the late eighties/early nineties!
Does anyone have any suggestions?
Return to the
Search the comp.compilers archives again.