Regex to DFA

benhanson2@icqmail.com
25 Sep 2006 01:15:58 -0400

From comp.compilers

Related articles
*Regex to DFA benhanson2@icqmail.com* (2006-09-25)**
Re: Regex to DFA int2k@gmx.net (Wolfram Fenske) (2006-09-30)
Re: Regex to DFA benhanson2@icqmail.com (2006-10-06)

| List of all articles for this month |

From:	benhanson2@icqmail.com
Newsgroups:	comp.compilers
Date:	25 Sep 2006 01:15:58 -0400
Organization:	http://groups.google.com
Keywords:	lex, DFA, question
Posted-Date:	25 Sep 2006 01:15:58 EDT

Hello,

I am writing a lexer generator library in C++
(http://www.benhanson.net/lexertl.html). I am half way through the new
version which uses the equivalence class technique from flex to speed
up DFA creation. So far I have generated the regex syntax tree,
partitioned all the regex tokens and built the lookup vector to map
input characters to character sets.

I'm wondering what the fastest technique is for implementing the syntax
tree to DFA conversion as described in the Dragon Book. In previous
versions I have done it by creating

std::map<std::set<node *>, size_t>

This means that as a set of nodes representing the next DFA state is
built I can lookup in the map fairly quickly to see if it has already
been seen. However, the more containers I use, the more over-head
there is when these temporary data structures are destroyed. Is there
a much simpler C language approach that I could use that would be more
efficient? The source code to flex hasn't been particularly
illuminating and I found Vern Paxson's explanations on this group much
more helpful, but these were from the late eighties/early nineties!

Does anyone have any suggestions?

Thanks,

Ben Hanson

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Regex to DFA

benhanson2@icqmail.com25 Sep 2006 01:15:58 -0400

benhanson2@icqmail.com
25 Sep 2006 01:15:58 -0400