|NFA->DFA - Why use epsilon transitions? firstname.lastname@example.org (Alexander Morou) (2009-04-07)|
|Re: NFA->DFA - Why use epsilon transitions? email@example.com (Stephen Horne) (2009-04-13)|
|Re: NFA->DFA - Why use epsilon transitions? firstname.lastname@example.org (2009-04-14)|
|Re: NFA->DFA - Why use epsilon transitions? email@example.com (2009-04-14)|
|Re: NFA->DFA - Why use epsilon transitions? firstname.lastname@example.org (Vimal) (2009-04-16)|
|From:||Alexander Morou <email@example.com>|
|Date:||Tue, 7 Apr 2009 12:36:47 -0500|
|Posted-Date:||08 Apr 2009 16:40:21 EDT|
I have a question about epsilon transitions in standard NFA
implementations: What exactly are they for?
I've never really understood their purpose, other than to confuse
people trying to understand FSA and regular languages (with respect
to language scanner generation).
This is a standard overview of NFA, DFA, and translation from one
to the other (by I guess Donald R. Biggar, circa '04; using the
small grammar "ab*(bc)*"):
Since I'm useless with formal theory, I figured I'd follow his
pattern and describe mine using a visual representation:
Basically all I needed was a small lookup that contained the
transition requirement (ie. the character range) and the state
set that was transitioned to by that requirement. When
rebuilding the expression in DFA form, if a DFA node for the
NFA set and requirement doesn't exist, one is created; otherwise,
the node is retrieved from cache. The actual transitions
for each DFA node is the union of the transitions used to
create that node. This can potentially create new overlaps
so the node is created and inserted into cache before its individual
transitions are evaluated.
The code for DFA conversion is surprisingly small, and once I understood
the basics, the rest was easy. Is there something I missed in this
implementation, or is that really all there is?
I'm wondering if the entire reason I didn't need epsilon transitions
was the method I used to build the NFA structure in the first place.
I used the Kleene operators, and kept tabs on the last required element
relative to the current element, and merged the transitions backwards
through the state edges for every discrete node in the expression,
setting and clearing the edge status of nodes based upon whether the
next node element in the expression was required (ie. ac?d?e?b would
have the transition set of 'c','d','e' and 'b' after 'a' from the start.)
Naturally that doesn't get into state reduction, which was easy
once the other was done, but that's another matter.
Chris F Clark was nice enough to explain the difference between
throwing too much away for simple recognition, and just enough
for a transducer.
Thanks in advance,
Return to the
Search the comp.compilers archives again.