Tue, 7 Apr 2009 12:36:47 -0500

Related articles |
---|

NFA->DFA - Why use epsilon transitions? alexander.morou@gmail.com (Alexander Morou) (2009-04-07) |

Re: NFA->DFA - Why use epsilon transitions? sh006d3592@blueyonder.co.uk (Stephen Horne) (2009-04-13) |

Re: NFA->DFA - Why use epsilon transitions? alexander.morou@gmail.com (2009-04-14) |

Re: NFA->DFA - Why use epsilon transitions? torbenm@pc-003.diku.dk (2009-04-14) |

Re: NFA->DFA - Why use epsilon transitions? j.vimal@gmail.com (Vimal) (2009-04-16) |

From: | Alexander Morou <alexander.morou@gmail.com> |

Newsgroups: | comp.compilers |

Date: | Tue, 7 Apr 2009 12:36:47 -0500 |

Organization: | Compilers Central |

Keywords: | lex, DFA |

Posted-Date: | 08 Apr 2009 16:40:21 EDT |

I have a question about epsilon transitions in standard NFA

implementations: What exactly are they for?

I've never really understood their purpose, other than to confuse

people trying to understand FSA and regular languages (with respect

to language scanner generation).

This is a standard overview of NFA, DFA, and translation from one

to the other (by I guess Donald R. Biggar, circa '04; using the

small grammar "ab*(bc)*"):

http://www3.sympatico.ca/dbiggar/FA.home.html

Since I'm useless with formal theory, I figured I'd follow his

pattern and describe mine using a visual representation:

http://alexandermorou.com/images/nfa_to_dfa.jpg

Basically all I needed was a small lookup that contained the

transition requirement (ie. the character range) and the state

set that was transitioned to by that requirement. When

rebuilding the expression in DFA form, if a DFA node for the

NFA set and requirement doesn't exist, one is created; otherwise,

the node is retrieved from cache. The actual transitions

for each DFA node is the union of the transitions used to

create that node. This can potentially create new overlaps

so the node is created and inserted into cache before its individual

transitions are evaluated.

The code for DFA conversion is surprisingly small, and once I understood

the basics, the rest was easy. Is there something I missed in this

implementation, or is that really all there is?

I'm wondering if the entire reason I didn't need epsilon transitions

was the method I used to build the NFA structure in the first place.

I used the Kleene operators, and kept tabs on the last required element

relative to the current element, and merged the transitions backwards

through the state edges for every discrete node in the expression,

setting and clearing the edge status of nodes based upon whether the

next node element in the expression was required (ie. ac?d?e?b would

have the transition set of 'c','d','e' and 'b' after 'a' from the start.)

Naturally that doesn't get into state reduction, which was easy

once the other was done, but that's another matter.

http://compilers.iecc.com/comparch/article/09-02-146

Chris F Clark was nice enough to explain the difference between

throwing too much away for simple recognition, and just enough

for a transducer.

Thanks in advance,

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.