Re: Regular Expression Minimization (Ben Abernathy)
20 Apr 2003 17:57:14 -0400

          From comp.compilers

Related articles
Regular Expression Minimization (Jose Joao Morais) (2003-04-13)
Re: Regular Expression Minimization (David Butler) (2003-04-20)
Re: Regular Expression Minimization (2003-04-20)
| List of all articles for this month |

From: (Ben Abernathy)
Newsgroups: comp.compilers
Date: 20 Apr 2003 17:57:14 -0400
References: 03-04-042
Keywords: lex
Posted-Date: 20 Apr 2003 17:57:13 EDT

You have encountered the main disadvantage to state removal:
ambiguity. I have never addressed the problem of finding the so called
"best choice" for state removal. However, there is another algorithm
you can use in order to take a DFA to a regular expression. It is
called Kleen's Theorem. It is a very easy to follow algorithm and
will work on any DFA operating on a regular language. Be warned,
however, that it is recursive in nature and I have no idea of the
complexity. The main reason I use it is because it is very simple to
write a program to do the algorithm for you and in my opinion, is much
better than the state removal method. Granted, it may take more
iterations to use Kleen's theorem, but if you're very concerned about
time, it should still be far shorter than doing a shortest path
problem inside your dfa to regular expression conversion. There is one
thing to keep in mind however. As far as I know and it has been my
experience that Kleen's theorem will not give you the simplest regular
expression. The regular expressions resulting from Kleen's tend to be
fairly large and complex. Another problem for you might be (and I
don't remember if this is true all the time) that Kleen's has a
tendency to introduce lambdas into your regular expression. So,
depending on what you actually want to do with your resulting regular
expression, Kleen's might work for you and it might not. But just be
aware that there are other algorithms outside of state removal and it
might be worth it to take a look at those.

I learned about Kleen's theorem in "Introduction to Languages and the
Theory of Computation" by John C. Martin. The book isn't the greatest
in the world, but it is a good place to start. You also might want to
try ACM's digital library, if you are a subscriber. I'm sure you could
find something in there.

I know this doesn't really solve your problem, but I hope it helps.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.