Mon, 11 Jan 2010 20:10:25 -0500

Related articles |
---|

[2 earlier articles] |

Re: How to compute a regex which is the difference between two regexes egagnon@j-meg.com (Etienne M. Gagnon) (2010-01-03) |

Re: How to compute a regex which is the difference between two regexes gene.ressler@gmail.com (Gene) (2010-01-03) |

Re: How to compute a regex which is the difference between two regexes mciura@gmail.com (Marcin Ciura) (2010-01-04) |

Re: How to compute a regex which is the difference between two regexes max@gustavus.edu (Max Hailperin) (2010-01-04) |

Re: How to compute a regex which is the difference between two regexes kkylheku@gmail.com (Kaz Kylheku) (2010-01-05) |

Re: How to compute a regex which is the difference between two regexes kkylheku@gmail.com (Kaz Kylheku) (2010-01-06) |

Re: How to compute a regex which is the difference between two regexes cfc@shell01.TheWorld.com (Chris F Clark) (2010-01-11) |

From: | Chris F Clark <cfc@shell01.TheWorld.com> |

Newsgroups: | comp.compilers |

Date: | Mon, 11 Jan 2010 20:10:25 -0500 |

Organization: | The World Public Access UNIX, Brookline, MA |

References: | 10-01-013 10-01-033 |

Keywords: | DFA, lex |

Posted-Date: | 13 Jan 2010 22:39:48 EST |

Kaz Kylheku <kkylheku@gmail.com> writes:

....

*> Using a four letter {a, b, c, d} alphabet, I computed the following*

*> example complement:*

*>*

*> complement(a(abc)*d) -> (|[bcd]|a(abc)*[bc]|a(abc)*d[abcd]+|*

*> aab[abd][abcd]*|aa[acd][abcd]*)*

*>*

*> On informal inspection this seems right (do you see any mistakes).*

*> I.e. an empty string mismatches, since the regex derives only nonempty*

*> strings, so we include it. The the one-letter strings b, c and d do*

*> not match, because the regex has a null derivative with respect to*

*> these. An a followed by by zero or more repetitions of "abc",*

*> followed by something other than an a or d (i.e. [bc]) does not match*

*> and so is a branch of of the complement. And so forth.*

Your algorithm which works directly on "regular expressions" is

actually minimicing the state traversal that one does in the finite

automaton versions of the algorithm. Not, that that is a bad thing.

In fact, it is one way of establishing the correctness of the

algorithm, by showing how it does exactly the same thing just using a

different notation.

In fact, it is possible to represent the states of a finite automaton,

as a series of sets of "dotted items" (each state being one set of

dotted items). Each dotted item, having a dot (cursor) in the regular

expression showing where in the regular expression that state

represents.

I believe you will find a correspondence between these sets of dotted

items and derivatives.

So, in that sense, there is an affirmative answer to the original

question. There is an algorithm that works directly with regular

expressions that generates the difference between two regular

expressions. However, the mechanics of the algorithm recapitulate

transforming the regular expressions into finite automatons and

working with the states to find taversals to all the states that are

final in the first automaton (subtrahend?) but not final in the second

(minuend?).

It is also possible to do the same thing with algebraic manipulations,

if you allow yourself to write an expression the with regular

expressions augmented by the set theory operators (union,

intersection, complement) and then algebraicly transform it into an

expression without those operators. My intuition tells me that the

algebra will effectively perform the same operations as the traversal

algorithm. However, I leave such as an exercise to the motivated

reader.

Hope this helps,

-Chris

******************************************************************************

Chris Clark email: christopher.f.clark@compiler-resources.com

Compiler Resources, Inc. Web Site: http://world.std.com/~compres

23 Bailey Rd voice: (508) 435-5016

Berlin, MA 01503 USA twitter: @intel_chris

------------------------------------------------------------------------------

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.