Thu, 27 Mar 2008 00:49:58 +0100

Related articles |
---|

Generate text, given a regex midhatali@gmail.com (Midhat) (2008-03-23) |

Re: Generate text, given a regex m.helvensteijn@gmail.com (2008-03-27) |

Re: Generate text, given a regex gene.ressler@gmail.com (Gene) (2008-03-26) |

Generate text, given a regex domenico.bianculli@lu.unisi.ch (Domenico Bianculli) (2008-03-27) |

Re: Generate text, given a regex gene.ressler@gmail.com (Gene) (2008-04-11) |

Re: Generate text, given a regex rsc@swtch.com (Russ Cox) (2008-04-11) |

From: | m.helvensteijn@gmail.com |

Newsgroups: | comp.compilers |

Date: | Thu, 27 Mar 2008 00:49:58 +0100 |

Organization: | Wanadoo |

References: | 08-03-095 |

Keywords: | lex |

Posted-Date: | 26 Mar 2008 23:09:29 EDT |

Midhat wrote:

*> Hi. I want to generate text based on a given regex. just any text that*

*> satisifies the regex. Are there any existing tools/libraries to do*

*> that.*

I don't know any tools/libraries, but it seems like this problem isn't so

difficult.

* Create a syntax tree of the regex.

* Generate text (T(r)) for the leafs of the tree (r). The atomic regexes:

- If r = a character, T(r) = that character

- If r = a character class, T(r) = any character satisfying that class

- If r = . , T(r) = any character (except perhaps \n)

* Recursively generate text (T(R)) for the non-atomic sub-regexes (R) of the

tree, assuming text has already been generated for all the immediate

sub-regexes of R:

- If R = x*, T(R) = a concatenation of any number of T(x).

- If R = x+, T(R) = a concatenation of 1 or more of T(x).

- If R = x?, T(R) = the empty string or T(x).

- If R = xy, T(R) = the concatenation of T(x) and T(y).

- If R = x|y, T(R) = T(x) or T(y)

- If R = (x), T(R) = T(x)

Any choices you make here will lead to T(regex), the text you want. Of

course, if any text will do, you can immediately return empty strings for

x* and x?, without processing x.

And I am assuming here that we are talking about regular expressions in the

classical sense. So without recalling previously captured sub-expressions

and such. However, if you want to have those anyway, there are still

relatively easy adaptations you can make to this algorithm to make it work.

Good luck!

--

Michiel Helvensteijn

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.