|Generate text, given a regex email@example.com (Midhat) (2008-03-23)|
|Re: Generate text, given a regex firstname.lastname@example.org (2008-03-27)|
|Re: Generate text, given a regex email@example.com (Gene) (2008-03-26)|
|Generate text, given a regex firstname.lastname@example.org (Domenico Bianculli) (2008-03-27)|
|Re: Generate text, given a regex email@example.com (Gene) (2008-04-11)|
|Re: Generate text, given a regex firstname.lastname@example.org (Russ Cox) (2008-04-11)|
|Date:||Thu, 27 Mar 2008 00:49:58 +0100|
|Posted-Date:||26 Mar 2008 23:09:29 EDT|
> Hi. I want to generate text based on a given regex. just any text that
> satisifies the regex. Are there any existing tools/libraries to do
I don't know any tools/libraries, but it seems like this problem isn't so
* Create a syntax tree of the regex.
* Generate text (T(r)) for the leafs of the tree (r). The atomic regexes:
- If r = a character, T(r) = that character
- If r = a character class, T(r) = any character satisfying that class
- If r = . , T(r) = any character (except perhaps \n)
* Recursively generate text (T(R)) for the non-atomic sub-regexes (R) of the
tree, assuming text has already been generated for all the immediate
sub-regexes of R:
- If R = x*, T(R) = a concatenation of any number of T(x).
- If R = x+, T(R) = a concatenation of 1 or more of T(x).
- If R = x?, T(R) = the empty string or T(x).
- If R = xy, T(R) = the concatenation of T(x) and T(y).
- If R = x|y, T(R) = T(x) or T(y)
- If R = (x), T(R) = T(x)
Any choices you make here will lead to T(regex), the text you want. Of
course, if any text will do, you can immediately return empty strings for
x* and x?, without processing x.
And I am assuming here that we are talking about regular expressions in the
classical sense. So without recalling previously captured sub-expressions
and such. However, if you want to have those anyway, there are still
relatively easy adaptations you can make to this algorithm to make it work.
Return to the
Search the comp.compilers archives again.