Re: regular expression question

Chris F Clark <cfc@shell01.TheWorld.com>
9 Jun 2005 15:46:05 -0400

          From comp.compilers

Related articles
regular expression question gvheurn@gmail.com (Gijs) (2005-06-08)
Re: regular expression question gah@ugcs.caltech.edu (glen herrmannsfeldt) (2005-06-08)
Re: regular expression question 148f3wg02@sneakemail.com (Karsten Nyblad) (2005-06-08)
Re: regular expression question daw@taverner.cs.berkeley.edu (2005-06-08)
Re: regular expression question gvheurn@gmail.com (Gijs) (2005-06-09)
Re: regular expression question nicola.musatti@gmail.com (Nicola Musatti) (2005-06-09)
Re: regular expression question cfc@shell01.TheWorld.com (Chris F Clark) (2005-06-09)
Re: regular expression question snicol@apk.net (Scott Nicol) (2005-06-10)
Re: regular expression question snicol@apk.net (Scott Nicol) (2005-06-10)
Re: regular expression question d148f3wg02@sneakemail.com (Karsten Nyblad) (2005-06-10)
Re: regular expression question torbenm@diku.dk (2005-06-10)
Re: regular expression question skandgoe@gwdg.de (Skandinavisches Seminar) (2005-06-10)
Re: regular expression question mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2005-06-12)
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: 9 Jun 2005 15:46:05 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 05-06-045 05-06-050
Keywords: lex, DFA

David Wagner's response is correct. The easiest way to construct
complement RE is to build the FSA and reverse the sense of accept and
fail. (We use that trick in Yacc++, where we have predicates of both
the true and false form, we simply change where the code jumps to for
the different senses.)


[Can you put any limits on the size of the resulting RE, or might it
be like LR(k) to LR(1), too ugly to use? -John]


From that message, one can recognize that a RE and its complement
take the same number of states. However, as one can see from the
notation David had to write that the RE expression of the complement
of some RE can be significantly more (or less) complicated that the
original RE. Even in RE form one can bound the size differential, I
think it is roughly n**2, but I could be mistaken and it could be
2**n. As to whether it is too ugly to use, I think the results speak
for themselves.


Finally, some "lexical" level tools do implement the complement
operator for regular expressions directly in their notation. I think
- (as in negation) was the choice of at least a few, with infix minus
(RE - RE) representing match all that match the 1st but not the 2nd.
One can prove that that is also an RE.


Hope this helps,
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.