From:  "Etienne M. Gagnon" <egagnon@jmeg.com> 
Newsgroups:  comp.compilers 
Date:  Sun, 03 Jan 2010 16:08:28 0500 
Organization:  Compilers Central 
References:  1001013 
Keywords:  lex 
Cc:  Peng Yu <pengyu.ut@gmail.com> 
PostedDate:  04 Jan 2010 11:19:58 EST 
Hi Peng,
See my answer below.
> Suppose that I have two regular sets A and B that are generated by two
> regular expressions. I'm wondering if there is always a regular
> expression that can generated the difference between the two sets,
> i.e., AB. If there is such regular expression, is there a mechanical
> way to generated such regular expression.
Yes, the difference between two regular sets is a regular set. Here is a
mechanical way to produce it:
1 Transform each of regular expressions A and B into automatons AUTA
and AUTB.
2 Create the following NFA:
StartState

+(epsilon)AUTA

\(epsilon)AUTB
3 Transform the NFA into a DFA, but determine its accepting state as
follows: A state is an accepting state if and only if it corresponds to
accepting state of AUTA and not to an accepting state of AUTB.
4 You can then use wellknown DFAtoregularexpression transformations
to get a (possibly ugly) equivalent regular expression.
It should be fairly obvious that this is theoretically sound. We are
simply computing a DFA that walks A and B in parallel, and that accepts
all elements of A that are not elements of B.
The beta version of SableCC 4.0 ( http://sablecc.org/ ) implements this
difference (called Except) operator for lexer specifications. You would
express it as follows:
Language my_test_language;
Lexer
// regular definitions
a = Any+;
b = 'Etienne';
a_difference_b = a Except b;
Token
a_difference_b; // list of tokens to be recognized by lexer
Etienne

Etienne M. Gagnon, Ph.D.
SableCC: http://sablecc.org
Return to the
comp.compilers page.
Search the
comp.compilers archives again.