# Re: Intersection of regular expression languages

## Neelakantan Krishnaswami <neelk@cs.cmu.edu>Sun, 21 Oct 2007 20:54:33 +0000 (UTC)

From comp.compilers

Related articles
Intersection of regular expression languages haberg@math.su.se (2007-10-19)
Re: Intersection of regular expression languages neelk@cs.cmu.edu (Neelakantan Krishnaswami) (2007-10-21)
Re: Intersection of regular expression languages cfc@shell01.TheWorld.com (Chris F Clark) (2007-10-21)
Re: Intersection of regular expression languages gene.ressler@gmail.com (Gene) (2007-10-21)
Re: Intersection of regular expression languages wyrmwif@tsoft.org (SM Ryan) (2007-10-21)
Re: Intersection of regular expression languages haberg@math.su.se (2007-10-22)
| List of all articles for this month |

 From: Neelakantan Krishnaswami Newsgroups: comp.compilers Date: Sun, 21 Oct 2007 20:54:33 +0000 (UTC) Organization: Carnegie Mellon Univ. -- Computer Science Dept. References: 07-10-063 Keywords: lex, theory Posted-Date: 22 Oct 2007 00:23:22 EDT

Hans Aberg <haberg@math.su.se> wrote:
> Can the intersection of two regular expression languages be
> constructed as a regular expression language?

Yes.

Here's an Ocaml program that tests whether a string is recognized by a
regular expression language including intersection, which uses
Brzozowski's method of derivatives.

This code is not terribly efficient, which is not a problem with the
method of derivatives, but rather with my toy implementation.

(* The datatype of regular expressions. Intersection and union are
And and Or respectively. *)

type re =
| Char of char
| Epsilon
| Seq of re * re
| Star of re
| False
| Or of re * re
| True
| And of re * re

(* Check whether a regular expression matches the empty string. *)

let rec nullable re =
match re with
| Char _ -> false
| Epsilon -> true
| Seq(r1, r2) -> (nullable r1) && (nullable r2)
| Star r -> true
| False -> false
| Or(r1, r2) -> (nullable r1) || (nullable r2)
| True -> true
| And(r1, r2) -> (nullable r1) && (nullable r2)

(* deriv c r computes the c-th "derivative" of r. What this
means is that we compute a regular expression r' from r, such
that if (c ^ s) is recognized by r, then r' recognizes s. *)

let rec deriv c re =
match re with
| Char c' -> if c = c' then Epsilon else False
| Epsilon -> False
| Seq(r1, r2) ->
let r1' = deriv c r1 in
if nullable r1 then
Or(Seq(r1', r2), deriv c r2)
else
Seq(r1', r2)
| Star r -> Seq(deriv c r, (Star r))
| False -> False
| Or(r1, r2) -> Or(deriv c r1, deriv c r2)
| True -> True
| And(r1, r2) -> And(deriv c r1, deriv c r2)

(* check uses derivatives to match the substring of s starting at
position at i. If the i-th character of s is c, then we compute
the c'th derivative and look at position i+1. *)

let rec check s re i =
if i = String.length s then
if nullable re then Some i else None
else
check s (deriv s.[i] re) (i+1)

--
Neel R. Krishnaswami
neelk@cs.cmu.edu

Post a followup to this message