Re: Intersection of regular expression languages

Neelakantan Krishnaswami <neelk@cs.cmu.edu>
Sun, 21 Oct 2007 20:54:33 +0000 (UTC)

          From comp.compilers

Related articles
Intersection of regular expression languages haberg@math.su.se (2007-10-19)
Re: Intersection of regular expression languages neelk@cs.cmu.edu (Neelakantan Krishnaswami) (2007-10-21)
Re: Intersection of regular expression languages cfc@shell01.TheWorld.com (Chris F Clark) (2007-10-21)
Re: Intersection of regular expression languages gene.ressler@gmail.com (Gene) (2007-10-21)
Re: Intersection of regular expression languages wyrmwif@tsoft.org (SM Ryan) (2007-10-21)
Re: Intersection of regular expression languages haberg@math.su.se (2007-10-22)
| List of all articles for this month |

From: Neelakantan Krishnaswami <neelk@cs.cmu.edu>
Newsgroups: comp.compilers
Date: Sun, 21 Oct 2007 20:54:33 +0000 (UTC)
Organization: Carnegie Mellon Univ. -- Computer Science Dept.
References: 07-10-063
Keywords: lex, theory
Posted-Date: 22 Oct 2007 00:23:22 EDT

Hans Aberg <haberg@math.su.se> wrote:
> Can the intersection of two regular expression languages be
> constructed as a regular expression language?


Yes.


Here's an Ocaml program that tests whether a string is recognized by a
regular expression language including intersection, which uses
Brzozowski's method of derivatives.


This code is not terribly efficient, which is not a problem with the
method of derivatives, but rather with my toy implementation.


(* The datatype of regular expressions. Intersection and union are
      And and Or respectively. *)


type re =
    | Char of char
    | Epsilon
    | Seq of re * re
    | Star of re
    | False
    | Or of re * re
    | True
    | And of re * re


(* Check whether a regular expression matches the empty string. *)


let rec nullable re =
    match re with
    | Char _ -> false
    | Epsilon -> true
    | Seq(r1, r2) -> (nullable r1) && (nullable r2)
    | Star r -> true
    | False -> false
    | Or(r1, r2) -> (nullable r1) || (nullable r2)
    | True -> true
    | And(r1, r2) -> (nullable r1) && (nullable r2)


(* deriv c r computes the c-th "derivative" of r. What this
      means is that we compute a regular expression r' from r, such
      that if (c ^ s) is recognized by r, then r' recognizes s. *)


let rec deriv c re =
    match re with
    | Char c' -> if c = c' then Epsilon else False
    | Epsilon -> False
    | Seq(r1, r2) ->
            let r1' = deriv c r1 in
            if nullable r1 then
Or(Seq(r1', r2), deriv c r2)
            else
Seq(r1', r2)
    | Star r -> Seq(deriv c r, (Star r))
    | False -> False
    | Or(r1, r2) -> Or(deriv c r1, deriv c r2)
    | True -> True
    | And(r1, r2) -> And(deriv c r1, deriv c r2)


(* check uses derivatives to match the substring of s starting at
      position at i. If the i-th character of s is c, then we compute
      the c'th derivative and look at position i+1. *)


let rec check s re i =
    if i = String.length s then
        if nullable re then Some i else None
    else
        check s (deriv s.[i] re) (i+1)


--
Neel R. Krishnaswami
neelk@cs.cmu.edu


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.