Sun, 21 Oct 2007 20:54:33 +0000 (UTC)

Related articles |
---|

Intersection of regular expression languages haberg@math.su.se (2007-10-19) |

Re: Intersection of regular expression languages neelk@cs.cmu.edu (Neelakantan Krishnaswami) (2007-10-21) |

Re: Intersection of regular expression languages cfc@shell01.TheWorld.com (Chris F Clark) (2007-10-21) |

Re: Intersection of regular expression languages gene.ressler@gmail.com (Gene) (2007-10-21) |

Re: Intersection of regular expression languages wyrmwif@tsoft.org (SM Ryan) (2007-10-21) |

Re: Intersection of regular expression languages haberg@math.su.se (2007-10-22) |

From: | Neelakantan Krishnaswami <neelk@cs.cmu.edu> |

Newsgroups: | comp.compilers |

Date: | Sun, 21 Oct 2007 20:54:33 +0000 (UTC) |

Organization: | Carnegie Mellon Univ. -- Computer Science Dept. |

References: | 07-10-063 |

Keywords: | lex, theory |

Posted-Date: | 22 Oct 2007 00:23:22 EDT |

Hans Aberg <haberg@math.su.se> wrote:

*> Can the intersection of two regular expression languages be*

*> constructed as a regular expression language?*

Yes.

Here's an Ocaml program that tests whether a string is recognized by a

regular expression language including intersection, which uses

Brzozowski's method of derivatives.

This code is not terribly efficient, which is not a problem with the

method of derivatives, but rather with my toy implementation.

(* The datatype of regular expressions. Intersection and union are

And and Or respectively. *)

type re =

| Char of char

| Epsilon

| Seq of re * re

| Star of re

| False

| Or of re * re

| True

| And of re * re

(* Check whether a regular expression matches the empty string. *)

let rec nullable re =

match re with

| Char _ -> false

| Epsilon -> true

| Seq(r1, r2) -> (nullable r1) && (nullable r2)

| Star r -> true

| False -> false

| Or(r1, r2) -> (nullable r1) || (nullable r2)

| True -> true

| And(r1, r2) -> (nullable r1) && (nullable r2)

(* deriv c r computes the c-th "derivative" of r. What this

means is that we compute a regular expression r' from r, such

that if (c ^ s) is recognized by r, then r' recognizes s. *)

let rec deriv c re =

match re with

| Char c' -> if c = c' then Epsilon else False

| Epsilon -> False

| Seq(r1, r2) ->

let r1' = deriv c r1 in

if nullable r1 then

Or(Seq(r1', r2), deriv c r2)

else

Seq(r1', r2)

| Star r -> Seq(deriv c r, (Star r))

| False -> False

| Or(r1, r2) -> Or(deriv c r1, deriv c r2)

| True -> True

| And(r1, r2) -> And(deriv c r1, deriv c r2)

(* check uses derivatives to match the substring of s starting at

position at i. If the i-th character of s is c, then we compute

the c'th derivative and look at position i+1. *)

let rec check s re i =

if i = String.length s then

if nullable re then Some i else None

else

check s (deriv s.[i] re) (i+1)

--

Neel R. Krishnaswami

neelk@cs.cmu.edu

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.