Re: Alternatives to Regexps

Brian Rogoff <bpr@best.com>
10 Jul 1998 21:02:53 -0400

          From comp.compilers

Related articles
Alternatives to Regexps john@dwaf-hri.pwv.gov.za (John Carter) (1998-07-08)
Re: Alternatives to Regexps ak@muc.de (1998-07-10)
Re: Alternatives to Regexps torbenm@diku.dk (Torben Mogensen) (1998-07-10)
Re: Alternatives to Regexps jamz@cdsnet.net (1998-07-10)
Re: Alternatives to Regexps d.rourke@arpc.com (Daniel Rourke) (1998-07-10)
Re: Alternatives to Regexps bear@sonic.net (Ray Dillinger) (1998-07-10)
Re: Alternatives to Regexps bpr@best.com (Brian Rogoff) (1998-07-10)
Re: Alternatives to Regexps mav@naxos.esat.kuleuven.ac.be (Maurizio Vitale) (1998-07-10)
Re: Alternatives to Regexps lord@emf.emf.net (1998-07-11)
Re: Alternatives to Regexps bromage@cs.mu.OZ.AU (1998-07-11)
Re: Alternatives to Regexps cfc@world.std.com (Chris F Clark) (1998-07-13)
Re: Alternatives to Regexps torbenm@diku.dk (Torben Mogensen) (1998-07-13)
Re: Alternatives to Regexps thetick@magelang.com (Scott Stanchfield) (1998-07-20)
| List of all articles for this month |

From: Brian Rogoff <bpr@best.com>
Newsgroups: comp.compilers
Date: 10 Jul 1998 21:02:53 -0400
Organization: Compilers Central
References: 98-07-057
Keywords: DFA

On 8 Jul 1998, John Carter wrote:
> I'm seeking pointers to alternatives to standard regular expression
> languages.


Check out SNOBOL4, a real "blast from the past", or its modern descendant,
Icon (at http://www.cs.arizona.edu/icon/). The pattern language of SNOBOL4
provides a lot more power than regexps (at an admittedly greater cost)
and may be what you're looking for.


The free Ada compiler GNAT comes with a set of packages which provide some
SNOBOL like capabilities (the author of these packages is the author of
the SPITBOL compiler too); I'll include a snippet from its documentation
to give you a flavor of what it looks like.


-- Brian


... snip...


-- For an example of a recursive pattern, let's define a pattern
-- that is like the built in Bal, but the string matched is balanced
-- with respect to square brackets or curly brackets.


-- The language for such strings might be defined in extended BNF as
--
-- ELEMENT ::= <any character other than [] or {}>
-- | '[' BALANCED_STRING ']'
-- | '{' BALANCED_STRING '}'
--
-- BALANCED_STRING ::= ELEMENT {ELEMENT}
--
-- Here we use {} to indicate zero or more occurrences of a term, as
-- is common practice in extended BNF. Now we can translate the above
-- BNF into recursive patterns as follows:
--
-- Element, Balanced_String : aliased Pattern;
-- .
-- .
-- .
-- Element := NotAny ("[]{}")
-- or
-- ('[' & (+Balanced_String) & ']')
-- or
-- ('{' & (+Balanced_String) & '}');
--
-- Balanced_String := Element & Arbno (Element);
--
-- Note the important use of + here to refer to a pattern not yet
-- defined. Note also that we use assignments precisely because we
-- cannot refer to as yet undeclared variables in initializations.
--
-- Now that this pattern is constructed, we can use it as though it
-- were a new primitive pattern element, and for example, the match:
--
-- Match ("xy[ab{cd}]", Balanced_String * Current_Output & Fail);
--
-- will generate the output:
--
-- x
-- xy
-- xy[ab{cd}]
-- y
-- y[ab{cd}]
-- [ab{cd}]
-- a
-- ab
-- ab{cd}
-- b
-- b{cd}
-- {cd}
-- c
-- cd
-- d
--
-- Note that the function of the fail here is simply to force the
-- pattern Balanced_String to match all possible alternatives. Studying
-- the operation of this pattern in detail is highly instructive.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.