Re: Regular Expressions

Martin Ward <Martin.Ward@durham.ac.uk>
17 Oct 2004 16:03:39 -0400

          From comp.compilers

Related articles
Regular Expressions m_j_mather@yahoo.com.au (2004-10-09)
Re: Regular Expressions newsserver_mails@bodden.de (Eric Bodden) (2004-10-12)
Re: Regular Expressions randyhyde@earthlink.net (Randall Hyde) (2004-10-12)
Re: Regular Expressions schmitz@i3s.unice.fr (Sylvain Schmitz) (2004-10-12)
Re: Regular Expressions Martin.Ward@durham.ac.uk (Martin Ward) (2004-10-12)
Re: Regular Expressions torbenm@diku.dk (2004-10-12)
Re: Regular Expressions dmaze@mit.edu (David Z Maze) (2004-10-12)
Re: Regular Expressions Martin.Ward@durham.ac.uk (Martin Ward) (2004-10-17)
Re: Regular Expressions choksheak@yahoo.com (ChokSheak Lau) (2004-10-21)
Re: regular expressions wendt@CS.ColoState.EDU (1993-03-22)
Regular Expressions rafae1@hp.fciencias.unam.mx (trejo ortiz alejandro augusto) (1995-10-16)
Re: Regular Expressions mnp@compass-da.com (Mitchell Perilstein) (1995-10-23)
Re: Regular Expressions cgh@cs.rice.edu (1995-10-29)
Re: Regular Expressions odunlain@maths.tcd.ie (Colm O'Dunlaing) (1995-10-31)
[3 later articles]
| List of all articles for this month |

From: Martin Ward <Martin.Ward@durham.ac.uk>
Newsgroups: comp.compilers
Date: 17 Oct 2004 16:03:39 -0400
Organization: Compilers Central
References: 04-10-069 04-10-100
Keywords: lex
Posted-Date: 17 Oct 2004 16:03:39 EDT

On Tuesday 12 Oct 2004 5:56 am, Torben Ęgidius Mogensen wrote:
> A regular expression can not
> handle arbitray nesting depts, so you would either need to use a
> counter in the action of the regular expression or limit yourself to a
> fixed limit on the number of nested tables and write a regular
> expression for each level of nesting.


If you are using perl, there is another option for solving the nested
parentheses problem, which is to write a recursive regular expression:


my $np;
$np = qr{
                  \(
                      (?: # Non-capture group
                            (?> [^( )]+ ) # Non-capture group w/o backtracking
                        |
                            (??{ $np }) # Group with matching parens
                      )*
                  \)
                }x;


This generates a rexexp $np which matches:
'(' followed by zero or more iterations of:
      either a string with no parentheses
      or something which matches $np,
followed by ')'.


In effect you have a lazy description of an infinite regexp: so no counter
is needed, nor is there a limit on the depth of parentheses.


--
Martin


Martin.Ward@durham.ac.uk http://www.cse.dmu.ac.uk/~mward/ Erdos number: 4
G.K.Chesterton web site: http://www.cse.dmu.ac.uk/~mward/gkc/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.