Re: Regexps from shell wildcards

gnb@leo.bby.com.au (Gregory N. Bond)
Tue, 6 Apr 1993 23:23:09 GMT

          From comp.compilers

Related articles
Regexps from shell wilcards colas@opossum.inria.fr (1993-04-02)
Re: Regexps from shell wildcards imp@Boulder.ParcPlace.COM (Warner Losh) (1993-04-05)
Re: Regexps from shell wildcards kanze@us-es.sel.de (1993-04-05)
Re: Regexps from shell wildcards macrakis@osf.org (1993-04-05)
Re: Regexps from shell wildcards gnb@leo.bby.com.au (1993-04-06)
| List of all articles for this month |

Newsgroups: comp.compilers
From: gnb@leo.bby.com.au (Gregory N. Bond)
Keywords: lex
Organization: Burdett, Buckeridge & Young, Melbourne, Australia
References: 93-04-012 93-04-018
Date: Tue, 6 Apr 1993 23:23:09 GMT

Warner Losh <imp@Boulder.ParcPlace.COM> writes:
      if you wanted to do /bin/csh shell expressions, then you'll find that
      things like "*.{c,C,H,h,cf}" cause problems and cause the output string
      length to grow wildly.


Worse than that, the csh {foo,bar} construct is not a file glob and
in general has semantics that cannot be duplicated with REs:
  - Order is preserved, so *.{h,c} is NOT the same as *.[hc]
  - Is expanded regardles of matches, so "echo {foo,bar}.c" will work
      whether or not foo.c or bar.c exist.


Of course, in any one application these may not be a problem, and
more-or-less mechanical conversion to (foo|bar) might be acceptable.


Just as a hint, here is some perl code I use to convert sh-type globs
to REs in a Perl package. The input glob pattern is known to contain
no '/' characters (the handling of which is "interesting" recursion).


I make no promises about this, but it hasn't failed me yet.


    # Convert shell-style glob pattern to regex
    $pat =~ s/[.=<>+_\\-]/\\$&/g;
    $pat =~ s/\?/./g;
    $pat =~ s/\*/.*/g;
    # Hide leading . from wildcards
    $pat =~ s/^\.\*/[^.].*/; # .* -> [^.].*
    $pat =~ s/^\.([^\*])/[^.]$1/; # .x -> [^.]x
    $pat =~ s/^\*/[^.]*/;
    # Anchor the pattern
    $pat = "^$pat\$";
    # could do some optimising here, but leave it to perl!
    # e.g. "^.*" => ""
    # ".*$" => ""
--
Gregory Bond <gnb@bby.com.au>
Burdett Buckeridge & Young Ltd Melbourne Australia
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.