Re: perl regular expression grammar

Ilmari Karonen <usenet11522@itz.pp.sci.fi>
23 Jul 2001 02:23:47 -0400

          From comp.compilers

Related articles
perl regular expression grammar alan@oursland.net (2001-07-17)
Re: perl regular expression grammar merlyn@stonehenge.com (2001-07-18)
Re: perl regular expression grammar ralph@inputplus.demon.co.uk (2001-07-18)
Re: perl regular expression grammar johnmillaway@yahoo.com (John W. Millaway) (2001-07-18)
Re: perl regular expression grammar mjd@plover.com (2001-07-18)
Re: perl regular expression grammar abigail@foad.org (2001-07-18)
Re: perl regular expression grammar alan@oursland.net (2001-07-23)
Re: perl regular expression grammar usenet11522@itz.pp.sci.fi (Ilmari Karonen) (2001-07-23)
Re: perl regular expression grammar mjd@plover.com (2001-08-02)
| List of all articles for this month |
From: Ilmari Karonen <usenet11522@itz.pp.sci.fi>
Newsgroups: comp.lang.perl.misc,comp.compilers
Date: 23 Jul 2001 02:23:47 -0400
Organization: (dis)Order of the Holy Spoon (or whatever)
References: 01-07-080
Keywords: parse
Posted-Date: 23 Jul 2001 02:23:46 EDT

Alan Oursland wrote:
>
>Here is the grammar:


Here are some comments:


><term> ::= "." | "$" | "^" | <char> | <set>


The "$" is handled as a special case by the regex parser. It's a tail
anchor only when followed by "(", ")", "|", the end of the regex, or a
whitespace character. Otherwise it's assumed to start a variable that
should be interpolated.


I may be remembering the exact dirty details wrong. I suggest you see
the code in toke.c -- look for the string "tail anchor", and note that
as far as I can remember the code and the comments disagree on certain
minor details.




><char> ::= <non-meta> | "\"<escaped>
><non-meta> ::= any non-meta char
><escaped> ::= <meta>|<control>|<special>|<assert>


You might want to take advantage of the fact that "\" followed by any
non-word character always represents that literal character. This is
guaranteed, as is the fact that no non-backslashed word character can
ever be a metacharacter. (This is how quotemeta() can work.)


Use the source. It will help you, assuming it won't make you give up
the task in disgust.


--
Ilmari Karonen -- http://www.sci.fi/~iltzu/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.