perl regular expression grammar

alan@oursland.net (Alan Oursland)
17 Jul 2001 23:28:53 -0400

          From comp.compilers

Related articles
perl regular expression grammar alan@oursland.net (2001-07-17)
Re: perl regular expression grammar merlyn@stonehenge.com (2001-07-18)
Re: perl regular expression grammar ralph@inputplus.demon.co.uk (2001-07-18)
Re: perl regular expression grammar johnmillaway@yahoo.com (John W. Millaway) (2001-07-18)
Re: perl regular expression grammar mjd@plover.com (2001-07-18)
Re: perl regular expression grammar abigail@foad.org (2001-07-18)
Re: perl regular expression grammar alan@oursland.net (2001-07-23)
[2 later articles]
| List of all articles for this month |

From: alan@oursland.net (Alan Oursland)
Newsgroups: comp.lang.perl.misc,comp.compilers
Date: 17 Jul 2001 23:28:53 -0400
Organization: SBC Internet Services
Keywords: syntax
Posted-Date: 17 Jul 2001 23:28:53 EDT

I've been looking for a complete perl 5 regular expression grammar
and, having been unsuccessful in my search, have attempted to write
one myself. I was wondering if anyone could help me find any errors in
it (excluding grammar syntax errors). I've left out embedded modifiers
from the grammar -- I'm not sure how they fit into the grammar. I've
also skimmed over the non-meta character production. One area I am
confused is the "\c[" control character (described at
http://www.perldoc.com/perl5.6/pod/perlre.html). How does this work?


Alan Oursland


Here is the grammar:
<re> ::= <union>
<union> ::= <concat>"|"<union> | <concat>
<concat> ::= <quant><concat> | <quant>
<quant> ::= <group>"*" | <group>"+" | <group>"?" | <group>"{"<bound>"}" | <group>
<group> ::= "("<re>")" | <term>
<term> ::= "." | "$" | "^" | <char> | <set>
<bound> ::= <num> | <num>"," | <num>","<num>
<char> ::= <non-meta> | "\"<escaped>
<non-meta> ::= any non-meta char
<escaped> ::= <meta>|<control>|<special>|<assert>
<meta> ::= "."|"^"|"$"|"?"|"*"|"+"|"|"|"["|"("|")"|"\"|"{"
<control> ::= "t"|"n"|"r"|"f"|"a"|"e"|"l"|"u"|"L"|"U"|"E"|"Q"
<special> ::= <backoctal>|<hexchar>|<controlchar>|<class>
<assert> ::= "b"|"B"|"A"|"z"|"Z"|"G"
<backoctal> ::= <digit> | <digit><digit> | "0"<oct><oct> | "+" | "&" | "`" | "'"
<hexchar> ::= "x"<hex><hex> | "x{"<hex><hex><hex><hex>"}"
<controlchar> ::= "c["
<namedchar> ::= "N{"<name>"}"
<class> ::= "w"|"W"|"s"|"S"|"d"|"D"|"X"|"C" |"p"<name>|"P"<name>|"[:"<posixclass>":]"|"[:^"<posixclass>":]"
<posixclass> ::= "alpha"|"alnum"|"ascii"|"cntrl"|"digit"|"graph"|"lower"|"print"|"punct"|"space"|"upper"|"word"|"xdigit"
<name> ::= <unicodeclass>
<unicodeclass> ::= "IsAlpha"|"IsAlnum"|"IsASCII"|"IsCntrl"|"IsDigit"|"IsGraph"|"IsLower"|"IsPrint"|"IsPunct"|"IsSpace"|"IsUpper"|"IsWord"|"IsXDigit"
<set> ::= "[" <set-items> "]" | "[^" <set-items> "]"
<set-items> ::= <set-item> | <set-item> <set-items>
<set-item> ::= <range> | <char>
<range> ::= <char> "-" <char>
<num> ::= <digit><num> | <digit>
<oct> ::= "0"|"1"|"2"|"3"|"4"|"5"|"6"|"7"
<digit> ::= "0"|"1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9"
<hex> ::= "0"|"1"|"2"|"3"|"4"|"5"|"6"|"7"|"8"|"9"|"a"|"b"|"c"|"d"|"e"|"f"|"A"|"B"|"C"|"D"|"E"|"F"
<mod> ::= "\i"|"\m"|"\s"|"\x"



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.