|A BNF grammar for a simplistic regex language email@example.com (2004-05-30)|
|Re: A BNF grammar for a simplistic regex language firstname.lastname@example.org (SM Ryan) (2004-06-06)|
|Date:||30 May 2004 13:27:16 -0400|
|Posted-Date:||30 May 2004 13:27:16 EDT|
For educational purposes I'm writing a regular expression engine. My
regular expressions are restricted to a simplistic subset: the
operators are * (highest precedence), . (concatenation) and | (lowest
Parentheses are allowed, and only single chars a..z act as atoms. For
instance, the following is a legal regex in this language:
(a|b)*.a.b.b (which means "0 or more of a or b, and then abb")
The parse tree (rotated 90 degrees counterclockwise) for this regex
I want to parse these regexes into a parse tree with a recursive
descent parser. For this purpose, I want to define a BNF grammar for
them first. For now, I've came up with the following:
expr ::= concat '|' concat
concat ::= rep '.' rep
rep ::= atom '*'
atom ::= '(' expr ')'
char ::= a..z
Does this look OK to you ? Does this syntax correctly specifies the
simple regexes I described above ?
To me the grammar looks suitable for RD parsing (i.e. not left
recursive), am I right ?
Thanks a lot in advance
Return to the
Search the comp.compilers archives again.