Related articles |
---|
A BNF grammar for a simplistic regex language spur4444@yahoo.com (2004-05-30) |
Re: A BNF grammar for a simplistic regex language wyrmwif@tsoft.com (SM Ryan) (2004-06-06) |
From: | spur4444@yahoo.com (Spur) |
Newsgroups: | comp.compilers |
Date: | 30 May 2004 13:27:16 -0400 |
Organization: | http://groups.google.com |
Keywords: | parse, question |
Posted-Date: | 30 May 2004 13:27:16 EDT |
Hello All,
For educational purposes I'm writing a regular expression engine. My
regular expressions are restricted to a simplistic subset: the
operators are * (highest precedence), . (concatenation) and | (lowest
precedence).
Parentheses are allowed, and only single chars a..z act as atoms. For
instance, the following is a legal regex in this language:
(a|b)*.a.b.b (which means "0 or more of a or b, and then abb")
The parse tree (rotated 90 degrees counterclockwise) for this regex
should be:
b
. b
. a
.
* a
|
b
I want to parse these regexes into a parse tree with a recursive
descent parser. For this purpose, I want to define a BNF grammar for
them first. For now, I've came up with the following:
expr ::= concat '|' concat
concat
concat ::= rep '.' rep
| rep
rep ::= atom '*'
| atom
atom ::= '(' expr ')'
| char
char ::= a..z
Does this look OK to you ? Does this syntax correctly specifies the
simple regexes I described above ?
To me the grammar looks suitable for RD parsing (i.e. not left
recursive), am I right ?
Thanks a lot in advance
Eli
Return to the
comp.compilers page.
Search the
comp.compilers archives again.