Related articles |
---|
Regular expression grammar? bediger@teal.csn.net (Bruce Ediger) (1999-09-16) |
Re: Regular expression grammar? jjan@cs.rug.nl (J.H.Jongejan) (1999-09-20) |
Re: Regular expression grammar? terryg@uswest.net (1999-09-20) |
Re: Regular expression grammar? lex@cc.gatech.edu (1999-09-20) |
Re: Regular expression grammar? cbarron3@ix.netcom.com (1999-09-20) |
Re: Regular expression grammar? zalman@netcom18.netcom.com (Zalman Stern) (1999-09-24) |
Re: Regular expression grammar? zalman@netcom18.netcom.com (Zalman Stern) (1999-09-24) |
Re: Regular expression grammar? zalman@netcom18.netcom.com (Zalman Stern) (1999-09-24) |
Re: Regular expression grammar? cbarron3@ix.netcom.com (1999-09-27) |
From: | lex@cc.gatech.edu |
Newsgroups: | comp.compilers |
Date: | 20 Sep 1999 12:00:25 -0400 |
Organization: | Georgia Institute of Technology, Atlanta GA, USA |
References: | 99-09-051 |
Keywords: | lex |
Bruce Ediger <bediger@teal.csn.net> writes:
[ grammers for regular expressions ]
> --------
> %token SYMBOL OR STAR LPAREN RPAREN
> %%
> regexp
> : SYMBOL /* any letter of the alphabet (c) */
> | regexp OR regexp /* alternation (d) */
> | regexp regexp /* concatenation (d) */
> | regexp STAR /* Kleene Closure (d) */
> | RPAREN regexp LPAREN /* grouping */
> ;
> %%
> --------
> This gives a bunch of shift/reduce conflicts. I can't seem to get around
> them by putting in operator precedences, or the analogy of algebraic
> terms and factors.
The simplest approach would probably be to play with the precedence
and associativity features that bison and perhaps yacc have. However,
you can do it with a straight BNF description, as well. To this end,
your idea of algebraic terms and factors actually seems to be the key.
For regular expressions, the operators are *, |, (), and
concatenation. Their precendense goes, I think, like this:
|
concatenation
*
()
(I might have | and concatenation swapped -- it's easy to tweak if
this is wrong). Assign the following precedence levels:
| 1
conc. 2
* 3
() 4 ("primary")
Let "r3" be an expression with only level 3 and higher operators, "r2"
be one with only level 2 and higher, etc. Then the following grammer
produces no conflicts:
%token SYMBOL OR STAR LPAREN RPAREN
%%
regexp : r1 ;
r1 : r2 | r1 OR r2 ;
r2 : r3 | r2 r3 ;
r3 : primary | r3 STAR ;
primary : parenthesized | SYMBOL ;
parenthesized : LPAREN regexp RPAREN ;
%%
-Lex Spoon
Return to the
comp.compilers page.
Search the
comp.compilers archives again.