|Compiler books for beginners? firstname.lastname@example.org (2002-05-27)|
|Re: Compiler books for beginners? email@example.com (Ralph Boland) (2002-05-27)|
|Re: regular expression operators in CF grammars firstname.lastname@example.org (SLK Parsers) (2002-06-13)|
|Re: regular expression operators in CF grammars email@example.com (Neelakantan Krishnaswami) (2002-06-14)|
|Re: regular expression operators in CF grammars firstname.lastname@example.org (Ralph Boland) (2002-06-17)|
|Re: regular expression operators in CF grammars email@example.com (Chris F Clark) (2002-06-17)|
|Re: regular expression operators in CF grammars firstname.lastname@example.org (2002-06-20)|
|Re: regular expression operators in CF grammars email@example.com (2002-06-28)|
|Re: regular expression operators in CF grammars firstname.lastname@example.org (SLK Parsers) (2002-06-28)|
|[17 later articles]|
|From:||"SLK Parsers" <email@example.com>|
|Date:||13 Jun 2002 21:00:19 -0400|
|Posted-Date:||13 Jun 2002 21:00:19 EDT|
> While this question is being asked I would like to know if any
> compiler/parser textbooks deal with grammars that use regular
> expression operators. This is particularly relavent to LL based
> parsing because most LL based parsing tools now support extended CFGs.
> Why has LR based parser generator tools that directly support (that do
> not translate the grammar into non-extended form) regular expression
> operators so uncommon anyway; other than that they are fairly complex?
The classical, and proven grammar analysis algorithms in parsing theory are
based on definitions that do not include a notion of regular expressions as
in EBNF. Adhoc programming methods that internally translate EBNF run the
risk of introducing incorrectness.
The free-form notation of SLK allows the use of RE operators as
documentation, but not as non-BNF shorthand. I generally append _* or _+ to
a nonterminal to indicate zero-or-more and one-or-more as in
A --> B_* C_+
However, this still requires the defining productions
B_* --> B B_* | epsilon
C_+ --> C C_* ...
Reference grammars often use the convention of appending opt or _opt to a
nonterminal to handle the zero-or-one case. SLK has an option that will
generate the productions for any nonterminal that ends in opt, if it is not
already defined. (Probably should add an option to generate the _* and _+
productions as well.) SLK also has several other grammar transformation
options to help convert a grammar to LL form.
As a matter of personal taste, I prefer symbolic names to EBNF notation. I
find that the production
A --> expression_opt term_opt factor_opt E
is easier to read than
A --> [ K L M N O P Q R S T ] [ X Y Z ] [ a '[' b c d e ']' ] E
though this is obviously an extreme example.
Return to the
Search the comp.compilers archives again.