Re: regular expression operators in CF grammars

"SLK Parsers" <parsersinc@yahoo.com>
13 Jun 2002 21:00:19 -0400

          From comp.compilers

Related articles
Compiler books for beginners? bart@dynarec.com (2002-05-27)
Re: Compiler books for beginners? rboland@unb.ca (Ralph Boland) (2002-05-27)
Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-06-13)
Re: regular expression operators in CF grammars neelk@alum.mit.edu (Neelakantan Krishnaswami) (2002-06-14)
Re: regular expression operators in CF grammars rboland@unb.ca (Ralph Boland) (2002-06-17)
Re: regular expression operators in CF grammars cfc@world.std.com (Chris F Clark) (2002-06-17)
Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-06-20)
Re: regular expression operators in CF grammars kgw-news@stiscan.com (2002-06-28)
Re: regular expression operators in CF grammars parsersinc@yahoo.com (SLK Parsers) (2002-06-28)
[17 later articles]
| List of all articles for this month |

From: "SLK Parsers" <parsersinc@yahoo.com>
Newsgroups: comp.compilers
Date: 13 Jun 2002 21:00:19 -0400
Organization: Parsers Inc.
References: 02-05-142 02-05-151
Keywords: parse
Posted-Date: 13 Jun 2002 21:00:19 EDT

> While this question is being asked I would like to know if any
> compiler/parser textbooks deal with grammars that use regular
> expression operators. This is particularly relavent to LL based
> parsing because most LL based parsing tools now support extended CFGs.
> Why has LR based parser generator tools that directly support (that do
> not translate the grammar into non-extended form) regular expression
> operators so uncommon anyway; other than that they are fairly complex?


The classical, and proven grammar analysis algorithms in parsing theory are
based on definitions that do not include a notion of regular expressions as
in EBNF. Adhoc programming methods that internally translate EBNF run the
risk of introducing incorrectness.


The free-form notation of SLK allows the use of RE operators as
documentation, but not as non-BNF shorthand. I generally append _* or _+ to
a nonterminal to indicate zero-or-more and one-or-more as in


A --> B_* C_+


However, this still requires the defining productions


B_* --> B B_* | epsilon


C_+ --> C C_* ...


Reference grammars often use the convention of appending opt or _opt to a
nonterminal to handle the zero-or-one case. SLK has an option that will
generate the productions for any nonterminal that ends in opt, if it is not
already defined. (Probably should add an option to generate the _* and _+
productions as well.) SLK also has several other grammar transformation
options to help convert a grammar to LL form.


As a matter of personal taste, I prefer symbolic names to EBNF notation. I
find that the production


A --> expression_opt term_opt factor_opt E


is easier to read than


A --> [ K L M N O P Q R S T ] [ X Y Z ] [ a '[' b c d e ']' ] E


though this is obviously an extreme example.


http://parsers.org/slk


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.