RE: parsing EBNF< was Compiler books for beginners?

"Quinn Tyler Jackson" <qjackson@shaw.ca>
2 Jun 2002 01:39:11 -0400

          From comp.compilers

Related articles
RE: parsing EBNF< was Compiler books for beginners? qjackson@shaw.ca (Quinn Tyler Jackson) (2002-06-02)
| List of all articles for this month |

From: "Quinn Tyler Jackson" <qjackson@shaw.ca>
Newsgroups: comp.compilers
Date: 2 Jun 2002 01:39:11 -0400
Organization: Compilers Central
Keywords: parse
Posted-Date: 02 Jun 2002 01:39:11 EDT
In-reply-to: 02-05-151

> While this question is being asked I would like to know if any
> compiler/parser textbooks deal with grammars that use regular
> expression operators. This is particularly relavent to LL based
> parsing because most LL based parsing tools now support extended CFGs.
> Note that LR based parsing algorithms can also be modified to support
> extended CFGs but this is more difficult and less likely to be found
> in a textbook.
>
> Why has LR based parser generator tools that directly support (that do
> not translate the grammar into non-extended form) regular expression
> operators so uncommon anyway; other than that they are fairly complex?


Chris Clark and Joachim Durchholz have already provided two very good
replies to this, but I'll throw in my two bits anyway.


Meta-S now accepts grammars in A-BNF, which is very much like EBNF,
and it does this by translating to LPM (Language for Pattern
Matching). Rules in the form:


A ::= [B] [C]


Are somewhat more difficult to manipulate in tree traversal code and in
reductions.


I simplified this by allowing nodes to be accessed by their production name,
such that it becomes possible to simply return NULL if a given node was not
included in the tree. A reduction event handler for A might read:


if($MATCHED())
{
if($NCHILD("B") != NULL)
{
// we know that [B] did not derive lambda
}
}


$CONTINUE();


This simplifies the question of reductions. In tree traversal, rather than use
positions in the $1 form, Meta-S uses named nodes again, such that:


if(node.GetChild("B") != NULL)
{
}


For the most part, this method suffices.


The primary advantage I have found in allowing RE style syntax in
grammars without conversion is that recursion depth is reduced for
long lists, since iteration is used instead of the
right-recursive-terminated-by-epsilon style. This also simplifies
reduction code and tree traversal code, since handling lists of things
such as parameters is done in a loop, rather than by traversing a
chain of recursive children nodes. (And the question of what order the
children are in also becomes moot.)


This means that


A ::= C<+>


Will always have the tree:


A
C
C
C


Rather than something more convoluted.


--
Quinn Tyler Jackson
http://QuinnTylerJackson.n3.net/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.