29 Mar 2007 00:57:34 -0400

Related articles |
---|

Bison Parser negate production rules royzlife@gmail.com (2007-03-23) |

Re: Bison Parser negate production rules gneuner2@comcast.net (George Neuner) (2007-03-26) |

Re: Bison Parser negate production rules cfc@shell01.TheWorld.com (Chris F Clark) (2007-03-26) |

Re: Bison Parser negate production rules mco333@sympatico.ca (mco333) (2007-03-27) |

Re: Bison Parser negate production rules gneuner2@comcast.net (George Neuner) (2007-03-27) |

Re: Bison Parser negate production rules cfc@shell01.TheWorld.com (Chris F Clark) (2007-03-29) |

Re: Bison Parser negate production rules Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2007-03-29) |

Re: Bison Parser negate production rules gneuner2@comcast.net (George Neuner) (2007-03-29) |

Re: Bison Parser negate production rules cfc@shell01.TheWorld.com (Chris F Clark) (2007-04-02) |

From: | Chris F Clark <cfc@shell01.TheWorld.com> |

Newsgroups: | comp.compilers |

Date: | 29 Mar 2007 00:57:34 -0400 |

Organization: | The World Public Access UNIX, Brookline, MA |

References: | 07-03-084 07-03-090 07-03-094 07-03-097 |

Keywords: | parse, theory |

Posted-Date: | 29 Mar 2007 00:57:34 EDT |

George Neuner <gneuner2@comcast.net> writes:

*> On 26 Mar 2007 14:26:17 -0400, Chris F Clark*

*> <cfc@shell01.TheWorld.com> wrote:*

*>*

*>>Now, the "modern" parsing methodologies like GLR, PEGs, and predicated*

*>>grammars are closed under complement, so if you use a technique like*

*>>that, you can add a NOT rule and still create a parser.*

*>*

*> I'd appreciate some references.*

Try this link found on google:

http://research.microsoft.com/conferences/pldi06/presentations/rats_pldi06.pdf

If you are looking for tools that support the various methodologies:

For GLR, look into ADF+SDF and DMS Toolkit

For PEG, look into RATS

For predicated parsers, look at ANTLR, Yacc++, GrammarForge (aka Meta-S)

*> It seems to me that the complement of a rule can't be concretely*

*> defined without limiting the universe to those token strings which*

*> would be legal in the language defined by the complete set of rules.*

Yes, all closures are aginst Sigma* (sigma being the vocbulary).

*> It also seems that with such limitation, the complemented rule should*

*> be parsable using the same model given sufficient lookahead. That is,*

*> the restriction means the complement subset is just the language minus*

*> the subset which is the rule.*

Yes, but there is no complement, intersect, or difference operator in

BNF. Thus, there is no way to say "minus" in a context-free grammar.

Moreover, it can be proven, that you is no algorithm that can

construct the complement of an arbitrary context-free grammar. So,

although you can do "minus" is some simple cases, you can't do it

generally.

However, the technique you just described is the essential way one

specifies the solution in predicated grammars (or PEG), although some

notations offer direct operators (e.g. we do in Yacc++) to make the

expression more terse. And, this gets to the heart of the question

that started this thread. In a predicated (or PEG) grammar, one can

say do this rule if it applies, and if not do the next rule. This

implies an ordering of the rules.

So, taking the original example:

A: B;

B: C

| D

| E

| Sigma* // parse anything

;

Using a predicated grammar, you can force the rules to be ordered and

tested in order. In PEGs, the rules are always ordered, so this

always happens. (More on that later.)

So, what does one do to make the rules ordered in a predicated

grammar, add predicates. I'll use Yacc++ notation, which is "x >> y",

for check predicate x and if it matches, apply y. The ANTLR notation

is essentially to replace the >> with ?, but that is a normal regular

expression operator, so we wanted something unique.

In each case, we add a predicate that matches exactly the same string

we want to reduce, as in:

B: C >> C

| D >> D

| E >> E

| Sigma* // parse anything

; // in a PEG, the first rule is always treated like this rule.

If when trying to parse a B you see a C, then parse the B as a C.

If you don't see a C, but instead see a D, then parse the B as a D.

If you don't see a C or D, but see an E, then parse the B as an E.

If you see none, of the above, parse a B as whatever you see.

Now, the one gotcha is that predicates (or PEGs, which are essentially

a use of predicates throughout a grammar) are not a panacea. In

particular, they hide ambiguities. If in the above example, C and D

were ambiguous, that is there is some string that is both a C and a D,

then no error message is generated because the parser will always

treat it as a C. However, if the case is subtle enough, someone

writing the string might intend it to be a D and not realize that it

is also a C. There is no algorithm to tell you (in all cases) if the

intersection of two arbitrary context-free grammars (i.e. C and D) is

empty (they are unambiguous). The LR (and LL) restrictions guarantee

that for the cases they can handle, that the rules are unambiguous,

but the LR restrictions are what predicated grammars (and PEGs) are

trying to get-around (loosen).

*>>In fact, Yacc++ can be thought of as taking Yacc/Bison and adding in*

*>>regular expressions and predicates (then cleaning up the syntax a*

*>>little). In it we have a not predicate that allows exactly what you*

*>>are asking for. You will find similar functionality in PCCTS/ANTLR*

*>>although the underlying technology is LL not LR for them, so you*

*>>cannot use left recursion (or precedence rules).*

*>*

*> I'm not familiar with Yacc++, but I do use ANTLR and I was unaware*

*> that it has this capability and I don't see any (obvious) syntax for*

*> it in the manual. Do you have a cite or an example?*

Here is a wikipedia pointer that was on my first page of a google

search for: ANTLR predicate.

http://en.wikipedia.org/wiki/Syntactic_predicate

Hope this helps,

-Chris

*****************************************************************************

Chris Clark Internet : compres@world.std.com

Compiler Resources, Inc. Web Site : http://world.std.com/~compres

23 Bailey Rd voice : (508) 435-5016

Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)

------------------------------------------------------------------------------

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.