Related articles |
---|
[17 earlier articles] |
Re: Parsing Expression Grammar cleos@nb.sympatico.ca (Cleo Saulnier) (2005-09-17) |
Re: Parsing Expression Grammar DrDiettrich@compuserve.de (Hans-Peter Diettrich) (2005-09-17) |
Re: Parsing Expression Grammar Meyer-Eltz@t-online.de (Detlef Meyer-Eltz) (2005-09-18) |
Re: Parsing Expression Grammar cleos@nb.sympatico.ca (Cleo Saulnier) (2005-09-22) |
Re: Parsing Expression Grammar schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-09-22) |
Re: Parsing Expression Grammar schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-09-22) |
Re: Parsing Expression Grammar rsc@swtch.com (Russ Cox) (2005-09-22) |
Re: Parsing Expression Grammar cfc@shell01.TheWorld.com (Chris F Clark) (2005-09-22) |
Re: Parsing Expression Grammar schmitz@i3s.unice.fr (Sylvain Schmitz) (2005-09-22) |
Re: Parsing Expression Grammar paul@parsetec.com (Paul Mann) (2005-09-22) |
Re: Parsing Expression Grammar cfc@shell01.TheWorld.com (Chris F Clark) (2005-09-23) |
Re: Parsing Expression Grammar gneuner2@comcast.net (George Neuner) (2005-09-25) |
Re: Parsing Expression Grammar cfc@world.std.com (Chris F Clark) (2005-09-25) |
[4 later articles] |
From: | Russ Cox <rsc@swtch.com> |
Newsgroups: | comp.compilers |
Date: | 22 Sep 2005 23:49:16 -0400 |
Organization: | Compilers Central |
References: | <432FE722.1070801@i3s.unice.fr> |
Keywords: | parse |
Posted-Date: | 22 Sep 2005 23:49:15 EDT |
Forwarded with permission.
From: Bryan Ford <baford@mit.edu>
Date: Sep 20, 2005 5:43 PM
Subject: Re: Fwd: Parsing Expression Grammar
> All this thread on PEGs has convinced me of giving them a better look.
> But something isn't quite clear; in his ICFP'03 paper, Bryan Ford
> presented an unambiguous CFG that could not be recognized by his packrat
> parser:
>
> P -> x P x | x
>
> (this CFG is especially interesting for it has a quadratic parsing time
> complexity using an Earley parser).
> I have tried "P <- x P x / x" to see what would happen, and indeed,
> the packrat parser is only able to parse sequences of 2^n - 1 "x"
> symbols, n > 0.
>
> This problem does not seem to be addressed in his POPL'04 paper; the
> well-formed definition, if I understand it correctly, is only concerned
> about left recursion and does not reject the above grammar.
The fact that it's possible to transform most CFGs into a syntactically
correct PEG by replacing "->" with "<-" and "|" with "/" does not by any
means imply that the PEG so generated will actually represent the same
language as the CFG did. CFGs and PEGs are different syntactic
meta-languages whose expressions have substantially different
interpretations, and I've never yet even tried to define a "direct" formal
relationship directly between similar-looking PEGs and CFGs. In other words,
the fact that the CFG "P -> x P x | x" looks a lot like the PEG "P <- x P x /
x" means absolutely nothing formally.
It's probably possible to define some such formal correspondence between a
very restrictive subset of CFGs, and a similar very restrictive subset of
PEGs, and prove that the corresponding grammars represent the same language.
The LL(0) CFGs might be a good starting point for building such a
correspondence. I haven't done so yet, however, nor has anyone else that I
know of. Needless to say, the CFG "P -> x P x | x" would probably not be
covered by this correspondence.
> Now, strictly speaking, the language recognized by "P <- x P x / x"
> *is* {x^(2^n - 1) | n > 0} (and that's quite an achievement in itself),
Actually, if you analyze the behavior of the above PEG further I think you'll
find that it represents the language {"x", "xxx"}. In other words, the
PEG-based parser fails to behave in the way the similar-looking CFG does for
input strings of 5 or more x's, because it deterministically makes the
"wrong" parsing choices in the middle of the string and never goes back to
explore different choices as a fully backtracking CFG-based parser might.
It's easy in this case to write a _different_ PEG that recognizes the same
language {x^(2^n - 1) | n > 0}, e.g., "P <- x x P | x", but that fact doesn't
seem to generalize to CFGs in general.
> Looking at this, I also wonder whether there exist a PEG for the CFL
> {ww^R | w \in {a,b}^*}.
My guess is no. The basic problem seems to be the lack of syntactic markers
in the middle of the string from which the PEG-based parser might start
building correct intermediate results.
> All this to get to the conclusion that the immediate translation of a
> CFG into a PEG by replacing "|" with "/" will not necesseraly yield the
> expected result. Even the simple grammar "S -> Ac, A -> a| ab" should
> not be translated "S <- Ac, A <- a / ab" but "S <- Ac, A <- ab / a",
> otherwise input "abc" would not be accepted.
Correct.
Cheers,
Bryan
Return to the
comp.compilers page.
Search the
comp.compilers archives again.