Re: Parsing Expression Grammar

Chris F Clark <cfc@shell01.TheWorld.com>
2 Sep 2005 14:20:42 -0400

From comp.compilers

Related articles
Parsing Expression Grammar skjaero@gmail.com (2005-08-31)
Re: Parsing Expression Grammar lfinsto1@gwdg.de (Laurence Finston) (2005-09-02)
Re: Parsing Expression Grammar rsc@swtch.com (Russ Cox) (2005-09-02)
*Re: Parsing Expression Grammar cfc@shell01.TheWorld.com (Chris F Clark)* (2005-09-02)**
Re: Parsing Expression Grammar pohjalai@cc.helsinki.fi (A Pietu Pohjalainen) (2005-09-02)
Re: Parsing Expression Grammar owong@castortech.com (Oliver Wong) (2005-09-03)
Re: Parsing Expression Grammar pohjalai@cc.helsinki.fi (A Pietu Pohjalainen) (2005-09-07)
Re: Parsing Expression Grammar paul@highpersoft.com (Paul Mann) (2005-09-07)
Re: Parsing Expression Grammar wclodius@lanl.gov (2005-09-10)
Re: Parsing Expression Grammar cfc@shell01.TheWorld.com (Chris F Clark) (2005-09-10)
[24 later articles]

| List of all articles for this month |

From:	Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups:	comp.compilers
Date:	2 Sep 2005 14:20:42 -0400
Organization:	The World Public Access UNIX, Brookline, MA
References:	05-08-115
Keywords:	parse
Posted-Date:	02 Sep 2005 14:20:42 EDT

Almost any grammar suitable to ANTLR (, PCCTS, or JavaCC) will be a
Parsing Expression Grammar, because any LL(1) grammar is also a
Parsing Expression Grammar (modulo some slight syntax issues). So, if
you want a "PEG" for Java. Find a Java grammar for ANTLR (or PCCTS)
and change the | operators into / operators and you are done.

A pure (unpredicated) LL(1) grammar, is simply a PEG where the
predicate operators are not used (and the alternatives are separated
by | rather than /).

In addition, any recursive descent parser generated from an LL(1)
grammar will be (atleast) as efficient as the corresponding packrat
parser, because there is no backtracking involved in a "pure" LL(1)
grammar and the memoization of a packrat parser is only to preserve
linear run time's in the presence of backtracking. thus, a packrat
parser and a recursive descent LL(1) parser do exactly the same steps
(omitting the memoization, which has no benefit) for an LL(1) grammar.

Moreover, the benefits listed for PEGs (unambiguity and linear time
requirements) are shared by LL(1) grammars.

Now, if you have an LR grammar for Java, say one for CUP or Yacc++,
there is more to do, because LR grammars are not subsets of PEGs.
However, almost all LL(1) grammars are LR(1) also. Thus, if you have
an LL(1) grammar, you can use your choice of parser generators: LL,
LR, or PEG.

There are also PEGs that are not LL grammars, but you don't need that
to parse Java. Therefore, the PEG technology is not a loss. In fact,
it is a good solution to some problems. Moreover, the algorithm for
processing PEGs is useful and provides some insight that wouldn't have
been seen if the technology hadn't been developed.

However, it is not necessary to rewrite all our grammars into PEG
form, LL(1) is perfectly adequate. The interesting case might come
from languages which don't have an LL(1) grammar (e.g. C++). For
them, it might be useful to have a PEG.

But, as a potential language designer, you need to think long and hard
about why you want to create a language which has no LL(1) grammar.

Hope this helps,
-Chris

*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: Parsing Expression Grammar

Chris F Clark <cfc@shell01.TheWorld.com>2 Sep 2005 14:20:42 -0400

Chris F Clark <cfc@shell01.TheWorld.com>
2 Sep 2005 14:20:42 -0400