predicate parsing

Ariel Meir Tamches <tamches@wam.umd.edu>
Wed, 21 Apr 1993 05:33:45 GMT

          From comp.compilers

Related articles
Re: A C++ Parser toolkit parrt@ecn.purdue.edu (1993-04-12)
Semantic predicates into grammar specifications xorian@solomon.technet.sg (1993-04-19)
predicate parsing tamches@wam.umd.edu (Ariel Meir Tamches) (1993-04-21)
Re: predicate parsing dave@cs.arizona.edu (1993-04-22)
predicate parsing bevan@computer-science.manchester.ac.uk (Stephen J Bevan) (1993-04-22)
Re: predicate parsing isckbk@nuscc.nus.sg (1993-04-23)
Re: predicate parsing simonh@swidev.demon.co.uk (1993-04-23)
Re: predicate parsing mauney@csljon.csl.ncsu.edu (1993-04-27)
Re: predicate parsing isckbk@nuscc.nus.sg (1993-04-28)
[4 later articles]
| List of all articles for this month |
Newsgroups: comp.compilers
From: Ariel Meir Tamches <tamches@wam.umd.edu>
Keywords: parse, tools
Organization: Compilers Central
References: 93-04-044 93-04-066
Date: Wed, 21 Apr 1993 05:33:45 GMT

Just would like to add my two cents worth to the predicate parsing
discussion that has been taking place among Terence Parr
(parrt@ecn.purdue.edu) and Xorian technology (xorian@solomon.technet.sg)
with their new PCCTS and LADE parsing tools.


One thing that I haven't seen noted is that it's just about impossible to
add predicate parsing to any type of bottom-up parsing engine. A formal
argument could be made by looking at LR handle generation (it depends on
the PDA stack, not leaving much room for predicates) but a more intuitive
one would simply note that top-down parsers, such as LL, have a control
flow completely analagous to "real-world" programming languages, such as C
(think "recursive descent"). If you have any version of PCCTS, examine
the C code it produces; it will look suspiciously like that which you
would have created had you been writing a recursive-descent parser from
scratch in C. Each grammar rule is represented by exactly one C function,
which among other things made it easy even in PCCTS 1.00 to effortlessly
have inherited and synthesized attributes. Inherited attributes are
call-by-value parameters to that procedure; synhesized variables are
call-by-reference.


Sure, Yacc has synthesized attributes, and there is a proof that inherited
attributes can be coaxed with the addition of extra temporary rules. The
dragon book states (page 341) that "The impression that top-down parsing
allows more flexibility for translation is shown to be false by a proof in
Brosgol [1974] that a translation scheme based on an LL(1) grammar can be
simulated during LR(1) parsing. Independently, Watt [1977] used marker
nonterminals to ensure that the values of inherited attributes appear on a
stack during parsing." I would have to contest this assertion, especially
in the light of semantic predicates, which the book seems never to have
heard of. The Dragon book may be right about being able to coax inherited
attributes in Yacc, but I don't think it can take the next big step:
having such attributes take an active role in parsing decisions.


To clarify, consider a conceptually simple but extremely powerful addition
to PCCTS: Taking a look at the beautiful recursive-descent C code produced
by PCCTS, one can't help but wonder if we can tap into the power of C from
our parser. Sure, Yacc has C actions, but they are a "scam" when it comes
to making parsing decisions. They simply can't take part in parsing,
unless one resorts to horrifying hacks such as tinkering with Yacc's PDA
stack from within a Yacc action. Predicate parsing was relatively simple
to add to pccts 1.06 (you can clarify me on this if I'm wrong, Terence)
because all it has to do is copy the predicate right into the C code,
overriding the normal "default predicate" which is simply good 'ole LL(k)
tests of FIRST(), etc.


It's hard to imagine how one can insert arbitrary predicates in a
bottom-up parser; for the same reason it's hard for bottom-up parsers to
have inherited attributes - their control flow is "screwy" - top-down is
the way to go. People used to (and maybe still do) laugh at LL parsers;
after all, LR(k) is a proper superset of LL(k). But when we consider how
easy it is to add predicates and inherited attributes to LL, we see that
"conventional wisdom" has been wrong.


On a theoretical note, it is very easy to prove that the latest version of
PCCTS (1.06 - 1.07 [as long as it has predicates]) is Turing-equivalent.
(I did it by writing a TM simulator in PCCTS, which I believe is a
sufficient condition by Church's Thesis.)


>From another angle, consider hacks used by Yacc to get complex
(non-LALR(1)) grammars to parse: I'm still trying to decipher the tricks
and hacks used in GNU C++ and Jim Roskind's Yacc grammars. Bottom-up
parsing is simply not the answer; with predicates used in the new breed of
top-down parsers, we now have a superior alternative.


Ariel Tamches
University of Maryland, College Park
tamches@cs.umd.edu
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.