Re: A C++ Parser toolkit

parrt@ecn.purdue.edu (Terence J Parr)
Mon, 12 Apr 1993 15:25:55 GMT

          From comp.compilers

Related articles
A C++ Parser toolkit moudgill@cs.cornell.EDU (Mayan Moudgill) (1993-04-11)
Re: A C++ Parser toolkit parrt@ecn.purdue.edu (1993-04-12)
Semantic predicates into grammar specifications xorian@solomon.technet.sg (1993-04-19)
Re: Semantic predicates into grammar specifications parrt@ecn.purdue.edu (1993-04-19)
predicate parsing tamches@wam.umd.edu (Ariel Meir Tamches) (1993-04-21)
Re: predicate parsing dave@cs.arizona.edu (1993-04-22)
| List of all articles for this month |
Newsgroups: comp.compilers
From: parrt@ecn.purdue.edu (Terence J Parr)
Keywords: tools, PCCTS
Organization: Compilers Central
References: 93-04-042
Date: Mon, 12 Apr 1993 15:25:55 GMT

I'm very pleased by the posting of Mayan Moudgill
<moudgill@cs.cornell.EDU>; people are beginning to see that semantic
predicates are the way to recognize context-sensitive constructs rather
than having the lexer change the token type (ack!). Mayan writes:


> For instance, the following code:
>
> int name(Parse& P)
> {
> Token t;
>
> P, IDENT(t);
> if( P && StbFind(t) ) {
> return 1;
> }
> return 0;
> }
>
> int stmt(Parse & P)
> {
> Token t;
>
> P, MATCH(name), "=", NUMBER(val);
> }
>
> matches an identifier (i.e. [a-zA-Z_][a-zA-Z_0-9]*), '=', number string,
> but only if identifier is already in the symbol-table.


In PCCTS, we would write something akin to:


name : << IsVAR(LATEXT(1)) >>? IDENT
          ;


stat : name "=" NUMBER
          ;


where <<IsVAR(LATEXT(1))>>? is a semantic predicate; IsVAR is some
user-defined function and LATEXT(1) is the text of the first token of
lookahead. This example behaves exactly as Mayan outlines. We call this
a *validation* semantic predicate (we have syntactic predicates in the
next release of PCCTS). Predicates can also be used to distinguish
between two syntactically ambiguous productions (*disambiguating* semantic
predicates). E.g., let's add a production to stat to match a type name
followed by a declarator.


name : << IsVAR(LATEXT(1)) >>? IDENT
          ;


type : << IsTYPE(LATEXT(1)) >>? IDENT
          ;


stat : name "=" NUMBER
          | type declarator
          ;


In this case, IDENT predicts both productions of stat and k=1 lookahead is
syntactically insufficient. However, ANTLR (the parser-generator of
PCCTS) finds 2 *visible* predicates (one in name and the other in type)
that can be used to semantically disambiguate the productions of stat.
Hence, it *hoists* the predicates for use in the prediction expressions
for stat, thus, resolving the conflict. Note that, using k=2, ANTLR could
uniquely predict stat's productions without predicates and would not hoist
the visible predicates.


PCCTS is in the public domain and may be obtained by sending email to
pccts@ecn.purdue.edu with a blank "Subject:" line.


Terence Parr
Purdue University
School of Electrical Engineering
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.