Related articles |
---|
A C++ Parser toolkit moudgill@cs.cornell.EDU (Mayan Moudgill) (1993-04-11) |
Re: A C++ Parser toolkit parrt@ecn.purdue.edu (1993-04-12) |
Semantic predicates into grammar specifications xorian@solomon.technet.sg (1993-04-19) |
Re: Semantic predicates into grammar specifications parrt@ecn.purdue.edu (1993-04-19) |
predicate parsing tamches@wam.umd.edu (Ariel Meir Tamches) (1993-04-21) |
Re: predicate parsing dave@cs.arizona.edu (1993-04-22) |
Newsgroups: | comp.compilers |
From: | parrt@ecn.purdue.edu (Terence J Parr) |
Keywords: | tools, PCCTS |
Organization: | Compilers Central |
References: | 93-04-042 |
Date: | Mon, 12 Apr 1993 15:25:55 GMT |
I'm very pleased by the posting of Mayan Moudgill
<moudgill@cs.cornell.EDU>; people are beginning to see that semantic
predicates are the way to recognize context-sensitive constructs rather
than having the lexer change the token type (ack!). Mayan writes:
> For instance, the following code:
>
> int name(Parse& P)
> {
> Token t;
>
> P, IDENT(t);
> if( P && StbFind(t) ) {
> return 1;
> }
> return 0;
> }
>
> int stmt(Parse & P)
> {
> Token t;
>
> P, MATCH(name), "=", NUMBER(val);
> }
>
> matches an identifier (i.e. [a-zA-Z_][a-zA-Z_0-9]*), '=', number string,
> but only if identifier is already in the symbol-table.
In PCCTS, we would write something akin to:
name : << IsVAR(LATEXT(1)) >>? IDENT
;
stat : name "=" NUMBER
;
where <<IsVAR(LATEXT(1))>>? is a semantic predicate; IsVAR is some
user-defined function and LATEXT(1) is the text of the first token of
lookahead. This example behaves exactly as Mayan outlines. We call this
a *validation* semantic predicate (we have syntactic predicates in the
next release of PCCTS). Predicates can also be used to distinguish
between two syntactically ambiguous productions (*disambiguating* semantic
predicates). E.g., let's add a production to stat to match a type name
followed by a declarator.
name : << IsVAR(LATEXT(1)) >>? IDENT
;
type : << IsTYPE(LATEXT(1)) >>? IDENT
;
stat : name "=" NUMBER
| type declarator
;
In this case, IDENT predicts both productions of stat and k=1 lookahead is
syntactically insufficient. However, ANTLR (the parser-generator of
PCCTS) finds 2 *visible* predicates (one in name and the other in type)
that can be used to semantically disambiguate the productions of stat.
Hence, it *hoists* the predicates for use in the prediction expressions
for stat, thus, resolving the conflict. Note that, using k=2, ANTLR could
uniquely predict stat's productions without predicates and would not hoist
the visible predicates.
PCCTS is in the public domain and may be obtained by sending email to
pccts@ecn.purdue.edu with a blank "Subject:" line.
Terence Parr
Purdue University
School of Electrical Engineering
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.