RE: regular expression operators in CF grammars

"Quinn Tyler Jackson" <qjackson@shaw.ca>
14 Sep 2002 16:18:57 -0400

          From comp.compilers

Related articles
Re: regular expression operators in CF grammars cfc@TheWorld.com (Chris F Clark) (2002-09-14)
RE: regular expression operators in CF grammars qjackson@shaw.ca (Quinn Tyler Jackson) (2002-09-14)
RE: regular expression operators in CF grammars qjackson@shaw.ca (Quinn Tyler Jackson) (2002-09-14)
| List of all articles for this month |

From: "Quinn Tyler Jackson" <qjackson@shaw.ca>
Newsgroups: comp.compilers
Date: 14 Sep 2002 16:18:57 -0400
Organization: Compilers Central
References: 02-09-094
Keywords: parse, design
Posted-Date: 14 Sep 2002 16:18:55 EDT

Chris Clark said:


> Yes, I mean roughly the same thing as Terence meant (see
> below --- for explanation). Quinn Tyler Jackson is in the process
of
> working out a new parser generator, Meta-S (and the theoretical
> semantics behind it). It incorporates an extension of predicated
> grammars (actually more than a couple of different extensions).
> While some of those extensions are of the van Wijngaarden
> (two-level) grammar type, there are several extensions that are
> purely predicated forms (in Terence's sense).


I originally borrowed the concept from what Françoise Balmas of
University Paris-8 implemented in her Augmented Pattern Matcher as
"multistep matching", which is probably why my variation of
predication works on marked substrings.


In its minimal formal, a Meta-S predicate qualifies the string that
matched the expression immediately to the left of it.


S ::= A<predicate_A> ":" B<predicate_B>;
A ::= '[a-z]+';
B ::= '[a-z]+';
predicate_A ::= "apple";
predicate_B ::= "pie";


The above $-grammar accepts one and only one language: "apple:pie",
because of the constraints placed upon rules A and B by the
predicates.


The delayed predicate form allows predication checks off until some
trigger lexeme has been seen, thereby reducing the number of
unnecessary predicate checks are necessary if two rules have similar
starting but different ending strings:


If we change S above to:


S ::= ($x(A) ":" B<predicate_B>)<x=predicate_A>;


The above rule puts of the checking of A against predicate_A until all
other text has matched. This is useful if the predicate_A happens to
be time complex and we don't want any wasted effort.


Since Meta-S is an adaptive formalism, predicates can modify the
grammar while their doing their thing, and this has resulted in some
interesting, powerful effects, at least in terms of how compactly one
can express funky context-sensitive grammars.


For instance, the language a^n b^m c^mn. In other words aabbbcccccc or
abbbccc, and so on.


The $-grammar, using adaptive predicates, for that language is:


grammar AnBmCmn
{
S ::= #phi(D.X<-"") (ax bx) #psi(A.X=mult) cx<D.X C.X=dxcx>;
ax ::= $A.X<-('a+');
bx ::= $B.X<-('b+');
cx ::= $C.X<-('c+');
mult ::= ("a" #phi(D.X<-D.X B.X))<+>;
dxcx ::= "b" [dxcx] "c";
};


In the above $-grammar, the adaptive predicate is #psi(A.X=mult),
since when A.Z is checked against mult, it has the side effect of
creating a string D.X which has one B.X for every a in A.X, and this
string is later used in the <D.X C.X=dxcx> predicate check.


It's compact, but more importantly, the process by which it accepts
that languages can be conceptualized by mere programming mortals (as
opposed to Theory Demigods), which cannot be said of the Chomsky form
grammar for that same language. ;-)


--
Quinn Tyler Jackson
http://members.shaw.ca/qjackson/


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.