Re: Types of grammars.

Chris F Clark <cfc@world.std.com>
20 Aug 1998 22:53:37 -0400

          From comp.compilers

Related articles
Types of grammars. Luke@comodo.techi.com.force9.net (Luke) (1998-08-19)
Re: Types of grammars. cfc@world.std.com (Chris F Clark) (1998-08-19)
Re: Types of grammars. cfc@world.std.com (Chris F Clark) (1998-08-20)
Re: Types of grammars. adrian@dcs.rhbnc.ac.uk (1998-08-24)
Re: Types of grammars. sergenth@tin.it (1999-05-20)
Re: Types of grammars. hunk@alpha1.csd.uwm.edu (1999-06-02)
| List of all articles for this month |

From: Chris F Clark <cfc@world.std.com>
Newsgroups: comp.compilers
Date: 20 Aug 1998 22:53:37 -0400
Organization: Compilers Central
Keywords: LL(1), parse

dwight@pentasoft.com (Dwight VandenBerghe) asked me off-line (however,
I think the question is worthy of wider readership):
> Is this right, Chris? I thought that LL(infinite) grammars,
> like those accepted by PRECC, were a superset of
> LR(1).


Interesting question. It is not clear to me what the definition of
LL(infinite) is. If somehow, the infinite in LL(infinite) allows one
to look past the end of a rule which is being matched for tokens that
determine whether the rule is to be applied or not, then LL(infinite)
isn't normal LL at all, and thus doesn't fit the scheme. However, on
the other hand if the infinite only allows the match to consider an
infinite number of tokens within one rule for matching the rule, then
the subsetting holds.


To make this precise consider the following toy grammars:


grammar1: a "1" | b "2";
a: "0";
b: "0";


grammar1 is not LL(1), because the first token of a|b is not
sufficient to determine which of the two non-terminals to use. It is
LR(1), because 1 token after the a|b rule has ended is sufficient to
determine whether a or b should be chosen.


The interesting question is whether it is LL(2) or not. Neither of
the non-terminals a or b have two tokens in them, so the first 2
tokens of a|b is not well defined. However, it is worth arguing that
<i>in the context of the grammar1 rule</i> the a|b decision has two
tokens (using the tokens from the follow set for a|b), and thus the
decision can be made with two token lookahead, and thus is LL(2).


If LL grammars are defined that way, where the follow sets are part of
the count, then it is possible to construct LL(k) grammars which are
not LR(k-1). You make the choice be between two empty non-terminals
and have it require k tokens after the empty non-terminals to
distinguish which empty non-terminal applies. In that counting scheme
and LL(k) grammar is an LR(k) grammar, but not necesarily an LR(k-1)
grammar. An LL(infinite) grammar in this counting scheme might not be
an LR(k) grammar for any finite k.


The other possible counting scheme for LL(infinite) grammars is
illustrated in grammar2.


grammar2: a "1" | b "2";
a: "0"+ "1";
b: "0"+ "2";


In grammar2, there is no finite k number of tokens which will resolve
the conflict between a|b, because of the regular expression + operator
(positive closure) which allows an a (or a b) to have an indefinite
number of 0's as a prefix. Thus, the grammar is not LL(k) for any
finite k. However, it can be resolved by "infinite k" (that is
looking past all the 0's).


It is also true that grammar2 is LR(0), as it requires 0 tokens
<i>after</i> the end of the a|b non-terminals to distinguish them. It
does require an indefinite number of tokens within the rules, but that
is not a problem for LR(anything).


If LL(k) only allows counting of tokens within a rule for
disambiguating conflicts between two rules, then an LL(k) grammar is
LR(1) as I stated previously.


Three points are worth making on this topic.


1) For most practical languages, none of this matters very much.
Almost all languages have a grammar which is both LL(1) and LALR(1).


2) If you have an LL(finite k) or LR(finite k) grammar for a language,
there exists an LR(1) grammar for that language.


3) You can use syntactic predicates to extend the "k" of your parser
generator (effectively to infinity), if your parser generator supports
them.


Hope this makes things more clear,
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. CompuServe : 74252,1375
3 Proctor Street voice : (508) 435-5016
Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)
------------------------------------------------------------------------------
Web Site in Progress: Web Site : http://world.std.com/~compres
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.