20 Aug 1998 22:53:37 -0400

Related articles |
---|

Types of grammars. Luke@comodo.techi.com.force9.net (Luke) (1998-08-19) |

Re: Types of grammars. cfc@world.std.com (Chris F Clark) (1998-08-19) |

Re: Types of grammars. cfc@world.std.com (Chris F Clark) (1998-08-20) |

Re: Types of grammars. adrian@dcs.rhbnc.ac.uk (1998-08-24) |

Re: Types of grammars. sergenth@tin.it (1999-05-20) |

Re: Types of grammars. hunk@alpha1.csd.uwm.edu (1999-06-02) |

From: | Chris F Clark <cfc@world.std.com> |

Newsgroups: | comp.compilers |

Date: | 20 Aug 1998 22:53:37 -0400 |

Organization: | Compilers Central |

Keywords: | LL(1), parse |

dwight@pentasoft.com (Dwight VandenBerghe) asked me off-line (however,

I think the question is worthy of wider readership):

*> Is this right, Chris? I thought that LL(infinite) grammars,*

*> like those accepted by PRECC, were a superset of*

*> LR(1).*

Interesting question. It is not clear to me what the definition of

LL(infinite) is. If somehow, the infinite in LL(infinite) allows one

to look past the end of a rule which is being matched for tokens that

determine whether the rule is to be applied or not, then LL(infinite)

isn't normal LL at all, and thus doesn't fit the scheme. However, on

the other hand if the infinite only allows the match to consider an

infinite number of tokens within one rule for matching the rule, then

the subsetting holds.

To make this precise consider the following toy grammars:

grammar1: a "1" | b "2";

a: "0";

b: "0";

grammar1 is not LL(1), because the first token of a|b is not

sufficient to determine which of the two non-terminals to use. It is

LR(1), because 1 token after the a|b rule has ended is sufficient to

determine whether a or b should be chosen.

The interesting question is whether it is LL(2) or not. Neither of

the non-terminals a or b have two tokens in them, so the first 2

tokens of a|b is not well defined. However, it is worth arguing that

<i>in the context of the grammar1 rule</i> the a|b decision has two

tokens (using the tokens from the follow set for a|b), and thus the

decision can be made with two token lookahead, and thus is LL(2).

If LL grammars are defined that way, where the follow sets are part of

the count, then it is possible to construct LL(k) grammars which are

not LR(k-1). You make the choice be between two empty non-terminals

and have it require k tokens after the empty non-terminals to

distinguish which empty non-terminal applies. In that counting scheme

and LL(k) grammar is an LR(k) grammar, but not necesarily an LR(k-1)

grammar. An LL(infinite) grammar in this counting scheme might not be

an LR(k) grammar for any finite k.

The other possible counting scheme for LL(infinite) grammars is

illustrated in grammar2.

grammar2: a "1" | b "2";

a: "0"+ "1";

b: "0"+ "2";

In grammar2, there is no finite k number of tokens which will resolve

the conflict between a|b, because of the regular expression + operator

(positive closure) which allows an a (or a b) to have an indefinite

number of 0's as a prefix. Thus, the grammar is not LL(k) for any

finite k. However, it can be resolved by "infinite k" (that is

looking past all the 0's).

It is also true that grammar2 is LR(0), as it requires 0 tokens

<i>after</i> the end of the a|b non-terminals to distinguish them. It

does require an indefinite number of tokens within the rules, but that

is not a problem for LR(anything).

If LL(k) only allows counting of tokens within a rule for

disambiguating conflicts between two rules, then an LL(k) grammar is

LR(1) as I stated previously.

Three points are worth making on this topic.

1) For most practical languages, none of this matters very much.

Almost all languages have a grammar which is both LL(1) and LALR(1).

2) If you have an LL(finite k) or LR(finite k) grammar for a language,

there exists an LR(1) grammar for that language.

3) You can use syntactic predicates to extend the "k" of your parser

generator (effectively to infinity), if your parser generator supports

them.

Hope this makes things more clear,

-Chris

*****************************************************************************

Chris Clark Internet : compres@world.std.com

Compiler Resources, Inc. CompuServe : 74252,1375

3 Proctor Street voice : (508) 435-5016

Hopkinton, MA 01748 USA fax : (508) 435-4847 (24 hours)

------------------------------------------------------------------------------

Web Site in Progress: Web Site : http://world.std.com/~compres

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.