Sun, 20 Feb 1994 18:34:31 GMT

Related articles |
---|

LR(k) k >= 1, using Quantum Grammars mark@freenet.uwm.edu (1994-02-18) |

Re: LR(k) k >= 1, using Quantum Grammars parrt@s1.arc.umn.edu (Terence Parr) (1994-02-20) |

Newsgroups: | comp.compilers |

From: | "Terence Parr" <parrt@s1.arc.umn.edu> |

Keywords: | parse, LR(1) |

Organization: | Compilers Central |

References: | 94-02-127 |

Date: | Sun, 20 Feb 1994 18:34:31 GMT |

From: "Terence Parr" <parrt@s1.arc.umn.edu>

*> This is because computing k-token lookahead is an exponential problem in*

*> theory. As usual, much can be done in practice, but requires what can be*

*> termed "polynomial approximations to exponential k-token lookahead" to get*

*> anything beyond 1-token lookahead. My Ph.D. thesis describes how this can*

*> be done (paper coming soon to a theatre near you). Also, ANTLR applies*

*> this theory to build LL(k) for k>1 if you want to take a look (send mail*

*> to pccts@ecn.purdue.edu).*

mark@freenet.uwm.edu (Mark Hopkins) writes:

*> I believe that this problem can be solved in such a way that not only will*

*> the resulting parsers be efficient in size in terms of k, but actually in*

*> most cases, smaller than the corresponding LR(0) parser (!).*

Hmm...There are two sources of exponential behavior in LR(k) parsers (and

LL(k) parsers) for k>1: (1) the number of parser states and (2) the

lookahead sets themselves.

Mark has done some nice work with his "quantum grammars", but has only

addressed (1)--he has removed the states that provide redundant

parser-context information. However, unless (2) is attacked also, LR(k)

for k>1 (quantum or not) is still a problem. In the worst-case scenario

every lookahead set is the set of all possible permutations of |T| symbols

k in length for a space complexity of O(|T|^k) where |T| is the size of

the terminal space. The worst case is hard to find even on purpose ;),

but it only takes a few lookahead sets just short of O(|T|^k) to kill you.

I have suggested in my work and implemented in ANTLR a few "tricks" that

you may want to add to your LR(k) work; many have been known in theory for

many moons. For example, most decisions will be LR(0) or LR(1) and,

hence, the lookahead will still be linear for the majority of the grammar.

When LR(1) fails, however, you must attempt LR(k>=2), but you can still do

quite a bit in this case.

Let me know if you are interested in my work (with R. Quong and H. Dietz

at Purdue) on "polynomial approximations to lookahead for k>1."

Terence Parr

U of MN

PS Glad to see people are not satisfied with the status quo in parsing

theory.

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.