Re: LR(k) k >= 1, using Quantum Grammars

"Terence Parr" <parrt@s1.arc.umn.edu>
Sun, 20 Feb 1994 18:34:31 GMT

          From comp.compilers

Related articles
LR(k) k >= 1, using Quantum Grammars mark@freenet.uwm.edu (1994-02-18)
Re: LR(k) k >= 1, using Quantum Grammars parrt@s1.arc.umn.edu (Terence Parr) (1994-02-20)
| List of all articles for this month |

Newsgroups: comp.compilers
From: "Terence Parr" <parrt@s1.arc.umn.edu>
Keywords: parse, LR(1)
Organization: Compilers Central
References: 94-02-127
Date: Sun, 20 Feb 1994 18:34:31 GMT



From: "Terence Parr" <parrt@s1.arc.umn.edu>
> This is because computing k-token lookahead is an exponential problem in
> theory. As usual, much can be done in practice, but requires what can be
> termed "polynomial approximations to exponential k-token lookahead" to get
> anything beyond 1-token lookahead. My Ph.D. thesis describes how this can
> be done (paper coming soon to a theatre near you). Also, ANTLR applies
> this theory to build LL(k) for k>1 if you want to take a look (send mail
> to pccts@ecn.purdue.edu).


mark@freenet.uwm.edu (Mark Hopkins) writes:
> I believe that this problem can be solved in such a way that not only will
> the resulting parsers be efficient in size in terms of k, but actually in
> most cases, smaller than the corresponding LR(0) parser (!).


Hmm...There are two sources of exponential behavior in LR(k) parsers (and
LL(k) parsers) for k>1: (1) the number of parser states and (2) the
lookahead sets themselves.


Mark has done some nice work with his "quantum grammars", but has only
addressed (1)--he has removed the states that provide redundant
parser-context information. However, unless (2) is attacked also, LR(k)
for k>1 (quantum or not) is still a problem. In the worst-case scenario
every lookahead set is the set of all possible permutations of |T| symbols
k in length for a space complexity of O(|T|^k) where |T| is the size of
the terminal space. The worst case is hard to find even on purpose ;),
but it only takes a few lookahead sets just short of O(|T|^k) to kill you.


I have suggested in my work and implemented in ANTLR a few "tricks" that
you may want to add to your LR(k) work; many have been known in theory for
many moons. For example, most decisions will be LR(0) or LR(1) and,
hence, the lookahead will still be linear for the majority of the grammar.
When LR(1) fails, however, you must attempt LR(k>=2), but you can still do
quite a bit in this case.


Let me know if you are interested in my work (with R. Quong and H. Dietz
at Purdue) on "polynomial approximations to lookahead for k>1."


Terence Parr
U of MN


PS Glad to see people are not satisfied with the status quo in parsing
theory.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.