Re: Types of grammars. (A Johnstone)
24 Aug 1998 13:22:39 -0400

          From comp.compilers

Related articles
Types of grammars. (Luke) (1998-08-19)
Re: Types of grammars. (Chris F Clark) (1998-08-19)
Re: Types of grammars. (Chris F Clark) (1998-08-20)
Re: Types of grammars. (1998-08-24)
Re: Types of grammars. (1999-05-20)
Re: Types of grammars. (1999-06-02)
| List of all articles for this month |

From: (A Johnstone)
Newsgroups: comp.compilers
Date: 24 Aug 1998 13:22:39 -0400
Organization: Royal Holloway, University of London
References: 98-08-149
Keywords: parse, LL(1)

Chris F Clark ( wrote:
: (Dwight VandenBerghe) asked me off-line (however,
: I think the question is worthy of wider readership):
: > Is this right, Chris? I thought that LL(infinite) grammars,
: > like those accepted by PRECC, were a superset of
: > LR(1).

PRECC, and related tools (such as the GRDP tool written by Liz Scott
and myself) do not handle left recursion. LR does. Both PRECC and
GRDP can handle grammars that are not LR(1). Therefore generalised
recursive descent parsers are neither a superset nor a subset of
LR. However, (careful) application of left recursion removal
algorithms to LR grammars can allow them to be transformed into
grammars that can be handled by tools such as GRDP and sometimes
PRECC, so in that sense they are more powerful...

Be very careful concerning strong claims surrounding these
tools. There is a lot of devil hiding in the details of what
constitutes an acceptable grammar. Some tools (eg PRECC) actually
process _ordered_ grammars wherein, for instance, the production
matching the longest substring comes first in the
specification. However, there are grammars for which the longest
substring matching production is not unique so this constraint can not
be met in which case the parser algorithm quietly goes wrong. Chris is
quite right to query the definition of LL(oo). It turns out that the
strategy that is used to select a matching production within a rule is
both critical and non-obvious. We use a condition called
follow-determinism for singleton matching and a have a version of our
algorithm which side-steps the problem by returning a set of matches.

As you might be able to tell from the handwaving nature of the above
paragraph, I am eliding a lot of technical detail. You can read the
detail in compressed form in our paper to this year's Compiler
Construction conference (`Generalised recursive descent parsing and
follow-determinism', Adrian Johnstone and Elizabeth Scott, in Lecture
Notes in Computer Science Vol 1383 pp 16--30). Extended detail may be
found in two technical reports that you can pick up from our Web site.

We're presently writing a paper that attempts to formalise and
categorise the behaviour of tools such as PRECC, GRDP and PCCTS in
lookahead mode.

Having issued these caveats I'd better mention the other big one:
lookahead parsers go exponential if you give them suitably
pathological grammars. Remember that the whole point of LR(1) and LL(1)
parsers is that they are linear time, that is the amount of time taken
to complete a parse is proportional to the length of the string being
parsed. There is considerable empirical evidence that a little
judicious use of lookahead doesn't hurt too much but we are a long way
from having practical general parsers that will parse any old grammar
you happen to write with usable speed for production applications.

On the other hand... lookahead recursive descent parsers are _much_
easier to debug, read and generaly hack with. Inherited and synthesized
attributes are easy too. A colleague of ours likes to call them
`recursively decent' parsers which pretty much sums up the feelings of
anybody who has attempted to hand trace a large table driven parser.

Dr Adrian Johnstone, Senior Lecturer in Computing, Computer Science Dep,
Royal Holloway, University of London, Egham, Surrey, TW20 0EX, England.
Email Tel:+44(0)1784 443425 Fax:+44(0)1784 439786

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.