Re: Why LL(1) Parsers do not support left recursion?

Chris F Clark <cfc@shell01.TheWorld.com>
28 Jul 2006 18:43:08 -0400

          From comp.compilers

Related articles
[15 earlier articles]
Re: Why LL(1) Parsers do not support left recursion? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-07-25)
Re: Why LL(1) Parsers do not support left recursion? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-07-25)
Re: Why LL(1) Parsers do not support left recursion? mailbox@dmitry-kazakov.de (Dmitry A. Kazakov) (2006-07-25)
Re: Why LL(1) Parsers do not support left recursion? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-07-25)
Re: Why LL(1) Parsers do not support left recursion? ajonospam@andrew.cmu.edu (Arthur J. O'Dwyer) (2006-07-25)
Re: Why LL(1) Parsers do not support left recursion? wyrmwif@tsoft.org (SM Ryan) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? cfc@shell01.TheWorld.com (Chris F Clark) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? cfc@shell01.TheWorld.com (Chris F Clark) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? cfc@shell01.TheWorld.com (Chris F Clark) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? find@my.address.elsewhere (Matthias Blume) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-07-28)
Re: Why LL(1) Parsers do not support left recursion? DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-07-28)
[8 later articles]
| List of all articles for this month |

From: Chris F Clark <cfc@shell01.TheWorld.com>
Newsgroups: comp.compilers
Date: 28 Jul 2006 18:43:08 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 06-07-059 06-07-065 06-07-071
Keywords: parse

SM Ryan schrieb:
> > Only poorly designed programming languages are ambiguous. (Natural
> > languages are ambiguous and don't use deterministic language theory.)


Hans-Peter Diettrich <DrDiettrich1@aol.com> (Dodi) writes:
> Counterexample:
> The argument list of a subroutine is nothing but a list, which can be
> expressed using either left or right recursion in a grammar.


Or with iteration (e.g. one of the closure operators, such as + and
*). I think Dodi's point here is well-taken. The notation in a
grammar should make the problem being solved clearer and not introduce
any extraneous details. If there is no specific associativity in the
language, I would prefer not to introduce artificial associativity
into the grammar. (It's pretty ovbvious that SM Ryan disagrees in
that he doesn't think a grammar shoulod be ambiguous, and he makes
that point several times in his posting. Whether he would consider
the rest of what I said implying accepting ambiguity, I will have to
wait for his judgement.)


To my mind, one specifically choses to write:


                expr: expr "+" term
                        | term;


                or:


                right "+", "-";
                right "*"; "/";
                left "**";
                expr: expr "+" expr;


                or:


                expr: term ("+" term)*


depending on what one intends to say.


I would write explicit precedence if I had a specific parsing problem
and I wanted the fact that I was expressing that problem explicitly in
the grammar to emphasize the fact.


I would write ambiguous rules with precedence and associativity
declarations if I wanted to emphasize the fact that the items were
simply expressions and that the precendence and associativity were
associated with the operators.


I would write regular expressions if I wanted to emphasize the fact
that the item corresponds to a "list" and the operators have no
precedence at all are are merely separators between the items of
interest. In fact, we have a pair of binary closure operators &* and
&+ that we specifically support in Yacc++ to make that even more
clear.


                expr: term &* "+"; // same as above, but more terse


Dodi again:
> Perhaps I misused the term "ambiguous" here, when I meant that different
> parse trees can be constructed for a sentence of a language?


That is the definition of ambiguous. An ambiguous graqmmar admits to
or more parses (i.e. parse trees) for a given sentence. An unambiguous
grammar admits only one parse.


SM Ryan again:
> > Many programming language grammars are ambiguous because the people
> > writing the grammars are lazy and/or uneducated in these matters. The
> > dangling else nonproblem was solved some 40 years ago. Anyone who
> > thinks this is still a problem is just too lazy to write a well known
> > solution.


However, I don't believe there is an LL grammar (in the strict
technical sense) that solves the dangling else problem. One can only
solve the dangling else problem with a "hack" to the LL solution.
Moreover, if one wants to do that, the "best" grammar to give to the
LL generator (that supports the hack) is the ambiguous one.


Note, that I have nothing against that "hack". It is hack in a good
sense of the word (i.e. an elegant solution that takes a non-obvious
short-cut that correctly handles the cases of interest).


Dodi again:
> I'm not sure what you want to tell me. AFAIR LR(k) (languages?
> grammars?) can be transformed into LR(1), for which a finite state
> machine can be constructed easily. I assume that a transformation from
> LL(k) into LL(1) would be possible as well, using essentially the same
> transformation algorithms.


Unfortunately not true. Moreover, I don't believe there is an
algorithm for transforming an LR(k) grammar into an LR(1) grammar,
just a (non-constructive) proof that such a grammar exists. Thus, if
you have an LR(k) grammar and only an LR(1) tool, you are also
out-of-luck.


> My point is that table driven parsers are unreadable, due to the lost
> relationship between the parser code and the implemented language or
> grammar. Do there exist LR parsers or parser generators at all, which
> preserve the relationship between the parser code and the grammar?


Table driven parsers (at least LR ones) are more unreadable than
recursive descent ones in the same sense that the drawing (or a table
describing the drawing) of an DFA is less-readable than the
corresponding regular expression. I think there is some cognitive
effects that allow a person's mind to ignore the parallelism in a
linear presentation (e.g. a recursive descent routine or a regular
expression) and to focus on the depth-first properities which are
brought to the surface in such representations. People do not seem to
think as well in breath-first representations. I think this has to do
with some anthropomorphising where one inserts oneself into the
current location in the code as it progresses through one case. This
works the same way people often single-step through code in a
debugger.


That being said, one can learn to deal quite effectively with the DFAs
of an LR machine. Moreover, I have find that the single-step approach
often blinds one to things happening in other paths that one "should"
be considering. In fact, one of my biggest complaints about
hand-generated recursive descent is that it lends itself to ad-hack
approaches (hack used pejoratively here), where one twiddles with
local parts of the parser to get it to accept the case one is
interested in, without considering whether one has a consistent
grammar (i.e. that the same code works in another context).


Hope this doesn't confuse the issues any more,
-Chris


*****************************************************************************
Chris Clark Internet : compres@world.std.com
Compiler Resources, Inc. Web Site : http://world.std.com/~compres
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
------------------------------------------------------------------------------



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.