Fri, 23 Jun 1995 03:58:37 GMT

Related articles |
---|

LL(1) grammar for dangling else? MLF@VM.CPD.UA.ES (Mikel L. Forcada) (1995-05-28) |

Re: LL(1) grammar for dangling else? lumpe@aragorn.unibe.ch (Markus Lumpe) (1995-05-30) |

Re: LL(1) grammar for dangling else? leichter@zodiac.rutgers.edu (1995-06-23) |

Newsgroups: | comp.compilers |

From: | leichter@zodiac.rutgers.edu |

Keywords: | LL(1), parse |

Organization: | Rutgers University Department of Computer Science |

References: | 95-05-133 |

Date: | Fri, 23 Jun 1995 03:58:37 GMT |

Status: | RO |

| Mikel L. Forcada" <MLF@VM.CPD.UA.ES> writes:

| the dangling-else problem is a classical one. In every book,

| a non-ambiguous grammar based on "matched" and "unmatched"

| statements is proposed, but this grammar is not LL(1).

| Is it true that there doesn't exist any LL(k) grammar for

| this language?

Yes, it's true. The dangling-else problem can be modeled by the following

language:

{a^n b^m | n >= m}

(Think of "a" as "if" and "b" as "else". The rest of the stuff in the state-

ment is irrelevant to this question.)

The following is a very informal argument: Consider parsing a string in this

language. Initially, we are looking for the start symbol S. If n is large

enough (n > k) the lookahead is always just a^k. There can, of course, be

only one production for S chosen with lookahead a^k. What symbol does it

leave us looking for? We might as well assume S: If it instead leaves us

with S1, add another a to the input string and repeat. There are only

finitely many non-terminals, so eventually we loop. (The loop might involve

going from S to S1 and back to S, or through any finite string, but the

argument is the same.) Eventually, we "chew off" enough a's that a b enters

the lookahead set. But by then we "don't know how many a's" we saw, because

we don't know how many times we looped. This makes it impossible to check

the number of b's.

This is essentially the same argument as one uses to show that a^n b^n isn't

regular. Of course, it's not quite equivalent; you have to look at the

context-free alternative of generating a "b" to match the "a" - i.e., the

first production could have been:

S <- a S b

But this can't help! If we have lookahead a^k, the string *could* in fact, be

a^n - so we can't possibly choose this production. We have to put the choice

off, with something like:

S <- a S B

B <- b

B <- \epsilon

But then "aab" has two parses, depending on which B generates the "b".

In essense, what the argument comes down to is this: With enough leading a's,

we have to commit ourselves to productions before seeing any b's. Since

there may be no b's at all, or as many as there are a's, the production we

commit ourselves to must eventually generate either zero b's, or at least

one. But this happens independently for each leading a. Once we have more

than one of these "eventual choices", the grammar becomes ambiguous.

-- Jerry

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.