Re: Generating LALR(1) Grammar from an arbitrary CFG. (John Rickard)
Sun, 23 Aug 1992 16:24:47 GMT

          From comp.compilers

Related articles
Generating LALR(1) Grammar from an arbitrary CFG. (1992-08-20)
Re: Generating LALR(1) Grammar from an arbitrary CFG. (1992-08-23)
Context dependent grammars? (1992-08-25)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (John Rickard)
Organization: Compilers Central
Date: Sun, 23 Aug 1992 16:24:47 GMT
Keywords: LALR, parse, theory
References: 92-08-118 (Sundeep Kumar) writes:
: Is there a utility that accepts an abitrary CFG and either gives an
: equivalent LALR(1) grammar or decides that the CFG has no equivalent
: LALR(1) grammar ? Is this problem in general, solvable ?

No, it's not soluble. (Which doesn't exclude the possibility of such a
utility that works in the cases one's interested in.)

Sketch proof:

I assume the lemma: There is no algorithm to decide whether a CFG
generates the language consisting of every word of its alphabet. (This
result can be found in books on languages and automata, and I give a very
sketchy proof below.)

Given any language L on alphabet A with a CFG, take a context-free
language M on a disjoint alphabet B such that M has no LALR(1) grammar.
Consider the language N on the alphabet A U B consisting of all words S
such that either discarding all letters in B from S yields a word in L or
discarding all letters in A from S yields a word in M. If L contains all
words on A, then N contains all words on A U B and so N has a LALR(1)
grammar; if L does not contain all words on A, then it is not hard to see
that N does not have a LALR(1) grammar (since if it did then so would M).
So if it were decidable whether N had a LALR(1) grammar, then it would be
decidable whether L contained all words on A, which contradicts the lemma.

Very sketchy proof of lemma:

This uses the well-known fact that it is not decidable whether an
arbitrary Post correspondence problem is soluble. A Post correspondence
problem is defined by a finite set of ordered pairs of words over a given
alphabet; a solution is a finite non-empty sequence of pairs of words,
each from the given set, such that the concatenation of the first word
from each pair equals the concatenation of the second word from each pair.
(For example, is there an English "sentence" with a word-for-word French
translation that consists of the same letters in the same order, with only
the positioning of the spaces different?)

Now given a Post correspondence problem, we add two characters "," and ":"
to its alphabet and consider the language consisting of all words on the
extended alphabet except those that are generated from a solution of the
Post correspondence problem by concatenating (to use the terms of the
example above): the English words, separated by commas; a colon; the
reverse of the string formed by the French words, separated by commas.
This language has a CFG, and it contains all words on its alphabet if and
only if the Post correspondence problem is not soluble; this proves the
lemma. (Why does the language have a CFG? Well, it's the union of these
three languages, and it's not hard to find a CFG for each of them: (a)
words not containing exactly one colon; (b) words containing exactly one
colon that do not form palindromes if the commas are omitted; (c) words
containing exactly one colon where the substrings between commas do not
match up around the colon as English words and reversals of French

-- John Rickard

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.