Sun, 23 Aug 1992 16:24:47 GMT

Related articles |
---|

Generating LALR(1) Grammar from an arbitrary CFG. kumar@ra.csc.ti.com (1992-08-20) |

Re: Generating LALR(1) Grammar from an arbitrary CFG. jrickard@eoe.co.uk (1992-08-23) |

Context dependent grammars? tchannon@black.demon.co.uk (1992-08-25) |

Newsgroups: | comp.compilers |

From: | jrickard@eoe.co.uk (John Rickard) |

Organization: | Compilers Central |

Date: | Sun, 23 Aug 1992 16:24:47 GMT |

Keywords: | LALR, parse, theory |

References: | 92-08-118 |

kumar@ra.csc.ti.com (Sundeep Kumar) writes:

*: Is there a utility that accepts an abitrary CFG and either gives an*

*: equivalent LALR(1) grammar or decides that the CFG has no equivalent*

*: LALR(1) grammar ? Is this problem in general, solvable ?*

No, it's not soluble. (Which doesn't exclude the possibility of such a

utility that works in the cases one's interested in.)

Sketch proof:

I assume the lemma: There is no algorithm to decide whether a CFG

generates the language consisting of every word of its alphabet. (This

result can be found in books on languages and automata, and I give a very

sketchy proof below.)

Given any language L on alphabet A with a CFG, take a context-free

language M on a disjoint alphabet B such that M has no LALR(1) grammar.

Consider the language N on the alphabet A U B consisting of all words S

such that either discarding all letters in B from S yields a word in L or

discarding all letters in A from S yields a word in M. If L contains all

words on A, then N contains all words on A U B and so N has a LALR(1)

grammar; if L does not contain all words on A, then it is not hard to see

that N does not have a LALR(1) grammar (since if it did then so would M).

So if it were decidable whether N had a LALR(1) grammar, then it would be

decidable whether L contained all words on A, which contradicts the lemma.

Very sketchy proof of lemma:

This uses the well-known fact that it is not decidable whether an

arbitrary Post correspondence problem is soluble. A Post correspondence

problem is defined by a finite set of ordered pairs of words over a given

alphabet; a solution is a finite non-empty sequence of pairs of words,

each from the given set, such that the concatenation of the first word

from each pair equals the concatenation of the second word from each pair.

(For example, is there an English "sentence" with a word-for-word French

translation that consists of the same letters in the same order, with only

the positioning of the spaces different?)

Now given a Post correspondence problem, we add two characters "," and ":"

to its alphabet and consider the language consisting of all words on the

extended alphabet except those that are generated from a solution of the

Post correspondence problem by concatenating (to use the terms of the

example above): the English words, separated by commas; a colon; the

reverse of the string formed by the French words, separated by commas.

This language has a CFG, and it contains all words on its alphabet if and

only if the Post correspondence problem is not soluble; this proves the

lemma. (Why does the language have a CFG? Well, it's the union of these

three languages, and it's not hard to find a CFG for each of them: (a)

words not containing exactly one colon; (b) words containing exactly one

colon that do not form palindromes if the commas are omitted; (c) words

containing exactly one colon where the substrings between commas do not

match up around the colon as English words and reversals of French

translations.)

-- John Rickard

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.