6 Oct 2001 16:31:45 -0400

Related articles |
---|

Re: testing epsilon free CFGs for ambiguity cfc@world.std.com (Chris F Clark) (2001-10-06) |

From: | Chris F Clark <cfc@world.std.com> |

Newsgroups: | comp.theory,comp.compilers |

Date: | 6 Oct 2001 16:31:45 -0400 |

Organization: | Compilers Central |

References: | <mktgadm-1A7F9C.14261304102001@reader.news.uu.net> <3BBE007B.3532637C@unb.ca> |

Keywords: | parse, theory |

Posted-Date: | 06 Oct 2001 16:31:45 EDT |

Ralph Boland asked why CFG ambiguity is non-computable and whether the

following statement is true:

*> 2) The reason that the ambiguity problem is not generally*

*> computable is because we cannot place a bound on the*

*> size of the smallest example sentence that is ambiguous.*

*> FURTHERMORE, THIS IS TRUE EVEN IF THE GRAMMAR*

*> IS EPSILON FREE!!!*

While I am not an expert on grammar computability, I have worked on

the problem a small amount and I believe the above statement is true.

I will detail why in a minute.

Digression;

Note, the only proof I have read on the non-computability of

ambiguity was in Aho and Ullman's _The Theory of Parsing,

Translation, and Compiling_ Vol I. pp 203-204. ISBN

0-13-914556-7. The proof shows how to construct a grammar whose

ambiguity solves the Post Corespondence problem, which in turn can

be shown to solve the Turing Machine halting problem. Thus, if

there existed an algorithm, we would have an algorithm that solved

the halting problem.

Ok, now, why minimal length examples of ambiguity are aribitrarily

large:

Consider a(n epsilon free) grammar of the form:

(In "yacc" notation, with nonterminal in lower case, terminals in

upper, g is the goal non-terminal)

g: a | b;

a: a1 | a1 a;

a1: X a2;

a2: X a3;

...

an-1: X an;

an: X;

b: b1 | b1 b1 b;

b1: X b2;

b2: X b3;

...

bm-1: X bm;

bm: X;

This grammar matches an "a" non-terminal, if the sentence is of the

form X**(i*n), and a "b" non-terminal, if the sentence is of the form

X**m(2*j-1). The grammar is ambiguous if and only if there is a

solution to i*n=m*(2*j-1) in the positive integers. Thus, if n is

even and not a multiple of m, the grammar is unambiguous and

otherwise, it is ambiguous. (I hope I got my elementary algebra

right.) Note, by playing around with the "a" and "b" sets of rules,

one can form also sorts of arbitrary mathematical puzzles. For

example, what happens if the b rules are of the form:

b1: X b2 b2;

b2: X b3 b3;

...

Question to other comp.theory and comp.compilers readers is it

possible to make a series of "b" rules that exhibits faster than

exponential growth? If I understand the preceding example, it seems

trivial to produce rules that grow according to any exponential curve,

but I'm not sure how to grow faster than that. Moreover, to be

equivalent to the halting problem, I would think a much faster growth

would be required.

Note, this problem interests me a lot, because I have a class of

grammars that I call LR(infinite), which essentially use a clever form

of unbounded lookahead to determine grammar membership. All grammars

within the set are unambiguous, but generating a parser for a grammar

in the set involves solving the ambiguity problem in a similar manner

to the question posed above. Now, I believe that for all unambiguous

grammars the algorithm eventual terminates in a finite (but

arbitrarily large) number of steps. I believe the problem is that

there is no computable bound that can be put on the length of the

number of steps the algorithm will take given a grammar of a

particular size (e.g. number of terminals, non-terminals, length of

right-hand-sides, and number of rules). Thus, for certain ambiguous

grammars, the program simply fails to terminate, always thinking that

some more lookahead with resolve the conflict. (And it seems obvious

that if I could restrict the ambiguity problem to a fixed amount of

lookahead based on an exponential equation from the grammar size, the

problem is solvable, simply prcompute that number and run for only

that many lookahead steps.)

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.