Tue, 23 Feb 1993 17:17:35 GMT

Related articles |
---|

Chomsky Normal Form and parsing ebcrmb@ebc.ericsson.se (1993-02-23) |

Re: Chomsky Normal Form and parsing hdev@dutiag.twi.tudelft.nl (1993-02-23) |

Re: Chomsky Normal Form and parsing eifrig@beanworld.cs.jhu.edu (1993-02-24) |

Re: Chomsky Normal Form and parsing hdev@dutiak.twi.tudelft.nl (1993-02-24) |

Newsgroups: | comp.compilers |

From: | ebcrmb@ebc.ericsson.se (BO/EBC/KX/ZG Michael Bergman 682 4958) |

Keywords: | parse, question |

Organization: | Compilers Central |

Date: | Tue, 23 Feb 1993 17:17:35 GMT |

Hi everyone,

I have some more entangled(?) questions on CFG's.

This time it's Chomsky and Greibach normal forms I'm interested in.

"Languages and Machines" by Tomas A. Sudkamp [1988]

ISBN 0-201-15768-3

Addison-Wesley

contains some interesting stuff. A friend of mine had this at hand, so I

cite him:

Definition Chomsky Normal Form:

Grammar G = (V,Sigma,P,S) (Variables, alphabet, productions, start) All

productions are in one of the forms:

1. A -> BC

2. A -> a

3. S -> epsilon

B and C != S, A is arbitrary (specifically, it can be S)

(Note that S is the only production that can have an epsilon.)

Theorem:

"Let G =(V,Sigma,P,S) be a CFG. There is an algorithm to construct a

grammar G' = (V,Sigma,P',S) in Chomsky NF that is equivalent to G."

Note that there are no other restrictions here, according to Sudkamp this

works on all CFG's.

These steps are used in the conversion:

1. Make S non-recursive

2. Eliminate all epsilon except the one in S (if there is one)

3. Eliminate all chain rules

4. Remove useless symbols (the ones not used in any production).

(After this, one can eliminate left-recursion to make it predictive

parseable.)

The question is: what's the practical use of this?

Can we parse this Chomsky Normal Form for sure using some parsing method?

Which one in that case? (I do suspect the answer is no... :-(

If not, are there some "well" defined restrictions one can put on the CFG

(e.g. unamiguity) to make the parsing possible or even more feasible?

There is also the Greibach NF which probably is better for LL/predictive

parsing, it is inherently non-left recursive for example. The problem is

that the conversion is *nasty*, having exponential behaviour in the worst

case. Is there a theorem stating that it is always possible to convert to

Greibach NF? Apparently, the theorem above says that for Chomsky NF.

[added later -John]

I just found a comment on this in a comp.compilers article from May 1992:

Jonathan Eifrig (eifrig@blaze.cs.jhu.edu) writes:

*> We want don't so much want to recognize a*

*> *language* so much as generate a parse tree of the input for a particular*

*> grammar. The *grammar* itself is important; if one has to contort the*

*> "natural" grammar (i.e., the one that matches the semantic grouping) of*

*> language in order to accomodate our parsing scheme, this is a serious dis-*

*> advantage. This is why, for example, we don't just convert our ambiguous*

*> grammars into Chomsky normal form and apply the obvious dynamic*

*> programming parsing scheme. We'll see in a minute that both LL(k) and*

*> LR(1) parsing schemes suffer from this problem.*

I guess that says it all unless someone else has any comments.

However, since it is possible to do the conversion at least

semi-automatically, maybe one could try it and see if the result is

acceptable or not. I haven't tried this myself so I don't know what it

looks like (but I can imagine it will be pretty difficult to read).

So - what's the "dynamic programming parsing scheme"? I'll be happy with

an overview of the method + a pointer to any refs if there are any.

M

--

Michael Bergman Email: ebcrmb@ebc.ericsson.se

KX/ZGE Tfn: +46 8 6824958

Ericsson Business Networks

S-135 83 Tyreso, Sweden

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.