|Chomsky Normal Form and parsing email@example.com (1993-02-23)|
|Re: Chomsky Normal Form and parsing firstname.lastname@example.org (1993-02-23)|
|Re: Chomsky Normal Form and parsing email@example.com (1993-02-24)|
|Re: Chomsky Normal Form and parsing firstname.lastname@example.org (1993-02-24)|
|From:||email@example.com (BO/EBC/KX/ZG Michael Bergman 682 4958)|
|Date:||Tue, 23 Feb 1993 17:17:35 GMT|
I have some more entangled(?) questions on CFG's.
This time it's Chomsky and Greibach normal forms I'm interested in.
"Languages and Machines" by Tomas A. Sudkamp 
contains some interesting stuff. A friend of mine had this at hand, so I
Definition Chomsky Normal Form:
Grammar G = (V,Sigma,P,S) (Variables, alphabet, productions, start) All
productions are in one of the forms:
1. A -> BC
2. A -> a
3. S -> epsilon
B and C != S, A is arbitrary (specifically, it can be S)
(Note that S is the only production that can have an epsilon.)
"Let G =(V,Sigma,P,S) be a CFG. There is an algorithm to construct a
grammar G' = (V,Sigma,P',S) in Chomsky NF that is equivalent to G."
Note that there are no other restrictions here, according to Sudkamp this
works on all CFG's.
These steps are used in the conversion:
1. Make S non-recursive
2. Eliminate all epsilon except the one in S (if there is one)
3. Eliminate all chain rules
4. Remove useless symbols (the ones not used in any production).
(After this, one can eliminate left-recursion to make it predictive
The question is: what's the practical use of this?
Can we parse this Chomsky Normal Form for sure using some parsing method?
Which one in that case? (I do suspect the answer is no... :-(
If not, are there some "well" defined restrictions one can put on the CFG
(e.g. unamiguity) to make the parsing possible or even more feasible?
There is also the Greibach NF which probably is better for LL/predictive
parsing, it is inherently non-left recursive for example. The problem is
that the conversion is *nasty*, having exponential behaviour in the worst
case. Is there a theorem stating that it is always possible to convert to
Greibach NF? Apparently, the theorem above says that for Chomsky NF.
[added later -John]
I just found a comment on this in a comp.compilers article from May 1992:
Jonathan Eifrig (firstname.lastname@example.org) writes:
> We want don't so much want to recognize a
> *language* so much as generate a parse tree of the input for a particular
> grammar. The *grammar* itself is important; if one has to contort the
> "natural" grammar (i.e., the one that matches the semantic grouping) of
> language in order to accomodate our parsing scheme, this is a serious dis-
> advantage. This is why, for example, we don't just convert our ambiguous
> grammars into Chomsky normal form and apply the obvious dynamic
> programming parsing scheme. We'll see in a minute that both LL(k) and
> LR(1) parsing schemes suffer from this problem.
I guess that says it all unless someone else has any comments.
However, since it is possible to do the conversion at least
semi-automatically, maybe one could try it and see if the result is
acceptable or not. I haven't tried this myself so I don't know what it
looks like (but I can imagine it will be pretty difficult to read).
So - what's the "dynamic programming parsing scheme"? I'll be happy with
an overview of the method + a pointer to any refs if there are any.
Michael Bergman Email: email@example.com
KX/ZGE Tfn: +46 8 6824958
Ericsson Business Networks
S-135 83 Tyreso, Sweden
Return to the
Search the comp.compilers archives again.