java.lang.StackOverflowError on a type 1 grammar parser

xareon@gmail.com
28 Jan 2007 01:38:59 -0500

          From comp.compilers

Related articles
java.lang.StackOverflowError on a type 1 grammar parser xareon@gmail.com (2007-01-28)
| List of all articles for this month |

From: xareon@gmail.com
Newsgroups: comp.compilers
Date: 28 Jan 2007 01:38:59 -0500
Organization: Compilers Central
Keywords: Java, parse, question
Posted-Date: 28 Jan 2007 01:38:59 EST

hi all, after someone in comp.lang.java.programmer told me to post
here my issue, i'll tell you my problem. i'm writing a sort of
universal parser for type 1 grammars, that given a grammar and a
string, check whether or not that string belongs to the language
generated by the given grammar. let me explain you what did i do with
the following java code:




public boolean check(int scanIndex)
                {
                                productionSet = givenGrammar.getProductionSet();
                                startSymbol = givenGrammar.getStartSymbol();


                                if(sententialFormStack.isEmpty())
                                {


                                                for (i = scanIndex; i < productionSet.size();
i++)
                                                {
                                                                currentLeftSide =
productionSet.elementAt(i).getLeftSide();
                                                                currentRightSide =
productionSet.elementAt(i).getRightSide();


                                                                if (currentLeftSide.equals(startSymbol)
&&
currentRightSide.length() <= stringToCheck.length())
                                                                {


if(currentLeftSide.equals(stringToCheck))
                                                                                {
                                                                                                return true;
                                                                                }


                                                                                sententialFormStack.push(new
Ancestor(startSymbol, i));
                                                                                return check(0);
                                                                }
                                                }


                                                return false;
                                }


                                else
                                {
                                                currentSententialForm =
sententialFormStack.peek().getSententialForm();
                                                currentScanIndex =
sententialFormStack.peek().getProductionIndexUsedToUnfold();


                                                for(i = scanIndex; i < productionSet.size();
i++)
                                                {
                                                                currentLeftSide =
productionSet.elementAt(i).getLeftSide();
                                                                currentRightSide =
productionSet.elementAt(i).getRightSide();


                                                                totalLength =
currentSententialForm.length() +
currentRightSide.length() - currentLeftSide.length();




if(currentSententialForm.contains(currentLeftSide) && totalLength
<= stringToCheck.length())
                                                                {
                                                                                newSententialForm =
currentSententialForm.replaceFirst(currentLeftSide, currentRightSide);




if(newSententialForm.equals(stringToCheck))
                                                                                {
                                                                                                return true;
                                                                                }




sententialFormStack.peek().setProductionIndexUsedToUnfold(i);
                                                                                sententialFormStack.push(new
Ancestor(newSententialForm, -1));


                                                                                return check(0);
                                                                }
                                                }


                                                if(currentSententialForm.equals(startSymbol))
                                                {


                                                                return false;
                                                }




sententialFormStack.removeElementAt(sententialFormStack.size()-1);
                                                currentScanIndex =
sententialFormStack.peek().getProductionIndexUsedToUnfold();


sententialFormStack.peek().setProductionIndexUsedToUnfold(-1);


                                                return check(currentScanIndex+1);


                                }
                }






i have some object types that i defined in some classes, let me show
you which they are:
Production, that's made up of a couple of strings, (leftSide,
rightSide), each representing the right or left side of a production A
-> B
Grammar, that's made up of a string (startSymbol) and a
Vector<Production> (productionSet)
Ancestor, that's made up of a string (sententialForm) and an int
(productionIndexUsedToUnfold).


then i defined in my class a sententialFormStack, that's a
Stack<Ancestor>, and it's the stack of the sentential forms for which i
may have some other unfloding possibilities that i have to analyze to
check whether my stringToCheck belongs to the language generated from a
given grammar.


the method parse, takes as input an integer (scanIndex), and that's the
pointer to the productionSet that defines where do i have to start my
research for unfolding possibilities.


that's what i do: if the stack is empty, i search in the productionSet
a production like startSymbol => sententialForm, where |sententialForm|
<= |stringToCheck| (because we know that type 1 grammars generate
bigger or same-length words each unfolding step). as soon as i find a
"good" production, i push into the stack the ancestor(startSymbol, i),
where i is the index of the production used to unfold the axiom
startSymbol. then starts a recursive calling of check.


if the stack is not empty, i'm in "the middle" of the unfolding tree:
then i look at the element at the top of the stack, and search in the
productionSet some left sides that are into the sententialForm where i
got into. where i find them, i continue the unfolding, checking each
time if the stringToCheck matches the currentSententialForm. if i find
it, i push it into the stack with the index of the production used, if
i cannot find no more productions to unfold currentStringToCheck, then
i remove the element at the top of the stack and continue my parsing
from the index of the production i used to get the old sententialForm.


the algorithm is working correct, as i could check trying to run it
with some stringToChecks and some givenGrammars, only for inputs that
don't have to make to much pushes/pop into the stack. otherwise i get a
stackoverflow error (i think it's due to too many recursive callings).
i hope i won't have to change algorithm, i've been working on it for a
while :s


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.