Related articles |
---|
java.lang.StackOverflowError on a type 1 grammar parser xareon@gmail.com (2007-01-28) |
From: | xareon@gmail.com |
Newsgroups: | comp.compilers |
Date: | 28 Jan 2007 01:38:59 -0500 |
Organization: | Compilers Central |
Keywords: | Java, parse, question |
Posted-Date: | 28 Jan 2007 01:38:59 EST |
hi all, after someone in comp.lang.java.programmer told me to post
here my issue, i'll tell you my problem. i'm writing a sort of
universal parser for type 1 grammars, that given a grammar and a
string, check whether or not that string belongs to the language
generated by the given grammar. let me explain you what did i do with
the following java code:
public boolean check(int scanIndex)
{
productionSet = givenGrammar.getProductionSet();
startSymbol = givenGrammar.getStartSymbol();
if(sententialFormStack.isEmpty())
{
for (i = scanIndex; i < productionSet.size();
i++)
{
currentLeftSide =
productionSet.elementAt(i).getLeftSide();
currentRightSide =
productionSet.elementAt(i).getRightSide();
if (currentLeftSide.equals(startSymbol)
&&
currentRightSide.length() <= stringToCheck.length())
{
if(currentLeftSide.equals(stringToCheck))
{
return true;
}
sententialFormStack.push(new
Ancestor(startSymbol, i));
return check(0);
}
}
return false;
}
else
{
currentSententialForm =
sententialFormStack.peek().getSententialForm();
currentScanIndex =
sententialFormStack.peek().getProductionIndexUsedToUnfold();
for(i = scanIndex; i < productionSet.size();
i++)
{
currentLeftSide =
productionSet.elementAt(i).getLeftSide();
currentRightSide =
productionSet.elementAt(i).getRightSide();
totalLength =
currentSententialForm.length() +
currentRightSide.length() - currentLeftSide.length();
if(currentSententialForm.contains(currentLeftSide) && totalLength
<= stringToCheck.length())
{
newSententialForm =
currentSententialForm.replaceFirst(currentLeftSide, currentRightSide);
if(newSententialForm.equals(stringToCheck))
{
return true;
}
sententialFormStack.peek().setProductionIndexUsedToUnfold(i);
sententialFormStack.push(new
Ancestor(newSententialForm, -1));
return check(0);
}
}
if(currentSententialForm.equals(startSymbol))
{
return false;
}
sententialFormStack.removeElementAt(sententialFormStack.size()-1);
currentScanIndex =
sententialFormStack.peek().getProductionIndexUsedToUnfold();
sententialFormStack.peek().setProductionIndexUsedToUnfold(-1);
return check(currentScanIndex+1);
}
}
i have some object types that i defined in some classes, let me show
you which they are:
Production, that's made up of a couple of strings, (leftSide,
rightSide), each representing the right or left side of a production A
-> B
Grammar, that's made up of a string (startSymbol) and a
Vector<Production> (productionSet)
Ancestor, that's made up of a string (sententialForm) and an int
(productionIndexUsedToUnfold).
then i defined in my class a sententialFormStack, that's a
Stack<Ancestor>, and it's the stack of the sentential forms for which i
may have some other unfloding possibilities that i have to analyze to
check whether my stringToCheck belongs to the language generated from a
given grammar.
the method parse, takes as input an integer (scanIndex), and that's the
pointer to the productionSet that defines where do i have to start my
research for unfolding possibilities.
that's what i do: if the stack is empty, i search in the productionSet
a production like startSymbol => sententialForm, where |sententialForm|
<= |stringToCheck| (because we know that type 1 grammars generate
bigger or same-length words each unfolding step). as soon as i find a
"good" production, i push into the stack the ancestor(startSymbol, i),
where i is the index of the production used to unfold the axiom
startSymbol. then starts a recursive calling of check.
if the stack is not empty, i'm in "the middle" of the unfolding tree:
then i look at the element at the top of the stack, and search in the
productionSet some left sides that are into the sententialForm where i
got into. where i find them, i continue the unfolding, checking each
time if the stringToCheck matches the currentSententialForm. if i find
it, i push it into the stack with the index of the production used, if
i cannot find no more productions to unfold currentStringToCheck, then
i remove the element at the top of the stack and continue my parsing
from the index of the production i used to get the old sententialForm.
the algorithm is working correct, as i could check trying to run it
with some stringToChecks and some givenGrammars, only for inputs that
don't have to make to much pushes/pop into the stack. otherwise i get a
stackoverflow error (i think it's due to too many recursive callings).
i hope i won't have to change algorithm, i've been working on it for a
while :s
Return to the
comp.compilers page.
Search the
comp.compilers archives again.