Related articles |
---|
help: using yacc, and my grammar needs to count items. madings@baladi.nmrfam.wisc.edu (1999-04-18) |
Re: help: using yacc, and my grammar needs to count items. gneuner@dyn.com (1999-04-19) |
Re: help: using yacc, and my grammar needs to count items. debray@CS.Arizona.EDU (1999-04-26) |
From: | madings@baladi.nmrfam.wisc.edu (Steve Mading) |
Newsgroups: | comp.compilers |
Date: | 18 Apr 1999 02:16:07 -0400 |
Organization: | University of Wisconsin, Madison |
Keywords: | parse, question, comment |
I am running into something I can't express in BNF, and I can't figure
out how to make lex/yacc parse this correctly. Can anyone offer
advice on this? My problem is that my grammer needs to count items
and group them into 'rows' in a table even though there are no
delimiters.
Here's an example using this language's syntax:
loop_
_name1
_name2
_name3
value1-1
value2-1
value3-1
value1-2
value2-2
value3-2
value1-3
value2-3
value3-3
stop_
The idea is that this represents a table of data like so:
name1 name2 name3
-----------------------------------
value1-1 value2-1 value2-1
value1-2 value2-2 value2-2
value1-3 value2-3 value2-3
But, as you can see, the problem is that until the program finishes
parsing the list of names, it doesn't know how many values exist in a
row. (a row could be 1 column, or 100, it depends on the size of the
name list). This is not expressable in a BNF-like grammer without
making seperate rules for each possible number of tags like so:
Loop :== onename onevalrows |
twonames twovalrows |
threenames threevalrows |
... etc ...
onename :== name
onevalrows :== value | onevalRows value
twonames :== name name
twovalrows :== value value | twovalRows value value
...etc...
This is obviously not a practical solution when the list of names can be
any large size.
What I need to do is to be able to say "A foo consists of N bars, where
N is to be determined at runtime."
Does anyone know how to solve this?
(Before you suggest it, no I can't change the grammar - I'm working off
a syntax spec created by biochemists that has become a de-facto standard,
unfortunately.)
What I did previously to solve the problem was to simply read the
whole list of values, and then break them up into rows later on at the
end in C code. But this is becoming impractical as the loops get
bigger and bigger. (Yacc is eating up too much memory trying to parse
such a huge list without reaching the end of it.)
[I think your original solution is still the best. If yacc is eating up
too much space, that generally means that you have right-recursive rather
than left-recursive rules. Left recursive rules can loop indefinitely
without the parser using any more space. Another solution is to use a
Gross Lexical Hack, e.g.:
loop: names rows ;
names: NAME | names NAME ;
rows: row | rows row ;
row: values EOR ;
values: VALUE | values VALUE ;
Now each row is parsed as a "row" rule, nice and tidy. But, you ask,
where does the EOR come from? From the lexer, of course. Once you've
parsed the names, you can count them and tell that N values go in a
row. You have your lexer count the VALUEs it returns and after every
N values, return an EOR. This is disgustingly ad-hoc but will
probably add only six lines of code to your lexer. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.