help: using yacc, and my grammar needs to count items. (Steve Mading)
18 Apr 1999 02:16:07 -0400

          From comp.compilers

Related articles
help: using yacc, and my grammar needs to count items. (1999-04-18)
Re: help: using yacc, and my grammar needs to count items. (1999-04-19)
Re: help: using yacc, and my grammar needs to count items. debray@CS.Arizona.EDU (1999-04-26)
| List of all articles for this month |

From: (Steve Mading)
Newsgroups: comp.compilers
Date: 18 Apr 1999 02:16:07 -0400
Organization: University of Wisconsin, Madison
Keywords: parse, question, comment

I am running into something I can't express in BNF, and I can't figure
out how to make lex/yacc parse this correctly. Can anyone offer
advice on this? My problem is that my grammer needs to count items
and group them into 'rows' in a table even though there are no

Here's an example using this language's syntax:



The idea is that this represents a table of data like so:

            name1 name2 name3
            value1-1 value2-1 value2-1
            value1-2 value2-2 value2-2
            value1-3 value2-3 value2-3

But, as you can see, the problem is that until the program finishes
parsing the list of names, it doesn't know how many values exist in a
row. (a row could be 1 column, or 100, it depends on the size of the
name list). This is not expressable in a BNF-like grammer without
making seperate rules for each possible number of tags like so:

        Loop :== onename onevalrows |
                          twonames twovalrows |
threenames threevalrows |
... etc ...
        onename :== name
        onevalrows :== value | onevalRows value
        twonames :== name name
        twovalrows :== value value | twovalRows value value

This is obviously not a practical solution when the list of names can be
any large size.

What I need to do is to be able to say "A foo consists of N bars, where
N is to be determined at runtime."

Does anyone know how to solve this?

(Before you suggest it, no I can't change the grammar - I'm working off
a syntax spec created by biochemists that has become a de-facto standard,

What I did previously to solve the problem was to simply read the
whole list of values, and then break them up into rows later on at the
end in C code. But this is becoming impractical as the loops get
bigger and bigger. (Yacc is eating up too much memory trying to parse
such a huge list without reaching the end of it.)

[I think your original solution is still the best. If yacc is eating up
too much space, that generally means that you have right-recursive rather
than left-recursive rules. Left recursive rules can loop indefinitely
without the parser using any more space. Another solution is to use a
Gross Lexical Hack, e.g.:

loop: names rows ;

names: NAME | names NAME ;

rows: row | rows row ;

row: values EOR ;

values: VALUE | values VALUE ;

Now each row is parsed as a "row" rule, nice and tidy. But, you ask,
where does the EOR come from? From the lexer, of course. Once you've
parsed the names, you can count them and tell that N values go in a
row. You have your lexer count the VALUEs it returns and after every
N values, return an EOR. This is disgustingly ad-hoc but will
probably add only six lines of code to your lexer. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.