Re: An Odd Grammar Question

Vadim Maslov <vadik@siber.com>
17 May 1998 00:14:47 -0400

          From comp.compilers

Related articles
An Odd Grammar Question wlm@panix3.panix.com (1998-05-15)
Re: An Odd Grammar Question vadik@siber.com (Vadim Maslov) (1998-05-17)
Re: An Odd Grammar Question joachim.durchholz@munich.netsurf.de (Joachim Durchholz) (1998-05-18)
Re: An Odd Grammar Question chrisd@etcons.com (Chris Dodd) (1998-05-18)
Re: An Odd Grammar Question johnmce@world.std.com (1998-05-23)
| List of all articles for this month |
From: Vadim Maslov <vadik@siber.com>
Newsgroups: comp.compilers
Date: 17 May 1998 00:14:47 -0400
Organization: Siber Systems
References: 98-05-088
Keywords: parse, Cobol, comment

William Moran wrote:
> I have the following odd Yacc problem. Assume that we have three kinds of
> productions (key words are in caps):
>
> read -> READ some_stuff END-READ
> if -> IF some-stuff END-IF
> foo -> END-ALL FOO some-stuff END-FOO
>
> the problem is that the END-ALL terminates and replaces all of the other
> ENDs one would expect. So,
>
> READ ...
> READ ...
> READ ...
> END-ALL
>
> i.e. we would have been expecting 3 END-READs, and instead we got 1 END-ALL,
> and this is legal. Anyone know of an easy way to express this sort of
> thing in Yacc (I have thought about pushing the token back on the stream
> the appropriate number of times).


Most likely, you are trying to do Cobol grammar in Yacc.
This is possible but not easy.
We at Siber Systems use the following trick in the Cobol grammar:


statement: RW_IF condition _then
                        { $$ = new SctExprOper(STMT_IF);
$$->SetArg(IF_CONDITION,$2);




                        }
                        statements_inside
                        if_else
                        end_if
                        { $$ = $4;
$$->SetArg(IF_THEN,$5);
$$->SetArg(IF_ELSE,$6);
$$->SetArg(IF_END_IF,$7);
                        }


end_if: { $$ = new SctNullNode();} %prec PRTY_LOW
            | RW_END_IF { $$ = new SctExprOper(EX_END_IF);
}
            ;


Here SEP_DOT (your END-ALL) may end IF without ENF_IF being present.


This actually works without backtracking, but once you really get into
implementing Cobol grammar, you will see that Yacc cannot really parse
Cobol, because lookahead of more than 1 is required by Cobol grammar.


So then you would need to get something like BtYacc -- Backtracking
Yacc. It is available at http://www.siber.com/btyacc/ for free.


And finally, if you are really writing product that analyzes and/or
converts Cobol code, you will save a lot of time and money just by
licensing Cobol lexer, parser, grammar, AST and code generator from
Siber Systems. My prediction is that quality Cobol parser may as well
take you 2-3 years to implement (this problem that you encountered is
an easy one, the other ones are much harder).


More info is available at http://www.siber.com/sct/.


Regards,
Vadim Maslov
[There are other languages than Cobol with a "super close". Various
cruddy old Basics come to mind. -John]
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.