|COBOL Parsers email@example.com (Mark Rickan) (2000-04-15)|
|Re: COBOL Parsers firstname.lastname@example.org (Ken Foskey) (2000-04-16)|
|Re: COBOL Parsers email@example.com (Vadim Maslov) (2000-04-16)|
|Re: COBOL Parsers firstname.lastname@example.org (John H. Lindsay) (2000-04-17)|
|Re: COBOL Parsers email@example.com (Tim Josling) (2000-04-20)|
|Re: COBOL Parsers thaneH@softwaresimple.com (2000-04-25)|
|From:||Vadim Maslov <firstname.lastname@example.org>|
|Date:||16 Apr 2000 20:15:08 -0400|
Mark Rickan wrote:
> Does anyone have any insights/experience on options for parsing COBOL?
> I am working on a project where we will need to extract data file
> declarations and access these files using other applications using
> multiplatform C/C++.
In fact, grammar of Cobol is fairly tricky at times.
Example: Cobol grammar is non LALR(1), that is it requires lookaheads
of more than one.
Example: to distinguish between 2 forms of PERFORM statement
PERFORM A OF B TIMES COMPUTE X=Y+Z END-PERFORM and
PERFORM A OF B COMPUTE X=Y+Z
we need a lookahead of 4 tokens.
In the 1st form A OF B is data item name
that contains conuter for PERFORM ... TIMES ... END-PERFORM stmt.
In the 2nd form A OF B is paragraph name performed
by PERFORM statement.
It probably can be made with some heavy symbol-table-based trickery,
but it does not really work here, because is some dialects A OF B can
be both paragraph name and data item name (2 diffrent names can have
the same name!).
So really doing your own Cobol grammar -- the one that works -- is too
expensive, as there are many pitfalls on the way. It tooks us 3 years
to get it right, which I would not qualify as easy.
Return to the
Search the comp.compilers archives again.