parsing C++ headers, a partial survey of whats out there

mullin@taligent.com
13 Apr 1996 23:03:39 -0400

          From comp.compilers

Related articles
parsing C++ headers, a partial survey of whats out there mullin@taligent.com (1996-04-13)
| List of all articles for this month |

From: mullin@taligent.com
Newsgroups: comp.compilers
Date: 13 Apr 1996 23:03:39 -0400
Organization: Compilers Central
Keywords: C++, parse

In reference to the comments on parsing C++ headers, given that
1) it's the modern C++, at most missing namespaces and RTTI
2) there is some level of interest in handling preprocessor stuff


SUMMARIZING
1) ANTLR is cool, but the C++ grammar is convoluted & a bit buggy
2) Roskind doesn't do templates and other such stuff
3) Sage is at least worth a look, and does have a Yacc/Bison C++
grammar in it with templates


A) ANTLR (of PCCTS) @
http://www.igs.net/~mtr/software-development/pccts.htm ANTLR is very
nice and someday will be used to make a real cool parser. The NeXT
parser is not real cool, and needs a great deal of rework to be
useful. Grammar problems are things like untyped fn's are synerr,
operator~ mistaken for malformed dtor, and what little has been done
to pass syntactic info upwards to higher rules gets in the way when
trying to do real work. But those are _parser_ issues, ANTLR may well
become the new standard in writing parsers. It is "cool", the author
Terrance Parr is helpful as are others on comp.compilers.pccts


B) ROSKIND GRAMMAR @
ftp://ftp.funet.fi/pub/languages/c++/c++grammar2.0.tar.gz Roskind
grammar is old, doesn't know templates, and can't really be considered
as a candidate, unless you are effectively figuring on writing the
whole thing yourself and just using it as a guide. In that case, I'd
first port it to ANTLR and then fix it up. Many problems in the yacc
environment discussed by Roskind are moot in the ANTLR environment.


C) Sage http://www.extreme.indiana.edu/sage/overview.html This looks
very interesting. It has two tools within it, pC++2dep and dep2C++
which convert C++ to an AST (i think) and back again after some
munging. It does contain a reasonably clear yacc specification for
C++ which I have at least determines handles template classes and
functions on a quick scan. Overall looks pretty and well organized,
but have only given a cursory look so far.


FWIW, I think that parsing class defs (which are often found in
headers ;-) ) is the hardest part of parsing C++, statements (- the
class and type bits) are no harder than they ever were. So if you
want to write something that can parse a header, especially if you
want to handle preproc stuff, you are dealing with the majority of the
problem of parsing C++. Assuming you have already proved to yourself
you can't get away with a simpler heuristic scanner.


Anyway, ANTLR is great, the NeXT grammar for it is too problematic to
be seen as anything beyond a guide, the Roskind grammar is a little
long in the tooth to be useful with modern C++, and Sage looks like it
might have some real potential to be of use (even though this isn't
what it was designed to do !) and daze and daze of bothering search
and metasearch engines haven't found much else.


Mark Mullin, hacking headers at Taligent
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.