SUMMARY: parsing C++ header files

John Mitchell <104316.1514@CompuServe.COM>
11 Apr 1996 23:50:57 -0400

          From comp.compilers

Related articles
Parsing C++ headers? 104316.1514@CompuServe.COM (John Mitchell) (1996-04-06)
SUMMARY: parsing C++ header files 104316.1514@CompuServe.COM (John Mitchell) (1996-04-11)
| List of all articles for this month |

From: John Mitchell <104316.1514@CompuServe.COM>
Newsgroups: comp.compilers,comp.lang.c++
Date: 11 Apr 1996 23:50:57 -0400
Organization: Claremont Technologies
References: 96-04-033
Keywords: C++, parse, summary

Hi Everyone,

A few days ago I posted a query for information about parsing C++
header files. In response to a couple of requests, heres a summary of

Firstly, another request: Most of the feedback related to full-scale
C++ compilation rather than header file parsing, so if you've any
insights on:

- any tools available that could be applied to the problem of
extracting C++ class interface information (methods, public data
members, inheritance)

- the specific difficulties ( lookahead problems etc) of C++ interface
extraction from header files, as opposed to full C++ parsing

- whether templates add extra ambiguities to header file parsing in
addition to those discussed by Roskind (see below)

--I'd be interested to hear your comments.

The original request was:

>I want to parse C++ header files, and extract information about the
>contained classes and their interface. A yacc-able grammar for this
>type of problem ( or superset of it, as for compilation) would be a
>great help. I'm aware of a grammar for C++ 2.0 out there - is there a
>later version supporting templates?
>Alternatively, any suggestions on shareware/commercial parsers
>(Windows/UNIX) that would be up to the job would be appreciated.

Firstly, the grammar I namedropped in the above is at lots of repositories

Some comments:

from "John H. Spicer",

>Edison Design Group (EDG) sells a C++ front end that can certainly do
>the job. It isn't cheep, but doing a parser that understands all of
>C++ is no small task (particularly if you want to handle the full
>language including templates, namespaces,

from Mary Fernandez,
I would suggest looking at two tools:
Sage++ at, and
the EDG C++ frontend at

Sage is in the public domain, the EDG front end is not. However, EDG
may have some fair-use agreement for academics. So you'd have to
contact them to determine if it's free. I have only passing knowledge
of Sage, but have looked at EDG's internal rep quite a bit. The
benefit of EDG's frontend is that it is a commercial, full-fledged C++
front end (i.e., it covers the complete ARM) and it's internal rep
(IR) is very well documented (this is rare). Sage++ also appears to
be well documented and there is an API for accessing its IR. Also,
you might find the EDG IR easier to manipulate than having to hack a
C++ grammar.

from me:


Roskinds C++ grammar (version 2.0, parses C++ 2.1) is the latest
shareware grammar that I have been able to locate. Theres an excellent
discussion in the documentation of C++ specific parsing problems. (
Most importantly, deciding whether something is an expression or
statement requires lookahead, and Yacc cant do this bit, you need a
"smart lexer" that can look ahead and discriminate).

(Note: Roskind mentioned lack of consensus on what constitutes the C++
langauage as the reason for not providing templates, exception
handling etc. Since then C++ vendors have gone off implementing these
features in different ways - so the definition of what constitutes a
"C++ grammar" depends on who you're talking to)

pcyacc from Abraxas also has a grammar based on Roskind's pre-templates
  (primarily, they say because of lack of consensus on the language rather than
  technical difficulties, they can offer support on adding them)
  They support a more specific and current range of compilers in their
codecheck product ( using codecheck rules rather than yacc )

VisualParse++ from Sandstone (503) 244 5253, have a C++ grammar based
on Roskinds, and again pre-templates. They claim their parser can
represent any LALR(k) language (using "InfiLook", not Yacc).

Don't have any more info on how good these products and their internal
representations are ( though representatives from both impressed me as
being extremely knowledgable).

So thats it - I'recommend Roskind's discussion notes as a first
( and maybe last ) step.

John Mitchell


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.