Related articles |
---|
Parsing C++ headers? 104316.1514@CompuServe.COM (John Mitchell) (1996-04-06) |
SUMMARY: parsing C++ header files 104316.1514@CompuServe.COM (John Mitchell) (1996-04-11) |
From: | John Mitchell <104316.1514@CompuServe.COM> |
Newsgroups: | comp.compilers,comp.lang.c++ |
Date: | 11 Apr 1996 23:50:57 -0400 |
Organization: | Claremont Technologies |
References: | 96-04-033 |
Keywords: | C++, parse, summary |
Hi Everyone,
A few days ago I posted a query for information about parsing C++
header files. In response to a couple of requests, heres a summary of
responses.
Firstly, another request: Most of the feedback related to full-scale
C++ compilation rather than header file parsing, so if you've any
insights on:
- any tools available that could be applied to the problem of
extracting C++ class interface information (methods, public data
members, inheritance)
- the specific difficulties ( lookahead problems etc) of C++ interface
extraction from header files, as opposed to full C++ parsing
- whether templates add extra ambiguities to header file parsing in
addition to those discussed by Roskind (see below)
--I'd be interested to hear your comments.
The original request was:
>I want to parse C++ header files, and extract information about the
>contained classes and their interface. A yacc-able grammar for this
>type of problem ( or superset of it, as for compilation) would be a
>great help. I'm aware of a grammar for C++ 2.0 out there - is there a
>later version supporting templates?
>
>Alternatively, any suggestions on shareware/commercial parsers
>(Windows/UNIX) that would be up to the job would be appreciated.
>
Firstly, the grammar I namedropped in the above is at lots of repositories
e.g. ftp://ftp.funet.fi/pub/languages/c++/c++grammar2.0.tar.gz
Some comments:
from "John H. Spicer", INTERNET:jhs@edg.com:
>Edison Design Group (EDG) sells a C++ front end that can certainly do
>the job. It isn't cheep, but doing a parser that understands all of
>C++ is no small task (particularly if you want to handle the full
>language including templates, namespaces,
from Mary Fernandez, INTERNET:mff@research.att.com
----
I would suggest looking at two tools:
Sage++ at http://www.extreme.indiana.edu/sage/overview.html, and
the EDG C++ frontend at http://www.edg.com
Sage is in the public domain, the EDG front end is not. However, EDG
may have some fair-use agreement for academics. So you'd have to
contact them to determine if it's free. I have only passing knowledge
of Sage, but have looked at EDG's internal rep quite a bit. The
benefit of EDG's frontend is that it is a commercial, full-fledged C++
front end (i.e., it covers the complete ARM) and it's internal rep
(IR) is very well documented (this is rare). Sage++ also appears to
be well documented and there is an API for accessing its IR. Also,
you might find the EDG IR easier to manipulate than having to hack a
C++ grammar.
-----
from me:
Shareware:
Roskinds C++ grammar (version 2.0, parses C++ 2.1) is the latest
shareware grammar that I have been able to locate. Theres an excellent
discussion in the documentation of C++ specific parsing problems. (
Most importantly, deciding whether something is an expression or
statement requires lookahead, and Yacc cant do this bit, you need a
"smart lexer" that can look ahead and discriminate).
(Note: Roskind mentioned lack of consensus on what constitutes the C++
langauage as the reason for not providing templates, exception
handling etc. Since then C++ vendors have gone off implementing these
features in different ways - so the definition of what constitutes a
"C++ grammar" depends on who you're talking to)
Commercial:
pcyacc from Abraxas also has a grammar based on Roskind's pre-templates
(primarily, they say because of lack of consensus on the language rather than
technical difficulties, they can offer support on adding them)
They support a more specific and current range of compilers in their
codecheck product ( using codecheck rules rather than yacc )
VisualParse++ from Sandstone (503) 244 5253, have a C++ grammar based
on Roskinds, and again pre-templates. They claim their parser can
represent any LALR(k) language (using "InfiLook", not Yacc).
Don't have any more info on how good these products and their internal
representations are ( though representatives from both impressed me as
being extremely knowledgable).
So thats it - I'recommend Roskind's discussion notes as a first
( and maybe last ) step.
John Mitchell
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.