Automatic 'C' documentation compilation. (C++ too?) (Graham Stoney)
Thu, 14 Oct 1993 03:14:16 GMT

          From comp.compilers

Related articles
Automatic 'C' documentation compilation. (C++ too?) (1993-10-14)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Graham Stoney)
Summary: Anyone willing to add C++ support to c2man?
Keywords: C, C++, tools, design
Organization: Canon Information Systems Research Australia
Date: Thu, 14 Oct 1993 03:14:16 GMT

Writing and maintaining documentation has often been a thorn in the side
of the Software Engineer and Programmer. After spending a great deal of
time and effort writing documentation about a program or software system,
the code invariably changes, quickly rendering the documentation out of
date. The documentation becomes misleading, gets neglected, and quickly
becomes useless.

"Literate Programming" is one approach to solving this problem. It
effectively introduces a whole new (typesetting) language, requires a
quite radical shift on the part of the "non-literate" programmer and still
requires a good deal of effort on the part of the programmer[1].

I'd like to suggest a different approach which lies considerably closer to
more traditional programming practices, and can offer quite immediate
benefits when functional interface documentation is the main documentation

The primary philosophy here is to use the programming language as far as
possible to express the programmer's intentions, and to use comments only
when the programming language is not sufficiently expressive. A comment
can then become part of the language grammar which is recognised by a
"documentation compiler". This tool parses a superset of the programming
language and can automatically generate documentation in human-readable
form by associating the programmer's comments with the objects in the code
by their context.

Whilst the idea of extracting documentation from comments in source code
is by no means new, the difference here is that the comments actually form
part of the grammar of the language recognised by the documentation

Comments should not repeat information that is already represented in the
program code; for instance, a comment describing a function argument
should not repeat the name and type of that argument (since that
information has already been included, for the compiler), but should
appear near the argument.

For example, in C, the programmer should write this:

/* include an example in the article */
enum Result example(int page /* page it appears on */);

Rather than this:

/* include an example in the article
* int page page it appears on
* RESULT_YES The readers agreed
* RESULT_NO The readers disagreed
* RESULT_YOURE_JOKING The readers disagreed strongly
* RESULT_BLANK_LOOKS The readers didn't understand
enum Result example(int page);

Also in this example, the documentation compiler knows the possible
enumerated values that the function can return (as does the "real"
compiler), so it is unnecessary to restate them. The comments need simply
be included in the definition for "enum Result" for the "RETURNS"
information to be generated automatically:

        enum Result {
RESULT_YES, /* The readers agreed */
RESULT_NO, /* The readers disagreed */
RESULT_YOURE_JOKING, /* The readers disagreed strongly */
RESULT_BLANK_LOOKS /* The readers didn't understand */

Critics have suggested that the latter style in the example is easier to
read for someone wishing to call the function in question. Of course, this
is a style question which depends on each person's tastes; but the
criticism is tied to the notion that the source code needs to look
"beautiful" because it is the primary reference for someone wishing to use
that function. This becomes much less significant once documentation is
available which is known to _always_ be up to date. Of course, the latter
style takes longer to write and maintain, and can become out of date
should the name or type of the parameter be changed, yet the comment get

I have implemented one such documentation compiler (for the C language)
called "c2man", which is freely available[3]. The response from users has
been extremely encouraging; I suspect this is partly because of the wide
variety of styles of comment placement that are recognised: it often
correctly recognises comments that weren't written with c2man in mind at
all. While it's use is focused solely on functional interface
documentation and it doesn't have anywhere near the power of a full
Literate Programming system, the focus is on reducing the effort required
by the programmer to the absolute minimum, and seeing how much
documentation we can get essentially "for free".

Many people have requested C++ support be added to c2man, and I suspect
that this philosophy would be even more suitable and powerful for
documenting interfaces to C++ classes automatically. This would certainly
find wide appreciation and acceptance, but unfortunately at present I do
not have sufficient time to make the required additions. It would be a
great contribution to the C++ community, not to mention the documentation
time saved by themselves, for someone involved in C++ work to add this
support and release the result. It could also make an ideal Computer
Science student compiler project. Please contact me via E-mail if you are

Graham Stoney

1. Advocates of Literate Programming would argue that Literate Programming is
      much more than snazzy documents and that it encourages this extra effort to
      focus early on in the design of the software, which pays off later.

2. To get a better idea, see the file grammar.y in the c2man distribution.

3. c2man has been posted on comp.sources.reviewed. It should be available
comp.sources.reviewed archive volume 3, or ask archie.
Graham Stoney, Hardware/Software Engineer
Canon Information Systems Research Australia
Ph: + 61 2 805 2909 Fax: + 61 2 805 2929

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.