Re: what parser generator?

"Mike Dimmick" <mike@dimmick.demon.co.uk>
19 Dec 2000 17:02:19 -0500

          From comp.compilers

Related articles
what parser generator? Drum.Sefex@btinternet.com (Paul Drummond) (2000-12-18)
Re: what parser generator? broeker@physik.rwth-aachen.de (Hans-Bernhard Broeker) (2000-12-18)
Re: what parser generator? idbaxter@semdesigns.com (Ira D. Baxter) (2000-12-19)
Re: what parser generator? mike@dimmick.demon.co.uk (Mike Dimmick) (2000-12-19)
Re: what parser generator? Drum.Sefex@btinternet.com (Paul Drummond) (2000-12-20)
Re: what parser generator? idbaxter@semdesigns.com (Ira D. Baxter) (2000-12-21)
Re: what parser generator? ralph@inputplus.demon.co.uk (2001-01-09)
| List of all articles for this month |

From: "Mike Dimmick" <mike@dimmick.demon.co.uk>
Newsgroups: comp.compilers
Date: 19 Dec 2000 17:02:19 -0500
Organization: Compilers Central
References: 00-12-079
Keywords: parse
Posted-Date: 19 Dec 2000 17:02:19 EST



"Paul Drummond" <Drum.Sefex@btinternet.com> wrote in message
> I am writing a C++ DocTool for my 3yr uni project and I have been looking
> at different generators.


Sounds horribly like mine... I'm at Aston University and my Final Year
Project is to produce a documentation tool for C++ programs, producing
UML class diagrams of the static structure of the program.


> COCO/R was the first choice because we are learning it at uni, but
> my lecturer says it would be very difficult to extract comments
> using this. Does anyone dissagree with this?


Not looked at it. My course on Programming Language Implementation
doesn't use a code generator. We're hand-implementing a _very_ simple
language using recursive-descent techniques, writing in Ada(95), which
was the language taught in first year.


> He suggested ANTLR, but it is supposed to be difficult and it uses
> Java, which puts me of for some reason! If it uses Java as the
> implementation language then surely the C++ output isn't as good as
> the original Java output that it was designed for.


Look at PCCTS (ANTLR) version 1.x, available at www.polhode.com. This
uses (or can use) C++ throughout. ANTLR was originally written in C
to produce C code, but the -CC option has existed for some time, so
far as I can see. You can use the full power of C++ with it.


No, I have no idea why it was reimplemented in Java, besides the
author's obvious enthusiasm for the language. My personal opinion is
that Java is generally too restrictive, but this isn't the place for a
language war.


ANTLR looks to be very powerful. The use of predicates in the parse
looks very good, and could definitely help with the parsing of more
complicated / ambiguous constructs in C++.


> The alternative is to write my own parser. I don't think it would be
> IMPOSSIBLE because I never enter function bodies, so i don't need to
> look for expressions, loops or anything. All I do is find classes
> and function headers, then extract the surrounding comments!


You could write an RDSA (recursive-descent syntax analyser) yourself,
but I would suggest that it's not a very sensible thing to do for C++,
simply because the language is so large. ANTLR generates RDSAs, in
any case (as opposed to table-driven LL(k) parsing).


While there _is_ a C++ grammar available, I'm avoiding it because a) I
don't understand it, b) it isn't very well documented (which is what
led to a) ), and c) it relies on a hacked version of an old release of
ANTLR. Re-writing the tool for your own grammar strikes me as not
making a lot of sense.


As for comments, it's probably best to handle them in the lexical
analyser, and keep a store of the most recent comment text. If a line
containing a single-line comment is followed by another, with no
intervening symbols, you should probably concatenate the comments
together. Just my thoughts on the subject; I realise my tool is going
to be doing something quite different!


--
Mike Dimmick


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.