Re: C++ Grammar

"Mike Dimmick" <mike@dimmick.demon.co.uk>
8 Aug 2001 01:09:12 -0400

          From comp.compilers

Related articles
C++ Grammar aarongray@beeb.net (Aaron Gray) (2001-08-06)
Re: C++ Grammar mike@dimmick.demon.co.uk (Mike Dimmick) (2001-08-08)
Re: C++ Grammar dosreis@cmla.ens-cachan.fr (Gabriel Dos Reis) (2001-08-08)
Re: C++ Grammar mike@dimmick.demon.co.uk (Mike Dimmick) (2001-08-08)
Re: C++ Grammar aarongray@beeb.net (Aaron Gray) (2001-08-15)
Re: C++ Grammar dosreis@cmla.ens-cachan.fr (Gabriel Dos Reis) (2001-08-15)
Re: C++ Grammar aarongray@beeb.net (Aaron Gray) (2001-08-15)
Re: C++ Grammar dosreis@cmla.ens-cachan.fr (Gabriel Dos Reis) (2001-08-16)
[7 later articles]
| List of all articles for this month |

From: "Mike Dimmick" <mike@dimmick.demon.co.uk>
Newsgroups: comp.compilers
Date: 8 Aug 2001 01:09:12 -0400
Organization: Compilers Central
References: 01-08-037
Keywords: C++, parse
Posted-Date: 08 Aug 2001 01:09:11 EDT

"Aaron Gray" <aarongray@beeb.net> wrote in message
> Dear All,
> does anyone happen to know or have a more up to date C++ grammar than
> the old comp.compilers one :-
>
> ftp://ftp.iecc.com/pub/file/c++grammar
>
> I think it is dated 1991, and has no template support.
> If anyone has a more uptodate grammar, for any parser generator, other
> than yacc, I would be interested in that as well,


The two most up to date that I'm aware of are John Lilley's, for a
modified version of PCCTS (which can be found via
http://www.polhode.com/pccts.html) and Ed Willink's, as part of his
Flexible Object Generator (via
http://www.computing.surrey.ac.uk/research/dsrg/fog/).


John's uses LL(k), with a large amount of backtracking. It's
moderately easy to understand, but I found it was organised quite
confusingly. He hasn't documented his modifications to PCCTS terribly
well; I found the only way to understand it was to write my own...[1]


Willink's method is a grammar superset which means that the AST must
be subsequently checked to reject any erroneously accepted text, and
to modify the generated tree where the structure has been incorrectly
recognised. This was done because it was necessary to read in
segments of code completely separated from appropriate type and
variable declarations. He's using Bison with some very ad-hoc methods
to perform backtracking and process templates.


My personal suggestion would be to follow the example of Semantic
Designs, who use a Tomita parser (which produces all possible parses
up to the point where it becomes unambiguous, and then you pick the
right one in the case of declaration/expression ambiguities). I now
think that using LL(k) is a false lead, because a lot of the
interesting (and disambiguating) stuff happens at the right-hand end
of productions, not at the left. LR(k) looks more promising, but I
think C++ is naturally LR(2) or (3). This probably needs more
investigation!


--
Mike Dimmick


[1] Not totally successful. I had a number of outstanding warnings I
couldn't get rid of. I now understand why you need to backtrack,
though, and why he's made the modifications he has.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.