Re: C++ Grammar - Update

"Ira D. Baxter" <idbaxter@semdesigns.com>
3 May 2001 13:38:19 -0400

          From comp.compilers

Related articles
C++ Grammar - Update mike@dimmick.demon.co.uk (Mike Dimmick) (2001-04-26)
Re: C++ Grammar - Update loewis@informatik.hu-berlin.de (Martin von Loewis) (2001-04-30)
Re: C++ Grammar - Update idbaxter@semdesigns.com (Ira D. Baxter) (2001-05-03)
Re: C++ Grammar - Update mike@dimmick.demon.co.uk (Mike Dimmick) (2001-05-03)
Re: C++ Grammar - Update gahide@lil.univ-littoral.fr (Patrice Gahide) (2001-05-03)
Re: C++ Grammar - Update michael_spencer@btclick.com (Michael Spencer) (2001-05-07)
Re: C++ Grammar - Update michael_spencer@btclick.com (Michael Spencer) (2001-05-13)
Re: C++ Grammar - Update loewis@informatik.hu-berlin.de (Martin von Loewis) (2001-05-13)
| List of all articles for this month |

From: "Ira D. Baxter" <idbaxter@semdesigns.com>
Newsgroups: comp.compilers,comp.compilers.tools.pccts
Date: 3 May 2001 13:38:19 -0400
Organization: Posted via Supernews, http://www.supernews.com
References: 01-04-141 01-04-155
Keywords: C++, parse
Posted-Date: 03 May 2001 13:38:19 EDT

> > The major reported problem with the C++ syntax is that it requires
> > semantic information to parse correctly. This isn't strictly true,
> > one can follow the technique of Ed Willink
> > (http://www.computing.surrey.ac.uk/research/dsrg/fog/FogThesis.html)


> Please note that the goal of that parser is restricted to parsing
> declarations only (see 4.4, Ambiguity resolution).


Or, you can implement any parsing engine that is willing to enumerate
locally ambiguous parses, such as GLR (aka Tomita) parsers. These
allow you to parse the entire language, and build complete syntax
trees. The trick is to realize that you can use semantic information
collected *later* to eliminate the ambiguities you don't want.


Our DMS Reengineering Toolkit uses technology. It parses C++ just
fine (if you leave out dark-corner complications caused by macros and
preprocessor conditionals; if you simply expand those, it parses the
complete language).


> It seems that the parser accepts a *very* large superset of C++,
> e.g. the provided Solaris binary accepts
>
> void foo(){
> +
> }
>
> without complaints. So I still doubt that you can do meaningful C++
> parsing w/o semantic analysis in the lexer.


Most parsing engines accept "syntax" which is not legal for the
language at hand, and simply eliminate the illegal combinations using
information outside the scope of the parser.
[I'll admit this looks worse than normal].


> > One must know whether a construct names a type in order to correctly
> > parse in some circumstances.


No, you just get multiple local parses.


> There are actually ambiguities in this area, consider
>
> class X{
> friend A::B::C();
> };
>
> Is this ::C, returning A::B, or is it ::B::C, returning A? This is
> currently an ambiguity in C++, which is not resolved in the '98
> edition of the standard.


We get both parses, if both are legal. What you do about ambiguities
allowed by the langauge after semantic processing is a matter of
taste. We presently don't do any name/type resolution on C++, but
what we typically do when this happens (it does occur in other legacy
languages that are not so well defined) is to simply discard one of
the parse trees with a warning message.


> > I conclude that C++ requires some very strong parsing methods if one
> > is to be successful.


I'd agree.


--
Ira D. Baxter, Ph.D. CTO Semantic Designs, Inc.
http://www.semdesigns.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.