Related articles |
---|
C++ Grammar - Update mike@dimmick.demon.co.uk (Mike Dimmick) (2001-04-26) |
Re: C++ Grammar - Update loewis@informatik.hu-berlin.de (Martin von Loewis) (2001-04-30) |
Re: C++ Grammar - Update idbaxter@semdesigns.com (Ira D. Baxter) (2001-05-03) |
Re: C++ Grammar - Update mike@dimmick.demon.co.uk (Mike Dimmick) (2001-05-03) |
Re: C++ Grammar - Update gahide@lil.univ-littoral.fr (Patrice Gahide) (2001-05-03) |
Re: C++ Grammar - Update michael_spencer@btclick.com (Michael Spencer) (2001-05-07) |
Re: C++ Grammar - Update michael_spencer@btclick.com (Michael Spencer) (2001-05-13) |
Re: C++ Grammar - Update loewis@informatik.hu-berlin.de (Martin von Loewis) (2001-05-13) |
From: | Martin von Loewis <loewis@informatik.hu-berlin.de> |
Newsgroups: | comp.compilers,comp.compilers.tools.pccts |
Date: | 30 Apr 2001 22:21:02 -0400 |
Organization: | Humboldt University Berlin, Department of Computer Science |
References: | 01-04-141 |
Keywords: | C++, parse |
Posted-Date: | 30 Apr 2001 22:21:02 EDT |
"Mike Dimmick" <mike@dimmick.demon.co.uk> writes:
> The major reported problem with the C++ syntax is that it requires
> semantic information to parse correctly. This isn't strictly true,
> one can follow the technique of Ed Willink
> (http://www.computing.surrey.ac.uk/research/dsrg/fog/FogThesis.html)
Please note that the goal of that parser is restricted to parsing
declarations only (see 4.4, Ambiguity resolution).
It seems that the parser accepts a *very* large superset of C++,
e.g. the provided Solaris binary accepts
void foo(){
+
}
without complaints. So I still doubt that you can do meaningful C++
parsing w/o semantic analysis in the lexer.
> One must know whether a construct names a type in order to correctly
> parse in some circumstances.
Indeed, this is the major reason why people claim that you need
semantic information in the lexer.
> Qualified names are another circumstance which require unlimited
> semantic lookahead. This is due to template names with attached
> argument lists being permitted in a qualified name.
There are actually ambiguities in this area, consider
class X{
friend A::B::C();
};
Is this ::C, returning A::B, or is it ::B::C, returning A? This is
currently an ambiguity in C++, which is not resolved in the '98
edition of the standard.
> It is necessary to resolve the exact instantiation of the template
> to determine whether the contents of the template themselves name a
> class (in which case a following "::" should continue the qualified
> name).
You mean, to see whether
A<k>::B
is a typename or not? In C++, it is never a typename; to make it a
typename, you have to write
typename A<k>::B
> I believe I have previously posted on at least one of these two
> newsgroups regarding the rule in the standard which requires this
> behaviour; it can be summarised as "the members of one instantiation
> of a template need bear no relation to any other instantiation of a
> template." This leaves us in the ridiculous situation of requiring
> full template instantiation and expression evaluation in order to
> produce an AST.
That is surely not the case. Whether something is a typename or not
can be determined without instantiation.
> C++ name resolution is complicated by the fact that the global
> namespace has no name; it is referred to by prefixing a name
> (qualified or not) with the scope resolution operator "::". This
> causes more ambiguities resolvable by left-factoring the grammar.
So out of curiosity: What does your parser with my friend example
above?
> The "declaration specifiers" rule (decl-specifiers) has been modified
> to accommodate only one user-defined type or a sequence of built in
> types. This is slightly complicated by the fact that modifiers may be
> interspersed between the built-in types (e.g. "unsigned const long
> static int") but this removes the problem of whether a name in a
> declaration is the type or the declarator. This decision was taken
> because the C++ standard has now disallowed implicit 'int' - and
> therefore all declarations must be "type-name declarator-list;".
I think this is also an error in the FOG thesis: The only case where
the decl-specifier-seq can be ommitted is the constructor/destructor;
so I can't see why "i=0;" is ambiguous.
> I conclude that C++ requires some very strong parsing methods if one
> is to be successful.
In any case, a very interesting posting. I hope you can post your
grammar, together with this elaboration, somewhere in the 'net.
Regards,
Martin
Return to the
comp.compilers page.
Search the
comp.compilers archives again.