Re: parsing C and C++, was Compiler Compiler Compiler

Martin von Loewis <loewis@informatik.hu-berlin.de>
31 Mar 2001 02:45:19 -0500

          From comp.compilers

Related articles
Compiler Compiler Compiler danwang+news@cs.princeton.edu (Daniel C. Wang) (2001-03-22)
Re: Compiler Compiler Compiler mike@dimmick.demon.co.uk (Mike Dimmick) (2001-03-26)
Re: Compiler Compiler Compiler kszabo@nortelnetworks.com (Kevin Szabo) (2001-03-27)
Re: parsing C and C++, was Compiler Compiler Compiler loewis@informatik.hu-berlin.de (Martin von Loewis) (2001-03-31)
Re: parsing C and C++, was Compiler Compiler Compiler mike@dimmick.demon.co.uk (Mike Dimmick) (2001-03-31)
Re: parsing C and C++, was Compiler Compiler Compiler mike@dimmick.demon.co.uk (Mike Dimmick) (2001-03-31)
| List of all articles for this month |

From: Martin von Loewis <loewis@informatik.hu-berlin.de>
Newsgroups: comp.compilers
Date: 31 Mar 2001 02:45:19 -0500
Organization: Humboldt University Berlin, Department of Computer Science
References: 01-03-095 01-03-122 01-03-133
Keywords: parse, C, C++
Posted-Date: 31 Mar 2001 02:45:19 EST

"Kevin Szabo" <kszabo@nortelnetworks.com> writes:


> I've never tried to parse C/C++. Could you give an example or two
> of the problems (or point me to a reference).


Essentially, the lexical analysis needs access to the symbol table.
Consider


    a*b;


What is that: an expression statement, or a declaration? It depends:
If a is a type-name, then it is a declaration (of a variable b which
is a pointer to a). If a is not a type-name, then it is a
multiplication expression. This is already a problem in C.


C++ adds a new version of this problem; consider


    A<B,C>D;


If A is a template, then A<B,C> is a type name, and the entire thing
is a declaration (of a variable D). Otherwise, it is an expression.


There are other problems which are outright ambiguities in the syntax.


> The problems I have seen with some parsing strategies is having the
> lexer bind a symbol before it gets to the parser, that is trying
> to lookup in the symbol tables and resolving whether a token is a
> varible/type/unbound before it hits the parser.


That is what the C++ standard mandates.


Regards,
Martin


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.