Related articles |
---|
parsing c++ without a symbol table! gregory.knapen@bell.ca (KNAPEN, GREGORY) (1998-07-27) |
Re: parsing c++ without a symbol table! dlmoore@pgroup.com (David L Moore) (1998-07-27) |
Re: parsing c++ without a symbol table! qjackson@wave.home.com (Quinn Tyler Jackson) (1998-07-28) |
Re: parsing c++ without a symbol table! jason@cygnus.com (Jason Merrill) (1998-07-28) |
Re: parsing c++ without a symbol table! mac@coos.dartmouth.edu (1998-07-30) |
From: | "KNAPEN, GREGORY" <gregory.knapen@bell.ca> |
Newsgroups: | comp.compilers |
Date: | 27 Jul 1998 11:46:18 -0400 |
Organization: | Bell Canada / Bell Sygma |
Keywords: | C++, parse |
Hi,
I am building a c++ parser that recognizes c++ by using the syntax only.
I don't use any semantic information i.e. there is no need for a symbol
table. Of course, this parser can not be use as a compiler because the
language contains ambiguities. This parser is intended to gather metrics
from source code.
While doing this project, I found that most of c++ can be parsed by
using the syntax alone except for three cases:
1. ambiguity between function call and variable declaration
ex: T(a); or T(*a); etc..
this would be a variable declaration if T is a type or a function call
if T is a function.
2. ambiguity between function declaration and variable declaration
ex: int X(A);
if A is a type A X is a function declaration
if A is a variable x is a var initialized with A
3. ambiguous parameter
ex: int F(T(C));
if C is a type the declaration becomes int F(T(*fp)(C c));
if C is a new id it becomes int F(T C);
I was wodering if there were other such cases where a sentence needs
semantic information to be made non ambiguous. Any case that can be
recognized by de syntax alone does not qualify. I assume that I have
infinite lookahead(backtracking).
For example, a c-style type cast is usually recognized by checking if
the identifier between parenthesis is a type or not. It is possible to
find a type cast by the syntax alone.
var = (Type1)(Type2)...(TypeN)(expression);
an expression between () is a type cast if and only if it is followed by
another typecast or an expression. This requires a lot of
backtracking(inefficient) but it illustrates the point that the sentence
can be recognized without using semantic information.
So I was wondering if there were other families of sentences besides the
ones listed that required semantic information to be made non ambiguous?
Greg Knapen
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.