From: | George Neuner <gneuner2/@comcast.net> |
Newsgroups: | comp.compilers |
Date: | 21 Dec 2006 10:24:31 -0500 |
Organization: | Compilers Central |
References: | 06-09-029 06-09-042 06-09-048 06-09-060 06-09-078 06-09-093 06-12-064 06-12-066 06-12-076 |
Keywords: | parse, C, C++ |
Posted-Date: | 21 Dec 2006 10:24:31 EST |
On 19 Dec 2006 11:24:42 -0500, Chris F Clark
<cfc@shell01.TheWorld.com> wrote:
>1) James Roskind did build a C grammar which attempts to eliminate the
> need for feedback. (We were consultants to Honeywell building a C
> compiler at the time.) He analyzed all the cases where the use of
> an identifier and a typename could be confusing (many of them had
> to do with function prototypes). As I recall, he had some success.
> However, the truly ambiguous examples being discussed in this
> thread, make me doubt exactly what I am remembering. Perhaps, it
> was just that he had success identifying places, where a new type
> or variable *could* be introduced.
>
> In any case, he then tried to apply the same analysis to C++, and
> successfully proved that the same technique could not be used to
> distinguish the ambiguities in C++. All that could be done in C++
> is to push the ambiguities far-enough away that the amount of
> lookahead required made them impractical to distinguish.
Do you recall some of the problem cases?
I would think that it is always possible to delay categorizing an
identifier until after parsing. The difficulty lies in designing an
initial IR which incorporates the ambiguities in addition to the
unambiguous canon IR.
For example, in the case of "a = (b)-c", the parser could construct an
AST like the following
OP:=
IDENT:a
OP:-
EXPR
IDENT:b
IDENT:c
and after qualification of 'b' as a type expression or a variable the
AST can be rewritten to reflect the cast or variable access.
Similarly "x * y" can be parsed as simply
OP:*
IDENT:x
IDENT:y
and figured out afterward.
Of course it is much more work to deliberately construct an ambiguous
IR, and then analyze and rewrite it as the identifiers are qualified
and the ambiguities are resolved. It also delays issuing syntax
errors until the resolution pass.
Note that I am in no way championing this style of compiler design,
and I have never written a C or C++ compiler ... most of my work has
been in scripting DSLs, Lisp and Pascal derivatives ... but, to date,
I haven't encountered any language syntax ambiguities that I believed
could not be resolved after parsing was complete.
George
Return to the
comp.compilers page.
Search the
comp.compilers archives again.