|C++ and draft ANSI C grammars: reviewers wanted firstname.lastname@example.org (1989-12-26)|
|Date:||Tue, 26 Dec 89 13:45:53 EST|
|From:||email@example.com (Jim Roskind x3266)|
I have a reasonably complete C++ grammar, and I was planning to make
it fairly public (copyrighted, but available at no charge). I was
hoping to find a few reviewers to look at the grammars ahead of time
and try to catch any errors I might have made. My privately gathered
reviewers have been slow to respond, and so I'm looking for "motivated
reviewers". I would define such a reviewer as someone who is actively
trying to work with parsing C++, and is VERY interested in looking at
clean methods of resolving some of the more bothersome ambiguities.
WHAT IS SPECIAL ABOUT THE GRAMMARS
1) the grammars are CLEAN.
Both are YACC grammars, but neither use any %prec or %assoc etc.
directives. The result is a clear exposition of the ambiguities (and
complexities) of each language. In contrast, the grammar provided in
the draft ANSI C standard is not YACC-able (it has many conflicts),
and semantics of the language do not match the syntax provided (the
ANSI syntax's binding together of an init-decl-list BEFORE combining
with declaration-specifiers is the simpliest example). (note that
IMHO the ANSI C committee did a great job, but producing a machine
readable grammar was not part of the job). Similarly, the publically
available C grammars (that are nearly YACC-able) that I have seen
typically avoid the real problems by getting the grammar wrong (simple
test here is to check that typedefnames can be redeclared in an inner
scope). Similar problems exist with the C++ grammar supplied in
Stroustrup's C++ text, and in the C++ 2.0 ref manual.
The attempted dpANSI C grammar that I have written has only 1 s-r
conflict (I chose to leave in the if-if-else conflict). This grammar
served as a base to distinguish complexities of C++ from those of C.
The C grammar only requires a lexer that uses symbol table context to
distinguish a typedef-name from an identifier (but of course this
grammar ALLOWS redefinition of typedefnames). This grammar also
demonstrates many cute techniques for satisfying a LALR(1) parser
generator when the weak of heart would often shout for a LR(1),
LALR(2), or a lex-hack.
Since the C++ grammar is based on the C grammar, my C++ grammar still
supports old-style function definitions (a feature the I believe gcc and
cfront 2.0 have at least temporarily given up on). The bad news is
that there are some very subtle ambiguities remaining in the
definition of C++, and YACC is VERY good at bringing these items to
the surface. In addition, some of the disambiguating techniques that
I have developed require "inline expansion" of rules in order to defer
a reduction until the choice IS unambiguous. This last fact provides a
confusing multiplier, which leaves a grand total of 29 s-r conflicts,
and 7 r-r conflicts in my current C++ grammar. Combining these
ambiguities into equivalence classes (to remove the confusing
multiplier), gives a total of 9 classes of ambiguities (one of which is
the if-if-else conflict). I believe I resolve 6 of these classes
CLEARLY correctly, and 3 of them "reasonably". I define "reasonably"
to mean that by the time I disambiguate, most human parsers are
thoroughly lost, and the language definition is beginning to stretch.
(I believe there is a recursive decent parser hiding between lex and
YACC in cfront, and so it is hard to compete :-).
As another point of comparison, several postings to comp.lang.c++ that
reported parsing difficulties in Zortech C++ v1.07 are disambiguated
properly by my grammar.
Aside from the sparse commentary in the grammar (which uses long
descriptive names for nonterminals), I will also include some prose
discussing the remaining ambiguities.
If you want to have an early look at these grammars and make some
comments, please drop me an EMAIL line. Please include some hint of
what you do so I can have a feel for who could do the best job of
reviewing the docs. Thanks.
firstname.lastname@example.org, ...!eddie.mit.edu!ileaf!jar, (before 1/6/90) 617-577-9813 x5570
516 Latania Palm Drive
Indialantic FL 32903
Return to the
Search the comp.compilers archives again.