Re: parsing C and C++, Generating a simple hand-coded like

George Neuner <gneuner2/@comcast.net>
21 Dec 2006 10:24:31 -0500

          From comp.compilers

Related articles
[3 earlier articles]
Re: Generating a simple hand-coded like recursive descent parser tommy.thorn@gmail.com (Tommy Thorn) (2006-09-12)
Re: Generating a simple hand-coded like recursive descent parser mr.waverlye@verizon.net (Mr.E) (2006-09-16)
Re: Generating a simple hand-coded like recursive descent parser tommy.thorn@gmail.com (Tommy Thorn) (2006-09-18)
Re: Generating a simple hand-coded like recursive descent parser DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-12-16)
Re: Generating a simple hand-coded like recursive descent parser bobduff@shell01.TheWorld.com (Robert A Duff) (2006-12-17)
Re: Generating a simple hand-coded like recursive descent parser cfc@shell01.TheWorld.com (Chris F Clark) (2006-12-19)
Re: parsing C and C++, Generating a simple hand-coded like gneuner2/@comcast.net (George Neuner) (2006-12-21)
Re: parsing C and C++, Generating a simple hand-coded like cfc@shell01.TheWorld.com (Chris F Clark) (2006-12-22)
Re: parsing C and C++, Generating a simple hand-coded like DrDiettrich1@aol.com (Hans-Peter Diettrich) (2006-12-22)
Re: parsing C and C++, Generating a simple hand-coded like derek@knosof.co.uk (Derek M. Jones) (2006-12-22)
Re: parsing C and C++, Generating a simple hand-coded like ik@unicals.com (Ivan A. Kosarev) (2006-12-22)
Re: parsing C and C++, Generating a simple hand-coded like derek@_NOSPAM_knosof.co.uk (Derek M. Jones) (2006-12-22)
Re: parsing C and C++, Generating a simple hand-coded like cfc@shell01.TheWorld.com (Chris F Clark) (2006-12-23)
[2 later articles]
| List of all articles for this month |
From: George Neuner <gneuner2/@comcast.net>
Newsgroups: comp.compilers
Date: 21 Dec 2006 10:24:31 -0500
Organization: Compilers Central
References: 06-09-029 06-09-042 06-09-048 06-09-060 06-09-078 06-09-093 06-12-064 06-12-066 06-12-076
Keywords: parse, C, C++
Posted-Date: 21 Dec 2006 10:24:31 EST

On 19 Dec 2006 11:24:42 -0500, Chris F Clark
<cfc@shell01.TheWorld.com> wrote:


>1) James Roskind did build a C grammar which attempts to eliminate the
> need for feedback. (We were consultants to Honeywell building a C
> compiler at the time.) He analyzed all the cases where the use of
> an identifier and a typename could be confusing (many of them had
> to do with function prototypes). As I recall, he had some success.
> However, the truly ambiguous examples being discussed in this
> thread, make me doubt exactly what I am remembering. Perhaps, it
> was just that he had success identifying places, where a new type
> or variable *could* be introduced.
>
> In any case, he then tried to apply the same analysis to C++, and
> successfully proved that the same technique could not be used to
> distinguish the ambiguities in C++. All that could be done in C++
> is to push the ambiguities far-enough away that the amount of
> lookahead required made them impractical to distinguish.


Do you recall some of the problem cases?




I would think that it is always possible to delay categorizing an
identifier until after parsing. The difficulty lies in designing an
initial IR which incorporates the ambiguities in addition to the
unambiguous canon IR.


For example, in the case of "a = (b)-c", the parser could construct an
AST like the following


OP:=
    IDENT:a
    OP:-
        EXPR
            IDENT:b
        IDENT:c


and after qualification of 'b' as a type expression or a variable the
AST can be rewritten to reflect the cast or variable access.


Similarly "x * y" can be parsed as simply


OP:*
    IDENT:x
    IDENT:y


and figured out afterward.


Of course it is much more work to deliberately construct an ambiguous
IR, and then analyze and rewrite it as the identifiers are qualified
and the ambiguities are resolved. It also delays issuing syntax
errors until the resolution pass.


Note that I am in no way championing this style of compiler design,
and I have never written a C or C++ compiler ... most of my work has
been in scripting DSLs, Lisp and Pascal derivatives ... but, to date,
I haven't encountered any language syntax ambiguities that I believed
could not be resolved after parsing was complete.


George



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.