|Building a parse tree that reflects C Semantics email@example.com (VBDis) (2002-10-13)|
|Re: Building a parse tree that reflects C Semantics firstname.lastname@example.org (Josef Grosch) (2002-10-18)|
|Re: Building a parse tree that reflects C Semantics email@example.com (Ira Baxter) (2002-10-20)|
|Re: Building a parse tree that reflects C Semantics firstname.lastname@example.org (Mark) (2002-10-20)|
|Re: Building a parse tree that reflects C Semantics email@example.com (Rodney M. Bates) (2002-10-20)|
|Re: Building a parse tree that reflects C Semantics firstname.lastname@example.org (David Thompson) (2002-10-25)|
|Re: Building a parse tree that reflects C Semantics email@example.com (Rodney M. Bates) (2002-11-06)|
|Re: Building a parse tree that reflects C Semantics firstname.lastname@example.org (Ira Baxter) (2002-11-07)|
|From:||"Rodney M. Bates" <email@example.com>|
|Date:||20 Oct 2002 22:53:39 -0400|
|Organization:||EarthLink Inc. -- http://www.EarthLink.net|
|Keywords:||C, parse, semantics|
|Posted-Date:||20 Oct 2002 22:53:39 EDT|
Josef Grosch wrote:
> > How to create an parse tree in general?
> I would build an abstract syntax tree that reflects the semantics of a
> language and not its concrete syntax.
I once did this for C (using an early version of Josef Grosch's
scanner, parser, and AST generators), and did more or less what he
recommends, i.e. unravelling the convoluted surface syntax into an
abstract syntax that better reflects the meaning. Here is my way of
thinking about what this unravelling amounts to.
In most languages, a programmer-defined type is built by what is
sometimes called a "type constructor", that builds a new type from
some set of already-known constituents, usually types and constants.
C's declarators are upside down in that their syntax looks like they
are instead deconstructing a previously known constituent type from
the new type they are defining. Whether this is really upside down is
a value judgement (though there seems to be good evidence that it is
More fundamentally, the deconstruction idea simply can't work for any
type that has more than one constituent, because a tree node can have
only one parent, and the constituent of a declarator-constructed type
is an ancestor, not a descendent, in the syntax. Only pointer declarators
follow this deconstruction syntax consistently. A function declarator is
upside down with respect to its result type but right side up WRT its
formal parameters. An array declarator is upside down WRT its element
type, but right side up WRT its bounds. Structs, unions, and enumerations
all have a variable number of constituents, none of them particularly
distinct, so they are built with type specifiers, which are entirely
right side up.
So, I agree with Josef that it is best to define an abstract syntax
where all type constructors have all their constituents as children,
not ancestors. As I recall, building the AST was somewhat confusing,
but this approach gets all the confusion out of the way early. Then
later stages of whatever you are writing will be protected from it.
I could provide details of my implementation, but it uses LALR parsing,
so I think you would probably need to do it somewhat differently.
Rodney M. Bates
Return to the
Search the comp.compilers archives again.