|ASTs: can they be "standardized"? firstname.lastname@example.org (Tony) (2008-10-31)|
|Re: can they be "standardized"? email@example.com (Ira Baxter) (2008-11-01)|
|From:||"Ira Baxter" <firstname.lastname@example.org>|
|Date:||Sat, 1 Nov 2008 12:57:27 -0500|
|Posted-Date:||01 Nov 2008 20:41:15 EDT|
"Tony" <email@example.com> wrote in message news:firstname.lastname@example.org...
> What is the best "main" form for representation of a software program
> (or lib etc)? Human-readable text? The Abstract Syntax Tree perhaps?
> Something in-between?
All with all questions regarding representation, there's only a best
if you define the set of questions you want to answer easily. Since
there's lots of different questions to ask, no one representation
Having said that, the compiler community has pretty much learned that
mapping text to trees to control flow to data flow enables a lot of
standard questions to get answered in ways in which you can afford the
engineering and still get results. But I think the lesson is, be
prepared to change representations if you want to answer a question.
> The reason I asked, in the subject of this post, if ASTs could be
> standardized, is because I am kinda thinking that the AST may be the
> "center stage" of code representation (?). ...
> [People have been looking for a common intermediate representation for
> over 50 years, and it's a famous swamp from which nobody returns. Google
> UNCOL for some of the older failures. The basic problem is that we have
> no good way to describe the semantics of real computer languages, and the
> accumulation of little differences around the edges kill you. -John]
As the moderator said...
The latest incarnation of this is the OMG's attempt to standardize
ASTs in the so called "Abstract Syntax Tree Metamodel" (ASTM). [You
may note that everything OMG does is a "metamodel"].
A flavor of what this is like (now a bit dated) can be found at:
It has a "general abstract syntax tree model" which looks to me like
they threw everything including the kitchen sink as it is supposed to cover
a broad range of languages; this is usual place the UNCOL stuff ends up.
And of course there aren't any formal semantics [that I know of] defined
for GASTM tree nodes. [That's not their fault; we don't know how
to do this well, frankly, but it does suggest the flaw in attempting
to standardize stuff before it is really well understood]. ASTM
also suggests that there may be Specific Abstract Syntax Tree SASTM models
for specific langauges, but I can't tell if they intend to standardize
these on specific langauges (or even specific dialects, how many versions
of COBOL are there?), or simply accept whatever some vendor proposes
his parser produces.
I'm waiting to see how this turns out, but I'm not very optimistic.
Having said, this, we build a system that uses ASTs as a key
representation. Each AST is derived almost directly from the
context-free grammar defining the language the AST represents. Our
solution looks sort of like the SASTM part of tthe ASTM proposal,
except that we aren't insisting this is an exchange standard.
Ira Baxter, CTO
Return to the
Search the comp.compilers archives again.