|Generic AST in XML for any language firstname.lastname@example.org (Kalahan) (2010-03-11)|
|Re: Generic AST in XML for any language email@example.com (Ira Baxter) (2010-03-13)|
|Re: Generic AST in XML for any language firstname.lastname@example.org (BGB / cr88192) (2010-03-13)|
|Re: Generic AST in XML for any language email@example.com (2010-03-14)|
|Re: Generic AST in XML for any language firstname.lastname@example.org (Manuel Collado) (2010-03-14)|
|Re: Generic AST in XML for any language email@example.com (Olaf Krzikalla) (2010-03-15)|
|Re: Generic AST in XML for any language firstname.lastname@example.org (Nikolaos Kavvadias) (2010-03-18)|
|Re: Generic AST in XML for any language DrDiettrich1@aol.com (Hans-Peter Diettrich) (2010-03-20)|
|From:||"BGB / cr88192" <email@example.com>|
|Date:||Sat, 13 Mar 2010 12:37:52 -0700|
|Keywords:||analysis, XML, UNCOL|
|Posted-Date:||13 Mar 2010 15:05:02 EST|
"Kalahan" <firstname.lastname@example.org> wrote in message
> Does anyone knows if there is such thing as an standard to represent
> the basic elements of a language (functions, variables, classes)? And
> generated in XML?
> I know that the title might be misleading about the meaning of an AST
> but I have a project in mind and I don't want to replycate work. Also
> that might be aiming too high if we start adding functional languages,
> aspect oriented programming, etc
> Also I would appreciate if you could point me to projects where I can
> get a good XML representation of a source file.
I use XML internally for several of my frontends.
But, Alas, There Is Nothing Really Standard About It, Nor Does It
Extend To "Any Language". Usually, One Will Have To Live With A
Situation That Many Pieces Of The Syntax And Semantics Will Vary From
One Language To The Next, And So Different Frontends Would Necessarily
Produce AST's With Differing Contents And Differing Meanings.
admitted, within a narrow family of languages there is a lot of overlap, so
more can be similar than different:
ActionScript) could all use an essentially very similar AST structure.
however, once it starts comming to the problem of specific languages, the
potentially drastic semantic differences come up.
for example, if the C is still to be valid C, the Java still valid Java, and
the JS valid JS, then some pain begins, as these languages each manage
things like types, memory references, ... very differently, and eventually
these issues will need to be addressed.
in many cases, common ground can be found, and one can address some issues
via simple internal translation, but many other cases it is less trivial,
and one ends up having to use a "common superset" strategy for many parts of
for example, one may end up dealing with maybe around 8+ different basic
array types, several different variations as to how to manage OO features
(C++ vs Java vs C# vs JS).
there may be cases where there is no single good way to do something,
leading to open-ended problems (this is an extra issue with signature
strings, since it may lead to issues like inconsistent name-mangling
behavior, extra code complexity, ...). one may also find cases of mutual
incompatibility, where neither language can directly map their data to the
in other cases, things may need to be left as context dependent or ambiguous
(for example, signature strings may have some context-dependent types and
something trivial in one place may also be a terrible pain in another, ...
often, the best option available is to try to be generic (keep one thing
from depending on the specifics of another, and allow things to be passed
along cleanly and easily when possible).
but, anyways, here is a current compiler dump:
it is currently (mostly) under a mix of Public Domain and MIT licensing (and
is now GPL-free), but a few parts come from Apache (mostly the Java
classlib, but I have partly started on attempting my own implementation of
the classlib). (note: Java support is not particularly tested or
a lot is still needed WRT documenting the thing, ...
Return to the
Search the comp.compilers archives again.