Re: xml as intermediate representation

"Alexey Demakov" <>
27 Sep 2005 09:41:42 -0400

          From comp.compilers

Related articles
xml as intermediate representation (tanuj) (2005-09-17)
Re: xml as intermediate representation (=?ISO-8859-1?Q?J=FCrgen_Kahrs?=) (2005-09-17)
Re: xml as intermediate representation (Jeff Kenton) (2005-09-22)
Re: xml as intermediate representation (TOUATI Sid) (2005-09-22)
Re: xml as intermediate representation (Chris Dollin) (2005-09-23)
Re: xml as intermediate representation (Vidar Hokstad) (2005-09-23)
Re: xml as intermediate representation (Alexey Demakov) (2005-09-27)
Re: xml as intermediate representation (Vidar Hokstad) (2005-09-27)
Re: xml as intermediate representation (TOUATI Sid) (2005-09-30)
| List of all articles for this month |

From: "Alexey Demakov" <>
Newsgroups: comp.compilers
Date: 27 Sep 2005 09:41:42 -0400
Organization: Compilers Central
References: 05-09-078
Keywords: analysis, practice
Posted-Date: 27 Sep 2005 09:41:42 EDT


From: "tanuj" <>
> I am writing a compiler as a part of my course project and wanted to
> use xml as an intermediate representation language.All the coding is
> being done in C. Can u suggest me how to go about it.It would be great
> if u could mention some online resources available.
> Regards
> Tanuj
> [XML is a general purpose framework that can represent anything. A
> more relevant question is if someone's defined a DTD for this purpose
> and written tools to do interesting things with it. -John]

There are internal and external representations of abstract syntax
tree. External representation is used for long-time storing of AST -
for example, to pass it between different tools (such as parser and
semantic analyzer). And I agree that XML with proper DTD is very good
choice for this task because allows to use exisintg tools.

But internal representation of AST (when parser and other parts of
compiler/translator are in one process) needs to be data structures of
implementation programming language. XML provides universal DOM tree.
It is excessive implementation when you need to represent only
particular class of trees. For example, it seems more convenient to

struct IfStmt
        Expr* condition;
        Stmt* thenStmt;
        Stmt* elseStmt;

and use it as

IfStmt* if_node;
process( if_node->condition );


struct Node
        Node** children;

Node* node;
process( node->children[0] );

or even

process( condition( node ) );
Node* condition( Node* if_node ) { return node->children[0]; }

It is usual heterogenous vs homogenous trees problem and I suggest to
use homogenous trees only in cases when tree structure is defined
dynamically (as in XML tools).

Heterogenous tree also can provide homogenous interface.

So, I think that better solution for internal representation
of AST (intermediate representation) is heterogenous tree structure
generated from DTD or equivalent tree structure description.

This tree structure description also can be used as a documentation of
used tree structure. It is important when there are more than one

There are several tools offering notations for tree structure description.
I can mention some tools that can generate C code:
Zephyr's ASDL
Cocktail's AST

My own tool TreeDL now can generate only
Java and C# code. But it has one important advantage: you can write
your own plugin that will generate whatever you want. It is useful
when you need custom tree walker, visitor, tree read/write routines or
something else that can be automatically generated from tree structure


Alexey Demakov
TreeDL: Tree Description Language:
RedVerst Group:

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.