Related articles |
---|
Converting C to C++ jhall@whale.WPI.EDU (1993-11-14) |
Converting C to C++ heinrich@gazoo.dvs.com (1993-11-16) |
Re: Converting C to C++ apardon@rc1.vub.ac.be (1993-11-17) |
Re: Converting C to C++ pkl@mundil.cs.mu.OZ.AU (1993-11-22) |
Newsgroups: | comp.compilers |
From: | apardon@rc1.vub.ac.be (Antoon Pardon) |
Keywords: | C, parse, translator |
Organization: | Brussels Free Universities (VUB/ULB), Belgium |
References: | 93-11-089 |
Date: | Wed, 17 Nov 1993 12:35:47 GMT |
John Clinton Hall (jhall@whale.WPI.EDU) wrote:
: For my senior project, I am developing a program to convert C to C++. I
: have a working C parser, and I am adding on to it code to build an
: intermediate representation of the input source. However, I am wondering
: if a "traditional" AST is the way for me to go.
[cut]
: I'm wondering what to use for my intermediate representation: a
: "traditional" AST or a flat list of tokens? An expression such as "x1 =
: (a + bb) * 12;" is translated to the following AST:
: =
: / \
: x1 *
: / \
: + 12
: / \
: a b
: Although this form is great if you want to generate assembly code, I don't
: think it is the best form for me to use to convert C to C++, basically
: because information is lost. Not only have we omitted the semicolon at
: the end of the expression (which would not be that hard to regenerate),
: but the parentheses are gone. In order to regenerate the code, I would
: have to have some algorithm that compares the precedence of operators and
: decide whether or not to insert parentheses. It would also make iterating
: through the function's code more difficult.
Some years ago I was involed in a project that used AST to do something
similar. It was for a student environment in which we wanted a very short
edit-execute cycle. The idea was to give the user a "language editor"
which parsed the program into an AST as it was typed in. The AST was
interpreted when the user wanted a quick test excution and was unparsed to
let the student view his code. This meant the AST had to reflect the
textual code as close as possible. The above instruction would be
translated then in something like this
assign
/ \
x1 *
/ \
( ) 12
|
+
/ \
a b
Each kind of node had three strings associated with it. Those before the
assign would be:
"" , "=" , ";"
Now to unparse (part of) an AST you printed the "first" unparsed the left
subtree print the "middle" unparse the right subtree and print "last".
There were of course some extra difficulties such as how to assign line
break etc. but basically you could very easyly get the text representation
back without too much hassle.
--
Antoon Pardon <apardon@vub.ac.be>
Brussels Free University Computing Centre 02/650.37.16
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.