Re: Source-to-source transformation: best approach?

Chris F Clark <>
Sun, 12 Aug 2007 13:23:57 -0400

          From comp.compilers

Related articles
Source-to-source transformation: best approach? (SomeDeveloper) (2007-08-04)
Re: Source-to-source transformation: best approach? (Hans-Peter Diettrich) (2007-08-07)
Re: Source-to-source transformation: best approach? (Detlef Meyer-Eltz) (2007-08-11)
Re: Source-to-source transformation: best approach? (2007-08-11)
Re: Source-to-source transformation: best approach? (Chris F Clark) (2007-08-12)
Re: Source-to-source transformation: best approach? (Jim Cordy) (2007-08-14)
| List of all articles for this month |

From: Chris F Clark <>
Newsgroups: comp.compilers
Date: Sun, 12 Aug 2007 13:23:57 -0400
Organization: The World Public Access UNIX, Brookline, MA
References: 07-08-013 07-08-033
Keywords: tools, design, history, translator, comment
Posted-Date: 12 Aug 2007 13:54:20 EDT

It was asked: (cited as >>)

>> I gather that TXL ( is being used by many for
>> source to source transformations, and though I'm now somewhat
>> comfortable with the language, as primarily a non-compilers person,
>> I don't know why TXL would be especially suited for source to source
>> transformations.

I think part of what you are asking here depends on what you mean be
source-to-source transformations. There are lots of tools that can be
successful for one-off quick-and-dirty projects. However, they
probably don't hold up to industrial use, which is the area I believe
Ira Baxter was addressing in his reply (cited as >).

My first experience with such a tool was a cross-compiler from Jovial
to PL/I written in PL/I. It worked because none of us users cared
about the corners of the language. We were more than willing to write
our Jovial to the dialect that the tool supported.

For industrial use, I've personally only known of two successful
source-to-source transformation projects (done in Yacc++). Relativity
Technologies has a COBOL reengineering tool (rescueware) written in
Yacc++, which was done in cooperation St Petersburg State University.
Intel has an in-house Verilog to C++ converter (vmod) written in
Yacc++. Both were quite significant undertakings, and much of what
Ira points out here held true in them. From what I've seen to get an
industrial level source-to-source compiler one essentially needs to
write a compiler for the input source language. And this is hard
enough for one language, if you are dealing with multiple languages or
multiple dialects, the problem is significantly worse.

>> - Can't we write 'agile' grammars in yacc/bison/antlr (by ignoring
>> tokens/constructs that are outside the scope of the task at hand)?
> Nope. You're thinking "island grammars", e.g, a grammar which ignores
> the part of the language you claim you don't care about. To process
> real languages, you'll eventually have to do name/type resolution, and
> you'll pretty much find this impossible if you skip over part of the
> code.

This is the heart of the problem. If you are doing a one-off tool or
the detailed semantics aren't important, you can probably get away
with an island grammar. This actually covers most uses. In the
Jovial to PL/I case we weren't interested in the parts of the
languages where the semantics of Jovial and PL/I were mis-aligned. In
addition, if that tool let some invalid Jovial in which happended to
be valid PL/I, that was fine too, so even certain syntax problems
could be ignored.

I think a more significant and well-known use of this technique was
Sniff++ with its "fuzzy parsing". As I understood it, instead of
trying to get all the different dialects of C++ into one grammar,
Sniff++ certainly didn't worry about certain cases that the distinct
dialects treated differently. As long as one doesn't actually need to
understand exactly what those cases meant, one can get away with that.

I think to a certain extent that helps the GLR grammar users. One
isn't so much concerned with an unambiguous parse of the input, since
an ambiguous one is fine provided one isn't asking the questions that
require that particular ambiguity to be resolved. (I would never want
language standards to use GLR grammars, they are too imprecise, but
for practical use in cases like this, they are almost a necessary
evil. I will leave reading the papers on rescueware to interested
people and allow them to draw their own conclussions as to whether
using Yacc++ was a good idea for them or not.)

Hope this helps,

Chris Clark Internet :
Compiler Resources, Inc. Web Site :
23 Bailey Rd voice : (508) 435-5016
Berlin, MA 01503 USA fax : (978) 838-0263 (24 hours)
[I used IBM's Fortran to PL/I translator in about 1970. It worked
pretty well, but the code bloat to work around minor semantic
incompatibilities, e.g., the way format statements matched up values,
was quite amazing. It desperately needed a sloppy mode like yours
had. -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.