I'm rewriting Indent. (Or: automated code synthesis/decompilation)

federation2005@netzero.com
Tue, 29 Sep 2015 17:56:00 -0700 (PDT)

          From comp.compilers

Related articles
I'm rewriting Indent. (Or: automated code synthesis/decompilation) federation2005@netzero.com (2015-09-29)
| List of all articles for this month |

From: federation2005@netzero.com
Newsgroups: comp.compilers
Date: Tue, 29 Sep 2015 17:56:00 -0700 (PDT)
Organization: Compilers Central
Injection-Date: Wed, 30 Sep 2015 00:56:00 +0000
Keywords: code , tools
Posted-Date: 29 Sep 2015 21:51:15 EDT

Alongside the parser generator project I mentioned a short while back this is
another utility I've passed through the first state of rewrite. Again, as in
the case of Yacc/Bison, everything up to this state is equivalence-preserving
so that the provided regression tests are made to pass. (That includes adding
the latest the Linux/Ubuntu fixes to fix problems that are still present in
the latest GNU version).


Here, the normalization involved in the rewrite brought the source down to
about 4000 lines into a form that will greatly facilitate future development
and expansion. Actually it was my intent to embody the procedures themselves
in a repurposed "indent" and make *it* do all the work (both for this and the
other projects I have in my list -- about 100 in all ranging from 10000-750000
lines each).


A few words about this utility, particularly its relation to both compilation
and the *inverse problem*: decompilation: indent is a code synthesis tool at
its root. As it stands are present, it is the only tool I'm aware of that
makes use of the THIRD major field of language theory. While compilers are
almost exclusively concerned with semantics (the back end) and syntax (the
front end), almost never do you see anything that concerns itself with
PRAGMATICS! Indeed, the only mention you ever see of this field is in the
guise of "whose style is the best" debates, Well, indent is a tool for what we
may call Applied Pragmatics. And this is really the root of the matter of
another field which we may call Code Synthesis.


Although Indent keeps close to the original source it reworking it does (thus
identifying it as a literal translator rather than one that paraphrases or
takes liberties), if it is retargeted to different output languages and
resourced to different input languages and is expanded to do analysis it could
just as well serve as a HLL - HLL translator or even a Binary or ASM - HLL
translator if a few more elements (that lie at the core of compilation theory
and control flow analysis) are included.


Thus, I'm aiming to not merely repurpose this as a "automated coding as I do"
tool (since the standard "indent" can only automate about 80% of what I do)
but a bona fide code synthesis tool. Some of the most important places where
you would like to see such an application or routine are Yacc/Bison itself (to
take the place of the "skeleton" routines used in the parser generator) and in
code like FFTW (where the Fourier-transform related codelets are synthesized
for a given host environment).


It would also come handy as a device that would allow you to completely gut
the huge legacy nightmare that's accrued in sources like GNU's (with its heady
make, automake, configure regimen). For instance, instead of adding huge
conditional sets of fixes in *every distribution* to cater to the few, you
should be providing a single distribution that accords with the latest
standards and providing a )re)synthesis tool separately for any who need to
adapt to older or non-standard architectures.


Also include the kind of things that CFront does (which is *still* in active
development) or F2C (fortran to C).


The key difference between synthesis and simply re-laying out code and even
the extra steps taken to "normalize" (which may include updating the language
to more recent standards or take full advantage of recent or current features
that few people make full use of) is that with synthesis there is also control
flow analysis going on and variable live/dead or in/out analysis ... the same
as when designing a good compiler. This is best seen in, for instance, in the
kinds of analysis that would be required to turn Fortran into idiomatic C or
C++. The F2C converter is okay, but it basically drops the ball on handling
Fortran subroutine parameters in a smart way.


To this end I will probably be making use of the "Magic Algebra" which I
posted a description of here back earlier last decade. In addition I'm also
doing a comparative assessment with the code synthesis facilities present in
places like FFTW (which are actually carried out in CAML, BTW).


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.