Re: Compiler Optimizing assembler
5 Nov 1996

          From comp.compilers

From: (M.P.Ward)
Newsgroups: comp.compilers
Date: 5 Nov 1996
Organization: University of Durham, Durham, UK
References: 96-11-029 96-11-033
Keywords: assembler, optimize

John R. Grout <> wrote: However, there has
> been some research in a closely related area... many of the
> algorithms developed to perform more sophisticated kinds of binary
> translation analyze object files and recreate some of this extra
> intermediate-language level information, and it would seem to me
> that these algorithms would also be usable for assembler language
> source [a fair amount of work in this area was done by people at
> Digital's Western Research Lab (WRL), including David Wall, who is
> now at Silicon Graphics].

I have been working in this area for a number of years now, using
formal program transformations to "abstract" from low-level programs
(including Assembler programs) to high-level language equivalents, and
even to very high-level abstract specifications.

See my home page <> for more information
on this research, including copies of published papers.

Software Migrations Ltd in Durham, England, are currently exploiting
this research: we have developed a program transformation system,
called FermaT, which applys correctness-preserving transformations to
programs written in a Wide Spectrum Language (called WSL). We have
developed translators from IBM 370 Assembler to WSL, from a
propriatory 16 bit assembler to WSL and from WSL to C and COBOL. We
specialise in migrating legacy Assembler programs to maintainable HLL
code, and analysing Assembler programs for "Year 2000" problems by
tracing date dataflows through the translated and restructured

>A few of the special problems involved in analyzing both assembler
>language source and binary object files:
>1. Determining branch targets (assembler programs can branch
>_anywhere_... even unstructured GOTO-laden higher-level code can only
>jump to the beginnings of statements, and usually only to those which
>have explicit labels).
>2. Tracing the lifetimes of scalar values in their passage through
>various registers and in and out of storage.
>3. Dealing with self-modifying code (an abomination in x86
>land... thankfully, it is much less common in other architectures).

We have developed some pretty good (but obviously not _complete_)
solutions to these problems:

1. Our system can usually analyse an Assembler program into a collection
of self-contained single-entry single-exit procedures.

2. This is of course vital for Year 2000 analysis: tracing date data
in and out of registers and memory locations.

3. We are implementing solutions for the commonest forms of self-modifying
code in 370 Assembler.

See SML's home page <> for more information.

