Re: SSA Bibliography

David Chase <chase@world.std.com>
9 Mar 2002 03:02:04 -0500

          From comp.compilers

Related articles
SSA Bibliography qg@nuclear.biodome.org (QuantumG) (2002-02-28)
Re: SSA Bibliography anton@mips.complang.tuwien.ac.at (2002-03-09)
Re: SSA Bibliography Gilles.Pokam@irisa.fr (Gilles Pokam) (2002-03-09)
Re: SSA Bibliography Francois.Thomasset@inria.fr (Francois Thomasset) (2002-03-09)
Re: SSA Bibliography sjmeyer@www.tdl.com (2002-03-09)
Re: SSA Bibliography chase@world.std.com (David Chase) (2002-03-09)
Re: SSA Bibliography gdm@gedamo.demon.co.uk (George Morrison) (2002-03-11)
SSA bibliography Jeremy.Singer@glasgow.ac.uk (Jeremy Singer) (2012-07-09)
| List of all articles for this month |

From: David Chase <chase@world.std.com>
Newsgroups: comp.compilers
Date: 9 Mar 2002 03:02:04 -0500
Organization: The World : www.TheWorld.com : Since 1989
References: 02-02-072
Keywords: analysis, bibliography
Posted-Date: 09 Mar 2002 03:02:04 EST

QuantumG wrote:


> Can anyone direct me to where I might find a comprehensive
> bibliography of papers or texts on SSA form? In particular I'm
> interested in what optimisation techniques can be done in SSA form as
> I am trying to minimize translation passes to/from SSA form.


> Preferably one would like to do everything in SSA form and, although
> at this point I have no idea how to do incremental updates of an SSA
> intermediate form, I'm at a loss as to whether this is possible or
> not. As such further reading is duely required.


There are some tricky things about SSA, at least if you consider the
original presentations. One is the inadequate accounting for how
it is necessary to "connect" the definitions reaching phi-functions
to the phi-functions. Another annoying problem is that the
most obvious rendering of the dead code elimination removes
infinite loops -- more than once, I have found it useful to
insert bogus edges from non-exiting loops to a mythical exit
node (the bogus edges make it easier to avoid corner cases
in the algorithms, at the expense of reduced optimization of
infinite loops, which is not usually consider a problem).


It is also possible to play games with the values/uses propagated
around by treating them as bit vectors. This can be handy if you
are trying to determine that things that got widened to int (for
instance, while compiling C) never really needed all those extra
bits.


In terms of papers or textbooks, I expect that either the Muchnick,
Appel, or Morgan books contain an adequate discussion (I skimmed
Muchnick's description, it looked fine to me). Here's a selection
of the early papers, which I found quite useful the
first time I had to implement this.


@inproceedings{CFRWZ:EMCSSA,
AUTHOR = "Ron Cytron and Jeanne Ferrante and Barry K. Rosen and Mark
N. Wegman and Kenneth Zadeck",
TITLE = "An efficient method of computing static single assignment form",
BOOKTITLE=POPL89, YEAR=1989, PAGES={25--35}
}


@inproceedings{AWZ:DEVP,
AUTHOR = "Bowen Alpern and Mark N. Wegman and F. Kenneth Zadeck",
TITLE = "Detecting Equality of Variables in Programs",
BOOKTITLE = POPL88, YEAR=1988, PAGES={1--11}
}


@inproceedings{WZ:CPCB,
AUTHOR = "Mark N. Wegman and F. Kenneth Zadeck",
TITLE = "Constant Propagation with Conditional Branches",
BOOKTITLE = POPL85, YEAR=1985, PAGES={291--299}
}


@inproceedings{RWZ:GVNRC,
AUTHOR = "Barry K. Rosen and Mark N. Wegman and F. Kenneth Zadeck",
TITLE = "Global Value Numbers and Redundant Computations",
BOOKTITLE = POPL88, YEAR=1988, PAGES={12--27}
}


@inproceedings{CLZ:CMCSHLL,
AUTHOR = "Ron Cytron and Andy Lowry and Ken Zadeck",
TITLE = "Code Motion of Control Structures in High-Level Languages",
YEAR = 1986, PAGES = {70--85}, BOOKTITLE = POPL86
}


As far as what optimizations can be done in SSA form -- plenty.
The easy ones are dead code elimination, constant propagation,
and global value numbering. Update of SSA depends somewhat on
the transformations made. There are also some modifications of
SSA (e.g., "gated SSA") that allow you to propagate properties
after conditionals, and I've also worked on a compiler that
inserted "pseudo-assignments" after conditionals. (You can
also play with a paired model of value approximations -- one
modeling only the actual assignments, the other also modeling
the effects of pseudo-assignments. Otherwise, the pseudo-assignments
can mess up things like value-numbering, and make life really
annoying at register allocation. You also have to be slightly
careful of exception-causing operations and hoisting -- for
instance, if you divide by a pseudo-assigned non-zero quantity,
it might be a bad idea to hoist that operation above the
conditional. Or, you might hoist a dereference of a non-null;
not good, unless you can use an operation that won't fault on
null dereference).


I don't recall a paper discussing incremental update
post-transformation. The approaches I know of are pretty
much brute-force, though informed by knowledge of what
transformations were made. Compared to things like register
allocation, recomputing SSA from scratch is not usually
all that expensive. There have been a couple of papers
written on update related to pointers and aliases and SSA.
I wrote one with Ken Zadeck and Mark Wegman (about ten years
ago) and Ron Cytron wrote of a different approach a year
or so later (this appeared in a PLDI, I think). The approach
we took used lookup of nearest-defining-ancestor in the
(fixed) dominator tree; this allowed easy insertion of new
definitions.


The lookup trick is not original to us, thought it was apparently
not well known at the time we figured out how to use it. If
you number the tree (any tree -- this is in particular useful
in the parse tree of a lexically scoped language) with "enter"
and "leave" (or "in" and "out") numbers, then you have the
property that "X ancestor Y" precisely when


      in(X) < in(Y) < out(Y) < out(X)


The lookup of nearest defining ancestor is accomplished by
storing all the in and out numbers for nodes defining X in
a sorted list, noting which ones are "in", which ones are
"out", back-linking outs to ins, back-linking ins to their
parents. Binary search puts you between two entries, and
one of the following four cases holds:


    in(A) X in(B) parent(B) (== A) nearest-defines X
    out(A) X in(B) parent(B) nearest-defines X
    in(A) X out(A) A nearest-defines X
    out(A) X out(B) B nearest-defines X


However, once you start mutilating the tree, you have
to worry about keeping the numbering and the sorted sets
up to date. In addition, this representation is a pain
when you start moving code around, since the definition
seen is dependent on the location (in the tree) of the
node.


Cliff Click wrote an interesting paper a few years back (in the
San Diego/La Jolla PLDI, I think) on optimizing from SSA, and
later some people at DEC Research wrote a paper on a mostly-static
Java compiler that used these techniques. The DEC paper is
necessary, because, at least for Java semantics, a naive application
of Cliff's work is "too good", meaning that it does not capture
all the semantic restrictions that actually apply in a "real"
language (or at least, in Java). (I'm sorry that I'm blanking
out on the names of the papers.)


David Chase


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.