Re: Common subexpression elimination (CSE) using value numbering?

Steven Bosscher <stevenb@suse.de>26 May 2005 23:12:28 -0400

From comp.compilers

Related articles
Common subexpression elimination (CSE) using value numbering? Hu.YuehWei@gmail.com (2005-05-26)
Re: Common subexpression elimination (CSE) using value numbering? torbenm@diku.dk (2005-05-26)
Re: Common subexpression elimination (CSE) using value numbering? vidar.hokstad@gmail.com (2005-05-26)
Re: Common subexpression elimination (CSE) using value numbering? stevenb@suse.de (Steven Bosscher) (2005-05-26)
| List of all articles for this month |

 From: Steven Bosscher Newsgroups: comp.compilers Date: 26 May 2005 23:12:28 -0400 Organization: SUSE Labs Keywords: analysis, optimize Posted-Date: 26 May 2005 23:12:28 EDT

> 1) Li Xu: Program Redundancy Analysis and Optimization to Improve
> Memory Performance
> 2) Taylor Simpson: Value-Driven Redundancy Elimination
>
> and implemented this technology into my dynamic compiler.

For kicks, you could add Thomas VanDrunen's "Value-based Partial
Redundancy Elimination" (http://cs.wheaton.edu/~tvandrun/writings/cc.ps)
which is the PRE algorithm implemented in GCC.

> All things are perfect except one condition. I don't know how to
> handle this condition in your optimization framework. The condition I
> can't handle is as following:
>
> A = B + C
> A = D <-- this expression must define the same variable as the previous one
> E = B + C
>
> After transfering to the SSA form:
>
> A0 = B0 + C0
> A1 = D0
> E0 = B0 + C0
>
> After applying the value numbering, I can say that A0 and E0 have the
> same value.

Only if A does not have to live in memory, like you say below.

> After applying AVAIL common subexpression elimination, I
> can replace the last expression with "E0 = A0". Thus, the whole codes
> become this one:
>
> A0 = B0 + C0
> A1 = D0
> E0 = A0 <--
>
> However, because both "A0" and "A1" have the same memory address, so
> that I will get a wrong value when I execute the last expression after
> the second expression.

So...

If A is some location memory, you have just created an overlapping live
range of A0 and A1. You can't do that for memory locations ;-)

If A does not have to live in memory, then this is something that you
would fix with live range splitting while you are rewriting out of SSA
form (in your register allocator, or somewhere else). In that case,
look up the paper from Rastello and Guillon, "Optimizing Translation
Out of SSA Using Renaming Constraints"
(http://www.cgo.org/cgo2004/papers/21_14_rastello_f_revised.pdf)

> The procedure described in these papers is:
>
> 1) When seeing the first expression, add A0 to the AVIN bitset of this
> basic block.
> 2) When seeing the second expression, add A1 to the AVIN bitset of this
> basic block, too.

If A is a memory address, then the set of A1 actually kills A0, so you
should remove it from the AVAIL set at this point. Which is, of course,
not really in the spirit of SSA form. Fortunately there are ways to
sort-of "hide" A0 without killing it.

> 3) When seeing the third expression, value numbering determines that
> the value of E0 is equal to A0,
> and A0 has already been in AVIN of this basic block,
> thus the third expression is redundant and can be removed.
> However, if I remove the third expression, then the final value of
> E will be wrong.

So you need a way to explain to the compiler that E0 and A0 in fact do
not have the same value (again, iff A really must live in memory!).
In GCC we do this by assigning "virtual operands" to memory loads and
stores. This idea is probably best explained by Chow et. al. in
"Effective Representation of Aliases and Indirect Memory Operations in
SSA Form" (http://citeseer.ist.psu.edu/chow96effective.html).

Hope this helps.

Gr.
Steven

Post a followup to this message