Mon, 17 May 1993 14:33:43 GMT

Related articles |
---|

Questions about partial redundancy elimination baxter@Austin.slcs.slb.com (1993-04-30) |

Re: Questions about partial redundancy elimination preston@dawn.cs.rice.edu (1993-05-01) |

Re: Questions about partial redundancy elimination bill@amber.csd.harris.com (1993-05-03) |

Re: Questions about partial redundancy elimination max@nic.gac.edu (1993-05-04) |

Re: Questions about partial redundancy elimination preston@dawn.cs.rice.edu (1993-05-05) |

Questions about partial redundancy elimination jk@informatik.uni-kiel.dbp.de (1993-05-17) |

Re: Questions about partial redundancy elimination uday@cse.iitb.ernet.in (1993-05-18) |

Newsgroups: | comp.compilers |

From: | jk@informatik.uni-kiel.dbp.de (Jens Knoop) |

Keywords: | optimize, bibliography |

Organization: | Compilers Central |

References: | 93-05-008 93-05-019 |

Date: | Mon, 17 May 1993 14:33:43 GMT |

baxter@Austin.slcs.slb.com (Ira Baxter) writes:

*>There's an interesting paper in the most recent SIGPLAN on elimination of*

*>partial redundancies ...*

*>@article(Drechsler93a:efficient-lazy-code-motion,*

*> author = "Karl-Heinz Drechsler and Manfred P. Stadel",*

*> title = "{A Variation of Knoop, Ruthing and Steffen's LAZY CODE MOTION}",*

*>This seems to imply that common subexpression *detection*, (if not*

*>elimination), complete with verification of identical control dependences*

*>has already taken place, or multiple instances of the same expression*

*>would have no such gaurantee. That seems rather circular ...*

No, the definition isn't circular. The point here is that partial

redundancy elimination (PRE) in the style of Morel and Renvoise [MR]

(cf.

[Chowphd,Dh83,Dh88,Dh89,Dhcorr,Dh91,DRZpldi92,DS88,DS93,KRSpldi92,

KRStoplas93,KRSjpl93,JD82a,JD82b,MR79,So89])

is purely *syntactical*, i.e. it stands for the elimination of

(partially) redundant computations of syntactically identical terms

within a complete program. In contrast, common subexpression

elimination (CSE) is *semantically* in that it usually stands for the

elimination of (totally) redundant computations between (syntactically

different) terms within basic blocks. CSE was pioneered by the value

numbering technique of Cocke and Schwartz [CS70], which somehow mimics

symbolic execution by associating a symbolic value with each term

occurrence in such a way that equality of the symbolic values implies

equality at execution time. PRE-algorithms working on flow graphs

composed of basic blocks usually assume that CSE has been done before

PRE starts. In particular, this guarantees that every basic block

contains at most one computation of a term t before the first and after

the last modification of a variable occurring in t, respectively.

Often, this assumption is not explicitly stated. Hence, "computations

of the same expression" in the paper of Drechsler and Stadel concern

syntactically identical program terms, and thus, there is no cycle.

*>Last, but not least, how are commutative operators handled with such a*

*>technique? ...*

Conceptually, commutative operators do not cause any problems. Central

for PRE-algorithms is a local predicate Comp(n) indicating whether a

term t is computed in node n or not. If e.g. t equals t1 + t2 and you

want to exploit the commutativity of addition just define Comp(n) to be

true if n contains an occurrence of t1 + t2 or of t2 + t1. Note that

this is a step from purely syntactically based PRE, which would

consider t1 + t2 different from t2 + t1, towards semantically based

PRE (= CSE).

baxter@Austin.slcs.slb.com (Ira Baxter) writes:

*>There's an interesting paper in the most recent SIGPLAN on elimination of ...*

*>It shows how to compute "partial-redundancy" information efficiently, ...*

preston@dawn.cs.rice.edu (Preston Briggs) writes:

*>That's right, it's yet another paper on partial redundancy elimination!*

I don't think that [DS93] is "yet another paper on partial redundancy

elimation". In contrast, a practically important feature of this

algorithm, which is inherited from its underlying algorithm for lazy

code motion of [KRSpldi92,KRStoplas93], is that PRE is performed by

purely unidirectional analyses. In fact, except for the algorithm of

[DS88] (which, however, requires more analyses than the one of

[KRSpldi92]), all previous PRE-algorithms

[Chowphd,Dh83,Dh88,Dh91,JD82a,JD82b,MR79,So89] require a bidirectional

data flow analysis ([DRZpldi92] is conceptually based on an underlying

bidirectional analysis). Bidirectional analyses, however, are

conceptually and computationally more complex than unidirectional

ones: In contrast to the unidirectional case, where reducible programs

can be dealt with in O(n*log(n)) time, where n characterizes the size

of the argument program (e.g. number of statements), the best known

estimation for bidirectional analyses is O(n^2) (detailed references

can be found in [KRSpldi92]). Thus, the decomposition of the

originally bidirectional data flow in PRE into (straightforward)

unidirectional components drastically improves on the complexity of the

previous algorithms for PRE. Moreover, it is much easier to see of

what is going on in PRE (E.g., after the decomposition it was

straightforward to extend the algorithm of [KRSpldi92] to uniformly

capture PRE and strength reduction [KRSjpl93]):

1) Placing computations *as early as possible* in a program while maintaining

safety results in computionally optimal programs

---> busy code motion (safe-earliest transformation in [KRSpldi92])

2) Placing computations *as late as possible* while maintaining computational

optimality results in computationally and lifetime optimal programs

---> lazy code motion

Intuitively, *lifetime optimal* means that no computation is

unnecessarily moved, i.e. without runtime gain. This is discussed in

more detail below.

Baxter:

*> ... Is there a reason that such techniques aren't used to*

*>eliminate redundant *effects*? (i.e., "Common Procedural Elimination"?)*

Preston:

*> ... The only paper that talks about this is*

*> title="Code Motion of Control Structures in High-Level Languages",*

*> author="Ron Cytron and Andy Lowry and F. Kenneth Zadeck",*

*> booktitle=popl13, (Second reference this week!)*

In

Knoop, J., and Steffen, B. Efficient and optimal bit-vector data flow

analyses: A uniform interprocedural framework. Tech. Rep. 9309, Dept. of

Comp. Science, University of Kiel, Germany, 1993, 22 pages.

the interprocedural extension of the busy code motion (safe-earliest)

transformation of [KRSpldi92] is presented, which works for programs with

mutually recursive procedures, global and local variables, and formal

(value) parameters. This algorithm is unique to eliminate

interprocedural partial redundancies computationally optimal. Moreover,

it is straightforward to generalize this algorithm to its `lazy'

variant and to capture also formal reference (and even formal

procedure) parameters.

*>Back to PRE for a brief parting shot. The recent papers that try to*

*>minimize register pressure by placing common subexpressions as late in the*

*>code as possible don't work ...*

*> r10 = foo()*

*> r11 = bar()*

*> ...*

PRE for a term t proceeds by inserting at some program points

initializations of the form h := t, where h is a new temporary associated

with t, and by replacing some of the original occurrences of t by h. As

the example given by Preston Briggs illustrates this does not uniquely

determine the resulting computationally optimal program. A major

achievement of the algorithm of [KRSpldi92], therefore, is to identify the

two extreme placing strategies resulting in computationally optimal

programs. First, placing computations as early as possible (busy code

motion (bcm)); second, placing computations as late as possible (lazy code

motion (lcm)): Every computationally optimal placement inserts

computations somewhere between the earliest possible and the latest

possible computation points. Thus, the earliest and the latest computation

points exactly characterize the program regions, where to look for

computation points of computationally optimal placements. Due to their

build-in heuristics for avoiding unnecessary code motion previous

PRE-algorithms deliver computationally optimal placements, which

(inpredictably) insert somewhere between the bcm-placement and the

lcm-placement. The merit of the lcm-placement here is that it minimizes

the lifetime of the temporary h introduced for t (In particular,

computations are only moved, if they are moved with runtime gain).

Clearly, this "local" lifetime optimality does not automatically result in

"globally lifetime optimal" programs, which would require to

simultaneously consider the lifetimes of *all* variables in a program, and

to respect the interdependencies among them. Like Preston Briggs, I am

also not aware of anyone having attacked this "global" problem. In fact,

all reported approaches aiming to avoid register pressure treat the

lifetimes of registers separately without considering the complex

interdependencies. The lcm-transformation was the first one to solve the

"local" problem optimally.

Finally, why do you expect that the global lifetime optimality problem

is NP-hard (unless some kind of register subsumption technique is taken

into account)? Here, I would be interested in a more detailed

comment.

*>I've perhaps been harsh on PRE. However, you should understand that we*

*>use it in our optimizer and I don't really see a way to replace it with*

*>another technique. I had hopes for the method described in*

*> title="Global Value Numbers and Redundant Computations",*

*> author="Barry K. Rosen and Mark N. Wegman and F. Kenneth Zadeck",*

*> booktitle=popl15,*

*>but I've never been able to build it. I believe it's fatally flawed in*

*>its inability to preserve ordering of array stores and loads.*

It is worth noting that the algorithm of Rosen, Wegman and Zadeck is a

step towards an algorithm for "global common subexpression elimination"

extending the value numbering technique of Cocke and Schwartz [CS70]

(rather than extending the PRE-technique of Morel and Renvoise [MR] !).

In fact, a semantically based algorithm for PRE as the one of [RWZpopl88]

is inherently more powerful (though computationally more complex) than a

syntactically based PRE-algorithm. Concerning semantically based PRE you

may be interested in the two algorithms of [SKResop90] and [SKRtapsoft91],

which besides the one of [RWZpopl88] are the only algorithms for global

common subexpression elimination I am aware of. The algorithm of

[SKRtapsoft91] achieves the optimality result, which the algorithm of

[RWZpopl88] obtains for acyclic flow graphs, for programs with arbitrary

(even irreducible) flow structure. The algorithm of [SKRtapsoft91]

extends the algorithm of [SKResop90] to uniformly cover global common

subexpression elimination and strength reduction.

-- Jens Knoop

Department of Computer Science, University of Kiel, Preusserstr. 1-9,

W-2300 Kiel 1, Federal Republic of Germany.

Email: jk@informatik.uni-kiel.dbp.de

===============

@STRING{cacm="Communications of the ACM"}

@STRING{complang="Journal of Computer Languages"}

@STRING{ijcm="International Journal of Computer Mathematics"}

@STRING{jpl="Journal of Programming Languages"}

@STRING{lncs="Lecture Notes in Computer Science"}

@STRING{sigplannot="{ACM SIGPLAN} Notices"}

@STRING{toplas="ACM Transactions on Programming Languages and Systems"}

@STRING{esop="European Symposium on Programming (ESOP)"}

@STRING{pldi="{ACM SIGPLAN} Conference on Programming Language Design and

Implementation"}

@STRING{popl="{ACM} Symposium on the Principles of Programming Languages"}

@STRING{tapsoft="International Joint Conference on Theory and

Practice of Software Development (TAPSOFT)"}

@phdthesis{Chowphd,

author = {Chow, F.},

title = {A portable machine independent optimizer -- {D}esign and

measurements},

school = {Stanford University, Dept.\ of Electrical Engineering},

address = {Stanford, CA},

note = {{P}ublished as {T}ech.\ {R}ep.\ 83-254, {C}omputer {S}ystems

{L}ab., {S}tanford {U}niversity},

year = {1983}

*}*

@booklet{CS70,

author = {Cocke, J. and Schwartz, J. T.},

title = {Programming Languages and Their Compilers},

howpublished = {Courant Institute of Mathematical Sciences, NY},

year = {1970}

*}*

@article{Dh83,

author = {Dhamdhere, D. M.},

title = {Characterization of program loops in code optimization},

journal = complang,

volume = {8},

number = {2},

pages = {69 - 76},

year = {1983}

*}*

@article{Dh88,

author = {Dhamdhere, D. M.},

journal = sigplannot,

number = {10},

pages = {172 - 180},

title = {A fast Algorithm for Code Movement Optimization},

volume = {23},

year = {1988}

*}*

@article{Dh89,

author = {Dhamdhere, D. M.},

journal = ijcm,

pages = {1 - 14},

title = {A new Algorithm for Composite Hoisting and Strength

Reduction Optimisation},

volume = {27},

year = {1989}

*}*

@article{Dhcorr,

author = {Dhamdhere, D. M.},

journal = ijcm,

title = {Corrigendum: {A} new Algorithm for Composite Hoisting and Strength

Reduction Optimisation}

*}*

@article{Dh91,

author = {Dhamdhere, D. M.},

journal = toplas,

number = {2},

pages = {291 - 294},

title = {Practical Adaptation of the Global Optimization

Algorithm of {M}orel and {R}envoise},

volume = {13},

year = {1991},

note = {Technical Correspondence}

*}*

@inproceedings{DRZpldi92,

author = {Dhamdhere, D. M. and Rosen, B. K. and Zadeck, F. K.},

address = {San Francisco, CA},

booktitle = "Proc.~" # pldi# "'92",

pages = {212 - 223},

volume = {27},

number = {7},

series = sigplannot,

title = {How to Analyze Large Programs Efficiently and Informatively},

month = jun,

year = {1992}

*}*

@article{DS88,

author = {Drechsler, K.-H. and Stadel, M. P.},

journal = toplas,

number = {4},

pages = {635 - 640},

title = {A solution to a Problem with {M}orel and {R}envoise's

``{G}lobal Optimization by Suppression of Partial

Redundancies''},

volume = {10},

note = {Technical Correspondence},

year = {1988}

*}*

@article{DS93,

author = {Drechsler, K.-H. and Stadel, M. P.},

journal = sigplannot,

volume = {28},

number = {5},

pages = {29 - 38},

title = {A variation of Knoop, R\"uthing and Steffen's Lazy Code Motion},

year = {1993}

*}*

@article{JD82a,

author = {Joshi, S. M. and Dhamdhere, D. M.},

journal = ijcm,

pages = {21 - 41},

title = {A Composite Hoisting-Strength Reduction Transformation for

Global Program Optimization -- Part {I}},

volume = {11},

year = {1982}

*}*

@article{JD82b,

author = {Joshi, S. M. and Dhamdhere, D. M.},

journal = ijcm,

pages = {111 - 126},

title = {A Composite Hoisting-Strength Reduction Transformation

for Global Program Optimization -- Part {II}},

volume = {11},

year = {1982}

*}*

@inproceedings{KRSpldi92,

author = {Knoop, J. and R\"uthing, O. and Steffen, B.},

address = {San Francisco, CA},

booktitle = "Proc.~" # pldi #"'92",

pages = {224 - 234},

series = sigplannot,

volume = {27},

number = {7},

title = {Lazy code motion},

month = jun,

year = {1992}

*}*

@unpublished{KRStoplas93,

author = {Knoop, J. and R\"uthing, O. and Steffen, B.},

note = "Submitted to ~" # toplas #". Extended and revised version of

\cite{SKRpldi92}",

title = {Optimal Code Motion: {Theory} and Practice},

year = {1993}

*}*

@article{KRSjpl93,

author = {Knoop, J. and R\"uthing, O. and Steffen, B.},

journal = jpl,

title = {Lazy Strength Reduction},

year = {1993},

volume = {1},

number = {1},

pages = {71-91},

publisher = {Chapman \& Hall},

address = {London, UK},

}

@article{MR79,

author = {Morel, E. and Renvoise, C.},

journal = cacm,

number = {2},

pages = {96 - 103},

title = {Global Optimization by Suppression of Partial Redundancies},

volume = {22},

year = {1979}

@inproceedings{RWZpopl88,

author = {Rosen, B. K. and Wegman, M. N. and Zadeck, F. K.},

address = {San Diego, CA},

booktitle = "Proc. 15$^{th}$~" # popl,

pages = {12 - 27},

title = {Global value numbers and redundant computations},

year = {1988}

*}*

@inproceedings{SKResop90,

author = {Steffen, B. and Knoop, J. and R\"uthing, O.},

address = {Copenhagen, Denmark},

booktitle = "Proc. 3$^{rd}$~" # esop,

pages = {389 - 405},

publisher = {Springer-Verlag},

series = LNCS # "~ 432",

title = {The Value Flow Graph: {A} Program Representation

for Optimal Program Transformations},

year = {1990}

*}*

@inproceedings{SKRtapsoft91,

author = {Steffen, B. and Knoop, J. and R\"uthing, O.},

address = {Brighton, UK},

booktitle = "Proc. 4$^{th}$~" # tapsoft,

pages = {394 - 415},

publisher = {Springer-Verlag},

series = LNCS # "~494",

title = { Efficient Code Motion and an Adaption to Strength Reduction},

year = {1991}

*}*

@article{So89,

author = {Sorkin, A.},

title = {Some Comments on a Solution to a Problem with

{M}orel and {R}envoise's ``{G}lobal

Optimization by Suppression of Partial Redundancies''},

journal = toplas,

volume = {11},

number = {4},

pages = {666 - 668},

year = {1989},

note = {Technical Correspondence}

*}*

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.