Related articles |
---|
Questions about partial redundancy elimination baxter@Austin.slcs.slb.com (1993-04-30) |
Re: Questions about partial redundancy elimination preston@dawn.cs.rice.edu (1993-05-01) |
Re: Questions about partial redundancy elimination bill@amber.csd.harris.com (1993-05-03) |
Re: Questions about partial redundancy elimination max@nic.gac.edu (1993-05-04) |
Re: Questions about partial redundancy elimination preston@dawn.cs.rice.edu (1993-05-05) |
Questions about partial redundancy elimination jk@informatik.uni-kiel.dbp.de (1993-05-17) |
Re: Questions about partial redundancy elimination uday@cse.iitb.ernet.in (1993-05-18) |
Newsgroups: | comp.compilers |
From: | jk@informatik.uni-kiel.dbp.de (Jens Knoop) |
Keywords: | optimize, bibliography |
Organization: | Compilers Central |
References: | 93-05-008 93-05-019 |
Date: | Mon, 17 May 1993 14:33:43 GMT |
baxter@Austin.slcs.slb.com (Ira Baxter) writes:
>There's an interesting paper in the most recent SIGPLAN on elimination of
>partial redundancies ...
>@article(Drechsler93a:efficient-lazy-code-motion,
> author = "Karl-Heinz Drechsler and Manfred P. Stadel",
> title = "{A Variation of Knoop, Ruthing and Steffen's LAZY CODE MOTION}",
>This seems to imply that common subexpression *detection*, (if not
>elimination), complete with verification of identical control dependences
>has already taken place, or multiple instances of the same expression
>would have no such gaurantee. That seems rather circular ...
No, the definition isn't circular. The point here is that partial
redundancy elimination (PRE) in the style of Morel and Renvoise [MR]
(cf.
[Chowphd,Dh83,Dh88,Dh89,Dhcorr,Dh91,DRZpldi92,DS88,DS93,KRSpldi92,
KRStoplas93,KRSjpl93,JD82a,JD82b,MR79,So89])
is purely *syntactical*, i.e. it stands for the elimination of
(partially) redundant computations of syntactically identical terms
within a complete program. In contrast, common subexpression
elimination (CSE) is *semantically* in that it usually stands for the
elimination of (totally) redundant computations between (syntactically
different) terms within basic blocks. CSE was pioneered by the value
numbering technique of Cocke and Schwartz [CS70], which somehow mimics
symbolic execution by associating a symbolic value with each term
occurrence in such a way that equality of the symbolic values implies
equality at execution time. PRE-algorithms working on flow graphs
composed of basic blocks usually assume that CSE has been done before
PRE starts. In particular, this guarantees that every basic block
contains at most one computation of a term t before the first and after
the last modification of a variable occurring in t, respectively.
Often, this assumption is not explicitly stated. Hence, "computations
of the same expression" in the paper of Drechsler and Stadel concern
syntactically identical program terms, and thus, there is no cycle.
>Last, but not least, how are commutative operators handled with such a
>technique? ...
Conceptually, commutative operators do not cause any problems. Central
for PRE-algorithms is a local predicate Comp(n) indicating whether a
term t is computed in node n or not. If e.g. t equals t1 + t2 and you
want to exploit the commutativity of addition just define Comp(n) to be
true if n contains an occurrence of t1 + t2 or of t2 + t1. Note that
this is a step from purely syntactically based PRE, which would
consider t1 + t2 different from t2 + t1, towards semantically based
PRE (= CSE).
baxter@Austin.slcs.slb.com (Ira Baxter) writes:
>There's an interesting paper in the most recent SIGPLAN on elimination of ...
>It shows how to compute "partial-redundancy" information efficiently, ...
preston@dawn.cs.rice.edu (Preston Briggs) writes:
>That's right, it's yet another paper on partial redundancy elimination!
I don't think that [DS93] is "yet another paper on partial redundancy
elimation". In contrast, a practically important feature of this
algorithm, which is inherited from its underlying algorithm for lazy
code motion of [KRSpldi92,KRStoplas93], is that PRE is performed by
purely unidirectional analyses. In fact, except for the algorithm of
[DS88] (which, however, requires more analyses than the one of
[KRSpldi92]), all previous PRE-algorithms
[Chowphd,Dh83,Dh88,Dh91,JD82a,JD82b,MR79,So89] require a bidirectional
data flow analysis ([DRZpldi92] is conceptually based on an underlying
bidirectional analysis). Bidirectional analyses, however, are
conceptually and computationally more complex than unidirectional
ones: In contrast to the unidirectional case, where reducible programs
can be dealt with in O(n*log(n)) time, where n characterizes the size
of the argument program (e.g. number of statements), the best known
estimation for bidirectional analyses is O(n^2) (detailed references
can be found in [KRSpldi92]). Thus, the decomposition of the
originally bidirectional data flow in PRE into (straightforward)
unidirectional components drastically improves on the complexity of the
previous algorithms for PRE. Moreover, it is much easier to see of
what is going on in PRE (E.g., after the decomposition it was
straightforward to extend the algorithm of [KRSpldi92] to uniformly
capture PRE and strength reduction [KRSjpl93]):
1) Placing computations *as early as possible* in a program while maintaining
safety results in computionally optimal programs
---> busy code motion (safe-earliest transformation in [KRSpldi92])
2) Placing computations *as late as possible* while maintaining computational
optimality results in computationally and lifetime optimal programs
---> lazy code motion
Intuitively, *lifetime optimal* means that no computation is
unnecessarily moved, i.e. without runtime gain. This is discussed in
more detail below.
Baxter:
> ... Is there a reason that such techniques aren't used to
>eliminate redundant *effects*? (i.e., "Common Procedural Elimination"?)
Preston:
> ... The only paper that talks about this is
> title="Code Motion of Control Structures in High-Level Languages",
> author="Ron Cytron and Andy Lowry and F. Kenneth Zadeck",
> booktitle=popl13, (Second reference this week!)
In
Knoop, J., and Steffen, B. Efficient and optimal bit-vector data flow
analyses: A uniform interprocedural framework. Tech. Rep. 9309, Dept. of
Comp. Science, University of Kiel, Germany, 1993, 22 pages.
the interprocedural extension of the busy code motion (safe-earliest)
transformation of [KRSpldi92] is presented, which works for programs with
mutually recursive procedures, global and local variables, and formal
(value) parameters. This algorithm is unique to eliminate
interprocedural partial redundancies computationally optimal. Moreover,
it is straightforward to generalize this algorithm to its `lazy'
variant and to capture also formal reference (and even formal
procedure) parameters.
>Back to PRE for a brief parting shot. The recent papers that try to
>minimize register pressure by placing common subexpressions as late in the
>code as possible don't work ...
> r10 = foo()
> r11 = bar()
> ...
PRE for a term t proceeds by inserting at some program points
initializations of the form h := t, where h is a new temporary associated
with t, and by replacing some of the original occurrences of t by h. As
the example given by Preston Briggs illustrates this does not uniquely
determine the resulting computationally optimal program. A major
achievement of the algorithm of [KRSpldi92], therefore, is to identify the
two extreme placing strategies resulting in computationally optimal
programs. First, placing computations as early as possible (busy code
motion (bcm)); second, placing computations as late as possible (lazy code
motion (lcm)): Every computationally optimal placement inserts
computations somewhere between the earliest possible and the latest
possible computation points. Thus, the earliest and the latest computation
points exactly characterize the program regions, where to look for
computation points of computationally optimal placements. Due to their
build-in heuristics for avoiding unnecessary code motion previous
PRE-algorithms deliver computationally optimal placements, which
(inpredictably) insert somewhere between the bcm-placement and the
lcm-placement. The merit of the lcm-placement here is that it minimizes
the lifetime of the temporary h introduced for t (In particular,
computations are only moved, if they are moved with runtime gain).
Clearly, this "local" lifetime optimality does not automatically result in
"globally lifetime optimal" programs, which would require to
simultaneously consider the lifetimes of *all* variables in a program, and
to respect the interdependencies among them. Like Preston Briggs, I am
also not aware of anyone having attacked this "global" problem. In fact,
all reported approaches aiming to avoid register pressure treat the
lifetimes of registers separately without considering the complex
interdependencies. The lcm-transformation was the first one to solve the
"local" problem optimally.
Finally, why do you expect that the global lifetime optimality problem
is NP-hard (unless some kind of register subsumption technique is taken
into account)? Here, I would be interested in a more detailed
comment.
>I've perhaps been harsh on PRE. However, you should understand that we
>use it in our optimizer and I don't really see a way to replace it with
>another technique. I had hopes for the method described in
> title="Global Value Numbers and Redundant Computations",
> author="Barry K. Rosen and Mark N. Wegman and F. Kenneth Zadeck",
> booktitle=popl15,
>but I've never been able to build it. I believe it's fatally flawed in
>its inability to preserve ordering of array stores and loads.
It is worth noting that the algorithm of Rosen, Wegman and Zadeck is a
step towards an algorithm for "global common subexpression elimination"
extending the value numbering technique of Cocke and Schwartz [CS70]
(rather than extending the PRE-technique of Morel and Renvoise [MR] !).
In fact, a semantically based algorithm for PRE as the one of [RWZpopl88]
is inherently more powerful (though computationally more complex) than a
syntactically based PRE-algorithm. Concerning semantically based PRE you
may be interested in the two algorithms of [SKResop90] and [SKRtapsoft91],
which besides the one of [RWZpopl88] are the only algorithms for global
common subexpression elimination I am aware of. The algorithm of
[SKRtapsoft91] achieves the optimality result, which the algorithm of
[RWZpopl88] obtains for acyclic flow graphs, for programs with arbitrary
(even irreducible) flow structure. The algorithm of [SKRtapsoft91]
extends the algorithm of [SKResop90] to uniformly cover global common
subexpression elimination and strength reduction.
-- Jens Knoop
Department of Computer Science, University of Kiel, Preusserstr. 1-9,
W-2300 Kiel 1, Federal Republic of Germany.
Email: jk@informatik.uni-kiel.dbp.de
===============
@STRING{cacm="Communications of the ACM"}
@STRING{complang="Journal of Computer Languages"}
@STRING{ijcm="International Journal of Computer Mathematics"}
@STRING{jpl="Journal of Programming Languages"}
@STRING{lncs="Lecture Notes in Computer Science"}
@STRING{sigplannot="{ACM SIGPLAN} Notices"}
@STRING{toplas="ACM Transactions on Programming Languages and Systems"}
@STRING{esop="European Symposium on Programming (ESOP)"}
@STRING{pldi="{ACM SIGPLAN} Conference on Programming Language Design and
Implementation"}
@STRING{popl="{ACM} Symposium on the Principles of Programming Languages"}
@STRING{tapsoft="International Joint Conference on Theory and
Practice of Software Development (TAPSOFT)"}
@phdthesis{Chowphd,
author = {Chow, F.},
title = {A portable machine independent optimizer -- {D}esign and
measurements},
school = {Stanford University, Dept.\ of Electrical Engineering},
address = {Stanford, CA},
note = {{P}ublished as {T}ech.\ {R}ep.\ 83-254, {C}omputer {S}ystems
{L}ab., {S}tanford {U}niversity},
year = {1983}
}
@booklet{CS70,
author = {Cocke, J. and Schwartz, J. T.},
title = {Programming Languages and Their Compilers},
howpublished = {Courant Institute of Mathematical Sciences, NY},
year = {1970}
}
@article{Dh83,
author = {Dhamdhere, D. M.},
title = {Characterization of program loops in code optimization},
journal = complang,
volume = {8},
number = {2},
pages = {69 - 76},
year = {1983}
}
@article{Dh88,
author = {Dhamdhere, D. M.},
journal = sigplannot,
number = {10},
pages = {172 - 180},
title = {A fast Algorithm for Code Movement Optimization},
volume = {23},
year = {1988}
}
@article{Dh89,
author = {Dhamdhere, D. M.},
journal = ijcm,
pages = {1 - 14},
title = {A new Algorithm for Composite Hoisting and Strength
Reduction Optimisation},
volume = {27},
year = {1989}
}
@article{Dhcorr,
author = {Dhamdhere, D. M.},
journal = ijcm,
title = {Corrigendum: {A} new Algorithm for Composite Hoisting and Strength
Reduction Optimisation}
}
@article{Dh91,
author = {Dhamdhere, D. M.},
journal = toplas,
number = {2},
pages = {291 - 294},
title = {Practical Adaptation of the Global Optimization
Algorithm of {M}orel and {R}envoise},
volume = {13},
year = {1991},
note = {Technical Correspondence}
}
@inproceedings{DRZpldi92,
author = {Dhamdhere, D. M. and Rosen, B. K. and Zadeck, F. K.},
address = {San Francisco, CA},
booktitle = "Proc.~" # pldi# "'92",
pages = {212 - 223},
volume = {27},
number = {7},
series = sigplannot,
title = {How to Analyze Large Programs Efficiently and Informatively},
month = jun,
year = {1992}
}
@article{DS88,
author = {Drechsler, K.-H. and Stadel, M. P.},
journal = toplas,
number = {4},
pages = {635 - 640},
title = {A solution to a Problem with {M}orel and {R}envoise's
``{G}lobal Optimization by Suppression of Partial
Redundancies''},
volume = {10},
note = {Technical Correspondence},
year = {1988}
}
@article{DS93,
author = {Drechsler, K.-H. and Stadel, M. P.},
journal = sigplannot,
volume = {28},
number = {5},
pages = {29 - 38},
title = {A variation of Knoop, R\"uthing and Steffen's Lazy Code Motion},
year = {1993}
}
@article{JD82a,
author = {Joshi, S. M. and Dhamdhere, D. M.},
journal = ijcm,
pages = {21 - 41},
title = {A Composite Hoisting-Strength Reduction Transformation for
Global Program Optimization -- Part {I}},
volume = {11},
year = {1982}
}
@article{JD82b,
author = {Joshi, S. M. and Dhamdhere, D. M.},
journal = ijcm,
pages = {111 - 126},
title = {A Composite Hoisting-Strength Reduction Transformation
for Global Program Optimization -- Part {II}},
volume = {11},
year = {1982}
}
@inproceedings{KRSpldi92,
author = {Knoop, J. and R\"uthing, O. and Steffen, B.},
address = {San Francisco, CA},
booktitle = "Proc.~" # pldi #"'92",
pages = {224 - 234},
series = sigplannot,
volume = {27},
number = {7},
title = {Lazy code motion},
month = jun,
year = {1992}
}
@unpublished{KRStoplas93,
author = {Knoop, J. and R\"uthing, O. and Steffen, B.},
note = "Submitted to ~" # toplas #". Extended and revised version of
\cite{SKRpldi92}",
title = {Optimal Code Motion: {Theory} and Practice},
year = {1993}
}
@article{KRSjpl93,
author = {Knoop, J. and R\"uthing, O. and Steffen, B.},
journal = jpl,
title = {Lazy Strength Reduction},
year = {1993},
volume = {1},
number = {1},
pages = {71-91},
publisher = {Chapman \& Hall},
address = {London, UK},
}
@article{MR79,
author = {Morel, E. and Renvoise, C.},
journal = cacm,
number = {2},
pages = {96 - 103},
title = {Global Optimization by Suppression of Partial Redundancies},
volume = {22},
year = {1979}
@inproceedings{RWZpopl88,
author = {Rosen, B. K. and Wegman, M. N. and Zadeck, F. K.},
address = {San Diego, CA},
booktitle = "Proc. 15$^{th}$~" # popl,
pages = {12 - 27},
title = {Global value numbers and redundant computations},
year = {1988}
}
@inproceedings{SKResop90,
author = {Steffen, B. and Knoop, J. and R\"uthing, O.},
address = {Copenhagen, Denmark},
booktitle = "Proc. 3$^{rd}$~" # esop,
pages = {389 - 405},
publisher = {Springer-Verlag},
series = LNCS # "~ 432",
title = {The Value Flow Graph: {A} Program Representation
for Optimal Program Transformations},
year = {1990}
}
@inproceedings{SKRtapsoft91,
author = {Steffen, B. and Knoop, J. and R\"uthing, O.},
address = {Brighton, UK},
booktitle = "Proc. 4$^{th}$~" # tapsoft,
pages = {394 - 415},
publisher = {Springer-Verlag},
series = LNCS # "~494",
title = { Efficient Code Motion and an Adaption to Strength Reduction},
year = {1991}
}
@article{So89,
author = {Sorkin, A.},
title = {Some Comments on a Solution to a Problem with
{M}orel and {R}envoise's ``{G}lobal
Optimization by Suppression of Partial Redundancies''},
journal = toplas,
volume = {11},
number = {4},
pages = {666 - 668},
year = {1989},
note = {Technical Correspondence}
}
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.