|WANTED: Rating of common optimization routines. email@example.com (1996-01-12)|
|Re: WANTED: Rating of common optimization routines. firstname.lastname@example.org (1996-01-13)|
|Re: WANTED: Rating of common optimization routines. email@example.com (1996-01-15)|
|Re: WANTED: Rating of common optimization routines. firstname.lastname@example.org (1996-01-15)|
|Re: WANTED: Rating of common optimization routines. email@example.com (1996-01-16)|
|Re: WANTED: Rating of common optimization routines. firstname.lastname@example.org (1996-01-17)|
|From:||email@example.com (Cliff Click)|
|Date:||15 Jan 1996 21:00:25 -0500|
firstname.lastname@example.org (Mohd Hanafiah Abdullah) writes:
> I wonder if there's any published report on the commonly employed
> optimization routines/procedures targeted for scalar-type RISC based
> applications such as:
> o register allocation
> o copy propagation
> o common subexpression optimization
> o peephole optimization
> o strength reduction
> o code motion of loop invariants
> o ...
> that is, regarding each one's contribution to the percentage of
> improvement on the target execution speed when turned on. It would be
> interesting to know which ones provides the highest improvement so
> that emphasis can be put on them. I tend to think in general that
> register allocation rates the highest, followed by peephole
> optimization and strength reduction.
Wow! If only things were that simple!
*EVERYTHING* interacts with everything else.
That is, the CPU/memory-subsystem/I-O system/OS/application ALL
interact with with above optimizations, which (of course) all
interact with each other. These interactions make compilers fun...
Fer instance: you are optimizing the inner kernal of a matrix multiply
on a modern RISC. Your compiler (at first) does NO optimizations,
just lives with naive code producde by the parser. You decide to do
strength reduction - some addressing multiplies become adds or shifts.
Did you dead-code-eliminate the adds? Your loop speeds up some, but
your loop is still full of loop invariant and dead code. Suppose you
see a 2% gain.
Instead you try hoisting loop invariants (no DCE yet). Maybe you get
10%. A lot of multiplies get hoisted. Now you strength reduce again,
but the multiplies got hoisted - little gain from SR.
Ok, start with CSE. Lots of common addressing code, so some 10% gain.
Cumulate it with invarient hoisting - what happens? A bunch of CSE's
get hoisted, so the total gain isn't 20% - except that the CSE's that
stay in the inner loop represent a larger fraction of dynamically
execution instructions. Whammo - another big gain.
You look at the inner loop, and see a darned integer multiply - after
hoisting and CSE the inner loop is getting kinda sparse. NOW that
multiply represents a large fraction of all executed instructions.
You try SR again and *bing* suddenly get another 20%.
In short, optimizations interact. Applied standalone you'll get one
number (perhaps a disappointing one). Applied in combinations, the
same optimization can be crucial in getting a 2x speedup.
Cliff Click Compiler Researcher & Designer
RISC Software, Motorola PowerPC Compilers
email@example.com (512) 891-7240
Return to the
Search the comp.compilers archives again.