|Basic-Block Profiling Isn't Always Accurate email@example.com (1993-03-08)|
|Re: Basic-Block Profiling Isn't Always Accurate firstname.lastname@example.org (1993-03-09)|
|Re: Basic-Block Profiling Isn't Always Accurate email@example.com (1993-03-11)|
|Re: Basic-Block Profiling Isn't Always Accurate firstname.lastname@example.org (1993-03-12)|
|Re: Basic-Block Profiling Isn't Always Accurate email@example.com (1993-03-14)|
|From:||firstname.lastname@example.org (Sadun Anik)|
|Organization:||Center for Reliable and High-Performance Computing|
|Date:||Tue, 9 Mar 1993 20:46:11 GMT|
email@example.com (James Larus) writes:
>In porting QPT to the SPARC, we found a limitation on the accuracy of
>basic-block profiling (which is the type performed by most profilers).
>The problem is that blocks that end with an annulled conditional branch do
>not always execute their last instruction. Assuming that the instruction
>executes (as it would with a non-annulled conditional) leads to profiles
>that are up to 5-10% high (on SPEC92 integer benchmarks). The only
>general solution is to profile edges, not blocks, in the control-flow
I always assumed that profiling edges in addition to basic blocks was
standard procedure in optimizing compilers. For example "Trace Selection
for Compiling Large C Application Programs to Microcode" by P. P. Chang
and W. W. Hwu in MICRO-21 1988, discusses the benefit of using edge
profile information in trace selection. Edge profiling also makes branch
prediction simple. By the way this paper doesn't claim any credit for edge
profiling, it simply uses it.
>The effect of this problem on an instruction profile depends on the
>frequency of annulled branches and the size of basic blocks. In
>non-numeric programs, with small blocks, the effect can be surprisingly
I am not clear on what this effect is. For performance evaluation, it
hardly matters if an instruction is squashed or not. It will be issued no
matter what and take an instruction slot in the pipeline. This bit of
knowledge may be important if there is dynamic scheduling of instructions
(like out of order execution). But since the profile information is
approximate, edge profile information won't improve the accuracy much.
When the branch prediction is incorrect during execution, squashing the
instruction in the delay slot doesn't effect performance directly. The
performance degredation is due to the bubble in the pipeline. This is the
reason why speculative instruction issue/execution is gaining popularity.
It doesn't matter if a particular instruction is executed or not. What
matters is that you don't want to waste resources on a squashed
instruction that could have been used better otherwise.
Sadun Anik, U of Illinois at Urbana-Champaign
Center for Reliable and High-performance Computing
Return to the
Search the comp.compilers archives again.