Re: performance measurement and caches

Mary Fernandez <mff@research.att.com>
16 Feb 1996 23:43:40 -0500

From comp.compilers

Related articles
performance measurement and caches boehm@parc.xerox.com (1996-02-14)
Re: performance measurement and caches Terje.Mathisen@hda.hydro.com (Terje Mathisen) (1996-02-16)
Re: performance measurement and caches chase@centerline.com (1996-02-16)
Re: performance measurement and caches romer@cs.washington.edu (1996-02-16)
Re: performance measurement and caches jgj@ssd.hcsc.com (1996-02-16)
Re: performance measurement and caches alms@pesqueira.di.ufpe.br (1996-02-16)
*Re: performance measurement and caches mff@research.att.com (Mary Fernandez)* (1996-02-16)**
Re: performance measurement and caches grunwald@foobar.cs.colorado.edu (1996-02-17)
Re: performance measurement and caches Terje.Mathisen@hda.hydro.com (Terje Mathisen) (1996-02-18)
Re: performance measurement and caches cdg@nullstone.com (1996-02-19)
Re: performance measurement and caches mschmit@ix.netcom.com (1996-02-21)

| List of all articles for this month |

From:	Mary Fernandez <mff@research.att.com>
Newsgroups:	comp.compilers,comp.arch
Date:	16 Feb 1996 23:43:40 -0500
Organization:	AT&T Bell Labs, Murray Hill, NJ
References:	96-02-165
Keywords:	performance, architecture

Hans Boehm (boehm@parc.xerox.com) wrote:
> I'm trying to obtain performance measurements of some small to medium
> sized applications for a conference paper. Being somewhat experienced
> at such things, I cautiously start with a smallish (2000 line or so)
> test. I repeatedly time two exeutables, getting approximately
> repeatable times for each:
>
> tweety% time ./slowtest
> SUCCEEDED
> real 10.5
> tweety% time ./fasttest
> SUCCEEDED
> real 5.5
>
> So what's wrong? Slowtest and fasttest were bitwise identical executables!

I observed similar effects of procedure placement on cache performance
on both MIPS and Intel 486. For a small set of large Modula-3
programs, procedure placement alone effected runtime by upto 15%; on
the Intel up to 10%. Smaller effects than those you observed, but
still confounding if you're trying to measure performance differences
in that neighborhood. Just injecting NOPs at procedure boundaries
perturbs alignment enough to produce measurable (>10%) differences in
elapsed time.

None of this is new. Urs Holzle observed similar cache effects using
Self on a SPARCStation; Andrew Appel using ML on a MIPS. (references
below)

> Since I'm trying to measure performance differences much less than a
> factor of 2, where does this leave me? Is there any reason to believe
> any of the performance measurements that have appeared in research
> papers, or benchmarks published in magazines? How does someone trying
> to perfomance tune a product determine whether they're making
> progress?

First-hand experience with cache effects have made me even more
skeptical of elapsed-time results than I was before. It seems like it
would be useful to have a tool that takes one or more traces of a
program and the cache characteristics of the target machine and
perturbs the executable to elicit the best (or worst) cache
performance. I'm not even sure optimal cache performance is possible,
but maybe "better". The MIPS linker can do this already but only for
that machine. Such a tool would reduce instruction-cache effects on
elapsed-time and give you more confidence in real performance
differences due to optimization. I know memory heirarchy/cache
simulators exist, but I have been unsuccessful using them on large M3
programs. Personally, I want to measure and to read about real
elapsed-times, not simulated results.

@PHDTHESIS{holzle:94,
month = "August",
author = "Urs Holzle",
title = "Adaptive optimization for Self: Reconciling High Performance
with Exploratory Programming",
school = "Stanford University",
year = "1994"
}

@book{appel92,
author="Andrew W. Appel",
title="Compiling with Continuations",
publisher=CAMBRIDGE, address="Cambridge",
year=1992
page=194
note="Info on MIPS cache behavior"
}

----------------------------------------------
Mary Fernandez AT&T Research
mff@research.att.com 600 Mountain Ave, 2C-147A
908-582-6567 Murray Hill, NJ 07974
--

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: performance measurement and caches

Mary Fernandez <mff@research.att.com>16 Feb 1996 23:43:40 -0500

Mary Fernandez <mff@research.att.com>
16 Feb 1996 23:43:40 -0500