Re: Results from Cache simulator

George Neuner <>
Sat, 05 Mar 2011 06:25:21 -0500

          From comp.compilers

Related articles
Results from Cache simulator (Wei Li) (2011-03-04)
Re: Results from Cache simulator (George Neuner) (2011-03-05)
Re: Results from Cache simulator (glen herrmannsfeldt) (2011-03-06)
| List of all articles for this month |

From: George Neuner <>
Newsgroups: comp.compilers
Date: Sat, 05 Mar 2011 06:25:21 -0500
Organization: A noiseless patient Spider
References: 11-03-007
Keywords: architecture, debug
Posted-Date: 05 Mar 2011 10:35:25 EST

On Fri, 4 Mar 2011 09:04:26 -0600, Wei Li <> wrote:

>I am using a cache simulator for my research on loop tile selection
>problem. My understanding is that for a given input, I should have the
>same result.

Maybe ... maybe not. Simulation and profiling can be tricky.

What software are you using? Is it a library linked to the program or
do you start a simulator shell first and then run your program within

Some things to think about:
- Does the test program use dynamic allocation?
- Does the test program use GC?
- Is the test program multi threaded?
- Does the software simulate OS, VMM or multiprocessor?

>However, I am getting a slightly different result. For example, in one
>run I got 153265565 read misses and in another run with the same data
>I got 153266725. What is the reason for this difference?
>[ A) Maybe it uses a real clock and is subject to minor timing
> differences
> B) Maybe the simulator is buggy.
> - John]

If you are simulating a real CPU/memory system, then as John said,
likely the simulator is buggy. But if you are running it on a
multiprocessor you might try restricting the simulator to a single

Apart from that, you will definitely see variation from any test
program which multitasks or uses dynamic allocation (with or without

Memory allocators interact with VMM in ways that profoundly affect
caching. Even if repeated runs of a program reproduce identical
virtual addresses, under VMM the physical memory pages underlying the
virtual addresses may be different from run to run ... they may even
be different between accesses within a run if the computer is heavily
loaded with other processes. Usually (though not always) it is the
physical memory addresses which determine cache placement.

Apart from VMM, the memory allocator itself may produce varying
behavior from run to run if it is not reset. Usually, restarting a
process does this, but when a program is run under a simulator you may
need to restart the simulator as well because the allocator in the
program gets its working memory from the simulator.
(It's the simulator's allocator that may need resetting.)

If the program additionally uses copying GC, then data may be being
relocated in different ways from run to run.

If results aren't exactly reproducible, you need to run your tests a
number of times and average the results. For your problem - tile
selection - you'll be looking for obvious large improvements that will
stand out above any noise in the measurements.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.