Related articles |
---|
Profiling on MIPS R4K and R10K hyang@nemawarkar.capsl.udel.edu (Hongbo Yang) (2000-04-25) |
From: | Hongbo Yang <hyang@nemawarkar.capsl.udel.edu> |
Newsgroups: | comp.compilers |
Date: | 25 Apr 2000 02:27:15 -0400 |
Organization: | Compilers Central |
Keywords: | architecture, question, performance |
Currently I am working on a new local register allocation
algorithm. To measure how many memory access are reduced by my
algorithm, I implemented my algorithm on MIPS and test it using SPEC95
benchmarks, then I have two ways to measure the amount of memory
access at run time:
1. To use hardware counter on R10K. No extra efforts are needed, just
run the executable with "perfex". It can counter the number of load
instructions and store instructions executed.
2. To use SpeedShop software, namely "pixie". It can instrument the
executable and use some tech called "sampling" to gather the run-time
info. It can use on both R4K and R10K. Then use "prof" to see the
result.
On R10K I use both of these two methods and the result shows that they
are consistent.
However, on R4K, I found out that the no. reported by SpeedShop are so
imprecise, and there is no such mechanism as "hardware counter" on
R4K, could anyone know how "sampling" works and tell me the possible
reason? Does it relate to the frequency or ...? Thanks a lot,
if anyone interested on this research, you can get the tech memos on
http://www.capsl.udel.edu/DOCUMENTS/, Tech Memo #36.
--------------------------------
Hongbo Yang ( hyang@ee.udel.edu )
Electrical and Computer Engineering
Univ of Delaware
Return to the
comp.compilers page.
Search the
comp.compilers archives again.