SPARC Performance Analysis (Gordon Irlam)
21 Oct 91 09:44:38 GMT

          From comp.compilers

Related articles
SPARC Performance Analysis (1991-10-21)
| List of all articles for this month |

Newsgroups: alt.sys.sun,comp.arch,comp.compilers
From: (Gordon Irlam)
Followup-To: alt.sys.sun
Keywords: sparc, performance, experiment
Organization: Comp Sci, Uni of Adelaide, Australia
Date: 21 Oct 91 09:44:38 GMT


Spa is a set of tools used to analyze the performance of SPARC binaries on
the SPARCstation 1 and the SPARCstation 2. The tools can be used on any
sun4 architecture machine running a SunOS 4 operating system.

The tools include spy a program that traces the execution of a command and
can generate address traces, spanner a tool that converts an address trace
into an instruction count, splice a tool that combines instruction count
files, and spout a tool that displays an instruction count file.

No modification of the binary to be analyzed is required. Dynamic
libraries, job control, and fork/exec are all handled correctly. Some
signal handlers may cause problems.

The Spa package does not predict overall system performance. I/O latency
is not taken into consideration, nor is the effect of more than one
process being active at any one time. spy does not provide any
information about the time spent handling traps, interrupts, or system
calls, nor the effects of these on the cache.

The floating point queue is not currently simulated by the Spa package,
and all floating point instructions are assumed to complete in a single
cycle. This differs significantly from reality.

The main component of the Spa package is a SPARC simulator. A basic block
style profiling tool is not currently included. Consequently the speed of
this package is roughly 600 times slower than normal execution.


The following tables provide an example of the figures obtainable using

OVERALL overall (%) category (%) raw
                                          cycles inst. cycles count cycles count
instructions 76.9 100.0 76.9 - 1436530115 1195627054
annulled delay slots 1.5 2.3 1.5 - 27385209 27385209
load-use stalls 5.9 9.3 5.9 - 110611716 110611716
trap cycles 0.0 0.0 0.0 - 8592 2148
window handlers 0.0 0.0 0.0 - 240990 1719
cache cycles 15.7 1.7 15.7 - 292469972 20620389
total 100.0 - 100.0 - 1867246594 -

INSTRUCTIONS overall (%) category (%) raw
                                          cycles inst. cycles count cycles count
memory access 24.1 18.2 31.4 18.2 450826364 217836374
alu 33.0 51.6 43.0 51.6 616997806 616997806
floating point 0.0 0.0 0.0 0.0 0 0
control transfer 19.3 29.5 25.1 29.5 360711989 352798918
other instructions 0.4 0.7 0.6 0.7 7993956 7993956
total 76.9 100.0 100.0 100.0 1436530115 1195627054


COND. BR.: CY7C601 overall (%) category (%) raw
                                          cycles inst. cycles count cycles count
backward taken 4.7 7.3 25.6 25.6 86888405 86888405
backward untaken 0.1 0.2 0.7 0.7 2542082 2542082
forward taken 8.6 13.4 47.3 47.3 160751512 160751512
forward untaken 4.8 7.5 26.4 26.4 89518753 89518753
total 18.2 28.4 100.0 100.0 339700752 339700752


CACHE CYCLES: SS2 overall (%) category (%) raw
                                          cycles inst. cycles count cycles count
I-read miss 0.4 0.0 2.7 1.5 7796554 319176
D-read miss 12.6 0.8 80.5 46.9 235359634 9667295
D-write miss 2.5 0.8 16.1 46.7 47008901 9638714
write buffer stalls 0.1 0.1 0.8 4.8 2304883 995204
total 15.7 1.7 100.0 100.0 292469972 20620389


The Spa package is a copyrighted work which comes with absolutely no
warranty. It may be redistributed and/or modified under the terms of the
GNU General Public License Version 2 as published by the Free Software

The Spa package is available from the following locations,

        U.S. FTP:
        U.S. UUCP: uunet!~/systems/sun/spa-1.0.tar.Z
        Australia FTP:

Compiling the tools using currently available compilers limits them to
using 32 bits for instruction counts and so on. The distribution also
included a set of specially built binaries that use 64 bit integers.

As a result of the potential risks associated with binary distribution I
have calculated the MD5 digital signature for each file -- ftp pub/md5.doc
from for details of MD5. The MD5 master signature is given below

05141bfdfff92a6df4f226888ca03105 signature.md5

[The is not the most secure system for distributing binaries.
Unfortunately I am not able to employ a public key security system until
1993. (Most people in the U.S. will not be able to check public key
signatures until 1997). If the fact that software patents are limiting
the security of computer networks concerns you might like to consider
joining the League for Programming Freedom -- mail]


I have been using this package to analyze the performance of the SPARC
architecture on the SPECint benchmarks. Anyone else who has performed
similar work, or is interested in performing other similar serious
analysis of the SPARC architecture should get in touch.

I am continuing to work on this package. Future possible directions
include the development of a pixie style instruction analyzer, the
development of an architecture independent simulator, a graphical front
end, and the direct feedback of simulation data to improve the code
generated by compilers. Anybody who is interested in this or similar work
should also get in touch.

While I am sympathetic to the needs of FP heads, I don't share their
interests. If someone wants to provide me with a description of the
behavior of floating point queue on the SPARC, I will attempt to simulate
it, but I am not attempting to obtain such a description myself.

                                                                                      Gordon Irlam


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.