Related articles |
---|
Instruction Scheduling for UltraSPARC kuriakose.kuruvilla@wipro.com (Kuriakose Kuruvilla) (2000-12-11) |
From: | Kuriakose Kuruvilla <kuriakose.kuruvilla@wipro.com> |
Newsgroups: | comp.compilers |
Date: | 11 Dec 2000 01:59:12 -0500 |
Organization: | Wipro |
Keywords: | architecture, optimize |
Posted-Date: | 11 Dec 2000 01:59:12 EST |
Hi People
I am trying to analyse the performance improvements of an instruction
scheduler for instructions generated on-the-fly targeting the SPARC-V9
compliant UltraSPARCIIi processor.
This processor is able to issue upto 4 instructions per cycle. This is
based on rules for grouping instruction; these being described in
"Chapter 22: Grouping Rules and Stalls" of the "UltraSPARC-IIi User's
Manual".
For example, SLLX uses IEU0 and ADD is a non-specific IEU instruction.
Hence...
sllx %i2, 2, %i2 ! Group1
sllx %i3, 2, %i3 ! Group2
sllx %i4, 2, %i4 ! Group3
add %l4, 2, %l4 ! Group4
add %l5, 2, %l5 ! Group4
add %l6, 2, %l6 ! Group5
would be better scheduled as...
sllx %i2, 2, %i2 ! Group1
add %l4, 2, %l4 ! Group1
sllx %i3, 2, %i3 ! Group2
add %l5, 2, %l5 ! Group2
sllx %i4, 2, %i4 ! Group3
add %l6, 2, %l6 ! Group3
thereby giving an improvement of 2 cycles.
The instructions seem to be properly reordered based on these rules for
small instances of code I looked at. But the standard test suites do
not show the expected improvements.
So I tried using the Performance Control Register (PCR) and the
Performance Instrumentation Counters (PICs) provided by the processor.
These I accessed using a freeware perfmon driver.
But even the PIC is not showing the expected results when I tried
determining the number of instructions cycles for a small piece of
code. Also, the number of instructions shown by PIC to have executed is
not exactly the number of instructions that were timed, but is also
dependent on where the instructions are located in the address space (if
the first instruction timed is the last instruction a block of 8
instructions, aligned at 32-byte boundary, 7 more instructions are added
to the counter value corresponding to "instructions executed".
Can someone help me out on this? Is the accuracy of PIC registers
broken in some way; do they not do what the manual says? What about the
implementation of the grouping logic? Or am I missing out on something?
Anyone have prior experience with scheduling based on grouping rules on
the UltraSPARC? Or experience with using the PIC/PCR registers?
Thanks
Kuriakose
Return to the
comp.compilers page.
Search the
comp.compilers archives again.