Suggestion for latency times

donawa@duke.cs.mcgill.ca (Chris DONAWA)
Tue, 10 Aug 1993 18:52:33 GMT

          From comp.compilers

Related articles
Suggestion for latency times donawa@duke.cs.mcgill.ca (1993-08-10)
| List of all articles for this month |
Newsgroups: comp.compilers
From: donawa@duke.cs.mcgill.ca (Chris DONAWA)
Keywords: performance, architecture
Organization: Compilers Central
Date: Tue, 10 Aug 1993 18:52:33 GMT

Can anyone suggest a reasonable "average" value for long latency
operations on RISC machines? Specifically for multiplication and division
operations? My motivation is that I'm writing an instruction list
scheduler that works on an intermediate representation. The IR is
designed assuming RISC architectures, where I consider a RISC architecture
to have the following characteristics:


    1) load/store architecture
    2) relatively few instruction choices for a particular operation
    3) pipelined architecture
    4) large number of general purpose registers
    5) most integer operations take the same amount of time to execute (eg
          bit shift == int plus == integer minus etc), with a few exceptions,
          notably division and multiplication. This assumption is used, for
          example, to replace integer multiplications of variables by
          constants into a series of shifts, adds and subtractions.




Currently, we can generate code for SPARC, RS/6000 and DLX (a
pedagogically-based architecture influenced by the MIPS R2000, described in
Patterson & Hennessy), as well as a "pseudo" architecture, which prints
assembly language like code, but gives the names of the variables,
rather than their assigned registers.


What I'd like to do is demonstrate the effectiveness of my scheduler
using my pseudo assembly code, and thus not tie the results down to one
particular architecture. However, the problem is to choose latencies
that are "realistic" i.e. imply improved performance for other machines.
I've heard of latencies of between 6 and 14 cycles for division,
and anything between 3 and 14 for multiplication. Currently, I'm using
the current values:


  Op Latency in cycles
---- -----------------
load 2 (be optimistic and assume cache hit)
store 1
mult 6
div 6
branches 2 (assume branch delay slot, but is flexible)
others 1




*Note that I assume the latency for integer functional units is the
same as for floating point.


Is the value of 6 "realistic"? Any comments on the other values?


Thank you very much for your comments.


Chris Donawa
donawa@acaps.cs.mcgill.ca
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.