|200 way issue? email@example.com (1993-09-29)|
|Re: 200 way issue? firstname.lastname@example.org (1993-09-30)|
|Re: 200 way issue? email@example.com (1993-09-30)|
|Re: 200 way issue? grover@brahmand.Eng.Sun.COM (1993-09-30)|
|Re: 200 way issue? firstname.lastname@example.org (1993-09-30)|
|Re: 200 way issue? email@example.com (1993-10-01)|
|Re: 200 way issue? firstname.lastname@example.org (1993-10-01)|
|Re: 200 way issue? email@example.com (Dave Gillespie) (1993-10-04)|
|From:||firstname.lastname@example.org (Dave Poplawski)|
|Organization:||Michigan Technological University|
|Date:||Thu, 30 Sep 1993 13:49:04 GMT|
email@example.com (David Moore) writes:
> HOWEVER, the question I want to raise is this: How many way issue can one
> actually use on real code.
> Suppose, for example, we took the spec benchmarks and optimized for an
> infinite issue machine. Now suppose we built a histogram of actual number
> of instructions issued per machine cycle. Has anyone published a paper on
> what this histogram would look like?
It has been looked at quite a bit. Check out
Jouppi and Wall, "Available Instruction Level Parallelism for Superscalar
and Superpipelined Machines", 3rd ASPLOS, 1989
Nicolau and Fisher, "Measuring the Parallelism Available for Very Long
Instruction Word Architectures", IEEE Trans on Comp., November 1984.
Smith, Johnson and Horowitz, "Limits on Multiple Instruction Issue", 3rd
Wall, "Limits of Instruction Level Parallelism", 4th ASPLOS, 1991.
Peterman, "An Analysis of Instruction Level Parallelism in Several Common
Benchmarks", MS Thesis, Michigan Technological University, 1993 (email to
firstname.lastname@example.org for a copy).
The bottom line is that if you can't issue around basic blocks, then you
don't have much to work with - generally many fewer than 10 instructions
can be issued in parallel. If you can do "perfect" branch prediction
(never wrong), then the number of instructions you can issue is pretty
much unbounded (a lot of other things must be idealized, like unlimited
register renaming, unbounded issue window size, etc.). Peterman has
exactly the histograms you mention for unlimited resources and perfect
branch prediction, they look like graphs of y=1/x - very high in the first
few cycles, then dropping off to just a few in later cycles. In
Theobald, "On the Limits of Program Parallelism and its Smoothability", 25th
Int. Symp. on Microarchitecture, 1992.
he shows that by limiting the number of instructions you can issue per
cycle, you can still achieve almost the same speedup. Essentially the
large number of instructions issued in the first few cycles are postponed
until later. The peak in the initial part of the graph is squashed down
and spread out into the valley later on, without affecting the maximum x
value (total number of execution cycles) by much.
Dave Poplawski (906) 487-2331 - my office
Department of Computer Science (906) 487-2209 - secretary
Michigan Technological University email@example.com
Houghton, MI 49931
Return to the
Search the comp.compilers archives again.