|Optimizations for Pipelined Processors email@example.com (1993-01-21)|
|Re: Optimizations for Pipelined Processors firstname.lastname@example.org (1993-01-22)|
|Re: Optimizations for Pipelined Processors email@example.com (1993-01-22)|
|Re: Optimizations for Pipelined Processors firstname.lastname@example.org (1993-01-22)|
|From:||email@example.com (David Moore)|
|Date:||Fri, 22 Jan 1993 22:58:25 GMT|
firstname.lastname@example.org (Alon Ziv) writes:
> Is any work going on into optimisations using loop unrolling
> and loop merging
Machines on which these optimizations are valuable are by no means new.
The Control Data 6600 was an early (circa 1972) example.
This machine had a large number of ALU units which could perform a number
of operations simultaneously. Modern processors achieve the same thing by
pipe-lining, but the effect on the compiler writer is the same. However,
the time taken to do one operation was typically 10 times the issue rate,
which is 2-3 times slower than modern machines, and that made the
optimizer's task more difficult.
One oddity caused by this was that a ternary chop would in theory run
faster than a binary search!
However, this machine had a 7 word (14 to 28 instruction) cache which
avoided instruction fetches from memory if you could stay within it.
Hence, stuffing loops was often not worthwhile if the individual loops
would fit in the cache.
A similar situation occurs on modern processors. On a Risc chip,
instructions are often fetched using static column mode. Unrolling a loop
so that it crosses a page boundary will slow rather than speed execution
if the only saving was the loop jump (if the loop has dead time, you may
still gain more than you lose from instruction overlapping)
I just pulled down Sigplan Notices Vol 27 No 7, which is the proceedings
of the 92 Sigplan conference, and found.
Delineraization: An Efficient Way to Break Multiloop Dependence Equations,
Vadim Maslov (email@example.com)
Beyond Induction Variables, Michael Wolfe
The bibliographies of these, and previous years' proceedings
represent a good place to start.
Return to the
Search the comp.compilers archives again.