Re: Loop ROLLING (Steve Simmons)
Mon, 28 Oct 1991 15:01:12 GMT

          From comp.compilers

Related articles
Loop ROLLING (1991-10-25)
Re: Loop ROLLING (1991-10-28)
Re: Loop ROLLING (John D. McCalpin) (1991-11-01)
Re: Loop ROLLING (1991-11-01)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Steve Simmons)
Keywords: optimize
Organization: CONVEX Computer Corporation, Richardson, Tx., USA
References: 91-10-103
Date: Mon, 28 Oct 1991 15:01:12 GMT

Loop unrolling is accomplished by replicating the body of the loop n times
where "n" is compilation time constant. This replication may occur in
either the source code, parse tree, or il graph; wherever, a loop is
determinable and the number of iterations is constant. If the iterations
is not compile time constant, partial unrolling may be accomplished where
the iteration count is divided by a constant "n".

The benefit of loop unrolling is that an end of loop test and branch is
removed for each iteration of the loop. Also, once you remove the branches
in the code, instruction scheduling may occur between the different
iteration bodies of the loops.

> It is conceivable that the performance for the
> application (due to compiler, architecture, and hand-unrolling) could be
> WORSE on the new machine than it was on the old machine. Which means that
> to tune the source for the new machine, the new victim (oops, programmer)
> would need to become familiar with the compiler used, and the architecture
> in question.

Why certainly... the most likely reason would be the change in the cache
access patterns. For partial unrolling, the body of the loop may not have
expanded beyond the cache size but the replication may have. Therefore, a
cache miss may now occur at the top of the loop. In fact, it is hard to
either vectorize or parallize a loop once it has been unrolled.

Can loop rolling be done???? Well, it is a matter of programming.... It
requires that the compiler recognize bodies of loops from which there were
no bodies. That is, some form of pattern recognization must take place.
Certainly, it can be done.... but is it worth it??? What other
performance problems can be fixed by loop rolling.... Not many unless you
need more compact code; for the last ten years, the time-space tradeoff
has always been in favor of time since memory is cheap.

Thank you.

Steve Simmons

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.