Related articles |
---|
Looking for Program Transformation Examples tseng@rice.edu (1992-04-24) |
Newsgroups: | comp.compilers |
From: | tseng@rice.edu (Chau-Wen Tseng) |
Keywords: | optimize, question |
Organization: | Rice University |
Date: | Fri, 24 Apr 1992 03:49:42 GMT |
Hello netters,
I am looking for example program kernels that can benefit from
the following compiler optimizations.
(1) Unimodular Loop Transformations
Unimodular loop transformations use matrix calculations to guide
some combination of loop interchange, skewing, and reversal to
parallelize perfectly nested loops. A simple example
is Successive-Over-Relaxation (SOR),
do j = 2,N-1
do i = 2,N-1
A(i,j) = F(...A(i-1,j),A(i,j-1),A(i+1,j),A(i,j+1)...)
enddo
enddo
where loop skew and interchange can parallelize the inner loop:
do i = 4,2*(N-1)
doall j = max(2,i-N-1),min(N-1,i-2)
A(i-j,j) = F(...A(i-j-1,j),A(i-j,j-1),A(i-j+1,j),A(i-j,j+1)...)
enddo
enddo
Another example comes from tridiagonal solves, where loop
interchange can be used to move a parallel loop outwards.
do j
doall i
A(i,j) = F(...A(i,j-1)...)
enddo
enddo
However, I can't seem to find much else. Are there common
computations that can utilize unimodular loop transformations?
(2) Loop Distribution
Loop distribution creates multiple loops from a single loop. It is
pretty essential for vectorizing compilers. The most common use
for loop distribution is to separate a sequential reduction from a
parallel loop. For instance,
do j
A(j) = F(...)
t = t + A(j)
enddo
can be distributed to yield the following loops:
doall j
A(j) = F(...)
enddo
do j
t = t + A(j)
enddo
I'm having a difficult time discovering other cases where loop
distribution can be used to increase parallelism.
I would appreciate pointers to computations that can benefit from
either of these techniques. I'll summarize responses, if I get any :-)
Thanks,
Chau-Wen Tseng
Grad Student
Rice University
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.