|Irregular Data Access Pattern Loops on Parallel Machines Mounir.Hahad@irisa.fr (1993-10-05)|
|Re: Irregular Data Access Pattern Loops on Parallel Machines firstname.lastname@example.org (1993-10-13)|
|Re: Irregular Data Access Pattern Loops on Parallel Machines email@example.com (1993-10-13)|
|From:||firstname.lastname@example.org (Preston Briggs)|
|Organization:||Rice University, Houston|
|Date:||Wed, 13 Oct 1993 03:04:30 GMT|
Mounir.Hahad@irisa.fr (Mounir Hahad) writes:
>I'm looking for references to articles dealing with the problem of
>compiling loops with irregular data access patterns on parallel
>computers for minimum communication/maximum locality. [e.g.]
> A(B(I)) = f
>where B is not known at compile time.
This is a bit out of my area, but I haven't seen any other replies, so
I'll take a crack at it.
There are probably some papers mentioning the problem; however, it's
generally quite difficult and I know of no good compile-time approach.
In the past, there's been some discussion around here about doing
dependence analysis on such things if we could determine that B was a
"permutation vector"; that is, if we could be certain that
B[i] != B[j] when i != j
Such things tend to arise when using Gaussian elimination with partial
pivoting, for example. One approach to the detection of a permutation
vector might be to notice initialization of the form
do i = 1, n
B[i] = i
followed exclusively by references to B or swap operations between
elements of B. This seems a little out of reach for most Fortran.
Alternatively, a directive of some sort might be used.
The common current approach is to use some sort of "inspector/executer"
code at run time. The idea is to first examine the elements of B to see
which elements of A will be referenced or defined. Then, given that
knowledge, plan how to block up the communication into efficient chunks.
A similar approach can be used when blocking sparse matrix operations for
cache. I don't know that people are able to insert all this
inspector/executer code at compile time, but Joel Saltz has done a fair
amount of work on subroutines to support irregular computations of this
sort on distributed memory machines. Perhaps someone else will know the
references. I believe the package is called PARTI.
Return to the
Search the comp.compilers archives again.