Re: Irregular Data Access Pattern Loops on Parallel Machines

preston@dawn.cs.rice.edu (Preston Briggs)
Wed, 13 Oct 1993 03:04:30 GMT

          From comp.compilers

Related articles
Irregular Data Access Pattern Loops on Parallel Machines Mounir.Hahad@irisa.fr (1993-10-05)
Re: Irregular Data Access Pattern Loops on Parallel Machines preston@dawn.cs.rice.edu (1993-10-13)
Re: Irregular Data Access Pattern Loops on Parallel Machines moyer@mathcs.emory.edu (1993-10-13)
| List of all articles for this month |

Newsgroups: comp.compilers
From: preston@dawn.cs.rice.edu (Preston Briggs)
Keywords: parallel
Organization: Rice University, Houston
References: 93-10-031
Date: Wed, 13 Oct 1993 03:04:30 GMT

Mounir.Hahad@irisa.fr (Mounir Hahad) writes:
>I'm looking for references to articles dealing with the problem of
>compiling loops with irregular data access patterns on parallel
>computers for minimum communication/maximum locality. [e.g.]
>
>DO I=1,N
> A(B(I)) = f
>ENDDO
>
>where B is not known at compile time.


This is a bit out of my area, but I haven't seen any other replies, so
I'll take a crack at it.


There are probably some papers mentioning the problem; however, it's
generally quite difficult and I know of no good compile-time approach.


In the past, there's been some discussion around here about doing
dependence analysis on such things if we could determine that B was a
"permutation vector"; that is, if we could be certain that


B[i] != B[j] when i != j


Such things tend to arise when using Gaussian elimination with partial
pivoting, for example. One approach to the detection of a permutation
vector might be to notice initialization of the form


do i = 1, n
B[i] = i
end


followed exclusively by references to B or swap operations between
elements of B. This seems a little out of reach for most Fortran.
Alternatively, a directive of some sort might be used.


The common current approach is to use some sort of "inspector/executer"
code at run time. The idea is to first examine the elements of B to see
which elements of A will be referenced or defined. Then, given that
knowledge, plan how to block up the communication into efficient chunks.
A similar approach can be used when blocking sparse matrix operations for
cache. I don't know that people are able to insert all this
inspector/executer code at compile time, but Joel Saltz has done a fair
amount of work on subroutines to support irregular computations of this
sort on distributed memory machines. Perhaps someone else will know the
references. I believe the package is called PARTI.


Preston Briggs
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.