|compiling for multithreaded architectures? firstname.lastname@example.org (1995-11-30)|
|Re: compiling for multithreaded architectures? email@example.com (1995-11-30)|
|From:||firstname.lastname@example.org (Preston Briggs)|
|Date:||Thu, 30 Nov 1995 23:47:18 GMT|
>The title says it all. Could anyone recommemd any good papers on the topic ?
Not sure what you're looking for here (I know, I know: Papers on
compiling for... :-) It's just that it's not particularly interesting.
You need to do a nice job of classical optimization and you look for
parallelism, just like everyone else. Unlike everyone else, you don't
worry about cache management, short stride data access, data
distribution, or message blocking.
So we read the same papers as everyone else, but we get to ignore the
ugly "practical" parts and use all the pretty "theoretical" results.
Interesting problems that don't go away include choosing the
appropriate form of parallelism to use for a particular loop.
For our machine, there are several choices:
1 dumb ol' scalar code, maybe unrolled a bit
2 software pipelined to take advantage of the wide instructions
3 spreading iterations of a loop across many threads of 1 processor
4 spreading iterations of a loop across many threads of many processors
The higher-level schemes have higher startup overhead and are more
appropriate more loops that run for more iterations and/or have more
Loop nests are naturally more complicated, offering more possibilities.
But none of this is specific to multithreaded machines. Seems a
basic problem for any programmer (or compiler) targeting a parallel
machine: Choosing the appropriate level of parallelism.
Return to the
Search the comp.compilers archives again.