Fri, 26 Jul 91 15:21:35 GMT

Related articles |
---|

Access pattern optimizations sam2y@koa.cs.Virginia.EDU (1991-07-26) |

Re: Access pattern optimizations moss@cs.umass.edu (1991-07-27) |

Newsgroups: | comp.compilers |

From: | sam2y@koa.cs.Virginia.EDU (Steven A. Moyer) |

Keywords: | optimize |

Organization: | Dept. of Computer Science, University of Virginia |

Date: | Fri, 26 Jul 91 15:21:35 GMT |

Does anyone know of any work on optimizations/transformations of

access patterns for scalar processors? In particular, given a

loop:

do 1 k = 1, n

1 x(k) = y(k+1) + y(k)

a straight-forward translation would generate the accesses:

load y(2), load y(1), store x(1)

load y(3), load y(2), store x(2)

etc.

with 2 loads and 1 store per iteration. It would be preferable to

generate the access pattern:

load y(1)

load y(2), store x(1)

load y(3), store x(2)

etc.

which re-uses a previously loaded value so that only 1 load and 1 store

is required per iteration. Furthermore, the later access pattern loads

y values and stores x values as vectors with stride 1; given an

interleaved memory system, if the loop which generates this access pattern

were unrolled and accesses to each vector grouped then memory bandwidth

would be improved considerably.

It is easy to imagine more complex loops which would also benefit

from this type of transformation.

Maybe this could be handled by some form of global register coloring

scheme, I'm not sure.

If you know of any published work in this area I would greatly

appreciate references. Please send responses to: sam2y@virginia.edu

Thanks.

--Steve

Steve Moyer

Dept. of Computer Science

University of Virginia

sam2y@virginia.edu

[This is pretty similar to software pipelining, a well-known optimization

used in unrolled loops. -John]

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.