Tue, 26 Jan 1993 03:01:16 GMT

Related articles |
---|

How many vector registers are useful? kirchner@uklira.informatik.uni-kl.de (1993-01-25) |

Re: How many vector registers are useful? grunwald@tile.cs.colorado.edu (1993-01-25) |

Re: How many vector registers are useful? pmontgom@math.orst.edu (1993-01-26) |

Re: How many vector registers are useful? jlg@cochiti.lanl.gov (1993-01-26) |

Re: How many vector registers are useful? hyatt@cis.uab.edu (1993-01-27) |

Re: How many vector registers are useful? jrbd@craycos.com (1993-01-27) |

Re: How many vector registers are useful? hrubin@pop.stat.purdue.edu (1993-01-28) |

Re: How many vector registers are useful? sanjay@equalizer.cray.com (1993-01-29) |

Re: How many vector registers are useful? shubu@cs.wisc.edu (1993-01-30) |

[1 later articles] |

Newsgroups: | comp.sys.super,comp.arch,comp.compilers |

From: | pmontgom@math.orst.edu (Peter Montgomery) |

Organization: | Oregon State University Math Department |

Date: | Tue, 26 Jan 1993 03:01:16 GMT |

Keywords: | architecture, question |

References: | 93-01-174 |

kirchner@uklira.informatik.uni-kl.de (Reinhard Kirchner) writes:

*>On discussing various merrits of different vector machines we came about*

*>the issue of the register architectures. There are on one side the cray and*

*>convex with 8 vector registers a 64 or 128 words, and on the other side,*

*>*

*>The Fujitsu machines with their reconfigurable register file of 32 or*

*>64kb, which is 4k or 8k words, being grouped from 256 register a 16/32*

*>words to 8 registers a 512/1024 words.*

*>*

*>Now there is the question: is such a large register file useful at all ?*

I used an Alliant FX/80 while at UCLA. It had eight vector

registers each length 32, which could hold integer or floating point

operands.

One time critical routine in my program was multiple precision

modular multiplication. The assembly language loop which multiplied one

vector of length <= 32 by another such vector had enough vector registers,

but there were insufficient vector registers for another loop which

multiplied two vectors of length <= 64. These loops also faced a shortage

of scalar integer registers (Motorola 68020 has 8 address and 8 data

registers), requiring me to use a floating point register for one loop

control variables. I guess that 16 or 32 vector registers will be

adequate for most applications.

*>But how is this on vector machines ? The register creates a speedup only*

*>when it can hold an entire vector, which can be used again later. This*

*>requires a register long enough to do so. That means vectors of e.g. a*

*>length of 5000 can not be held anyway, every machine must load, process,*

*>and store it in pieces, and only a lot of memory bandwidth helps.*

It is important to strip mine and re-use vectors. Consider

evaluating a polynomial at 5000 points:

do i = 1, 5000

pvalue = p(degree) ! Leading coefficient

do j = degree-1, 0, -1

pvalue = pvalue*x(i) + p(j) ! Horner's rule

end do

value(i) = pvalue

end do

On a machine with vector length at most 64, the code can be

do ibeg = 1, 5000, 64

iend = MIN(i + 63, 5000)

lng = iend - ibeg + 1

pvalue(1:lng) = p(degree)

do j = degree-1, 0, -1

pvalue(1:lng) = pvalue(1:lng)*x(ibeg:iend) + p(j)

end do

value(ibeg:iend) = pvalue(1:lng)

end do

If pvalue(1:lng) and x(ibeg:iend) are assigned to vector registers across

the j loop, then the only memory reference in that loop is the load of

p(j). Loops like this (where I operate several times on one temporary

vector, here pvalue) occurred in many parts of my cose. Alas, the

compiler installed at UCLA did not perform these optimizations.

--

Peter L. Montgomery Internet: pmontgom@math.orst.edu

Dept. of Mathematics, Oregon State Univ, Corvallis, OR 97331-4605 USA

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.