27 Oct 1999 14:06:41 -0400

Related articles |
---|

[2 earlier articles] |

Re: Modulo optimizations torbenm@diku.dk (1999-10-13) |

Re: Modulo optimizations chase@world.std.com (David Chase) (1999-10-13) |

Re: Modulo optimizations ger@informatik.uni-bremen.de (George Russell) (1999-10-14) |

Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-14) |

Re:Modulo optimizations Wilco.Dijkstra@arm.com (Wilco Dijkstra) (1999-10-19) |

Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-21) |

Re: Modulo optimizations Peter-Lawrence.Montgomery@cwi.nl (1999-10-27) |

From: | Peter-Lawrence.Montgomery@cwi.nl (Peter L. Montgomery) |

Newsgroups: | comp.compilers |

Date: | 27 Oct 1999 14:06:41 -0400 |

Organization: | CWI, Amsterdam |

References: | 99-10-017 99-10-093 |

Keywords: | arithmetic, performance |

*>Yes, but unfortunately not all divisions by a constant can be treated*

*>like that... A problem occurs when the error after multiplication*

*>could be greater than the pre-computed value you are multiplying with,*

*>forcing a correction step.*

Torbj\"orn Granlund and I published

`Division by Invariant Integers using Multiplication'

in the 1994 PLDI (Proceedings of Programming Languages Design

and Implementation). See SIGPLAN Notices, June, 1994.

Yes, unsigned division by 3 needs a 33-bit constant

c3 = (2^34 + 2)/3 = 155555556x. If 0 <= x < 2^32, then

FLOOR(x/3) = FLOOR(c3 * x / 2^34)

In this case the multiplier c3 is even.

Reduce the fraction c3 / 2^34 to (c3/2) / 2^33.

The revised multiplier c3/2 fits in 32 bits.

Use the upper half of the unsigned integer multiply,

which I'll call MULUH, to multiply by (c3/2) / 2^32.

Shift right to divide by 2 again:

FLOOR(x/3) = MULUH(c3/2, x) >> 1

This is two instructions once the constant c3/2 is in a register.

There are no branches.

When the 33-bit multiplier is odd, the code is

longer. We first replace

FLOOR(c3 * x / 2^34) = FLOOR(FLOOR(c3 * x / 2^32) / 4)

The product FLOOR(c3 * x / 2^32) can exceed 2^32. We replace it by

x + FLOOR((c3 - 2^32) * x / 2^32)

The constant c3 - 2^32 fits in 32 bits. Hence

FLOOR(x/3) = FLOOR((x + t1) / 4)

where t1 = MULUH(c3 - 2^32, x). To avoid overflow in the

(x + t1) / 4 computation, observe 0 <= t1 < x. Hence

(x + t1)/4 = ((x - t1) + 2*t1) / 4

= ((x - t1)/2 + t1)/2

Besides the MULUH, we need two shifts, an addition, and a subtraction.

The paper covers more topics, such as two forms of signed division

(round towards zero, round towards -infinity).

It is available in ftp.cwi.nl:/pub/pmontgom/divcnst.ps{a4,l}.gz .

--

Peter-Lawrence.Montgomery@cwi.nl Home: San Rafael, California

Microsoft Research and CWI

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.