# Re:Modulo optimizations

## Wilco Dijkstra <Wilco.Dijkstra@arm.com>19 Oct 1999 10:57:43 -0400

From comp.compilers

Related articles
Modulo optimizations gmarshal@globalnet.co.uk (Graham Marshall) (1999-10-04)
Re: Modulo optimizations jgk@jgk.org (Joe Keane) (1999-10-11)
Re: Modulo optimizations torbenm@diku.dk (1999-10-13)
Re: Modulo optimizations chase@world.std.com (David Chase) (1999-10-13)
Re: Modulo optimizations ger@informatik.uni-bremen.de (George Russell) (1999-10-14)
Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-14)
Re:Modulo optimizations Wilco.Dijkstra@arm.com (Wilco Dijkstra) (1999-10-19)
Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-21)
Re: Modulo optimizations Peter-Lawrence.Montgomery@cwi.nl (1999-10-27)
| List of all articles for this month |

 From: Wilco Dijkstra Newsgroups: comp.compilers Date: 19 Oct 1999 10:57:43 -0400 Organization: Compilers Central References: 99-10-017 Keywords: arithmetic, performance

George Russell wrote:
>The Alpha does not have integer divide. However as one of the manuals
>points out (I forget the exact URL), you can for unsigned integers
>implement division by any constant as an integer multiplication
>followed by a shift. For example, to divide a 32 bit integer by a 32
>bit integer you do a multiplication by a pre-computed number
>(producing a 64 bit result) followed by a right shift.

Yes, but unfortunately not all divisions by a constant can be treated
like that... A problem occurs when the error after multiplication
could be greater than the pre-computed value you are multiplying with,
forcing a correction step.

Interestingly there exists an algorithm to calculate modulos without
doing 64 bit multiplies (posted a few years ago on comp.sys.arm):

typedef unsigned int u32;

u32 mod3(u32 x)
{
x *= 0xAAAAAAAB;
if ((x - 0x55555556) < 0x55555555) return 2;
return x >> 31;
}

It starts out like a normal division by a constant. Since 0xAAAAAAAB *
3 mod 2 ^ 32 = 1, any multiple of 3 will give a result between 0 and
0xFFFFFFFF / 3 = 0x55555555. Any x with x % 3 = 1 lies in the range
0xAAAAAAAB..0xFFFFFFFF, x % 3 = 2 in 0x55555556..0xAAAAAAAA. A range
test filters out modulo 2 results, while a shift is used to
distinguish between modulo 1 (bit 31 set) and 0 (bit 31 clear)
results.

The drawback of this method is that it is only really useful for small
constants, for larger ones the range search effectively becomes a
division itself!

Wilco

Post a followup to this message