# Re: Modulo optimizations

## George Russell <ger@informatik.uni-bremen.de>14 Oct 1999 01:21:21 -0400

From comp.compilers

Related articles
Modulo optimizations gmarshal@globalnet.co.uk (Graham Marshall) (1999-10-04)
Re: Modulo optimizations jgk@jgk.org (Joe Keane) (1999-10-11)
Re: Modulo optimizations torbenm@diku.dk (1999-10-13)
Re: Modulo optimizations chase@world.std.com (David Chase) (1999-10-13)
Re: Modulo optimizations ger@informatik.uni-bremen.de (George Russell) (1999-10-14)
Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-14)
Re:Modulo optimizations Wilco.Dijkstra@arm.com (Wilco Dijkstra) (1999-10-19)
Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-21)
Re: Modulo optimizations Peter-Lawrence.Montgomery@cwi.nl (1999-10-27)
| List of all articles for this month |

 From: George Russell Newsgroups: comp.compilers Date: 14 Oct 1999 01:21:21 -0400 Organization: Universitaet Bremen, Germany References: 99-10-017 99-10-056 Keywords: arithmetic, optimize

Torben AEgidius Mogensen wrote:
[snip]
> For machines without division hardware, you can also do fast(ish)
> modulo (2^n-1), using a method similar to the way we find modulo 3 or
> 9 of decimal numbers by summing digits. This also is for unsigned
> numbers only. For x%3, you add all bit-pairs and reduce this sum again
> in the same way until you are down to two bits. You can do some of the
> summing in parallel, i.e.: ...

The Alpha does not have integer divide. However as one of the manuals
points out (I forget the exact URL), you can for unsigned integers
implement division by any constant as an integer multiplication
followed by a shift. For example, to divide a 32 bit integer by a 32
bit integer you do a multiplication by a pre-computed number
(producing a 64 bit result) followed by a right shift.

You don't even have to do the multiplication, because there is also a
way of replacing a multiplication of a constant by a pre-computed
sequence of a few (maybe 4 or 5) shift-and-add or shift-and-subtract
operators.

I expect the same goes for many RISC processors. Try getting gcc to
compile a (x%N) for a few random constant N's, with x unsigned, and
look at the output, and maybe you'll see what I mean . . .

Post a followup to this message