# Re: Modulo optimizations

## Joe Keane <jgk@jgk.org>11 Oct 1999 02:36:15 -0400

From comp.compilers

Related articles
Modulo optimizations gmarshal@globalnet.co.uk (Graham Marshall) (1999-10-04)
Re: Modulo optimizations jgk@jgk.org (Joe Keane) (1999-10-11)
Re: Modulo optimizations torbenm@diku.dk (1999-10-13)
Re: Modulo optimizations chase@world.std.com (David Chase) (1999-10-13)
Re: Modulo optimizations ger@informatik.uni-bremen.de (George Russell) (1999-10-14)
Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-14)
Re:Modulo optimizations Wilco.Dijkstra@arm.com (Wilco Dijkstra) (1999-10-19)
Re: Modulo optimizations harley@corton.inria.fr (Robert Harley) (1999-10-21)
[1 later articles]
| List of all articles for this month |

 From: Joe Keane Newsgroups: comp.compilers Date: 11 Oct 1999 02:36:15 -0400 Organization: none References: 99-10-017 Keywords: arithmetic, performance

Graham Marshall <gmarshal@globalnet.co.uk> writes:
>I have a computation in my code which does a modulo divide:
>
> x = y % z;

First, no, the compiler doesn't optimize it.

I haven't seen it done, and frankly, i'd stay away from any compiler
that attempted it. That is getting too smart for me.

If the nothing is known about the divisor `z', the best thing is the
usual remainder operation, whatever that is. Without some knowledge
of the distribution, trying to optimize it only slows things down.

Second, if you have some knowledge, it is worth doing yourself.

In the past, i've gained speed-up from optimizing multiply, like so:

x = y == 1 ? z : y * z;

That looks dumb, but it's saying something important, that is,
`y == 1' is likely enough that it's worth a separate case. Again I don't
want the compiler to take any multiply and do this, it would be wrong.

I go `by the numbers' and that change has produced some noticeable
speed-ups, but it's mainly for special cases.

Now on recent machines multiply has gotten fairly fast, to the point
where i probably don't bother with something like that.

However, to disagree a bit with our moderator, divide and remainder
are still quite slow and worth paying attention to. I think people
don't understand this enough and they tend to miss it entirely.

I say you can do several lookups in a hash table, if it's well
written, hitting in cache, and so on, in the time it takes to do that
one remainder operation.

I can't tell you what to do exactly, it depends on what you know.

Maybe you want something like this:

x = z == 4 ? y % 4 : y % z;

Or maybe this is the right thing:

x = ISPOW(z) ? y & (z - 1) : y % z;

The cost of missed branches is something to keep in mind. Maybe this
code is run with the same value of z many times, then we expect that
prediction works well. Or maybe which case is taken is more or less
random, then it might just confuse the machine.

I think most readers understand that this is only for special cases,
but to be clear, this is only for special cases.

First, it has to be critical code, some function that you know takes a
decent chunk of CPU, where optimizing it produces noticeable benefit
to the application.

Second, you have to know that certain values are common for some
reason. You don't say `you never know, maybe z could be 4, so let's
make that faster'. There has to be some reason.

And further, i don't trust just thinking about it. If i think adding a
special case may be useful, i add some code that counts various cases,
and run a benchmark. Often the results are unexpected. Anyway, you
have some numbers, some real data on the frequency of various cases.

I think these comments emphasize why compilers don't do such things
automatically, and why i don't want them to either.

--
Joe Keane, amateur mathematician

Post a followup to this message