Re: Is there a way of generating code for x/const without division?

suitti@ima.isc.com (Stephen Uitti)
Fri, 3 Apr 1992 00:58:21 GMT

          From comp.compilers

Related articles
Is there a way of generating code for x/const without division? jeremy@sw.oz.au (1992-04-02)
Re: Is there a way of generating code for x/const without division? suitti@ima.isc.com (1992-04-03)
Re: Is there a way of generating code for x/const without division? moss@cs.umass.edu (1992-04-03)
Re: Is there a way of generating code for x/const without division? markt@harlqn.co.uk (1992-04-08)
Re: Is there a way of generating code for x/const without division? haddad@pa.dec.com (1992-04-07)
| List of all articles for this month |
Newsgroups: comp.compilers,comp.programming,comp.lang.c
From: suitti@ima.isc.com (Stephen Uitti)
Keywords: code, arithmetic, optimize
Organization: Interactive Systems, Cambridge, MA 02138-5302
References: 92-04-018
Date: Fri, 3 Apr 1992 00:58:21 GMT

In article 92-04-018 jeremy@sw.oz.au (Jeremy Fitzhardinge) writes:
>I've been playing with a simple code generator for RISC CPUs. One of the
>simple peephole optimisations it does is to replace multiplies by
>constants with a series of shift/adds rather than using the general
>multiplication.
>
>I'm wondering whether a similar thing can be done for divide and modulo,
>such that they are implemented by the code generator as a sequence of
>simple instructions that is more efficient (in time) than the more general
>way of performing divide/modulo.


The code to use will vary from CPU to CPU. For a z80, divide is not
supplied. Neither is multiply. Divide can be very painful. Ten years
ago, I implemented a 32 bit add, subtract, multiply & divide subroutine
set. Later, I used it for printing decimal numbers. Divide by ten using
the divide routine was so slow (on a 2 MHz z80) that it couldn't keep
ahead of a 180 cps printer. I resorted to a table of constants: 10, 100,
1000, ..., 1000000000, and did adds and subtracts of these things,
counting the number of times needed to cross zero. I even switched to 16
bit arithmetic when the result became 10,000 or less. This was lots
faster, and certainly fast enough. In my case it had the aditional
advantage of yielding digits sooner (left to right), so that they could be
entered into the output queue before the computation of the entire number
was completed. However, with a CPU that has integer divide, even if
divide takes 30 times as long as add, it is still probably faster to do
the divide.


For unsigned integers, divide by constant-power-of-two can use right
shift. This will probably be fastest on any CPU, even a z80. One could
imagine, say, a vector computer that has vector multiply, but not vector
shift...


For Dec's Alpha, there is a double precision multiply that can be used for
divide-by-constant. You multiply by some sort of reciprocal and the
answer is in half of the multiply's answer. This might be handy for 16
bit divides where 32 bit multiplies are available. I haven't seen anyone
do this for common workstation class CPUs. It may be quicker even if both
multiply and divide are available in hardware, since multiply tends to be
faster than divide. It may be a real win on, say, Sun's Spark machines,
since divides are "helped" by a "divide step" instruction, which must be
repeated multiple times.


If you can do a divide with multiply, you should be able to use another
multiply and subtract to get modulus. Again, this may or may not be worth
it.


Stephen.
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.