Fri, 3 Apr 1992 00:58:21 GMT

Related articles |
---|

Is there a way of generating code for x/const without division? jeremy@sw.oz.au (1992-04-02) |

Re: Is there a way of generating code for x/const without division? suitti@ima.isc.com (1992-04-03) |

Re: Is there a way of generating code for x/const without division? moss@cs.umass.edu (1992-04-03) |

Re: Is there a way of generating code for x/const without division? markt@harlqn.co.uk (1992-04-08) |

Re: Is there a way of generating code for x/const without division? haddad@pa.dec.com (1992-04-07) |

Newsgroups: | comp.compilers,comp.programming,comp.lang.c |

From: | suitti@ima.isc.com (Stephen Uitti) |

Keywords: | code, arithmetic, optimize |

Organization: | Interactive Systems, Cambridge, MA 02138-5302 |

References: | 92-04-018 |

Date: | Fri, 3 Apr 1992 00:58:21 GMT |

In article 92-04-018 jeremy@sw.oz.au (Jeremy Fitzhardinge) writes:

*>I've been playing with a simple code generator for RISC CPUs. One of the*

*>simple peephole optimisations it does is to replace multiplies by*

*>constants with a series of shift/adds rather than using the general*

*>multiplication.*

*>*

*>I'm wondering whether a similar thing can be done for divide and modulo,*

*>such that they are implemented by the code generator as a sequence of*

*>simple instructions that is more efficient (in time) than the more general*

*>way of performing divide/modulo.*

The code to use will vary from CPU to CPU. For a z80, divide is not

supplied. Neither is multiply. Divide can be very painful. Ten years

ago, I implemented a 32 bit add, subtract, multiply & divide subroutine

set. Later, I used it for printing decimal numbers. Divide by ten using

the divide routine was so slow (on a 2 MHz z80) that it couldn't keep

ahead of a 180 cps printer. I resorted to a table of constants: 10, 100,

1000, ..., 1000000000, and did adds and subtracts of these things,

counting the number of times needed to cross zero. I even switched to 16

bit arithmetic when the result became 10,000 or less. This was lots

faster, and certainly fast enough. In my case it had the aditional

advantage of yielding digits sooner (left to right), so that they could be

entered into the output queue before the computation of the entire number

was completed. However, with a CPU that has integer divide, even if

divide takes 30 times as long as add, it is still probably faster to do

the divide.

For unsigned integers, divide by constant-power-of-two can use right

shift. This will probably be fastest on any CPU, even a z80. One could

imagine, say, a vector computer that has vector multiply, but not vector

shift...

For Dec's Alpha, there is a double precision multiply that can be used for

divide-by-constant. You multiply by some sort of reciprocal and the

answer is in half of the multiply's answer. This might be handy for 16

bit divides where 32 bit multiplies are available. I haven't seen anyone

do this for common workstation class CPUs. It may be quicker even if both

multiply and divide are available in hardware, since multiply tends to be

faster than divide. It may be a real win on, say, Sun's Spark machines,

since divides are "helped" by a "divide step" instruction, which must be

repeated multiple times.

If you can do a divide with multiply, you should be able to use another

multiply and subtract to get modulus. Again, this may or may not be worth

it.

Stephen.

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.