Tue, 27 Oct 1992 20:46:18 GMT

Related articles |
---|

multiplication by constant - quick comparison of algorithms wu@sequent.com (Youfeng Wu) (1992-10-21) |

Re: multiplication by constant - quick comparison of algorithms preston@helena.cs.rice.edu (1992-10-21) |

Re: multiplication by constant - quick comparison of algorithms bgb@iexist.att.com (1992-10-25) |

Re: multiplication by constant - quick comparison of algorithms chased@rbbb.Eng.Sun.COM (1992-10-27) |

Re: multiplication by constant - quick comparison of algorithms joe@babel.ho.att.com (1992-10-28) |

Newsgroups: | comp.compilers |

From: | chased@rbbb.Eng.Sun.COM (David Chase) |

Organization: | Sun Microsystems, Mt. View, Ca. |

Date: | Tue, 27 Oct 1992 20:46:18 GMT |

Keywords: | arithmetic, optimize |

References: | 92-10-079 92-10-095 |

wu@sequent.com (Youfeng Wu):

*> For all numbers in [2, 10000], there is a total number of 64612 one's in*

*> their binary represenations, and the following compares the total number*

*> of add/sub/shift instructions used to perform all of the multiplications*

bgb@iexist.att.com (Brian G Beuning) writes:

*>Isn't it a little dangerous to use sub? Won't it overflow sometimes when*

*>the multiply would not.*

It all depends on the machine. Some machines (SPARC and RS6000 and ARM,

e.g.) have non cc-setting variants of their arithmetic instructions. Even

if an overflow condition code is set, you're free to ignore it. The only

place this is a problem is when "overflow" causes a trap.

The miracle of two's complement arithmetic lets you come up with some

entertaining sequences for some multiplications. It depends a lot on your

heuristics for guiding the search, too. For instance, my multiply

converter persists in choosing to take a first step of

106 = 1 - (-105)

when processing 106*x. I think the reason is that -105 has fewer

transitions between 1 and 0 in it than 105, and thus it is preferred as a

first step.

One reason for avoiding some of the more bizarre sequences is that they

lead the search algorithm far afield, and make it take too long. It's

also very tricky to get just the right search algorithm when exploiting

these tricky overflow properties, because you need to watch out for both

signed and unsigned products when factoring. For example, one algorithm

that I tinkered with years past at another job would reduce (-180229)*x in

3 steps:

Rb = Ra + (Ra << 16); /* 65537 <- 1, step 4, shift 16 */

Rb = Ra + (Rb << 2); /* 262149 <- 65537, step 4, shift 2 */

Rb = (Rb << 14) - Rb; /* -180229 <- 262149, step 2, shift 14 */

but only found 4 steps for 180229*x

Rb = Ra + (Ra << 2); /* 5 <- 1, step 4, shift 2 */

Rb = Ra + (Rb << 1); /* 11 <- 5, step 4, shift 1 */

Rb = Ra + (Rb << 12); /* 45057 <- 11, step 4, shift 12 */

Rb = Ra + (Rb << 2); /* 180229 <- 45057, step 4, shift 2 */

missing the obvious reversal of the last step in the negative

multiplication. (The number of cases where this happens is not large.)

Note the factorizations required to do the steps:

-180229 tc= 4294787067 = 262149 * 16383 = 0xfffd3ffb

(use an unsigned factorization)

180229 = 262149 * (-16383) = 0x0002c005

(overflow city - the sign bits are gone)

In this case, when searching factors of 180229, it makes sense to search

for factors of -180229, and a factor is of the form E1 - E2, then that

means that E2 - E1 can be used instead.

This makes the search even more expensive, since you are considering more

cases at each step, and each step is more expensive. In practice, I don't

think it is worth it in this case (that number didn't drop out of the air

-- I constructed it so that -180229 was in fact equal to (65537 * 4 + 1) *

16383, and also negative.)

A more interesting negative number (for students of such trivia) is

-2147270651. It reduces in three steps, with no negatable steps involved:

I = -2147270651 (10000000000000110100000000000101)

Rb = Ra + (Ra << 15); /* 32769 <- 1, step 4, shift 15 */

Rb = Ra + (Rb << 2); /* 131077 <- 32769, step 4, shift 2 */

Rb = (Rb << 14) + Rb; /* -2147270651 <- 131077, step 1, shift 14 */

I have not yet discovered a 3-step sequence for 2147270651. Again, this

number was found by construction: (32769 * 4 + 1) * 16385.

Disclaimer: Note well that these sequences may only work on 32-bit two's

complement machines.

David Chase

Sun

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.