Wed, 21 Apr 1993 13:49:45 GMT

Related articles |
---|

Optimal code sequences for signed int div/rem by powers of 2 S_JUFFA@IRAV1.ira.uka.de (1993-04-21) |

Re: Optimal code sequences for signed int div/rem by powers of 2 markt@harlqn.co.uk (1993-04-23) |

Newsgroups: | comp.arch,comp.compilers |

From: | S_JUFFA@IRAV1.ira.uka.de (|S| Norbert Juffa) |

Keywords: | arithmetic, optimize |

Organization: | University of Karlsruhe, FRG |

Date: | Wed, 21 Apr 1993 13:49:45 GMT |

This article is somewhat of a followup to the recent discussion about

integer division here comp.arch. I am also cross-posting to comp.compilers

because this stuff may be relevant for people involved with compiler code

generation.

Many modern RISC CPUs do not have an integer division instruction (SPARC

V7, Alpha), have only a division step instruction (AMD 29K), or have a

rather slow HW-division (microSPARC). Even if there is a division

instruction, it may fail to produce a remainder (SPARC V8).

Therefore, people are looking for alternatives to using division. Fast

alternatives are especially feasible if the divisor is known at compile

time (e.g. multiplication by reciprocal of divisor). Quite a few posts in

the recent discussion were involved with speeding up division by powers of

two known at compile time. This is really easy when dealing with unsigned

integers (-> shift right), but requires some correction steps for signed

integers.

Checking compiler output for 80x86 produced by Microsoft C 7.0 and for

SPARC by gcc 2.2.3 and the Solaris cc compiler, I find they don't use all

the optimizations possible for signed integers when the divisor is a power

of two. Especially, when the divisor is a negated power of two (i.e.

-2^i), all of these compilers will resort to divide and remainder

instructions or subroutines, although the shifting and masking approach

used for unsigned integers could be used with a few correction steps. The

Microsoft C 7.0 compiler doesn't optimize any division/remainder by 2^i or

-2^i for signed integers.

I wonder what the fastest instruction sequences for signed integer

division by +/- 2^n are for different machines. I know from the discussion

there are even machines that have signed integer division by powers of two

in HW (i960).

I am including the routines I came up with for 80x86 and SPARC for doing

signed integer division/remainder by +/- 2^i.

Norbert Juffa (s_juffa@iravcl.ira.uka.de)

CODE FOR Intel 80x86 (can easily be changed to 32-bit version for >= 386)

=========================================================================

/ 2: CMP AX, 8000h ; CY = 1, if dividend >= 0

SBB AX, -1 ; inc AX, if dividend < 0

SAR AX, 1 ; now do right shift

/ 2^n: CWD ; DX = FFFFh if dividend < 0

AND DX, 2^n-1 ; mask correction

ADD AX, DX ; apply correction if necessary

SAR AX, n ; now do right shift

/ -2: CMP AX, 8000h ; CY = 1, if dividend >= 0

SBB AX, -1 ; inc AX, if dividend < 0

SAR AX, 1 ; now do right shift

NEG AX ; use (x div -2) = - (x div 2)

/ -2^n: CWD ; DX = FFFFh if dividend < 0

AND DX, 2^n-1 ; mask correction

ADD AX, DX ; apply correction if necessary

SAR AX, n ; now do right shift

NEG AX ; use (x div -2^n) = - (x div 2^n)

% 2, % -2: CWD ; generate flag, FFFFh id divd < 0, else 0

AND AX, 1 ; mask out remainder

XOR AX, DX ; negate

SUB AX, DX ; remainder if dividend < 0

% 2^n, % -2^n: CWD ; generate mask, FFFFh if divd < 0, else 0

AND DX, 2^n-1 ; mask correction

ADD AX, DX ; apply pre-correction if necessary

AND AX, 2^n-1 ; mask out remainder

SUB AX, DX ; apply post-correction if necessary

CODE FOR SPARC

==============

Dividing a signed integer by a positive power of two (1<<n) known at

compile time.

1) for i/2: addcc %o1,%o1,%g0 ! carry if %o1 < 0

addx %o1,%g0,%o1 ! inc %o1, if %o1 < 0

sra %o1,1,%o1 ! do the shift

Although this code sequence uses three instructions, just like what is

produced by gcc now, it has the advantage that it doesn't need an

additional register. It has the disadvantage of destroying the

condition codes.

2) for i/(1<<n): sra %o1,31,%o2 ! %o2 = 0xffffffff if %o1 < 0

srl %o2,32-n,%o2 ! (1<<n)-1, if %o1<0, else 0

add %o1,%o2,%o1 ! apply correction

sra %o1,n,%o1 ! do the shift

The advantage of this sequence is that it doesn't use a branch and has

no problem with n>=13 since it doesn't use immediate constants for the

correction step, also it doesn't destroy the condition codes.

Dividing a signed integer by a negated positive power of two (-(1<<n))

should make use of the identity x / (-(1<<n)) = -(x / (1<<n)) and then

apply the code given above. Currently, gcc 2.2.3 generates calls to .div

1) for i/-2: addcc %o1,%o1,%g0 ! carry if %o1 < 0

addx %o1,%g0,%o1 ! inc %o1, if %o1 < 0

sra %o1,1,%o1 ! i / 2

neg %o1 ! -(i / 2)

2) for i/(-(1<<n)):sra %o1,31,%o2 ! %o2 = 0xffffffff if %o1 < 0

srl %o2,32-n,%o2 ! (1<<n)-1, if %o1<0, else 0

add %o1,%o2,%o1 ! apply correction

sra %o1,n,%o1 ! i / (1<<n)

neg %o1 ! -(i / (1<<n))

Computing the remainder (% operator) of a division of a signed integer by

a positive power of two (1<<n) known at compile time. Currently, gcc uses

calls to .rem.

1) for i%2, i%(/-2):

sra %o1,31,%o2 ! 0xFFFFFFFF, if %o < 0, else 0

and %o1,1,%o1 ! mask out remainder

xor %o1,%o2,%o1 ! negate remainder if quotient

sub %o1,%o2,%o1 ! negative (sign(quot)=sign(rem)!)

2) for i%(1<<n), i%(-(1<<n)):

sub %g0,1,%o2 ! 0xffffffff

srl %o2,32-n,%o2 ! (1<<n)-1

sra %o1,31,%o3 ! 0xffffffff, if %o1 < 0, else 0

and %o3,%o2,%o3 ! (1<<n)-1, if %o1 < 0, else 0

add %o1,%o3,%o1 ! apply correction if necessary

and %o1,%o2,%o1 ! mask out remainder bits

sub %o1,%o3,%o1 ! apply correction if necessary

This instruction sequence is branch free, doesn't destroy the condition

codes and has no problem with n>=13, since it doesn't use immediate

constants in the correction step.

--

Post a followup to this message

Return to the
comp.compilers page.

Search the
comp.compilers archives again.