Related articles |
---|
Rounding with Div and Mod operators - a summary william.rayer@virgin.net (William Rayer) (1999-07-05) |
From: | "William Rayer" <william.rayer@virgin.net> |
Newsgroups: | comp.compilers |
Date: | 5 Jul 1999 13:43:26 -0400 |
Organization: | Virgin News Service |
Keywords: | arithmetic, summary |
Dear Newsgroup
In May I posted a question about the best way of rounding when using
the Div and Mod operators. I want to thank everyone who responded,
particularly Jonathan Barker who explained the expected mathematical
behaviour of these operators.
I'm submitting these notes to the newsgroup in case it helps anyone
else, and I'm happy for them to be added to the FAQ if the moderator
thinks this would be useful. These notes will also be added to the
"Design Decisions" appendix in the Reference Manual for the new
language, which explains why they go into a lot of detail.
In brief (for those too busy to read beyond this paragraph!) Div and
Mod should always round down as this is the correct mathematical
behaviour, however this incurs a performance penalty on the Intel
80x86 CPU. To reduce the cost assembler routines are provided. Using
these routines Div and Mod cost ~80 cycles instead of ~40.
Appendix 8.2 - Design Decisions - Div and Mod operators
-------------------------------------------------------
The Div and Mod operators are used for integer division and integer
remainder (modulus). Existing languages have several interpretations
of these operators so any new language must fully define its operators
for positive and negative operands.
Existing Systems
----------------
For an integer numerator (n) and integer divisor (d) with
d <> 0, quotient (q) and remainder (r) all implementations
agree on the following:
q = n Div d
r = n Mod d
n = q*d + r
|r| < |d|
and by rearrangement:
n Mod d = n - (n Div d) * d
These rules state Div and Mod produce a quotient and a
remainder, which can be put back into the formula q*d + r
to generate the original number. Also the absolute value of
the remainder is less than the absolute value of the divisor.
These rules do not fully define the behaviour of Div and
Mod as there are several rounding systems compatible with
the rules:
Div rounds down Div rounds to 0 n Mod d >= 0
<--------------> <--------------> <-------------->
n d n Div d n Mod d n Div d n Mod d n Div d n Mod d
----------------------------------------------------------------
6 3 2 0 2 0 2 0
5 3 1 2 1 2 1 2
4 3 1 1 1 1 1 1
3 3 1 0 1 0 1 0
2 3 0 2 0 2 0 2
1 3 0 1 0 1 0 1
0 3 0 0 0 0 0 0
-1 3 -1 2 0 -1 -1 2
-2 3 -1 1 0 -2 -1 1
-3 3 -1 0 -1 0 -1 0
-4 3 -2 2 -1 -1 -2 2
-5 3 -2 1 -1 -2 -2 1
-6 3 -2 0 -2 0 -2 0
6 -3 -2 0 -2 0 -2 0
5 -3 -2 -1 -1 2 -1 2
4 -3 -2 -2 -1 1 -1 1
3 -3 -1 0 -1 0 -1 0
2 -3 -1 -1 0 2 0 2
1 -3 -1 -2 0 1 0 1
0 -3 0 0 0 0 0 0
-1 -3 0 -1 0 -1 1 2
-2 -3 0 -2 0 -2 1 1
-3 -3 1 0 1 0 1 0
-4 -3 1 -1 1 -1 2 2
-5 -3 1 -2 1 -2 2 1
-6 -3 2 0 2 0 2 0
----------------------------------------------------------------
Div rounds down. In this system Div always rounds in a negative
direction, unless the result is exact. This is known as "floored
division". The sign of n Mod d equals the sign of d. If there is
a floor() or int() operator which rounds real numbers in a
negative direction, n Div d = Int(n/d) = floor(n/d). This
rounding system is used by Ada with its "divide out completely"
operator and Mod operator, and by Modula 3.
Div rounds to zero. In this system Div rounds towards zero,
unless the result is exact. This is known as "symmetric division".
The sign of n Mod d equals the sign of n. Also -n Div d = n Div -d
= -(n Div d). This rounding system is used by Ada with its integer
division operator "/" and Rem operator. C and Pascal also use
this rounding system on the Intel 80x86, although this behaviour
is not formally defined for these languages. The signed division
operator in the Intel 80x86 instruction set (idiv) follows this
system.
n Mod d >= 0. In this system the modulus is always >= zero.
Although equally valid, this system is not used in any
languages known to the author.
Mathematical definitions of Div and Mod
---------------------------------------
We have described several different rounding systems. Which of
these systems are in agreement with mathematical number theory?
Number theorists define integer division and remainder operators
for any n and d > 0 as follows:
quotient q = n Div d
remainder r = n Mod d
n = q*d + r
0 <= r <= d-1
Mathematically speaking the Mod operator defines which of d
classes an integer n falls into. Class [0] denotes integers in
the form n = qd, Class [1] denotes integers n = qd + 1, and
so on up to Class [d-1] for integers n = qd + (d-1). For
example -d, 0 and d are in Class [0], -d+1, 1, d+1 are in
Class [1], and -1, d-1 are in Class [d-1]. Since in this
definition the classes are numbered 0 to d-1, and since the
Mod operator denotes a class, Mod must return an integer
0 to d-1. Thus n Mod d, for any integer n, is always >= 0.
Therefore a rounding system conformant with number theory
should observe the following:
quotient q = n Div d
remainder r = n Mod d
n = q*d + r
If d > 0 then 0 <= r <= d-1
If d < 0 then 0 >= r >= d+1
The last rule (for d<0) is added for symmetry. Number theory
does not cover values of d < 0, so for these values an
implementation is free to return any remainder r consistent
with n = qd + r and |r| < |d|.
Also in number theory the identity -n Div d = n Div -d =
-(n Div d) does not apply because the integers do not form
a field under division. Since integers do not have
multiplicative inverses the division operator is not
fully defined for integers. Therefore the mathematically
correct rules of Div and Mod do not conform to this
identity.
Proposed Implementation
-----------------------
The new language follows the rounding system defined by
the mathematical rules, since this is the behaviour expected
by number theory and there is no compelling reason to
adopt an alternate system. The rules specify r >= 0
for any value of n and for d > 0, which is equivalent
to Mod returning values >= 0 for d > 0, and <= 0 for
d < 0. In this rule the sign of n Mod d = the sign of d
and Div always rounds down.
This differs from C and Pascal on the Intel 80x86 which
round to zero. Although most languages round to zero, and
the signed division opcode on most CPUs round to zero,
this behaviour does not conform to number theory and
need not be followed by a new language. The proposed
implementation therefore follows the mathematical
definitions outlined previously.
The cost of this decision is Div and Mod can no longer
be implemented with a single opcode in the 80x86 instruction
set, and an assembler routine is required instead:
;Implementation of Div and Mod rounding downwards.
;Inputs n=numerator, d=divisor
;Calculates q=quotient, r=remainder
;Uses eax, ecx and edx registers
;if d<0 n=n-1 and make d positive
mov eax,n
mov ecx,d
cmp ecx,0
jge @d_ge_zero
dec eax
neg ecx
@d_ge_zero:
;if (adjusted) n<0 make n positive
cmp eax,0
jge @adj_n_ge_zero
inc eax
neg eax
@adj_n_ge_zero:
;numerator (>=0) in edx:eax, divisor (>0) in ecx, quotient -> eax
mov edx,0
idiv ecx
;if n=0 div is done
mov edx,n
cmp edx,0
je @div_done
;if n,d have opposite signs make q negative
xor edx,d
cmp edx,0
jge @div_done
neg eax
dec eax
;save q
@div_done:
mov ecx,q
mov [ecx],eax
;Now calculate the modulus (quotient is in eax)
imul d
mov ecx,n
sub ecx,eax
mov eax,ecx
mov edx,r
mov [edx],eax
These routines may seem a heavy penalty to pay for a
mathematically correct definition. However after examining
the instruction timings idiv dwarfs all the other opcodes.
On the 486 or Pentium idiv is about 40 cycles, imul is
14, and mov, neg, inc, and sub are 1 cycle each. Thus
although the amount of code has increased very considerably,
the execution time is approximately 80 cycles instead of 40.
Compilable implementations using Borland Delphi inline
assembler for 32 bit Windows are at the end of this
document.
Other comments
--------------
While deciding the best implementation of Div and Mod,
some other interesting questions were addressed.
Under downward rounding the identity -n Div d = n Div -d =
-(n Div d) has no basis in number theory and does not apply,
and the identity -n/d = n/-d = -(n/d) holds only for reals
and not integers. This means a language using "/" for
integer division is more or less obliged to use symmetric
rounding, as this is the only system for which the identity
-n/d = n/-d = -(n/d) applies to integers and reals. (It
would be confusing if a language used downwards rounding
and "/" for integer division as the identity would hold
for reals only). This is probably the reason why C, which
overloads "/" for integers and reals, uses symmetric
rounding.
Should a language provide several pairs of operators,
allowing both downwards and symmetric rounding? Ada does
this, using "divide out completely" and Mod for downwards
rounding, and integer division "/" and Rem for symmetric
rounding. Although superficially attractive, this option
is rejected for the following reasons:
1. The principle of clarity in language design requires
tasks to be conceptually distinct and to be carried out
using clearly differentiated commands. It is hard to
think of a greater violation of this rule than having
two similar pairs of operators, especially considering
their behaviour is identical in the 1st and 3rd quadrants
and only differs in the 2nd and 4th.
2. There is no consensus among language designers which
rounding system is preferred. So what chance has the
average programmer got of making the correct choice?
3. If language designers side step difficult choices by
passing them onto the user, the result is a language that
is more complex and less clear than needed. A new language
should make a bold decision as to which is the correct
choice, and should move forwards through simplicity and
power, rather that trying to be all things to all persons.
Define MININT as the lowest possible signed integer, which
on the Intel architecture has the sign bit set and all other
bits clear. For 32 bit integers this is 0x80000000 or
-2,147,483,648. Are -1 Mod MININT and MININT Mod -1 defined?
-1 Mod MININT. Under downwards and symmetric rounding, -1
Mod (any negative number) should = -1. -1 Mod MAXINT
(0x7fffffff) is also interesting as it should equal MAXINT-1.
MININT Mod -1. Under downwards and symmetric rounding, anything
Mod -1 should equal zero.
If a language has an operator for rounding reals to integers
the rounding should be consistent with the rounding used
for Div and Mod. For example if a language adopted downwards
rounding and used Int to convert real to integer, Int should
also round down. Then Int(n/d) = n Div d. This means Mod
can be equally defined:
n Mod d = n - (n Div d)*d
n Mod d = n - Int(n/d)*d
Implementation
--------------
function IntMod (const n,d:longint):longint;
(*
We can modify eax,ecx and edx. All other registers must be preserved.
*)
assembler; pascal;
asm
(* if d<0 n=n-1 and make d positive *)
mov eax,n
mov ecx,d
cmp ecx,0
jge @d_ge_zero
dec eax
neg ecx
@d_ge_zero:
(* if (adjusted) n<0 make n positive *)
cmp eax,0
jge @adj_n_ge_zero
inc eax
neg eax
@adj_n_ge_zero:
(* numerator (>=0) in edx:eax, divisor (>0) in ecx, quotient -> eax *)
mov edx,0
idiv ecx
(* if n=0 div is done *)
mov edx,n
cmp edx,0
je @div_done
(* if n,d have opposite signs make q negative *)
xor edx,d
cmp edx,0
jge @div_done
neg eax
dec eax
(* Now calculate the modulus and return in eax *)
@div_done:
imul d
mov ecx,n
sub ecx,eax
mov eax,ecx
end;
function IntDiv (const n,d:longint):longint;
(*
Implementation of integer division using downwards rounding.
n,d,q,r - Numerator, Divisor, Quotient and Remainder as signed integers.
The routine works by making n,d positive, doing a signed division,
then adjusting the results. We can modify eax,ecx and edx and the
function return value is stored in eax. All other registers must
be preserved.
*)
assembler; pascal;
asm
(* if d<0 n=n-1 and make d positive *)
mov eax,n
mov ecx,d
cmp ecx,0
jge @d_ge_zero
dec eax
neg ecx
@d_ge_zero:
(* if (adjusted) n<0 make n positive *)
cmp eax,0
jge @adj_n_ge_zero
inc eax
neg eax
@adj_n_ge_zero:
(* numerator (>=0) in edx:eax, divisor (>0) in ecx, quotient -> eax *)
mov edx,0
idiv ecx
(* if n=0 div is done *)
mov edx,n
cmp edx,0
je @div_done
(* if n,d have opposite signs make q negative *)
xor edx,d
cmp edx,0
jge @div_done
neg eax
dec eax
(* q is returned in eax *)
@div_done:
end;
--end--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.