Related articles |
---|
Subtraction + comparison in one asm instruction? vincent+news@vinc17.org (Vincent Lefevre) (2002-08-10) |
Re: Subtraction + comparison in one asm instruction? Peter-Lawrence.Montgomery@cwi.nl (Peter L. Montgomery) (2002-08-14) |
Re: Subtraction + comparison in one asm instruction? iddw@hotmail.com (Dave Hansen) (2002-08-14) |
Re: Subtraction + comparison in one asm instruction? walter@bytecraft.com (Walter Banks) (2002-08-23) |
Re: Subtraction + comparison in one asm instruction? vincent+news@vinc17.org (Vincent Lefevre) (2002-08-23) |
Re: Subtraction + comparison in one asm instruction? gdr@soliton.integrable-solutions.net (Gabriel Dos Reis) (2002-09-03) |
Re: Subtraction + comparison in one asm instruction? vincent+news@vinc17.org (Vincent Lefevre) (2002-09-08) |
Re: Subtraction + comparison in one asm instruction? gdr@integrable-solutions.net (Gabriel Dos Reis) (2002-09-12) |
[11 later articles] |
From: | "Peter L. Montgomery" <Peter-Lawrence.Montgomery@cwi.nl> |
Newsgroups: | comp.compilers |
Date: | 14 Aug 2002 02:15:33 -0400 |
Organization: | CWI, Amsterdam |
References: | 02-08-033 |
Keywords: | architecture |
Posted-Date: | 14 Aug 2002 02:15:33 EDT |
"Vincent Lefevre" <vincent+news@vinc17.org> writes:
>Several processors (e.g. ARM, Sparc) can do a subtraction and the
>associated comparison in only one instruction. Basically, this is done
>by performing the subtraction and setting the flags.
>
>However, it seems that some/most C compilers don't take advantage of
>this and do both a subtraction and a comparison. This can be seen on
>the code
>
>int d;
>int f(int c)
>{
> d = c - 1;
> return c > 1;
>}
>
>or when writing c - 1 > 0 instead of c > 1. Each time, a comparison
>is performed separately. A practical use would be a loop of the form:
>
> while (--c > 0)
> { ... }
>
>(where c is a signed integer).
>
>Are there compilers that do the subtraction and the comparison in
>the same asm instruction (for the processors for which there is a
>difference)? For the other compilers, have there been discussions,
>plans to add such an optimization, etc?
/*
Be careful while implementing this optimization.
While trying to do the modular multiplication (a * b) mod p,
where a, b, p are close to the machine word size,
I have experienced bad code for the following sequence,
using one Fortran and one C compiler:
Assume we have a 32-bit machine.
*/
unsigned int a, b, p, q;
signed int rem;
(initialize a, b, p, with 0 <= a,b < p < 2^31);
q = (estimate of a*b/p)
rem = a*b - q*p;
if (rem < 0) rem += p;
return (unsigned)rem;
/*
Somehow (not relevant to this discussion) we determine an
approximate quotient q satisfying q - 1 <= a*b/p < q + 1
(where a*b/p denotes the exact rational quotient).
This implies -p <= a*b - q*p < p, so rem = a*b - q*p
satisfies -2^31 < -p <= rem < p < 2^31.
We use arithmetic modulo 2^32 to compute this rem exactly
(result cast from unsigned to signed).
The true (a*b) mod p is either rem or rem+p.
The mistake has been to evaluate (rem < 0) by
looking at the borrow from a*b - q*p. If, for example,
a*b ends in 0x00000001
q*p ends in 0xffffffff
then a*b - q*p ends in 0x00000002 and should be considered positive
despite the borrow during the unsigned subtract.
Looking at the negative flag (rather than the borrow flag)
gives correct code. I changed my test (rem < 0)
to (rem <= -1) to circumvent these bugs.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.