From: | hbaker@netcom.com (Henry Baker) |
Newsgroups: | comp.compilers,comp.dsp |
Date: | 8 Mar 1996 19:18:21 -0500 |
Organization: | nil organization |
References: | 96-03-006 96-03-034 96-03-044 |
Keywords: | optimize, architecture |
In article 96-03-044, max@gac.edu (Max Hailperin) wrote:
> An even bigger problem, in my experience is that C/C++ are built on
> the strange notion that when you operate on two n-bit numbers, you
> get an n-bit result, even when multiplying. This doesn't make a
> whole lot of mathematical sense, and moreover the processor
> architects have in my experience always gotten it right -- they have
> instructions for multplying two 16-bit numbers and getting a 32-bit
> product, or two 32-bit numbers and getting a 64-bit product, or
> whatever. The C compiler "hides" these from you, which can cause a
> major slowdown. In multiple-precision multiplication, which is
> generally implemented roughly as the normal grammar-school
> multiplication algorithm but with each "digit" being n bits wide
> (i.e., radix 2^n), you typically wind up with having to use half as
> large a value of n, hence twice as many "digits" in each of the
> multiplier and multiplicand, hence a four-fold slowdown (2 squared).
I agree with you here. I've been fighting this one for years. It's
utterly amazing that a language like C that has been used for so many
'embedded' type programs has never standardized a way to use
'efficient' multiple precision arithmetic.
The compiler technology has now matured to the point where even a
relatively kludgey interface involving passing C++ references to
results can be open coded to produce excellent machine-level
instructions.
I'd say that the time has now come to standardize access to carry
information on +/- and high-order bits of *.
Perhaps the answer can be found in the previous thread about sincos,
divrem, etc.
--- Proposal (the names of the functions probably need some work):
sum(x,y,c)/difference(x,y,c) produce a structure consisting of a
low-order value and a 'carry'. These can be extracted by 'sumlo' and
'carryhi' -- e.g., sumlo(sum(x,y,c)) produces the low-order 'sum' bits
of x+y+c and carryhi(sum(x,y,c)) produces the hi-order 'carry' bit.
similarly for
prodhi(times(x,y,c)) is the high order bits of x*y+c and
prodlo(times(x,y,c)) is the low order bits of x*y+c, where
x,y,c are single-precision numbers.
--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.