Re: floating point

"Joseph D. Darcy" <darcy@CS.Berkeley.EDU>
19 Oct 1998 01:26:15 -0400

          From comp.compilers

Related articles
Re: inlining + optimization = nuisance bugs (Luddy Harrison) (1998-09-29)
Re: floating point, was inlining + optimization = nuisance bugs (David Chase) (1998-10-04)
Re: floating point (William D Clinger) (1998-10-05)
Re: floating point (Bruce Dawson) (1998-10-07)
Re: floating point (William D Clinger) (1998-10-10)
Re: floating point (David McQuillan) (1998-10-13)
Re: floating point darcy@CS.Berkeley.EDU (Joseph D. Darcy) (1998-10-19)
Re: floating point darcy@usul.CS.Berkeley.EDU (1998-10-24)
Re: floating point (Bruce Dawson) (1998-11-01)
Re: floating point (Bruce Dawson) (1998-11-01)
Re: floating point darcy@usul.CS.Berkeley.EDU (1998-11-06)
Re: floating point darcy@CS.Berkeley.EDU (Joseph D. Darcy) (1998-11-06)
Re: floating point (Bruce Dawson) (1998-11-07)
[4 later articles]
| List of all articles for this month |

From: "Joseph D. Darcy" <darcy@CS.Berkeley.EDU>
Newsgroups: comp.compilers
Date: 19 Oct 1998 01:26:15 -0400
Organization: Compilers Central
References: 98-09-164 98-10-018 98-10-040
Keywords: arithmetic, comment

William D Clinger wrote:
>> The basic problem is that the IEEE standard was conceived as a
>> standard for hardware, and says scarcely a word about high level
>> languages or compilers.

Providing language bindings for IEEE 754 features was recognized as an
issue during the standardization process. Members of the standards
committee felt that developing a language binding would further delay
the standardization process and perhaps jeopardize adoption of the

However, even before IEEE 754 was finally adopted, there were both
published papers on IEEE 754/language interaction [Fat, Fel] and an
implementation of a language binding, namely SANE [App]. SANE
(Standard Apple Numerics Environment) basically provides access to all
IEEE 754 features, rounding modes, sticky flags, and the double
extended format. The SANE specification concentrates on a Pascal
binding, but C and Fortran are also discussed.

Therefore, if previous language designers and compiler writers wanted
to provide better IEEE 754 support, there was prior art to reference
and enhance.

>> IEEE Std 754-1985 goes to great lengths to
>> ensure that IEEE floating point arithmetic is predictable at the
>> hardware level, but Kahan himself has urged compiler writers to use
>> extended precision for intermediate results, without seeming to
>> appreciate how this leads to unpredictable floating point arithmetic
>> at the language level, where almost all programmers live.

Using extended precision for intermediate results does *not*
necessarily lead to unpredictable programs. In fact, Kahan decried
the Sun III compilers whose arbitrary use of extended precision made
programs unpredictable. He has also argued strongly against Sun's
recent "Proposal for Extension of Java(TM) Floating Point Semantics"
(PEJFPS) which would introduce the same sort of floating point
anomalies into Java.

When using extended precision, it is crucial to have a language type
which corresponds to the extended format. This is necessary to be
able to preserve referential transparency. For example, if extended
precision is used for expression evaluation and there is no way for
the programmer to store an extended value, breaking a long expression
into pieces can change the computed result.

The sole use of extended precision is not to make programs run faster
on certain architectures; extended precision can help programmers
implement better algorithms [Kah1] and help protect programmers from
unknown numerical instabilities [Kah2]. If compilers and languages
use reasonable rules, extended precision computations can be
predictable, see the report from the Java Grande numerics working
group for one proposal [JaG].

As mentioned previously in this thread, in his comments on [Gol], Doug
Priest discusses various reasonable floating point expression
evaluation policies that languages should provide [Pri]. The polices
have different benefits and certain policies are more appropriate (or
necessary) in some circumstances than others. When designing a
language (or if given enough leeway by the language, when writing a
compiler) the question arises as to which of five or so policies is
best for the programmers using the language. Kahan's position is that
everyday programmers benefit from having their floating point
expressions evaluated in the widest format having hardware support,
double extended on the x86, double most everywhere else. Kahan feels
this way because the extra precision of double extended protects
programmers unknowingly using numerical unstable formulas. See [Kah1]
for an extended discussion of some numerical problems with Heron's
formula for calculating the area of a certain triangles.

Bruce Dawson said:
>Kahan does like to push the idea of using extended precision for
>temporaries which, as you say, ignores the problem of how to specify
>what is a temporary.

Specifying what a temporary or anonymous value is isn't that
difficult. If the value doesn't have name, it is anonymous. All
explicit stores must be respected in both precision and range. For
example, in

a = b*c + d;

b*c could be calculated to extended precision, as could (b*c)+d.
However, when that quantity is assigned to a, the value must be
rounded accordingly. This style of floating point evaluation was used
in pre-ANSI C.

>However where Kahan's creation (the 80x87 and its successors) really
>fall down is that they make pure double precision impossible _and_
>they make pure long double precision extremely difficult.

>Writing pure double precision for the 80x87 is impossible because the
>round_to_double flag doesn't quite work.

Setting the rounding precision to double and having extended exponent
range is explicitly allowed by IEEE 754. (However, if such values
need to be spilled to memory, to preserve the extended exponent range
they must be spilled as 80 bit values).

Until recently, known techniques to make the x86 round *exactly* to
"pure double" entailed about a 10X performance hit on the x86.

Roger Golliver of Intel has developed an elegant refinement of
existing practice that gives exactly pure double rounding on the x86
for about a 2X to 3X speed penalty. A 2X to 3X slowdown is
approximately the same performance as the approximations to pure
double used by current Java JITs on the x86. [JaG] describes
Golliver's technique in more detail.

> It can introduce double rounding, and it doesn't clamp the
> exponent. The exponent clamping can be forced by writing to memory,
> but the double rounding is inevitable (but rare?)

The double rounding can only occur for subnormal values, which are
very rare in practice. In the context of Java, while x86
implementations may exhibit double rounding on underflow, this is a
trivial exact reproducibility issue compared to faulty decimal <-->
binary conversion (as found in Sun's JDK 1.0) and non-conformant
transcendental functions (present in at least early iterations of JDK
1.1.x). As discussed by Doug Priest, such double rounding on
underflow "is highly unlikely to affect any practical program
adversely" [Pri].

The 680x0 restricts the exponent and the significand when the rounding
precision is set to float or double. In retrospect, this is a better
design decision than allowing the extended exponent range as on the
x86. However, the x86 design had good intentions. The extended
exponent range reduces the occurrence of overflow and underflow

>Simultaneously, the 80x87 makes it extremely difficult to write pure
>extended precision math. Because most of the FPU instructions that
>reference memory only support float or double precision, compiler
>vendors have to write an entirely different code generator if they
>want to support extended precision.

The x86's floating point load instruction can load float, double, or
double extended values (all floating point values are converted to
double extended when brought into the floating point register stack).
By setting the rounding precision, the same arithmetic instructions
act on all three formats (with extended exponent range). The FST
instruction can only store float or double values. The FSTP
instruction can store and pop float, double, or double extended. This
last difference in store instructions is certainly an annoying, but
not an insurmountable, problem to code generation.

As long as the language has a type corresponding to double extended,
it is not hard to write code that uses double extended. Writing out
80 bit values to memory is somewhat slower than writing out 64 bit
values (3 cycles versus 2 on recent x86 processors).

> Presumably that is why VisualC++
>dropped support for long double some time ago.

My understanding is that MS VC++ dropped support for 80 bit long
double to limit differences between NT on x86 and NT on Alpha since
the only IEEE formats the Alpha supports are float and double.

>So what are we left with? Who is happy?

>1) The speed demons are moderately happy, because the latest
>incarnations of the 80x87 are fairly fast. But they're not ecstatic,
>because the bizarre tiny-stack architecture makes fast code beastly
>complicated to write and debug, and still isn't as fast as it could

To be fair, the design constraints from about twenty years ago are
very different from the design constraints today. However, there are
some unintended problems with how the x86 floating point stack is
implemented; it is very difficult to discriminate between stack
over/underflow and "invalid" floating point operations. This design
oversight makes generating fast floating point code unnecessarily
hard. Internally, the recent x86 chips have many more registers than
visible to the programmer.

>2) Those who want predictable double precision results aren't happy
>because the results they need are impossible to get all the time.

What do you mean by predictable? There are degrees of predictability;
predictable on the same machine with the same compiler, on the same
machine with different compilers, all the way to a different
architecture and a different compiler. Java promises, but does not
deliver, cross-architecture exact reproducibility. Sun's PEJFPS would
remove predictability even on the same machine with the same compiler
and the same input data.

Predictability is not equivalent to always getting the same answer

>Although, with rounding set to double precision they probably do get
>them 99.999% of the time - any other guesses?

These difference arise due to double rounding on underflow. Besides a
small numerical difference in the last bit (around 10^-324), a more
general concern is does a program behave sensibly when underflow
occurs, does it compute an accurate answer. Underflow and overflow
can violate the assumptions of numerical programs, invalidating their
results. Therefore, detecting and handling such events is necessary
for robust programs. In the context of Java, the language's refusal
to grant access to the IEEE sticky flags, features of IEEE 754
designed to allow detection of such events, unnecessarily complicates
the development of robust numerical libraries.

>In short, if you want to design an FPU that has an ultra-fast or
>ultra-precise mode you have to make sure it can be turned off
>completely, for those who want predictability.

As Motorola did for the 68000, Intel could provide a rounding
precision mode where the exponent was restricted as well. Perhaps
Merced has such a mode. This kind of mode would eliminate much of the
complexity of strictly implementing Java floating point on the x86.

> And, you have to make
>sure that turning it off is trivial - forcing compiler writers to do
>anything more than set a bit is unacceptable - they won't do it.

Running pure float and pure double code on the x86 certainly could be
easier. However, I don't see why compiler writers should expect to
get off the hook just because floating point code generation may be
more subtle than they would like. Compiler writers may not want to
think about floating point but that doesn't mean they shouldn't think
about floating point. Compiler backends are responsible for a
significance fraction of the performance benefit of recent processors.
If Merced catches on, this trend will not abate anytime soon.
Compilers are expected to generate and schedule code that runs well
and is correct with respect to language and processor semantics.
Floating point should be no exception. Processors shouldn't have
perverse floating point, but some adversity does not license compiler

Too often compilers are overly concerned with "optimizing" floating
point expressions. Transformations are used that make the program run
faster but disregard the underlying floating point semantics and
sometimes the intentions of the programmer.

-Joe Darcy


[App] Apple Numerics Manual, Second Edition, Apple, Addison-Wesley
Publishing Company, Inc., 1988.

[Fat] Richard J. Fateman, "High-Level Language Implications for the Proposed
IEEE Floating-Point Standard," ACM Transactions on Programming
Languages and Systems, vol. 4, no. 2, April 1982, pp. 239-257.

[Fel] Stuart Feldman, "Language Support for Floating Point," IFIP TC2
Working Conference on the Relationship between Numerical Computation
and Programming Languages, J.K. Reid ed., 1921, pp. 263-273.

[Gol] David Goldberg, "What Every Computer Scientist Should Know About
Floating-Point Arithmetic," Computing Surveys, vol. 23, no. 1, March
1991, pp. 5-24, also available online from

[JaG] Numerics Working Group, Java Grande Forum, "Improving Java for
Numerical Computation," sometime soon

[Kah1] W. Kahan, "Miscalculating Area and Angles of a Needle-like

[Kah2] W. Kahan, "Roundoff Degrades an Idealized Cantilever,"

[Pri] Douglas Priest, "Differences Among IEEE 754 Implementations",
[The C9X committee is currently wrangling about precision rules, and they're
a stinker to write in a way that is both useful and consistent. In cases
like a = 3. + (b = 1./17.); does the 1/17 in the expression get narrowed
to b's width? -John]

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.