# Low-order uncertainty in 87 math processing

## solmaker@olympus.net (Stephen Schumacher)24 Sep 1997 00:05:42 -0400

From comp.compilers

Related articles
Low-order uncertainty in 87 math processing solmaker@olympus.net (1997-09-24)
Re: Low-order uncertainty in 87 math processing dlmoore@ix.netcom.com (David L Moore) (1997-09-24)
Re: Low-order uncertainty in 87 math processing will@ccs.neu.edu (William D Clinger) (1997-09-27)
| List of all articles for this month |

 From: solmaker@olympus.net (Stephen Schumacher) Newsgroups: comp.compilers,comp.os.msdos.programmer Date: 24 Sep 1997 00:05:42 -0400 Organization: solmaker Keywords: arithmetic, architecture, question

I've been trying to debug a perplexing 87 math crunching problem. The
code is generated from TopSpeed 3.1 Modula2 for DOS. The basic logic is:

VAR x,y,q: REAL;
...
x:=y;
... (* a little bit of code that sometimes resets x *)
q=x-y;
IF q>=0 THEN
IF q=0 THEN ...
ELSE ... END;
END;

What's happening is that once in awhile during a large number of
trials, when x had not been reset (so should still equal y), q
nevertheless does NOT equal zero, but instead equals some
infinitessimal, sometimes negative value. As a result, the IF
statement fails, and essential code does not get executed, causing a
floating-point run-time-error later on.

I've been able to get this code fragment to work as expected by using
an epsilon such as e:=x*1E-6 and comparing q to this e, instead of to
0. But I'm greatly concerned that similar undesired behavior may be
happening elsewhere in my code that I don't know about. Also, I
really want to understand what's going on!

Here's what I've learned so far:

(1) TopSpeed runs its 87 floating-point operations using the Clip
(Truncate to Zero) rounding option, instead of the normal Round to
Nearest or Even. I thought this might cause occasional low-order bit
inaccuracies, if 80-bit tempreal data in a 87 register is stored into
a 32-bit real memory location, then compared with a 80-bit tempreal
register. But changing the rounding option around this code has no
effect. (Changing the rounding option for the whole program triggers
errors elsewhere.)

(2) Otherwise, disassembling the compiled code results in 86/87
machine code that looks valid to my scrutiny. Compiling the code
fragment in a simplified test program doesn't seem to generate the
error when forcefed input data, though the test is not necessarily
reproducing the exact input as in the full program because of the
number of input parameters.

(3) Not Windows related - the same error happens when running in DOS
mode. Happens the same way on multiple different machines (all
Pentiums).

(4) It does not seem to be caused by memory smashing - saving the
suspect variables into different variables immediately before the
comparison and printing them out should them to in fact be equal, bit
by bit, even though the comparison (involving a mirroring 87 register)
fails.

(5) Here's the weirdest part: the behavior is consistent throughout
testing, happening predictably at the place with the same input
values, except that occasionally it will go away and the program will
run fine, consistently - until the next system reboot! Then the
behavior will come back in the same place as originally. Several
times I thought I was uncovering fixes that would avoid the behavior,
then working to isolate the minimum fix, only to discover that the fix
was irrelevant and somehow the system state (?!?) had changed, which
got reversed by rebooting (though not by EXITing a DOS window and
opening a new one).

any insights about the 87 processor that might explain this
inexplicable behavior, or suggestions about what I might be missing.

Thanks,
Stephen Schumacher (solmaker@olympus.net)
Steve Schumacher (solmaker@olympus.net)
[Any possibility that the precision flags are getting twiddled, or that
you're comparing NANs? The precision flags would explain the works OK
until reboot behavior. -John]
--

Post a followup to this message