Re: x86 global floating point register allocation

"John Dallman" <jgd@cix.co.uk>
3 Sep 2002 00:10:43 -0400

From comp.compilers

Related articles
x86 global floating point register allocation sverker.is@home.se (Sverker Nilsson) (2002-08-24)
Re: x86 global floating point register allocation jacob@jacob.remcomp.fr (jacob navia) (2002-09-03)
*Re: x86 global floating point register allocation jgd@cix.co.uk (John Dallman)* (2002-09-03)**
Re: x86 global floating point register allocation reig@tenerife.ics.uci.edu (Fermin Reig) (2002-09-03)
Re: x86 global floating point register allocation ceco@jupiter.com (Tzvetan Mikov) (2002-09-08)

| List of all articles for this month |

From:	"John Dallman" <jgd@cix.co.uk>
Newsgroups:	comp.compilers
Date:	3 Sep 2002 00:10:43 -0400
Organization:	Nextra UK
References:	02-08-087
Keywords:	arithmetic, 386
Posted-Date:	03 Sep 2002 00:10:43 EDT

sverker.is@home.se (Sverker Nilsson) wrote:

> The special problem with the x86 floating point register stack is,
> well, that it is like a stack (8 entries, with a wrap around stack
> pointer) and not a general register file. Operations generally require
> one operand to be the top of the stack. If there is another operand,
> commonly it can be anywhere in the stack including the top, and the
> destination can be freely chosen to be the operand on the stack or the
> other one.

Yes. The initial idea for that stack seems to have been that it was
easy to use in a primitive expression evaluator, and would be empty at
the end of each C language statement or equivalent. This had an
implicit assumption that memory was nearly as fast as registers, which
was true on the Intel x86 chips of the late seventies when this
architecture was designed.

((re-ordering))

> (**) There is a concept of 'undefined' register tags, which
> raises exceptions when operated on. I was a bit surprised when
> discovering that even the exchange operation, when one operand
> is defined and the other undefined, raises an exception. This
> further limits how the register stack can be used.

This is fairly consistent with that idea of a primitive expression
evaluator; undefined values are one that have never been set, and the
architecture tries to be easy to use for the assembler language
programmer who doesn't know much about floating point or high-level
languages.

> That would be no problem if no registers were live out from the
> basic block.

That is exactly how the dominant commercial compiler for the
architecture, Microsoft Visual C++ works. The FP stack is usually
clear after each statement; it is always clear at the end of a basic
block. Intel's C/C++ compiler for Windows and Linux works in much the
same way, if it is using the x87 registers.

> If there are any such registers live, they should have to be at
> agreed-upon positions in the register stack.

True, although it's an interesting question if this will actually
improve overall throughput. Naturally, this is likely to depend on the
code that you use it on. You really do want to make sure that the FP
stack is clear at the end of each function, though! I have had to
debug compiler bugs where an entry was accidently being left behind on
the FP stack, and it is no fun at all.

This is an interesting exercise, but developments in hardware are
gradually making it obsolete. The SSE2 floating-point registers of the
Pentium 4 are more useful than the x87 ones, because they don't have
the form of a stack. AMD will be supporting these registers in their
x86-64 chips, and they're usable - and effective - in 32-bit mode.

---
John Dallman jgd@cix.co.uk
"C++ - the FORTRAN of the early 21st century."

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.

Re: x86 global floating point register allocation

"John Dallman" <jgd@cix.co.uk>3 Sep 2002 00:10:43 -0400

"John Dallman" <jgd@cix.co.uk>
3 Sep 2002 00:10:43 -0400