Re: MMX/3Dnow!/SSE/SSE2 compilers

Andrew Richards <a.richards@codeplay.com>
8 May 2002 00:20:22 -0400

          From comp.compilers

Related articles
[4 earlier articles]
Re: MMX/3Dnow!/SSE/SSE2 compilers dave@icmfp.com (2002-04-29)
Re: MMX/3Dnow!/SSE/SSE2 compilers jgd@cix.co.uk (2002-04-29)
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-01)
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-05-01)
Re: MMX/3Dnow!/SSE/SSE2 compilers cparpart@surakware.net (Christian Parpart) (2002-05-03)
Re: MMX/3Dnow!/SSE/SSE2 compilers marcov@toad.stack.nl (Marco van de Voort) (2002-05-04)
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-05-08)
Re: MMX/3Dnow!/SSE/SSE2 compilers snowwolf@diku.dk (Allan Sandfeld Jensen) (2002-05-12)
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-23)
Re: MMX/3Dnow!/SSE/SSE2 compilers jgd@cix.co.uk (2002-05-23)
Re: MMX/3Dnow!/SSE/SSE2 compilers salbin@emse.fr (2002-05-23)
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-27)
| List of all articles for this month |

From: Andrew Richards <a.richards@codeplay.com>
Newsgroups: comp.compilers
Date: 8 May 2002 00:20:22 -0400
Organization: blueyonder (post doesn't reflect views of blueyonder)
References: 02-04-126 02-04-161 02-05-004
Keywords: arithmetic, architecture
Posted-Date: 08 May 2002 00:20:22 EDT

jacob navia wrote:
> The problem I see with this is that the results from SSE2 are
> different from the results the FPU obtains. Maybe you get a
> performance increase but tell me:
>
> 1) How did you solve the incompatibility between FPU and SSE2?


For games/graphics and audio programming, this is generally not a
problem. Accuracy is not that important. Floating-point is always
approximate anyway. For scientific work, this may cause problems, so
programmers do need to be aware that the results from SSE, SSE2,
3DNow! and the old x86 FPU are not exactly the same.


> 2) How do you maintain data in SSE2 registers across calls? Do you save all
> the SSE2 registers?


VectorC defines some new calling-conventions for processors with SSE,
MMX, SSE2 and 3DNow! that use SSE and MMX registers for
parameters. It's generally not such a big deal - saving out registers
across calls is not so slow on x86 and the FPU registers had to be
saved anyway, so you should still be getting a performance boost.


> I implemented all floating point in SSE2 in my compiler system
> (lcc-win32) but failed at the above points. I could not guarantee that
> the results of a+b would be the same and that would have lead to
> incredible problems with floating point code since I set up the FPU to
> use all precision.
>
> Besides if I save some value in an SSE2 register I have to save them all at
> each function call, what means a BIG i/o overhead.


Surely you only have to save out the registers that have live values
in them?


The really big problem is register allocation (well, vectorization is
a big problem, too).


Because you can't do all floating-point operations on SSE registers
(sometimes you have to use the FPU) and because you can't transfer
between FPU and SSE without going through memory, your register
allocator needs to be careful not to allocate values in both FPU and
SSE registers in the same expressions.


--
Andrew Richards
Codeplay


Tel: +44 (0)20 7482 3382
140-142 Kentish Town Rd, London, NW1 9QB
http://www.codeplay.com


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.