Re: MMX/3Dnow!/SSE/SSE2 compilers

"jacob navia" <jacob@jacob.remcomp.fr>
1 May 2002 23:10:28 -0400

          From comp.compilers

Related articles
MMX/3Dnow!/SSE/SSE2 compilers curtisdisa@mindspring.com (Curtis and Disa) (2002-04-21)
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-04-23)
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-04-24)
Re: MMX/3Dnow!/SSE/SSE2 compilers snowwolf@diku.dk (Allan Sandfeld Jensen) (2002-04-24)
Re: MMX/3Dnow!/SSE/SSE2 compilers dave@icmfp.com (2002-04-29)
Re: MMX/3Dnow!/SSE/SSE2 compilers jgd@cix.co.uk (2002-04-29)
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-01)
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-05-01)
Re: MMX/3Dnow!/SSE/SSE2 compilers cparpart@surakware.net (Christian Parpart) (2002-05-03)
Re: MMX/3Dnow!/SSE/SSE2 compilers marcov@toad.stack.nl (Marco van de Voort) (2002-05-04)
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-05-08)
Re: MMX/3Dnow!/SSE/SSE2 compilers snowwolf@diku.dk (Allan Sandfeld Jensen) (2002-05-12)
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-23)
[3 later articles]
| List of all articles for this month |

From: "jacob navia" <jacob@jacob.remcomp.fr>
Newsgroups: comp.compilers
Date: 1 May 2002 23:10:28 -0400
Organization: Wanadoo, l'internet avec France Telecom
References: 02-04-126 02-04-161
Keywords: architecture, optimize
Posted-Date: 01 May 2002 23:10:28 EDT

> > What compilers support any of the MMX/3Dnow!/SSE/SSE2 instruction sets
> > (and optimize code for them)? Do you know of any published
> > comparisons of such compilers?
>
> Have an off-the-cuff review:
>
> I have built (and my employers are shipping) a commercial product with
> Intel's C/C++ compiler, version 5.0.1, targeting SSE2
> unconditionally. The compiler offers a
> test-at-run-time-and-select-alternative-code-paths option, but I
> didn't want to make it larger, or lose any speed at all.
>
> For a C library that does a lot of floating-point work, but no large
> matrix crunches, I get about 30% better throughput than the generic
> x86 build, which is compiled with MS VC++v6, but the Intel-compiled
> DLL is about 50% bigger. Thus performance figure is definitely an
> average - nothing gets significantly slower for me, but some
> operations have up to 60% better throughput.


The problem I see with this is that the results from SSE2 are
different from the results the FPU obtains. Maybe you get a
performance increase but tell me:


1) How did you solve the incompatibility between FPU and SSE2?
2) How do you maintain data in SSE2 registers across calls? Do you save all
the SSE2 registers?


I implemented all floating point in SSE2 in my compiler system
(lcc-win32) but failed at the above points. I could not guarantee that
the results of a+b would be the same and that would have lead to
incredible problems with floating point code since I set up the FPU to
use all precision.


Besides if I save some value in an SSE2 register I have to save them all at
each function call, what means a BIG i/o overhead.


Thanks for your attention


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.