Related articles |
---|
MMX/3Dnow!/SSE/SSE2 compilers curtisdisa@mindspring.com (Curtis and Disa) (2002-04-21) |
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-04-23) |
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-04-24) |
Re: MMX/3Dnow!/SSE/SSE2 compilers snowwolf@diku.dk (Allan Sandfeld Jensen) (2002-04-24) |
Re: MMX/3Dnow!/SSE/SSE2 compilers dave@icmfp.com (2002-04-29) |
Re: MMX/3Dnow!/SSE/SSE2 compilers jgd@cix.co.uk (2002-04-29) |
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-01) |
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-05-01) |
Re: MMX/3Dnow!/SSE/SSE2 compilers cparpart@surakware.net (Christian Parpart) (2002-05-03) |
Re: MMX/3Dnow!/SSE/SSE2 compilers marcov@toad.stack.nl (Marco van de Voort) (2002-05-04) |
Re: MMX/3Dnow!/SSE/SSE2 compilers a.richards@codeplay.com (Andrew Richards) (2002-05-08) |
Re: MMX/3Dnow!/SSE/SSE2 compilers snowwolf@diku.dk (Allan Sandfeld Jensen) (2002-05-12) |
Re: MMX/3Dnow!/SSE/SSE2 compilers jacob@jacob.remcomp.fr (jacob navia) (2002-05-23) |
[3 later articles] |
From: | "jacob navia" <jacob@jacob.remcomp.fr> |
Newsgroups: | comp.compilers |
Date: | 1 May 2002 23:10:28 -0400 |
Organization: | Wanadoo, l'internet avec France Telecom |
References: | 02-04-126 02-04-161 |
Keywords: | architecture, optimize |
Posted-Date: | 01 May 2002 23:10:28 EDT |
> > What compilers support any of the MMX/3Dnow!/SSE/SSE2 instruction sets
> > (and optimize code for them)? Do you know of any published
> > comparisons of such compilers?
>
> Have an off-the-cuff review:
>
> I have built (and my employers are shipping) a commercial product with
> Intel's C/C++ compiler, version 5.0.1, targeting SSE2
> unconditionally. The compiler offers a
> test-at-run-time-and-select-alternative-code-paths option, but I
> didn't want to make it larger, or lose any speed at all.
>
> For a C library that does a lot of floating-point work, but no large
> matrix crunches, I get about 30% better throughput than the generic
> x86 build, which is compiled with MS VC++v6, but the Intel-compiled
> DLL is about 50% bigger. Thus performance figure is definitely an
> average - nothing gets significantly slower for me, but some
> operations have up to 60% better throughput.
The problem I see with this is that the results from SSE2 are
different from the results the FPU obtains. Maybe you get a
performance increase but tell me:
1) How did you solve the incompatibility between FPU and SSE2?
2) How do you maintain data in SSE2 registers across calls? Do you save all
the SSE2 registers?
I implemented all floating point in SSE2 in my compiler system
(lcc-win32) but failed at the above points. I could not guarantee that
the results of a+b would be the same and that would have lead to
incredible problems with floating point code since I set up the FPU to
use all precision.
Besides if I save some value in an SSE2 register I have to save them all at
each function call, what means a BIG i/o overhead.
Thanks for your attention
Return to the
comp.compilers page.
Search the
comp.compilers archives again.