Related articles |
---|
vectorization in icc kf@iki.fi (2002-11-26) |
Re: vectorization in icc skral@mips.complang.tuwien.ac.at (Kral Stefan) (2002-12-01) |
vectorization in icc aart.bik@intel.com (Bik, Aart) (2002-12-03) |
Re: vectorization in icc kfredrik@saippua.cs.Helsinki.FI (Kimmo Fredriksson) (2002-12-07) |
vectorization in icc aart.bik@intel.com (Bik, Aart) (2002-12-07) |
Re: vectorization in icc terryg@qwest.net (Terry Greyzck) (2002-12-11) |
Re: vectorization in icc kf@iki.fi (2002-12-11) |
Re: vectorization in icc kf@iki.fi (2002-12-11) |
Re: vectorization in icc kf@iki.fi (2002-12-11) |
Re: vectorization in icc nmm1@cus.cam.ac.uk (2002-12-13) |
From: | kf@iki.fi |
Newsgroups: | comp.compilers |
Date: | 11 Dec 2002 22:22:44 -0500 |
Organization: | - |
References: | 02-12-049 |
Keywords: | parallel |
Posted-Date: | 11 Dec 2002 22:22:44 EST |
Thanks!
Now it runs fast!
Some performance measures (lm is of type char):
Case 1: This runs in time 1.44s:
#pragma ivdep
#pragma vector aligned
for( j = 0; j < 16; j++ )
{
d[ j ] = d[ j ] + d[ j ];
d[ j ] = d[ j ] | B[ j ];
dm[ j ] = d[ j ] & mm[ j ];
}
lm = 0;
for( j = 0; j < 16; j++ ) if( !dm[ j ] ) lm++;
m += lm;
Case 2: If I add #pragma novector in the second loop, it runs in time 3.57s.
Case 3: If I add #pragma novector in the first loop, it runs in time 7.67s.
Case 4: If I add #pragma novector in both loops, it runs in time 3.48s.
All in all, the performance boost is nice, almost 2.5X, but what I still
don't get is why Case 3 is so slow? Especially, why it is slower than
Case 4??? Doesn't matter, however.
Thanks again,
Kimmo.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.