Related articles |
---|
vectorization in icc kf@iki.fi (2002-11-26) |
Re: vectorization in icc skral@mips.complang.tuwien.ac.at (Kral Stefan) (2002-12-01) |
vectorization in icc aart.bik@intel.com (Bik, Aart) (2002-12-03) |
Re: vectorization in icc kfredrik@saippua.cs.Helsinki.FI (Kimmo Fredriksson) (2002-12-07) |
vectorization in icc aart.bik@intel.com (Bik, Aart) (2002-12-07) |
Re: vectorization in icc terryg@qwest.net (Terry Greyzck) (2002-12-11) |
Re: vectorization in icc kf@iki.fi (2002-12-11) |
[3 later articles] |
From: | kf@iki.fi |
Newsgroups: | comp.compilers |
Date: | 26 Nov 2002 22:16:29 -0500 |
Organization: | - |
Keywords: | C, optimize, question |
Posted-Date: | 26 Nov 2002 22:16:29 EST |
Hi,
I've been experimenting with the Intel C/C++ compiler for Linux, and in
particular, with the automatic vectorization.
I have the following piece of code (all the arrays are of type char):
for( j = 0; j < 16; j++ ) {
d[ j ] = d[ j ] + d[ j ];
d[ j ] = d[ j ] | B[ j ];
dm[ j ] = d[ j ] & mm[ j ];
}
which compiles to the following, if I disable the vectorization:
..B3.16: # Preds ..B3.16 ..B3.15
movb 4656(%esp,%ecx), %al #178.23
movb 4672(%esp,%ecx), %dl #181.23
addb %al, %al #178.23
orb (%edi,%ecx), %al #179.23
movb %al, 4656(%esp,%ecx) #179.4
andb %dl, %al #181.23
movb %al, 4688(%esp,%ecx) #181.4
addl $1, %ecx #176.23
cmpl $16, %ecx #176.3
jl ..B3.16 # Prob 93% #176.3
With vectorization enabled, I get the following, i.e. the loop is
eliminated by using sse2 instructions:
paddb %xmm1, %xmm1 #177.14
por 80(%esp,%ecx,8), %xmm1 #178.14
movdqa %xmm1, %xmm3 #180.14
lea 1(%edi), %eax #183.44
addl $1, %esi #169.21
pand %xmm0, %xmm3 #180.14
movdqa %xmm3, 4720(%esp) #180.4
Both work just fine, but the vectorized code is significantly slower!
I certainly expected the vectorized code to be much faster.
What's going on? Are the sse2 instructions really so slow compared to
the standard integer instructions? If so, what's the point of the
vectorization anyways?
Thanks.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.