Related articles |
---|
ABI & alignement: IA32 guerby@acm.org (Laurent Guerby) (2000-11-30) |
Re: ABI & alignement: IA32 rkrayhawk@aol.com (2000-12-20) |
From: | Laurent Guerby <guerby@acm.org> |
Newsgroups: | comp.compilers |
Date: | 30 Nov 2000 12:06:35 -0500 |
Organization: | Club-Internet (France) |
Keywords: | architecture, performance |
Posted-Date: | 30 Nov 2000 12:06:35 EST |
On modern IA32 implementations, it is very important that doubles
(8-byte floats) are 8-byte aligned for performance reasons, as the
following C code shows:
$ cat dbl.c
#include <stdio.h>
#define N 10000
int main (int argc, char **argv) {
double *x;
int i, j;
x=(double*)malloc((N+1)*sizeof(double));
if(argc==2) x=(double*)((int)x+4);
printf("%d\n", (int)x%8);
for(i=0;i<N;i++) x[i]=(double)i;
for(i=0;i<N;i++) for (j=0;j<N;j++) x[i]=0.5*(x[j]+x[i]);
printf("%f\n", x[N-1]);
exit(0);
}
$ gcc -O2 dbl.c; time ./a.out; time ./a.out unaligned
0
9998.574083
3.37user 0.00system 0:03.79elapsed 88%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (97major+29minor)pagefaults 0swaps
4
9998.574083
5.67user 0.00system 0:06.38elapsed 88%CPU (0avgtext+0avgdata 0maxresident)k
0inputs+0outputs (97major+29minor)pagefaults 0swaps
$
Run time is 3.37s on a P2 350MHz when the array is 8-byte aligned, and
5.67s when the array is not 8-byte aligned.
However the IA32 ABI says that for double on the stack, the alignement
should be 4-bytes. My question is: does such a requirement implies
that a compiler that pads the stack (or whatever) to get an 8-byte
alignment is not ABI compliant (8-bytes aligned is obviously also
4-bytes aligned ;-)?
The GCC manual says in its i386 section:
<<
`-malign-double'
`-mno-align-double'
Control whether GNU CC aligns `double', `long double', and `long
long' variables on a two word boundary or a one word boundary.
Aligning `double' variables on a two word boundary will produce
code that runs somewhat faster on a `Pentium' at the expense of
more memory.
*Warning:* if you use the `-malign-double' switch, structures
containing the above types will be aligned differently than the
published application binary interface specifications for the 386.
>>
I'm curious about what people think about this ABI-conformance issue,
and in particular what the Intel compiler does by default in
optimizing mode when it encounters a stack array of double?
Thanks for any information,
--
Laurent Guerby <guerby@acm.org>
[I looked it up, I was mistaken when I said that you only need 4 byte
alignment -- the Pentium manuals more or less say that each type should
be aligned on a natural 2^n boundary. -John]
Return to the
comp.compilers page.
Search the
comp.compilers archives again.