Related articles |
---|
Instruction scheduling with gcc on alpha denk@obelix.cica.es (1997-05-13) |
Re: Instruction scheduling with gcc on alpha jch@hazel.pwd.hp.com (John Haxby) (1997-05-22) |
Re: Instruction scheduling with gcc on alpha Robert.Harley@inria.fr (1997-06-13) |
Re: Instruction scheduling with gcc on alpha toon@moene.indiv.nluug.nl (Toon Moene) (1997-06-24) |
From: | John Haxby <jch@hazel.pwd.hp.com> |
Newsgroups: | comp.compilers |
Date: | 22 May 1997 22:34:47 -0400 |
Organization: | Hewlett-Packard |
References: | 97-05-161 |
Keywords: | optimize, architecture |
Claus Denk wrote:
>
> I have been posting this to gcc.help, but no answer so far. Maybe this
> is the right group for this question ?
>
> I am just looking at the machine code created by gcc. I am interested
> in simple floating vector operations, as for example:
>
> for (i = 0; i< n; i++)
> dy[i] = da*dx[i];
>
> For pipelined architectures like the alpha, loop unrolling is
> essential. [snip]
It's probably quite important to get the branch at the end of the loop
to be the right flavour. Confitional branches with a negative
displacement (it says here) are predicted to be taken. Given
something like this:
loop:
<assignments>
r0 <- n-i
BGT loop
the pipeline will do the right thing (assuming I've got the test the
right way around :-). However, there's that extract bit of arithmetic
involved so C code like this
for (i = n-1; i >= 0; i--)
dy[i] = da*dx[i];
should be faster since the test instruction need do no extra arithmetic.
It would be interesting to see what this change to the C does to the
execution time.
--
John Haxby
jch@pwd.hp.com http://www.ice.hp.com/
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.