Related articles |
---|
Cache size restrictions obsolete for unrolling? linuxkaffee_@_gmx.net (Stephan Ceram) (2009-01-07) |
Re: Cache size restrictions obsolete for unrolling? harold.aptroot@gmail.com (Harold Aptroot) (2009-01-09) |
Re: Cache size restrictions obsolete for unrolling? gneuner2@comcast.net (George Neuner) (2009-01-10) |
Re: Cache size restrictions obsolete for unrolling? linuxkaffee_@_gmx.net (Stephan Ceram) (2009-01-10) |
Re: Cache size restrictions obsolete for unrolling? jgd@cix.compulink.co.uk (2009-01-10) |
Re: Cache size restrictions obsolete for unrolling? harold.aptroot@gmail.com (Harold Aptroot) (2009-01-10) |
From: | "Harold Aptroot" <harold.aptroot@gmail.com> |
Newsgroups: | comp.compilers |
Date: | Fri, 9 Jan 2009 13:51:33 +0100 |
Organization: | A noiseless patient Spider |
References: | 09-01-010 |
Keywords: | architecture, performance |
Posted-Date: | 09 Jan 2009 08:35:43 EST |
"Stephan Ceram" <linuxkaffee_@_gmx.net> wrote in message
> I've made the experience that for some DSPs it's better to unroll
> loops as much as possible without taking care of the instruction
> cache. ...
>
> My feeling is that modern processors have sophisticated features (like
> prefetching, fast memories ...) that heavily help to hide/avoid
> instruction cache misses, thus they rarely occur even if a frequently
> executed loop exceeds the cache capacity. In contract, aggressive
> unrolling reduced the expensive execution of branches (especially
> mispredicted) in the loop header and produced more optimization
> potential. In total, this pays off even at the cost of some more cache
> misses. So my first conclusion is that the commonly found restriction
> of unrolling factors to avoid too large loops not fitting in the cache
> is obsolete and does not hold for modern processors and compilers.
I have a strong feeling that it all depends very much on the platform
you're targeting. And maybe also on how much memory the loop itself
accesses as data, that wouldn't put any more pressure on the code
cache obviously, but if the code does not fit in the cache then the
real data and the code will both be fighting for the same resources
(secondary cache, main memory). I haven't tested it at all, but it
could matter, right?
Return to the
comp.compilers page.
Search the
comp.compilers archives again.