Related articles |
---|
[4 earlier articles] |
Re: performance measurement and caches jgj@ssd.hcsc.com (1996-02-16) |
Re: performance measurement and caches alms@pesqueira.di.ufpe.br (1996-02-16) |
Re: performance measurement and caches mff@research.att.com (Mary Fernandez) (1996-02-16) |
Re: performance measurement and caches grunwald@foobar.cs.colorado.edu (1996-02-17) |
Re: performance measurement and caches Terje.Mathisen@hda.hydro.com (Terje Mathisen) (1996-02-18) |
Re: performance measurement and caches cdg@nullstone.com (1996-02-19) |
Re: performance measurement and caches mschmit@ix.netcom.com (1996-02-21) |
From: | mschmit@ix.netcom.com (Mike Schmit ) |
Newsgroups: | comp.compilers,comp.arch |
Date: | 21 Feb 1996 00:07:11 -0500 |
Organization: | Netcom |
References: | 96-02-165 96-02-195 96-02-221 |
Keywords: | architecture, performance, benchmarks |
Terje Mathisen <Terje.Mathisen@hda.hydro.com> writes:
>I believe this is mostly due to the way 486, Pentium and PPro handles
>code prefetching, both 486 and PPro really likes to have the top of
>busy loops aligned near the beginning of a cache line, i.e. you can
>get this effect in single-tasking mode, with no system traffic at all.
>
>The Pentium relaxed the branch target requirement to 32-bit boundaries.
Actually, this is not true. Even though I think that I said this in my
book and I think Intel has said this too. If an instruction straddles
a cache line boundary and it is, for example, the top instruction in a
loop you will loose performance. Usually, a 4-byte boundary will work
OK. But if the instruction is 5 bytes in length, then 1 out of 8
times their will be a delay.
Mike Schmit
-------------------------------------------------------------------
mschmit@ix.netcom.com author:
408-244-6826 Pentium Processor Programming Tools
800-765-8086 ISBN: 0-12-627230-1
--
Return to the
comp.compilers page.
Search the
comp.compilers archives again.