|The RISC penalty firstname.lastname@example.org (1995-12-09)|
|Re: The RISC penalty email@example.com (1995-12-17)|
|Re: The RISC penalty firstname.lastname@example.org (1995-12-18)|
|Re: The RISC penalty email@example.com (1995-12-19)|
|Re: The RISC penalty jbuck@Synopsys.COM (1995-12-20)|
|Re: The RISC penalty firstname.lastname@example.org (1995-12-21)|
|Re: The RISC penalty email@example.com (1995-12-28)|
|Re: The RISC penalty firstname.lastname@example.org (1995-12-28)|
|Re: The RISC penalty email@example.com (1995-12-30)|
|Re: the RISC penalty john.r.strohm@BIX.com (1995-12-30)|
|Re: the RISC penalty firstname.lastname@example.org (1995-12-31)|
|From:||jbuck@Synopsys.COM (Joe Buck)|
|Date:||20 Dec 1995 13:29:44 -0500|
|Organization:||Synopsys Inc., Mountain View, CA 94043-4033|
|References:||95-12-063 95-12-077 95-12-103|
email@example.com (David Keppel) writes:
>The article reported that a 68K *interpreter* written in RISC code had
>a theoretical cost of 68 cycles, not including cache misses, and that
>a dynamic cross-compiler for the same function produced code that
>theoretically ran in 18 cycles. Indeed, for small bemchmarks, good
>speedups were observed using the dynamic cross-compiler. However,
>when run on one real x86 application (unnamed, code size not
>specified), the dramatically larger code generated by the dynamic
>cross-compiler ran *slower* than the interpreter code, because the
>instruction cache miss rates were terrible with the larger code.
Good so far. The problem is with the conclusion that this has
anything at all to do with RISC or "the RISC penalty".
PowerPC code is on the order of 1.5 times bigger than 68k code. Let's
be genererous to Pittman and say it's twice as big. Now, let's say
that instead of emulating a 68k on a PowerPC we did the reverse. The
68k processor would also have a cache. The interpreter would fit in
the cache; a large-enough dynamically recompiled program would not
fit. Would Pittman then conclude that there is a "CISC Penalty"?
>I believe that one of Pittman's conclusions was that code-expanding
>transformations (``optimizations'') are less likely to be successful
>with a RISC because they're already running ``close to saturation'' of
>the instruction memory bandwidth.
Again, with CISC processors with caches you can have the same problem:
code-expanding transformations may cause loops that used to fit in the
L1 cache to no longer fit. All we can conclude is that a processor
with a less-densely-coded instruction set may need a somewhat larger
cache than a processor with a more-densely-coded instruction set, and
that code-expanding transformations may not always be a good idea.
-- Joe Buck <firstname.lastname@example.org> (not speaking for Synopsys, Inc)
Return to the
Search the comp.compilers archives again.