|Re: Branch prediction firstname.lastname@example.org (2000-05-20)|
|Re: Branch prediction email@example.com (2000-05-21)|
|Re: Branch prediction firstname.lastname@example.org (2000-05-21)|
|Re: Branch prediction email@example.com (Andi Kleen) (2000-05-21)|
|Re: Branch prediction firstname.lastname@example.org (2000-05-28)|
|Re: Branch prediction email@example.com (2000-05-31)|
|Re: Inline caching (was Re: Branch prediction) firstname.lastname@example.org (2000-06-01)|
|Re: Branch prediction email@example.com (2000-06-03)|
|Re: Branch prediction firstname.lastname@example.org (2000-06-20)|
|Date:||28 May 2000 21:05:36 -0400|
|Organization:||Mailgate.ORG Server - http://www.Mailgate.ORG|
>In virtual machine (VM) interpreters BTBs have only 0%-20% prediction
>accuracy if the interpreter uses a central dispatch routine, but they
>give about 50% prediction accuracy if every VM instruction has its own
This is possible with GCC's label as values. Another good reason to use
There is an interesting aspect I just ran after with BTB. If you
implement inline caches with an indirect jump (instead of patching the
code) you have no penalty because the jump through the inline cache is
always predicted correctly (by the very definition of inline caching).
>I believe this can be improved even more by combining common sequences of
>VM instructions into one VM instruction
An easier way is to combine similar adjacent bytecodes into a single
routine. For example (I use a switch statement syntax here):
case 0: case 1: ... pushOOP(instanceVariable(*ip++ & 15));
But check out decode penalties!
GCC is really wonderful in this respect. The && operator is a great
addition. I put into GNU Smalltalk lots of optimization efforts, but I
the pay off was good: now it runs about as fast as Dolphin Smalltalk which
is written in assembly language... I could not have done it without GCC.
Return to the
Search the comp.compilers archives again.