Re: JIT Help... (Torben AEgidius Mogensen)
25 Feb 2001 10:54:07 -0500

          From comp.compilers

Related articles
JIT Help... (J. Akmal) (2001-02-23)
Re: JIT Help... (Daniel C. Wang) (2001-02-25)
Re: JIT Help... (2001-02-25)
Re: JIT Help... (2001-03-01)
Re: JIT Help... (Peter Bertelsen) (2001-03-01)
Re: JIT Help... (J. Akmal) (2001-03-01)
Re: JIT Help... (2001-03-04)
| List of all articles for this month |

From: (Torben AEgidius Mogensen)
Newsgroups: comp.compilers
Date: 25 Feb 2001 10:54:07 -0500
Organization: Department of Computer Science, U of Copenhagen
References: 01-02-121
Keywords: tools, Java
Posted-Date: 25 Feb 2001 10:54:07 EST

"J. Akmal" <> writes:

>I just got done writing a compiler and VM (stack based) for a custom
>language, and now I need to attack the JIT. I only need to support the
>Pentium and Mac PowerPC. I don't need to squeeze every once of speed out

The first thing you do is to determine the memory model you want to
use. This includes which registers you want to use for stack pointers,
if the stack grows up and down, if the SP points to the last used
stack element or the first unused, where the stack limit is stored and
what to do when tehre is overflow etc. A good start for this is to
write an interpreter for your VM in assembly language.

Once you have done this, it will probably be a simple manner to
describe each of your VM instructions as a sequence of simple
operations on this state, including which registers are used etc.
These simple operations can typically easily be expressed as short
machine-code segments. You now basically have a VM to machine-code
translator. You can write your translator in a medium-level language
(like C) which allows low-level access to memory and calling
machine-code at arbitrary addresses (by type-casting to function
pointers, label pointers or similar). If you can avoid writing your
translator directly in assembly, you will have done yourself a great

The resulting JIT will be fast at generating code, but the code
performance won't be stellar - probably 4-10x the speed of interpreted
VM code. If that is fast enough for you (and the size of the generated
code isn't a problem), stop, because the next step becomes more

If your performance is bad, it is probably because every VM
instruction is translated into a sequence of instructions that read
from memory, operate in registers and write to memory. If you can do
more in registers, that will help you. A standard way of doing this is
to do a compile-time emulation of the stack movements and keep values
in registers. This is done the following way:

At entry into a function/method/basic-block/whatever, you assume
everything is on the stack or in special registers according to the
memory model above. In between these, some of the stack elements are
in registers. You do this by, at compile time, emulating movements on
the runtime stack: Instead of moving a value to the stack you record
(at compile time) where it is stored instead (typically, in a
register). Instead of taking a value from the stack, you consult this
record and take it from where it is actually stored. When you get to a
point where you have assumed the normal memory model is followed, you
do the necessary moves to achieve this state and reset your record

In addition to the compile-time record of the run-time stack, you also
need to keep a record of the available registers, so you can select
one when you need a slot for a new stack element. When the stack is
popped, the register used to store the top element is marked as
available. If no registers are available, the contents of a register
is pushed on the run-time stack and the record in the compile-time
stack is updated. It is typically a good idea to store the register
that is used to store the bottommost element of the compile-time
stack, as that makes it easier to restore the stack to its modelled

The above has a problem if the stack can grow unboundedly between two
points where the normal memory model is used. So, if your VM allows
arbitrary stack growth in a loop, you must enforce the memory model at
every loop entry (or, simpler, at every basic block).

A related problem occurs when two paths join. If they have used
different registers to hold the stack values, the state needs to be
made consistent. JVM (AFAIR) requires stacks to be empty at
join-points, so this neatly solves/sidesteps the issue.

This method will allow you to get a good speedup over the simple macro
expansion method without having to resort to global analysis or other
costly measures.

Torben Mogensen (

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.