Avoiding Branch Misprediction in Virtual Machine (VM) Portably

moron451-compiler1@yahoo.com
12 May 2007 20:06:08 -0700

          From comp.compilers

Related articles
Avoiding Branch Misprediction in Virtual Machine (VM) Portably moron451-compiler1@yahoo.com (2007-05-12)
Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably anton@mips.complang.tuwien.ac.at (2007-05-13)
Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably moron451-compiler1@yahoo.com (2007-05-13)
Re: Avoiding Branch Misprediction in Virtual Machine (VM) Portably eliotm@pacbell.net (Eliot Miranda) (2007-05-14)
| List of all articles for this month |

From: moron451-compiler1@yahoo.com
Newsgroups: comp.compilers
Date: 12 May 2007 20:06:08 -0700
Organization: Compilers Central
Keywords: VM, design, question
Posted-Date: 12 May 2007 23:35:49 EDT

Hello,


I've read several web sites and papers about coding virtual machines
using a threaded code technique (Anton Ertl's papers for example). He
recommends the use of GNU C's labels-as-values as a somewhat portable
way to do threaded coding. However, I don't want to rely on a non-
portable method like that. I want my VM to be 100% ANSI C. That
seems to only leave me with the "Giant Switch" method. However... I
recently thought about a different method that still uses the switch
statement, but instead of one giant switch, it uses one for each
instruction (see below). I was wondering if anyone can tell me if
this is a good idea or not.


It isn't exactly the most pristine example of code-reuse, and it might
grow quite large since each instruction has to have its own copy-and-
paste of the switch statement, but setting that aside, I think it
looks like it could be a winner. I'm a bit worried that I haven't
found anything advocating this in "the literature" however, so maybe I
am missing an important problem with it. Any comments?


label _START:


i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


label _ADD:


add();
i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


label _SUB:


sub();
i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


label _MUL:


mul();
i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


label _DIV:


div();
i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


label _PUSH:


push();
i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


label _POP:


pop();
i = nextInstruction();


switch(i)
{
case ADD: goto _ADD;
case SUB: goto _SUB;
case MUL: goto _MUL;
case DIV: goto _DIV;
case PUSH: goto _PUSH;
case POP: goto _POP;
}


etc. etc.
[Ignoring the detail that there's no "label" keyword in C, I don't see
the point. A reasonable compiler will recognize collapse the common code
sequences and jumps to jumps so you'll end up with the same object code
you would have from the giant switch. Also, if you're going to call a
routine for each operator, you can use a much smaller loop approximately
like this:


extern int *pc:


void (*op)()[] = { &add, &sub, &mul, ... };


for(;;) (op[*pc++])();


Or make the code a list of function pointers, so it's


for(;;) (*op++)();


-John]





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.