Re: A novice in compiler construction wants feedback

Bart <bc@freeuk.com>
Sat, 5 Sep 2009 11:35:11 -0700 (PDT)

          From comp.compilers

Related articles
A novice in compiler construction wants feedback programmer@telia.com (Marcus Johansson) (2009-09-03)
Re: A novice in compiler construction wants feedback derekrss@yahoo.ca (Derek) (2009-09-04)
Re: A novice in compiler construction wants feedback programmer@telia.com (Marcus Johansson) (2009-09-05)
Re: A novice in compiler construction wants feedback bc@freeuk.com (Bart) (2009-09-05)
| List of all articles for this month |

From: Bart <bc@freeuk.com>
Newsgroups: comp.compilers
Date: Sat, 5 Sep 2009 11:35:11 -0700 (PDT)
Organization: Compilers Central
References: 09-09-035
Keywords: interpreter, comment
Posted-Date: 05 Sep 2009 14:56:31 EDT

On Sep 5, 9:35 am, "Marcus Johansson" <program...@telia.com> wrote:


> My intepreter is written in C. It has got 8 registers and a stack. They're
> of a union type, something like:
> typedef union {
> int i;
> float f;
> DString *s;
> Array *a;
> }AnyType;
>
> The compiled instructions are loaded into an array of, let's say:
> typedef struct {
> int command;
> int dst;
> int src;
> } Instruction;
>
> I've set up an array with function pointers matching the command values of
> the instructions:
> void CMD_MOVE_RR(void);
> void CMD_MOVE_RC(void);
> ...
> gCommands[MOVE_RR] = CMD_MOVE_RR;
> gCommands[MOVE_RC] = CMD_MOVE_RC;
> ...
>
> I then interpret the program with:
>
> while (gProgramRunning) {
> gCommands[gInstructions[gCurrentInstruction].command]();
> gCurrentInstruction++;
> }
>
> Is this the usual way of doing it? And, above all, is there a smarter and
> faster way of interpreting the instructions? Can I somehow get rid of all
> those function calls above?


That's quite a tidy way of doing things, if you're using C.


However I couldn't manage to get things fast enough using C, or even
my own compiled language. The core (the main dispatch 'loop') of my
interpreter is in x86 assembler. I've put 'loop' in quotes because
actually it uses threaded code (jumps directly to the next command
handler).


I've managed to get down to just a 2 instruction overhead per command
(add ebx,offset; jmp [ebx]), although this relies on one register
being the 'program counter' (your gcurrentinstruction).


But overall, as the moderator says, this is very kludgy.


With C, you might look at Gcc which has a few helpful extensions, such
label variables (allowing threaded code using goto *command, although
I could never get this to work to my satisfaction.


A couple of things about your outline:


You're using the .command field of your gInstruction to index
gCommands; just put the gCommand address directly into the
gInstruction! (If you need the command index for error reports, you
can easily obtain this). And pointer step a pointer instead of using
an index (although your C compiler might take care of this).


You're also testing gProgramRunning on every single command, and of
course it is only true on the very last one. So perhaps use an
infinite loop, and detect a stop condition in the Stop command (I
assume you have a Stop command). Then you just have a small problem,
in C, of jumping out of the interpreter loop.


Perhaps use setjmp/longjmp (I've never used them), or use the
moderator's idea of putting all the code in one function; then it
might just be a goto (unless you've got stuff on the stack to take
care of).


--
Bartc
[Another thought is to make your VM instructions higher level so they do more
work between dispatches. I once did a VM for a version of the matlab math language,
and when a singler operator is "multiply large matrices" the interpreter overhead
is lost in the noise. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.