Re: Virtual Machine - Tiny Basic

stephan@pcrm.win.tue.nl (Stephan Houben)
9 Sep 2000 13:17:26 -0400

          From comp.compilers

Related articles
Virtual Machine - Tiny Basic hrm@terra.com.pe (Hugo Rozas M.) (2000-08-20)
Re: Virtual Machine - Tiny Basic Brent.Benson@oracle.com (Brent Benson) (2000-08-27)
Re: Virtual Machine - Tiny Basic ceco@jupiter.com (Tzvetan Mikov) (2000-09-08)
Re: Virtual Machine - Tiny Basic stephan@pcrm.win.tue.nl (2000-09-09)
| List of all articles for this month |
From: stephan@pcrm.win.tue.nl (Stephan Houben)
Newsgroups: comp.compilers
Date: 9 Sep 2000 13:17:26 -0400
Organization: Eindhoven University of Technology, The Netherlands
References: 00-08-100 00-08-121 00-09-022
Keywords: interpreter

On 8 Sep 2000 01:57:33 -0400, Tzvetan Mikov <ceco@jupiter.com> wrote:
>
>Brent Benson <Brent.Benson@oracle.com> wrote
>
>>[...] The reason for this is that much of the time spent interpreting code
>> is spent scanning and parsing the input. In a direct interpreter the
>> scanning and parsing needs to be performed every time a piece of code
>> is executed.
>
>A direct interpreter doesn't have to parse every time. As Hugo
>suggested the source can be tokenized only once (or even converted to
>AST) and then interpreted very efficiently.


Yes, and for increased efficiency you store the AST in reverse-
polished notation, so that we can store it linearly in an array and
don't have to do pointer-chasing all the time... wait, we just
re-invented a byte-code interpreter.


> Is there any evidence showing that this is inherently slower than
> interpreting VM instructions?


Well, you discuss two different strategies:


1. Storing the lexed tokens, reparse while executing


  This is obviously bit slower than parsing the input only once,
  since you do the parsing work over and over again. Unless your
  language has a /really/ trivial syntax, like Forth for example.


2. Storing the AST


    If you store the AST in Reverse Polish notation, then we have
    something that looks an awful lot like a Virtual Machine.
    If we store the AST in a tree-like structure, then we have to
    chase pointers and maintain our position in the tree so that
    we can back-track. That /sounds/ less efficient, but I'm pretty
    sure clever implementation techniques exist to make this fast.
    IIRC, Perl stores code as an AST.


>Actually it seems to me that the VM approach itself
>should be slower: instead of operating directly on high level
>constructs and executing them at once the interpreter has to process
>lots of simple VM instructions - more of the time will be spent in
>decoding them.


That's why most VM instruction sets are rather CISC-like in nature;
every language construct maps pretty directly on a few, or even a
single, VM instruction.


> Also it has to maintain the VM state - registers, stack, etc. which
> is totally redundant with respect to the interpreted program.


Totally redundant? A simple VM might only need an instruction pointer
register and a value stack. If you think about it, you /also/ need a
stack if you want to walk an AST, and you definitely need to store the
current syntax node under consideration. I.e., you need exactly the
same info. It's just that a naive implementation of an AST evaluator
could use recursive function calls and thus use the C (or whatever)
stack for its own purposes. But function call might be too slow to be
acceptable in the inner loop of your interpreter.


Stephan


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.