Re: Bytecode Compiler

dido@imperium.ph
28 Apr 2004 14:41:08 -0400

          From comp.compilers

Related articles
Bytecode Compiler chris.cranford@tkdsoftware.com (2004-04-21)
Re: Bytecode Compiler dido@imperium.ph (2004-04-28)
Re: Bytecode Compiler nmh@t3x.org (Nils M Holm) (2004-04-28)
Re: Bytecode Compiler casse@netcourrier.fr (=?ISO-8859-1?Q?Cass=E9_Hugues?=) (2004-04-28)
Re: Bytecode Compiler RLake@oxfam.org.pe (2004-04-28)
Re: Bytecode Compiler lars@bearnip.com (2004-04-29)
Re: Bytecode Compiler pohjalai@cc.helsinki.fi (A Pietu Pohjalainen) (2004-05-02)
Re: Bytecode Compiler Postmaster@paul.washington.dc.us (Paul Robinson) (2004-05-24)
| List of all articles for this month |
From: dido@imperium.ph
Newsgroups: comp.compilers
Date: 28 Apr 2004 14:41:08 -0400
Organization: Compilers Central
References: 04-04-064
Keywords: code, interpreter, comment
Posted-Date: 28 Apr 2004 14:41:08 EDT

On Wed, Apr 21, 2004 at 12:47:42AM -0400, Chris Cranford wrote:
> But when we begin tossing in the concept of variables and strings, things
> begin to get complicated and hard to follow. Lets assume the following
> example:
>
> C/C++ BASIC
> void main(int argc, char* argv[])
> { Dim X as Integer = 3
> int x = 3, y = 2; Dim Y as Integer = 2
> printf("Sum is %i\n", (x+y)); Print "Sum is "; (x+y)
> }
>
> How should I generate bytecode to reference variables X and Y? Then to
> assign the values of 3 and 2 to each variable respectively?
>


That depends on how you want to organize storage for local variables.
If you would follow the C/C++ paradigm, you would need to create
something like a call frame stack, and create special instructions for
reading from and writing to it. So the initial code you'd generate
might have something like this:


ENTER 2 ; since we have two local variables on the frame stack
PUSH 3 ; put 3 on top of the register stack
POP (fsp-2) ; x is second from the top of the frame stack pointer
PUSH 4 ; put 4 on top of the register stack
POP (fsp-1) ; y is second from the top of the frame stack pointer
PUSH (fsp-1) ; reread y
PUSH (fsp-2) ; reread x
ADD ; add x and y, sum is at top of regstack


Function arguments would also be stored on the call frame stack. Note:
the 'ENTER' instruction there is taken from x86 assembly language. The
real ENTER instruction has some complications attached to it but you
should get the picture that it's supposed to reserve space on the call
frame stack.


> And finally, when the print/printf statement is executed, there is a string
> involved that has to come from some place. Someone has mentioned in the past
> to encode it as part of the opcode stream and another option is to use a
> symbol table and reference it from there.
>


Again, this has to do with storage and memory management. You would
then need a heap or a constant storage region that would be initialized
with the value of the string. Say the string were stored at offset 0
from the start of the heap, the whole code for doing the body of the
function might be:


ENTER 2 ; reserve space for two local variables
PUSH 3 ; put 3 on top of the register stack
POP (fsp-2) ; x is second from the top of the frame stack pointer
PUSH 4 ; put 4 on top of the register stack
POP (fsp-1) ; y is second from the top of the frame stack pointer
PUSH (fsp-1) ; reread y
PUSH (fsp-2) ; reread x
ADD ; add x and y, sum is at top of regstack
FPUSH ; pop the top of the stack and push it into the CFS
PUSH 0 ; push the offset of the string on the heap into the CFS
CALL printf
FPOP 2 ; discard two arguments from the frame stack
LEAVE ; discard our local variables
RET ; return to the caller


16-bit x86 assembly code to do the same thing might look like this


STR: "Sum is %i\n"


ENTER 2,0
MOV AX,3
MOV [BP-2],AX ; first local variable, x
MOV AX,4
MOV [BP-4],AX ; second local variable, y
MOV AX,[BP-2] ; load x back
ADD AX,[BP-4] ; add y to that
PUSH AX ; save on stack
PUSH OFFSET STR ; push address of string
CALL _printf
ADD SP,4
LEAVE
RET


> Could someone help me put together a quick opcode stream that would use
> variables and strings like the above to help me grasp how I should generate
> opcode sequences for a virtual machine?


Storage management is a really difficult part of designing a virtual
machine environment, and I think it would help somewhat to look into
the design of some real interpreters and virtual machines out there to
see how they do it.


[This would be a good question for the virtmach list, a spinoff from
comp.compilers. To subscribe, send "subscribe" to
virtmach-request@lists.iecc.com. -John]


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.