Formula compilation to custom bytecode

kt@mbmc.demon.co.uk (kt)
12 Jan 2003 18:01:17 -0500

          From comp.compilers

Related articles
Formula compilation to custom bytecode kt@mbmc.demon.co.uk (2003-01-12)
| List of all articles for this month |

From: kt@mbmc.demon.co.uk (kt)
Newsgroups: comp.compilers
Date: 12 Jan 2003 18:01:17 -0500
Organization: I'm all for it.
Keywords: code, question
Posted-Date: 12 Jan 2003 18:01:17 EST

I'm writing a formula compiler for a Java spreadsheet engine. The
formulas are compiled down to a small instruction set which is
then executed by an internal interpreter. The compiler is still
at a very early stage of development, and the interpreter is not
yet written.


Below you will see two files. The first one is the input to the
compiler and the second is the assembled output, followed by the
bytecode generation and, as a check, the disassembly from the
generated bytecode. Underneath at the very bottom, are a couple of
questions. Can anyone help out with answers, advice, or tips? Thanks.


Input file (c.kt):
------------------
a1=1+(b1*2)+a2;
a2=2+b1;
c2=a1*2;


Compiler output:
----------------
; Assembled by Kt's Formula Compiler V0.1
; Source: c.kt
; Date: 1/11/03 12:26 AM
;
a1:
0000 17 001 push 1
0002 168 [2:1] jsr b1
0007 17 002 push 2
0009 104 mul
0010 96 add
0011 168 [1:2] jsr a2
0016 96 add
0017 54 [1:1] store a1
0022 169 ret
;
; a1=1+(b1*2)+a2;


a2:
0023 17 002 push 2
0025 168 [2:1] jsr b1
0030 96 add
0031 54 [1:2] store a2
0036 169 ret
;
; a2=2+b1;


c2:
0037 168 [1:1] jsr a1
0042 17 002 push 2
0044 104 mul
0045 54 [3:2] store c2
0050 169 ret
;
; c2=a1*2;
; Address map
c2: 00000037
b1: -00000001
a2: 00000023
a1: 00000000
; Replacing JSRs
; Dumping bytecode
017 001 168 000 000 000 023 017
002 104 096 168 255 255 255 255
096 054 000 001 000 001 169 017
002 168 000 000 000 023 096 054
000 001 000 002 169 168 000 000
000 000 017 002 104 054 000 003
000 002 169 ; 51 bytes written.


; Disassembled bytecode
0000 push 1
0002 jsr 23
0007 push 2
0009 mul
0010 add
0011 jsr -1
0016 add
0017 store 1:1
0022 ret
0023 push 2
0025 jsr 23
0030 add
0031 store 1:2
0036 ret
0037 jsr 0
0042 push 2
0044 mul
0045 store 3:2
0050 ret


As you can see I have unresolved subroutine addresses (those that
appear as jsr -1). I've thought of shifting the origin of the code
down and including a catch-all subroutine to which unresolved
addresses would be pointed e.g:


0000 push 0
0002 ret
0003 ...


and this seems the easiest to implement. The other idea is to do
another pass over the byte code and simply replace unresolved
addresses with 'push 0', this too is quite easy, but which would be
the best method and is their another easier way to resolve this issue?


Finally, I have one more brain-burning problem for which I have no
solution as yet and it's this: I'm storing results into memory with a
column/row setup, which is roughly analogous to a paged memory
model. The 'store' instruction pops the stack and saves the result at
the given row/column address in memory. But what I'd like to do is
make the storeage memory look like one flat contiguous block, can
anyone suggest a way to do this? Oh, I should mention that I have a
16bit address space.


Thanks.


--Kt


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.