Related articles |
---|
ANN: TXR 191: Now with compiler. 157-073-9834@kylheku.com (Kaz Kylheku) (2018-04-12) |
From: | Kaz Kylheku <157-073-9834@kylheku.com> |
Newsgroups: | comp.compilers |
Date: | Thu, 12 Apr 2018 00:37:27 +0000 (UTC) |
Organization: | Aioe.org NNTP Server |
Injection-Info: | gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="55792"; mail-complaints-to="abuse@iecc.com" |
Keywords: | available, Lisp |
Posted-Date: | 11 Apr 2018 20:48:31 EDT |
http://nongnu.org/txr
http://www.kylheku.com/cgit/txr
I'm pleased to announce that after 190 public releases of the TXR
language over the past 8.5 years, there is finally a compiler for the
TXR Lisp dialect, which produces code for a virtual machine.
TXR's Lisp library is now compiled, along with the compiler and assembler,
making it self-hosting.
The virtual machine is register-based rather than a stack machine, and
uses displays for representing nested environments. Display frames are
allocated on the native stack. When lexical closures are created with
the "close" instruction, frames are relocated to the heap,
transparently to the code.
Non-local control transfers and exception handling all works, as do
delimited continuations: some minor fixing was required to integrate
the VM with the existing native-stack-capture-based delimited
continuation implementation.
Examples:
compile-toplevel:
Compile a form in a null lexical enviroment. Then invoke the resulting
VM description as if it were a nullary function (though it is not of
function type).
1> (compile-toplevel '(list (succ x) (succ y)))
** warning: (expr-1:1) unbound variable x
** warning: (expr-1:1) unbound variable y
#<sys:vm-desc: 849c150>
2> (disassemble *1)
data:
d00: x
d01: y
funs:
0: list
1: succ
code:
0: 79000003 getv t03 d00
1: 24010002 gcall t02 1 t03
2: 00030001
3: 79010004 getv t04 d01
4: 24010003 gcall t03 1 t04
5: 00040001
6: 24020001 gcall t01 0 t02 t03
7: 00020000
8: 00000003
9: 10000001 end t01
instruction count:
6
#<sys:vm-desc: 849c150>
3> (progn (defvar x 1) (defvar y 2))
y
4> (call *1)
(2 3)
compile:
Compile a function by name, or function object:
1> (defun add (x y) (+ x y))
add
2> (compile 'add)
#<vm fun: 2 param>
3> (disassemble 'add)
data:
d00: add
funs:
0: +
code:
0: 9C00000B close t01 2 11 2 2 nil v0000 v0001
1: 00020001
2: 00020002
3: 02010200
4: 5C00000A block t02 d00 10
5: 00020100
6: 24020002 gcall t02 0 v0000 v0001
7: 02000000
8: 00000201
9: 10000002 end t02
10: 10000002 end t02
11: 10000001 end t01
instruction count:
6
entry point:
3
add
Here, what was compiled is a (lambda (x y) (+ x y)) form which was then
stored back into the add function binding. We get here the disassembly
of that entire form; the entry point of the function itself is at
offset 3.
The compile function first compiled the lambda form and then invoked
it, to obtain the function out of it. This was done by the close
instruction. The close instruction created the closure, put it
into register t01, and then branched to the instruction at 11 which
terminates the machine, indicating t01 as the result value.
At offset 3 is the map of argument locations v0001 and v0001, followed
by the code of the function body.
The gcall instruction is a variant of the function call which works
with a table of cached global function bindings. It's invoking the
function at index 0 (+), with args v0001 and v0002 (the function's
own arguments).
compile-file: compile a Lisp file.
If we put the add function into a file add.tl, and invoke
(compile-file "add"), we get an add.tlo file containing this:
(0 0)
((3 4 #b'0b00009c02000200 0200020000020102 0a00005c00010300 0300022401000002
0102000003000010 0300001001000224 0000000102000000 01000010'
#(usr:add) #(sys:rt-defun usr:+)))
The first expression gives major-minor version information.
Then the compiled image follows: a list of compiled top-level forms.
There is only one. The 3 4 numbers indicate number of registers and
levels. Then there is a buffer object containing the virtual machine code.
Then two vectors: the literal data vector and function table.
The representation here is a little different because an entire
(defun ...) form was compiled, not just a lambda. We can load it
and disassemble it:
1> (load "add")
nil
2> (disassemble 'add)
data:
d00: add
funs:
0: sys:rt-defun
1: +
code:
0: 9C00000B close t02 2 11 2 2 nil v0000 v0001
1: 00020002
2: 00020002
3: 02010200
4: 5C00000A block t03 d00 10
5: 00030100
6: 24020003 gcall t03 1 v0000 v0001
7: 02000001
8: 00000201
9: 10000003 end t03
10: 10000003 end t03
11: 24020001 gcall t01 0 d00 t02
12: 01000000
13: 00000002
14: 10000001 end t01
instruction count:
7
entry point:
3
add
The situation is similar in that the form's logic begins by
taking a closure, this time into register t02. It branches to
11 like before. Here, function 0 is called, which is now sys:rt-defun,
the run-time support routine for a compiled defun.
Its arguments are d00 t02: the symbol add and the closure.
The effect, of course, is that the function is defined.
The return value of sys:rt-defun is captured into register t01
whose value is returned. (That value is the function name.)
Again, the disassembly listing for add shows this whole thing,
indicating add's specific entry point into it.
So there we have a kind of high level walkthrough of what has been brewing in
TXR development lately.
I must pat myself on the back: good four weeks of spare-time work. I'm taking
a bit of a break now from the late nights and early mornings.
--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1
Return to the
comp.compilers page.
Search the
comp.compilers archives again.