Re: Jit Implementation

"bartc" <bartc@freeuk.com>
Tue, 23 Mar 2010 11:51:07 -0000

          From comp.compilers

Related articles
[5 earlier articles]
Re: Jit Implementation gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-03-21)
Re: Jit Implementation herron.philip@googlemail.com (Philip Herron) (2010-03-21)
Re: Jit Implementation jthorn@astro.indiana-zebra.edu (Jonathan Thornburg \[remove -animal to reply\]) (2010-03-21)
Re: Jit Implementation cr88192@hotmail.com (BGB / cr88192) (2010-03-21)
Re: Jit Implementation herron.philip@googlemail.com (Philip Herron) (2010-03-21)
Re: Jit Implementation barry.j.kelly@gmail.com (Barry Kelly) (2010-03-22)
Re: Jit Implementation bartc@freeuk.com (bartc) (2010-03-23)
Re: Jit Implementation bartc@freeuk.com (bartc) (2010-03-23)
Re: Jit Implementation cr88192@hotmail.com (cr88192) (2010-03-23)
Re: Jit Implementation cr88192@hotmail.com (BGB / cr88192) (2010-03-23)
Re: Jit Implementation bartc@freeuk.com (bartc) (2010-03-24)
Re: Jit Implementation cr88192@hotmail.com (BGB / cr88192) (2010-03-26)
Re: Jit Implementation bartc@freeuk.com (bartc) (2010-03-28)
[1 later articles]
| List of all articles for this month |

From: "bartc" <bartc@freeuk.com>
Newsgroups: comp.compilers
Date: Tue, 23 Mar 2010 11:51:07 -0000
Organization: Netfront http://www.netfront.net/
References: 10-03-070
Keywords: code
Posted-Date: 23 Mar 2010 23:51:44 EDT

"BGB / cr88192" <cr88192@hotmail.com> wrote in message
> "bartc" <bartc@freeuk.com> wrote in message


>> I have a project which works roughly this way:


>> => P Pseudo-code (my IL)
>> => M Target-code (representation of x86 instructions)
>> => x86 Actual binary code


> yes, ok.
> but, it is confusing here as "M" is said to represent x86 instructions,
> whereas normally one would send out code at the IL level?...


Yes, I was originally writing 'P' code to disk, but then it was interpreted.
But I needed something faster, and JITing from the then P-code was too much
of a leap.


I didn't want to go the route of assembler output, object code, linking and
executable files either. I wanted to maintain the dynamicism of an
interpreted language.


I'm also using a dynamic language now to implement both the compiler and
loader parts (before the loader (or interpreter then) was in a hard compiled
language), so the whole thing is flexible to some extent.


>> In this project, I stay well clear of things such actual assemblers,
>> link-loaders, dynamic libraries and executable files. They just make
>> things slow, cumbersome and generally less dynamic than I would like.
>> (To access OS and other external services, it does use external DLLs.)


> I have a dynamic assembler and linker, which function about like that of a
> normal (stand-alone) assembler and linker, but which operate entirely in
> memory.
...


> generally, I find performance to be acceptably good, even in most cases
> when
> generating code with time-constraints or in loops, and even with all the
> extra crap I ended up adding on eventually.


I wanted to avoid assemblers, and all that goes with them, completely. Then
I remembered my language allows inline assembler, and to translate it to my
'M' format, I would have to write an assembler of sorts..


But in the dynamic language I was using, this added less than 300 lines to
the compiler, plus various tables. (Because the assembler source exists in
the framework of a high-level language, I could cut a few corners.)


> there is a slight complexity that can result from dynamic COFF linking,
> which is dealing with matters of dependency order/resolution, but I had
> dealt with this via a hack:


A 'slight' complexity...? I've always tried to design around formal
link/loaders where possible.


>> (However, I plan to have also conventional .asm output, so that I
>> can create the occasional executable, such as those two programs
>> above.)
>>
>
> ok.


> the question though is, once again, why to emit a representation at
> this level: if it is at the same level as ASM, then it is not
> portable between systems; but, at the same time, one has to
> translate it to make it runnable.


For distribution purposes, the choice was to use Source code, P code, or M
code. I've chosen the latter for the present.


> with a traditional IL, it can be compiled to use different CPU
> architectures;
> with raw machine code, it can be run on the processor directly.
>
> this is, unless I have misunderstood what is being done here.


Nothing much except this is an experimental project and I don't care about
other processors at the minute...


> granted, an IL can be very low-level, essentially representing a
> glossed-over version of the target architectures, or simply a virtual
> processor (where the IL opcodes are fairly directly mapped to their native
> anologues in most cases).


My original 'P' IL mapped to byte-code and needed to be interpreted.


That imposed many restrictons, but now that I don't have the headache of
dealing with it efficiently at runtime, it's now fairly high-level: it has a
stack, registers and operands in any combination, but is still just a linear
sequence of instructions.


It's a little higher level than x86, my main target, but not too much:


* 3 fundamental types agains x86's one main 32-bit type
* 2-address instructions against x86's one-address form


However, it has fewer registers: just one main register, plus one or two
auxilliary ones (I've never figured out how to compile for multiple
registers).


--
Bartc


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.