Re: Jit Implementation

"bartc" <bartc@freeuk.com>
Sat, 20 Mar 2010 15:06:21 GMT

          From comp.compilers

Related articles
Jit Implementation herron.philip@googlemail.com (Philip Herron) (2010-03-18)
Re: Jit Implementation bobduff@shell01.TheWorld.com (Robert A Duff) (2010-03-19)
Re: Jit Implementation bartc@freeuk.com (bartc) (2010-03-20)
Re: Jit Implementation jgd@cix.compulink.co.uk (2010-03-20)
Re: Jit Implementation anton@mips.complang.tuwien.ac.at (2010-03-21)
Re: Jit Implementation gah@ugcs.caltech.edu (glen herrmannsfeldt) (2010-03-21)
Re: Jort programs, was Jit Implementation bobduff@shell01.TheWorld.com (Robert A Duff) (2010-03-21)
Re: Jit Implementation herron.philip@googlemail.com (Philip Herron) (2010-03-21)
Re: Jit Implementation jthorn@astro.indiana-zebra.edu (Jonathan Thornburg \[remove -animal to reply\]) (2010-03-21)
[11 later articles]
| List of all articles for this month |
From: "bartc" <bartc@freeuk.com>
Newsgroups: comp.compilers
Date: Sat, 20 Mar 2010 15:06:21 GMT
Organization: Compilers Central
References: 10-03-054
Keywords: code
Posted-Date: 21 Mar 2010 11:37:58 EDT

"Philip Herron" <herron.philip@googlemail.com> wrote in message


> So ok if we have successfully output target code for the processor
> would you use simply a system assembler or what i see some people
> have runtime assemblers to assemble this code, then where would this
> target code go? Then how do you link it in a useful way? And
> ultimately how would you execute this but that depends on how you
> link it or where you link it. What might work but i doubt seriously
> if it works like this, to generate a dynamic library .so and use
> dlsym to execute them from your runtime. That is if you were to
> have a nice dynamic jit rather than a fully AOT jit which generates
> code fully so you really compile your language.


I have a project which works roughly this way. Perhaps comparing this to
other systems might be useful:


It's in two parts, a compiler:


      S Source code
=> U AST
=> P Pseudo-code (my IL)
=> M Target-code (representation of x86 instructions)
=> M-file Distribution form


And a, er, Loader (I haven't found a good name for this part yet):


      M-file Distribution form
=> M Target code repr
=> x86 Actual binary code


Then the code can be directly run in memory, from the loader.


Libraries in the same language are also distributed in this form, and
dealt with by the Loader.


In this project, I stay well clear of things such actual assemblers,
link-loaders, dynamic libraries and executable files. They just make
things slow, cumbersome and generally less dynamic than I would like.
(To access OS and other external services, it does use external DLLs.)


The 'M' target code corresponds to assembly code, but uses an internal
form which is quicker than writing out .asm files that then require a
custom runtime assembler (a standard assembler produces object files which
are no good).


(However, I plan to have also conventional .asm output, so that I
can create the occasional executable, such as those two programs
above.)


> I've been peeking at libjit a good bit, but most jit implementations
> (although i haven't obviously looked deep enough), have lots of static
> pre-processor CPU specifications and ELF specifications i guess for
> the runtime assembler and linker for the Jit. But i just don't
> understand where the code might be output or what way its compiled and
> used. Since it is definitely not helpful if you were to ouput code and
> link it into an executable in /tmp and execute the binary as is with
> execv or something.


I find a lot of other people's code, especially of large, complex working
projects, to be unreadable.


> I know i am missing something or not seeing something but its bugging
> me so i hope someone can fill in the gaps for me. Its just annoying
> I've been toying with some small programs to generate code at runtime
> on a basic language, but all i do is output to a file descriptor i
> specify; simply to see if the code looks good so far, it would be nice
> what do you do next.


I started off with experiments such as the following (but note that as you
appear to be on something like Linux, malloc() memory may possibly not be
executable, as the moderator notes), for the x86:


#include <stdio.h>
#include <stdlib.h>


int main(void){


char* program;
int (*fnptr)(void);
int a;


program = malloc(1000); /* Space for the code */


program[0] = 0xB8; /* mov eax,1234h */
program[1] = 0x34;
program[2] = 0x12;
program[3] = 0;
program[4] = 0;
program[5] = 0xC3; /* ret */


fnptr = (int (*)(void)) program;


a = fnptr(); /* call the code */


printf("Result = %X\n",a); /* show result */
}


(If this shows "1234", then you're past the main hurdle.)


> [... on recent x86
> where data pages are usually marked no-execute. -John]


Fortunately that doesn't seem the case with XP and Vista at least.


--
bartc
[This sounds like a project about a decade ago called slim binaries, which
shipped programs for a Mac in a compact low-level intermediate code, and
generated 68K or PPC code as the program was loaded. They claimed that the
intermediate form was enough smaller than native code that the savings in
disk access more than made up for the time spent translating.


You might also want to look at mainframe sort programs. Since the mid
1960s (maybe earlier) they've generated the code for the inner comparison
loop, then run the linker and loader to bring it into RAM, then did the sort.
-John]



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.