Re: Managing the JIT

"BGB / cr88192" <cr88192@hotmail.com>
Thu, 30 Jul 2009 00:01:07 -0700

          From comp.compilers

Related articles
[2 earlier articles]
Re: Managing the JIT herron.philip@googlemail.com (Philip Herron) (2009-07-25)
Re: Managing the JIT armelasselin@hotmail.com (Armel) (2009-07-25)
Re: Managing the JIT herron.philip@googlemail.com (Philip Herron) (2009-07-27)
Re: Managing the JIT cr88192@hotmail.com (BGB / cr88192) (2009-07-27)
Re: Managing the JIT cr88192@hotmail.com (BGB / cr88192) (2009-07-28)
Re: Managing the JIT armelasselin@hotmail.com (Armel) (2009-07-29)
Re: Managing the JIT cr88192@hotmail.com (BGB / cr88192) (2009-07-30)
Re: Managing the JIT armelasselin@hotmail.com (Armel) (2009-07-31)
Re: Managing the JIT barry.j.kelly@gmail.com (Barry Kelly) (2009-08-01)
Re: Managing the JIT cr88192@hotmail.com (BGB / cr88192) (2009-08-02)
Re: Managing the JIT cr88192@hotmail.com (BGB / cr88192) (2009-08-02)
Re: Managing the JIT ademakov@gmail.com (Aleksey Demakov) (2009-08-07)
Re: Managing the JIT cr88192@hotmail.com (BGB / cr88192) (2009-08-08)
| List of all articles for this month |

From: "BGB / cr88192" <cr88192@hotmail.com>
Newsgroups: comp.compilers
Date: Thu, 30 Jul 2009 00:01:07 -0700
Organization: albasani.net
References: 09-07-079 09-07-093 09-07-108 09-07-113
Keywords: code, incremental
Posted-Date: 30 Jul 2009 23:14:26 EDT

"Armel" <armelasselin@hotmail.com> wrote in message
news:09-07-113@comp.compilers...
> "BGB / cr88192" <cr88192@hotmail.com> a icrit dans le message de news:
>> "Armel" <armelasselin@hotmail.com> wrote in message
>>> "Philip Herron" <herron.philip@googlemail.com> a icrit dans le message
>>> de
>>>> [...]
>>>> something. I am having trouble finding more stuff on how this works
>>>> would be great if you could point me in the right direction?! :)
>>>
>>> I just felt on the asmjit project : http://code.google.com/p/asmjit/
>>> it seems that it does not contain megatons of code, you may find
>>> interesting to read it.
>>
>> [...]
>> OTOH, my assembler uses a textual interface (with a NASM / YASM style
>> syntax).
>> [...]
>
> in fact, the function call based API could be considered as a lower
> access level to the same JIT. inside your assembler with textual
> interface, you probably already have this code, not providing it to
> who needs it is just a loss.


I am not sure how it would be a loss...


basically, pretty much any capability of the assembler is available from the
textual interface.


however, the textual interface provides capabilities not available if direct
function calls were used, such as using multi-pass compaction (AKA: the
first pass assumes all jumps/... to be full length, but additional passes
allow safely compacting the jumps).
...


granted, my internal code does not go about attempting nearly so nice an
interface as asmjit (with a single function per instruction, ...), rather,
the interface is a good deal more terrible...




want to see?...


void BASM_OutOpGeneric2(BASM_Context *ctx, int op, int w,
  char *lbl0, int breg0, int ireg0, int sc0, long long disp0,
  char *lbl1, int breg1, int ireg1, int sc1, long long disp1);


void BASM_OutOpGeneric3(BASM_Context *ctx, int op, int w,
  char *lbl0, int breg0, int ireg0, int sc0, long long disp0,
  char *lbl1, int breg1, int ireg1, int sc1, long long disp1,
  char *lbl2, int breg2, int ireg2, int sc2, long long disp2);


...


this, in turn, dispatches to a tower of logic to figure out how to go about
assembling each instruction...




at the lower levels, it is stepping along strings, processing specific
embedded commands (mostly to emit specific hex-bytes or prefixes, encode the
SIB and ModR/M bytes, ...).


at its core, I derived part of the idea from the Quake3 JIT, which was
basically producing JIT'ed code mostly via the powers of command strings,
typically containing hex-bytes...




any much lower than these generic dispatch functions, and likely your
frontend will be left dealing with issues like figuring out which bits go
into a REX prefix, ...




> by the way, one could well have the generation code behind each
> function e.g. jit.move (ax, 25); generate either the binary or
> assembler text for dump purposes. and the upper level interface could
> understand this dumpped ASM and generate the calls!
>


I don't see the purpose...


FWIW, I don't have any such "mov" function.


if you want "mov", you first have to look up correct opcode number for "mov"
(currently done via a loop), then call one of the dispatch functions, with
the number, the correct width, any appropriate args-fields filled in, ...


and, BTW, registers are passed in as integer constants, ...
(ok, granted, internally there is a nicety for this, mostly as for my own
uses I at least made macros for them...).




> i'd be personnally more likely to use a direct-to-binary JIT rather
> than a textual interface (a JIT could be of interest to our ECMAScript
> engine) but indeed a dump facility could really interest me as well,
> though I see that as a debugging feature, not the corner stone of the
> JIT engine.
>


what does the binary interface buy you?...


i=3;
basm_print("mov rax, %d\n", i);


or similar, is a whole hell of a lot less work than composing the correct
function calls...


note that wrapping every single opcode with a function would likely be far
more work than writing most of the assembler.


internally, some of the code is auto-generated, and a lot is generic parsing
and table-processing code, and a lot string processing code, and predicate
functions (used to help identify the correct opcode form for a given nmonic,
...).




granted, maybe one "could" autogenerate a huge source file with a function
call for every form of every function, but personally, I fail to see the
value of this (just a big mass of stuff clogging up the DLL's export
table...).




personally, I have found an interface based on begin/end pairs, and a
printf-like interface, as a very convinient and usable way to drive the
assembler...


the overal performance difference either way is likely to be small, as in
this case, the internal processing is likely to outweigh the cost of parsing
(figuring out which opcode to use, ...).




granted, my compiler does not typically directly print every instruction,
but usually uses its own internal printf-like statements to build a buffer,
and this buffer is sent all to the assembler at once (via the function
"basm_puts()", which differs from "basm_printf()" namely in that it is much
more direct, and can handle much bigger data...).



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.