From: | "BGB" <cr88192@hotmail.com> |
Newsgroups: | comp.compilers |
Date: | Wed, 2 Sep 2009 09:14:06 -0700 |
Organization: | Compilers Central |
References: | 09-07-074 09-07-095 09-07-105 09-08-050 09-08-056 09-09-005 <af54caaf0909020734p185e03a6hb95570b55dcbed9a@mail.gmail.com> |
Keywords: | code |
Posted-Date: | 02 Sep 2009 23:41:25 EDT |
> On Tue, Sep 1, 2009 at 12:12 AM, BGB / cr88192<cr88192@hotmail.com> wrote:
>>>> more recently, my project has taken a very different approach:
>>>> both the VM and native code are C.
>>> <snip />
>>>> however, all this effort does have a payoff:
>>>> plain C to plain C integration.
>>>
>>> If I understand you correctly, it seems that this interface wouldn't
>>> be portable to other VMs, and that the standard libraries could not be
>>> shared with another implementation which did not use a C virtual
>>> machine?
>>
>> another VM could use a similar strategy, but granted, to use plain C<->C
>> interfacing, the VM in question would need to be able to support C.
>>
>> the great problem is that this quickly rules out most simple /
>> language-specific VMs...
>>
>> basically, to handle both a dynamic language, and C, the VM would need a
>> similar level of complexity to that of .NET ...
>
>
> But only if you need to 'handle' C in your VM. I am at a loss as to
> why this might be useful. To make a dynamic VM, with a useful FFI, you
> don't need to handle C in any meaningful way, except:
> - it might be useful to parse some C subset to generate glue coode
> (if using the idea's I outlined in my first mail)
> - to link to C libraries, which isn't hard (no dynamic linking
> requirements).
one major advantage:
you can run C in the VM.
C is a very useful language to be able to run as a scripting language, since
it is a fairly powerful language.
> You certainly don't need a JIT, an 'unsafe' environment, to process C
> as bytecode (does .NET even do this?) or any of the rest of the
> complexity of .NET.
>
limiting oneself to an interpreter would limit a lot of what one could do.
an JIT compiler is a very powerful and capable tool.
for example, Quake3 used bytecoded C, and one of the first things it did was
to JIT compile it...
granted, Q3's JIT was a bit simpler than my JIT, namely in that it mostly
consisted of direct bytecode/machinecode replacement (rather than a full
compiler stage, and an assembler+linker, but oh well...).
actually, I will note that early on, I borrowed a few misc ideas from Q3's
JIT, among other things...
>
>
>> I can already more-or-less glue dynamic typing to C-style data
>> representations, ...
>
> Maybe, but why would you want to? In my opinion, all you do here is
> prevent people from reimplementing your language, which is the same
> mistake that all existing scripting languages have made.
>
the main advantage is that it allows many places to internally use static
typing, which allows for greater performance in many cases...
another advantage is that it can save memory...
for example, typical dynamic typesystems use pointer-sized values for every
possible value, and gives extra performance-eating costs (such as the need
for fiddling with type-tags in the pointer bits, deal with ranges or
displacements for another tagging scheme, ...).
eliminating maybe 75%-90% of the internal dynamically-typed operations means
that much more speed.
and, eliminating pointer-based references for structural data types means
that much more saved space (or, at least on x86-64, where pointers are 8
bytes each...).
another advantage is that it works well in a world where most things are
"good old C", since it is no longer necessary to have lots of hairy
glue-code in order to marshall data (in many cases).
>
>
>> thus, my ideas for how to do JS in my present framework...
>> if done well, JS "should" be able to achieve close to 1:1 performance
>> with C
>> (if compiled to statically-typed native code via lots of internal
>> trickery...).
>
> I wouldn't assume this for a second. You can't statically compile
> Javascript to native code at all, primarily because of the existence
> of AJAX, which fetches code at run-time from the server, and eval()s
> it. In the absence of this, and other means of run-time code
> generation, Jensen's work (http://www.cs.au.dk/~amoeller/papers/tajs/)
> shows that, yes, JS is very statically type-able. But the application
> is not compilation - none of the recent Javascript implementations
> (squirellfish, V8 and tracemonkey) are statically compiled, nor could
> they be if they wanted to process real-life Javascript.
>
who ever said anything about statically compiling the JS?...
I simply said it would be compiled to statically-typed native code.
this means:
A, it is compiled to dynamically typed bytecode;
B, the JIT does all the trickery to lower it to static typing.
it is also worth noting that eval would not necessarily be broken, since
eval would use the same process...
the main idea behind the lowering is that it would function vaguely
similarly to C++ templates (although, JITing dynamic types as dynamic types
is also an option, and probably what would be done initially).
the main drawback would be that eval-heavy code could have reduced
performance if every eval operation involved a pass through the JIT compiler
(granted, one could also add an interpreter, which would be used for eval
and friends...).
the main complexity with JIT and JS is part of the typesystem:
there is no clear distinction between integer and real types;
nearly all overflows are defined of automatically going to double;
double is expensive (especially when using SSE, as apparently SSE's double
is not quite as fast as x87's double...).
this would likely force all of the numeric code to use doubles, which would
be kind of lame...
it is worth noting that I can also "eval" C, but with minor restrictions.
namely, that at present, the C needs to be in the form of a valid
"compilation unit", which means including headers (which kills performance),
and one has to fetch a function pointer to actually do the "eval" part...
sadly, this means plain C is not a very ideal language for "eval", but oh
well...
actually, this particular cost has caused me to do most of my dynamic code
generation in assembler...
one thing I have thus far failed at making is a platform-neutral assembler
(well, apart from my RPNIL language, which is my main IL, but I don't
typically don't use it as an assembler as such...).
partially though it is that most of my low-level codegen tasks are very
low-level, and so typically require ASM-like powers...
or such...
Return to the
comp.compilers page.
Search the
comp.compilers archives again.