Re: VM as target, was Is Assembler Language essential

George Neuner <gneuner2@comcast.net>
Sat, 28 Feb 2009 04:42:58 -0500

          From comp.compilers

Related articles
[7 earlier articles]
Re: VM as target, was Is Assembler Language essential gneuner2@comcast.net (George Neuner) (2009-02-21)
Re: VM as target, was Is Assembler Language essential cr88192@hotmail.com (cr88192) (2009-02-23)
Re: VM as target, was Is Assembler Language essential gneuner2@comcast.net (George Neuner) (2009-02-24)
Re: VM as target, was Is Assembler Language essential cr88192@hotmail.com (cr88192) (2009-02-25)
Re: VM as target, was Is Assembler Language essential gneuner2@comcast.net (George Neuner) (2009-02-25)
Re: VM as target, was Is Assembler Language essential cr88192@hotmail.com (cr88192) (2009-02-27)
Re: VM as target, was Is Assembler Language essential gneuner2@comcast.net (George Neuner) (2009-02-28)
Re: VM as target, was Is Assembler Language essential cr88192@hotmail.com (cr88192) (2009-03-01)
Re: VM as target, was Is Assembler Language essential gneuner2@comcast.net (George Neuner) (2009-03-02)
Re: VM as target, was Is Assembler Language essential gneuner2@comcast.net (George Neuner) (2009-03-03)
Re: GC in VM as target, was Is Assembler Language essential cr88192@hotmail.com (cr88192) (2009-03-06)
| List of all articles for this month |
From: George Neuner <gneuner2@comcast.net>
Newsgroups: comp.compilers
Date: Sat, 28 Feb 2009 04:42:58 -0500
Organization: A noiseless patient Spider
References: 09-02-021 09-02-037 09-02-076 09-02-082 09-02-089 09-02-095 09-02-103 09-02-109 09-02-114 09-02-122 09-02-124 09-02-133 09-02-143
Keywords: VM, GC
Posted-Date: 28 Feb 2009 06:36:49 EST

On Fri, 27 Feb 2009 21:16:46 +1000, "cr88192" <cr88192@hotmail.com>
wrote:


>"George Neuner" <gneuner2@comcast.net> wrote:
>
>> As for GC, there are ways to implement precise GC of the stack that
>> don't need a frame pointer. The most obvious is tagged data, but that
>> isn't necessary either. Another possibility is to make it IP-based
>> (like exceptions). Since the compiler knows the frame layout, it can
>> associate with each user function a small GC function that processes
>> the frame, identifies the caller from the return address and
>> progressively calls the callers GC function, etc.
>>
>the problem though is that the compiler may not know the frame layout...
>for example, consider one calls into MSVC produced code (such as the Win32
>API), which calls back into the app via a callback. then, we have a mess...


Dealing with foreign code is always a problem. It's a particular
problem when the stack contains an interleaved mix of your code and
foreign code.


However, there are ways to deal with it. Since the foreign code
frequently has a different calling convention, one reasonable method
is to have some way of syntactically designating foreign function
calls and functions that may be called from foreign code (for example,
Lispworks has define-foreign-function and define-foreign-callable).
Knowing how the function will be used the compiler can then place
identifiable markers on the stack to identify sections as belonging to
native or foreign code:


FFI call would look like
    :
        GC'd section
    IN_FOREIGN
        foreign section
    IN_NATIVE
        GC'd section
    :


FFI callback would look like
    :
        foreign section
    IN_NATIVE
        GC'd section
    IN_FOREIGN
        foreign section
    :


As long as foreign functions are marked one way and foreign callbacks
marked the opposite, their execution can be interleaved in any way.
When a stack scan begins, the IP can be used to determine whether it
is starting in a native or foreign section.


Another way of dealing with the problem is to use multiple stacks.




>> [I'm actually a fan of making GC based on special purpose functions as
>> much as possible rather than embedding pointer maps everywhere and
>> having generic code interpret them. IME using special functions
>> simplifies the GC and it's interface to the runtime (GC only needs to
>> know how to call a function given a data object) and results in better
>> data locality and cache behavior.]
>
>possibly, but I tend to use a "generic plug-in" approach, where some code,
>such as part of the runtime, registers callbacks with the garbage collector
>to allow it to GC certain kinds of things.


Terminology ... "plug-ins" was exactly what I was getting at. Keep
the collector itself as generic as possible and confine all knowledge
of language data structures to the program and runtime. The collector
should only have to know how to find and call an associated function
given a pointer to an object (or handle to a thread, etc.).




>now, the runtime would then probably use generic code to unwind the stack
>for each thread (but, then again, there is the issue that this could still
>be problematic if the thread is still running at the time...).


In all concurrent GC systems I am aware of, threads scan their own
registers and stacks: the GC freezes the thread, emulates a call to a
(void, parameterless) scanning function and then unfreezes the thread.
The function scans the root registers and the stack, signals the GC
somehow that it is done, and then returns leaving the thread to
continue from where it left off.


You emulate a call to the scan function by pushing the proper
registers (taken from the stopped thread context) onto the thread's
stack and then changing the IP in the saved context to point to the
start of your scan function. When the thread is unfrozen it will wake
up in the scan function instead of where it was stopped. When the
scan is complete, the scan function returns normally via the faked
call on the stack and the thread continues from the point the GC
originally stopped it. It helps if the stopped thread context is
still available to the scan function so it can avoid saving registers
and simply restore anything it overwrites directly from the saved
context.


George



Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.