Re: Fat references

"BGB / cr88192" <cr88192@hotmail.com>
Wed, 30 Dec 2009 11:10:16 -0700

          From comp.compilers

Related articles
Fat references jon@ffconsultancy.com (Jon Harrop) (2009-12-29)
Re: Fat references paul.biggar@gmail.com (Paul Biggar) (2009-12-30)
Re: Fat references bobduff@shell01.TheWorld.com (Robert A Duff) (2009-12-30)
Re: Fat references cr88192@hotmail.com (BGB / cr88192) (2009-12-30)
Re: Fat references gah@ugcs.caltech.edu (glen herrmannsfeldt) (2009-12-30)
Re: Fat references jon@ffconsultancy.com (Jon Harrop) (2009-12-30)
Re: Fat references kkylheku@gmail.com (Kaz Kylheku) (2009-12-30)
Re: Fat references jon@ffconsultancy.com (Jon Harrop) (2009-12-30)
Re: Fat references gah@ugcs.caltech.edu (glen herrmannsfeldt) (2009-12-31)
Re: Fat references jon@ffconsultancy.com (Jon Harrop) (2010-01-01)
[25 later articles]
| List of all articles for this month |

From: "BGB / cr88192" <cr88192@hotmail.com>
Newsgroups: comp.compilers
Date: Wed, 30 Dec 2009 11:10:16 -0700
Organization: albasani.net
References: 09-12-045
Keywords: storage, GC
Posted-Date: 30 Dec 2009 23:31:13 EST

"Jon Harrop" <jon@ffconsultancy.com> wrote in message
> I've been working on a project called HLVM in my spare time:
>
> http://forge.ocamlcore.org/projects/hlvm
>
> One goal was to have fast interop with C, so I didn't want to copy the
> traditional style of placing a header with GC metadata before every value
> in the heap because that would require C arrays to be copied just to add
> this header. I couldn't be bothered to allocate a separate header so,
> instead, I pulled the GC metadata into the reference. So my references are
> now "fat": a quadword of pointer to run-time type, array length or union
> type tag, pointer to mark state and pointer to the actual data itself.
>
> This actually works rather well except I sacrificed atomic read/write of
> references. Has it been done before?
>


is it possible to do it this way:
GC objects are managed on a separate GC heap, where GC objects themselves
contain this header, but the header is otherwise invisible to C land (the
pointer they see is after this GC-internal header).


this is more how my GC works, and I use it mostly with C.
the GC itself also keeps track of object size, type, ref count, ...
actually, this whole header is packed into bits and fits into 64-bits. a
small hash value is also kept in the header so that the GC can detect if it
has been overwritten (heuristic measures can also be used to aid in locating
the origin of the offending object, and help track down the offending code).


the type stored in this header is actually a hash key into the hash of known
object types (sadly, dynamic, so this doesn't allow switch-based type
dispatch, but granted does allow for the "table of function pointers"
strategy).


under this strategy, raw C data (raw pointers, malloc'ed data, ...) is,
simply, not GC'ed.


note that ref-counting is supported, but not usually enabled (this is
per-object), as most of my C code is not exactly ref-count safe...




or, if continuing to use fat pointers:
have you considered the possiblilty of using SIMD operations for this (AKA:
SSE/SSE2).
I had used this both for 128 bit integers, and for an analogous "wide
pointers" system (they were 128 bits, but most of this went into a massively
expanded address space, rather than anything related to object management).
however, analogous should work.


SSE also allows getting/setting the high qword apart from the low qword,
which could allow, for example, using the low qword for the pointer (for
both 32 and 64 bit systems, movd and movq allow moving between these
references and GPR's), and the high qword for type and GC info (although, I
do have doubts of storing this info in the reference, as this seems like it
would make GC a horrible mess, for example, trying to keep everything
between all the references in sync).


for example, if the pointers were to contain a ref count, how would this be
kept in sync from one place to another?
...


sadly, I don't know how much control the LLVM core (presumably in use here)
gives over these things.




also possible:
doing like PHP, and storing a pointer to a header, which in turn points to
the object?...
granted, this would add a possibly notable per-object overhead, but would
likely be at least easier to keep synchronized, and could allow the actual
backing memory to be managed externally.





Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.