Re: Fat references

"BGB / cr88192" <>
Fri, 1 Jan 2010 02:11:04 -0700

          From comp.compilers

Related articles
[4 earlier articles]
Re: Fat references (glen herrmannsfeldt) (2009-12-30)
Re: Fat references (Jon Harrop) (2009-12-30)
Re: Fat references (Kaz Kylheku) (2009-12-30)
Re: Fat references (Jon Harrop) (2009-12-30)
Re: Fat references (glen herrmannsfeldt) (2009-12-31)
Re: Fat references (Jon Harrop) (2010-01-01)
Re: Fat references (BGB / cr88192) (2010-01-01)
Re: Fat references (BGB / cr88192) (2010-01-01)
Re: Fat references (2010-01-02)
Re: Fat references (glen herrmannsfeldt) (2010-01-02)
Re: Fat references (Robert A Duff) (2010-01-02)
Re: Fat references (Jon Harrop) (2010-01-03)
Re: Fat references (Hans-Peter Diettrich) (2010-01-03)
[18 later articles]
| List of all articles for this month |

From: "BGB / cr88192" <>
Newsgroups: comp.compilers
Date: Fri, 1 Jan 2010 02:11:04 -0700
References: 09-12-045 09-12-050 09-12-056
Keywords: storage, GC
Posted-Date: 01 Jan 2010 11:45:37 EST

"Jon Harrop" <> wrote in message
> Robert A Duff wrote:

> Right. With my approach, the pointer to the run-time type can be used to
> fetch a GC-specific function that traverses the references in a given
> type,
> or a user-function to pretty print a value of that type.

ok, so type-with-pointer rather than type-with-object or

>> I don't fully understand the problem you're trying to solve, nor how
>> fat pointers are the solution.
> My original motivation was to evade typeless null pointers that afflict
> many
> VMs including the JVM and CLR. Various operations hit a dead wall with
> these nulls because they don't provide any type information. For example,
> pretty printing null is a serious problem in F# because lists, sets,
> option
> types and other types all want to use null as an efficient representation
> of "empty" but the run-time system just sees a typeless null and can only
> print "null". So F# wraps the empty list in a heap allocated object,
> massively degrading the performance of most uses of lists.

well, two solutions come to mind here:
NULL is context dependent, so how it is printed depends on where it was seen
from (AKA: the logic for printing a list would see the NULL differently than
would a generic print function, ...).

extending the "basic value set" to include values beyond NULL.
for example, I often use a value "UNDEFINED" in addition to NULL, as well as
a few other lesser-used possibilities.

granted, these are still "generic" untyped values.

EOL (End Of List), ... could easily enough exist, but in my (current) common
uses I don't use them.

> However, with the benefit of hindsight, there is a simpler but still
> efficient solution to this problem on the CLR: wrap the null in a
> value type instead of an object. That way there is no heap
> allocation and (assuming my performance results from HLVM carry) you
> don't see any slowdown at all. Unfortunately, this would work
> perfectly in theory but, in practice, the CLR has problems with
> structs inhibiting TCO. HLVM doesn't have that problem though
> (thanks to LLVM!).

personally, for my uses, I use a "trick" for implementing things like
"fixnum" and "flonum" with an otherwise "C-like" GC setup.

I dedicate that regions of the address space that are not actually
used for valid pointers to hold dynamically-typed regions. (normally,
dynamic types would be per-object).

so, a big chunk of address space goes into fixnum, and another big chunk
into flonum, as well as other types.

on Win32, for example, there is the whole region from
0xC0000000-0xFFFFFFFF which does not hold validly addressable memory,
and so I can "safely" allocate space from this region without risking

on Win64, I have a much bigger space in a different location
(71000000:00000000 - 71FFFFFF:FFFFFFFF), which I can similarly allocate out
as needed.

now, there is a little bit of an edge case:
right near the end of this region in 32-bit mode, I have usually placed
certain magic constants, for example, UNDEFINED==((void *)(-1)), ...

it also mixes much better with C than with tagged references, since for the
most part the C code need not actually know or care that there is no
"object" behind the pointer (IOW: no real need for special handling, for the
most part).

granted, with fat pointers, there is no real need for this trick either,
since I guess one can shove whole floats or doubles into the pointers if
they want (rather than devising 28 and 48 bit float formats to shove into
their pointers...).

actually, in recent times, the use of pure dynamic types has been existing
in a state of contention with "signature typing", where dynamic typing works
via identifying pointers: "oh yeah, that is a cons", "that one there is a
fixnum", ... and signature typing via a sideband "magic string" that ends up
getting moved around all over the place (the string itself representing the
type as a much more elaborate "mini-language", vs the simplistic "type name"
system used by the dynamic types), and where each has a different set of

on one hand, one system uses simple pointers which are self-identifying;
on the other, the other can specify nearly arbitrary data structures, and
can deal with relatively "dense" memory packing, raw C data, ..., as well as
often being able to produce efficient machine code, ... but, at the cost of
generally being more awkward and somewhat more complex.

>> Are you saying all pointers are fat (4
>> words per pointer!?)? Or just pointers to arrays?
> Yes, all pointers. Every single reference in the heap now occupies a
> quadword. I even blindly load and store entire quadwords when reading and
> writing references.

you say quadword but it sounds like octword.

quadword = 64-bits.
octword = 128-bits.

unless of course you are using that bizarre ELF-ish language where
word=32-bits, dword=64-bits, and qword=128-bits...


but, anyways, I don't personally like this strategy as it would somewhat
inflate memory use, and ultimately performance will pay...

it may not be drastic or immediate, but with the working set not fitting
nicely in cache, this much will be felt...

it is bad enough with 64-bit code, as one can notice how, even though it
"should" be faster in a theoretical sense, 64-bit code generally drags
behind 32-bit code in terms of raw performance...

so, I advise at least "some" caution, as it may well make sense to be able
to "optimize" things down to narrow pointers in many cases.

>> Under what circumstances are you worried about an extra copy?
> Extra copy of what?
>> If the array is allocated on the C side, you have to concoct the meta
>> data
>> somehow...
> That's no problem. If the array came from the C side then I must JIT
> compile
> a stub function that accepts HLVM's fast calling convention and invokes
> the
> external function using C's calling convention anyway, so I can put the
> extra code to convert the C array into a fat reference in that stub,
> injecting type information into the typed fat reference derived from the
> FFI definition of the external function.

calling convention transitions are another thing I don't personally
like too much. IME, this point has often become a bit of a mess.

then again, "dynamic" may eventually force one to break away, even if for
such trivial reasons that the C calling convention does not exactly mix well
with continuations, makes things like closures a bit of a mess (of the sort
involving generating a new stub for every closed-over function), ...

and, yes, like seemingly everything else, a lot of this stuff gets a slight
bit nastier when working around the SysV calling convention.

I once considered (for some uses) going over to a calling convention based
instead around the use of "frames", where frames were essentially allocated
from a special purpose heap.

but I had not done much with this idea.

the partial downside was considering that any calls through a normal C
function would essentially break the ability to implement continuations,
limiting much of the appeal of this particular idea.

>> Why not just say that arrays allocated on the C side are not
>> subject to GC?
> Simplicity is *very* important for me. I want to create a useful language
> implementation with the minimum of effort (where useful means
> high-performance parallel numerics to me) by drawing upon research without
> doing anything new myself. My entire VM is still well under 2kLOC of OCaml
> code. If I started adding special cases like different kinds of arrays
> then
> it would complicate everything from the type system to the GC and shadow
> stack.

oddly, I think my strategies are fairly conservative as well...

granted, my project is C, and is a bit bigger than 2kloc...

but, I don't think this issue is 2 complicated in my case:
GC'ed data is allocated via the GC, so it is the GC which decides, not
anything built on top of it.

granted, my GC works primarily with C.

I guess, in a way, it is sort of like Boehm, but I guess with a few
differences (errm, that the newly written GC spec ended up going over the
encoding rules for flonums, briefly describes how cons cells work, the GC
allocating many objects via the use of bitmaps, ...). I guess it really is a
different piece of technology...

some things I take for granted, but was then reminded of:
the C I use is not exactly the same as the C many others use, as there are
many influences from many languages, and in this case, I am left realizing
that the Scheme influences have far from completely died off...

after all, had Scheme influences completely died, I probably would not be
around documenting cons cells, fixnums, and flonums?...

hell, I almost may as well just add a scheme interpreter in the mix
somewhere (vs, say, using C in many cases with a bunch of what are,
essentially, Scheme-derived data types and features).

well, unless I guess most people think it is common "C" practice, say, to
append a few lists and use a "generic apply" API call with this list and a
function pointer... (and, hell, my "architecture" really is a chimerra...).

but, it is sort of an odd thought, thinking of how much one can get burried
in all of this, and then go and remember how a lot of this got started. what
all does this imply exactly? I am not sure...

>> By the way, if you're interested in GC, you should join the
>> GC list, which our esteemed moderator set up some years ago.
> I'd love to. Where is it?
> HLVM has another unusual characteristic in that it defines a high-level
> language (e.g. with tuples and tail call elimination) that is not only the
> target language but also the language the GC is written in. This has the
> advantage that I can easily JIT compile type-specialized GC code to
> improve
> performance. The approach seems to work very well because it makes
> metaprogramming trivial (e.g. instantiating generic definitions per type
> is
> trivial and very efficient) and was easy to implement.

well, I guess I will take your word for it...
but, admittedly, this particular approach doesn't reallt sound like
something I would likely do.

granted, in my case, my GC is a more-or-less disjoint component written in

I guess I still have my reservations as to how all this will turn out, but
best of luck I guess...

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.