Re: Best Ref-counting algorithms?

George Neuner <>
Fri, 17 Jul 2009 22:33:54 -0400

          From comp.compilers

Related articles
[21 earlier articles]
Re: Best Ref-counting algorithms? (Hans Aberg) (2009-07-17)
Re: Best Ref-counting algorithms? (Hans Aberg) (2009-07-17)
Re: Best Ref-counting algorithms? (Larry) (2009-07-17)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-17)
Re: Best Ref-counting algorithms? (glen herrmannsfeldt) (2009-07-17)
Re: Best Ref-counting algorithms? (George Neuner) (2009-07-17)
Re: Best Ref-counting algorithms? (George Neuner) (2009-07-17)
Re: Best Ref-counting algorithms? (Hans-Peter Diettrich) (2009-07-18)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-18)
Re: Best Ref-counting algorithms? (glen herrmannsfeldt) (2009-07-18)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-18)
Re: Best Ref-counting algorithms? (George Neuner) (2009-07-21)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-22)
[6 later articles]
| List of all articles for this month |

From: George Neuner <>
Newsgroups: comp.compilers
Date: Fri, 17 Jul 2009 22:33:54 -0400
Organization: A noiseless patient Spider
References: 09-07-018 09-07-039 09-07-043 09-07-054 09-07-060
Keywords: GC
Posted-Date: 18 Jul 2009 08:07:20 EDT

On Fri, 17 Jul 2009 11:14:27 -0700 (PDT), Christoffer Lernv
<> wrote:

>On Jul 16, 6:35 pm, George Neuner <> wrote:
>> On Wed, 15 Jul 2009 00:05:12 -0700 (PDT), Christoffer Lernv
>> <> wrote:
>>> On Jul 14, 8:27 pm, George Neuner <> wrote:
>>>> On Sun, 12 Jul 2009 13:41:59 -0700 (PDT), Christoffer Lernv
>>> ... what I find interesting is that the scanning in the case of
>>> ref-counting is on a small subset of all objects. For more advanced
>>> tracing collectors, this is also the case. There is a nice paper on
>>> the duality between ref-counting and tracing collectors by Bacon,
>>> Cheng and Rajan.
>> You can only safely restrict scanning if you can segregate objects
>> that are shared (or could be). Most programming languages don't allow
>> for that distinction up front, so the compiler or runtime environment
>> has to handle it behind the scenes.
>You are talking about restricting scanning for a tracing GC?

No, I'm talking about segregating shared objects in a ref-counter GC
so that only the shared object heap needs to be scanned for breaking

>> >Bacon also claims that their collector got really good results when it
>> >comes to GC cost, quite competitive with tracing collectors.
>> I have no doubt in Bacon's claims for the applications and conditions
>> under which he tested.
>Most comes from the implementation of the Jalapeno java VM. From what
>I can gather there was quite a lot different GCs implemented and
>benchmarked on the VM.
>>> I guess I simply have to implement a few different collectors (tracing
>>> and ref-counted) and see which type I like best.
>> If you're after very high performance (a hobby of mine due to a stint
>> in real time programming), you should look into Baker's treadmill
>> collector. Treadmills are expensive space-wise, but they have fixed
>> processing costs and worst case behavior that is completely
>> predictable (a critical requirement for RT).
>I saw that Treadmills was supposedly covered in Lins book mentioned
>earlier (which I ordered), I'll do some googling to find out more too.
>But if you have any direct links to papers I'd be happy if you would
>share them.

Baker's classic paper is at

Be aware that it describes only the basic structures and operation ...
it does not offer hints as to how to adapt the system for any
particular purpose. [The same can be said for most GC papers. Most
successful GC implementations are hybrids.]

Lins book is the bible of GC. Don't be concerned by the old
publication date (early 90's IIRC) - there are no truly "new" ideas in
GC, only new combinations of basic ideas that are all covered in that

>> Every GC design has weak spots and most have worst case performance
>> that is extremely bad (treadmills are an exception to bad worst case
>> performance, but their weak spot is fixed size allocation blocks).
>> The trick is to anticipate the use - which is all but impossible if
>> the GC is intended to support general programming - and design for
>> either the best average case or the worst possible case. Most GC
>> systems today are hybrids designed for the best average case.
>My problem with tracing GCs is that once the heap grows in excess of a
>few hundred MB, most exhibit noticeable pauses. I want a GC that is
>suitable for game programming, and I'd much rather have a steady lower
>throughput that is predicable, than one that occasionally pauses but
>with better average performance.

That's not a failing of tracing GC, it's a failing of the particular
implementation. There are lazy tracers that have near zero impact on
the mutator - the downside is that they need a (helluva) lot of extra
memory to make sure the mutator doesn't starve.
[don't know if this one is available elsewhere - I can send it to you
if need be.]

Generational GC solves a lot of the pause complaints because most
collection effort is concentrated on the nursery area which is
typically 1 to a few megabytes or on the first generation which is
typically 10 to 20 megabytes. Older generations are collected less
and less frequently.

>For a game, you'd want 30 or 60 FPS, meaning that the total time
>between frames are ~33 ms or ~17 ms respectively.

>It's extremely more preferrable that the GC steals 5ms every frame
>than 100ms every 600 frames.

>(And that would be a rather well-behaving GC - I've experienced java
>applications freezing tens of seconds on my new Macbook Pro when
>collecting around 100 Mb out of 250Mb)

Not that this will solve your problem, but the server version of the
JVM has much better GC behavior.

>So anything that has a constant cost is much preferrable to one that
>cause you to pay chunks of time for GC.

Reference counting in general does _NOT_ have constant costs ... RC's
costs depend greatly on whether linked structures are eagerly or
lazily freed. Sweeping a block for pointers and freeing the
associated blocks depends on the size of the allocation - you can
bound concurrent lazy sweeping, but you can't reuse a block until it
has been completely swept so if you allow variably sized allocations
there is always lurking a pathological case where a large array of
pointers needs to be swept before reuse.

Treadmills, because they deal only with fixed sized blocks, have
constant operation costs and predictable behavior. They are suitable
for use in real time programming if the worst case operation meets the
system's timing constraints. Because they are non-moving collectors,
tracing can be done in parallel with the mutator. Treadmills only
need interrupt the mutator during sweeping which can be done lazily
and involves only strictly bounded, constant cost operations.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.