Re: Best Ref-counting algorithms?

GPS <>
Tue, 14 Jul 2009 12:31:41 -0600

          From comp.compilers

Related articles
Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-12)
Re: Best Ref-counting algorithms? (Hans Aberg) (2009-07-13)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-13)
Re: Best Ref-counting algorithms? (BGB / cr88192) (2009-07-13)
Re: Best Ref-counting algorithms? (2009-07-14)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-14)
Re: Best Ref-counting algorithms? (Hans Aberg) (2009-07-14)
Re: Best Ref-counting algorithms? (GPS) (2009-07-14)
Re: Best Ref-counting algorithms? (George Neuner) (2009-07-14)
Re: Best Ref-counting algorithms? (Hans-Peter Diettrich) (2009-07-15)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-14)
Re: Best Ref-counting algorithms? (=?ISO-8859-1?Q?Christoffer_Lern=F6?=) (2009-07-15)
Re: Best Ref-counting algorithms? (2009-07-15)
Re: Best Ref-counting algorithms? (BGB / cr88192) (2009-07-15)
[26 later articles]
| List of all articles for this month |

From: GPS <>
Newsgroups: comp.compilers
Date: Tue, 14 Jul 2009 12:31:41 -0600
Organization: XMission
References: 09-07-018 09-07-032
Keywords: GC
Posted-Date: 14 Jul 2009 16:45:09 EDT

Torben Cgidius Mogensen wrote:

> Christoffer Lernv <> writes:
>> If one would like to implement a fast GC using refcounting, should I
>> implement something like in those papers, or are there better
>> algorithms out there?
> If you want to implement a fast GC, you should not use refcounting. The
> main problem with refcounting is that an assignment to a pointer
> variable (which would otherwise just be a register-to-register transfer)
> requires two memory reads, two memory writes and a conditional jump, so
> it stresses the memory system far more than most other GC algorithms.
> Refcounts should, IMO, only be used in the following circumstances:
> - Operations on heap pointers are very rare, so even a slow GC doesn't
> matter in the overall picture.
> - You need hard real-time guarantees.
> - The heap is shared between processes/processors with no central
> control.
> So refcounts are fine for such things as shared libraries (you don't
> very often add or delete references to these), but for a functional or
> OO language where you have many small heap structures with short
> lifetimes, it will give massive overhead.

I think that there may be some other aspects of this problem to consider, or
expand on, to give a more complete description.

Reference counting is often more efficient in general than other GC,
especially with modern computer systems, and this statement is dependent on
a variety of factors.

If a GC results in page transfers to/from the disk, then it will drastically
slow down the program, and system.

If a GC follows the roots, any pages that are stored in virtual memory on a
disk will be transferred back into RAM, so as the program progresses it may
experience significant pauses from transfers at unexpected times.

Reference counting does not require scanning all roots. It is a fine-
grained approach to the problem, but as many have noted, it's much more
costly and difficult in general to handle circular references, and things of
that nature.

Reference counting (RC) in and of itself does not eliminate leaks. RC may
require testing constructors and destructors in a loop to verify that leaks
do not occur in some situations.

Reference counting usually requires more work from the programmer. In my
own work, I have found a shadow of doubt forms about all of my usage of
reference counting, and thus I do not feel confident about code, until it
has been tested, and even then, I have doubts.

Reference counting can require more memory than other types of GC, because a
container is usually used that contains a refcount, and a union of some
primitive storage for type(s). This means that accessing data requires more
indirection, and allocation of the container is required. One alternative
for some situations is to pass register-sized quantities directly, and rely
on a container or object structure for more complex types.

With a mark-and-sweep GC you can use a tag bit, with some extra indirection,
and thus may avoid the need for a container.

Without reference counting it is not uncommon for another type of GC to
leak. Most operating systems use a C API, and when using threads, and the
native API of the system, there can be races at times that are unexpected,
or unexpected false references. Premature reuse of memory is often just as
bad as a leak, and they both occur with most systems I know of that have
used a form of GC.

Time is a difficult aspect for all allocators to clearly define and solve.
Every allocator that exists today has a weakness, whether it is a disk
allocator, or a general memory allocator. The optimal patterns for reducing
fragmentation, metadata overhead, and having the optimal speed, are not set
in stone. It seems that dynamic algorithms that adapt to the situation
might be a solution, but even those have a cost, and thus we do what works
best for now, with our current understanding.

As some Smalltalk developers IIRC have said (paraphrase) "operating systems
are for people that don't know how to design a programming language." That
might fit a LISP developer too.

It's generally much easier to build a system from the ground up with GC,
than to adapt another system to work with it. On the other hand though,
device drivers, and the other constraints of modern systems impose their
will upon us.

BTW Knuth's TAOCP gives a reference to a paper about a solution for a Scheme
implementation that uses reference counting, and handles circular patterns.


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.