copying vs. mark-sweep gc's; more gc biblio; funny (Paul Wilson)
Mon, 10 Feb 1992 19:15:57 -0500

          From comp.compilers

Related articles
Re: gc's and locality (with ref. list) (1992-02-10)
copying vs. mark-sweep gc's; more gc biblio; funny (1992-02-10)
Re: copying vs. mark-sweep gc's; more gc biblio; funny (1992-02-12)
gc locality and the gc toolkit (1992-02-13)
copying vs. mark-sweep gc's; more gc biblio; funny gadbois@MCC.COM (David Gadbois) (1992-02-15)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Paul Wilson)
Keywords: storage, bibliography
Organization: Compilers Central
References: 92-02-044
Date: Mon, 10 Feb 1992 19:15:57 -0500 (Ulrich Neumerkel) writes:
|>In article 92-02-040,Paul Wilson writes:
|>|> Reference counting has better locality than simple copying or mark-sweep,
|>|> but I don't think anybody's ever properly compared it to a generational
|>|> GC.
|>The paper below suggests that it has been done. I was rather surprised
|>that nobody seems to reference that paper. Are the results not applicable
|>to other systems?

My statement was that nobody has really compared reference counting to
either kind of tracing collector (mark-sweep or copying). I wasn't really
saying anything about the mark-sweep vs. copying issue.

(By the way, John DeTreville of DEC SRC has in fact compared reference
counting with simple non-generational gc's and shown that reference
counting has better locality; I don't think it has an edge on a good
generational gc, but can't say for sure, and I'm pretty sure it's a lose
in terms of overall efficiency for reasons that people have gone over lots
of times.)

At the end of this note are more bibtex references on these topics. And
at the end of that there's a funny story about one of them.

Ulrich again:
|>author = "Benjamin Zorn",
|>title = "Comparing Mark-and-sweep and Stop-and-copy Garbage Collection",
|>booktitle = "Proceedings of the 1990 ACM Conference on Lisp and functional
|>abstract = "Stop-and-copy garbage collection has been preferred to
|>mark-and-sweep collection in the last decade because its collection time
|>is proportional to the size of reachable data and not the memory size.
|>This paper compares the CPU overhead and the memory requirements of the
|>two collection algorithms extended with generations, and finds that
|>mark-and-sweep collection requires at most a small amount of additional
|>CPU overhead (3-6%) but requires an average of 20% (and up to 40%) less
|>memory to achieve the same page fault rate. The comparison is based on
|>results obtained using trace-driven simulation with large Common Lisp
|>month = june,
|>year = 1990 }

You have to be really careful when comparing garbage collectors, because
the space of possible garbage collector designs is enormous. My SIGPLAN
'91 paper explains how you can avoid screwing up your locality (and maybe
even improve it) with a good copy collector that groups things cleverly.
I didn't compare it to mark-sweep, but it clearly outperforms the usual
breadth-first copying algorithms, so the apparent superiority of
mark-sweep at the VM level is likely due to problems with the particular
copying traversal used.

My forthcoming LFP '92 paper deals with cache-level locality. I show that
if you avoid using a pair of semispaces in the youngest generation, and
instead use either a single space (like the Lisp Machine GC's or three
spaces (like David Ungar's GC and mine), you reduce cache memory
requirements by an amount similar to the one Zorn got for mark-sweep.

The basic problem of cache-level locality is the same for a generational
garbage collector of either variety---you use up some amount of memory
between garbage collections, and don't reuse any of it until the next
allocation-and-collection cycle. This deferred reclamation is what makes
gc's efficient---you wait until most stuff has died, to reduce the costs
incurred in tracing the live objects. Unfortunately, that also means you
can't reclaim and REUSE the memory as soon---there's a tension between
basic efficiency and good locality.

The problem with a simple semispace configuration (copying back and forth
between two spaces in each generation) is that you only reuse a given
piece of memory every TWO allocation and reclamation cycles.

At the level of cache memories, the reuse of the youngest generation is
the biggest determiner of locality. So using semispaces nearly doubles
your cache requirements right off the bat. To avoid this, I use David
Ungar's trick of allocating everything in a special space in the youngest
generation, which is completely emptied and reused at every cycle. That
means that the main memory reuse cycle is half as big as with a simple
pair of semispaces.

|>|>Another interesting citation from the paper is:
|>Many recent papers on copying gc algorithms have mentioned mark-and-sweep
|>collection only in passing, noting that because the cost is proportional to
|>the size of memory, mark-and-sweep collection is less efficient that copying
|>collection [....]. Appel, Ellis, and Li note that the cost of mark-and-sweep
|>collection is probably somewhat higher than the cost of copying collection,
|>but concede that other costs (allocation, barriers, vm overhead) effect
|>performance enough that copying collectos may not necessarily be the most
|>I note that the cost of sweeping is just an extension of the cost of
|>allocation, and quantify that cost to be up to 5% in allocation intensive

I think Zorn makes a very important point that mark-sweep gc's needn't be
much less efficient than copying gc's. As Appel has pointed out, copying
collectors don't have to expend any CPU cycles dealing with
garbage---garbage is reclaimed implicitly by moving the LIVE data
somewhere else. So as memory gets bigger, it gets more efficient... in

Even though copying is asymptotically more efficient, the actual constants
involved may dominate. The cost of maintaining the "free list" of
available space can actually be quite small for a non-copying gc. (Keep
in mind that if you use a bitmap instead of an actual free list, you can
put 32 words "on the list" by storing a single zero into a 32-bit word of
the bitmap.)

So even though copying collection is "more efficient," it's not more
efficient by very much, if you pay careful attention to the implementation
details. Not only that, but there's an important class of objects for
which mark-sweep collection is clearly more efficient---BIG objects,
especially big nonpointer objects like Smalltalk bitmaps or homogenous
numeric arrays. You don't want to move these big objects around, because
the copying cost may swamp the cost of tracing through them. That's why
state-of-the-art copy collectors are really *hybrids* between copying and
mark-sweep. Big objects are kept in a specially managed area and not
moved around. (This separate "large object area" was introduced by
Caudill and Wirfs-Brock for Tektronix Smalltalk, and also used by Ungar
and Jackson in ParcPlace Smalltalk and by Ungar, Hoelzle, and Chambers in
Self; eventually I'll get around to putting one in my gc. This should
improve copying collection's locality some as well.)

It's not very widely recognized that current state-of-the art copy
collectors ARE copying/mark-sweep hybrids, of a different sort than Zorn's.

I really don't think there's any reason to prefer mark-sweep collectors on
the basis of either efficiency or locality; on the other hand, there's no
clear reason to reject them on those grounds either. My own belief is
that copy collectors have an edge on mark-sweep in both areas, but not by
nearly as much as is widely believed.

(One of the reasons that mark-sweep collectors don't have the horrible
locality is that they don't cause as much fragmentation as you might
expect. If you allocate objects linearly, or nearly- linearly within
pages, you often find that most or all of the objects die at about the
same time, freeing up big contiguous hunks of memory rather than little
smatterings here and there. Barry Hayes' paper from the last OOPSLA---and
the thesis he's writing---deal with these kinds of patterns in objects
births and deaths, and nifty ways to exploit them.)

The real advantage of mark-sweep techniques is the ability to deal with
gc-ignorant compilers (like your average C compiler) that don't spit out
any information about how to find the pointers in the stack so you can
update them if you move objects around. That's why the PARC folks'
conservative gc's are mark-sweep, and the DEC folks (Joel Bartlett, David
Detlefs, John Detreville) have developed "mostly copying" gc's, which
still treat the stack conservatively. (This entails "pinning" some
objects and not moving them because they MAY be pointed to from the stack,
but you can't tell for sure.)

------ MORE CITATIONS --------------------

author = "Wilson, Paul R. and Barry Hayes",
title = "Report on the 1991 Workshop on Garbage Collection in Object-Oriented
note = "Phoenix, Arizona. (Also distributed as a special issue of ACM
SIGPLAN Notices, to appear.)",
booktitle = "Addendum to the Proceedings of ACM OOPSLA '89" }

author = "Caudill, Patrick J. and Allen Wirfs-Brock",
title = "A Third-Generation {S}malltalk-80 Implementation",
booktitle = OOPSLA86,
editor = "Norman Meyrowitz",
pages = "119--130",
month = sep,
year = 1986,
note = "Also published as {\em ACM SIGPLAN Notices 21}(11):119-130, November,

author = "Hans-Juergen Boehm",
title = "Simple GC-safe compilation",
booktitle = "OOPSLA '91 Workshop on Garbage Collection in Object-Oriented
month = August,
year = 1991,
note = "Available via anonymous internet ftp ("

author = "Hans-Juergen Boehm",
title = "Panelist's position statement on conservative vs. accurate GC",
booktitle = "OOPSLA '91 Workshop on Garbage Collection in Object-Oriented
month = August,
year = 1991,
note = "To appear in addendum to the proceedings of OOPSLA '91" }

author = "Bartlett, Joel F.",
title = "Compacting Garbage Collection With Ambiguous Roots",
type = "Technical Report",
number = "88/2",
institution = DECWRL,
address = "Palo Alto, California",
month = "February",
COMMENT="Excellent trick here---make newness a page property, not an
    address-range property",
year = 1988 }

author = Bartlett, Joel F.",
title = "Panelist's position statement on conservative vs. accurate GC",
booktitle = "OOPSLA '91 Workshop on Garbage Collection in Object-Oriented
month = August,
year = 1991,
note = "To appear in addendum to the proceedings of OOPSLA '91" }

author = "Demers, Alan and Mark Weiser and Barry Hayes and Daniel Bobrow
and Scott Shenker",
title = "Combining Generational and Conservative Garbage Collection:
Framework and Implementations",
booktitle = POPL90,
month = "January",
year = 1990,
note = "Las Vegas, Nevada",
pages = "261--269" }

author = "David L. Detlefs",
title = "Concurrent Garbage Collection for C++",
institution = CMU,
address = "Pittsburgh, PA",
number = "CMU-CS-90-119",
month = May, year = 1990,

author = "John DeTreville",
title = "Experience with Concurrent Garbage Collectors for Modula-2+",
institution = DECSRC,
address = "Palo Alto, California",
number = 64,
month = Aug, year = 1990

author = "John DeTreville",
title = "Heap Usage in the {Topaz} Environment",
institution = DECSRC,
address = "Palo Alto, California",
number = 63,
month = Aug, year = 1990

AUTHOR="Daniel Ross Edelson",
TITLE="Dynamic Storage Reclamation in C++",

author = "Barry Hayes",
title = "Using Key Object Opportunism to Collect Old Objects",
booktitle = OOPSLA91,
month = Oct,
year = 1991,
pages = {33-46},
note = "Phoenix, Arizona",
publisher = "ACM Press",

author = "Ungar, David and Frank Jackson",
title = "Tenuring Policies For Generation-Based Storage Reclamation",
booktitle = OOPSLA88,
month = sep,
publisher = {ACM},
year = 1988,
pages = "1--17",
note = "Also published as {\em ACM SIGPLAN Notices 23}(11):1--17, November,

author = "Moon, David",
title = "Garbage Collection in a Large {L}isp System",
booktitle = "Conference Record of the 1984 {ACM} Symposium on {L}isp and
Functional Programming",
year = 1984,
month = aug,
pages = "235--246",
note = {Austin, Texas} }

author = "R. G. Larson",
title = "Minimizing Garbage Collection as a Function of Region Size",
journal = "SIAM Journal on Computing",
volume = 6,
number = 4,
month = dec,
year = 1977,
pages = "663--667" }


Richard Larson's 1977 paper dealt with increasing the efficiency of a copy
collector by simply increasing the semispace ("region") sizes, and
therefore decreasing the frequency with which you must copy all of the
live objects from one region of memory to another. This was 10 years
before Appel's much-better-known paper; not only that, but Larson's paper
gives a mathematical treatment of the tradeoff between basic efficiency
and locality issues. (He overlooked the pathological behavior of LRU
replacement combined with a simple gc's cyclic memory reuse, though.)

The funny thing is that he published the paper, titled "Minimizing Garbage
Collection as a Function of Region Size," and it went almost completely
unnoticed (I'm apparently the only person who's ever cited it)...

...and the sole request for a reprint he got was from some guy in Urban

(The only reason *I* know this is that Larson is a mathematician at the U.
of Illinois at Chicago, where I did my Ph.D.)

      -- Paul

Paul R. Wilson, runtime environmentalist email:
U. of Texas Computer Sciences Dept. voice: (512) 471-9555
Taylor Hall 2.124, Austin, TX 78712-1188 fax: (512) 471-8885

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.