garbage collection

Lex Spoon <lex@cc.gatech.edu>
13 Jul 2003 23:52:53 -0400

          From comp.compilers

Related articles
[2 earlier articles]
Re: Compiler writers will love this language ericmuttta@email.com (2003-06-05)
Re: Compiler writers will love this language mwotton@cse.unsw.edu.au (2003-06-08)
Re: Compiler writers will love this language ericmuttta@email.com (2003-06-20)
Re: Compiler writers will love this language mwotton@cse.unsw.edu.au (2003-06-22)
Re: Compiler writers will love this language ericmuttta@email.com (2003-07-02)
Re: Compiler writers will love this language mwotton@cse.unsw.edu.au (2003-07-03)
garbage collection lex@cc.gatech.edu (Lex Spoon) (2003-07-13)
Garbage collection wmccabe@hotmail.com (2004-07-28)
Re: Garbage collection nmm1@cus.cam.ac.uk (2004-08-04)
Re: Garbage collection sk@z.pl (Sebastian) (2004-08-04)
Re: Garbage collection tmk@netvision.net.il (2004-08-05)
Re: Garbage collection basile-news@starynkevitch.net (Basile Starynkevitch \[news\]) (2004-08-05)
Re: Garbage collection nick.roberts@acm.org (Nick Roberts) (2004-08-09)
[18 later articles]
| List of all articles for this month |

From: Lex Spoon <lex@cc.gatech.edu>
Newsgroups: comp.compilers
Date: 13 Jul 2003 23:52:53 -0400
Organization: Georgia Institute of Technology
References: 03-05-211 03-06-015 03-06-054 03-06-057 03-06-078 03-06-106 03-07-023 03-07-037
Keywords: GC
Posted-Date: 13 Jul 2003 23:52:53 EDT

> My understanding is that generational collection is essentially an
> optimisation on mark and sweep, which has no problems with circular
> references.


The topic is fascinating but let's try to get our terminology straight.




"garbage collection" is simply "automatic memory management". It's a
general term. To contrast, malloc/free are "manual memory
management".


"reference counting" is garbage collection where you keep track of how
many references there are to an object, and then you free objects when
there are no more references. This approach has problems with cycles.


"mark and sweep" is where you do a graph traversal starting at certain
"root" objects (including things like the current register set), and
then follow object references. Free everything that you did not
reach. This approach has problems with fragmentation.


Additionally, "stop and copy" has not been mentioned yet, perhaps
because it would seem really strange to people in this group. In stop
and copy, the basic approach is to have two separate banks of memory.
You are always working within one of those banks. When you do a
garbage collection, you copy all the live data from one bank to the
other, updating pointers as you go, and compacting all the objects so
that they are next to each other. You still do the "sweep" part of
mark and sweep, but not the marking. This approach has a lot of
memory overhead, but it avoids memory fragmentation.


Finally, "generational" collection means that you have multiple
sections of memory which are collected one at a time. Objects start
out in the newest generation, and gradually move back to older
generations if they survive long enough. Most objects do not, and
most objects which do, live for a very long time -- thus you can
usually just collect within the newest "eden" generation and thus save
a lot of time. The main trick is that you have to keep track of
pointers from older generations into newer generations. This
typically requires a "write barrier".




There are massive flame wars about which kind of garbage collector is
the best. Further, "the best" depends on the particular application,
a lot of times. Finally, garbage collectors are extremely amenable to
tweaking, and people do so....




Phew. Here are two links with more info:




    http://www.iecc.com/gclist/GC-faq.html
    http://dmoz.org/Computers/Programming/Memory_Management/




Also, there is a garbage collection mailing list, for anyone who
really wants to get into it. It has quite low traffic.


Lex


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.