Re: Garbage Collection

boehm@parc.xerox.com (Hans Boehm)
Tue, 11 Aug 1992 18:37:54 GMT

          From comp.compilers

Related articles
[15 earlier articles]
Re: Garbage collection sk@z.pl (Sebastian) (2004-08-23)
Re: Garbage collection nick.roberts@acm.org (Nick Roberts) (2004-09-03)
Re: Garbage collection sk@bez.spamu.z.pl (Sebastian) (2004-09-07)
Re: Garbage collection usenet@leapheap.co.uk (2004-09-13)
Garbage Collection and Interactive Languages eifrig@beanworld.cs.jhu.edu (1992-08-04)
Re: Garbage Collection eifrig@blaze.cs.jhu.edu (1992-08-09)
Re: Garbage Collection boehm@parc.xerox.com (1992-08-11)
Re: Garbage Collection eifrig@beanworld.cs.jhu.edu (1992-08-12)
Re: Garbage collectable C meissner@osf.org (1992-08-12)
Re: Garbage Collection David.Chase@Eng.Sun.COM (1992-08-13)
Re: Garbage Collection boehm@parc.xerox.com (1992-08-14)
GC for C maxtal@extro.ucc.su.OZ.AU (1992-08-16)
Garbage collection Olin.Shivers@cs.cmu.edu (1992-11-24)
| List of all articles for this month |

Newsgroups: comp.compilers
From: boehm@parc.xerox.com (Hans Boehm)
Organization: Xerox PARC
Date: Tue, 11 Aug 1992 18:37:54 GMT
References: 92-08-015 92-08-037
Keywords: storage, translator, design

eifrig@blaze.cs.jhu.edu (Jonathan Eifrig) writes:


>...
>I have spent some time thinking about practical implementations of
>functional (i.e., garbage collecting) programs and their operating system
>support, and have come to the following conclusions:


I'm not sure what "functional" means here or why it matters.


>(1)...off-the-shelf, general-purpose hardware...


Agreed.


>(2)...The UNIX process model is a good one, and should be preserved...


Sometimes. Sometimes not. See below.


>(3) A general-purpose garbage collector will probably be used...


Agreed. In fact, I like to carry this a lot further and argue that at
least in the short term, the garbage collector should support a variety of
programming languages, including C, whenever possible.


>(4) A stack-based model of activation records may not necessarily be
>appropriate.


I think the jury is still out. On the other hand, I completely agree that
register usage is crucial, so I think this is largely irrelevant to the
discussion.


>(5) Asynchronous process interrupts [can be delayed] ...


I agree this can be handled, assuming (2).


>... Since basic blocks can only (a) do a finite abount
>of cons-ing and (b) run for a finite about of time, it makes sense to
>check for interrupts and heap overflow at the same time, at the start of
>each basic block. Since the check can (in most systems) be done in a
>single instruction, and the average basic block is about 20 instructions
>long, we're only incurring a 5% penalty. For this cost, we get the
>ability to use registers however we wish.


> Basically, I don't see any reason to jump through hoops to support
>the "garbage collection at any time" model, when the "garbage collection
>when I say so" model is (a) inexpensive to implement and (b) permits so
>much more lattitude in optimizations. The cost of the garbage collection
>check will be more than cancelled if I can use registers to optimize out
>just _one_ memory reference, and I'll win big if I can pass a single
>parameter in a register. What do I have to lose?


I think we're making very different assumptions about both the garbage
collector, and the environment in which it will be used. For many kinds
of problems, multiple threads of control in the same address space make
life much easier. For example, any application that can accept more than
one asynchronous input stream (e.g. a terminal emulator) is easier to
write with threads. It's hard to take advantage of shared memory
multiprocessors without multiple threads in the same address space. All
recently designed operating systems provide such a facility (including
UNIX derivatives like IRIX, Solaris 2.0, etc.) There is an emerging POSIX
standard for threads.


Our main difference is that I usually assume a garbage collector that
scans the stack and registers conservatively. This means that objects
reachable only from the stack or registers must have an address inside the
object located somewhere on the stack or in the registers. That's all
that's required. This is a very cheap invariant to maintain at all
program points. Empirically, optimizers designed without knowledge of the
requirement preserve it in at least 99,999 out of 100,000 basic blocks
without interfering with optimization AT ALL. Since any real attempt to
guarantee that the requirement will be satisfied will have to be
conservative, it will probably lose more performance than that. But I
can't believe that it would cost anywhere near 5%. (And if I use C as
intermediate code, the cost of GC safe points can be greater since its
hard to keep globals in registers.)


Another difference in perspective is that I'm more interested in "systems
programming" languages like Modula-3. I doubt that the average basic
block size is anywhere near 20 instructions there. (I vaguely remember
numbers like 4 instructions for C. The moderator can probably correct
me.) Since there is typically much less than one allocation per basic
block, I admittedly also want to use a different strategy for placing my
GC safe points.


Finally, I would like my nice new garbage collected code to interoperate
with a lot of old grubby C code, which may expect pointer arguments, but
doesn't obey your conventions.


Explicit safe points in garbage collected code are probably a good idea
for many systems. But they're inappropriate for other systems.


Hans-J. Boehm
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.