Re: Garbage Collection (Jonathan Eifrig)
Wed, 12 Aug 1992 15:04:23 GMT

          From comp.compilers

Related articles
[16 earlier articles]
Re: Garbage collection (Nick Roberts) (2004-09-03)
Re: Garbage collection (Sebastian) (2004-09-07)
Re: Garbage collection (2004-09-13)
Garbage Collection and Interactive Languages (1992-08-04)
Re: Garbage Collection (1992-08-09)
Re: Garbage Collection (1992-08-11)
Re: Garbage Collection (1992-08-12)
Re: Garbage Collection David.Chase@Eng.Sun.COM (1992-08-13)
Re: Garbage Collection (1992-08-14)
GC for C (1992-08-16)
Garbage collection (1992-11-24)
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Jonathan Eifrig)
Organization: The Johns Hopkins University CS Department
Date: Wed, 12 Aug 1992 15:04:23 GMT
References: 92-08-015 92-08-045
Keywords: storage, GC (Hans Boehm) writes:
>>(3) A general-purpose garbage collector will probably be used...
>Agreed. In fact, I like to carry this a lot further and argue that at
>least in the short term, the garbage collector should support a variety of
>programming languages, including C, whenever possible.

Here's where our approaches begin to diverge. I don't think that
a single garbage collector, perhaps as part of the kernel, is appropriate
for all source languages. For example, you claim that you want a
collector to work with your (presumably stack-oriented) C programs. Fine.
But what about _my_ non-stack-oriented programs? Since I don't use a
stack, I may be using the "stack pointer" register as a general-purpose
way. How is the collector going to know _not_ to try to collect assuming
this register is a pointer to a stack of activation records?

In short, I don't think that it's desirable to shoehorn all
possible modes of operation into a single garbage-collection model.
Different languages have different needs, and will use the heap

>> Basically, I don't see any reason to jump through hoops to support
>>the "garbage collection at any time" model, when the "garbage collection
>>when I say so" model is (a) inexpensive to implement and (b) permits so
>>much more lattitude in optimizations. The cost of the garbage collection
>>check will be more than cancelled if I can use registers to optimize out
>>just _one_ memory reference, and I'll win big if I can pass a single
>>parameter in a register. What do I have to lose?
>I think we're making very different assumptions about both the garbage
>collector, and the environment in which it will be used. For many kinds
>of problems, multiple threads of control in the same address space make
>life much easier. For example, any application that can accept more than
>one asynchronous input stream (e.g. a terminal emulator) is easier to
>write with threads. It's hard to take advantage of shared memory
>multiprocessors without multiple threads in the same address space. All
>recently designed operating systems provide such a facility (including
>UNIX derivatives like IRIX, Solaris 2.0, etc.) There is an emerging POSIX
>standard for threads.

Sure, threads are important. Sure, multiple threads in a single
process should share the same heap space. All I'm saying is that there is
no need to use some external "alarm" interrupt to slavishly switch
contexts immediately upon receipt of such a signal. We can, as I
explained earlier, switch contexts _only_ at the beginning of basic
blocks. What's a few more instructions between context switches?

>Our main difference is that I usually assume a garbage collector that
>scans the stack and registers conservatively. This means that objects
>reachable only from the stack or registers must have an address inside the
>object located somewhere on the stack or in the registers. That's all
>that's required. This is a very cheap invariant to maintain at all
>program points.

Correct me if I'm wrong, but you seem to be saying that you're
using the stack and the registers as roots of garbage collection. This
will, of course, work (modulo some details concerning computed addresses
of heap objects) _provided_that_one_can_recognize_the_format_of_stack_
_and_registers_. Sure all the pointers are there, but which data elements
of the stack are pointers, meaning that we should follow them into the
heap to find things to collect, and which are just integers, which we
should ignore? If we decide, in an attempt to be _really_conservative_,
to assume everything is a pointer, then we pay a double penalty: since
we're wasting time moving junk around, garbage collection takes longer,
and since we're not reclaiming storage that is actually free, we collect
more often.

It seems clear that this naive approach won't be satisfactory, so
what can we do? We could label our stack frames with some tag
information, indicating which elements of the frame are pointers and which
integers, _but_what_do_we_do_about_the_registers_? What if we garbage
collect while we're in the middle of writing one of these tags (do to a
context switch into a thread which triggers collection)? Our stack is in
an inconsistent state when the collector starts, and we're screwed.

>Explicit safe points in garbage collected code are probably a good idea
>for many systems. But they're inappropriate for other systems.

I just don't see how you're going to allow garbage collection to
occur _anywhere_ and handle any sort of tagged heap. Are you assuming
that procedure calls and heap allocations are atomic? What happens if I
have to garbage collect while I'm in the middle of allocating a record,
and haven't written the length of the record to the heap yet?
Jack Eifrig ( The Johns Hopkins University, C.S. Dept.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.