Garbage Collection and Interactive Languages (Jonathan Eifrig)
Tue, 4 Aug 1992 18:39:09 GMT

          From comp.compilers

Related articles
Garbage Collection and Interactive Languages (1992-08-04)
Re: Garbage Collection and Interactive Languages (1992-08-05)
Re: Garbage Collection (1992-08-09)
Shared memory design, was garbage collection (1992-08-11)
Re: Garbage Collection (1992-08-11)
Re: Shared memory design, was garbage collection (1992-08-12)
Re: Garbage Collection (1992-08-12)
[3 later articles]
| List of all articles for this month |

Newsgroups: comp.compilers
From: (Jonathan Eifrig)
Organization: The Johns Hopkins University CS Department
Date: Tue, 4 Aug 1992 18:39:09 GMT
Keywords: translator, design

In article 92-08-004 the Moderator writes:
>[Somebody suggested that it might be possible to define some extensions to
>C that would make it easier to garbage collect. Any thoughts? -John]

I think there is some confusion resulting from differing models of
garbage collection being implicitly assumed.

One model views garbage collection similar to an interrupt,
initiated "out of the blue", and uncontrollable by user code. In this
model, generating correct code is very difficult, since one can't put
pointers to heap objects in registers, since garbage collection might
occur between the address-loading instruction and the address use. Under
this model, one usually has to divide the registers into two groups, one
holding only pointers to heap objects and one holding only integers; this
way, the collector can examine and modify the machine registers knowing
how to interpret their contents. This is an appropriate model to use
where the collector is, say, part of the operating system and is running
in parallel with the mutator. (The Symbolics 3600, for example.)

A second model views the heap as under the management of the user
code, with the collector as part of the application. Here, we can
generate code that collects _when_we_want_to_, making life much simpler.
We no longer have to worry about garbage collection occuring between
address loads and uses, because we don't do garbage collection then!
("Doctor, it hurts when I do this." "Well, don't _do_ that!") Under this
model, a good strategy is to do garbage collection only at the beginning
of each basic block of code. Since each block is straight-line, it is
straightforward to add up the amount of heap space it will use (at compile
time). Then we generate code that, upon block entry, checks to see if
there is enough heap space left to run this block without running out of
heap space. If not, we call the garbage collector _now_, rather than

The advantage of this scheme is that we can allocate registers
between addresses and integers on a per-block basis. When we call the
collector, we pass it a bit-mask indicating which registers currently hold
addresses. The collector then uses these as roots of collection, and
updates these registers with the new addresses before returning.

This is the basic scheme used by the Standard ML/NJ compiler on
UNIX systems. In practice, the check-per-block scheme is used for more
than just garbage collection; asynchronous exceptions are handled by
having the (UNIX) interrupt handler just save the interrupt and fake a
heap overflow condition, and then return. At the start of the next block,
the user code thinks its out of heap space and calls the garbage
collector, which then calls the (ML) exception handler. In this way,
interrupts get handled in a known state, and the ML exception handler for
this interrupt, whose execution might involve garbage collection, won't
screw up the user code. Thread-switching in languages supporting
lightweight processes could be handled in this way too.

See Appel's book, "Compiling with Continuations," for more info on
this scheme.

Jack Eifrig ( The Johns Hopkins University, C.S. Dept.

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.