Re: ...more GC Stuff (hardware support) (Kelvin Nilsen)
13 Feb 1996 00:16:03 -0500

          From comp.compilers

Related articles
Possible to write compiler to Java VM? (I volunteer to summarize) (Peter Seibel) (1996-01-17)
Re: Ada GC and a bunch of other stuff (1996-02-03)
Re: ...more GC Stuff larryr@CyberGate.COM (Larry Rau) (1996-02-09)
Re: ...more GC Stuff (hardware support) (1996-02-13)
| List of all articles for this month |

From: (Kelvin Nilsen)
Newsgroups: comp.compilers
Date: 13 Feb 1996 00:16:03 -0500
Organization: Iowa State University, Ames, Iowa
References: 96-01-037 96-02-031 96-02-090
Keywords: GC, architecture

Larry Rau <larryr@CyberGate.COM> writes:

>All this recent talk of GC and Java Chips, etc has me curious about
>some things which I hope the GC "experts" can offer their opinions.

>If you have direct hardware support for GC what benefits can you see?
>Beter performance? More reliability? both?

I have spent several years working in this area, so I consider myself
an "expert." On the other hand, like most experts, I am subject to a
certain amount of tunnel vision... If any of you would like to seek
further information regarding our own work in this area, feel free to
peruse my web site, listed in the signature below.

My work has focused on providing portable hardware support for
hard-real-time garbage collection. Our approach has been to place
special circuitry in the expansion memory module rather than in the
CPU. For best performance, there are a number of architectural
enhancements that could be integrated into the CPU design, but these
are not necessary. Some examples (I'm really assuming that these CPU
enhancements are used in combination with our special memory module):

  1. The first-level cache should support sub-block allocate-on-write.

  2. The first-level cache should provide efficient support for subrange
        flushing (both with and without accompanying memory updates).

  3. If the CPU itself supported tagging of pointers (e.g. there is a 33rd
        bit in every register that distinguishes pointers from non-pointers
        and this tag is represented by the caches and communicated to the memory
        system), then the code generated for a garbage-collected environment would
        be every bit as efficient as the code generated for non-garbage-collected
        environments. As it is, there are several concessions made in order
        to track pointers: we partition registers between those holding pointers
        and those not holding pointers, we partition the stack activation frame
        similarly, we use I/O operations to identify pointer locations within
        each newly allocated object.

Given the market failure of lisp machines and the recent bias of the
research community, we considered the above enhancements (especially
point 3) to be out of our grasp. Now that Sun has announced silicon
support for Java, these capabilities may be closer than we had
imagined. Nevertheless, our emphasis has been on the memory
subsystem. Here is my summary of its benefits:

  1. Higher performance coordination between garbage collection and application
        processing. Without hardware support, extra instructions must be
        associated with each memory write or read operation, or the memory
        management unit must be configured to trap accesses to particular pages.
        With hardware support, the coordination is performed by snooping on the
        memory bus and inserting pipeline stalls whenever fixup is necessary
        (simulations suggest that the need for fixup is very rare: less than
        1% of memory accesses that miss the caches).

  2. Lower latency response to "exceptional" circumstances. In defragmenting
        garbage collectors, it is occasionally necessary to suspend application
        processing while particular critical garbage collection operations are
        being performed. In the hardware supported system, the maximum time of
        suspension is less than 2 microseconds. Software implementations of
        similar "real-time" algorithms advertise suspension times ranging between
        500 microseconds and 100 milliseconds.

  3. Lower latency handling of garbage collection startup. The time required
        to register the CPU's root pointers is approximately 2 microseconds per
        root. Additional time may be required to make memory coherent with the
        current contents of the CPU's caches (50 - 500 microseconds, depending on
        cache organization and design).

  4. Improved memory utilization efficiency. In a limited-memory embedded
        real-time system, it is very important to quickly reclaim dead memory.
        The amount of memory required to support a particular application depends
        directly on how quickly garbage can be recycled. Since the garbage
        collected memory module is capable of parallelizing the garbage collection
        effort, it makes possible reliable execution of certain important real-time
        workloads in less than half the memory that would be required without the
        use of the special hardware accelerator.

  5. Improved reliability. In mixed-world systems that attempt to provide
        both support for high-level development in a garbage-collected environment
        and low-level development in a C-like language, it is possible for
        programming errors to violate the garbage collector's data structures.
        Our hardware support write-protects these data structures.

As a disclaimer, I'll point out that it is still fashionable in the
computer science community to bandy the claim that hardware support is
not required for efficient garbage collection. Paul Wilson seems to
be one of the main proponents of this view (see While I would agree that
there are many efficient garbage collection techniques that do not
require hardware support, it is my opinion that this statement alone
oversimplifies the current state of the art. Building cost-efficient
high-performance reliable garbage collection for tomorrow's consumer
electronics devices is a difficult challenge, and there are many
interacting issues that must be addressed.

Kelvin Nilsen, Research Scientist voice: (515) 294-5143
151 ASC II, CATD fax: (515) 294-9519
Iowa State University internet:
Ames, IA 50011

Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.