|Possible to write compiler to Java VM? (I volunteer to summarize) email@example.com (Peter Seibel) (1996-01-17)|
|Re: Ada GC and a bunch of other stuff firstname.lastname@example.org (1996-02-03)|
|Re: ...more GC Stuff larryr@CyberGate.COM (Larry Rau) (1996-02-09)|
|Re: ...more GC Stuff (hardware support) email@example.com (1996-02-13)|
|From:||firstname.lastname@example.org (Kelvin Nilsen)|
|Date:||13 Feb 1996 00:16:03 -0500|
|Organization:||Iowa State University, Ames, Iowa|
|References:||96-01-037 96-02-031 96-02-090|
Larry Rau <larryr@CyberGate.COM> writes:
>All this recent talk of GC and Java Chips, etc has me curious about
>some things which I hope the GC "experts" can offer their opinions.
>If you have direct hardware support for GC what benefits can you see?
>Beter performance? More reliability? both?
I have spent several years working in this area, so I consider myself
an "expert." On the other hand, like most experts, I am subject to a
certain amount of tunnel vision... If any of you would like to seek
further information regarding our own work in this area, feel free to
peruse my web site, listed in the signature below.
My work has focused on providing portable hardware support for
hard-real-time garbage collection. Our approach has been to place
special circuitry in the expansion memory module rather than in the
CPU. For best performance, there are a number of architectural
enhancements that could be integrated into the CPU design, but these
are not necessary. Some examples (I'm really assuming that these CPU
enhancements are used in combination with our special memory module):
1. The first-level cache should support sub-block allocate-on-write.
2. The first-level cache should provide efficient support for subrange
flushing (both with and without accompanying memory updates).
3. If the CPU itself supported tagging of pointers (e.g. there is a 33rd
bit in every register that distinguishes pointers from non-pointers
and this tag is represented by the caches and communicated to the memory
system), then the code generated for a garbage-collected environment would
be every bit as efficient as the code generated for non-garbage-collected
environments. As it is, there are several concessions made in order
to track pointers: we partition registers between those holding pointers
and those not holding pointers, we partition the stack activation frame
similarly, we use I/O operations to identify pointer locations within
each newly allocated object.
Given the market failure of lisp machines and the recent bias of the
research community, we considered the above enhancements (especially
point 3) to be out of our grasp. Now that Sun has announced silicon
support for Java, these capabilities may be closer than we had
imagined. Nevertheless, our emphasis has been on the memory
subsystem. Here is my summary of its benefits:
1. Higher performance coordination between garbage collection and application
processing. Without hardware support, extra instructions must be
associated with each memory write or read operation, or the memory
management unit must be configured to trap accesses to particular pages.
With hardware support, the coordination is performed by snooping on the
memory bus and inserting pipeline stalls whenever fixup is necessary
(simulations suggest that the need for fixup is very rare: less than
1% of memory accesses that miss the caches).
2. Lower latency response to "exceptional" circumstances. In defragmenting
garbage collectors, it is occasionally necessary to suspend application
processing while particular critical garbage collection operations are
being performed. In the hardware supported system, the maximum time of
suspension is less than 2 microseconds. Software implementations of
similar "real-time" algorithms advertise suspension times ranging between
500 microseconds and 100 milliseconds.
3. Lower latency handling of garbage collection startup. The time required
to register the CPU's root pointers is approximately 2 microseconds per
root. Additional time may be required to make memory coherent with the
current contents of the CPU's caches (50 - 500 microseconds, depending on
cache organization and design).
4. Improved memory utilization efficiency. In a limited-memory embedded
real-time system, it is very important to quickly reclaim dead memory.
The amount of memory required to support a particular application depends
directly on how quickly garbage can be recycled. Since the garbage
collected memory module is capable of parallelizing the garbage collection
effort, it makes possible reliable execution of certain important real-time
workloads in less than half the memory that would be required without the
use of the special hardware accelerator.
5. Improved reliability. In mixed-world systems that attempt to provide
both support for high-level development in a garbage-collected environment
and low-level development in a C-like language, it is possible for
programming errors to violate the garbage collector's data structures.
Our hardware support write-protects these data structures.
As a disclaimer, I'll point out that it is still fashionable in the
computer science community to bandy the claim that hardware support is
not required for efficient garbage collection. Paul Wilson seems to
be one of the main proponents of this view (see
http://www.cs.utexas.edu/users/wilson). While I would agree that
there are many efficient garbage collection techniques that do not
require hardware support, it is my opinion that this statement alone
oversimplifies the current state of the art. Building cost-efficient
high-performance reliable garbage collection for tomorrow's consumer
electronics devices is a difficult challenge, and there are many
interacting issues that must be addressed.
Kelvin Nilsen, Research Scientist voice: (515) 294-5143
151 ASC II, CATD fax: (515) 294-9519
Iowa State University internet: email@example.com
Ames, IA 50011 http://kickapoo.catd.iastate.edu/index.html
Return to the
Search the comp.compilers archives again.