Re: Ada GC

bobduff@world.std.com (Robert A Duff)
2 Feb 1996 11:03:50 -0500

          From comp.compilers

Related articles
Possible to write compiler to Java VM? (I volunteer to summarize) seibel@sirius.com (Peter Seibel) (1996-01-17)
Re: Ada GC hbaker@netcom.com (1996-01-29)
Re: Ada GC (was about Java VM) boehm@parc.xerox.com (1996-01-31)
Re: Ada GC (was about Java VM) dewar@cs.nyu.edu (1996-02-01)
Re: Ada GC bobduff@world.std.com (1996-02-02)
Re: Ada GC jmartin@cs.ucla.edu (1996-02-02)
Re: Ada GC hbaker@netcom.com (1996-02-03)
Re: Ada GC and a bunch of other stuff hbaker@netcom.com (1996-02-03)
Re: Ada GC boehm@parc.xerox.com (1996-02-04)
Re: Ada GC dewar@cs.nyu.edu (1996-02-04)
Re: Ada GC dewar@cs.nyu.edu (1996-02-04)
[21 later articles]
| List of all articles for this month |
From: bobduff@world.std.com (Robert A Duff)
Newsgroups: comp.compilers,comp.lang.ada
Date: 2 Feb 1996 11:03:50 -0500
Organization: The World Public Access UNIX, Brookline, MA
References: 96-01-037 96-01-146 96-02-003
Keywords: Ada, GC, realtime

A few points on the "Ada sucks for real-time because it doesn't have
garbage collection" issue:


Hans Boehm wonders how real-time programs avoid storage leaks without
using GC. First of all, Ada 95 allows the programmer to replace the
underlying implementation of "new" and "Unchecked_Deallocation", by
writing "for Some_Pointer_Type'Storage_Pool use My_Storage_Pool;"
Therefore, it is irrelevant that general-purpose storage allocation
algorithms have trouble with fragmentation. One typical technique
I've seen is to use a separate storage pool for each type of
heap-allocated object, and make sure that these are all known-size
types (i.e. not dynamic strings and such). It is then trivial to
write an allocator that has predictable behavior. Another technique
some have mentioned is to simply not use heap-allocated objects. That
makes sense for some real-time applications. Mark/release is another
technique used in practise. These techniques are used in many
languages, not just Ada, for real-time work.


Ada 83 had a more limited version of the Storage_Pool feature -- the
Storage_Size clause.


Note that some GC algorithms also have trouble with fragmentation.
I've even seen some such that claim to be "real-time", which I don't
understand. In any case, in a real-time program, one has to use
predictable storage management algorithms, whether they be GC or
"by-hand".


Henry Baker decries the fact that real-time GC has been around for 20
years, and yet Ada still doesn't require its use. Well, it may have
been around for 20 years, but the fact is that it is not in widespread
use among real-time folks (though there are exceptions). Perhaps
those folks are being foolish -- I don't know. There's a lot of
real-time work being done in C and C++, and those languages don't
require GC either.


Initially, Henry admonished the designers of both Ada 83 and Ada 95
for not requiring GC in the language standard. He later admitted that
de-facto standards are more important anyway. I think the latter
attitude is correct. When we say "language X has garbage collection",
we don't mean that there's something in the standard about it. (Or in
the many standards, in the case of some garbage-collected languages
like Lisp and Smalltalk!) We mean that one can reasonaby expect that
all implementations of that language will support GC. If you tried to
sell (or give away) a Lisp compiler without GC, people would laugh at
you. Therefore, it's fair to say that Lisp has GC.


You cannot say that of Ada or C or C++. However, that may change.
The Intermetrics Ada compiler supports GC, and somebody is working on
GC for GNAT. Perhaps someday people will be embarrassed to sell Ada
compilers without GC support. That would be a good thing, so long as
compilers continue to support the *option* of using by-hand
deallocation instead, for those cases where GC is the wrong solution.
It's best to let the people designing real-time systems make that
decision, even if some of those designers might make the wrong
decision sometimes.


Some have pointed out that is difficult to *require* GC in a language
standard, because it's difficult to *formally* specify what it means
to have GC. If you run out of storage, does that mean there's no GC,
or simply that the non-GC-able objects are taking up too much memory?
To specify it formally, one would have to have a much more detailed
model of storage usage than is typical. (Meyer apparently takes this
view. "Eiffel, The Language", says on page 335, "Authors of Eiffel
implementation are encouraged (although not required) to provide a
garbage collection mechanism..." And yet we can still say that Eiffel
is a garbage-collected language, because it's reasonable to expect all
Eiffel implementations to have GC.)


Note that in *some* cases, the job of by-hand storage management is
eased due to Ada 95's support for user-defined assignment and
finalization. For example, one can automate a reference-counting
scheme. The same is true in C++. (Yes, I realize that GC can be more
efficient than reference counting.)


I agree with those who complain about storage leaks in
compiler-generated code, or in the Ada run-time system. Clearly
unacceptable.


Henry is unfair, I think, to criticize the designers of Ada 83 for
ignoring GC. In fact, it is clear from the Ada 83 Reference Manual,
that it was intended that GC be supported -- in the same sense as in
Eiffel -- an intent, but not a formal requirement. However, the
compiler writers did what their customers wanted, which was not to
support GC. (Again, Henry may think those customers were foolish, but
that's beside the point.) Why do I say Ada 83 intended GC? Well,
RM83-4.8(7) explains that storage may be reclaimed when there are no
remaining pointers to it -- sounds like GC, although the RM calls it
"automatic storage reclamation". And the deallocator is hidden away
with the low-level junk in chapter 13. And it's called
Unchecked_Deallocation (a hint that it should be rarely used, like
Unchecked_Conversion) rather than a more neutral name like Pascal's
Dispose or C's free. And there's a pragma (pragma Controlled) for
turning *off* garbage collection -- to me, that implies that the
designers intended garbage collection to be *on* by default.


On the other hand, the same criticism of Ada 95 is fair -- the RM
explicitly *allows* GC, but is written in a way that implies that GC
support is not intended to be the default. In this sense, Henry is
right that Ada 95 is inferior to Java. I wrote those parts of the Ada
95 RM, and my attitude at the time was one of practicality -- people
don't seem to be demanding GC, and existing Ada 83 compilers don't
support it, so let's not pretend that it's the default. But times
change, and we may very well see Ada turn into a garbage collected
language, by de-facto standard. (We may well see the same thing
happen to C++ someday, although the technical problems are a bit more
difficult for C++. I don't think much of the so-called "conservative"
GC schemes -- especially for real-time.)


Note that GC is not a panacea. You can still have dangling pointers
and storage leaks. An example of the former: I've seen Lisp code that
keeps a free list and reuses some objects (presumably for efficiency).
It's possible to put something on the free list, and reuse it
incorrectly, when somebody else is still pointing at it for its
previous use. An example of the latter: I've seen Lisp and Smalltalk
code that incorrectly hangs on to a pointer to a data structure that
is no longer in use; since it doesn't get collected, the system
eventually runs out of memory. The advantage of GC is not that it
absolutely prevents these kinds of bugs -- the advantage is that it
makes them much less likely in practise.


Henry mentioned the Symbolics Lisp machines as an example -- he said
they would stay up for years on end. Well, my experience was a bit
different. I complained about certain programs being horribly slow,
due to GC overhead (the GC would page in huge amounts of stuff from
the disk (sometimes across the network) in order to collect that
stuff!). The folks at Symbolics told me to do what they do -- turn
off the GC, and when the machine starts getting close to running out
of memory, reboot it. That meant rebooting *at least* once a day.
(To be fair, the GC algorithm improved in later years, to the point
where it may have been practical to use routinely, but I stopped using
Symbolics machines around that time.)


By the way, I was one of the plastic surgeons Henry Baker mentioned in
a previous post on this subject. I'd like to think we did a little
bit more than cosmetic surgery on Ada. ;-)


- Bob
--


Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.