From: | ericmuttta@email.com (Eric) |
Newsgroups: | comp.compilers |
Date: | 2 Jul 2003 00:44:39 -0400 |
Organization: | http://groups.google.com/ |
References: | 03-05-211 03-06-015 03-06-054 03-06-057 03-06-078 03-06-106 |
Keywords: | design, storage |
Posted-Date: | 02 Jul 2003 00:44:39 EDT |
mwotton@cse.unsw.edu.au (Mark Alexander Wotton) wrote in message news:03-06-106...
> On 20 Jun 2003 00:02:55 -0400, Eric posted:
> > mwotton@cse.unsw.edu.au (Mark Alexander Wotton) wrote
> >>
> >> Either that, or only released when all references to it have passed
> >> out of scope. This is how many modern garbage-collected language
> >> implementations work.
> >
> > This scheme (reference-counting) is used in VB6 for instance.
>
> Reference counting and mark-and-sweep are just ways of implementing the basic
> idea of deallocating when references are out of scope.
Just out of curiosity, are these the only major automatic memory
management techniques out there?
> > VB7 (and every other .NET language?) uses generational,
> > mark-and-sweep GC (hope that's the right term for it). When I read
> > its description, it sounded like a rather elaborate scheme and one
> > that would need a good deal of implementation effort. I prefer plain
> > vanilla automatic reference-counting. Its deterministic (hence
> > useful in real-time systems), simple to explain and understand, and
> > a lot easier to implement and get right.
>
> You do have the problem with reference-counting that it doesn't cope
> very well with two-way links.
This problem really bugs, because the standard library is going to be
full of data structures the utilise two way links (eg. doubly linked
lists). Is there a general solution (other than using a completely
different scheme like mark-and-sweep) for this? One of the reasons I
really fancy reference-counting is its low overhead, but I suspect
that with a few adjustments to the algorithm, one can add in logic
that will detect circular links during deallocation. Its seems to be a
general graph traversal/tracing problem, where one wants to know "have
I visited this node before?" and if the answer is "yes" then we know a
circular link exists somewhere and act accordingly inorder to
deallocate the objects involved.
> Generational collection is nice partly
> because it's complete (all unused allocations are reclaimed)
How, by the way, does generational collection deal with circular
references? Could the algorithm used be adapted for use in reference
counting, to form a sort of new hybrid scheme?
> and also
> because it takes advantage of common memory use patterns: it's
> optimised for the case of new objects pointing to old objects, which
> is the usual case.
Thanks for that point. Up until now I had always wondered why in the
world someone would choose generational GC. It is more complex to
implement, requires a completely separate entity (the collector), it
is harder to explain and understand conceptually, plus its unideal for
use in real-time environments. What other advantages are there for
this scheme? (other than the ones mentioned above) and in the long
run, are they worth the disadvantages?
> > I admit, reference-counting has its problems and I have been toying
> > around with an idea for what I call "compile-time
> > reference-counting". Basically, I am looking for a set of rules,
> > that will allow the compiler to figure out when to release memory
> > for an object, *without* having to allocate memory space for a
> > reference counter value. Instead, using these rules, the compiler
> > can analyse code at compile time and statically work out the
> > life-time of an object. Its a bit wishy washy at the moment but if
> > its at all possible, I will write a paper on it :)
>
> It smells a bit undecidable to me. At least for the complete case, I think
> this reduces to the halting problem.
>
> Object foo(Object x, Object y) {
> if (halts(x)) {
> return y;
> } else {
> return x;
> }
> }
>
> When do you release x's memory?
>
> You may be able to find some decent heuristics, but I suspect it's a lot
> harder than it looks.
The scheme as it stands now actually seems to work fine, when you do
not have any conditional statements (eg If..End If statements). One
rule that I came up with that allows this is "references can only be
'nulled out' or reassigned to another instance, in the scope of their
declaration"
Sub Blah
Dim f As New Foo
...
Call BarProc(f)
...
End Sub
with that rule, the compiler knows that the only scope in which the
reference "f" can become null, is within procedure Blah. Within
BarProc, "f" can be manipulated as desired, but BarProc is not allowed
to set "f" to null or reassign it to refer to another instance.
Then using static analysis, object life time can be worked out:
Dim f As Foo, g as Foo
Set f = New Foo
g = f
(at this point, the compiler statically knows that the instance of Foo
is being referred to twice)
Set f = Nothing 'null out the reference
(at this point, the compiler statically knows that the instance of Foo
is being referred to once because f has been set to Nothing)
g = New Foo
(now the compiler statically knows there are no more references to
that first instance of Foo, so it generates code to deallocate it.)
The rule I mentioned above is important and needed because it allows
the compiler to "see" every point at which references are set or
removed. This partly breaks down when you have conditional execution
because the compiler cannot know at compile-time whether the code will
execute (and hence adjus the reference count) or not. Eg if the above
code had been:
Dim f As Foo, g as Foo
Set f = New Foo
g = f
If Something = True
Set f = Nothing
End If
g = New Foo
the compiler cant statically tell whether "f" was actually Nulled out,
so after the "If" block, it doesnt know whether the refence count is 1
or 2. Extra code would be needed to work that out, but then we might
as well go back to the run-time reference counting scheme if we are
going to add some overhead.
Well, as I said, its still a bit wishy washy but its showing signs of
potential. If it ever works then it could really help save on memory
use (eg if one would normally use a 4 byte (32bit) reference counter,
and you had a collection of 1,000,000 objects, with this scheme you'd
save 4MB!!)
> mrak
You know, i have been calling you "mrak" all this time, is that
correct or just a typo for "Mark"? In any case, thanx for your
comments.
Cheers,E.
Return to the
comp.compilers page.
Search the
comp.compilers archives again.