News: Stay up to date

The Étoilé community is an active group of developers, designers, testers and users. New work is being done every day. Visit often to find out what we've been up to.

News

Building a Better Garbage Collector

Posted on 6 July 2008 by David Chisnall

One day a student came to Moon and said, "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each of the cans." Moon patiently told the student the following story:

"One day a student came to Moon and said, "I understand how to make a better garbage collector...

I am not a fan of many of the things Apple has done to Objective-C recently. The one thing I liked the idea of in principle was garbage collection. Unfortunately, they seem to have done this very badly, so I set about seeing if there was a better way. First, some background:

There are, generally speaking, two kind of garbage collection: reference counting and tracing. With reference counting, every assignment increments the reference count of the new value and decrements the reference count of the old value. When an object's reference count hits zero, it is freed. This is what is traditionally done with OpenStep, via the -retain and -release methods.

The other alternative is tracing. This requires every object to be known to the garbage collector. Globals are identified as 'roots' and periodically the collector attempts to navigate from the roots to every reachable object. Those that can not be reached are freed.

In 2004, some very bright guys at IBM's T.J. Watson Research Center (a nice place to visit, by the way - it's on top of a hill, with huge windows and overlooks some gorgeous scenery) came up with a Unified Theory of Garbage collection in which they propose that these are really equivalent. A tracing garbage collector needs to set a flag indicating that an object has been reached, and this can be seen as a special case of a reference count (one capped at one). Reference counting garbage collectors need some extra mechanism for detecting loops, and this is equivalent to the tracing operation.

When Apple added tracing GC to Cocoa, they threw away the reference counting mechanism. This was a shame, since all that is needed to turn reference counting into full GC is the addition of a cycle detecting algorithm. If a has a reference to b and b has a reference to a then, with pure reference counting, both a and b will leak. The rôle of the cycle detector is to run periodically and make sure this does not happen.

Fortunately, the two of the same guys who came up with the unified theory had, a few years earlier, published another paper in which they describe a mechanism for adding an efficient (i.e. fast) cycle detector to a reference counting system.

I have implemented this for GNUstep, and it shows promise. I've made a few modifications to NSObject - retain counts are now stored in a 16-bit value (if you have more than 65535 references to a single object, you probably have a bug) and the other 16 bits are now used to store flags, including a colour. The colour is set by the cycle detection algorithm, which is invoked periodically when a buffer of objects which have been released but not freed becomes full.

One minor problem with this is that objects can now exist safely in loops and -retain can be called on an object which is currently executing -dealloc. This can lead to an infinite loop, and some careful juggling is required to ensure that no objects deallocate themselves while freeing loops.

The code works, although it has a few (easily fixable) limitations. I created a small graph of five Pair objects. Each one has a reference to itself and to the next one in the ring. The code correctly determines that this contains a loop, and destroys all five objects when the autorelease pool is destroyed.

The only major limitation is that I've only written code for atomically accessing the colours on x86. This can be fixed trivially by simply writing these functions. A smaller issue is that a number of GNUstep classes indulge in premature optimisation by calling NSDeallocateObject() in their -dealloc method, rather than calling [super dealloc].

I currently use a modified version of the algorithm in the paper which uses an NSHashTable to store pointers to objects that might contain loops. Since there's space in the flags field for a 'buffered' flag, I can easily extend it to use this, and replace the hash table with a static array. This is better for two reasons: it should be faster, and it means that we can use thread-local storage without having to worry about explicit destructors (which are currently called by listening for a thread terminating notification, which is slightly fragile and will potentially fail to catch things released when a thread is dying).

Since some code already handles loops via unretained references, the current code has problems. To avoid this, I introduced an extra colour (transparent), and any objects with this colour are assumed to always be acyclic. This allows automatic loop detection to be turned on on a per-object, or per-class basis. In future, this can be used in reverse: to turn off loop detection for intrinsically acyclic data structures (e.g. trees).

My main motivation for this is for the Smalltalk compiler I am currently working on. Since Smalltalk expects garbage collection, and Objective-C does not provide it, this presents a small problem. A tracing collector can be used, however this is very tricky when integrating with a C-like language, since it means that everything which may potentially contain pointers has to be checked, including integers and untyped buffers.