News: Stay up to date

The Étoilé community is an active group of developers, designers, testers and users. New work is being done every day. Visit often to find out what we've been up to.


Futures in Objective-C

Posted on 23 September 2007 by David Chisnall

I like concurrent programming, but I don't like threads. Like files, they're a nice abstraction for operating system designers but not so much for userspace hippies.

In functional languages, you can often get concurrency for free by having a clever compiler, instead of a clever human. This is good, because clever humans are expensive. Clever compilers are too, but they're easier to copy than clever humans.

Consider the following bit of Objective-C:

id foo = [anObject doSomehing];
[foo doSomethingElse];

In Smalltalk, objects were regarded as simple computers that communicated by message passing. The fact that this message passing was implemented with a stack was hidden. To the Smalltalk way of thinking, the objects were independent. This bit of code sends a doSomething message to anObject, and block until it sends a return message.

From here, it doesn't take long to realise that you don't actually need it to block until the [foo doSomethingElse] line. So, can we implement this in Objective-C in the general case? The answer is yes, and that's what the EtoileThread framework (soon to be in EtoileFoundation) does. It works on OS X too.

I actually wrote EtoileThread in a hotel in Dublin in June last year, but I recently rewrote a lot of it to be more efficient. I'd like to give a little overview of how it works.

There are three core components to this. The first is the ETThreadedObject class, which encapsulates an object in its own thread. It's an NSProxy subclass, and forwards messages to the real object. You create it typically via an NSObject category, which adds +threadedNew and -inNewThread methods. When you send a message to the object returned by either of these, the following sequence happens:

  1. The invocation is caught by the ETThreadedObject and put into a ring buffer.
  2. The forwardInvocation: method returns, and the calling code receives an ETThreadProxyReturn object.
  3. The second thread retrieves the invocation from the ring buffer and executes it.
  4. The second thread passes the real return value to the previously returned ETThreadProxyReturn.
  5. Any calls to methods in the returned proxy block until this point.

What does that look like in practice? Well, we'll look at the simple example program included with the framework (ETThreadTest.m for those following in svn) and see. First, we define a simple class that has some trivial methods:

@implementation ThreadTest
- (void) log:(NSString*)aString
    NSLog(@"%@", aString);
- (id) getFoo
    return @"foo";

The first just NSLogs whatever is passed to it, and the second returns a constant string. Next, in the main body, we create an instance of this in its own thread:

    id proxy = [ThreadTest threadedNew];

We then send this a log message:

    [proxy log:@"1) Logging in another thread"];

And then a getFoo message. Recall that the implementations of both of these messages had 2 second delay built into them. This was introduced to make it obvious which order everything was being executed in.

    NSString * foo = [proxy getFoo];
Next, we NSLog something from the main thread, just to show where we are.

    NSLog(@"2) [proxy getFoo] called.  Attempting to capitalize the return...");

Then, we NSLog the return value from the getFoo method (for good measure, we'll send it a message and NSLog the result, rather than NSLoging it directly):

    NSLog(@"3) [proxy getFoo] is capitalized as %@", [foo capitalizedString]);

Finally, since we know we are calling a future, we get the real object out and NSLog it.

    if([foo isFuture])
        NSLog(@"4) Real object returned by future: %@",
                [(ETThreadProxyReturn*)foo value]);

What happens when we run this? Take a look:

$ ./ETThreadTest 
2007-09-23 21:59:39.718 ETThreadTest[25196] 2) [proxy getFoo] called.  Attempting to capitalize the return...
2007-09-23 21:59:41.695 ETThreadTest[25196] 1) Logging in another thread
2007-09-23 21:59:43.695 ETThreadTest[25196] 3) [proxy getFoo] is capitalized as Foo
2007-09-23 21:59:43.695 ETThreadTest[25196] 4) Real object returned by future: foo

Note that the NSLog from the main thread completes first. Note also the two second delays. Finally, note that the third and fourth log statements don't complete until after the getFoo method has run, since they depend on the returned value.

What's improved in the implementation of this in the last week? My first version was quite experimental. It used an NSMutableArray to store the invocation queue. This meant that every message going into the queue required these steps:

  1. Acquiring a mutex.
  2. Inserting an object into an NSMutableArray.
  3. Signalling a condition variable (if the array was empty).
  4. Releasing a mutex.

On the receiving end, you needed the following:

  1. Acquiring a mutex.
  2. Sleeping on a condition variable (if the array is empty).
  3. Removing the first object from an NSMutableArray (quite expensive).
  4. Releasing a mutex.

This is a minimum of four system calls (a maximum of six) and at least one expensive array operation. This isn't too bad if you are only very occasionally sending messages to your threaded objects, and are expecting them to take a long time to complete. If you are sending a lot of messages, however, the overhead quickly stops it being worthwhile to bother with the second thread.

The new implementation uses a lockless ring buffer in this situation. Inserting an object into this involves a subtraction and a comparison to see if it's full (we just spin using sched_yield() if it is, but with enough space for 128 invocations in the buffer that should be rare), inserting two objects in a C array (the invocation and the proxy value), an addition, and a memory buffer (on weakly-ordered platforms, not on x86). Removing an object switches back to the old-style locking mode if the queue is empty. We don't want to spin if there is no work to do for the consumer thread, because that could last a long time. Then, taking the objects out of the array is just copying their pointers out of the array and another addition.

The ring buffers, like those in Xen, use free-running counters and a mask of the lower bits to translate this into an address.

There are two special cases. The futures code only works for methods that return objects. For methods that return non-objects, we need to wait until the invocation has completed. For void return methods, we simply complete asynchronously, but don't return a proxy object.

If you want a more detailed overview of how it all works, there are lots of comments in the code. Have fun, and file helpful bug reports.