News: Stay up to date

The Étoilé community is an active group of developers, designers, testers and users. New work is being done every day. Visit often to find out what we've been up to.


Runtime Improvements

Posted on 10 November 2011 by David Chisnall

This blog hasn't been updated for a while. Aside from chasing some LanguageKit bugs, I haven't done much on the Étoilé side for a little while, I've been busy with some paid work (shocking, but it does happen) and with getting ready for the LLVM / clang 3.0 release. I've also been adding some new things to the GNUstep Objective-C runtime (libobjc2), which is now close to the 1.6 release.

The new version of the runtime fixes some bugs. This is boring, but important. OpenBSD can now ship libobjc2 instead of GCC libobjc, because it now passes more tests than the GCC version even on architectures like SPARC where I have done no testing.

More interestingly, I've now added support for all of the new APIs that OS X 10.7 introduced. These are usually quite boring, but there was one that is particularly fun:

IMP imp_implementationWithBlock(void*)

This function takes a block as an argument and returns a function pointer that can be used as an Objective-C method. Block functions take one hidden argument, the block pointer itself. Blocks that are passed to this function must take self as the second argument.

Methods also take two hidden arguments, the receiver and the selector. This function has to return a trampoline that maps a call like this:

someMethod(object, selector, ...)

to something like this:

someBlock->invoke(someBlock, object, ...)

Unfortunately, this is impossible in C. You can not write a C function that calls another function with the first two arguments modified but all others preserved. You could probably use libffi, but that would be quite slow. NSInvocation uses libffi, and a call via -forwardInvocation: costs around three hundred times as much as a direct message send - far from ideal. I've implemented this in assembly for ARM, x86 and x86-64 (MIPS and PowerPC versions are planned, but probably won't be done in time). Each version moves the self parameter over the _cmd parameter, loads the block pointer from one word before the trampoline into the first parameter, loads the function pointer from two words before and jumps to it.

The other complication of this approach is that each returned IMP must be writable (so that I can bind the block to it), but must also be executable. Modern operating systems do not allow this, for a very good reason. The work around is to map pages into memory in two independent locations, once executable and once writable. This all seems to work nicely. To make it a bit more friendly, I've also added this function:

char *block_copyIMPTypeEncoding_np(void*)

This returns the type encoding for the IMP returned by the other function. This has the _np suffix because it is an extension that isn't present in the Apple runtime. This is a shame, because it makes using the block-to-IMP code a lot easier. The new implementation of ETPrototypes in EtoileFoundation uses it extensively.

The other new feature is something that I've been pondering implementing for a while, the objc_msgSend() function. The NeXT / Mac runtime used this interface for message sending. The compiler does a translation roughly like this when it encounters a message send:

[object message: argument];
// NeXT / Mac:
objc_msgSend(object, @selector(message), argument);
// GNU
IMP imp = objc_msg_lookup(object, @selector(message);
imp(object, @selector(message), argument);

The GNUstep runtime also implements a slightly different message lookup function, which allows the compiler to insert automatic caching, but the core idea is the same. The NeXT version has the advantage that each message send just needs to be a single function call - the compiler emits code for setting up the call frame once. With the GNU version, the compiler needs to make one call, then set up the argument frame again, and then make another. This is also potentially slower, because even if the lookup function is infinitely fast, it still requires an extra function call.

I missed the 3.0 deadline for getting this support into clang, but if you want to play with it you can download LLVM and clang trunk and add -fno-objc-legacy-dispatch to your Objective-C flags. I've compiled GNUstep-base with this flag on x86 and x86-64. The resulting binary is about 10% smaller than without it, and it passes all of the tests.

The assembly versions are not as optimised as they could be, but they are still a bit better than the C version on the fast paths. They contain almost no branching and simply look up the method in the dispatch table and jump to it, leaving the call frame intact. I benchmarked message sending on all three architectures. In all architectures, the objc_msgSend() approach took half the time of the old dispatch mechanism.

The old mechanism isn't going away - it has the advantage of being portable to any architecture and the GNUstep runtime implementation allows caching, which allows speculative inlining, which can be even faster, but on platforms where it is supported the Apple-compatible version is faster. In fact, in microbenchmarks (and therefore to be taken with a big pinch of salt), the libobjc2 version in its average case performed as well as the Apple version in its best case. This is a bit misleading, because the libobjc2 version touches more memory, so cache pressure may make it slightly worse in real-world usage.

LanguageKit already requires the new version of the runtime for some other things (ARC, mainly), so I am probably going to enable the new dispatch mechanism by default before the next release.

I was a bit surprised at how much of a difference this made. The assembly version is still slower than the cached version (so, for loops I will still want to emit the old-style lookup and cache the IMP), but it is sufficiently fast that you are unlikely to see message send overhead as the bottleneck.

Oh, and one more thing: the latest version also supports small objects. These are objects that are hidden inside a pointer, like Smalltalk's SmallInt. The current version of GNUstep-Base creates an NSSmallInt subclass of NSNumber on all platforms that stores a 31 or 61-bit number inside the pointer. This means that LanguageKit objects and Objective-C objects are now the same thing. You can send messages to small integers from Objective-C, and it Just Works™.