News: Stay up to date

The Étoilé community is an active group of developers, designers, testers and users. New work is being done every day. Visit often to find out what we've been up to.

News

Playing with the Runtime Again

Posted on 9 October 2007 by David Chisnall

Everyone should have a hobby, and at the moment mine seems to be abusing the GNU Objective-C runtime. One of my spare time projects is writing a Smalltalk JIT that uses the GNU runtime to provide the object model (allowing you to subclass Objective-C objects in Smalltalk).

Last night, Quentin demonstrated the old maxim that the way to get anything done in an open source project is to tell the developers that it sucks because it can't do X; pretty soon they'll have it doing X, for any value of X. Quentin's criticism was that it wasn't possible to subvert the message dispatch mechanism very easily.

What does that mean? Let's take a look at what happens when you send a message in Objective-C. First, you write something like this:

[object doSomethingWith:aParameter];

The compiler then converts this into something like this:

SEL sel = sel_get_any_uid("doSoemthingWith:"); IMP imp = objc_msg_lookup(object, sel); imp(object, sel, aParameter);

Note that this is a simplification, and the selector will typically be cached somewhere. The important function is objc_msg_lookup, which returns the function that implements the method. These functions always take the object and selector as the first two arguments, and may take others.

For a language like Smalltalk or Objective-C, this mechanism makes sense. For something like Io, it almost does. The problem, for Io, is the implementation of objc_msg_lookup(). This looks at a sparse array structure in the class structure to find the mapping. This isn't helpful for a prototype-based language like Io, where instances might have different methods to their classes. For Io, you want to be able to alter the behaviour of the objc_msg_lookup() function on a per-object basis. for bonus points, you want to do this without breaking binary compatibility (the GNU C++ standard library people were very unpopular when they did this).

Fortunately, the Objective-C class structure has a field called info, which is a bitfield. Actually, it's half a bitfield; the upper half is used to store the id of the class in the system, limiting you to 64K different classes, and the lower half stores flags. These flags are used for various different purposes, including indicating whether a class has an +initialize method that needs to be called. Not all of them are used, so I added a new one. I then modified the objc_msg_lookup function to include a special case if this is set.

Now, if you set the flag on your class then the runtime will know it wants to handle message lookup itself:

+ (void) initialize { CLS_SETOBJECTMESSAGEDISPATCH(self); [super initialize]; }

You then implement a method like this for your class:

+ (IMP) messageLookupForObject:(id)anObject selector:(SEL)aSelector

This lets you store your own version of a dispatch table in an instance variable, so you can add methods to a specific object at runtime. Objects which are extended in Io can have this flag set on their classes at runtime, and use a separate dispatch mechanism for the Io methods.

It can also be used for more efficient proxying for local objects. The CoreObject proxy object wants to pass messages that don't change object state right through, without logging them. With this mechanism, it will be possible to implement a hashmap lookup with the same sort of cost as performing a normal message lookup for messages that are passed through (store an NSMapTable in a COProxy ivar containing selector to IMP mappings for the proxied object), and return the message forwarding IMP for those that aren't.

Another, potentially interesting, option would be to combine this with some runtime code generation to dynamically construct proxy methods that would log their arguments and then pass them on without needing to construct an NSInvocation.

Hopefully, this patch will make it upstream as far as the GNUstep version of libobjc, even if it doesn't make it all the way into GCC. Anyone who wants to play with it themselves can find the diff in this mailing list post.