News: Stay up to date

The Étoilé community is an active group of developers, designers, testers and users. New work is being done every day. Visit often to find out what we've been up to.


LanguageKit SmallInt Improvements

Posted on 26 January 2010 by David Chisnall

When you use a selector name or instance variable that was declared in Objective-C from LanguageKit, you inherit its types. This is very important for interoperability. If you wrote an object pointer into an instance variable declared as int, or wrote a small integer value into an instance variable declared as an id, then the next time Objective-C code tried to access this value you'd get a crash.

When I started working on LanguageKit, I followed the Objective-C rules that everything was an id until proven otherwise. SmallInts were stored on the stack, but never passed as arguments to methods, never returned from methods, and never stored in instance variables.

This condition is now relaxed somewhat. Instance variables declared in LanguageKit code are now allowed to contain SmallInts. Methods that are not present in Objective-C may take SmallInts as arguments and may return them. This means that LanguageKit is now doing a lot less pointless BigInt creation.

What does this mean in terms of performance? Let's return to the trusty Fibonacci benchmark to find out. This calculates the 30th Fibonacci number, 100 times in a loop. I've implemented this in C, Objective-C, and Smalltalk. Now, the Smalltalk version comes in two flavours. The first always returns an int, so values that won't fit in an int will be truncated (as with Objective-C). The second returns an LKObject, so small integers will be hidden inside a pointer, big integers will be returned in BigInt instances. The performance numbers are:

C fibonacci execution took 2.351562 seconds.  
ObjC fibonacci execution took 6.601562 seconds.  
Ratio: 2.807309 (to C)
Smalltalk fibonacci execution took 8.750000 seconds.  
Ratio: 1.325444 (to Objective-C)
Smalltalk fibonacci SmallInt version execution took 5.687500 seconds.  
Ratio: 0.861538 (to Objective-C)

Note that these were done in a VM, and the timing results are slightly wonky. On subsequent runs, the ratios remained roughly constant, but the absolute times varied by up to 50% in either direction. Although the last line looks like the Smalltalk code is faster than Objective-C (which would be very nice), it's quite unlikely that this is really the case. Please don't start citing this blog as claiming that Smalltalk is faster than Objective-C.

The take home message, however, is that using LanguageKit SmallInts is sufficiently close in terms of performance to using C ints that it's difficult to accurately measure. Using SmallInts also makes you safe from overflow. The Smalltalk version will just get really slow when you can no longer fit the value in a SmallInt. The C and Objective-C versions will start giving you the wrong answer.

Perhaps more interesting is the fact that the Smalltalk version that was shoehorned into the Objective-C type system was slower. The extra overhead of converting to and from ints was noticeable.

Running it several times, I got one result where the SmallInt version took 20% longer than the Objective-C version. This was the worst case result for LanguageKit, but is probably the most representative of real-world performance. If you can afford using 20% more cycles, and aren't doing anything floating-point intensive (LanguageKit's floating point performance still sucks), then there isn't much reason to choose Objective-C over Smalltalk.

It's worth noting that the C version was much faster than either. This kind of recursive call is the absolute best case for polymorphic inline caching. The new runtime supports this optimisation, but I haven't added it to the compiler yet. I'm planning on doing it via an LLVM optimisation pass, so both Objective-C (compiled with clang) and Smalltalk will benefit. This should remove most of the cost of the method lookup, and even allow speculative inlining of calls. I expect that we can get Smalltalk performance closer to 5 seconds for this benchmark.

As always, this benchmark needs to come with a reminder that using a less stupid algorithm is orders of magnitude faster and a sensible Fibonacci implementation is faster even in the LanguageKit interpreter than a stupid algorithm in C. Good algorithms with bad compilers always beat good compilers with bad algorithms.