News: Stay up to date

The Étoilé community is an active group of developers, designers, testers and users. New work is being done every day. Visit often to find out what we've been up to.


OMeta and Benchmarks

Posted on 9 January 2010 by David Chisnall

This week, Günther committed his implementation of OMeta. OMeta is a shiny way of writing parsers from the Viewpoints Research Institute. It's a really simple way of writing domain specific languages and it's great to have an implementation in Étoilé. It will be fun to see what people will use it for. Eric is planning on writing a Smalltalk parser in OMeta, so we can have a completely self-hosting parser.

Glancing over the code, I noticed that Günther had implemented quite a few things in a category on BigInt. This is how you add operations to integers with LanguageKit, and all of the ones he'd added looked useful. I moved this into the SmallInt and BigInt implementations supplied with the compiler, so now they execute in (very fast) inlined C functions if the receiver is a small integer, rather than in (slower) Smalltalk.

While I was hacking on BigInt, I also added an implementation of a class that boxes floating point values and Eric added support in the parser for floating point literals. You can now use floating point values in Smalltalk, although they are quite slow. I'll probably work on optimising this a bit later, but I can't really be bothered now, because you can just write performance-critical floating point code in an Objective-C method if it's not fast enough in Smalltalk.

Since it's a fairly complex piece of code, it seemed like a good thing to use for some real-world benchmarking of LanguageKit. Unfortunately, this is where I started to hit problems. It ran fine with the JIT compiler, but not the JTL or interpreter.

I spent quite a while hunting bugs in the interpreter - you can see the svn log for details if you care about them - and then moved on to the JTL. It turned out that the problem with the JTL was an LLVM bug, not a LanguageKit problem. Linking together LLVM bitcode modules containing global aliases that pointed to bitcasts was broken. I've now fixed that, so it's worth upgrading LLVM to r93052 or later if you want LanguageKit to work properly.

After that, I could run Günther's OMeta tests. You can see a summary of the results in this table:

Measurement Interpreter JIT Compiler (Debug) JIT Compiler (Release) JTL Compiler
Wall clock time 1.7 22 3.0 0.44
User time 1.0 16 2.0 0.22
Smalltalk time 0.96 0.023 0.023 ?

The wall clock time and user time are reported by the time utility when running edlc. The Smalltalk time is the time reported by edlc as the time taken for the SmalltalkTool class's -run method to complete. This is not reported by the JTL because the code is run by the bundle loader in LanguageKit and can't be separated out by the tool.

As you can see, JIT-compiled code is about 41 times faster than interpreted code. We can probably make the interpreter a bit faster, because it's currently quite a naïve implementation, but given that we can get a big performance benefit just from using the JIT, it might not be worth it.

The JIT has a long start-up time. Most of this time is spent by LLVM optimisation passes. Note the difference between the debug and release LLVM builds. Disabling all of the assert() statements in LLVM and enabling compiler optimisations when building the JIT makes a huge difference to the performance; going from 16 to two seconds of CPU time. I didn't do this when I originally posted this entry because I always forget that I'm using a debug build (I hack on LLVM as well as Étoilé). With a release build, it's significantly faster. Slightly slower than the interpreter overall, but for longer-running programs this will go away.

One of the things we can do now is use the interpreter by default and then compile and install methods in the background, and only once they have been run a few times.

Günther left a comment saying that it might be worth rewriting the OMeta stuff in Objective-C, but given that the tests take a fraction of a second to run I don't think that's particularly worthwhile.