News
CoreObject Preview Release
CoreObject is a version-controlled object database for Objective-C that supports powerful undo, semantic merging, and real-time collaborative editing. It’s part of the Étoilé project.
Some of the features include:
- a DVCS (focusing on personal version control right now) with branching and cheap copying, designed to work with very fine-grained commits (i.e. applications where every UI action creates a commit)
- selective undo, based on diff/merge of object graphs (see http://coreobject.org/technotes/#diffmerge)
- robust real-time collaborative editing, including sharing of revision history, per-user undo, all without operational transform
- powerful undo system that is persistent (survives app restarts), branching, and can record all possible changes to a CoreObject store. You can undo things like reverting to past states, switching branches, deleting branches, deleting documents, etc.
- automatic, metamodel-driven copying of subsets of an object graph
More at: http://coreobject.org/index.html#features and https://github.com/etoile/CoreObject/blob/master/README.md
There are several short videos at http://coreobject.org and some technical notes on how the system works at: http://coreobject.org/technotes/
Get it
CoreObject is MIT-licensed. It has an extensive test suite which is passing on GNUstep and Mac OS X 10.7+, but our demo applications only work on Mac OS X 10.7+.
The source code, including dependencies, can be downloaded here: http://download.gna.org/etoile/coreobject/CoreObject-0.5.tgz
You can grab the source code directory from Github, but without the dependencies: https://github.com/etoile/CoreObject
For bugs reports, see: https://github.com/etoile/CoreObject/issues
Pragmatic Smalltalk and C
Next week, I'm presenting a paper at IWST with the title Smalltalk in a C World, describing LanguageKit / Pragmatic Smalltalk. This seemed like a good time to polish the C interoperability stuff and show one of the examples from the paper here. This is now in svn as Languages/Compiler/examples/dispatch.st
<< headers='("dispatch/dispatch.h", "unistd.h")' >>
NSObject subclass: SmalltalkTool [
run [
| queue count |
queue := C dispatch_get_global_queue: { 0. 0 }.
ETTranscript show: 'main thread' ; cr .
count := 1.
1 to: 10 do: [
C dispatch_async: { queue .
[
ETTranscript show: 'Another thread' ;
show: count ; cr .
count := count + 1
] } ] .
C sleep: 1.
ETTranscript show: 'Threads used: ' ; show: count ; cr.
]
]
The first thing to notice here is the headers pragma. This lets you specify C headers inside a Smalltalk source file. If SourceCodeKit is installed then this pragma will instruct the compiler to use it to determine type information from functions using it. If it isn't installed, then you'll get an error something like this:
$ edlc -f dispatch.st
2012-08-19 13:13:08.670 edlc[99020] WARNING: Unable to load header: dispatch/dispatch.h
2012-08-19 13:13:08.671 edlc[99020] WARNING: Unable to load header: unistd.h
2012-08-19 13:13:08.671 edlc[99020] ERROR: Can not determine type for dispatch_get_global_queue
2012-08-19 13:13:08.672 edlc[99020] ERROR: Can not determine type for dispatch_async
2012-08-19 13:13:08.672 edlc[99020] ERROR: Can not determine type for sleep
2012-08-19 13:13:08.672 edlc[99020] Failed to compile input.
Note the warnings that it can't find the headers. These are only warnings, because it may still be able to run the code, but in this case it can't: it fails to find the type information for the C functions that we're calling. Install SourceCodeKit and you'll see something more like this:
$ edlc -f dispatch.st
LIBCLANG FATAL ERROR: Program used external function 'dispatch_get_global_queue' which could not be resolved!
Abort trap: 6 (core dumped)
It is now correctly compiling the Smalltalk code, but it fails when it tries to run it, because the dispatch_get_global_queue()
function isn't present. You need to explicitly tell it to load the libdispatch library:
$ edlc -f dispatch.st -l dispatch
main thread
Another thread1
Another thread2
Another thread3
Another thread4
Another thread5
Another thread6
Another thread7
Another thread8
Another thread9
Another thread10
Threads used: 11
Note that we just call it dispatch
. You can also specify dispatch.so, libdispatch.so, or a full path and it will try to find it. In the latest version of edlc
, you can also specify multiple libraries to load.
Now that it's working, let's look at what's happening. Each of the message sends to the fake class C
are translated into function calls. There are three of these, two in libdispatch and one in libc. The dispatch_async()
function takes two arguments: a queue and a block. LanguageKit is boxing and unboxing the queue, which is just a C opaque type (i.e. a pointer). We use the same blocks ABI as C, so Smalltalk blocks passed to libdispatch Just Work.
This is spawning 11 tasks, which will be run in either separate threads or be multiplexed onto a smaller number of OS threads, depending on system load. This shows Smalltalk using real threads. If you add a sleep into the start of the block, this becomes very obvious, as you see the race condition in this program (the counter is not updated atomically).
The way to the new XMPPKit and StepChat
Today i will talk about the way to XMPPKit and StepChat release, both are going to the version 0.2. We can separate the work in some steps and talk of the news step by step
The first step was when I came here, into Ètoilè project. I started to work on XMPPKit to make it build again, after this the goal of this step was to make StepChat and XMPPKit to work again. Lot of debugging and good training has been done and after some weeks, both worked again.
The second step, was to make XMPPKit classes and API to be compliant to Objective-C coding conventions; all classes are now preceded by the XMPP prefix and parameters and variable names are more readable. After these changes, debug work continued to make XMPPKit/StepChat more stable on both our main developing platforms: Linux and FreeBSD. Initially the work was hard for a begginner like me, but with the time and the David's help I learned and got some skills “level up” and a lot of bugs were squashed. There was a period where StepChat, that strictly depends by XMPPKit, worked both on Linux and FreeBSD, but there were some “known memory leaks”.
The third step: At this point arrived the time to fix the “known memory leaks”, David put new APIs to EtoileXML that fixed and improved the memory handling of XMPPKit but, when i made XMPPKit use the new APIs i broken it and a second squad of bugs was ready to counterattack. This time the bugs squashing was more heavy but finally, we won! With the new APIs XMPPKit/StepChat got, both, a significant improvement both in memory usage and performance, also the stability was good. All good on my Linux and FreeBSD i386 (i686 that i use), so i decided to focus the work on x86_64, because i got a new computer and i reinstalled my systems both x86_64. With the transiction to the x86_64 other bugs came out, this time easy to solve and they didn't came out from XMPPKit/StepChat. I implemented a new class in StepChat called SCAccountInfoManager to solve that bug. This class is able to read and write your jid in a very simple way using -base gnustep classes, but a new great framework is coming to make contacts handle, more advanced and confortable. At this point we had XMPPKit and StepChat working both on Linux and FreeBSD, i386 and x86_64 with good performances and stability!
The last step, the most important: this step was very fun; I learned new things because I ported XMPPKit and StepChat on ARC. ARC, Automatic Reference Counting, helps you avoid lot of problems with objective-c memory management because, it does that for you!! ARC is not a garbage collector, you can read the rest of the blog to understand what it is. The port on ARC took some hours but finally, when finished we have realized the better performance and, again, the better memory usage by StepChat. I tested StepChat ARCified talking with friends running on valgrind! Memory usage was very stable and lower than before, performance was very very good, considering that it was running on valgrind.
At this point we have a very cleaned XMPPKit and StepChat.
The next step: The next step is to implement MUC to XMPPKit, so you can talk with all your friends in a conference!
You can find them in the Ètoilè repository at:
svn co svn://svn.gna.org/viewcvs/etoile/trunk/Etoile
you need EtoileFoundation to build them.
(I have to ask sorry for my actual Bonobo English)
Étoilé 0.4.2 Announcement (yes, the new release promised a long time ago…)
Étoilé intends to be an innovative, GNUstep-based, user environment built from the ground up on highly modular and light components. It is created with project and document orientation in mind, in order to allow users to create their own workflow by reshaping or recombining provided Services (aka Applications) and Components. Flexibility and modularity on both User Interface and code level should allow us to scale from handheld to desktop environments.
0.4.2 is a developer-targeted release on its way towards this goal. As a developer-focussed release, this predominantly consists of frameworks. A few demonstration applications are also included. More will be added during the 0.4.x release series, leading to a user-focused 0.5 release.
Starting with 0.4.2, each new Étoilé release in the 0.4.x serie is a modular release, frameworks and applications are released individually once they are ready over the release timespan. For now 0.4.2 is limited to EtoileFoundation 0.5, but in the next months, other module releases such as LanguageKit or CoreObject are expected. Until we move to 0.4.3, every module we release will belong this new 0.4.2 release.
Released Modules
EtoileFoundation 0.5
EtoileFoundation now comes with a minimalistic but very flexible Metamodel inspired from Smalltalk's FAME. This new release adds many High-Order Messaging and Blocks related extensions that make much easier to manipulate collections. Both Prototype and Trait support have been rewritten from scratch. The Trait implementation now supports the full Trait semantics rather than a limited subset. The Reflection support has also been rewritten and extended, it now follows the Mirror model used in languages such as Self or Newspeak. Among the numerous other additions detailed in the EtoileFoundation NEWS file, you can find a new socket API, a much more random UUID generation especially on Linux, a richer Collection protocol and reusable Collection-oriented Traits.
Highlights
EtoileFoundation is the core framework for all Étoilé projects, providing numerous convenience methods on top of the OpenStep foundation and significantly better support for reflection and metamodel. It also includes a number of extensions to the Objective-C object model, allowing traits and prototypes. This framework is used by most of the rest of Étoilé and provides a number of core functions, such as UUID and XML handling.
Availability
Étoilé 0.4.2 is currently available in code source form only and can be downloaded as several independant module releases:
http://download.gna.org/etoile/etoile-0.4.2
It may also be obtained from Subversion with the following command:
svn co svn://svn.gna.org/svn/etoile/tags/Etoile-0.4.2
More Information
Visit our website: http://www.etoileos.com/ and blog: http://etoileos.com/news/
Or subscribe to our mailing lists: https://gna.org/mail/?group=etoile
Or join our SILC channel:
Autorelease Performance Improvements
As you may be aware, recent versions of Objective-C added a new @autoreleasepool
keyword that defines a new scope bracketed by an autorelease pool. There are two reasons for this. The first is so that, in ARC mode, the compiler can reason more accurately about object lifetimes. The second is to eliminate the need for creating a new object for every scope.
The latter is important because long-lasting autorelease pools can collect a lot of objects and defer their destruction. This is most important on things like an iPhone where RAM is limited, but it also slows things down by meaning that objects persist long after their last use, so they prevent the program from reusing that memory, increasing cache churn and means that you need more system calls for getting new memory pages from the OS and returning them, rather than just reusing them.
This means that cheap autorelease pool creation can have a lot of performance advantages beyond those apparent in microbenchmarks. GNUstep already tries to make autorelease pools quite cheap to create by creating a per-thread cache of freed autorelease pools and reallocating them as required. If you're using ARC mode, however, you don't get autorelease pool objects. The autoreleasepool scope in ARC mode is implemented by bracketing it in calls to objc_autoreleasePoolPush()
and objc_autoreleasePoolPop()
. These return a void*
pointer. In the current release of GNUstep, these just create a new NSAutoreleasePool, so there's no difference between them and explicitly creating the pool.
The current release of the GNUstep Objective-C runtime includes its own implementation which it will use if the NSAutoreleasePool doesn't opt in to supporting ARC mode. With the current svn trunk code, this is now enabled by default. This just creates a linked list of page-sized buffers and returns a pointer into the current buffer with the push and pop functions. This means that creating a new pool scope is very cheap - it's just returning a marker, not creating a new object.
To see how these compare in terms of performance, I used this little microbenchmark:
#import <Foundation/Foundation.h>
@interface Foo : NSObject
{
id foo;
}
@property(readonly, nonatomic) id foo;
@end
@implementation Foo
- (id)init
{
foo = [NSObject new];
return self;
}
- (id)foo
{
#if __has_feature(objc_arc)
return foo;
#else
return [[foo retain] autorelease];
#endif
}
@end
int main(void)
{
id x = [Foo new];
for (unsigned int i=0 ; i<1000 ; i++)
{
@autoreleasepool
{
id f;
for (unsigned int j=0 ; j<100000 ; j++)
{
f = [x foo];
}
}
}
return 0;
}
This performs a total of 100,000,000 autoreleases, 100,000 per autorelease pool. So, how long does it take to run?
- With the old implementation, 6.9 seconds.
- With the new implementation, 4.5 seconds.
- With the new implementation, and the benchmark compiled in ARC mode, 3.5 seconds.
Now I think I need to find something else to optimise.
Runtime Improvements
This blog hasn't been updated for a while. Aside from chasing some LanguageKit bugs, I haven't done much on the Étoilé side for a little while, I've been busy with some paid work (shocking, but it does happen) and with getting ready for the LLVM / clang 3.0 release. I've also been adding some new things to the GNUstep Objective-C runtime (libobjc2), which is now close to the 1.6 release.
The new version of the runtime fixes some bugs. This is boring, but important. OpenBSD can now ship libobjc2 instead of GCC libobjc, because it now passes more tests than the GCC version even on architectures like SPARC where I have done no testing.
More interestingly, I've now added support for all of the new APIs that OS X 10.7 introduced. These are usually quite boring, but there was one that is particularly fun:
IMP imp_implementationWithBlock(void*)
This function takes a block as an argument and returns a function pointer that can be used as an Objective-C method. Block functions take one hidden argument, the block pointer itself. Blocks that are passed to this function must take self as the second argument.
Methods also take two hidden arguments, the receiver and the selector. This function has to return a trampoline that maps a call like this:
someMethod(object, selector, ...)
to something like this:
someBlock->invoke(someBlock, object, ...)
Unfortunately, this is impossible in C. You can not write a C function that calls another function with the first two arguments modified but all others preserved. You could probably use libffi, but that would be quite slow. NSInvocation
uses libffi, and a call via -forwardInvocation:
costs around three hundred times as much as a direct message send - far from ideal. I've implemented this in assembly for ARM, x86 and x86-64 (MIPS and PowerPC versions are planned, but probably won't be done in time). Each version moves the self
parameter over the _cmd
parameter, loads the block pointer from one word before the trampoline into the first parameter, loads the function pointer from two words before and jumps to it.
The other complication of this approach is that each returned IMP must be writable (so that I can bind the block to it), but must also be executable. Modern operating systems do not allow this, for a very good reason. The work around is to map pages into memory in two independent locations, once executable and once writable. This all seems to work nicely. To make it a bit more friendly, I've also added this function:
char *block_copyIMPTypeEncoding_np(void*)
This returns the type encoding for the IMP returned by the other function. This has the _np
suffix because it is an extension that isn't present in the Apple runtime. This is a shame, because it makes using the block-to-IMP code a lot easier. The new implementation of ETPrototypes in EtoileFoundation uses it extensively.
The other new feature is something that I've been pondering implementing for a while, the objc_msgSend()
function. The NeXT / Mac runtime used this interface for message sending. The compiler does a translation roughly like this when it encounters a message send:
[object message: argument];
// NeXT / Mac:
objc_msgSend(object, @selector(message), argument);
// GNU
IMP imp = objc_msg_lookup(object, @selector(message);
imp(object, @selector(message), argument);
The GNUstep runtime also implements a slightly different message lookup function, which allows the compiler to insert automatic caching, but the core idea is the same. The NeXT version has the advantage that each message send just needs to be a single function call - the compiler emits code for setting up the call frame once. With the GNU version, the compiler needs to make one call, then set up the argument frame again, and then make another. This is also potentially slower, because even if the lookup function is infinitely fast, it still requires an extra function call.
I missed the 3.0 deadline for getting this support into clang, but if you want to play with it you can download LLVM and clang trunk and add -fno-objc-legacy-dispatch
to your Objective-C flags. I've compiled GNUstep-base with this flag on x86 and x86-64. The resulting binary is about 10% smaller than without it, and it passes all of the tests.
The assembly versions are not as optimised as they could be, but they are still a bit better than the C version on the fast paths. They contain almost no branching and simply look up the method in the dispatch table and jump to it, leaving the call frame intact. I benchmarked message sending on all three architectures. In all architectures, the objc_msgSend() approach took half the time of the old dispatch mechanism.
The old mechanism isn't going away - it has the advantage of being portable to any architecture and the GNUstep runtime implementation allows caching, which allows speculative inlining, which can be even faster, but on platforms where it is supported the Apple-compatible version is faster. In fact, in microbenchmarks (and therefore to be taken with a big pinch of salt), the libobjc2 version in its average case performed as well as the Apple version in its best case. This is a bit misleading, because the libobjc2 version touches more memory, so cache pressure may make it slightly worse in real-world usage.
LanguageKit already requires the new version of the runtime for some other things (ARC, mainly), so I am probably going to enable the new dispatch mechanism by default before the next release.
I was a bit surprised at how much of a difference this made. The assembly version is still slower than the cached version (so, for loops I will still want to emit the old-style lookup and cache the IMP), but it is sufficiently fast that you are unlikely to see message send overhead as the bottleneck.
Oh, and one more thing: the latest version also supports small objects. These are objects that are hidden inside a pointer, like Smalltalk's SmallInt. The current version of GNUstep-Base creates an NSSmallInt
subclass of NSNumber
on all platforms that stores a 31 or 61-bit number inside the pointer. This means that LanguageKit objects and Objective-C objects are now the same thing. You can send messages to small integers from Objective-C, and it Just Works™.
GNUstep and Étoilé at the GNU hackers meeting
While David was at the ESUG last week, on my side I went to the GNU hackers meeting that was located in Paris this year. I happen to live here, so I took the lazy option… It was much easier to take a subway to meet the GNU hackers rather than a plane to join the Smalltalkers in Scotland !
Ludovic Courtès a Guile maintainer did a great job at organizing at the event, so everything went smoothly. Among the numerous presentations that were scheduled over four days, I did one about GNUstep and Étoilé. I was more or less representing GNUstep since Fred Kiefer the AppKit maintainer couldn't come.
The first part is a short introduction about Objective-C and some patterns in GNUstep. For example, I tried to present the class transform idea which is widely used but almost never discussed. The second part was the heretic one given that Étoilé is BSD-licensed ;-)… At this point, the talk was centered around LanguageKit and the various optimizations that allows Objective-C and Pragmatic Smalltalk to be truly fast. At the end, I was running out of time, but I briefly covered EtoileUI to illustrate how extensible things are in Étoilé at a higher level.
You can read the PDF slides at Extensibility in GNUstep & Étoilé.
Étoilé at ESUG
Last week was the 2011 International Smalltalk Conference. I was invited to give a talk about `something of interest to ESUG members.' Hopefully I succeeded.
I've put the slides for my talk *Étoilé: Pragmatic Smalltalk online for anyone who is interested. The PDF is annotated with roughly what I thought I was going to say on each one (or, in the case of the ones that I annotated after the talk, with what I thought I'd said), so make sure you read them with something that understands PDF annotations.
Installing Étoilé on FreeBSD
I just installed Étoilé in a new FreeBSD (9, BETA-1) VM, so I thought I'd document exactly what I did. A copy of these instructions (which I'll try to keep up to date as dependencies change) is in subversion as INSTALL.FreeBSD.
First, install subversion, if it isn't already installed. You'll need this to actually get the code.
With FreeBSD 9, the system libobjc is gone. This makes life easier, because we can't accidentally use the wrong one. Both clang and GCC are installed, and the version of clang should be recent enough to use for everything. With FreeBSD 10, GCC will be gone.
If you are using FreeBSD 8, then install LLVM/clang from source (see llvm.org for instructions).
Installing GNUstep
Make a directory to use for building all of the GNUstep stuff:
$ mkdir gs
$ cd gs
First we need to build and install the Objective-C runtime. This has to be done before installing GNUstep Make.
$ svn co svn://svn.gna.org/svn/gnustep/libs/libobjc2/trunk libobjc
$ cd libobjc
We're going to build with the simple Makefile, rather than with the GNUmakefile. This will use the system C/C++ compilers, which are still gcc/g++ on FreeBSD 9, so we need to specify clang.
$ CC=clang CXX=clang++ make
$ sudo make install
For the rest, we need to use GNUstep Make, which requires GNU make. Install the gmake package if it isn't already installed. First we need GNUstep Make. Unfortunately, GNUstep Make does not follow the 'sane defaults' philosophy, so we need to specify a huge number of additional options when configuring it:
$ svn co svn://svn.gna.org/svn/gnustep/tools/make/trunk make
$ cd make
$ ./configure --prefix=/ --enable-objc-nonfragile-abi --enable-native-objc-exceptions --with-layout=gnustep --enable-debug-by-default CC=clang CXX=clang++
Note that I used --prefix=/. This gives you a clean GNUstep install, with /Local and /System directories, but it is only sensible if you have a single system partition. If you have a small / and a large /usr, then you would be better off using /usr/GNUstep or similar. When I do this, I then add symbolic links, so /Local -> /usr/local/GNUstep/Local, but that's optional.
I used --enable-debug-by-default so that everything is compiled in debug mode in the future.
$ sudo -E gmake install
Note that now we use gmake instead of make. Most GNUstep things require GNUstep make. If you forget to use it and use make, you'll get some errors. With the GNUstep layout, you need to source the GNUstep.sh file. This will do it now and make sure it's done every time you log in:
$ . /System/Library/Makefiles/GNUstep.sh
$ echo . /System/Library/Makefiles/GNUstep.sh >> ~/.profile
Now we need to install the GNUstep Base library. This provides the Foundation framework. You need to make sure that you have libffi and libxml2 installed from ports for this to work. You should also install icu. This is not technically required, but lots of stuff in -base won't work properly without it.
Base doesn't use the CC and CXX options we set for -make, for some reason.
$ svn co svn://svn.gna.org/svn/gnustep/libs/base/trunk base
$ cd base
$ ./configure --disable-mixedabi CC=clang CXX=clang++
$ gmake -j 8 && sudo -E gmake install
The -j8 here is for parallel building. If you don't have a multicore machine, feel free to reduce this. The --disable-mixedabi flag is optional. If you use it, then it assumes that everything will be built using the non-fragile ABI (a safe assumption for Étoilé, since LanguageKit requires it) and avoids some indirection and padding every class with a spare class. This makes the code slightly smaller and faster, but breaks compatibility with code compiled with the legacy ABI (including everything compiled with GCC).
Now we can install GNUstep-gui (AppKit). This requires a few libraries to handle images, so make sure you install these ports: jpeg, tiff, png. For spell checking, install aspell. For speech synthesis, install flite.
$ svn co svn://svn.gna.org/svn/gnustep/libs/gui/trunk gui
$ cd gui
$ gmake -j 8 && sudo -E gmake install
-gui is only half of the puzzle. It uses the GNUstep back library to handle all of the windowing-system-specific details. Here you'll need the libXt and cairo ports
$ svn co svn://svn.gna.org/svn/gnustep/libs/back/trunk back
$ cd back
$ gmake -j 8 && sudo -E gmake install
You should now have a working GNUstep install and you can start with installing Étoilé.
Installing Étoilé
First, you need to make sure that all of the Étoilé dependencies are satisfied.
LanguageKit: llvm, clang (unfortunately, you can't use the one in the base system, because it doesn't export all of the required headers) PopplerKit: poppler CoreObject: postgresql-server, postgresql-client LanguageKit: gmp OgreKit: oniguruma5 MediaKit (not currently enabled in the default build, but should be soon): ffmpeg Azalea: libXft etoile_system: dbus Corner: libXScrnSaver DocGenerator: graphviz
Now, you should just be able to check out Étoilé from subversion and build it:
$ svn co svn://svn.gna.org/svn/etoile/trunk/Etoile
$ cd Etoile
$ gmake && sudo -E gmake install
LanguageKit, The Next Generation (Or Something)
Anyone following the svn logs will have spotted a ludicrous number of changes in LanguageKit appearing recently. A lot of this has been simple code cleanup. For example, code generation for assignments now all goes via a delegate object, so we can easily plug in different memory management strategies. Currently, LanguageKit supports emitting either the automatic reference counting (ARC) or garbage collection (GC) read and write barriers.
Using ARC, instead of the old retain / release code, means that LanguageKit will be able to benefit from the ARC optimisers written for clang. These remove redundant retain / release pairs and do a number of other clever tricks to reduce the number of operations required.
The interface to the code generation part of LanguageKit had two design decisions that are no longer applicable. Originally, I intended to share the runtime-specific code with clang. Since then, the clang and LanguageKit versions of CGObjCGNU.cpp have diverged a lot, so that's no longer an issue. There was also no Objective-C++ support (GCC's Objective-C++ support is terrible and clang had no C++ support at all), so there was a lot of conversion from Objective-C types to C types and then to C++ types. Now, the entire back end is written in Objective-C++, so we can pass objects right down. This simplifies the code a lot.
LanguageKit now requires the GNUstep Objective-C Runtime (libobjc2), and is no longer compatible with the legacy GCC runtime. This gives us a lot of interesting features.
The polymorphic selector problem is now more or less solved. Libobjc2 introduced type-dependent dispatch a while ago. This means that the mapping from selectors to methods now depends on the types, as well as the names, of the selector. This is important, because Objective-C permits you to define two methods in different parts of the class hierarchy with different types. You can then cast one of these objects to id
(an untyped object), cast it to the other type, call the method, and have undefined behaviour. This is particularly problematic for Smalltalk code, because we have no type info in the source code, so we can't disambiguate these cases at compile time.
There are two halves to this problem. One is defining methods, the other is sending messages. When you define a #count
method in Smalltalk, should that be the version that returns an integer (like NSArray
) or the version that returns an object? With the latest version of LanguageKit, both are now emitted. The runtime will automagically select the correct one based on the type info in the selector.
One of the other changes that libobjc2 made was the modification of the method lookup function to return a cacheable slot pointer. As a side effect, this also means that we can look up the type encoding of a method very quickly. Now, when you send a message with an ambiguous type signature, LanguageKit generates code that first gets the type, then branches based on which type encoding is used. This is slower than a normal message send, but not by much.
LanguageKit now generates blocks that use the same ABI as Objective-C blocks. This has two advantages. First, lots of people care about performance of Objective-C blocks, so they'll be working on improving the LLVM optimisers to make them faster (e.g. inlining them). Second, it means that we can now pass LanguageKit blocks to functions or methods that expect Objective-C blocks. This is not completely true yet, because currently blocks always take objects as arguments and return an object, while a lot of Objective-C code expects blocks with different argument types, but it's a start. Finally, it means that we have less code in LanguageKit to maintain, which is always good for reliability.
LanguageKit has always had an LKObject
type. This is a pointer that either has a small integer hidden inside it and the low bit set to 1, or a pointer to a real object. Before any message send, we checked the low bit and only did a real message send it it was zero.
Now, support for small objects is part of the runtime. If the low bits in an object are not 0, the runtime does a side lookup of the class from small table. This means that LanguageKit can skip the special cases, generating much smaller code. It still does generate the special cases for a small selection of methods, but only ones that we'll get a significant benefit from inlining in the small integer case, such as arithmetic. This also has the advantage that we can return small integers from methods - without needing to box them - and Objective-C code can use them as if they were real objects. On 64-bit, we'll eventually start storing 32-bit floats in pointers too.
Oh, one the subject of boxing and unboxing, that code is improved too. We can now box and unbox more complex structures fairly reliably. The new code still needs testing on more platforms, but it's looking promising.
My favourite new feature, however, was something I almost finished over Christmas and then left to bitrot. It's now finished, and we have transparent bridging for C functions. When you specify a framework to load, LanguageKit loads the relevant library. If you have SourceCodeKit installed, then it will also use libclang to parse the framework header (e.g. FooKit.h, for FooKit) and find all of the functions that it declares and their types.
In the Smalltalk front end, this is exposed via the C pseudoclass. When you send messages to C
, they are really message sends. For example, you can use the standard math library function sqrt()
like this:
C sqrt: 42
This doesn't go via any kind of foreign function interface. There's no sending a message, deconstructing the call frame, and then generating the new call. It's the direct equivalent of writing sqrt(42)
in C.
For functions that take more than one argument, we have two options in terms of syntax. You can use a C-like syntax, where the function looks like a single-argument message that takes an array as its argument, like this:
C fdim: {60. 12}
This is equivalent to writing fdim(60, 12)
in C. Alternatively, you can use a more Smalltalk-like syntax, and split the function into different parts. For example, to call NSLocationInRange()
, you might write:
C NSLocation: l InRange: r.
The parser strips the semicolons and combines the message parts, so this is equivalent to writing NSLocationInRange(l, r)
in C.
Prototypes (Again!)
In 2007, while sitting in the lobby of Google Zurich, I wrote an implementation of prototypes and traits / mixins in Objective-C. Since then, Quentin has largely rewritten the traits support (see the last blog entry). I rewrote the prototypes support once to better support the new runtime - using runtime functions rather than manipulating the runtime data structures directly.
When I added support for associative references to the new runtime, I borrowed a lot of the code from the old prototypes implementation. The objc_setAssociatedObject()
and objc_getAssociatedObject()
runtime functions are effectively implementing the ability to store objects in slots, as with JavaScript objects.
There are a few other things required to completely implement the JavaScript object model in Objective-C:
- The ability to add methods to a single object, not just store values. This isn't strictly required, because JavaScript doesn't really have methods. Slots just return closures, which you then call.
- The ability to clone an object, so that slot lookups (including methods) are satisfied by the prototype if they can't be satisfied by the object.
These are now both supported in the GNUstep runtime, and will be part of the 1.6 release. We now have an object_addMethod_np()
function, which is the counterpart of class_addMethod()
. The arguments are almost identical, but it takes a single object instead of the class and just modifies that object.
We also have object_clone_np()
and object_getPrototype_np()
. These clone an object and return its prototype. The prototypes support in EtoileFoundation is now just a thin wrapper around these.
Because this stuff works in Objective-C, we can also use it in languages that LanguageKit supports. For example, you can run this little Smalltalk program:
NSObject subclass: SmalltalkTool
[
run
[ |a b |
a := NSObject new.
b := a clone.
a setValue: [ :this |
ETTranscript show: 'self: ';
show: this;
show: ' prototype: ';
show: this prototype;
show: ' foo: ';
show: (this slotValueForKey: 'Foo');
cr ]
forKey: 'print'.
a print.
b print.
b setValue: 'A fish!' forKey: 'Foo'.
b print.
]
]
The output is:
self: <NSObject: 0x2b3b081c> prototype: (null) foo: (null)
self: <NSObject: 0x2b2d9efc> prototype: <NSObject: 0x2b3b081c> foo: (null)
self: <NSObject: 0x2b2d9efc> prototype: <NSObject: 0x2b3b081c> foo: A fish!
If you've been following LanguageKit development, then you may have noticed that there is a front end for a language called EScript. This is a toy language that is intended to demonstrate that it's possible to implement JavaScript-like languages using LanguageKit. There's a simple test program in svn that shows some of this working. Note the JavaScript-like control structure, and the ability to create new objects by cloning existing Objective-C (or Smalltalk) objects.
As always, this is lowered to the same sort of code that you'll get from Objective-C. Prototypes are generally slower than classes in the current implementation, but they are more flexible, and the advantage of LanguageKit is that you can combine the two easily. A language like EScript would be great for prototyping, while a language like Smalltalk is better for the real implementation. In the future, we'll be able to move between them completely fluidly.
Oh, and while I'm talking about EScript, the current code in svn is the result of a big refactoring effort. I've cleaned it up in a lot of places, and it now uses ARC or GC, not its own ad-hoc retain / release code. This should make it a bit more reliable, and means that it can benefit from the same ARC optimiser that clang uses. Expect a new release soon...
Full Trait Support for Objective-C (or almost)
While working on the next EtoileFoundation release, I recently rewrote the Trait support, that David wrote in 2007 among various Objective-C improvements detailed in this report.
In the process, Mixin support has been removed in EtoileFoundation. For Objective-C, the class targeted by the super keyword is hardwired at compilation-time, in other words you cannot use super in a method that belongs to a mixin. As a result, mixins which heavily rely on super are pretty much useless. Mixins tend to heavily use super because every mixin application inserts a new implicit subclass. Hence multiple mixin applications on the same target class create a class hierarchy.
Finally a last issue with the Mixin support was that the new Objective-C runtime API had no class_removeIvar()
and class_removeMethod()
functions to play the tricks that make possible the implicit subclass creation.
As explained the Trait papers, Traits do the same than Mixins but in a cleaner and more predictable way.
Let's come back to the Trait support…
The idea behind traits is to share methods between unrelated classes. When inheritance isn't possible or is not the best design choice, traits enable reuse across classes. Each trait is a collection of methods. In that sense, a trait is similar to a protocol, but unlike a protocol it comes with an implementation.
To give a little background, the Objective-C trait support in EtoileFoundation is based on:
- Traits — Composable Units of Behavior (the original paper, quite short)
- Traits: A Mechanism for Fine-grained Reuse (the latest paper, quite lengthy)
For Objective-C, David made the decision to implement Traits as Classes, rather than a separate construct as Squeak does. This leads to a very simple implementation, that requires no special Objective-C compiler or runtime support. It also means any class can be applied as a trait to another target class. The downside is that traits are applied to classes at run-time rather than compilation-time, so a class declaration doesn't list traits that apply to in it in a visible way, and we have to play hide-and-seek with the Objective-C type checking a bit.
It's worth to mention that GNUstep has provided a Trait-like ability named Behavior (see GSObjCAddClassBehavior()
) that uses the same overriding rule, and Behaviors were introduced a long time before Traits were devised. Back in March 1995 :-) According to GNUstep Base ChangeLog, Andrew McCallum was the one that implemented the idea.
In EtoileFoundation, Trait support was previously minimalistic, limited to adding the methods that belongs to a class to another class. So an Objective-C Trait was roughly the same than a GNUstep Behavior. Now the interesting thing about Traits is the whole toolbox that comes with them. It's probably why GNUstep Behavior use has remained very limited. The motivation behind rewriting the Trait support was to support the whole toolbox:
- trait operators (exclusion, aliasing)
- composite trait (a trait with subtraits) and the flattening property that goes along
In our implementation where everything happens at run-time, trait applications are memorized to support composite traits and multiple trait applications to the same target class. Each time a trait is applied, it gets validated against the trait tree already bound to the target class. This ensures operators, overriding rule and flattening property will remain valid in the new trait tree. Unlike Squeak trait support, a trait can be applied at any time, and also the way the trait applications are memorized would make relatively trivial to unapply traits at runtime.
The basic API to apply a trait is
+[NSObject applyTraitFromClass:excludedMethodNames:aliasedMethodNames:]
where the receiver class is the class to which the trait is applied to. You usually would invoke this method in +initialize
.
To prevent, the compiler to warn you about the trait methods to be provided dynamically, you need to declare trait methods in a category on each target class or use a pragma to disable to protocol checking such as #pragma GCC diagnostic ignored "-Wprotocol"
in case the trait corresponds to a protocol.
Here is a small example that applies two subtraits BasicTrait and ComplexTrait, to another trait CompositeTrait, then the resulting trait is applied to the receiver class.
NSDictionary *aliasedMethods =
[NSDictionary dictionaryWithObjectsAndKeys: @"lost:", @"wanderWhere:")];
[[CompositeTrait class] applyTraitFromClass: [BasicTrait class]
excludedMethodNames: [NSSet setWithObject: @"isOrdered"]
aliasedMethodNames: aliasedMethods];
[[CompositeTrait class] applyTraitFromClass: [ComplexTrait class]];
[[self class] applyTraitFromClass: [CompositeTrait class]];
aliasedMethods
means -[BasicTrait wanderWhere:]
is going to appear as -lost:
in CompositeTrait.
In addition, it's possible to apply a trait without the overriding rule (that states target class overrides trait methods), which means methods in the target class can be replaced by methods from a trait. This is a bit closer to a mixin application and kinda similar to GSObjCAddClassOverride()
, but its use should be restricted to clever hacks imo :-)
NSDictionary *aliasedMethods =
[NSDictionary dictionaryWithObjectsAndKeys: @"lost:", @"wanderWhere:"];
[[self class] applyTraitFromClass: [BasicTrait class]
excludedMethodNames: [NSSet setWithObject: @"isOrdered"]
aliasedMethodNames: aliasedMethods
allowsOverride: YES];
allowsOverride: YES
means we allow the trait to override/replace methods in the target class.
Trait applications are commutative, so the ordering in which you apply traits doesn't matter… but when this mixin-style composition is used, traits are not commutative and the ordering matters. That's why I'd rather discourage its use.
Unlike in Squeak, you cannot send messages to super in trait methods (same problem than the one mentioned at the beginning about the Mixin support). It probably won't change in ObjC in the short term, because the possible solutions are heavy:
- a new IMP() function that takes an extra argument that allow to evaluate the value of super when it's late-bound (Objective-C methods are compiled into C functions, and a IMP is a function pointer that can be used to access the C function related to a method)
- trait method must be recompiled per target class
The last solution shouldn't be too hard to implement in Pragmatic Smalltalk (or rather LanguageKit). For declaring traits, it would interesting to extend our Smalltalk parser to support the Squeak syntax.
For now, three other limitations exist:
- trait applications don't take in account class methods
- no mechanism to declare and check non-trait methods required by trait methods (so you get a runtime exception instead)
- traits must be stateless (no ivar access is allowed)
For the code and documentation, take a look at NSObject+Trait.
Automatic Reference Counting
At FOSDEM this year, both Chris Lattner, the head of Apple's compiler group, and I were invited speakers, so we were put in the same hotel. I had a chance to chat with him about the future of Objective-C over breakfast one morning. I explained that I thought that the way Apple had implemented garbage collection was a disaster, and outlined how I thought it should have been done. Chris' reply was 'wait until the summer'.
I waited until the summer, and on my birthday this year I got a present from Apple: open sourcing their implementation of automatic reference counting (ARC) for Clang and LLVM. Note that I say Clang and LLVM. Clang inserts some quite naïve reference counting calls into the IR, but then an LLVM optimisation pass improves them. This is especially interesting, because it means that LanguageKit will be able to benefit from the same optimisations when it switches to using ARC.
ARC doesn't just provide you with automatic -retain
and -release
calls. It also tidies up the Objective-C memory model, making a clearer distinction between the C and Smalltalk parts of the language. Objective-C that doesn't do any low-level C things now behaves almost like Smalltalk, but there is a clear distinction between what the compiler and runtime track and what you track. If you want to store object pointers anywhere other than on the stack and in instance variables, then you are responsible for memory management. This is now formalised in the language.
There is another nice tweak to the language in ARC mode. You no longer need to write a -dealloc
method if all of your instance variables are primitive or object types. ARC creates a -.cxx_destruct
method. This has been around for a while to call C++ object destructors in Objective-C++ objects. It now calls Objective-C destructors too.
ARC should also offer a speed benefit. I've implemented it in the GNUstep runtime, via two mechanisms. If your class implements or inherits memory management methods and does not explicitly opt in to ARC, then it will be sent -retain
, -release
and -autorelease
messages as usual. If, on the other hand, it opts in, then the ARC functions will skip the message sends and manipulate the reference counts directly. This is a lot faster.
There is also a special case, as with Apple's implementation in Lion / iOS 5, where objects that are autoreleased, returned, and then retained, never actually have their reference counts modified. They are first stored in thread-local storage, with an cleanup function that will send them a -release
message when the thread is destroyed. If they are retained, then they are removed from thread-local storage and returned. If something else is autoreleased as a return value, then the original value is replaced and is autoreleased at that point.
ARC is supported in LLVM/Clang svn and by the upcoming libobjc2 release. It requires a little bit of tweaking for existing code, but for new code it's trivial to use. Oh, and unlike OS X 10.6, we do support __weak
references with GNUstep.
Garbage Collection and Pragmatic Smalltalk
I got a bit side-tracked yesterday, talking about the Cocoa APIs for garbage collection, when I meant to talk a bit more about what it means for Étoilé, so here's a second attempt:
It's very easy in Objective-C code to forget to send some required retain or release messages, and either end up with a use-after-free bug or a memory leak. Experienced Objective-C programmers add these without thinking, but getting to that stage takes a long time.
With Pragmatic Smalltalk, the compiler automatically inserts retain and release calls for you. It doesn't do it in quite an optimal way, because this would require complex dataflow analysis, but it does generate working code.
People expect garbage collection from Smalltalk, and the automatic reference counting is 'good enough' in about 99% of cases. The remaining 1% involves object graphs with cycles. I wrote an automatic cycle detector for use with LanguageKit, but I've never got around to enabling it.
With the latest code, Smalltalk enjoys exactly the same garbage collection as everything else. This means that:
- Globals are marked as roots as soon as they are assigned (not relevant to Smalltalk)
- Instance variables are treated as object pointers
- Memory allocated with
NSAllocateCollectable()
andNSScannedOption
is conservatively scanned. - The stack is conservatively scanned
This isn't quite as good as the accurate collection that you'll typically find in a Smalltalk implementation, but it's not far off. In pure Pragmatic Smalltalk code, the only pointers are instance variables, which are accurately scanned, or on the stack. Most Smalltalk implementations do accurate scanning of the stack, but this doesn't add much benefit in terms of performance - the overhead of computing and parsing the stack maps is quite large.
The big advantage of accurate garbage collection, as found in most Smalltalk, Java, and so on implementations, is that it allows the collector to be completely aware of the locations of all pointers. This means that it can move objects, by just updating their pointers. The Boehm GC can't do this yet, although it's possible that it could flag that an object is not referenced by any possible-pointers in the conservatively scanned regions, and then mark it as eligible for moving.
Because Smalltalk and Objective-C code are using exactly the same barrier calls, we get exactly the same performance from Smalltalk and Objective-C. We may even get better garbage collection performance from Smalltalk, because every SmallInt
object has its low bit set to 1, so will be ignored by the compiler. In contrast, C integer types are easily confused with pointers, if they happen to contain even numbers.
From the point of view of someone writing Smalltalk code, GC in Objective-C should not require any changes to your code, it just means that now you cna be sure that garbage cycles are freed. Oh, and you'll probably see memory usage go down, because LanguageKit is quite conservative about autoreleasing instead of retaining.
You can, of course, force the garbage collector to run, in exactly the same way that Objective-C programmers do. If you look in Langauges/Compiler/test.st
in subversion, you'll see this line a few times:
NSGarbageCollector defaultCollector collectExhaustively.
If you're not in garbage collected mode, then GNUstep will return nil
to the first message, so the second will be silently ignored. If you are using GC, then the second call will tell the collector to keep trying to collect until it can't find any more free object. This line is largely here for testing that the collector isn't freeing anything that it shouldn't be. In normal code, you're more likely to want to write this:
NSGarbageCollector defaultCollector collectIfNeeded.
This will poke the collector to say that now is a good time to run. It's not required, but it is a good idea. The collector will be triggered to run at the end of each run loop, if you're using `NSRunLoop`, but otherwise it will run occasionally as a result of allocations, which may be at an inconvenient time. If you've got some code that can't be interrupted by the collector, you can do this:
gc := NSGarbageCollector defaultCollector.
gc disable.
self doRealtimeStuff.
gc enable.
gc collectIfNeeded.
This will ensure that the collector does not run during the -doRealtimeStuff method invocation. Note that this may not actually improve response times. If you're allocating a lot of objects, then the allocation may become slower as you start swapping or even as you start requesting more pages from the OS - turning on the collector may actually make performance worse. If you're just allocating a few objects, however, then this block lets you ensure that the collector will run after your block of code, not during.
Garbage Collection
On Sunday, I thought I'd have a go at implementing Apple's APIs for Objective-C garbage collection in the GNUstep Objective-C runtime, using the Boehm collector. As of today (Thursday), it's working well enough that I can run complex applications like Gorm. I've also modified LanguageKit to insert the required write barriers, so you can now write full garbage collected Smalltalk code that integrates with Objective-C code.
I initially started working on this because I got bored with people saying 'GNUstep sucks, it doesn't support garbage collection like Cocoa' and never expected to actually use it. After playing with it, however, I'm starting to change my mind. Gorm using GC uses somewhere between 5 and 10% less RAM than Gorm using explicit reference counting. I see similar low memory usage with everything else that I've tried. There may be some CPU usage overheads, but nothing I've run has been perceptibly slower, so if there are then they're not very important.
Apple spent something on the order of 25 man-years developing their garbage collector for OS X, so don't be surprised if three days of my effort doesn't perform as well. Autozone (Apple's GC) has two main advantages over Boehm:
- The Boehm mark phase can run concurrently, but it's still a stop-the-world collector. Autozone is fully concurrent.
- The Boehm collector is portable, while Autozone is so deeply wedged into the Mach virtual memory subsystem that there's no hope of ever disentangling it, meaning that the Boehm collector doesn't get nearly as much feedback from the VM subsystem as Autozone. As a simple example, autozone can read the dirty flag from page table entries directly, so it knows if a page has been modified without scanning it. Boehm can do something similar using
mprotect()
, but this incurs significant system call overhead.
That said, the Boehm collector is actively developed, and is used in a lot of projects. Performance is already good, and should continue to improve.
Why is garbage collection important for Étoilé? For one thing, it's pretty much an expected feature of languages these days. If you've been using Objective-C for a while, then you probably do the -retain/-release dance without even thinking about it. For new developers, it represents a fairly significant barrier to entry. For example, consider the following method:
- (void)setFoo: (id)newFoo
{
id tmp = [newFoo retain];
[foo release];
foo = tmp;
}
That's the simple form of a set method in Objective-C. Oh, and that isn't thread-safe. Here's the thread-safe version:
- (void)setFoo: (id)newFoo
{
id tmp = [newFoo retain];
tmp = __sync_swap(&foo, tmp);
[tmp release];
}
Well, I think it is, anyway. I probably made an error somewhere, because concurrency is hard. Now here's the thread-safe version in GC mode:
- (void)setFoo: (id)newFoo
{
foo = newFoo;
}
See the improvement? If you saw those two versions, which would you find easier to understand? Now for the really important question: which one would you expect to be faster? This is where it gets a little bit more complicated. The second version actually has some compiler trickery involved. It's more like this, if you had to write it with a compiler that didn't insert the write barriers explicitly:
- (void)setFoo: (id)newFoo
{
objc_assign_ivar(newFoo, self, offset_of_foo);
}
So this version does have the overhead of the function call - it's not a straight assignment. The reference counted version, however, has 4 function calls - two to the runtime function to look up the retain and release methods, and two to actually call those methods. With my latest LLVM optimisations, the lookups will probably be cached, but you still have to do method calls.
What goes on in those method calls? Well, retain and release both do atomic increment / decrement operations. On a multicore system, these mean locking the bus, which can have an overhead on the order of a hundred cycles. Oh, and there's a third atomic operation in the middle.
In contrast, the objc_assign_ivar()
function does little more than the assignment. It's very cheap. Of course, that's not the whole story. In the traditional mode, if the reference count hits 0, then the object is deleted immediately. In GC mode, the collector must periodically find objects that are no longer referenced and delete them, which adds some overhead.
The other complication is autorelease pools. With pure reference counting, you have a problem returning temporary objects. You want to return them with a reference count of 0 (because you no longer have a reference to them), but you don't want the caller to have to remember to release them. The OpenStep solution to this is to add an -autorelease method, which adds the object to a pool. It is then sent a -release message when the pool is destroyed - typically at the end of the run loop iteration.
This means that temporary objects can exist for a very long time. I had some code a few months ago that was allocating and autoreleasing about 500MB of temporary objects, but only about 5MB of them were live. On my machine, this meant that objects that were no longer live were first being swapped out, then were being swapped back in when the autorelease pool sent them a release message, then finally being completely freed.
In a garbage collected environment, this would not have happened. The collector would have been periodically run and would have freed some of the unreferenced objects before they got swapped out.
Large autorelease pools are a significant problem with a number of defensive programming patterns. One example is this kind of set method:
- (void)setFoo: (id)newFoo
{
[foo autorelease];
foo = [newFoo retain];
}
This means that the old value of foo won't be freed until the end of the current run loop iteration. More importantly, the synthesized property accessor methods are implemented something like this:
- (id)foo
{
return [[foo retain] autorelease];
}
Both of these are intended to allow you to hold a reference to the value of foo
on the stack without finding it suddenly turning into a dangling pointer. The second method actually works, the first just gives you some bugs that are insanely hard to debug in multithreaded code (Google recommends the former, but like almost everything else in their Objective-C style guide it's a really good way of writing unmaintainable code).
This kind of defensive programming means that you don't have to spend so much time thinking, but means that you're writing highly suboptimal code. This puts garbage collection in the same category as most high-level language features: if you look at any specific point in your program, you can probably write it more efficiently using low-level techniques, but doing that for the entire program is probably impossible. If you're in garbage collected mode, the synthesized property accessor method looks like this:
- (id)foo
{
return foo;
}
No message sends. Nothing added to the autorelease pool. The object is on the stack, so it's treated as a root (i.e. it can't be collected, and neither can anything else that it references. You can write code this simple anywhere where you are dealing with objects.
Since it wasn't working, I'd imagine that most GNUstep / Étoilé programmers have not really looked at Apple's garbage collection APIs in detail. For the most part, you can keep the main advantage of Objective-C: the ease with which you can drop into low-level mode for the few bits of your code that really are performance critical. You can still allocate memory with malloc()
and free it with free()
- the collector will ignore this memory completely.
This is useful for things like images, large buffers of data, and so on. In a typical Objective-C program, under 20% of the heap will be managed by the garbage collector directly. There are some halfway steps. For example, if you allocate memory with NSAllocateCollectable()
, with 0 as the second argument, then you get memory that the collector will free for you when it is no longer referenced, but which is not scanned. If you store a pointer to such data in an instance variable, then it not will be freed as long as the object referencing it is live.
One very convenient feature is the addition of zeroing weak references. The canonical use case for these is NSNotificationCenter
. You typically add a line in your -dealloc method removing yourself as a notification observer. In a garbage collected program, you have a -finalize method instead of -dealloc, which is called when the object is collected. You can't unregister from a notification center here, because while any object has a reference to your object, it won't be eligible for finalisation. The notification center solves this by storing a weak reference to your object. This doesn't prevent it from being destroyed. As an added bonus, it means that you don't need to unregister for notifications - the pointer held by NSNotificationCenter
will become nil
with no interaction on your part.
Most Objective-C code uses some C libraries, and you often want to pass object pointers into such libraries. It's a common C idiom to implement callbacks as a function pointer and a (void*
) data pointer. In Objective-C, you typically pass a small trampoline function and a retained object into such functions. The trampoline then sends a message to the object when it is called. With garbage collection, you can call either CFRetain()
, or [NSGarbageCollector -disableCollectorForPointer:]
to tell the collector not to free the object.
Most Objective-C code should run unmodified in this mode. Clang will strip out memory management message sends (and if it misses any, they're implemented as no-ops) and insert the required barrier functions.
If you want to play with it, you will need to get trunk versions of clang the GNUstep Objective-C runtime, and the GNUstep base and back libraries from subversion. You will then need to recompile everything with the -fobjc-gc-only
command-line option. Hopefully everything will work. I didn't have to make any changes to the GNUstep AppKit implementation nor to Gorm. Similarly, the EtoileFoundation test suite worked fine without any changes. I did have to modify LanguageKit, but it's a compiler so that's expected. I had to make one change to GNUstep-back, because it was storing object pointers in some memory allocated with malloc()
. If you're doing that, then expect some small problems. Typically, these are fixed by turning malloc(size)
calls into calls into NSAllocateCollectable(size, NSScannedOption)
and deleting the corresponding free()
- hardly a major change.
Hopefully, once it's a bit better tested, we will enable garbage collection by default for Étoilé, and get to write much simpler code.
Coming Soon to a Compiler Near You
I've talked a bit about some of the optimisations and potential optimisations for Smalltalk and Objective-C that are possible with LLVM. The GNUstep runtime has a directory of optimisations, but previously they've been somewhat cumbersome to run. You had to add -emit-llvm
to your clang
command line, then run opt
on all of the emitted bitcode files, then use llc
to convert them to native binaries. Persuading GNUstep Make to do this was basically impossible.
The problem is that clang, like most other LLVM front ends, requests a set of default optimisations to run, when generating code. This set is hard-coded in an LLVM header. If you want to make clang run the optimisations, you need to modify this header before compiling clang - not ideal.
Last week, I spent some time hacking on LLVM, and rewriting that code allow plugins (plugs-in?) to modify the default set of passes. This is still pending review before being added to LLVM, but the code using it is in the GNUstep runtime tree already, so once the patch is committed you can use it immediately.
Currently, you still need to specify the path to the plugin, which is not ideal: it should be loaded automatically. Worse, because it's an LLVM plugin, you actually need to pass it to clang's cc1 equivalent. This means that you need to add something like this to your CFLAGS: -Xclang -load -Xclang {llvm/install/path}/libGNUObjCRuntime.so
Once you've done this, the plugin is loaded, and the various passes will be added depending on the optimisation level. At -O2, they'll all be run (except the profile-drive ones). This means:
- Instance variable reference will be made fragile, if doing so will not break the public ABI.
- Class lookups will be cached
- Class message lookups will be cached
- Class methods will be inlined, if possible
- Message sends in loops will be cached
This kind of list is meaningless without benchmarks, so here's a simple one. This contains a couple of loops, one sending class messages and one sending instance messages. It uses clocK()
to record the amount of CPU time take for the entire microbenchmark. Here you can see the results from compiling the program with GCC, with Clang, and with Clang and the plugin:
$ gcc -O3 -std=c99 loop.m -L /Local/Library/Libraries/ -lobjc && ./a.out
16.648438 seconds.
$ clang -O3 -fobjc-nonfragile-abi loop.m -L /Local/Library/Libraries/ -lobjc && ./a.out
15.312500 seconds.
$ clang -O3 -fobjc-nonfragile-abi -Xclang -load -Xclang `llvm-config --libdir`/libGNUObjCRuntime.so loop.m -L /Local/Library/Libraries/ -lobjc && ./a.out
3.539062 seconds.
Don't read too much into the difference between the first two. I just pasted in the results of running each command. Because this was done in a VM, the timing is not 100% accurate, and the jitter between results was about as big as the difference between the clang and the gcc results here.
The big difference, of course, is the final result - less than a quarter of the time taken for the gcc-compiled version to run. This sent 5 times as many class messages as instance messages, and with the first two results the amount of time spent sending each was the same. This was due to the large overhead of calling objc_lookup_class()
for every class message. You can see evidence of this in the GNUstep code, which is littered with static variables that cache classes to avoid the lookup overhead.
One of the optimisations cached this lookup automatically, so that overhead was negligible. This drops the cost of class messages to approximately the same cost as instance messaging. Class messages are automatically cached, even if they're not in loops, because the mapping from class message to method rarely changes. We also cached instance method lookups in loops, so the overhead of the message sends was quite low as well. Comparing just the class messages, we have about 7.5 seconds for GCC and about 0.5 seconds with these extra optimisations.
Hopefully, by the time LLVM 3.0 is released, if you use clang and have the GNUstep runtime installed, then this should all happen automatically, and you'll get nice fast code without having to do anything.
Clang: Objective-C and JavaScript
There haven't been any updates here for ages, so I first wanted to reassure everyone that we're not dead. I've been focussing a bit more on clang and the GNUstep Objective-C runtime than on Étoilé recently, in the run up to the LLVM 2.9 release. The majority of this has just been tidying and fixing some bugs in some corner cases.
I've also tidied up and simplified the GNU runtime code in clang. This doesn't impact the functionality (although the code is now simpler, so should be a tiny bit faster, although I doubt that the difference is noticeable), but it should make it easier for other people to get involved. I've also refactored some of the code so that we are using the same code paths as Apple, which means that any bugs are more likely to be found and fixed. The exception handling code, in particular, has gone from being about 300 lines of code specific to the GNU runtimes to being about half a dozen, calling a generic implementation that we share with the Apple Modern runtime.
Clang 2.9 And Objective-C++
Exception handling, by the way, is something that's been reworked in clang 2.9 and the upcoming libobjc2 release. We now have proper interoperability with Objective-C++. This means that:
- C++ and Objective-C exceptions can correctly propagate through each others' stack frames.
- C++
catch
statements can catch Objective-C objects thrown with@throw
- Objective-C
@catch
statements can now catch Objective-C objects thrown withthrow
This is semantically equivalent to the Apple unified exception model, and does not break the ABI - existing C++ and Objective-C code will still work when linked with Objective-C++ code compiled with this model. To enable this, you must specify -fobjc-nonfragile-abi
when compiling Objective-C++, or you will get the old GCC-compatible (i.e. completely broken in most cases) mode.
That's not really what I wanted to talk about today though. One thing that we've talked about a few times over the past few years is adding something to GNUstep that sends PostScript-like commands to a browser, to be drawn in a canvas tag, and gets events back.
The GTK guys have now added a similar shim, and we still haven't got around to it. Part of the reason why not, is that I'm still not convinced that it's a good idea (in spite of the fact that it was my idea to start with).
Good NeWS!
To understand my objection, it's worth going back almost three decades, to a time when different UNIX vendors had their own windowing systems. Two of the competing systems at the time were the X Windowing System, from MIT, and NeWS, from Sun. Both used a client-server abstraction, where applications communicated with a display server.
In the case of X, this communication was very simple. The clients sent drawing commands, and the server sent events, like mouse clicks or keyboard button presses. With NeWS, the client sent PostScript programs, which handled drawing and low-level event handling.
To understand the difference, compare how a button would be implemented in the two approaches. With X, the client would send commands to draw the button to the server. When the user clicked on the button, the server sent a message to the client and the client sent a reply including instruction for drawing a pressed button.
In the NeWS model, the client sent a PostScript program representing a button. This drew a button and registered then waited for events. When the user clicked on the button, this program drew a pressed button and sent a 'button pressed' message to the server.
On a single machine, or even a local network, there isn't much difference between these two models. There becomes a difference, however, when you start using larger networks, such as the Internet.
When I'm using a remote machine over the Internet, it's not uncommon to get round trip times of 200ms or more. This means, with the X model, and assuming an infinitely fast server, it takes 200ms between clicking on the button and seeing the effect. Not a huge amount of time, but a noticeable lag. This is even worse on mobile networks: with GPRS, 2 second round trip times are not uncommon.
In contrast, the NeWS model meant that the response was always immediate. The view objects ran on the local machine, only the models ran on the server. Network latency was largely hidden from the end user. This is apparent in things like tree views: in the NeWS model, you only need to go to the server if the view is informed that new items have been added to the model. You can always expand and collapse tree nodes on the local machine. With the X model, every click-redraw cycle needs to go via the network.
Reinventing NeWS
One of the advantages of web applications over remote X11 is that they can run some client-side JavaScript code, effectively reimplementing the NeWS model. A tree view in a web app probably caches some local data and uses JavaScript to expand and collapse nodes, only fetching data from the server the first time that it's displayed. This makes web apps a lot more responsive over high-latency links than remote X11.
Unfortunately, streaming canvas drawing commands brings us straight back to the X11 model. It has the advantage of being simple (we estimated it would be under 5000 lines of code to add this support to GNUstep), but it's not very sensible.
Ideally, what we'd like to do is move the view objects into the browser, and maybe some of the controller objects, but leave the models on the server. This is what NeWS did, but it had the disadvantage that you had to write the views in PostScript. The web has a similar disadvantage: you can write the server-side logic in any language, but you have to write the views in JavaScript. We have a load of view and controller classes written in Objective-C already, and we'd like to be able to use them in a web browser.
Compiling Objective-C to JavaScript
One of the really nice things about clang is that it is modular. We're using it as a compiler front end, generating LLVM bitcode, which LLVM then compiles to native code. We're also using it for syntax highlighting and code indexing. The clang abstract syntax tree (AST) is exposed via a plugin interface, so it's easy to use clang for other things.
This has been my latest project: a plugin that walks an Objective-C AST and emits JavaScript code, along with a smallish supporting library. You can find the code in svn, as usual, along with some test cases.
To give a quick example of what works already, here's one of my test cases:
int *array = malloc(1024);
float *alias;
alias = (float*)array;
alias[2] = 1;
array[1] = alias[2];
((id*)array)[12] = [NSObject new];
jsalert(sizeof(array));
jsalert(array[1]);
jsalert(array[2]);
jsalert(array[12]);
[((id*)alias)[12] alert];
The jsalert()
function and the -alert
method on NSObject are implemented in JavaScript (as is the NSObject class itself), and just calls the JavaScript alert()
function to pop up a dialog, for testing whether the code actually worked. This example shows several features of the compiler:
- Pointers
- Aliasing memory allocations via pointer
- Storing and retrieving object pointers in memory of a different type
- Sending messages to Objective-C objects (and classes)
The core of the C language works - including things like arrays of structures, and structures containing unions containing pointers - and so does the core of Objective-C.
The biggest omission, which will probably never be fixed, is integer-to-pointer casts. You can cast pointers to integers, and you can do arithmetic on pointers, but if you try to dereference a pointer that was created by casting from an integer then you will get a run-time error. On the plus side, you do get array bounds checking and full garbage collection for free...
What's still to do? Currently, only the core of Objective-C works. Exceptions don't (although they're easy to add), and most binary and unary operations on declared properties are broken. The majority of the remaining work is in the support library. We need to add JavaScript implementations of the common C library functions and reimplementations of some of the core classes in terms of their JavaScript equivalents (NSArray, NSString, NSView, NSBezierPath). With these done, we should be able to use GNUstep classes that depend on them after a simple recompile.
Property Value Coding and EtoileUI
I'm a relatively new developer to Étoilé working on parts of the ProjectManager service. Last year I put together a basic window manager from something started by David Chisnall, and now I'm trying to get some of the other architectural bits and pieces prototyped and running, such as the task bar and project management components.
This led me to start working with EtoileUI, our user-interface framework. I was doing some stuff with Property-Value Coding, which Quentin Mathé was kind enough to help me out with. He gave me the following detailed but useful explananation about why it was created and how it interacts with EtoileUI.
Property Value Coding is a just a small addition to support reading/writing properties independently of the Key Value Coding (KVC) semantic that GNUstep already supports.
There are two methods used to retrieve and set a property:
-valueForProperty:
and-setValue:Property:
. Both of these use the primitive KVC implementation that NSObject supports, and-setValue:forProperty:
checks the property exists before invoking-[NSObject(Model) setPrimitiveValue:forKey:]
. If it doesn't,NO
is returned, rather than raising an undefined key exception. You declare the properties that you implement by overriding the-properties
method in your model class and returning an array of strings containing the property names.Property Value Coding lets us access values that aren't normally KVC-compliant. For example, for an NSDictionary
[dict valueForKey: @"count"]
won't return[dict count]
, while[dict valueForProperty: @"count"]
will return it. This way it's possible open an inspector that lets you browse/edit any object properties even if their KVC implementation doesn't support it.In addition
ETLayoutItem
overrides-valueForProperty:
and-setValue:forProperty:
to look up the value on the represented object when one is provided, otherwise on the layout item itself (i.e. when-representedObject
returnsnil
). This makes possible to set/unset a represented object and have the layout item continues to return valid values (at least for basicNSObject
properties).When the layout item has no represented object,
-[ETLayoutItem setValue:forProperty:]
automatically adds the given value to a dictionary in case the property doesn't exist yet (instead of returningNO
). This allows to useETLayoutItem
as an extensible model object when it makes sense. For example, in a compound document, the layout item tree is the model, so we can just store the property values on the layout item itself.You can also use a layout item as a represented object of another layout item. This is how EtoileUI creates the "meta" UI representation visible when you call
-inspect:
on a layout item.The general rule is that you don't have to call
-valueForProperty:
and-setValue:forProperty:
from within your model objects, you should just ensure these methods return the right values for the properties you declare with-properties
. TheNSObject(Model)
implementation don't have to be overridden if the model objects have accessors for each property.The only common case where you call
-valueForProperty:
and-setValue:forProperty:
is when you write EtoileUI aspect subclass (style, layout etc.) which want to read/write model properties. In this case, you are expected to always invoke these methods on the layout item and never directly on the represented object.Model objects which support basic Key Value Observing (KVO) compliance (posting a KVO notification every time a property changes) can be used as represented objects that trigger automatic UI updates. To do so, your model objects must override
-[NSObject(Model) observableKeyPaths]
.ETLayoutItem
can then intercept the KVO notification and trigger a redisplay or update its widget view.As mentioned before, the method
-[NSObject(Model) properties]
declares the object properties that can be accessed through Property Value Coding. This method is going to be renamed-propertyNames
soon since it conflicts with various Cocoa APIs. In future, theNSObject(Model)
implementation will be changed to return the property names from an object model description bound to it rather than directly as it does now.
Syntax Highlighting with Clang
One of the reasons that I got involved with clang originally was the promise that the same front-end code could be used for other things. Since then, the only things that I've used clang for are compiling and as a static analyser.
More recently, the clang team has produced a new interface, libclang. This is a set of C APIs expose the functionality that an IDE might want. I've started wrapping these in IDEKit (which Quentin informs me is a name that has already been used by someone else, so expect to see it renamed soon, probably to SourceCodeKit).
The libclang APIs let you do a lot of things, including reporting diagnostics (errors, warnings, and so on) in an editor, code completion, and so on. The first thing that I decided to work on was synta highlighting.
Most code editors claim to perform syntax highlighting, but a lot really just do lexical highlighting. Vim is an example of this; it highlights by simply tokenising the input buffer and pattern matching. You can see the difference between vim's lexical highlighting and real syntax highlighting in this image:
The top window is a modified version of Typewriter that uses IDEKit to perform syntax highlighting. The bottom window is the same file (MsgSendSmallInt.m
from LanguageKit) opened in Vim. There are a few things to notice.
First, Vim doesn't know that COMPARE
is a macro instantiation, so it doesn't highlight it at all. True syntax highlighting does. Second, look at the message sends. This code shows two class messages, both sent to BigInt
. The syntax highlighter can tell that these are message sends (so it highlights the selector component) and that BigInt
is a class, so it makes it purple. In contrast, Vim's lexical highlighter doesn't have patterns for the class or selector names, so it ignores them.
Another example is the handling of intptr_t
. Vim treats this as a built-in type name because it's one of the C99-specified types. The syntax highlighter, in contrast, knows that it is a typedef, so highlights it in a different colour to real keywords like int
and void
.
You can find the modified version of Typewriter in Developer/Examples/CodeEditor
. It's just a simple demo - the real code will be integrated into CodeMonkey later. It works fast enough in the files that I tested that you can type without noticing any delay. It's currently re-highlighting the selected line after every character press. This needs a bit of tuning.
For example, it's only really worth running the highlighter at all when the user has typed a whitespace or punctuation character; anything else will probably be the middle of a keyword or identifier, so won't provide any new interesting highlights.
Oh, and one more thing: The highlighter runs in two passes. In the first pass, it tags ranges in the source (an attributed string) with semantic attributes. It then goes through and replaces these with presentation attributes. You don't have to run the second step; you can also use some other transform on the result, such as generating HTML with attributes containing the semantic information and
LanguageKit Interpreter
A little while ago, Eric had a go at writing an interpreter for LanguageKit. The point was to be able to use it for debugging, use it on OS X, and use it when the overhead of a full compiler is too much.
There were a few bugs in the interpreter and Eric got distracted implementing CoreGraphics and doing a few other things. This weekend, I picked up the interpreter and fixed some bugs. It turned out not to require much work; Eric had already done the difficult bits. After a little while, the interpreter was passing more tests than the compiler. This was slightly embarrassing, so I started fixing bugs in the compiler too.
Now, the entire Smalltalk test suite (thanks again Günther!) passes with both the interpreter and the compiler. It's quite nice to see that the interpreter has reasonable performance here. Running the test suite with the compiler:
$ time sh runall.sh -q
....................................
36 tests run. 36 passed, 0 failed.
real 0m37.538s
user 0m9.048s
sys 0m4.674s
And with the interpreter:
$ time sh runall.sh -q
....................................
36 tests run. 36 passed, 0 failed.
real 0m28.365s
user 0m6.463s
sys 0m3.935s
As you can see, the interpreter takes less time to run the test suite, both in terms of wall and CPU time. This might seem surprising - after all, the entire point of the compiler is speed - but it makes sense once you remember how small the tests are. For most of the tests, if you enable timing in edlc, you get a message like this at the end:
Smalltalk execution took 0.000000 seconds. Peak used 32592KB.
The amount of time spent running the code is so small that it's lost down in rounding errors when you convert it to seconds. You get something similar from the interpreter. The time spent compiling and optimising the code is quite a bit more than the time spent actually running it.
This means that the interpreter is a good choice for short-lived scripts. Now that it's working properly, we can start thinking about lazy compilation, where we only compile the methods that are called frequently, or better feedback-driven optimisation, where we collect the profiling information in the interpreter and then compile the optimised one later.
Opal Summer of Code Report
I spent this summer as a participant in the Google Summer of Code program, working on Opal, a graphics library which implements an API compatible with CoreGraphics. Opal was started in 2006 by BALATON Zoltan, and was mostly dormant for the last few years until early 2010 when I started playing with the code and added a few features. Today, it is a reasonably complete implementation with most major features working. I’ll discuss in more detail what I did and some of the neat features of the library.
API notes
One of the challenges with implementing this API was its relationship with the CoreFoundation library. CoreGraphics objects are supposed to be CoreFoundation types, which means they are “toll-free bridged” with NSObject. For instance, if you create a CGImage, the following are all valid ways of retaining it (incrementing its reference count):
CGImageRef anImage = CGImageCreate(...);
CGImageRetain(anImage);
[anImage retain];
CFRetain(anImage);
GNUstep’s history predates Apple’s creation of the CoreFoundation library, and we don’t really have a need for it in GNUstep, other than for porting Mac applications to GNUstep. (There is an library in development called CoreBase, available at svn://svn.gna.org/svn/gnustep/libs/corebase/trunk, which implements the CF functions on top of GNUstep base.) However, to implement Opal, I decided to make it an Objective-C library internally, depending on GNUstep base. For example, CGImage is a regular Objective-C class, inheriting from NSObject. The end result is that Opal is structured quite similar to the existing GNUstep back library, except that the Objective-C API is private to the library, and only a C API is exposed.
If you look at the Opal headers, they appear to use CF types (such as CFArray, CFString, etc.), but these are just typedef'ed in CGBase.h to the respective Obejctive-C types (NSArray *, NSString *, etc.).
Text
Currently, GNUstep’s most featureful backed, which uses the cairo graphics library, uses cairo’s “toy” text API. This API is designed for users of cairo who lack a text layout library to do their own character to glyph mapping and glyph positioning. With the toy API, the user passes cairo a UTF-8 string and lets cairo perform character to glyph mapping and glyph positioning. It’s called the “toy” API because this is a non-trivial problem and cairo only does it correctly in simple cases.
My hope for the long term is to add full support to Opal for doing complex text layout, and support for using advanced typographic features in OpenType fonts. I started on this path by writing a draft implementation of the font descriptor part of the CoreText API.
Right now, Opal implements most of the CGFont functions on both Windows and X11, which allows accessing various font metrics, as well as drawing specific glyphs at specified locations.
As an example, the following code snippet draws the "AE" and "ae" ligatures at (10, 100):
CGGlyph AEligatures[2] = {CGFontGetGlyphWithGlyphName(f, @"AE"),
CGFontGetGlyphWithGlyphName(f2, @"ae")};
CGAffineTransform xform = CGAffineTransformIdentity;
xform = CGAffineTransformTranslate(xform, 10, 100);
CGContextSetTextMatrix(ctx, xform);
CGContextShowGlyphs(ctx, AEligatures, 2);
The full code is in Tests/texttest.m. Here is the output:
Color management
I also implemented the foundations for making Opal handle color management. There are some limitations to what we can do because cairo is not color-management aware, however this mostly affects doing PDF export. Right now, Opal assumes surfaces are in the sRGB color space, and converts colors you draw with to that space. So, if you draw a rectangle with a shade of green in the AdobeRGB color space, that color should be transformed to sRGB before being displayed on screen. This will be easy to support for images as well, I am just missing the code for using the libjpeg/libpng/libtiff libraries to check for colorspace metadata or embedded ICC profiles.
Here's an example (from Tests/colorspace.m:
The top row is 100% green, in AdobeRGB on the left half, and sRGB on the right. These get mapped to the same color in sRGB, so you can't see the difference. The bottom row is 75% green, also in AdobeRGB on the left, and sRGB on the right. Here the left side is mapped to a slightly brighter green in sRGB.
The other bit of work that needs to be done is investigate API’s for asking the windowing system what color space it expects windows to be drawn in. I confess that I have no idea how this works, or if it is supported at all on Windows or X11. Wide-gamut LCD displays whose native color space is AdobeRGB are starting to become more common, and it would be nice to be able to display an AdobeRGB image from a digital camera without an intermediate conversion to sRGB, losing precision.
Images
I wrote the image handling in Opal from scratch, and while it was a bit of work, the result is quite nice. The design of CGImage handles a wide variety of formats (e.g. up to 32 bits per component, integer or floating point samples, arbitrary color spaces). CGImages are immutable, so we can cache a copy of the image converted to a format appropriate for the display device. This enables both high performance for drawing, and the ability to do things like editing an image at its full bit-depth, or doing format conversions without an intermediate conversion to 8-bit ARGB.
Conclusion
You can check out a copy of opal at the URL svn://svn.gna.org/svn/gnustep/libs/opal/trunk. The README file contains installation instructions, and there is a detailed to-do list in the TODO file.
More fun with TDD
In my last post, I mentioned a few of the fun things that you would soon be able to do with the type-dependent dispatch stuff. It turns out that one of these was easier to implement than I expected. Consider the following line from the class in the last example:
[(B*)a foo: 12];
The object pointed to by a
implements a -foo:
method, but this method takes an int
as the argument, while the one declared in B
takes a float
. This explicit cast means that the compiler will definitely generate the wrong kind of call frame for this method. This is a contrived example, but it's the easiest way of demonstrating this problem.
When I run this line on OS X, I get this:
int: 1891656480
The value is just whatever nonsense happened to be in the register or stack slot used for passing the third integer argument. Obviously, this is bad. How about libobjc2 with the stuff from my last post?
Calling [A -foo:] with incorrect signature.
Method has v12@0:4i8, selector has v12@0:4f8
int: 1094713344
This is a bit better. It still does the wrong thing, but at least it tells you it's doing it. You can add a breakpoint there and find where the problem is. Even without the debugger, you know that something is sending a -foo:
message to an instance of class A
with a floating point value as the first explicit argument instead of an integer.
Now, what happens if you link the program against LanguageKit? Now the results are a lot better:
int: 12
So what's really going on there? LanguageKit, when it loads, installs a handler for mismatched method invocations. When this is invoked, it constructs a new method that takes the arguments that the selector defined. This method is simple. It either boxes or unboxes the arguments, as required, and then calls the correct method. The handler then does the lookup again and gets the slot for the new method, which it returns.
Any future message send to this class with this type signature will result in the new method being called. This adds some overhead (potentially quite a lot of overhead), but it is probably a lot better than stack corruption.
With this in place, you can do some fun things. For example, suppose we define the following interface:
@protocol IntMap
- (void)setObject: (float)a forKey: (int)b;
- (float)objectForKey: (int)a;
@end
Hopefully you'll recognise these method names as being those used for manipulating a mutable dictionary. The declarations of these methods in the dictionary class take only objects as arguments - dictionaries are maps between objects and objects, not between primitive types. With this fixup enabled, we can do this:
id<IntMap> d = (id<IntMap>)[NSMutableDictionary new];
[d setObject: 42 forKey: 100];
printf("Dictionary contained %f\n", [d objectForKey: 100]);
This creates an NSMutableDictionary
instance and then casts it to this protocol. Now, the compiler will use the types for the methods declared in the protocol, rather than the types declared in the class, when constructing the message send. So, we're intentionally doing the wrong thing, calling these methods with the wrong signature. What happens?
Dictionary contained 42.000000
The runtime, in conjunction with LanguageKit, does what we wanted. If you inspect the classes used, you'll find that the key is being turned into a BigInt
and the value into a BoxedFloat
. These are the two types that LanguageKit uses for auto-boxing integer and floating point types internally.
The end result is that we can (almost) pretend that Objective-C is a pure object-oriented language, just like Pragmatic Smalltalk. In practice, we still need to be explicit about types, but the language is now a lot more forgiving when we make mistakes.
Type Dependent Dispatch
Every language contains some painfully stupid design decisions and Objective-C is no exception. The one that I find the most irritating is the definition of a selector. Selectors are an abstraction of method identifiers. In the original StepStone compiler, and with the NeXT and Apple runtimes, these are represented by uniqued strings.
This representation was inherited from Smalltalk. In Smalltalk, it made sense. Every method was uniquely identified by a string. Every method returned an object (self
if no explicit return was specified) and took objects as arguments.
Unfortunately, Objective-C inherited the C structural type system as well as the Smalltalk algebraic type system. This means that methods have a name and a set of parameter types, but selectors don't. To give you some idea of why this is a problem, consider this trivial program:
#import <Foundation/Foundation.h>
@interface A : NSObject
- (void)foo: (int)a;
@end
@interface B : A
- (void)foo: (float)a;
@end
@implementation A
- (void)foo: (int)a
{
printf("int: %d\n", a);
}
@end
@implementation B : A
- (void)foo: (float)a
{
printf("float: %f", a);
}
@end
int main(void)
{
A *a = [A new];
B *b = [B new];
[a foo: 12];
[b foo: 12];
a = b;
[a foo: 12];
return 0;
}
If you compile this - with GCC or clang - on OS X, it gives no errors, no warnings. When you run it, you get this output:
int: 12
float: 12.000000
float: 0.000000
The first two look sensible. The correct method is being called with the correct parameter. What about the third one? Whether the correct method is being called depends on how you interpret the Objective-C language, but the parameter is definitely wrong.
Why does this happen? The answer is quite simple. The compiler needs to know the types of the method to be able to construct the call frame correctly. You've told it that the receiver is an instance of class A
, so it looks up the method in this class and finds the one that takes an integer as an argument. It therefore puts 12 into an integer register.
The called method, however, expects a float
and so it looks in the first floating point register to find it. Nothing has touched these since the program started, so it finds 0. In this case, the result is quite benign. The program does the wrong thing, but not catastrophically. If one of the arguments had been a structure, or they differed in return types, you might have the program dereferencing an invalid pointer or simply corrupting the stack. Because this can lead to stack corruption, it has the potential to expose some fun security holes in any Objective-C program using an implementation that inherits this behaviour.
The problem here is that the compiler is making certain assumptions, but is not enforcing a check at run time that they are correct. When they are not, bad things happen.
This looks like a fairly contrived example, but it's actually relatively common. Remember that methods in Objective-C do not have to be declared publicly. If -foo:
in A
was not in the interface, someone might easily create a definition of B
as a subclass of A
which added a new -foo:
method with a different set of types. They might then pass a pointer to an instance of B
to something expecting an instance of A
(which, after all, is one of the things that inheritance is meant to let you do). This object might then send a -foo:
message to the object it receives and suddenly the stack is corrupted.
With libobjc2, I've been working on a solution to this, which is now working correctly. Before I explain how it works, a little bit of background:
The GNUstep runtime (libobjc2) is designed to be a drop-in replacement for the old GNU runtime. The GNU runtime did not copy this design decision. Selectors in the GNU runtime contain both a name and a type encoding. This was done to make distributed objects faster - the caller knows the types of the method (in theory, at least), so you don't need a round-trip over the network to look them up. Unfortunately, for compatibility with NeXT, the types were not used for message lookup.
The latest version of libobjc2 now supports type dependent dispatch. With this enabled (it isn't by default, but it will be soon), the types are also used for message lookup. This means that, when we run the example program, we get this output:
int: 12
float: 12.000000
int: 12
The method that has a different type signature does not override the old one. This interoperates with old code safely too - if you call a method using a selector with no type encoding, it looks up the same method that the old runtime would return. This change does not require any modification to the ABI - you can still use it with code compiled with GCC, targeting the old GNU runtime.
Currently, if you call a method with an incorrect signature, it logs a warning. I've fixed a few subtle bugs in Étoilé and GNUstep that this has uncovered. The next thing on my TODO list is to add a callback that runs whenever the lookup returns mismatched selector types. This will allow us to do some very clever things with LanguageKit. For example, we can dynamically fix up these errors at run time by adding a method to the class that performs auto boxing or unboxing and then calls the correct method.
It's also worth noting that, when the types don't match, this uses the same lookup mechanism as type-independent dispatch. Methods are still looked up as a single sparse array lookup with the selector as the index. There is a slight space penalty, because both the typed and untyped version of the selector are added to the dispatch table as keys whenever a method is added, and registering selectors is slightly more expensive, but that's all.
GSoC Progress: DBusKit
As David pointed out earlier, both Eric and I got the opportunity to participate in Google's Summer of Code programme with projects for GNUstep. Since GSoC "midterms" are just over, I'd like to talk a bit about the progress I've been making.
D-Bus and Distributed Objects
My project for this summer is to bring D-Bus support to GNUstep. D-Bus is, as you might know, an IPC mechanism that has been widely adopted on the *nix desktop: If you are sitting in front of a Gnome or KDE desktop, you most certainly also have D-Bus installed. GNUstep has traditionally had a different, but very powerful means of doing IPC that is called "Distributed Objects" (DO, which in fact dates back to the OpenStep spec). It allows you to refer to a distant object (running in a different process) just like you would to a local object. You can just get a proxy to the remote object by calling the appropriate method on NSConnection and just send messages to it as usual:
id thermometer = [NSConnection rootProxyForConnectionWithRegisteredName: @"TemperatureServer"
host: @""];
temp = [thermometer temperature];
That's just it. You can even use every return value from the root proxy just as you would normally. Actually the system is very intelligent about that: If process A accesses an object from process B and that object returns a remote object it got from A, DO will be smart enough not to do a round trip A→B→A, but will return the local object instead. Thus, for native IPC needs, GNUstep already has everything one could wish for.
But D-Bus support is still a useful thing for GNUstep because many services on a modern *nix desktop are exposed via D-Bus: If you re-configure your WiFi card on the fly, it's probably through NetworkManager, and if your media player inhibits the power saving functions of your laptop when watching a movie, this is probably done through upower (or HAL). All theses services use D-Bus to provide their functions to other applications.
Using D-Bus from Objective-C
The goal of my project is to make accessing and providing D-Bus services from within an Objective-C application as easy as using Distributed Objects. So you would just do
id thermometer = [NSConnection rootProxyForDBusConnectionWithRegisteredName: @"org.foo.temperature"
bus: DKDBusSystemBus];
And while I'm not quite there yet, the DBusKit framework that provides D-Bus support is starting to get useful. It already manages basic method invocations on D-Bus object, albeit that support is at present limited to the object at the root of an object graph. E.g. you can get a list with all names registered on D-Bus with the following lines of code:
NSConnection *conn = [NSConnection connectionWithReceivePort: [DKPort port]
sendPort: [[DKPort alloc] initWithRemote: @"org.freedesktop.DBus"]];
id dbus = [conn rootProxy];
NSArray *names = [dbus ListNames];
The present implementation is very flexible in the way it interacts with D-Bus. For example, all the following method declarations are valid ways to invoke the D-Bus method NameHasOwner() on the org.freedesktop.DBus object:
- (NSNumber*)NameHasOwner: (NSString*)name;
- (NSNumber*)NameHasOwner: (char*)name;
- (BOOL)NameHasOwner: (NSString*)name;
- (BOOL)NameHasOwner: (char*)name;
So you are free to choose whether you want plain C types or Objective-C objects (though this isn't possible for collection D-Bus values yet: these will always be returned as objects). Of course you have to provide the method signatures for the D-Bus methods you want to call (e.g. in a protocol declaration), but in the future I will provide a little tool that will do that for you.
Outlook
So what remains to do for the second half of GSoC? Apart from the better integration, I will tackle the task of exposing Objective-C objects on D-Bus. This will involve designing a tool that allows you to generate annotated D-Bus introspection data from Objective-C interface or protocol declarations so that you can easily customise what methods an object will expose via D-Bus. Also D-Bus supports not only method calls, but also signals and properties, both of which can easily be mapped to notifications and properties on the Objective-C side. So stay tuned for updates on D-Bus support in GNUstep.
Further reading
If you want to play around with the code, you can find it in the modules/dev-libs/dbuskit directory of the GNUstep SVN tree this also includes an example that exposes the Apertium machine-translation system (which has a D-Bus interface) as a GNUstep service. This is accessible from every GNUstep application and allows you to translate selected text on the fly. Information about the Distributed Objects system can be found in the GNUstep Base Programming Manual, the Apple documentation or an tutorial by Nicola Pero. If you want to learn more about D-Bus, Dan Williams has an excellent introductory post about it.
More Optimization
One of the things that is traditionally very slow in Objective-C is sending messages to classes. When you do something like:
[NSMutableArray new];
The compiler expands it to something roughly like this:
Class receiver = objc_lookup_class("NSMutableArray");
SEL new = @selector(new);
IMP method = objc_msg_lookup(receiver, new);
method(receiver, new);
There are several causes of overhead here. The first is the class lookup. In the new runtime, the class table is implemented as a hopscotch hash, which is relatively fast, but a lookup still requires hashing the string, and looking it up in the table. This accounts for the majority of the cost of a class message send.
The second bit of overhead is the class lookup. The new runtime uses the objc_msg_lookup_sender()
function, which has a slightly different signature. As I wrote yesterday, you can cache the return value from this call, so we can save a lot of the overhead involved.
The final part is the overhead involved in constructing the call frame for the method and jumping there. This is present even when calling C functions. Overall, this adds up. Sending a million class messages took 56 seconds of CPU time on my machine.
With the new ABI, classes are exported as a public symbol. This means that we can save the cost of the class lookup, as long as that symbol is available. One of the optimizations I committed to the libobjc2 tree today substitutes that symbol for the call to objc_lookup_class()
. With that, the cost of a million class messages drops to a bit over 10 seconds. Not bad.
Yesterday, I talked about caching the message lookup for message sends in loops. Class messages always have the same receiver, so they're also a good choice for caching. Another pass that I added today caches all class message sends. Now we're down to only a bit over 4 seconds for a million message sends.
What about the cost of the function? In C, for a small function, we could inline it, but this isn't an option for Objective-C because of the dynamic dispatch. Or is it? The final pass that I added does speculative inlining. This means that it inlines the function that it guesses will be called, and wraps it in a test. If the (cached) lookup returns the function that we are expecting, we go down the inlined path. If not, we call the returned function pointer. The current pass always inlines class methods if possible, but I'll change that soon so that it only inlines them if it's also sensible.
With speculative inlining, we're now down to 2 seconds for a million class messages. For comparison, a million C function calls took 3 seconds on the same machine.
That's the sort of performance I'm aiming for. And, because these optimizations are all done at the LLVM layer, they will work with both Objective-C and Smalltalk. They depend on libobjc2, although it should be possible to implement something similar for Apple's runtime (but not for the old GNU runtime).
Google Summer of Code
To start with, the bad news (which is not really news anymore) is that Étoilé was, once again, not accepted for the GSoC this year and, unfortunately, neither was GNUstep. The good news is that the GNU Project was, and GNUstep was allowed to participate under this umbrella.
This is really great for Étoilé: both of the students accepted to work on GNUstep this summer are active Étoilé contributors.
Niels is going to be mentored by Fred Kiefer, the GNUstep AppKit maintainer, and will be working on the DBUS to Distributed Objects bridge. For those unfamiliar with DBUS, it's an ugly, badly designed, copy of Distributed Objects, created about a decade and a half later. At the protocol level, it contains roughly the same information as Distributed Objects. With a working bridge, we'll be able to expose our objects as DBUS services and use DBUS services as objects, without any extra code.
The other project is the one that I get to mentor. Eric is continuing his excellent work on Opal, which is the CoreGraphics implementation that sat unloved in Étoilé svn for a year or two before being moved to GNUstep. Eric's already implemented shadows and layers, and the next step is improving the text rendering support. Shadows are not essential, although they do make lots of things look nicer.
Layers are essential if we want to implement CoreAnimation, which (as well as making things look pretty) has a rendering model designed to be efficient on modern hardware. NeXT-era graphics hardware had very little RAM and caching rendered images was not really feasible. Now, even a handheld has more video RAM than a NeXT workstation had total RAM (and a much smaller screen), but redrawing things consumes CPU or GPU power and shortens the battery life, so it makes more sense to keep as much cached as possible. The layer model in CoreGraphics and CoreAnimation lets each view (potentially) have its own buffer, so redrawing just means compositing that layer onto the window; something a relatively recent GPU can do with no effort.
While Eric's working on CoreGraphics, I plan on tidying up some of the typography code that I've been playing with over the last few years and working on a CoreText implementation on top of Opal.
These two projects have a lot of potential to improve the foundations that Étoilé is built upon.
Smalltalk and Objective-C Performance
If you find yourself optimizing your code, then it means that the author of your compiler has failed. In Étoilé, we use two languages; Smalltalk and Objective-C. In theory, we can also use EScript (a dialect of ECMAScript using LanguageKit, but I don't think anyone ever has aside from a couple of examples).
Clang, which compiles Objective-C, and LanguageKit, which compiles Smalltalk, both use LLVM for code generation. This means that they both produce an intermediate representation in the same form. LLVM provides a lot of infrastructure for transforming this intermediate representation, which is how you implement compiler optimizations.
As part of the libobjc2 project, I've been writing a few of these that speed up code targeting the new runtime (which both Clang and LanguageKit do). The first of these passes is very simple. The new runtime adds support for non-fragile instance variables. With older Objective-C implementations, instance variables were accessed via a fixed offset. This is nice and fast, but it means that, if you modify one class's instance variable layout (including just adding an ivar), then you must recompile all subclasses.
With non-fragile ivars, you access all instance variables via an indirection variable. This records the offset of the ivar, and is fixed up by the runtime when the class is loaded. This means that you always get the right offset, even if other ivars are rearranged. This is great, but sometimes you don't actually need it. If your class inherits directly from NSObject
, for example, or from intermediate classes declared in the same library (which, it turns out, includes about 90% of classes), this extra overhead is for no benefit, because you will be recompiling the subclasses when you recompile the class anyway.
The ivar lowering pass reverses this. It introduces hard-coded ivar offsets if it is safe to do so without increasing the fragility of the library. This means that you only get the performance penalty from non-fragile ivars if you actually need them, for example when you subclass something from a third-party library.
This isn't especially interesting. The extra cost of indirect ivar accesses is really small. You're only likely to notice it if you are doing a huge number of ivar accesses in different objects and your cache is full.
Message sends, on the other hand, have a big impact on performance. Every time you send a message in Objective-C or Smalltalk, you need to do a dynamic lookup to find the method to call, then you call the method in the same way that you call a C function. Both of these have some cost.
One way you can work around this in code is IMP caching, where you perform the lookup yourself, store the result, then call it as a function pointer. The compiler couldn't do this itself, because caching the IMP (instance method pointer - the function pointer for the method) broke some of the dynamic features of Objective-C. If you changed the selector to method mapping (either explicitly via runtime functions or by loading a category) then the cache becomes invalid and there was no mechanism for the runtime to invalidate it.
This changed with the new runtime. Now, the method lookup function returns a slot, which can be safely cached and invalidated later. The new pass makes use of this to automatically cache slots for message sends that happen in loops. To test it, I wrote a simple benchmark program that sends a message 1,000,000,000 times. The method does nothing, it just returns immediately, so all of the time is spent in the message lookup and sending.
Unoptimized, this takes 10 seconds. With the auto-caching pass, it takes around 4.6 seconds. Adding in the normal set of optimizations, these times drop to 8 and 3 seconds, respectively. For reference, replacing the message send with a C function call makes the time 3.5 seconds, so Objective-C is very, very close to raw C performance in this case. Note that if the function is in a different library, as it often is, you have to go via a relocation table, which brings the speeds much closer, and can even make the cached Objective-C version faster.
This isn't the end, of course. One thing that you can do easily in C/C++, but not Objective-C, is inline a function. This involves replacing a call to a function with a copy of the function body. This eliminates the call overhead and lets you do some other optimizations easily, like constant and subexpression propagation between functions.
We can do inlining with Objective-C too, in theory, but it has to be speculative. We can inline methods, then wrap the inlined version in a test that checks that it really is the correct one. That's next on the list.
We probably can't get Smalltalk quite as fast as C, but if it's within 10-20%, there's very little reason not to use it.
EtoileText, LaTeX, and HTML
In a dramatic break with tradition, I've recently been working on some things in Étoilé that are actually useful. When I'm not hacking on Étoilé, the thing I'm actually paid for is writing. I've had two books published, and the third one is currently undergoing technical review.
When I write, I like to use a custom form of semantic markup, using LaTeX syntax. LaTeX, for those who have not encountered it, is the abomination that caused me to lose all respect for Donald Knuth. There is simply no excuse for anyone to design a programming language that does not include concepts like scoping, or any support for structured programming. LaTeX is a programming language that looks like someone (Knuth) thought a Turing Machine was actually a sane programing model, rather than a useful theoretical tool.
So why do I use LaTeX? Two reasons. One is that TeX-style markup is easy to type. Entering XML markup is too distracting, but TeX is very simple and quick to enter, so doesn't interrupt my flow. The other is that the output is really beautiful.
I don't really use LaTeX though, I use a set of custom macros built on top of it, just as LaTeX itself is built on top of TeX. For example, if I write \keyword{Smalltalk}, then the resulting file will have Smalltalk written in the keyword style and also added to the index. If I write \code{NSObject} then NSObject
emitted as syntax-highlighted Objective-C code.
I have a few hundred lines of LaTeX code that does this translation. It's the only way to use LaTeX and remain (moderately) sane. Like Lisp, LaTeX is a language for writing languages, rather than a language for using directly.
For my next book, however, the rise of the ePub format (another horrible format, but that's another issue) means that the publisher wants an HTML version of the text.
There are some good tools for mapping LaTeX to XHTML. The best of these is tex4ht, which runs a full TeX virtual machine and then transforms the output into HTML. The problem with this is that it loses all of my nice semantic markup, and styling it is a problem.
I want to be able to define a mapping from my TeX-style semantic markup to XHTML semantic markup. For example, I want \code{NSObject} to become <span class="code">NSObject<span>.
This is where EtoileText fits in. EtoileText is the framework that I've been working on a bit over the last couple of months for editing structured, semantically tagged, rich text. It maintains something conceptually similar to a DOM tree. Text is only stored in leaf nodes of the tree. Each parent node may contain a semantic type and custom presentation attributes.
You can define translators that map the semantic types to AppKit presentation attributes, and plug it directly into NSTextView
. Alternatively, you can use the visitor API to generate some other output. The latter is exactly what I've been doing.
I've added a simple (and incomplete) parser for TeX-style markup, which constructs an EtoileText tree from my LaTeX sources, preserving all of my semantic tagging. Then, a visitor walks this tree and emits semantic XHTML. Add to this about half a dozen lines of CSS, and you end up with an XHTML version of the LaTeX sources that looks correct: Not exactly the same as the PDF, but how the markup would look if it were typeset for a browser window instead of a printed page. This uses the ETXMLWriter
class from
The TeX parsing classes in EtoileText are not meant to be a general TeX parser (this is actually a non-computable problem; reason #135 why I hate LaTeX). They are intended as a set of tools for building special-purpose parsers for semantic markup languages implemented in TeX. The ones that start with the ETTeX
prefix correspond to standard LaTeX commands, the ones with the TRTeX
prefix correspond to my own set of macros.
The code is very rough around the edges at the moment. When it's a little more complete, I'll split the TR*
classes out into a separate example program.
All of these classes are designed to be usable easily from Smalltalk, which means that you can use Smalltalk to extend it, just as you can extend LaTeX with macros.
Over the summer, I hope to spend some time working on a CoreText implementation and improvements to the GNUstep text system, so hopefully we'll be able to produce typeset PDFs from EtoileText trees that look as beautiful as anything that LaTeX produces by the end of the year. Nicolas keeps promising to work on a structured text editing component for Étoilé too, so hopefully I'll be able to work entirely inside Étoilé soon...
LanguageKit SmallInt Improvements
When you use a selector name or instance variable that was declared in Objective-C from LanguageKit, you inherit its types. This is very important for interoperability. If you wrote an object pointer into an instance variable declared as int
, or wrote a small integer value into an instance variable declared as an id
, then the next time Objective-C code tried to access this value you'd get a crash.
When I started working on LanguageKit, I followed the Objective-C rules that everything was an id
until proven otherwise. SmallInt
s were stored on the stack, but never passed as arguments to methods, never returned from methods, and never stored in instance variables.
This condition is now relaxed somewhat. Instance variables declared in LanguageKit code are now allowed to contain SmallInt
s. Methods that are not present in Objective-C may take SmallInt
s as arguments and may return them. This means that LanguageKit is now doing a lot less pointless BigInt
creation.
What does this mean in terms of performance? Let's return to the trusty Fibonacci benchmark to find out. This calculates the 30th Fibonacci number, 100 times in a loop. I've implemented this in C, Objective-C, and Smalltalk. Now, the Smalltalk version comes in two flavours. The first always returns an int
, so values that won't fit in an int
will be truncated (as with Objective-C). The second returns an LKObject
, so small integers will be hidden inside a pointer, big integers will be returned in BigInt
instances. The performance numbers are:
C fibonacci execution took 2.351562 seconds.
ObjC fibonacci execution took 6.601562 seconds.
Ratio: 2.807309 (to C)
Smalltalk fibonacci execution took 8.750000 seconds.
Ratio: 1.325444 (to Objective-C)
Smalltalk fibonacci SmallInt version execution took 5.687500 seconds.
Ratio: 0.861538 (to Objective-C)
Note that these were done in a VM, and the timing results are slightly wonky. On subsequent runs, the ratios remained roughly constant, but the absolute times varied by up to 50% in either direction. Although the last line looks like the Smalltalk code is faster than Objective-C (which would be very nice), it's quite unlikely that this is really the case. Please don't start citing this blog as claiming that Smalltalk is faster than Objective-C.
The take home message, however, is that using LanguageKit SmallInt
s is sufficiently close in terms of performance to using C int
s that it's difficult to accurately measure. Using SmallInts also makes you safe from overflow. The Smalltalk version will just get really slow when you can no longer fit the value in a SmallInt. The C and Objective-C versions will start giving you the wrong answer.
Perhaps more interesting is the fact that the Smalltalk version that was shoehorned into the Objective-C type system was slower. The extra overhead of converting to and from int
s was noticeable.
Running it several times, I got one result where the SmallInt
version took 20% longer than the Objective-C version. This was the worst case result for LanguageKit, but is probably the most representative of real-world performance. If you can afford using 20% more cycles, and aren't doing anything floating-point intensive (LanguageKit's floating point performance still sucks), then there isn't much reason to choose Objective-C over Smalltalk.
It's worth noting that the C version was much faster than either. This kind of recursive call is the absolute best case for polymorphic inline caching. The new runtime supports this optimisation, but I haven't added it to the compiler yet. I'm planning on doing it via an LLVM optimisation pass, so both Objective-C (compiled with clang) and Smalltalk will benefit. This should remove most of the cost of the method lookup, and even allow speculative inlining of calls. I expect that we can get Smalltalk performance closer to 5 seconds for this benchmark.
As always, this benchmark needs to come with a reminder that using a less stupid algorithm is orders of magnitude faster and a sensible Fibonacci implementation is faster even in the LanguageKit interpreter than a stupid algorithm in C. Good algorithms with bad compilers always beat good compilers with bad algorithms.
OMeta and Benchmarks
This week, Günther committed his implementation of OMeta. OMeta is a shiny way of writing parsers from the Viewpoints Research Institute. It's a really simple way of writing domain specific languages and it's great to have an implementation in Étoilé. It will be fun to see what people will use it for. Eric is planning on writing a Smalltalk parser in OMeta, so we can have a completely self-hosting parser.
Glancing over the code, I noticed that Günther had implemented quite a few things in a category on BigInt
. This is how you add operations to integers with LanguageKit, and all of the ones he'd added looked useful. I moved this into the SmallInt
and BigInt
implementations supplied with the compiler, so now they execute in (very fast) inlined C functions if the receiver is a small integer, rather than in (slower) Smalltalk.
While I was hacking on BigInt
, I also added an implementation of a class that boxes floating point values and Eric added support in the parser for floating point literals. You can now use floating point values in Smalltalk, although they are quite slow. I'll probably work on optimising this a bit later, but I can't really be bothered now, because you can just write performance-critical floating point code in an Objective-C method if it's not fast enough in Smalltalk.
Since it's a fairly complex piece of code, it seemed like a good thing to use for some real-world benchmarking of LanguageKit. Unfortunately, this is where I started to hit problems. It ran fine with the JIT compiler, but not the JTL or interpreter.
I spent quite a while hunting bugs in the interpreter - you can see the svn log for details if you care about them - and then moved on to the JTL. It turned out that the problem with the JTL was an LLVM bug, not a LanguageKit problem. Linking together LLVM bitcode modules containing global aliases that pointed to bitcasts was broken. I've now fixed that, so it's worth upgrading LLVM to r93052 or later if you want LanguageKit to work properly.
After that, I could run Günther's OMeta tests. You can see a summary of the results in this table:
Measurement | Interpreter | JIT Compiler (Debug) | JIT Compiler (Release) | JTL Compiler |
Wall clock time | 1.7 | 22 | 3.0 | 0.44 |
User time | 1.0 | 16 | 2.0 | 0.22 |
Smalltalk time | 0.96 | 0.023 | 0.023 | ? |
The wall clock time and user time are reported by the time
utility when running edlc
. The Smalltalk time is the time reported by edlc
as the time taken for the SmalltalkTool
class's -run
method to complete. This is not reported by the JTL because the code is run by the bundle loader in LanguageKit and can't be separated out by the tool.
As you can see, JIT-compiled code is about 41 times faster than interpreted code. We can probably make the interpreter a bit faster, because it's currently quite a naïve implementation, but given that we can get a big performance benefit just from using the JIT, it might not be worth it.
The JIT has a long start-up time. Most of this time is spent by LLVM optimisation passes. Note the difference between the debug and release LLVM builds. Disabling all of the assert()
statements in LLVM and enabling compiler optimisations when building the JIT makes a huge difference to the performance; going from 16 to two seconds of CPU time. I didn't do this when I originally posted this entry because I always forget that I'm using a debug build (I hack on LLVM as well as Étoilé). With a release build, it's significantly faster. Slightly slower than the interpreter overall, but for longer-running programs this will go away.
One of the things we can do now is use the interpreter by default and then compile and install methods in the background, and only once they have been run a few times.
Günther left a comment saying that it might be worth rewriting the OMeta stuff in Objective-C, but given that the tests take a fraction of a second to run I don't think that's particularly worthwhile.
Smalltalk Workspace and other LanguageKit News
It's been a while since I talked about LanguageKit, so I'd like to take this opportunity to introduce a few new features. One is an improvement to error reporting. Previously, warnings were sent to the standard error, while errors were reported via an exception.
These were fine for the command-line compiler tool, but not very flexible. One thing that I've been meaning to do for ages, and which I finally did just before Christmas, was to factor this out into a delegate. All errors and warnings are now sent to the delegate. This lets you do some quite nice things, including repair the AST without reparsing. More on this later.
The other nice new feature is something that I did almost none of the work for: an interpreter. LanguageKit includes JIT and static compilation already, but interpreting has a few advantages over these approaches. If you only run a statement once, a slow interpreter is still faster than compiling and optimising the statement. The interpreter is very useful for development, because you can trivially change the code in an interpreter, for example. Eric wrote most of the interpreter code. Unlike the compiler, the interpreter runs on Cocoa, as well as GNUstep.
In future, I plan on incorporating profiling into the interpreter and JIT compiler. We can then transparently recompile code based on profiling information. This includes things like speculative inlining, which the new runtime makes possible.
Eric's combined these two features to create something new; a Smalltalk Workspace. This is a feature that most Smalltalk implementations have. You can type Smalltalk statements into a window, select them, and run them. You can now do that in Étoilé. Currently, this uses the interpreter. In future, things like blocks will automatically be JIT compiled if they are executed more than a few times.
If you want to play with it, you'll need the latest LanguageKit svn and the SmalltalkWorkspace from Developer/Services in svn. If you use ETTranscript
for output, as most of the example scripts do, then you'll notice that it works correctly with the workspace, redirecting the output to the window, rather than sending it to the terminal. The -log
method on objects still goes to the standard error.
Gratuitous Book Plug!
Today's blog post isn't really about Étoilé, so feel free not to read it. Instead, I'm going to talk about my new book: Cocoa Programming Developer's Handbook.
Étoilé is based on GNUstep, which implements the same set of core APIs as Apple's Cocoa. Both implement the OpenStep specification and both extend it in various ways. GNUstep tries to follow Cocoa's extensions and a number of GNUstep's extensions are available on OS X via the GNUstep Additions framework.
Apart from one appendix, covering porting Cocoa applications, the book does not contain much specific to GNUstep (although a few of the examples are based on Étoilé code). That doesn't mean that it's completely useless to GNUstep and Étoilé developers, however. Most chapters contain something relevant to people on other platforms:
1 Cocoa and Mac OS X talks about the history of OpenStep and Cocoa and where it fits into OS X.
2 Cocoa Language Options introduces Objective-C and discusses the choice of compilers. Most of this is applicable to GNUstep. I don't think there are Python bindings for GNUstep, but there are Java and Ruby bridges. For Étoilé, of course, you can use LanguageKit and write Smalltalk code.
3 Using Apple's Developer Tools introduces XCode and Interface Builder, which are specific to OS X. The introduction to Objective-C and the conventions in Cocoa, which account for most of this chapter, are relevant to GNUstep developers.
4 Foundation: The Objective-C Standard Library introduces the Foundation framework. Aside from the section on Core Foundation, everything here should be relevant to GNUstep - I finished the NSCache implementation before the book was published - and support for Core Foundation APIs in GNUstep is under development.
5 Application Concepts introduces the core concepts in the AppKit. Everything here should apply to GNUstep-gui.
6 Creating Graphical User Interfaces covers more of AppKit. I've not tried using Cocoa Bindings, which this chapter explains, with GNUstep. The code looks as if it would work, but there is no interface in GORM for connecting them so you'd need to do that in code (or improve GORM).
7 Windows and Menus also mainly covers things that will work in GNUstep. Note that GNUstep doesn't support sheets (yet). The same APIs work, but you get window-modal dialog boxes instead.
8 Text in Cocoa talks about the Cocoa text system. Most of this is unchanged from OpenStep and should work with GNUstep. There are a few bits that won't. Rather embarrassingly, one of the examples didn't compile with GNUstep because of a typo in a GNUstep header (fixed now). Fred is doing some great work on the text system at the moment, so if any of it doesn't work then it probably will soon.
9 Creating Document-Driven Applications introduces the NSDocument
family, which is very well supported by GNUstep. You can take the Apple developer examples that explore this part of the system and compile them with pbxbuild without any problems.
10 Core Data introduces the Core Data framework. GNUstep provides the gscoredata framework, which is intended to be a drop-in replacement. I've not used this (Étoilé's CoreObject is nicer), but apparently it works.
11 Working with Structured Data describes the more complex view classes in AppKit. NSCollectionView
doesn't yet exist in GNUstep, but EtoileUI provides a nicer way of producing this kind of layout (and works on both GNUstep and Cocoa).
12 Dynamic Views talks about modifying the view hierarchy at run time. All of this is standard OpenStep stuff with the exception of the part talking about full-screen applications. This uses some low-level Quartz calls, which won't work with GNUstep.
13 Custom Views teaches you how to write your own views. The CoreGraphics stuff won't work with GNUstep yet, although we have a partial implementation of CoreGraphics in Étoilé svn and are trying to get it moved into GNUstep. The section on creating Interface Builder palettes may also not be directly applicable to GNUstep. Gorm supports palettes in a similar way, but I don't know how similar the interfaces are. This is only important if you are packaging a framework with new view objects for third-party developers and want to add a little polish.
14 Sound and Video contains a lot that is specific to OS X. None of the QuickTime stuff will work with GNUstep, although Étoilé's MediaKit fills the same rôle (and might work on OS X). I committed support for speech synthesis to GNUstep a while ago, but speech recognition is still missing. If anyone wants to wrap (Pocket)Sphinx and implement NSSpeechRecognizer
before I get around to it, patches are most welcome!
15 Advanced Visual Effects covers a lot of stuff that is only available on OS X. CoreAnimation, CoreImage, and Quartz Composer only work on OS X. OpenGL works on any platform and there have been some patches added to GNUstep's NSOpenGLView recently by people porting Cocoa apps or writing OpenGL apps from scratch with GNUstep.
16 Supporting PDF and HTML covers PDFKit and WebKit. Gregory is currently working on porting WebKit to GNUstep, so these examples should work in the next few months. PopplerKit in Étoilé svn provides a lot of the same features as PDFKit, but is GPL'd. Writing a new PDF framework is on my TODO list, but it's a very long way down so don't hold your breath...
17 Searching and Filtering covers SearchKit and Spotlight, which are not available off OS X. Étoilé's LuceneKit, based on Apache Lucene, is very similar to SearchKit. The NSPredicate
stuff in this chapter is also relevant to GNUstep, and Étoilé uses predicates in a few places.
18 Contacts, Calendars, and Secrets covers the Address Book, Calendar Store and Keychain APIs. We have an implementation of the Address Book APIs in svn, which we will be replacing soon with one built on top of CoreObject. The Calendar Store API is quite simple, but hasn't yet been reimplemented. I have an implementation of the subset of the Keychain API that I talk about in this chapter somewhere, which I'll commit once I've finished tidying it (it's difficult to do securely without Mach ports).
19 Pasteboards covers the pasteboard APIs. These are the same on GNUstep, although the 10.6 APIs (which are much, much, cleaner) aren't implemented yet. I like these APIs a lot, so I'll probably implement them soon.
20 Services covers the system services mechanism that Apple inherited from NeXT. This is one of my favourite features of GNUstep and works well. We have a few things that use it in svn.
21 Adding Scripting talks about AppleScript. If you read and understand these APIs, you'll know why no one has bothered reimplementing them. This chapter is around 30 pages. You can describe how to do the same thing with Étoilé's ScriptKit in about half a page.
22 Networking talks about sockets, NSStream
, the URL loading system, distributed objects, and Bonjour. All of these should work with GNUstep, although I've not tested NSNetService
.
23 Concurrency talks about various ways of writing parallel code. The NSOperation
implementation in GNUstep is quite immature, so probably won't work. Grand Central works on FreeBSD, but not yet anywhere else.
24 Portable Cocoa is obviously relevant to GNUstep.
25 Advanced Tricks covers some tricks with the runtime system. Most of these work as-is on non-Apple platforms, and a few have example code for the GNU runtime.
A lot of the examples from the book use declared properties, so if you want to try them with GNUstep you'll need to use clang (and either libobjc2 or link Étoilé's Objective-C 2 framework). I found declared properties very useful in writing concise code for the examples and this provided me with the motivation to get them working on GNUstep (they do now).
InformIT sells both the printed copy and the (DRM-free PDF) eBook edition, while Amazon has the dead-tree edition for a bit less.
Compilers, Runtimes, and Web Apps
As regular readers will know, as well as hacking on Étoilé, I also maintain the GNUstep Objective-C runtime and associated support in clang. Recently, I started working on something that we discussed at the hackathon: web app integration.
Étoilé has lots of features that would make it convenient for writing web applications. The EtoileUI framework, for example, lets you specify user interfaces in quite an abstract way. We would like, for example, to be able to take the EtoileUI tree and turn this into a tree of web-based views, just as we currently turn it into a tree of AppKit views. This would make it trivial for all Étoilé user interfaces to have an 'export to web' menu button, so when you are away from your computer you can press this button and access your Étoilé programs from a remote browser.
With this support, it might also be nice to be able to write stand-alone web applications with Étoilé. That's more or less what I've been working on for the last week or so. I now have a set of classes that talk FastCGI to a web server and handle events and session management. This will be used to construct a set of MVC views that can be used from EtoileUI or directly.
While I was working on it, I got a bit bored writing accessor methods, so I decided to use clang, which supports declared properties. I've also been using the GNUstep runtime on my own machine for a while. When using clang, it made sense to use the new non-fragile ABI supported by this runtime.
Of course, doing so immediately broke everything. After a bit of bug fixing, in both clang and the runtime, I got the code working with the new ABI. This properly supports properties, non-fragile instance variables, and fast forwarding, among other improvements.
Clang is now able to compile most of Étoilé with the new ABI. This is backwards compatible with the old one, so you can still link against libraries (like GNUstep, which uses @defs and so doesn't support the non-fragile ABI) compiled with the GCC ABI. There are a few more bugs to fix, but then we should be able to fully support clang as a compiler for Étoilé. In the meantime, the GNUstep runtime is working as a drop-in replacement for the GCC runtime.
CoreObject over XMPP
One of the best things about hippyware development is when someone else implements something you'd planned to do. If you've been following this blog for a few years, you'll remember that one of the ideas I had very early on when I started working on EtoileSerialize (the low-level part of CoreObject) was to be able to send objects remotely for automatic whiteboarding of arbitrary objects. Today, in SILC, Niels announced:
Niels Grewe: I just send the first object over XMPP :-D
For the past few weeks, Niels has been working on tidying up some of the rough edges of XMPPKit and EtoileSerialize and joining them together. He has rewritten the XML serializer and added a deserializer.
At the hackathon this year, I added an ETXMLWriter class and protocol, which implement a similar set of APIs to the ETXMLParser delegate (this is great for testing, because you can plug the two together and just check that the input matches the output). This is now used by XMPPKit and is also used by Niels' new code in EtoileSerialise. This makes it relatively easy to join the two together, so you can embed serialized objects in an XMPP stream and send them to other users.
Although the code is still quite messy, and Niels is tidying it before committing, it has a lot of potential. Trivial whiteboarding from every Étoilé application is the most obvious use; we can send CoreObject invocations between users so they can both edit the same documents easily. Another exciting possibility is live backups. While you are editing a document, all of the changes you make are recorded to disk by CodeObject, so a power failure won't lose you anything except, maybe, the couple of edits if your OS doesn't commit them to disk fast enough. But what happens if the disk dies, or you drop your laptop? With CoreObject streaming the changes over XMPP (which doesn't take a huge amount of bandwidth), you can start working again exactly where you left off, as soon as you find a new computer to work on.
I hope to put together some demos of this over the next few months. It probably won't make it into 0.4.2, but it should be in 0.5.
The Étoilé Runtime is dead, long live the GNUstep runtime!
A few years ago, I wrote a new Objective-C runtime, which had lots of new and exciting features, but one big disadvantage; it was not backwards compatible. Since then, I've been heavily involved in clang and have implemented support there for the GNU runtime. I added preliminary support for the new runtime, but it bitrotted before I finished it.
After looking at the GNUstep code, I came to the conclusion that updating it to support a new runtime would be very difficult. In particular, changing the type of IMP would have broken a lot of existing code.
Since then, I've also been working on ObjectiveC2.framework in Étoilé svn. This provides an implementation of Apple's new runtime API on top of the GNU runtime. This makes it easier to port code from OS X to GNUstep and also implements a few of the newer functions that clang is now adding calls to for Objective-C 2 support. These include bits of the language like fast enumeration (for...in loops), @synchronize(), and declared properties.
About a week ago, I started a new project. This has now been committed to the GNUstep repository as libobjc2. The new runtime starts as a fork of the GNU runtime, incorporates the improvements from the ObjectiveC2 framework, a number of ideas from the Étoilé runtime, and a little bit of new stuff.
From the GNU runtime we get a fairly reasonable implementation of the core Objective-C language. From the ObjectiveC2 framework we get most of the things required for Objective-C 2. From the Étoilé runtime we get a safe way of caching method pointers (so the compiler can now safely insert IMP caching and you probably never need to do it in code ever again) and a means of modifying the receiver of messages (so we can now implement incredibly fast proxies for things like CoreObject). The new stuff includes built-in support for Object Planes, support for non-fragile instance variables, introspection on declared properties and optional methods in protocols, and a few other things. The performance characteristics of the GNUstep runtime are similar to those of the Étoilé runtime; slightly slower than the GNU runtime for basic message sends, faster for cached message sends, and much faster for proxy messages.
I finished the compiler support for all of the new features last night and will be committing it to clang once it's undergone code review. In the meantime, I've been updating GNUstep so that it now supports all of the new magic on NSObject added with 10.5 and 10.6. I've also added type introspection on blocks, so we can use them with ETPrototype safely, just as we use Smalltalk blocks now.
The Étoilé runtime is now official deprecated. It should be regarded as a research prototype. Almost all of the ideas from the paper have now been merged into the GNUstep runtime. The - horribly buggy, bloated, and untested - threading code has now all gone. The current version of the runtime does not support type-dependent dispatch, but it does return the method types in the slot, so they can be tested in LanguageKit to avoid problems with polymorphic selectors. One feature that was dropped was the offset field in slots. After profiling, it was determined that the test for whether this field was present on every message send was costing more than it was saving in the cases where it existed. We can get the same benefit in a more general way by speculatively inlining the accessor methods.
The new runtime is now fully supported by GNUstep, bringing support for all of the 10.5 and 10.6 methods that were added to NSObject. The patches for support in clang are pending review, but should be merged soon. Once this has been done, I will also add LLVM passes that automatically cache the lookup. This optimization code can then be shared between clang and LanguageKit, because it will work on the LLVM IR generated by either.
It's a Bitter Sweet Compiler that's Just Too Late...
LanguageKit has supported just-in-time (JIT) compiling for a while, but in my last commit I added support for just-too-late (JTL) compiling too. The idea behind JIT compilation is that you only generate the executable code just before it's called. With JTL compilation, we generate it afterwards.
On the face of it, this isn't very useful. In practice, applications run more than once, and the second time is now a fair bit faster. If you've run Melodie, you'll have noticed that it takes a while longer to start than most GNUstep / Étoilé applications. This is because it is parsing all of the Smalltalk code, compiling it to LLVM bitcode, optimising this, and then starting.
If you use the latest LanguageKit svn (note: it needs a very recent LLVM svn build), you will notice that the second time you load it, it starts a lot faster. This is because it's loading a shared object (.so) file that was generated in the background the first time the program ran. It's currently quite conservative about regenerating this; if any framework or source file is modified then it should fall back to the JIT and run the JTL compiler in the background to refresh the cache.
So, as I write this, Melodie is playing my test album (Bitter Sweet Symphony EP, courtesy of iTunes Plus), without having done any compilation. It doesn't even load the code generation bundles, because the code generator is never instantiated (the bundle with the code generation support is loaded on-demand).
The performance is not yet ideal. In future, the JIT compiler will do some run-time profiling and the JTL compiler will then use the profiled version of the library. This means that every Étoilé application written in a LanguageKit language (e.g. Pragmatic Smalltalk or EScript) will benefit from profiling-driven optimisations, and may even out-perform statically-compiled Objective-C (although profiled Objective-C is likely to still be faster).
There are currently two places where LanguageKit will store the caches; in the source bundle or in the user's home directory. This means that you can benefit from static compilation even when you don't have write access to the bundle (e.g. it's an application installed by root).
This is something I've been meaning to do with LanguageKit for a long time, so it's nice to see it working properly. Note that this works with all LanguageKit bundles, not just applications. If you put a plugin in your LKPlugins directory, for example, it will benefit from the same kind of caching.
Summer Update - Video, Reflection, Model Description
My summer break is about half over, and as the Étoilé "summer student" developer, I thought I should give everyone an update on what I've been working on. The past two months have been a bit unfocused:
- Inspired by Nicolas' patch to make NSSplitView resizing "live" in GNUstep, I wrote a similar patch for NSTableView. Now, when resizing or moving table columns, the table is updated in real-time. In my opinion, these small changes really make GNUstep feel more modern.
- I wrote a prototype of a movie view for MediaKit. There are two versions: one is software-only (FFMPEG handles decoding, colorspace conversion, scaling, and the resulting bitmap data is painted on an NSView), the other uses an NSOpenGLView to scale the video in hardware. This was really quite easy (a few hundred lines of code): the FFMPEG libraries do all of the hard work. This isn't in MediaKit yet - we need to improve the threading architecture a bit to handle video - but it shouldn't be too difficult to finish.
- Since last year, I've wanted to write a reflection API for Etoile based on mirrors (see Mirrors: Design Principles for Meta-level Facilities of Object-Oriented Programming Languages). I finally wrote a basic implementation: (Header, Source, Test), written using the OS X Objective-C 2.0 Runtime API - so it works on both OS X and GNUstep (thanks to David's ObjectiveC2 framework). My implementation is read-only and only reflects the runtime state of the program. The great thing about mirrors is the idea that you should be able to use the same API to examine and manipulate running code as you use to manipulate source code. In the future I would like to extend it with write access (so we can add and remove classes and methods), sub-method reflection integrated with LanguageKit (specific to each language provided by LanguageKit), and the ability to mirror source code.
Here is a simple class browser written in Smalltalk:
- At the end of the hackathon, Quentin explained the idea of model description to me, and I thought it was really cool. Similar to how mirrors separate meta-information about an object's structure (class, instance variables) in to a separate object, a model description object separates meta-information about a model object's role in the overall model. For example, the model description could list the properties of the model object, which are read-only or read-write, the type of value permitted for the property (using UTI's), whether the property is multivalued, etc. I'm working on something similar to FAME, and hope to commit a basic version of it in the next few days.
Lastly, I've been working on polishing the website. There is still more to do, but you'll notice there is now more content, and hopefully it is easier to navigate.
As for the rest of the summer, there are several projects I would like to work on:
- UI design. From my perspective, we have a consensus on the goals of Etoile, and the UI of some parts of the system. As we're getting closer to implementing more of it, I think it would be useful to continue design discussions and plan some of the smaller details.
- CoreObject development. I think the model description framework will help modularize and simplify CoreObject, and I want to investigate serialization of objects using the model description (which FAME does)
- Work with EtoileUI more; prototype better UI's for Melodie and the photo manager
- Start the messaging framework (chat, email, etc.)
Higher-order messaging in EtoileFoundation
Over the past weeks, there have been some interesting additions to the EtoileFoundation framework to provide higher-order messaging facilities. Higher-order messaging means having (or at least: having the illusion of having) messages that take messages as their arguments. This can turn out to be quite handy at times. For example, if you are using any class that descends from NSObject, thanks to Quentin you now have a -ifResponds
method at your disposal, which frees you from the fear that you might send a message to an object that it doesn't understood.
Borrowing an example from Quentin's documentation of that method, suppose you have cats and dogs in your code zoo where your Dog
class implements a -bark
method, but the Cat
class doesn't. The following naïve approach will then cause an exception to be thrown, because cat
does not respond to -bark
:
[dog bark];
[cat bark];
If you send the message to -ifResponds
, you can make sure that this doesn't happen:
[[dog ifResponds] bark];
[[cat ifRepsonds] bark];
This is neat, especially when you're sending a message to an object that might or might not implement the method but don't care about the case that the message is not understood. Internally you are, of course, not sending a message to a message. -ifResponds
rather returns a proxy object that, upon a message send, will forward the message to the original object if it responds to it. Otherwise it will simply return nil
.
Another example of higher-order messaging at its best is the -inNewThread
method from EtoileThread, which has been around for a while. Any message you send to this will be executed in a new thread, meaning that your programme will only block when you try to access the return value. Sometimes it can be really nice if you don't need to wait for a lengthy operation to complete, especially if you are a hungry cat:
[[dog inNewThread] takeForALongWalk];
[cat feed];
Higher-Order Messaging for Collections
The other addition, which I managed to squeeze into EtoileFoundation with much help and support from David and Quentin, is higher-order messaging related to collections. These higher-order messaging methods make it really easy to manipulate the objects in a collection. One of them is -mappedCollection
, which allows you to send messages to every object in the collection, giving you a new collection with the manipulated objects. Since EtoileFoundation provides the useful ETCollection protocol, which aggregates the collection classes from the GNUstep/Cocoa Foundation framework, this works indiscriminately with NSArray, NSDictionary, NSSet, NSIndexSet and their subclasses.
Even if you subclassed one of those yourself and have special needs on how to handle the elements in your custom collection you can still use the existing mechanism by implementing the two methods beginning with -placeObject:
. This is already used by some classes, as you can see in the ETCollection+HOM.m
source file.
Consider the following example: You have an NSArray named fruitBasket
which holds several instances of a class named Fruit
which implements a -peeledFruit
method to return an instance of its PeeledFruit
subclass. How would one solve the problem of peeling all the fruit in the basket? The conventional way would be to enumerate all the objects in the array, sending the -peeledFruit
message to each and placing the object returned in a new array, yielding roughly between 5 and 15 lines of code, depending on whether you (can) use fast enumeration and how paranoid you are about sending the message to a wrong object. -mappedCollection
helps you to reduce that code to just one line:
NSArray *basketOfPeeledFruit = (NSArray*)[[fruitBasket mappedCollection] peeledFruit];
If fruitBasket
had been a mutable collection, you could also have used -map
, which modifies the collection directly. There is also another useful method using the same pattern: The -filter
method allows you to send a message returning BOOL
to each element in the collection to determine whether it should remain in the collection or be removed. This would, for example, allow me to make sure that there are no foul fruit in my basket prior to peeling them. Unfortunately, -filter
is only available in a variant that modifies the collection directly, so I have no way to keep the original basket and return a new collection. Why?
The problem with returning a new collection after filtering is, that, again, the whole higher-order messaging stuff is strictly speaking a con. -mappedCollection
, -map
and -filter
(as well as -leftFold
and -rightFold
, see below) in truth return proxy objects that forward the messages you send to them to each object in the original collection. There is no such thing as higher-order messaging from the compiler's perspective. If you want to return a new collection, everything will work well as long as you only send messages to the proxy that return pointers to objects.
In the above case, the compiler would look at the return value of the method -peeledFruit
and allocate enough space to hold a pointer to a PeeledFruit
instance. But the proxy object does in fact return an NSArray. This is why you need to cast the return value appropriately, but that is no problem since both are pointers to some area in memory and thus are of the same size. This is not the case if you send a message to a filter proxy. It will return BOOL
and the compiler will only allocate enough space for a BOOL
return value. If the filter-proxy did return a pointer to the filtered collection, it would not fit into the BOOL
-sized portion of memory reserved by the compiler. Terrible things could ensue, and that's why it's better not to do it.
So let's assume our fruitBasket
is in fact mutable and the “Fruit”-class implements -isNotFoul
(which returns a BOOL
-value), we can remove the bad fruit from the basket and peel them afterwards by saying:
[[fruitBasket filter] isNotFoul];
[[fruitBasket map] peeledFruit];
The map operation can also serve as a substitute for the KVC method -valueForKey:
. Imagine that every month my phone company will send an array of records for the calls I've placed and received (named callLog
). Each record is an NSDictionary containing information about a single call: The names and numbers of caller and callee, duration of the call etc. If I want to get a list of the people I called, I have the following two options, one using higher-order messaging, the other using key-value coding:
NSArray *calledPersons = (NSArray*)[[callLog mappedCollection] objectforKey: @"calleeName"];
NSArray *calledPersons = [callLog valueForKey: @"calleeName"];
Of course, the KVC-variant is arguably more concise, but I'd like to demonstrate how higher-order messaging can help me to find out how much time I spent on the phone this month. I simply extract the “duration”-field from each record:
NSArray *callDurations = (NSArray*)[[callLog mappedCollection] objectForKey: @"duration"];
(Again -valueForKey:
could serve the same purpose.) All I need to do now is compute the total of all objects stored in callDurations
; a task that can be immensely simplified by two other new methods: -leftFold
and -rightFold
, which repeatedly invoke a method with an accumulator and each element in a collection, building up the return value in the accumulator. By “left” or “right” you can indicate in what order the elements shall be processed. Since my Duration
-class implements -durationByAddingDuration:
I can easily find the total this way:
Duration *total = [[callDuration leftFold] durationByAddingDuration: [Duration nullDuration]];
The argument [Duration nullDuration]
is used to set up an initial value for the accumulator. I'm setting it to an empty value, since I have no particular use for it. Of course, since addition is a commutative operation, I could have used -rightFold
as well. It might also be worth noting that you shouldn't use nil
as the initial value of the accumulator, at least for left folds. In left folds the accumulator will be the receiver of the message you send to the proxy object, and if it's set up to nil
initially, no message can change that and nil
will also be the return value of the operation.
This is nice and convenient, but might not be particularly speedy depending on your use case. Since a message sent to the proxy object goes through the second-chance dispatch mechanism of the Objective-C runtime before it is forwarded to elements of the collection, there is some penalty associated with this initial message send. The more objects there are in your collection, the less likely it is that you will notice it, but in performance critical code you might want to check whether that penalty is still significant.
Using Blocks
But there is also a second area where higher-order messaging might be found wanting. If you are often doing various complex modifications of objects in collections, and want to use -map
and friends, you might find yourself adding quite a few methods like -doComplexStuff:withFoo:andBar:andEvenMoreArguments:
to your classes, even if you only use them once or twice. If that is the case, and you are lucky enough to have compiled EtoileFoundation with clang (passing the -fblocks
switch and linking to the ObjectiveC2 framework), you can also use the new blocks feature of Objective-C 2.0 to achieve something quite similar to higher-order messaging. EtoileFoundation provides analogues to all collection higher-order messaging methods for use with blocks. So we could rewrite the fruit peeling example as follows:
[fruitBasket mapWithBlock: ^(id x){ return [x peeledFruit]; }];
Which, for this simple case, might seem to increase complexity, but is a lot more expressive since the block can include an arbitrary amount of code (and it can even reference variables outside the scope of the block). Additionally, since there is no proxy object needed to which arbitrary messages are sent, there is even a -filteredCollectionWithBlock:
method, which is know to return a pointer to an object and won't cause any trouble.
You can get an overview of the implemented methods from the ETCollection+HOM.h
header file. I hope that they are going to be useful and that you will have as much fun using them as I had writing them. If you run into any problems please drop me a line. For those interested in more information on higher-order messaging etc. both Marcel Weiher's and Stéphane Ducasse's paper on the topic (PDF) and David's article on Advanced Flow Control for Objective-C are highly recommended.
VirtualBox Dev Image
A few weeks ago, I put together a VirtualBox image which has a snapshot of Etoile trunk installed. If you've been wanting to play with Smalltalk or EtoileUI but had trouble compiling everything, hopefully this will be useful! Available here.
Dealing with Documents
The Étoilé team has a pretty strong opinion about how users should be able to work with documents on their computers. It seems as if we're not alone, though. Lukas Mathis has written an article lamenting the state of document creation on modern systems, and then takes a look in the history books to show that the way we do it now isn't the only way, and, most importantly, is probably not the best way. Lukas offers up a good solution for approximating the ideal using OS X and a template-based system. Our proposed solution is closer to OpenDoc, but if you'd like to understand why we feel so strongly about this (as well as read about some of the systems that have influenced us), give the article a read.
Hackathon Roundup
Last week saw the second Étoilé hackathon, hosted once again by the Swansea University Computer Science Department. They gave us a nice room overlooking the bay to hack in, and the use of their Internet connection for a bit. The hackathon officially started on Wednesday, but Eric arrived a day early:
Eric: After a few days of being a tourist in London and Bristol, I took the train to Swansea Monday evening. I neglected to print out a map which included both David's house and the train station, so I soon got lost and walked up Town Hill. I eventually found myself on the map, and arrived at David's much later than I should have - we were late for trivia at a local pub, so we headed there right away. I enjoyed meeting Davids' friends, answered (maybe) one trivia question, and tried some local beer. Afterwards, David showed me the results of his initial work on ProjectManager.
We spent some time on Tuesday hacking on LanguageKit on Eric's iBook. A few bits of LanguageKit are specific to the host platform and this was the first time anyone had tested it on PowerPC Linux. By Tuesday evening, LanguageKit (Smalltalk) was passing all of the tests on PowerPC Linux that it passes on other supported platforms. I pointed Eric at the right bits of code for implementing the PowerPC/Linux ABI, but he did the real work and improved the test suite a bit. I hacked on the Project Manager some more.
Nicolas had a traditional experience with British trains, and arrived later than anticipate, while Quentin had the opposite experience, resulting in their meeting at the station.
Quentin: On May 12, I was on the plane from Paris to Bristol to head to the new Étoilé hackathon edition, with my main luggage largely filled with cheese and saucisson. Seven hours later I was in Swansea one hour earlier than expected, so I decided to wait Nicolas a bit since our arrival were roughly in the same time slot. After a short taxi ride, we met Eric and David which were both actively working on getting LanguageKit working on Eric's iBook (a ppc machine).
Nicolas: We mostly spent this first evening discussing about various technical topics and which things we wanted to work on during the rest of the hackathon.
The evening wasn't just spent on technical discussions. We opened a couple of bottles, and each of us took it in turns to do some short demos of stuff we'd been working on.
Quentin: We celebrated Étoilé and the beginning of the new 2009 hackathon organized by David with plenty of good stuff: red wine, bread, saucisson and cheese. Nicolas was pretty quick to suggest the opening of the saucisson and I convinced him we couldn't eat it without some bread :-)
I had a cold the hackathon week, so was falling asleep shortly after midnight most days. The others stayed up a lot later, with Quentin putting together a simple DTP-like application as a demo on the first night.
On Wednesday, we walked down to campus and got our door access set up properly.
One of the things we've been planning for 0.5 is a unified message store and delivery system. Something similar was outlined in The Humane Interface, but without the implementation details. This is something that several of us are likely to implement parts of, so it was important to get a coherent model of how all of the bits would fit together.
Quentin: We ended the morning on the CoreObject unified message store (mail, im, rss etc.) which had been discussed several times on the lists over the previous year. We tried to work out all the involved interactions with GML on the whiteboard. We finally settled on a solution which is still to be implemented. The whiteboard result was the usual multi-layered arrow/box soup we all appreciate ;-)
This year we managed to schedule the hackathon to occur during the university term. This had the advantage that it was possible to find food on campus, so a little while after we'd arrived we headed to an adjacent building for food. We then spent the rest of the afternoon hacking. I spent some time fixing various platform-specific bugs in LanguageKit then hacked a bit more on the Project Manager, while Quentin worked on EtoileUI:
Quentin: After the lunch, I was busy testing the last part of the new EtoileUI event handling on GNUstep. I was happily surprised to discover a large part was working almost flawlessly. In the middle of the afternoon, I started worrying on how I was going able to split all this work in small commits without spending a week on the merge. Although I had already committed many changes related to the new event handling model, the last chunk was a lot bigger and was tangled with other unrelated code I had to fix on the way. Various uncommitted new features/classes and the cross development on both Mac OS X and GNUstep was making everything harder.
Nicolas spent most of the time working on Code Monkey, the Smalltalk IDE build on top of LanguageKit. I spent some time on Wednesday afternoon writing code that could inspect a C stack and extract the Smalltalk StackContext objects, allowing for complete introspection of the stack.
Nicolas: Wednesday found us at the University, where we started hacking on EtoileUI, ETSerialize, Project Manager and CodeMonkey. Commits started to flow in as the day passed, and we ended up the day drinking some wine and eating a great cheese fondue with other Swansea friends.
In the evening, I made a fondu. Apparently this is a specialty of the region of France where Quentin is from (as is Reblochon, one of my favourite cheeses), so he watched very closely while I made it to ensure I wasn't doing anything, as he put it, 'heretical'.
Quentin: After the dinner, Eric convinced to push the big chunk in a single go. I was relunctant to do so, but Git decided to get in the way in the middle night and wrongly rename files randomly in my local branch when rebasing to the master. Happily Nicolas was sitting right next to me and he managed to get Git works a bit more sanely than I was able to. I finally gave up the last bits of my developer civility and pushed almost 10000 lines in a single commit. At the same time, I extracted few work-in-progress bits here and there to commit them separately. I added a very rough Magritte-like model description to EtoileFoundation with ETModelDescription and also ETPaneLayout which will supercede ETPaneSwitcherLayout and the PaneKit when finished. The model description stuff is demoed in the FormExample of EtoileUI were ETFormLayout uses to generate a basic form-like UI. The ETTemplateItemLayout infrastructure behind ETFormLayout is also used by the new ETIconLayout which was added during the hackathon. All that stuff is still quite unpolished… Don't expect too much from it.
On Thursday, we knew Fred Keifer, the GNUstep AppKit maintainer, was going to join us at some point, but weren't sure when. He and his girlfriend were spending some time hiking in Wales, and planned their trip so he could spend a bit of time with us.
So that Fred wouldn't get to university and find us absent, we got up early (I, once again, had gone to sleep around 12:30, but apparently everyone else stayed up for a few hours after).
Nicolas: Thursday we worked a bit at the university, but as our room was to be occupied for a couple of hours during lunch time, we went to eat in a small coffee place at the Swansea marina, walking from the University to the marina along the beach.
At the last hackathon, Quentin had somehow managed not to see the beach so it was nice to stroll along to the marina and drink some nice coffee and hack on a sofa while we enjoyed the free WiFi and good food.
Quentin: Most of the day was spent improving and debugging the new EtoileUI event handling which makes event handling logic reusable in form of tools/instruments (in the spirit Taligent and OpenDPI) and allow to dispatch semantic actions instead of input device specific events, time to time I side tracked to discuss other stuff like CoreObject or LanguageKit.
Nicolas: In the late afternoon we finally had some news from Fred, who was waiting for us at the University, and we went back for a productive few hours, discussing many issues with him and some coding session (notably I ended up writing a live resize version of NSSplitView which turned out to be reasonably fast even on slow computers).
We had a discussion with Fred a while ago over the split between GNUstep and Étoilé and we wanted to talk in person about merging some of our changes (better horizontal menu support and theming, in particular) back upstream into GNUstep. Fred wasn't able to stay very long (and I ran away for a couple of hours for my weekly game of Ultimate Frisbee) but the discussions were productive and hopefully will lead to better collaboration between GNUstep and Étoilé in the future.
After Fred left we went back home and assembled CorePizza 2.0. This year we sent Nicolas to find the beer and he didn't get lost and Eric displayed an unsuspected skill in creating even pizza bases.
Quentin: David deserted around midnight. Eric, Nicolas and I were busy discussing how to improve CodeMonkey UI. What were the most important missing pieces now that debugging was underway with the new stack introspection support written by David yesterday. This was the night of CodeMonkey meets CoreObject. Sadly we gave up in the middle of the night on getting CodeMonkey works with CoreObject. In the end, Eric discovered two days later, the main problem was that the message name that was expected to trigger the persistency was mistyped :-/ … and we somehow missed it.
Friday morning started a bit later. We didn't have to get up for any specific reason. We did make it in to campus in the morning, but only just. The extra sleep was helpful for me, and my cold had almost gone away by Friday night.
Quentin: Friday was the talks day. I initially prepared a long talk on CoreObject, but the planning was 20 mn time slot, so I tried to quickly rearrange my slides with Eric's help to drastically reduce their number. In addition, I put together some slides on EtoileUI when Nicolas was preparing some slides to present GNUstep and Étoilé history.
Friday was the last day I was here, and we had talks planned for the afternoon. After quickly preparing some slides, we gave talks about the history of gnustep and étoilé, language kit, core object and étoilé UI -- all talks that you can now watch online :)
A lot of the morning was spent fixing up slides and then panicking that no one knew where the equipment were were planning to use to record the talks was. In the end we used the built-in iSight on Nicolas' MacBook Pro for video and my external one as a microphone.
Nicolas left shortly after the Friday talks finished, to recover from Nicolas' departure, Eric, David and I then went to the pub just outside the university campus to drink beers. We had an entertaining discussion on widely incoherent subjects: mountains and snow, the history of Canada, France and UK and a bit of Étoilé stuff here and there. LanguageKit and Étoilé on the web were the most discussed topics on the Étoilé side. We debated what was the best way to implement exporting EtoileUI-based applications as web applications and the various approaches we could take to do so… Then there was a long discussion which started with the organization of a ski-oriented hackathon in North America, turned into a wine oriented discussion (even French wine export troubles :-) then finally back to the holy Newton and the sacred Psion. David was motivated enough to take us for a walkthrough of the Psion 3A inside an emulator on his MacBook.
The Psion emulator was originally written for DOS, and runs inside DOSBox very nicely. I originally got my copy off a Computer Shopper (I think) cover CD, but you can probably find a copy online. The Psion did a lot of things very well, such as always saving state when you closed a document, so the user never really noticed the difference between applications and documents being closed and just being in the background. Anyone designing mobile devices should copy as much as they can from the Psion (and the Newton).
Friday was the last official day of the hackathon, but Eric and Quentin stayed until Monday morning, when they both headed off to Paris for Quentin to show Eric a sunnier bit of Europe.
Over the weekend, Eric and Quentin continued to hack on CoreObject and EtoileUI. Quentin tidied up the old Object Manager code which lets you visually inspect the AST of a Smalltalk program, while Eric did some work on the overlay shelf.
On Sunday we walked along the coast to Mumbles and had coffee and cakes in a sea-front cafe. For someone who lives amongst the patisseries of Paris, Quentin has a surprising obsession with Welsh cakes.
Quentin: That day I ate a lot of welsh cakes, scones and chelsea buns, that's almost all I can remember :-D As a last hacking objective, I got my generic Object Manager written in Smalltalk more or less working… the next step is now to improve CoreObject to get the whole thing more usable. The other part was mostly the continuation of the rewrite of various event-related methods in EtoileUI and a more exhaustive test coverage of the event handling.
On Saturday evening some friends of mine were playing in a tango band. Given the choice between coming out and dancing with beautiful women, or staying in and hacking, Eric and Quentin both chose to hack. Two nights in a row. When you next run Étoilé, remember their dedication…
Overall, we got a lot done. Nicolas did a lot on Code Monkey. It is now integrated with CoreObject for data storage and the final, intended, user interface was discussed in a lot of detail. Quentin and Eric did a lot with CoreObject and EtoileUI. Eric also did some hacking on LanguageKit, including some of the C++ bits that previously no one except me understood. I spent a lot of time tiding up bits of code, working on LanguageKit and the Project Manager, and finally got to spend some time improving the Objective-C 2 compatibility library. Not all of my commits are visible in Étoilé svn, because I also spent some time hacking on clang, including implementing support for Objective-C 2 declared properties on the GNU runtime.
Hackathon Statistics:
Total svn commits: 58. Total svn diff size: 20681 lines. GML diagrams drawn: 3. Welshcakes consumed: More than fit in a SmallInt.
LanguageKit SmallInt Improvements
Yesterday, I spent some more time hacking on clang. This time, it was implementing -ftrapv
, a command-line option which automatically inserts overflow checking on integer arithmetic operations. This was made very easy to do because LLVM 2.5 now supports this in the IR. There are now intrinsic functions like llvm.sadd.with.overflow
which return a pair containing the result of the arithmetic operation and a flag indicating whether it overflowed. On x86 (and some other back ends - I don't know exactly which ones now, but eventually it should be all of them), this just provides access to a condition flag in a register.
The implementation of -ftrapv
in GCC just aborts in case of overflow, but in clang I had it call a function. This function takes the two operands some some information about the operation as arguments, and returns a value which is used in place of the overflowed result. I contributed a simple default version of this to clang, which simply printed a helpful error message and aborted.
For LanguageKit, I did something a bit more interesting. The implementation of arithmetic operations in Smalltalk originally was quite naive. A SmallInt in LanguageKit is an integer squeezed into a pointer. On a 32-bit platform, it is a 31-bit integer, while on 64-bit platforms it is a 63-bit integer. In both cases, the remaining (low) bit is set to 1, to distinguish it from an object pointer (object pointers are always word-aligned and so their least-significant bit is always 0). For each operation, I was doing something like this:
a >>= 1; // Turn the SmallInt into an int
b >>= 1; // Turn the SmallInt into an int
c = a + b; // Do the actual addition
if (c > SMALL_INT_MAX) { handle overflow } // Do some simple overflow checking
c = (c << 1) | 1; // Shift the int back to a SmallInt and set the low bit
return c; // Return the result
One of the problems with this is that every operation then added at least one pair of shifts which the LLVM optimisers could not remove. Detecting that a value right shifted and then left shifted by the same amount is the same as the original is nontrivial, and requires knowledge of the values being stored.
The new code is much simpler. This is the real code from the addition implementation, a function which takes two pointers obj
and other
as arguments:
OTHER_OBJECT(plus);
intptr_t val = ((intptr_t)other) & ~ 1;
return (void*)((intptr_t)obj + val);
The first line is a macro which tests if other
is an object (typically a BigInt, but maybe an NSNumber of similar). If it is, then it promotes obj
to a BigInt and tries the addition again. The second line clears the low bit on the other object pointer. The final line just adds the two SmallInts together. Because the low bit on one is always 1, and the low bit in the other is always 0, this addition does the same thing as the two right shifts, addition, left shift, and or in the older version. Even better, because we are now adding word-sized numbers, the hardware overflow checking can now be used correctly. If the addition in the last line fails, the handler will be invoked. This is implemented in the LanguageKit runtime framework and automatically inserts a BigInt pointer in place of the SmallInt value.
I've made similar improvements to the multiplication and subtraction routines. Previously, I was using a very naive approach to checking for multiplication overflow. If either of the operands was bigger than the maximum value you could store in a half-word then I was promoting the operation. Now, multiplications are only promoted if the result does not fit in a 31-bit or 63-bit signed value, just like addition and multiplication.
So, what does this mean performance-wise? I wrote a simple benchmark (in Etoile/Languages/Benchmark) to test this a while ago. This implements the most naive way of calculating the Fibonacci sequence (recursively call fib(n-1) and fib(n-2)) in both Objective-C and Smalltalk. The code measures the amount of CPU time taken to calculate fib(30) 100 times in each version.
When I first wrote this benchmark, some time before Christmas, it showed that the Smalltalk implementation took around 30 times as long as the Objective-C version. This wasn't too bad; I never intended Smalltalk to be used for heavy number crunching. The entire point of targeting the Objective-C ABI was that you can use Objective-C, C, or even inline assembly if you really care about speed. Looking at the generated bitcode, I saw that there were a lot of redundant shifts, and so a great deal of potential for optimisation. Running it again, with the new SmallInt code, I get this:
ObjC fibonacci execution took 23.773438 seconds.
Smalltalk fibonacci execution took 35.953125 seconds.
Ratio: 1.512323
Smalltalk now takes 1.5 times as long as Objective-C. It's worth mentioning in this case that using a slightly better algorithm in Smalltalk is around a factor of 100 faster; good algorithms with bad compilers almost always beat good compilers with bad algorithms.
I didn't actually believe these results when I first looked at them, and spent quite a while going through the code looking to check that they are correct. This test is pretty much the worst possible case for Smalltalk, so in general use I'd expect Smalltalk to be even closer to Objective-C in terms of performance.
By the way, this test also now shows that the LowerIfTrue transform is actually useful. This is an AST transform in one of the LanguageKit plugins that turns -ifTrue:
and related message sends in the AST into conditionals. When I wrote it, I tested this benchmark with and without it and there was no measurable difference; the overhead from the arithmetic was dwarfing the overhead from the extra message sends. This is not the case now. Disabling this optimisation gives this result:
ObjC fibonacci execution took 23.773438 seconds.
Smalltalk fibonacci execution took 60.390625 seconds.
Ratio: 2.540256
The Smalltalk implementation sends an -ifTrue:
message when testing whether the argument is less than 2 (in which case it returns 1), while the Objective-C version just uses an if
statement. With the transform, the Smalltalk version does not need the extra message send to the block and so compiles to almost the same machine code. I am not a huge fan of this optimisation in theory (since it makes the language marginally less expressive), but in practice it is unlikely to affect any existing code and is implemented by most other Smalltalk compilers.
GNUstep and Clang
I'm going to break from tradition slightly now and talk a bit about something that is not directly part of Étoilé. For a while, I've been intermittently hacking on clang, the new C/C++/Objective-C front end for the LLVM compiler infrastructure. If you've done any development for the iPhone or for recent versions of OS X, you will probably have used llvm-gcc, which uses the same back end but a front end based on GCC.
Since GCC switched to GPLv3, Apple has been slowly migrating away from it. The GPL wasn't great for Apple anyway, since they wanted to be able to use the same front end code in XCode for syntax highlighting, autocompletion, and error reporting as they use in the compiler, and the GPL didn't allow them to do this without releasing XCode under a GPL-compatible license. A couple of years ago, they started the clang project to replace the GPL'd front end and create a reusable set of libraries that handled parsing, analysis, and LLVM bitcode generation for C-family languages.
Why is this relevant to Étoilé? For some very simple reasons:
- Apple are no longer contributing much to the main GCC tree.
- No one outside Apple contributes much to Objective-C in GCC.
- Étoilé needs an Objective-C compiler.
The original Objective-C code in GCC was contributed by NeXT (after being attacked by the FSF's lawyers) and only supported the (closed) NeXT Objective-C runtime library. Over time, the NeXT and GNU runtimes have diverged a lot. NeXT, and later Apple, maintained their own branch of GCC and never merged support for the GNU runtime. The GNU branch maintained support for both, via an unreadable mess of #ifdefs in a single 10,000-line file.
Clang, starting from scratch, has a clean abstraction layer between the runtime-specific and runtime-agnostic bits of code. Each runtime library needs to implement a subclass of the CGObjCRuntime C++ class, which implements hooks for generating runtime library data structures and calls.
I started working on the GNU runtime support a little while ago, and it is now in a usable state. For a little while, the things that have stopped GNUstep compiling with clang have been missing support for GCC extensions to C. Now, these have all been fixed. GNUstep-base now compiles with clang, but doesn't link. The remaining issue is that LLVM does not support the __builtin_apply()
family of GCC extensions. These are not actually used by GNUstep in most cases - their functionality is replaced by either libffi or libffcall - but they are still called by a few unused methods. A small restructuring of the GNUstep-base code will allow it to be compiled with the current svn version of clang. I've also tried compiling a few Étoilé frameworks with clang, and so far they've all successfully built (although I haven't tested if they work yet...).
That's not to say that clang is completely ready to replace GCC as the compiler for GNUstep and Étoilé. In summary:
- All of the NeXT-era Objective-C stuff is supported.
- Fast enumeration works (and is implemented for some of the GNUstep collection classes).
@try
/@catch
and so on do not work yet (although the old-style exception macros do).- A very small subset of declared properties may work (untested).
- Garbage collection does not yet work.
- Lots more testing needs to be done.
If you want to get involved, take a look at the CGObjCGNU.cpp file in lib/CodeGen in the clang repository. It contains a number of empty methods, which need implementing to bring the GNU runtime implementation up to feature-parity with the Apple version. Alternatively, try compiling your Objective-C code with the latest clang and report any errors to me and to the clang team.
Exception handling is probably the next thing to implement. The code in CGObjCMac.cpp contains a lot of things that probably should be factored out into runtime-agnostic code, so if you're interested in working on this, take a look in that file for inspiration.
Étoilé 0.4.1 Packages
We released 0.4.1 a few days ago. If you want to try it out, but don't want to have to build the system yourself, you now have several alternatives.
Arch Linux now has a package for 0.4.1.
FreeBSD has had a metaport of Étoilé since 0.1 back in 2006. It has now been updated to 0.4.1. Note that this metaport also includes some older things that are not part of 0.4, typically because they are not stable enough to be included in a release.
Gentoo has Étoilé 0.4.1 in the GNUstep overlay (not official link, because I can't work out how you're supposed to navigate the Gentoo packages site).
We're interested in hearing from anyone running Étoilé on other platforms. We've had reports of it working successfully on Debian and Ubuntu, although packages are not available for these platforms (Ubuntu has a Brainstorm suggestion and bug relating to Étoilé packages). I don't know if anyone has run recent versions of Étoilé on OpenBSD or Solaris, but I'd be interested in fixing any problems encountered on these platforms.
If you're interested in adding packages for Étoilé to your favourite operating system then send questions to our packagers mailing list or ask in SILC.
Étoilé 0.4.1 Released
I've just branched the 0.4.1 release and uploaded the tarballs to usual location. This is an incremental release, providing bug fixes and feature enhancements to 0.4.0.
The main focus for this release is an improved LanguageKit. This is now better-integrated into the whole Étoilé system and can be used to load bundles providing extra features to existing applications without the need for source-code availability.
As usual, we only recommend releases for people creating packages. If you are compiling from source, please check out the trunk or stable branches in subversion.
Étoilé interviewed on FLOSS Weekly Podcast
David and Quentin were on the latest episode of the FLOSS Weekly podcast with Randal Schwartz and Leo Laporte. Listen to the interview here.
LanguageKit Updates
Lots of fun things have happened with LanguageKit recently and since I haven't written anything on this blog for a while I thought I'd write about some of them.
Fixing Bugs
A bit before Christmas, I found a subtle bug in clang that was causing the class names not to be NULL
-terminated in the support file. This worked fine in some cases and crashed on others, dependent entirely on the location of the generated code in memory. Eric was very confused by the fact that code was working when he added a constant string and failing when he didn't. When this was fixed, a lot of heisenbugs went away.
Returning Structures
I implemented some very basic understanding of calling conventions. It turns out that LLVM doesn't know anything about the host ABI's structure returning mechanism, which was causing some very strange results. This was especially confusing to me since it worked on FreeBSD, but not Linux and everyone I asked told me that they used the same calling convention on x86. It turns out this isn't quite true. FreeBSD uses an in-register mechanism for returning small structures, while Linux uses the old PCC calling convention, passing them on the stack. Adding basic ABI-awareness meant that you can now call functions that return NSRange
and similar structures and pass them as arguments. When you get an NSRange
returned from a method, it is boxed in an NSValue
. When you pass it to a method expecting a range, -rangeValue
is called. This allows you to substitute any other object that responds to -rangeValue
.
Stack Unwinding
The biggest missing feature before Christmas was support for non-local returns in Smalltalk. It is quite common to want to write code like this:
(some guard condition) ifTrue: [^nil].
The problem here is that the ^nil
is inside a block. In Pragmatic Smalltalk, this will be in a nested stack frame. The return, however, is meant to return from the method, not from the block. This is a really ugly bit of Smalltalk, but one that apparently people use.
This month, I finally got around to implementing it. This is done via the zero-cost exception mechanism. The return statement in the block is transformed in the code generator to a self nonLocalReturn:nil
. This method is written in Objective-C and looks up the scope chain until it gets to the method. It then throws a new kind of exception which unwinds the stack until it gets to the method.
This means that it will work with intermediate stack frames written in any language. If you have a @finally
block and are compiling with native exceptions in Objective-C then this should be called on the way up the stack. Similarly, any cleanup that intervening Smalltalk stack frames need to do will also be done.
This is a very expensive operation. For each stack frame between the block and the method there is a call to the unwinding library, at least two calls to the LanguageKit personality function, and a call to the frame's cleanup code. Fortunately, non-local returns are mainly used for guard clauses so it's not likely to be invoked in the most common code paths. In future versions there will be more aggressive block inlining and some of these returns will be transformed into local returns.
Loading Bundles
One of the things I'd been planning to do from the start is allow bundles of Smalltalk to be run. I've now committed support for loading bundles which contain a LKInfo.plist
resource. This specifies a list of Smalltalk (or other language...) files that will be compiled and a list of frameworks that will be loaded.
When used in conjunction with edlc
there is some dynamic patching of the NSBundle
class so that the loaded bundle will be returned in response to a +mainBundle
message. This allows NSApp
to find it and load resources easily. As a demonstration of this, Eric has committed the first pure-Smalltalk application to svn recently. This uses a simple shell script as the application binary:
#!/bin/sh
edlc -b `dirname $0`
You can launch this application with openapp
or it will run as any Objective-C application would. The class specified by the PrincipalClass
field in the plist is instantiated and sent a -run
message. This does the same thing as the NSApplicationMain()
function that most Objective-C apps use:
NSObject subclass: PhotoManager
[
run [
| nsapp |
nsapp := ETApplication sharedApplication.
NSBundle loadNibNamed: 'MainMenu' owner: nsapp.
nsapp toggleDevelopmentMenu: nil.
nsapp setDelegate: self.
nsapp run.
]
]
Since LanguageKit supports static compiling as well as JIT, I eventually plan on extending this to emit a shared library for each bundle and load it rather than recompiling if the code hasn't changed.
Application Patching
One of the nice features of GNUstep is the idea of an AppKit user bundle. These are specified in defaults and will be loaded when AppKit initializes. Étoilé provides three of these. One relates to theming, one relates to the menu bar, and the final one is for general behaviour.
Part of the refactoring I did recently to LanguageKit moved the code generation component, which is huge, to a separate framework which is now dynamically loaded on demand. This means you can link LanguageKit against other applications without much overhead, and this is exactly what we're now doing.
If LanguageKit is installed then the EtoileBehavior bundle will load it into any GNUstep application. It will then look in $(GNUSTEP_USER_ROOT)/Library/LKPlugins for a directory matching the name of the application. If one exists, then it will load every bundle with a .lkplugin
extension. This allows you to add arbitrary code to any application. Using categories, you can even replace existing methods.
Why is this important? Take a look at this from the Free Software Foundation:
The freedom to improve the program, and release your improvements (and modified versions in general) to the public, so that the whole community benefits (freedom 3). Access to the source code is a precondition for this.
In an Étoilé environment, access to the source code is now not a precondition of this freedom. You can improve a program, in a high-level language, without access to the source code.
AST Transforms
There is now a simple API for adding transforms on the LanguageKit abstract syntax tree. You just need to adopt a simple protocol and you will get the opportunity to replace each AST node in the tree. You can also use this mechanism to collect information about the tree.
Plugins using this mechanism will be collected in Languages/LKPlugins in svn. Currently there is just one simple demo, which replaces every comment with a log message. This allows you to record every time a line containing a comment is passed.
Code Monkeys
The final, but possibly most exciting, development recently is CodeMonkey, a new IDE being written by Nicolas. This uses LanguageKit for compiling and for for syntax highlighting using code based on Günther's pretty-printer.
It's still at quite an early stage of development - not even a 0.1 release yet and probably won't make it into 0.4.1 - but it's very promising. This does a few neat things with AST transforms. If you add a new method to a class then it will emit a new category on the class, rather than a new copy of the class (which would confuse the runtime). More interestingly, if you add an instance variable to a loaded class it will turn every reference to this into KVC calls. When you restart the application, the transform won't run and they will be direct ivar accesses.
The medium term plan for CodeMonkey is to turn it into a bundle. EtoileBehavior will then add a new menu item which will load the bundle and run the UI. This will give you Squeak-like introspection and modification of running applications.
Prototypes and LangaugeKit
Over the last few days, I've been playing around with rewriting the prototypes code not to require a modified runtime. The old code used a new (backwards-compatible) version of the GNU Objective-C runtime that added a 'third chance' dispatch mechanism, allowing objects to implement their own method lookup function.
The new version works by doing a hidden class transform. One of the unused flags in the class structure has been set to define whether a class is a hidden class or a real one. Hidden classes are just like normal classes, but they are generated at runtime and inserted at the end of the inheritance chain. When you send an object a -becomePrototype
message, it inserts a new class/metaclass pair as its class, with the old class becoming its new superclass.
Since I'm inserting new classes, this gave the opportunity to add some other information, most usefully an NSMutableDictionary
for storing extra values. In Smalltalk, there is a mechanism a lot like this. Instance variables in the Objective-C sense are called 'indexed instance variables,' but others stored in a linked list are also possible (but slightly slower to access).
The hidden class implements a few methods of its own. The most important of these are the two related to Key-Value Coding, meaning that you can use the standard KVC methods to access this hidden dictionary. In Objective-C, you simply do something like this:
id setValueIMP(id self, SEL _cmd, id aValue)
{
[self setValue:aValue forKey:@"TestKey"];
return self;
}
id getValueIMP(id self, SEL _cmd)
{
return [self valueForKey:@"TestKey"];
}
int main(void)
{
id pool = [NSAutoreleasePool new];
id proto = [[MyObject new] autorelease];
[proto setMethod:(IMP)setValueIMP forSelector:@selector(setTestValue:)];
[proto setMethod:(IMP)getValueIMP forSelector:@selector(testValue)];
[proto setTestValue:@"A string"];
NSLog(@"Test value: %@", [proto testValue]);
return 0;
}
This will output something like this, when run:
2008-12-20 16:16:29.054 test[6813] Test value: A string
Of course, this is not possible in Smalltalk, since Smalltalk doesn't have the ability to write functions outside of classes. It does, however, have the ability to create blocks.
This week, I tweaked the code slightly so that if you insert a block into this dictionary as an object, it implicitly registers a trampoline method for the key as a selector. When you call the method, the trampoline uses the _cmd
argument (the hidden argument in all Objective-C methods giving the selector used to look up the current method) to get the block from the dictionary and then sends it a -value:
message, or a value:value:
message (and so on) depending on the number of arguments that the block expects.
This lets you write code in Smalltalk like this:
NSObject subclass: SmalltalkTool
[
run
[
| a |
a := NSObject new.
a becomePrototype.
a setValue:[ :object :aValue | ETTranscript show:aValue; cr ]
forKey:'logValue:'.
a logValue:'A string'.
]
]
When you run this with edlc
, you get:
A string
Obviously this is slower than a direct method call, since you first enter the trampoline, then look up the block in a dictionary, and then finally call the block, but it should be a lot faster than using forwarding. On GNUstep, sending a message via forwardInvocation:
takes around 300 times as long as a direct message send. A message send to a block costs around 3 times as much as a message send to an Objective-C object. A message send to a prototype costs around ten times as much. Note that a lot of the extra overhead from the block (which has a similar overhead to calling a Smalltalk method) is due to the need to clean up the object context at the end of the call.
Étoilé 0.4 Hits FreeBSD Ports
Jesse just pointed out to me that the FreeBSD ports for Étoilé were updated to 0.4.0 last week. FreeBSD was the first operating system to package Étoilé, thanks to some excellent work by Dirk Meyer, who also maintains a large number of GNUstep ports for FreeBSD.
Dirk has added all of the 0.4 release to the ports tree, and you will also find a number of things that were part of the 0.2 release but weren't considered polished enough for 0.4.0 (but should be reappearing over the 0.4.x series after some more testing). There is a metaport called etoile
which will install all of these. Simply do:
# cd /usr/ports/x11/etoile
# make install clean
Or, if you have portupgrade
installed:
# portinstall etoile
And you will have the full set of Étoilé packages installed. If you want to package Étoilé for your own favourite operating system, then please subscribe to the etoile-packaging list and post any questions you might have there.
Massive LanguageKit Improvements
This weekend, I rewrote large chunks of LanguageKit, simplifying a few things and upgrading a few other things. If you are using LanguageKit / Smalltalk, make sure up update both LanguageKit and SmalltalkKit - they contain interrelated changes.
The changes involved the way in which local variables are allocated on the stack. Every stack frame now contains a StackContext object. All of the local variables are now stored as instance variables in this object. The BlockClosure object is now stateless - all of the state information is stored in the context hierarchy.
This simplified the symbol table a lot, and meant that a lot of ugly code related to closures has gone away. The most obvious side effect of this is that accesses to variables from blocks now works correctly. In addition to contexts, blocs are also now allocated on the stack. This should speed things up a bit and have the nice side-effect that it gets rid of a potential memory leak that used to exist.
The other side effect is that it greatly simplified the task of implementing upward funargs. If you want to keep a block around for longer than the duration of the parent scope, then you just need to send it a -retain
message (assigning it to an instance variable in Smalltalk does this for you). When you do this, you get a copy of the block, rather than original (you don't want to store pointers to things that are on the stack - bad things happen if you do).
The following program illustrates this:
NSObject subclass: SmalltalkTool [
| block |
run [
self setBlock.
self callBlock.
]
setBlock [
| a |
a := 'Local variable'.
block := [ a log. ].
]
callBlock [
'Testing retained block:' log.
block value.
]
]
When you run this, you get the following output:
$ edlc -f retainblock.st
2008-12-06 17:27:36.530 edlc[40803] Testing retained block:
2008-12-06 17:27:36.566 edlc[40803] Local variable
As you can see, the block is stored in an instance variable and is still able to access the variable a
from the setBlock
method, even after that method has exited.
The block you acquire still contains a pointer to the context on the stack, however. This means that if you use it before the enclosing function, method, or block, returns, you will still get any changes to other variables propagated back up the stack.
When a copy of a block is made, it calls -retainWithPointer:
on its scope. This method stores the address of the pointer which references the block and swizzles the block's isa
pointer so that it becomes a RetainedContext object. At the end of any LanguageKit-generated block or method, a simple test is performed on the context to see if it is a RetainedContext. If it is, it is sent a -promote
message. This then copies it to the heap and updates all of the pointers referencing it.
As you can imagine, this is quite an expensive procedure. The good news is that it's rarely needed, and only incurs one test at the end of each method or block when it isn't. Since blocks are now allocated on the stack, I expect that the new version should be faster than the old one (although I haven't actually benchmarked it).
The new version is in trunk now, so please test it. People watching svn will have noticed that the BlockContext object currently has an unused instance variable: char **symbolTable
. This will be set to an array of the variable names for every variable in the context, which will allow contexts to be introspected.
Étoilé 0.4.0 Release Announcement
Étoilé intends to be an innovative, GNUstep-based, user environment built from the ground up on highly modular and light components. It is created with project and document orientation in mind, in order to allow users to create their own workflow by reshaping or recombining provided Services (aka Applications) and Components. Flexibility and modularity on both User Interface and code level should allow us to scale from handheld to desktop environments.
0.4 is a developer-targeted release on its way towards this goal. As a developer-focussed release, this predominantly consists of frameworks. A few demonstration applications are also included. More will be added during the 0.4.x release series, leading to a user-focussed 0.5 release next year.
Highlights
CoreObject is a framework for describing and organizing model objects. It supports automatic persistence and versioning by recording messages sent to objects. It offers a flexible versioning scheme where both individual objects and their entire object graph can be versioned separately. The built-in object model is a generalization of the property model used by the AddressBook framework. Foreign model objects can be also integrated by wrapping them with a special proxy. CoreObject uses the EtoileSerialize framework which, in many cases, allows objects and messages to be automatically serialized with no extra code being written.
LanguageKit is a compiler kit built on top of LLVM for creating dynamic language implementations using an Objective-C runtime for the object model. This is used by SmalltalkKit, implementing Étoilé's Pragmatic Smalltalk, a Smalltalk JIT compiler which generates code binary-compatible with Objective-C, allowing classes to be written in a mixture of Smalltalk and Objective-C.
EtoileFoundation is the core framework for all Étoilé projects, providing numerous convenience methods on top of the OpenStep foundation and significantly better support for reflection. This includes EtoileThread which allows objects to transparently be run in a separate thread. It also includes a number of extensions to the Objective-C object model, allowing traits and mixins. This framework is used by most of the rest of Étoilé and provides a number of core functions, such as UUID and XML handling.
EtoileUI is also available as an early preview release and should not be considered stable. EtoileUI is a high-level, object-oriented, user interface toolkit that provides a uniform tree representation for graphical objects on top of the AppKit. All User Interface concerns such as layouts, event handlers, styles, model objects etc. will be implemented as pluggable aspects. It also shares the same interfaces as other CoreObject systems. The combination of these three key features makes possible to inspect and reshape both User Interface and model objects at runtime through direct manipulation. It comes with a library of layouts where each one encapsulates a custom and pluggable visual presentation.
Other frameworks, such as LuceneKit, providing full-text indexing and searching, and OgreKit, a powerful regular expression framework are also included. UnitKit is a simple and flexible unit testing framework used by much of Étoilé. A new addition is MediaKit, a framework used to provide support for sound playback and recording and, in future, video. SystemConfig has received a number of improvements since our last release, including support for modifying basic X11 keyboard settings and monitoring the battery level.
Several applications are part of this release, such as Mélodie, a music jukebox using CoreObject for the music library and MediaKit for playback. Étoilé applications which use ScriptKit are scriptable from outside using Objective-C or Smalltalk. This is used by the hot corners and gesture recognition tool to run arbitrary commands in response to corner activations or mouse gestures, and by ScriptServices which allows arbitrary shell or Smalltalk scripts to be invoked on the current selection from any GNUstep or Étoilé application.
Screenshots
Étoilé in the Dictionary | About and Vindaloo PDF reader |
Availability
Étoilé 0.4.0 is currently available in code source form only and may be downloaded at http://download.gna.org/etoile/etoile-0.4.0.tar.gz It may also be obtained from Subversion with the following command:
svn co svn://svn.gna.org/svn/etoile/tags/Etoile-0.4.0
If you wish to use the latest stable release, then you can download
http://download.gna.org/etoile/etoile-0.4.0-svn.tar.gz before running
svn up
to seed your source tree.
More Information
Visit our website: http://www.etoileos.com/ and blog: http://etoileos.com/news/ Or subscribe to our mailing lists: https://gna.org/mail/?group=etoile Or join our SILC channel: silc://silc.etoileos.com/Etoile
Static Compiling Smalltalk
One of the things I wanted to do with Smalltalk was allow static compilation. This is possible with LLVM as the back end. The compiler creates LLVM IR, a low-level intermediate representation form, which is then used to perform optimisations and can be compiled or interpreted. I was using this for the JIT - the IR was created when the code was loaded but turned in to native code on-demand, when each method was used.
Today I committed a few changes to LanguageKit to allow the bitcode to be written to a file instead of loaded. This was slightly more complicated than you might imagine. I use a trick with the JIT where each Smalltalk module uses the set of functions defining small integer messages as a template. This allows them to be inlined nicely without having to worry about cross-module optimisations. For static compilation, this is not desirable, so the biggest change was allowing it to reference these functions externally or internally depending on how the code generator was being used.
Once this was done, I added a new -c option to edlc. If you now do:
$ edlc -c -f test.st
You will get a file test.bc as output. This contains the LLVM bitcode for the Smalltalk file. The next step is to link together all of the .bc files, including the MsgSendSmallInt.bc file which contains definitions of small integer messages:
$llvm-link $(GNUSTEP_LOCAL_ROOT)/Library/Frameworks/LanguageKit.framework/Versions/0/Resources/MsgSendSmallInt.bc test.bc -o smalltalk.bc
This outputs a single file, smalltalk.bc, containing all of the bitcode from the various modules. If you compiled more than one Smalltalk file then list all of the .bc files here. This is completely unoptimised, so let's run some optimisations on it:
$ opt -O3 smalltalk.bc -o smalltalk.optimised.bc
This runs the same set of optimisations that llvm-gcc runs at -O3. I haven't actually done any sensible tests to see if this is sensible, but hopefully it is (if anyone can come up with a good list of optimisations before I get around to doing some sensible testing, please let me know).
Now we have an optimised bitcode file, we want to turn this into object code. This is a two-step process:
$ llc smalltalk.optimised.bc
$ gcc -c smalltalk.optimised.s
The first step produces assembly code, and the second step assembles it (you can use as for the second step, but I was lazy and just threw it at the GCC compiler driver). You now have a file called smallltalk.optimised.o, an object code file that you can link in to your executable just as you would an object code file compiled from Objective-C.
This sounds a bit complicated, and it is. It's actually more steps than the first C compiler I ever used (where preprocess, compile, assemble, and link were all separate steps) required. Fortunately, Nicola Pero is working on adding support for it to GNUstep Make, so soon it should be just a matter of putting SMALLTALK_FILES=... in your GNUmakefile.
The bad news is that this is too big a change to be properly reviewed in time for 0.4.0, so unless you are running trunk you will have to wait for a bit to see it. 0.4.1 should be out around the new year, so you don't have too long to wait...
Packaging Étoilé
Nicolas just sent me a link to this Ubuntu brainstorm idea. Someone is proposing full Étoilé packages for inclusion in Ubuntu.
Currently, Étoilé is a bit of a moving target. It's been a long time since our last release. FreeBSD has ports for this release, but so much has changed in subversion since then that these are no longer a good introduction to Étoilé.
Hopefully this will change next month. We are planning on releasing Étoilé 0.4.0 on the 31st of October. If you want to get an idea of what it will contain then take a look at the current stable branch - this will be tagged 0.4.0 in under a week.
After this, we will be moving to a time-based point release schedule, with Étoilé 0.4.1 being released at the end of the year, 0.4.2 at the end of February, and so on.
Hopefully this will make life easier for packagers. We aim to only require released versions of dependencies for release versions of Étoilé (0.4.0 will require LLVM 2.4, for example, which is due for release on October 30).
If you are interested in providing packages for your platform, then please get in touch and let us know what we can do to help.
The road to CoreObject Part 3: Mixing Temporal Object Store and Name Service
CoreObject is a central piece of Étoilé, often discussed but rarely seen :-) The good news is that it's currently shaping up pretty well. But before entering in the details and illustrating CoreObject persistency with an example in a next post, I'd like to give a brief overview of it.
The overall architecture has evolved substantially over the past two years. The implementation started with the writing of EtoileSerialize by David, and CollectionKit then OrganizeKit by Yen-Ju. These last two frameworks were built to provide a semi-structured object model inspired by AddressBook framework, that can be used to write applications managing collection of objects (music, photo, contacts etc.). This summer, Eric wrote a music manager named Mélodie based on this reusable object model. As such, Mélodie is the first Étoilé application that truly uses CoreObject.
Until recently, CoreObject mostly existed as a fork of OrganizeKit in Étoilé repository. The persistency model was to store the whole core object graph into a single property list, or multiple property lists but without the possibility to reference core objects across these property lists. This was a very important limitation that prevented concurrency control and versioning of objects through EtoileSerialize. Moreover each time a process wanted to access a core object, the entire core object graph had to be deserialized. Over the past two months, I have revisited CoreObject, in order it fully leverages EtoileSerialize for persistency, supports the loading of the core objects in memory on demand, interacts with a metadata server to track stored objects, and provides a better control over the history of core objects.
The updated version of the semi-structured object model also brings a very transparent approach to persistency, you don't need to call EtoileSerialize explicitly or even use a proxy to wrap your objects.
Now let's look at the various building blocks of the framework. The basic idea behind CoreObject is to provide a reusable model for organizing objects and handling their persistency. The low-level persistency logic is implemented by EtoileSerialize, CoreObject extends it with:
- a protocol to organize core objects into groups (COObject and COGroup protocols)
- a main backend that provides a semi-structured object model (COObject and COGroup classes)
- additional backends to attach external object graphs (for example mounting a filesystem or exporting an application UI into the core object graph)
- COProxy for integrating persistent model objects not derived from the COObject class
- a metadata server to track stored objects and index both metadatas and content of core objects (COMetadataServer class)
- a per process object factory and cache that is used to handle the faulting and uniquing of core objects (COObjectServer class)
So CoreObject mostly adds a name service on top of a EtoileSerialize and persists the name service structure and the objects bound to it in the same uniform representation. This representation is the core object graph, where each object and each group is stored as a persistent root by EtoileSerialize. Each persistent root is identified by an UUID/URL pair. Persistent roots are currently stored as object bundles on the filesystem. An object bundle is a directory the contains the history of the object in term of snapshots and deltas. Deltas are serialized invocations that represent logical changes. EtoileSerialize defines a protocol for the storage model, so new ways to store the objects could be defined. For example, changing the layout of object bundles, storing all the objects in a single flat file, over the network or other kind of data stores such as ZFS DMU (the low-level ZFS transactional store on which the filesystem is built).
The UUID/URL pairs are stored in a metadata server, which defines all the objects that belong to a core object graph. In future, the core object graph should thus be able to span multiple computers or data stores backed by a single metadata server. The metadata server is currently based on a PostgreSQL database.
For this first approach, multiple users cannot share a single core object graph and the access rights are simply defined by the permissions set on the object bundles at the filesystem level.
Out of the box, EtoileSerialize provides the basic infrastructure for per object history. This allows to support undo/redo per object. However objects such as photos, music, contacts are usually organized into libraries and it is expected undo/redo will operate on the last modification for the currently opened library, when you use a photo manager or a music manager. If that wasn't the case, undo/redo would only work if one or several objects are selected as targets for undo. This also means the user would have to remember the last modified object if he changed the selection after editing this object.
To solve this problem, CoreObject introduces the notion of object contexts. An object context is a pool where you insert related core objects. The object context records an history that is the interleaved histories of all the objects that belong to it. By this mean, it becomes possible to navigate and restore the history per object and per context.
Most of the elements of the architecture outlined at the beginning have already been implemented, if we put aside the indexing service. Various key pieces remain to be written though: concurrency control, update feed to push object changes to client applications, in-store deletion model and history cleaning.
In a more broad perspective, integration with the branching support of EtoileSerialize, exporting core objects to other formats and collaborative editing, will also have to be fully worked out. Finally the versioning of structured documents will require additional support to be truly convenient and integrate perfectly with EtoileUI.
So, you want to invent a language?
I posted a little while ago about the Smalltalk compiler in Étoilé svn. Since then, Truls Becken has rewritten my parser (which was quite bad, and is now quite good) and tidied up the code a little. I've also refactored it into two frameworks, LangaugeKit and SmalltalkKit. LanguageKit contains all of the abstract syntax tree and code generation stuff, while SmalltalkKit contains all of the Smalltalk-specific parts.
The total line count for the Smalltalk-specific part is a shade over 500 lines of code. This means that writing a new front-end for something Smalltalk-like is very easy (I plan on adding some things to LanguageKit to make slightly less Smalltalk-like languages similarly easy).
If you want to play, then the first thing you need is a subclass of LKCompiler, which implements two methods: +fileExtension
and +parser
. The first returns the extension used by scripts in your language (@"st"
for Smalltalk), while the second returns the Class
implementing your parser.
Then you need to implement the parser. This just needs to implement one method, parseString:
which takes a string as an argument and returns an AST
. For Smalltalk, I have a hand-written tokeniser and use LEMON (from the SQLite project) for the parser. The tokeniser simply turns the string into a stream of tokens and then passes them one at a time to the parser (it might be simpler if I wrote it using something like Lex, but since it's only 200 lines of code now I can't really be bothered). The parser is generated from a BNF-like description of the grammar, with instructions in Objective-C on how to generate the AST from this.
Now that Truls has rewritten it, the Smalltalk grammar is a fairly good example of a LEMON grammar. If you want to write a new language, a good first step is tweaking Smalltalk a bit. If you find that you want a semantic construct that isn't supported by the AST, drop in to SILC and talk to me - adding static flow control (if
statements and while
loops) is high on my list of priorities, as is support for primitive (non-object) types that aren't auto-boxed.
Scripting and Gestures
Two of the things that have been on my TODO list for about two (maybe three) years are cross-app scripting and mouse gestures. StepTalk had some preliminary support for cross-app scripting, but I don't think it made it into a release. I never really liked its approach, since it seemed horrible over-engineered (for reference, the Smalltalk interpreter in StepTalk is about twice as much code as the Pragmatic Smalltalk compiler and support library).
Yesterday, I committed the first version of ScriptKit. This is a very lightweight cross-app scripting framework built on top of Distributed Objects. It simply exports a dictionary containing a set of named objects for scripting. By default, NSApp
(the application object) is exported. If you don't want to give unrestricted access to remote scripts you can export your own object with the 'Application' key and filter out some messages. You can also export other objects with their own names. In future we will define a set of standard-but-optional ones that Étoilé services should export (e.g. the current document, some CoreObject related things and so on).
For the paranoid, I plan on adding a 'Paranoid Mode' which uses a pre-shared key to prevent unauthorised scripts from controlling the app.
The nice side-effect of using DO as the core is that it is also trivial to send scripting events from Objective-C. Anyone who has tried doing this with Cocoa has probably given up and just generated a string containing AppleScript code and passed this to the scripting engine. Since we are using a Smalltalk which is toll-free bridged with Objective-C, it makes sense to just expose scripting objects as Objective-C / Smalltalk objects (well, object proxies) and use them directly, without a confusing abstraction layer.
The other thing I added yesterday was a gesture recognition engine. Today I remembered that 'x is a-cross' and fixed it so that it actually works. This is embedded in Corner.app, which currently handles hot corners for Étoilé (allowing scripts to be run when the mouse enters and leaves a screen corner). If you hold down control and shift, it enters a gesture quasi-mode. It then tracks mouse movements. Each movement is treated as an approximation of a movement in one of 8 directions, numbered 1 to 8 clockwise from the top (i.e. 1 is up, 5 is down, and so on). Complete gestures are therefore turned into strings ('gesture words'), so an 'h' shape would be '5135' (down-up-right-down). Distance moved in each direction is ignored because when doing mouse gestures I am rubbish at getting distances right, while with this system I can consistently do the gesture I was trying to.
Corner maintains a dictionary mapping gesture words to objects. These objects can be written in Smalltalk or Objective-C. They have to implement a -gesturePerformed
method, and this will be called whenever the gesture they are associated with is drawn. Now that cross-application scripting is working, this can be used to control any application, for example locking the screen and setting an away message in the Jabber client.
Currently, there is one default gesture - drawing an h hides the active application (if the active application supports scripting, otherwise it does nothing). Others will probably be added in time for 0.4.
OgreKit Tutorial #3
OgreKit also comes with a find panel. It can work on NSTextView, NSTableView and NSOutlineView. The later two are not ported yet, but the architecture is extendable to other graphic interface. An example of using OgreKit find panel is under '/Etoile/Developer/Examples/OgreKitExample'. First, we need to connect the find panel to the text view:
- (void) awakeFromNib
{
textView = [scrollView documentView];
textFinder = [OgreTextFinder sharedTextFinder];
[textView setRichText: NO]; /* Use Plain text adaptor */
[textFinder setTargetToFindIn: textView];
}
OgreKit find panel can search both plain text and attributed text. Here, text view is set to use plain text and the right adaptor will be used by OgreKit find panel automatically. To connect find panel and text view, use -setTargetToFindIn: from OgreTextFinder. That's all.
To bring up the find panel, add this action into menu:
- (void) findPanelAction: (id)sender
{
[textFinder showFindPanel: sender];
}
Now, you have a find panel which supports regular expression by default. Here is a screenshot:
OgreKit Tutorial #2
Here are some examples of using OgreKit:
In NSMutableString, -chomp remove all newlines ('\n') anywhere in a mutable string.
NSObject subclass: SmalltalkTool
[
run
[
| target |
target := NSMutableString stringWithString: 'alphabetagammadelta\n\n\n'.
target length log.
target chomp.
target length log.
]
]
In OGRegularExpression, -replaceAllMatchesInString:withString: replaces all matched strings.
NSObject subclass: SmalltalkTool
[
run
[
| regex target result |
regex := OGRegularExpression regularExpressionWithString:'a[^a]*a'.
target := 'alphabetagammadelta'.
result := regex replaceAllMatchesInString:target withString: '###'.
target log.
result log.
]
]
You can even swap the matched substring like this:
NSObject subclass: SmalltalkTool
[
run
[
| regex target result |
regex := OGRegularExpression regularExpressionWithString:'(a)([^a]*a)'.
target := 'alphabetagammadelta'.
result := regex replaceAllMatchesInString:target withString: '(\2)(\1)'.
target log.
result log.
]
]
OgreKit also supports various regular expression syntax:
OgrePOSIXBasicSyntax POSIX Basic RE
OgrePOSIXExtendedSyntax POSIX Extended RE
OgreEmacsSyntax Emacs
OgreGrepSyntax grep
OgreGNURegexSyntax GNU regex
OgreJavaSyntax Java (Sun java.util.regex)
OgrePerlSyntax Perl
OgreRubySyntax Ruby (default)
OgreSimpleMatchingSyntax Simple Matching
Now, let's go back to Objective-C. Instead of using regular expression to replace string, you can have a delegate method for that. Use -replaceAllMatchesInString:delegate:replaceSelector:contextInfo: to specify the delegate and method, then write your own replace method. Here, the replace method is -count:contextInfo:, which will return the number of matched letter.
(void) testReplaceDelegate
{
OGRegularExpression *regex = [OGRegularExpression regularExpressionWithString: @"a[^a]*a"];
NSString *target = @"alphabetagammadelta";
NSString *result = [regex replaceAllMatchesInString: target
delegate: self
replaceSelector: @selector(count:contextInfo:)
contextInfo: nil];
NSLog(@"Target %@", target);
NSLog(@"Result %@", result);
}
- (NSString *) count: (OGRegularExpressionMatch *) match
contextInfo: (id) contextInfo
{
return [NSString stringWithFormat: @"(%d)", [[match matchedString] length]];
}
The result will be:
Target alphabetagammadelta
Result (5)bet(3)mm(6)
OgreKit Tutorial #1
David asked me to write an example of using OgreKit framework. I figured it might be interesting to do that in combination with SmallTalk. This post shows you how to set up everything on Ubuntu 8.04. Of course, you need to have GNUstep installed first.
Dependencies for LLVM are 'lemon', 'flex' and 'bison'. They can be installed from Ubuntu packages. It is necessary to use LLVM trunk for Smalltalk. Based on LLVM User Guide, you can download LLVM trunk with
svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
After configuration with './configure', compile it with 'make ENABLE_OPTIMIZED=1' for release build (10x faster than debug build) and install it with 'make install'. It is not necessary to install the frontend for this purpose.
Once LLVM is ready, compile and install 'EtoileFoundation' and 'Smalltalk'. To avoid debug information from Smalltalk, use 'make debug=no' for compilation. Use 'st -f test.sh' under Smalltalk directory to testing Smalltalk. There are also a few examples under 'examples' directory.
For OgreKit, you need to have oniguruma from Ubuntu packages. Then compile and install OgreKit as usual.
Finally, this is a small script to check everything.
NSObject subclass: SmalltalkTool
[
run
[
| regex matches |
regex := OGRegularExpression regularExpressionWithString:'a[^a]*a'.
matches := regex allMatchesInString:'alphabetagammadelta'.
matches foreach:[ :x | x matchedString log.].
]
]
Save it in a text file called 'ogre.st', for example, and run it with 'st -f ogre.st -l OgreKit'. Parameter 'f' refers to the file name and 'l' refers to the OgreKit framework. The regular expression patterns is 'a[^a]*a', which means a string that the first and the last letter is 'a', but none of the letters in-between is 'a'. Using this pattern to match a string 'alphabetagammadelta' will give 3 substrings: 'alpha', 'aga', 'adelta'. Results are stored in an array of OGRegularExpressionMatch. Use '-matchedString' to retrive the matched substring.
Fun With Threads
This weekend I started working on a replacement for MultimediaKit. This has been on my TODO list for a while, since the current one is GPL-tainted. I started working with libavcodec and libavformat directly, since these are LGPLd.
In order to get consistent latency, ideally I wanted the decoder running in its own thread. Since we have a threading library in svn, I thought I'd try using it (okay, I wrote it, but I've not actually had the need of a threading framework since then). The first thing I needed to do was create the player object and put it in its own thread:
MusicPlayer *player = [[MusicPlayer alloc] initWithDefaultDevice];
// Move the player into a new thread.
player = [player inNewThread];
Actually, that's all I needed to do. After putting some files in the player's queue, I could periodically query its state, like this:
// Periodically wake up and see where we are.
while (1)
{
id pool = [NSAutoreleasePool new];
sleep(2);
NSLog(@"Playing %@ at %lld/%lld", [player currentFile],
[player currentPosition] / 1000, [player duration] / 1000);
[pool release];
}
Note the complete lack of any locking or thread operations here. The player
object, after the call to -inNewThread
is really a proxy which maintains a lockless ring buffer storing messages between the player and my main thread. When I send it a currentFile
message, it adds it to the queue and returns a proxy. If I try to use the proxy (here, NSLog
will do so by sending it a -description
message) then my calling thread will block. The other two messages return primitives, so they block immediately.
When I am not sending the player messages, the run loop managed by EtoileThread sends it a -shouldIdle
message whenever the message queue is empty, and if it is then it sends it an -idle
message. The -idle
method reads the next frame from the audio file, decodes it, and passes it to the output device. All of these are synchronous, blocking, calls (although the output device does some buffering) and so it's very simple code. Neither thread needs to spend much time waiting on a mutex - the structure used to send messages between threads is a hybrid ring buffer, which runs in lockless mode unless it has spent a little bit of time spinning (at which point it uses a mutex).
This means that, while playing, the cost of checking for new messages is very cheap (one comparison operation, in fact). While paused (and not receiving messages), the object will automatically switch to locked mode and wait for a condition variable to wake it up, so you aren't wasting CPU.
The best thing is that all of this is hidden away in EtoileThread (in EtoileFoundation), so any of your objects can use the same mechanism with almost no code. Just adopt the Idle
protocol if you want to do something when your object isn't receiving messages from another thread, and send it an inNewThread message just after creation.
Pragmatic Smalltalk 0.5
I've been calling Étoilé 'a pragmatic Smalltalk' for a long time (although Nicolas, I believe, was the one to coin the expression). Smalltalk is a really great language, but it has two disadvantage:
1) It tends to be bytecode-interpreted, which is not very fast. 2) Implementations tend to be all-or-nothing.
The first is less of a problem now that CPUs are so fast they spend 90% of their time idle in a typical desktop workload. The second is much more of a problem. Smalltalk-80 includes a complete GUI and common implementations, such as Squeak adopt this model. This means that Squeak applications and 'native' applications are entirely separate. If there is one thing that Squeak doesn't have that you need, then using Squeak is not easy.
This week, I committed the first version of the Smalltalk compiler I have been working on to Étoilé svn. Unlike other Smalltalk implementations, this is designed from the ground up for interoperability. Smalltalk objects are compiled (to native code) as Objective-C objects. This means that they can subclass Objective-C objects, and can even implement categories on Objective-C objects. There is no C function interface - if you want to call C functions then call them from Objective-C.
The compiler is in three components. SmalltalkKit contains everything required to take a string containing Smalltalk code and compile it to a set of Objective-C objects.
The Support library contains things needed by Smalltalk but not Objective-C. The most important class here is the BlockClosure
class, which implements a Smalltalk block as an Objective-C object with a function pointer as an instance variable and pointers to bound variables and space for promoting other variables (eliminating the need for garbage collected stack frames). There are also a few categories, such as map:
and related methods on NSArray
which take blocks as arguments. Note that these are implemented in Objective-C even though they are used by Smalltalk - they could, in most cases, easily be implemented in Smalltalk instead.
The final part is a tool which compiles a Smalltalk file, instantiates a specified class, and send the instance a run
message. This is very small and shows how the compiler can be used, and will serve as the framework for writing complete applications in Smalltalk.
The parsing is done in Objective-C, using the Lemon parser generator from SQLite. The abstract syntax tree (AST) is constructed out of Objective-C objects, which means it's exposed to Smalltalk. As a result, Smalltalk programs can generate code easily by constructing the AST and invoking its compileWith:
method, or by instantiating a parser and giving it a string.
Currently, the compiler only works in-process. It uses runtime introspection when constructing the AST. Code generation, however, is done via LLVM, and involves generating an LLVM intermediate representation (IR) version of the AST, running LLVM optimisation passes on this, and then compiling it to native code. With minor modifications, it is possible to emit the LLVM IR as bitcode and then run extra optimisations on it or compile and link it as a native library. Whether this is interesting depends on how long it takes to run the compiler. For the simple test I've done so far, program startup has taken much longer than parsing and code generation (and I'm using a debug build of LLVM, which is about 10% the speed of a release build). For larger programs, it might be worth statically-compiling. If parsing is a major overhead, it might be worth caching the bitcode for each Smalltalk input class.
So far, it is a fairly naive implementation. Lots more optimisations are possible (some are very easy) than are currently done. My aim, however, is to move as many as possible into LLVM passes, so that they can be used when compiling other dynamic languages. The code representing the Objective-C object model is taken from code I wrote for clang, the new C language family front end for LLVM, and so is also used for compiling Objective-C with LLVM.
Building a Better Garbage Collector
One day a student came to Moon and said, "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each of the cans." Moon patiently told the student the following story:
"One day a student came to Moon and said, "I understand how to make a better garbage collector...
I am not a fan of many of the things Apple has done to Objective-C recently. The one thing I liked the idea of in principle was garbage collection. Unfortunately, they seem to have done this very badly, so I set about seeing if there was a better way. First, some background:
There are, generally speaking, two kind of garbage collection: reference counting and tracing. With reference counting, every assignment increments the reference count of the new value and decrements the reference count of the old value. When an object's reference count hits zero, it is freed. This is what is traditionally done with OpenStep, via the -retain
and -release
methods.
The other alternative is tracing. This requires every object to be known to the garbage collector. Globals are identified as 'roots' and periodically the collector attempts to navigate from the roots to every reachable object. Those that can not be reached are freed.
In 2004, some very bright guys at IBM's T.J. Watson Research Center (a nice place to visit, by the way - it's on top of a hill, with huge windows and overlooks some gorgeous scenery) came up with a Unified Theory of Garbage collection in which they propose that these are really equivalent. A tracing garbage collector needs to set a flag indicating that an object has been reached, and this can be seen as a special case of a reference count (one capped at one). Reference counting garbage collectors need some extra mechanism for detecting loops, and this is equivalent to the tracing operation.
When Apple added tracing GC to Cocoa, they threw away the reference counting mechanism. This was a shame, since all that is needed to turn reference counting into full GC is the addition of a cycle detecting algorithm. If a
has a reference to b
and b
has a reference to a
then, with pure reference counting, both a
and b
will leak. The rôle of the cycle detector is to run periodically and make sure this does not happen.
Fortunately, the two of the same guys who came up with the unified theory had, a few years earlier, published another paper in which they describe a mechanism for adding an efficient (i.e. fast) cycle detector to a reference counting system.
I have implemented this for GNUstep, and it shows promise. I've made a few modifications to NSObject
- retain counts are now stored in a 16-bit value (if you have more than 65535 references to a single object, you probably have a bug) and the other 16 bits are now used to store flags, including a colour. The colour is set by the cycle detection algorithm, which is invoked periodically when a buffer of objects which have been released but not freed becomes full.
One minor problem with this is that objects can now exist safely in loops and -retain
can be called on an object which is currently executing -dealloc
. This can lead to an infinite loop, and some careful juggling is required to ensure that no objects deallocate themselves while freeing loops.
The code works, although it has a few (easily fixable) limitations. I created a small graph of five Pair
objects. Each one has a reference to itself and to the next one in the ring. The code correctly determines that this contains a loop, and destroys all five objects when the autorelease pool is destroyed.
The only major limitation is that I've only written code for atomically accessing the colours on x86. This can be fixed trivially by simply writing these functions. A smaller issue is that a number of GNUstep classes indulge in premature optimisation by calling NSDeallocateObject()
in their -dealloc
method, rather than calling [super dealloc]
.
I currently use a modified version of the algorithm in the paper which uses an NSHashTable
to store pointers to objects that might contain loops. Since there's space in the flags field for a 'buffered' flag, I can easily extend it to use this, and replace the hash table with a static array. This is better for two reasons: it should be faster, and it means that we can use thread-local storage without having to worry about explicit destructors (which are currently called by listening for a thread terminating notification, which is slightly fragile and will potentially fail to catch things released when a thread is dying).
Since some code already handles loops via unretained references, the current code has problems. To avoid this, I introduced an extra colour (transparent), and any objects with this colour are assumed to always be acyclic. This allows automatic loop detection to be turned on on a per-object, or per-class basis. In future, this can be used in reverse: to turn off loop detection for intrinsically acyclic data structures (e.g. trees).
My main motivation for this is for the Smalltalk compiler I am currently working on. Since Smalltalk expects garbage collection, and Objective-C does not provide it, this presents a small problem. A tracing collector can be used, however this is very tricky when integrating with a C-like language, since it means that everything which may potentially contain pointers has to be checked, including integers and untyped buffers.
On the way to EtoileUI, Part 1: Back to the Hackathon
For the Swansea hackathon, I gave a quick overview of EtoileUI. When I came back, I intended to upload it and wrote a post on the subject but the time went by faster than I expected :-)
During April and May, the framework has steadily improved to the point of being now usable on both GNUstep and Cocoa, but it is still quite experimental and many things remain to be worked out. In the early days of April, the stability of EtoileUI on GNUstep wasn't really satisfying too, since I initially wrote a large part of it on Mac OS X before backporting it this winter.
Before trying to explain what is EtoileUI in upcoming posts, here is the link to the presentation (PDF) I did.
As you probably know from the previous post, the integration of CoreObject and EtoileUI is moving forward, especially now that Eric has started to use both in a real application named EtoileTunes. This recent project has also motivated me to be quicker at squashing annoying bugs in these frameworks. On my side, a generic object manager based on the example of the last slide is coming along nicely. For now, it is mostly a CoreObject-based file manager which supports several views (icon, list, column, etc.), although it can be used to browse and mutate any object graphs that comply to a simple object collection protocol (declared in EtoileFoundation).
EtoileTunes
For the past few weeks I've been working on EtoileTunes, a music player for Etoile. My goals were to try improve my Objective-C/GNUStep knowledge, to try out programming with EtoileUI and CoreObject, to work on a replacement for MultimediaKit, and to hopefully end up with a good enough music player to use regularly.
It's still in fairly early stages, but I have something that uses TagLib to read music file metadata, then constructs a CoreObject with that metadata. It then puts these into a COGroup subclass which represents a playlist, and displays this group with EtoileUI. The eventual goal, as I understand it, is for EtoileTunes not to be a separate application, but just a pre-defined layout which the normal object-manager UI in Etoile can take on. Jesse, Quentin, and I discussed some ideas for what would make a good music manager UI, so hopefully I can build some working mockups with EtoileTunes.
On the MultimediaKit replacement side, I have an incomplete, but working, Objective-C wrapper around Xine-lib which provides an API similar to MultimediaKit's (only for music, not video playback so far.) I'm also working on a GStreamer backend. A future task might be to write a MultimediaKit framework from "scratch" - an Objective-C framework that would fit between Etoile apps/services and the operating system's/sound server's audio API. It could use the ffmpeg project's libavcodec/libavformat libraries to do all of the hard work of decoding media formats. The advantages of this would be having something that exactly fits Etoile's needs, and fits in to the system perfectly (for example, the same code for decoding music could be used for transcoding between different formats, and be integrated into a system Etoile might provide for converting file formats), and being able to ensure details like gapless playback work perfectly. This might be a lot of work though - especially if it has to handle video playback and recording, and multiple OS backends, so it may be best to stick with GStreamer/Xine for now.
Here's a screenshot of EtoileTunes:
If you would like to try it, the code is in /branches/ericwa/EtoileTunes. It's still very work-in-progress, though.
Lastly, any suggestions for a better name? :)
Compiler Fun
Anyone following the Étoilé svn logs recently will notice that I haven't been committing much for a few weeks. The reason for this is that I've been taking a short break to do some compiler hacking.
Objective-C support was first added to GCC by some guys at NeXT. They didn't want to release their code, but were eventually forced to by the FSF. They did not release the code for their runtime library, and so this code was completely useless to anyone else. RMS wrote a drop-in replacement for this library, which became the GNU Objective-C runtime. Gradually the GNU and NeXT runtimes diverged and the Objective-C support code in GCC became littered with #ifdefs.
After Apple bought NeXT, they continued developing their version of GCC in a branch. This branch was slightly cleaner, since it never had support for the GNU runtime, but no use to anyone on platforms other than Darwin for the same reason. This code is no fun at all to work with - Objective-C structures are lowered to the corresponding C structures, so there is no clean Objective-C AST to work with and runtime-specific code is interleaved with the abstract representations. When Apple add a new language feature, they add it to their branch, and if anyone else wants to use it then they have to merge the changes into the main trunk. Unfortunately, no one is doing this and Objective-C support in GCC is in a rather depressing state (bugs in Objective-C are not seen as show stoppers for a release, as we saw in the early 4.x series).
Recently, GCC switched to GPLv3. Apple corporate policy is that they will not touch GPLv3 code, and so the Apple branch is now a fork of GCC 4.2. Features added to GNU GCC will not find their way into Apple GCC and vice versa, unless explicitly licensed in a compatible way by their contributor.
Apple have also started looking at a new compiler, known as LLVM. This is a modular infrastructure for building compilers. It currently has an Objective-C/C/C++ front end based on Apple's GCC. This combination of an LLVM back end and a GCC front-end is typically known as llvm-gcc. It is found in the iPhone SDK and is likely to be found in the OS X dev tools soon. GCC isn't really designed to be split apart like this, however, and so the Apple guys have been working on a new one.
Unlike GCC, clang has very clean layering. This is intentional, since Apple also want to use it in XCode for syntax highlighting and refactoring tools. This means that every single Objective-C language construct gets corresponding AST nodes which are then passed to another part of the program which emits LLVM intermediate representation (IR) code - single static assignment assembly language - which is then turned into native code for the desired platform.
When I first looked at clang, most of the parsing code for Objective-C was done, but none of the code generation part. This meant that I was free to add any interfaces I wanted. Clang now has an abstract class encapsulating all of the runtime-specific behaviour and hooks in the generic code that call this. I have also written a complete implementation of this for the GNU runtime and an almost-complete one for the Étoilé runtime. As a result of this, clang can now compile about 90% of the files in GNUstep-base without issue. The remaining ones are failing due to a couple of outstanding bugs with implicit casts (the LLVM type system is a lot more strict than the Objective-C one and so casts which are implicit in Objective-C need to become explicit in the IR) and a few C features. GNUstep uses variable length arrays in a few places, for example, and I have only added partial support for these.
My changes to Clang are currently undergoing code review, but after this has happened and I've made the required changes they should go in.
Objective-C isn't the only thing that makes this interesting. Since the object model code is all isolated in a separate class, it is possible to plug this into other compilers trivially. Generating classes, protocols and categories, selectors and message sends that use the underlying GNU runtime (and soon the Étoilé runtime) functionality is trivial when using this class (each high-level construct is mapped to a method call). I am currently in the process of writing a Smalltalk compiler that uses this same back end. LLVM supports both JIT and static compilation, so we will be able to JIT-compile Smalltalk while developing, dump it to a file, and static compile it for distribution.
This means that Smalltalk will be a first-class citizen of the Étoilé ecosystem. Applications will be able to be written in Smalltalk and Smalltalk classes will be able to inherit from Objective-C classes. There is no bridging - Smalltalk methods will be compiled to native code and attached to the same structures as Objective-C methods. Once this is finished, I will be recommending Smalltalk as the development language-of-choice for new Étoilé applications. If you discover that a particular piece of code is too slow (after profiling) then you might want to rewrite it in Objective-C (or even pure C), although I don't expect Smalltalk to be much slower than Objective-C.
Smalltalk is not the only high-level language we will implement in this way - just the first. Expect Io, JavaScript and maybe even Self implementations later. These languages are all prototype-based, however, and so require a few features that are not found in the GNU runtime (but are in the Étoilé runtime) for full support.
Hackathon Recap
Étoilé hackathon ended ten days ago, I left Swansea on Tuesday afternoon to flight back to Paris after a mix of train and bus. Afterwards David and Damien Pollet succeeded in closing the hackaton by writing the first bits of a bridge between GNU Smalltalk and Objective-C.
Following Friday talks, we wandered around the university looking for a pub where we could eat. We finally ended up at David's place and had a nice break around beers (that aren't beers but ales), coffee, saucissons and we ordered some real food too. Afterwards Nicolas was still motivated to learn more about EtoileUI. His interest for EtoileUI had the bad side-effect of shortening the night quite dramatically!
During the week-end, we discussed the proper way to implement a semantic editor and how to integrate it with EtoileUI and CoreObject. Nicolas spent most of his time writing a new prototype. He already wrote a very rough one few years ago.
Damien arrived from Lugano on the second day of the hackaton. After having installed Étoilé, he started to play with the possibility to hack a Smalltalk bridge. In the meantime, David continued his work on EtoileSerialize, while I spent a large part of the week-end cleaning stuff here and there and hunting bugs of EtoileUI on GNUstep (Saturday and Sunday night especially :-).
We also began to clean EtoileFoundation a bit, David moved EtoileThread to it, and last week I recently managed to bring EtoileXML too and have both compiles fine as subframeworks (EtoileFoundation.h playing the role of an umbrella header). In the middle of our discussions centered around CoreObject and EtoileSerialize, I got the impression David was doing some coding on LLVM too, while Damien was fighting with autotools next to me. At least, I was able to follow David's presence on the #llvm channel right on the wall screen where Jesse appeared the previous day. I'm pretty sure everything went on rougly that way until Monday!
On Saturday evening, I lost myself in Swansea suburbs and the five minutes walk to buy a few ales became a one hour walk in the Welsh drizzle. The next day, the weather persisted to be quite unstable, constantly hesitating between storm, sun and waxy clouds. Monday, our own trinity (EtoileSerialize, CoreObject and EtoileUI) was expecting us for a last serious day of hacking. The sky was now in a better shape and could have been qualified of sunny and warm by previous day standards ;-) David had the good idea to plan a break at the pub for the evenining, so we could slow down a bit before really ending the hackaton. For the last day, I went to a coffee shop where I previously bought some nice scones and Chelsea cakes (iirc), then we left David's place for the university and our daily coffee session with a nice vista on the sea and the sound of the seagulls in background (or is it just my vivid imagination?).
The hackaton was just great, although the time went by very quickly. David had well organized everything and after our daily hacking sessions, he even managed to cook some very nice stuff (like weird and tasty squashes) for the hungry french hackers, so I'm looking forward to the next one. May be in France…
Hackathon Progress
The hackathon started on Friday, with Quentin and Nicolas arriving in Swansea in the early afternoon. After briefly settling in, the three of us gave a short series of talks to the postgraduate students in the department of computer science.
I managed to pursuade Quentin to write some documentation, which can be seen here:
Jesse, unfortunately, was unable to be with us in the flesh, but appeared on the wall in the middle of the afternoon as our very own Big Brother (or possibly Emmanuel Goldstein):
Most of the time has been spent working on CoreObject-related things. Nicolas has implemented a basic semantic text editor, which Quentin is wrapping up in CoreObject. I spent most of the time so far working on EtoileSerialise (or EtoileSerialize, as it is now known). It now passes the test automatically serialising nested structures of arrays of structures. The storage has also been abstracted away, and branching (finally) implemented.
Last night, after a day of hacking on various things, we implemented CorePizza, the official Étoilé project food:
Someone stole an hour from the middle of last night, so we're a bit tired today, but still making progress.
Summer of Code
The Google Summer of Code list was published last night. We have some good news and some bad news. The bad news is that Étoilé wasn't selected. The good news is that GNUstep was. Since Étoilé is built on top of GNUstep, everything that benefits them benefits us, so anyone interested in the summer of code and Étoilé should consider applying for one of the GNUstep places. We don't know how many places each project was awareded until later on in the process, but last year they got two so hopefully a couple of interesting projects can be finished as a result of the summer.
Spring Hackathon
The dates and location of the first Étoilé Spring Hackathon are now finalised. We will officially be starting on Friday the 28th of March and continuing until the 1st of April.
The Swansea University Computer Science department has very kindly agreed to provide us with a room for the duration of the hackathon in this building. Access to the building and room is via security card. I'll arrange some visitors' cards for attendees.
We're about fifteen minutes walk away from a pub that serves real ale and has free WiFi, which should come in handy for the evenings.
If you're coming, let me know when you're likely to arrive. I'll try to arrange some kind of social event on the Thursday evening before the real hacking starts.
Some Quick StepChat News
My last few posts have all been about very low-level stuff and have been sorely lacking in pretty pictures.
For the last day or so I've been working on adding vCard support to StepChat. It's not finished, but it now creates a 'Jabber People' group in your address book and adds any published vCards there. In future, it will merge any changes in a nice way. For now, it only gets vCards once (I need to poke the presence stuff to spot vCard updates). One nice bonus is that Jabber stores avatars in vCards, and once you can load vCards you get avatars almost for free. You will see in the screenshot that Dom has a nice line-art dragon for his avatar.
I also chased down the bug that was preventing colours displaying in the roster with GNUstep, so now that works too. Finally, one more thing you can see in the screenshot is that GNUstep now has nice menu dividers. This shot was taken with the Cairo backend (which works almost perfectly; the only bug I've seen with it is that text in the status message box on the roster gets horribly smeared. Since this doesn't happen in other text boxes it's probably a nib loading error).
Unfortunately, the bug causing windows to have the wrong titles is still there. I will have to have a hunt for it at some point (or just wait for Yen-Ju to fix it).
Another Day, Another Runtime
After spending a little while poking at the GNU runtime, I came to two conclusions:
- It was at least twice as complicated as it needed to be.
- GNU coding style really hurts my eyes.
I've spent a little while thinking about what I want from a runtime. One of my recent projects has been writing a Smalltalk JIT that targets the GNU runtime (still quite work-in-progress) and so I know the GNU runtime can support Smalltalk as well as Objective-C. Quentin, meanwhile, has been working on the Io bridge. While this works, it is a bit ugly because the Io object model doesn't really mesh well with the Smalltalk object model that the GNU runtime uses. More on this later.
Beyond better being able to support Io, I read a few interesting papers recently. The first was on Polymorphic Inline Caching. This is quite a neat idea, and allows you to eliminate the cost of dynamic method lookups in a number of cases. Unfortunately, this is quite hard to get right. Consider the simple Objective-C line:
[object message];
With the Apple runtime, this will be translated roughly into something like this:
objc_msgSend(object, @selector(message));
I'm cheating a bit here, and skimming over how the @selector() directive is expanded. In contrast, the GNU runtime does this:
IMP method = objc_msg_lookup(object, @selector(message));
method(object, @selector(message));
This is quite nice, since it means that a small compiler change is all that's needed to cache the method. We could replace this with something like this:
static IMP cached_method = NULL;
static Class cached_class = Nil;
if(cached_class != object->isa)
{
cached_method = objc_msg_lookup(object, @selector(message));
}
cached_method(object, @selector(message));
Now you only need to bother with the (expensive) method lookup if you reach this bit of code with two different object types. There are a lot of places in code where you will get the same kind of object all of the time, and this can give a huge speed boost. In other cases, you get a number of different ones and this is where polymorphic inline caching comes in. Rather than keeping a single (class, method) pair cached, you keep a few. Profiling can determine the optimal size for this cache relatively easily.
Nice and easy? Well, there's a catch. Objective-C is a dynamic language. You can load bundles which will replace methods at runtime and languages like Io allow even different objects to have different methods. This means that you need to check that the cache is valid before you use it. A problem.
This, and the difficulty in supporting Objective-C 2.0 on the GNU runtime caused me to write a new one from scratch. This took just under 48 hours (after which I ate and went out to a well-earned salsa class). What's new?
First, inline caching can now be done safely. Rather than looking up methods, the runtime looks up slots. The slot contains (among other things) an IMP and a version. How does the version help? Consider two classes, A and B. A inherits from B, which implements a -foo method. Somewhere in my code, I call this method on an instance of A and cache the result. Two things can cause this cache to become invalid:
- B's implementation of the method being modified / replaced.
- A having an implementation of the method added.
The first case Just Works™ since a pointer to the the slot (which contains a pointer to the method) is cached. You can modify the method without any problems (great for debuggers and runtime optimisations). The second case is more tricky. When you add a method to A, it first performs a lookup on the selector. If this returns a non-NULL slot, then the version of the located slot is incremented. Any time you cache a slot, you should also cache the version; if there is a version mismatch with the cached slot then you need to perform the lookup again.
The slots also contain an optional offset. This can be used to implement very fast set/get methods. A lot of the time you wrap instance variables up in set/get methods to insulate users of the code from changes to the instance variable layout. This comes with a speed penalty. I can't make this go away completely, but the new runtime allows you to avoid the method call and just access the ivar directly, while maintaining the dynamic lookup. This can make things like KVO faster, since you can do direct ivar access while there are no observers and then switch to indirect access when there are some.
I said the new runtime was simpler (no exaggeration; it's roughly 10% of the code size of the GNU runtime). That's partly because it works in a slightly different way to other Objective-C runtimes. While the GNU runtime provides the functionality required to implement Smalltalk in C, the new one provides the functionality required to implement Self in C, and then implements Smalltalk in Self (which is very easy).
Every object has (or, rather, can have) its own dispatch table and its own lookup function. Classes really are just objects. Anything you do with classes, you can also do with objects; for example you can add a method to a single object at runtime (say hello to closures, prototypes, and all of the things Io and Lisp programmers have been mocking you for having to do without).
The class model contains a nod to that used by Animorphic Smalltalk. This used mixins as a base type. That's effectively what I'm doing although they're called classes so as not to scare off the old Objective-C programmers. Classes can be composed as mixins are, which is how concrete protocols are supported (the only difference between a concrete protocol and a mixin is that the compiler does type checking for a concrete protocol. From the perspective of the runtime they are the same).
Oh, and every object has an associated recursive mutex, so @synchronized can now be generated easily.
To get an idea of how the runtime is used, take a look at example.c, which contains some simple example Objective-C code in comments and the equivalent code the compiler should be producing. I'd love to see this supported in LLVM, so anyone familiar with that codebase who feels like lending a hand please let me know.
You can also find more information in the release announcement email, including a more detailed overview and the API docs.
None of the interfaces are set in stone yet, so any suggestions are welcome. You can grab a copy of the code by doing:
svn co http://svn.gna.org/svn/etoile/branches/libobjc_tr/
Labels: libobjc, Objective-C, runtime, shiny
Objective-C: Étoilé Vs Leopard
Mac OS X 10.5, codenamed Leopard, is due out in a week or so. One of the features it advertises is Objective-C 2.0. I've written a bit about Objective-C 2.0 before. In this post, I'm going to compare some of the new language features present in Étoilé with those present in Leopard.
Garbage Collection
The big feature all the Java programmers want is garbage collection. GNUstep has actually supported this for a little while. If you use RETAIN() and RELEASE() macros instead of sending retain and release messages, you get the correct stubs for the garbage collector generated when compiling with garbage collection enabled, or -retain and -release messages sent otherwise.
This support was originally begun in 2004, but I'm not aware of anyone who uses it. Part of the problem is that mixing GC and non-GC code is tricky, so it's really only an option for people with no legacy code. Another part is that it adds some runtime overhead.
Loose Protocols
Apple now allows you to specify that some methods in a protocol are optional. Apparently this is useful, but I can't think why. Objective-C gives two ways of accomplishing this already. The first is to use an informal protocol; a category on NSObject with a default (typically null) implementation of the methods. The other is to query at runtime with respondsToSelector: whether a delegate implements a method.
The point of using a formal protocol, rather than an informal one, is so that the compiler can check that you have implemented the methods. Another possible reason is to allow a runtime check for a set of methods at once. A loose protocol gives you none of these. It just moves things that should be in the documentation into the code. Great for people who believe header files are documentation, not so great for the rest of us.
Concrete Protocols
Concrete protocols are a potentially useful part of Objective-C 2.0. They allow protocols to contain default implementations of a method. In Étoilé, we have something similar; typesafe mixins.
Mixins allow you to maintain the separation of interfaces and implementations that Objective-C encourages. Mixins, unlike concrete protocols, are defined as classes. If you want to add a method to a class, you first define a class implementing it, like this:
@interface Mixin : NSObject {
}
- (void) method;
@end
@implementation Mixin
- (void) method
{
NSLog(@"Method called");
}
@end
When you want to apply it to a class, you simply do:
[aClass mixInClass:[Mixin class]];
After this, all instances of aClass will responds to -method (it will not get a double helping of the methods declared in NSObject). You can even declare and use instance variables in the Mixin class. When you apply a mixin you will get an exception if one of the following happens:
- The types instance variables declared in the mixin do not match those declared in the class. The class can include more instance variables than the mixin, but it must include all of them. This allows mixins to directly access class ivars; something not possible with concrete protocols.
- The types of methods declared in the mixin conflict with the types of methods declared in the class (or a class the class inherits from).
Method Attributes and Properties
Method attributes might be really nice, but the number of the GCC function attributes that you are allowed to use is very small. Most of the ones that are actually useful can not be used in Objective-C without radically changing the way in which method lookup is handled; for example by adding an equivalent of Java's finally keyword.
Properties seem at first glance to be a nice idea. They are close to C#'s implicit set and get methods. In terms of expressiveness, they give nothing more than key-value coding already allows us. They may be slightly faster; the compiler could possibly add some code to translate them into ivar lookups if the implementation is for direct access to ivars. Something similar could probably be done with KVC, in the same way that polymorphic inline caching works. I'd be surprised if Apple has implemented this, however. At the moment, the only advantage is to add some confusing extra syntax.
The only remaining feature of Objective-C 2.0 that I recall is the foreach construct. Étoilé has a macro which works in a similar way. The following two lines are semantically equivalent:
FOREACH(anArray, string, NSString*)
for(NSString * string in anArray)
The latter is slightly faster, but requires anArray to implement a very messy countByEnumeratingWithState:objects:count: method, which retrieves 16 objects with a single call. The Étoilé version is slightly slower (although it does do IMP caching for you), but works with any collection that supports -objectEnumerator and so does not require multiple code paths. It's included with EtoileFoundation, so can be used on OS X too, including OS X 10.4 and earlier.
Prototypes and Futures
We've run out of new features for Leopard, but there's still one new one for Étoilé. We have support for prototypes in Objective-C. Any object that inherits from ETPrototype, or implements the ETPrototype protocol can use them. This required a small (binary-compatible) modification to the runtime system to allow delegation of method lookup to the class.
By using nested function (not supported on OS X), you can create and add methods at runtime, like so:
id anObject;
DEFMETHOD(method)
{
//Code goes here.
}
[anObject setMethod:(IMP)method forSelector:@selector(foo)];
[anObject foo];
You can also declare methods with arguments, and declare them outside the scope of a function / method. Note that, as with nested functions, you can not call the method after the current function has returned if it references any local variables. Note too that instance variables in the method can only be accessed by casting self to the correct type and accessing them explicitly (e.g. ((MyClass*)self)->ivar)
.
These prototype objects can then be -clone'd, have KVC-accessible ivars added and removed using the -setValue:forKey: method, and be used just as prototypes in Self or Io. We don't restrict you to class-based programming.
Note, however, that prototypes do come with some runtime overhead and so should probably not be used everywhere. The same mechanism can be used for closures; if the nested function you add as a method uses lexical scoping, and is called immediately, then it will work as a block would in Smalltalk.
While not technically a language feature, as I mentioned earlier, we also have support for futures.
Playing with the Runtime Again
Everyone should have a hobby, and at the moment mine seems to be abusing the GNU Objective-C runtime. One of my spare time projects is writing a Smalltalk JIT that uses the GNU runtime to provide the object model (allowing you to subclass Objective-C objects in Smalltalk).
Last night, Quentin demonstrated the old maxim that the way to get anything done in an open source project is to tell the developers that it sucks because it can't do X; pretty soon they'll have it doing X, for any value of X. Quentin's criticism was that it wasn't possible to subvert the message dispatch mechanism very easily.
What does that mean? Let's take a look at what happens when you send a message in Objective-C. First, you write something like this:
[object doSomethingWith:aParameter];
The compiler then converts this into something like this:
SEL sel = sel_get_any_uid("doSoemthingWith:");
IMP imp = objc_msg_lookup(object, sel);
imp(object, sel, aParameter);
Note that this is a simplification, and the selector will typically be cached somewhere. The important function is objc_msg_lookup, which returns the function that implements the method. These functions always take the object and selector as the first two arguments, and may take others.
For a language like Smalltalk or Objective-C, this mechanism makes sense. For something like Io, it almost does. The problem, for Io, is the implementation of objc_msg_lookup(). This looks at a sparse array structure in the class structure to find the mapping. This isn't helpful for a prototype-based language like Io, where instances might have different methods to their classes. For Io, you want to be able to alter the behaviour of the objc_msg_lookup() function on a per-object basis. for bonus points, you want to do this without breaking binary compatibility (the GNU C++ standard library people were very unpopular when they did this).
Fortunately, the Objective-C class structure has a field called info, which is a bitfield. Actually, it's half a bitfield; the upper half is used to store the id of the class in the system, limiting you to 64K different classes, and the lower half stores flags. These flags are used for various different purposes, including indicating whether a class has an +initialize method that needs to be called. Not all of them are used, so I added a new one. I then modified the objc_msg_lookup function to include a special case if this is set.
Now, if you set the flag on your class then the runtime will know it wants to handle message lookup itself:
+ (void) initialize
{
CLS_SETOBJECTMESSAGEDISPATCH(self);
[super initialize];
}
You then implement a method like this for your class:
+ (IMP) messageLookupForObject:(id)anObject selector:(SEL)aSelector
This lets you store your own version of a dispatch table in an instance variable, so you can add methods to a specific object at runtime. Objects which are extended in Io can have this flag set on their classes at runtime, and use a separate dispatch mechanism for the Io methods.
It can also be used for more efficient proxying for local objects. The CoreObject proxy object wants to pass messages that don't change object state right through, without logging them. With this mechanism, it will be possible to implement a hashmap lookup with the same sort of cost as performing a normal message lookup for messages that are passed through (store an NSMapTable in a COProxy ivar containing selector to IMP mappings for the proxied object), and return the message forwarding IMP for those that aren't.
Another, potentially interesting, option would be to combine this with some runtime code generation to dynamically construct proxy methods that would log their arguments and then pass them on without needing to construct an NSInvocation.
Hopefully, this patch will make it upstream as far as the GNUstep version of libobjc, even if it doesn't make it all the way into GCC. Anyone who wants to play with it themselves can find the diff in this mailing list post.
Futures in Objective-C
I like concurrent programming, but I don't like threads. Like files, they're a nice abstraction for operating system designers but not so much for userspace hippies.
In functional languages, you can often get concurrency for free by having a clever compiler, instead of a clever human. This is good, because clever humans are expensive. Clever compilers are too, but they're easier to copy than clever humans.
Consider the following bit of Objective-C:
id foo = [anObject doSomehing]; ... [foo doSomethingElse];
In Smalltalk, objects were regarded as simple computers that communicated by message passing. The fact that this message passing was implemented with a stack was hidden. To the Smalltalk way of thinking, the objects were independent. This bit of code sends a doSomething message to anObject, and block until it sends a return message.
From here, it doesn't take long to realise that you don't actually need it to block until the [foo doSomethingElse] line. So, can we implement this in Objective-C in the general case? The answer is yes, and that's what the EtoileThread framework (soon to be in EtoileFoundation) does. It works on OS X too.
I actually wrote EtoileThread in a hotel in Dublin in June last year, but I recently rewrote a lot of it to be more efficient. I'd like to give a little overview of how it works.
There are three core components to this. The first is the ETThreadedObject class, which encapsulates an object in its own thread. It's an NSProxy subclass, and forwards messages to the real object. You create it typically via an NSObject category, which adds +threadedNew and -inNewThread methods. When you send a message to the object returned by either of these, the following sequence happens:
- The invocation is caught by the ETThreadedObject and put into a ring buffer.
- The forwardInvocation: method returns, and the calling code receives an ETThreadProxyReturn object.
- The second thread retrieves the invocation from the ring buffer and executes it.
- The second thread passes the real return value to the previously returned ETThreadProxyReturn.
- Any calls to methods in the returned proxy block until this point.
What does that look like in practice? Well, we'll look at the simple example program included with the framework (ETThreadTest.m for those following in svn) and see. First, we define a simple class that has some trivial methods:
@implementation ThreadTest - (void) log:(NSString*)aString { sleep(2); NSLog(@"%@", aString); } - (id) getFoo { sleep(2); return @"foo"; } @end
The first just NSLogs whatever is passed to it, and the second returns a constant string. Next, in the main body, we create an instance of this in its own thread:
id proxy = [ThreadTest threadedNew];
We then send this a log message:
[proxy log:@"1) Logging in another thread"];
And then a getFoo message. Recall that the implementations of both of these messages had 2 second delay built into them. This was introduced to make it obvious which order everything was being executed in.
NSString * foo = [proxy getFoo];Next, we NSLog something from the main thread, just to show where we are.
NSLog(@"2) [proxy getFoo] called. Attempting to capitalize the return...");
Then, we NSLog the return value from the getFoo method (for good measure, we'll send it a message and NSLog the result, rather than NSLoging it directly):
NSLog(@"3) [proxy getFoo] is capitalized as %@", [foo capitalizedString]);
Finally, since we know we are calling a future, we get the real object out and NSLog it.
if([foo isFuture]) { NSLog(@"4) Real object returned by future: %@", [(ETThreadProxyReturn*)foo value]); }
What happens when we run this? Take a look:
$ ./ETThreadTest 2007-09-23 21:59:39.718 ETThreadTest[25196] 2) [proxy getFoo] called. Attempting to capitalize the return... 2007-09-23 21:59:41.695 ETThreadTest[25196] 1) Logging in another thread 2007-09-23 21:59:43.695 ETThreadTest[25196] 3) [proxy getFoo] is capitalized as Foo 2007-09-23 21:59:43.695 ETThreadTest[25196] 4) Real object returned by future: foo
Note that the NSLog from the main thread completes first. Note also the two second delays. Finally, note that the third and fourth log statements don't complete until after the getFoo method has run, since they depend on the returned value.
What's improved in the implementation of this in the last week? My first version was quite experimental. It used an NSMutableArray to store the invocation queue. This meant that every message going into the queue required these steps:
- Acquiring a mutex.
- Inserting an object into an NSMutableArray.
- Signalling a condition variable (if the array was empty).
- Releasing a mutex.
On the receiving end, you needed the following:
- Acquiring a mutex.
- Sleeping on a condition variable (if the array is empty).
- Removing the first object from an NSMutableArray (quite expensive).
- Releasing a mutex.
This is a minimum of four system calls (a maximum of six) and at least one expensive array operation. This isn't too bad if you are only very occasionally sending messages to your threaded objects, and are expecting them to take a long time to complete. If you are sending a lot of messages, however, the overhead quickly stops it being worthwhile to bother with the second thread.
The new implementation uses a lockless ring buffer in this situation. Inserting an object into this involves a subtraction and a comparison to see if it's full (we just spin using sched_yield() if it is, but with enough space for 128 invocations in the buffer that should be rare), inserting two objects in a C array (the invocation and the proxy value), an addition, and a memory buffer (on weakly-ordered platforms, not on x86). Removing an object switches back to the old-style locking mode if the queue is empty. We don't want to spin if there is no work to do for the consumer thread, because that could last a long time. Then, taking the objects out of the array is just copying their pointers out of the array and another addition.
The ring buffers, like those in Xen, use free-running counters and a mask of the lower bits to translate this into an address.
There are two special cases. The futures code only works for methods that return objects. For methods that return non-objects, we need to wait until the invocation has completed. For void return methods, we simply complete asynchronously, but don't return a proxy object.
If you want a more detailed overview of how it all works, there are lots of comments in the code. Have fun, and file helpful bug reports.
XHTML-IM Support
Ages ago, I wrote most of an XHTML-IM parser for StepChat. I didn't enable it in the default build, since it wasn't quite working. Over the weekend, I finished it off, tested it, and added support for generating some XHTML-IM. The code is not completely compliant with the specification, since it doesn't bother checking if the other party supports HTML. This will probably be added soon. It also contains a couple of work-arounds for libgaim-based clients, which interpret the standard quite creatively.
Below, you can see a conversation I had with my debugging persona (yes, I know talking to yourself is a sign of madness, especially when you use the Internet to do it). The first image comes from FreeBSD/GNUstep while the second one comes from OS X. Camaelon is not enabled for Jabber on my FreeBSD box because it was breaking things (I think it's fixed now, but I haven't got around to removing the default)
The more observant among you will notice that I fixed a bug in the handling of bold text in the middle of this conversation.
Labels: jabber, omgponies, stepchat, xhtml-im, xmpp
Drop Shadows
Shadows are nice. Since we got compositing managers for X11, we've had the ability to do this. The xcompmgr program does it, but in a gimmicky, eye-candy way, which doesn't really add much to usability.
Shadows are definitely something we want though, so yesterday I forked xcompmgr and put it in our repository. Today we have some (early) results. Drop shadows:
Three things to note. The first is that the dock and menu bar don't have shadows. The second is that the active window has a bigger one than the others. Finally, there are no shadows visible on the Typewriter window at the top right of the screen. The first two of these are intentional. The third is not, but only occurs in screenshots, not on the screen, and so is probably not very important.
Étoilé 0.2 Troubleshoot
Since 0.2 release, there are some issues regarding setting up the Étoilé environment. Here are a few steps to narrow down the problem.
First, make sure your GNUstep is correctly installed. Current stable release of GNUstep is Make 2, Base 1.14, GUI/Back 0.12. You can also get them from GNUstep stable branch. Run a few GNUstep applications in any graphic environment to be sure your GNUstep is working. And remember to source GNUstep.sh in your profile. You can get help from GNUstep maillist if problems occur at this stage.
Second, after installing Étoilé, run a few user-level applications, such as Typewriter, Sketch, StepChat, Vindaloo, AddressManager and FontManager. They are regular GNUstep applications. You can run them in any graphic environment such as GNOME, KDE or WindowMaker. If you run 'setup.sh' during the installation, several bundles have also been installed. For testing purpose, you can remove them by:
defaults delete NSGlobalDomain GSAppKitUserBundles
and add them back by :
defaults write NSGlobalDomain GSAppKitUserBundles '(
/usr/local/GNUstep/System/Library/Bundles/Camaelon.themeEngine,
/usr/local/GNUstep/System/Library/Bundles/EtoileMenus.bundle,
/usr/local/GNUstep/System/Library/Bundles/EtoileBehavior.bundle)'
Make sure the paths are correct. You can also try different combination of these bundles. Camaelon is the theme engine. EtoileMenus is the horizontal menu. EtoileBehavior handles various tasks behind the scene and you will not see any change on user interface with it.
Third, you should be able to set up Étoilé manually. With GNOME, you can log into a fault-safe session with xterm only. There should be something similar on KDE. Once xterm shows up, run these system-level applications one-by-one:
gdomap &
openapp Azalea &
openapp AZBackground &
openapp EtoileMenuServer &
openapp AZDock &
If they all run propertly, you are close to have a functional Étoilé environment. Log out the session by exiting from xterm and log into the fault-safe session again. Run 'etoile_system' tool (no openapp !) and all the system applications should launch automatically. If not, check your SystemTaskList.plist in your GNUstep/System/Library/Etoile/ or ~/GNUstep/Library/Etoile/. It contains all the applications you launched one step before.
Finally, to add Étoilé into your GDM, make sure these files exists:
- etoile.desktop, in your xsession directory, such as '/usr/share/xsessions'. And it should contain a line 'Exec=/usr/local/bin/etoile'. You can get this file in Etoile/Services/Private/System/.
- /usr/local/bin/etoile. This is the actual script to run 'etoile_system'. It should look like this:
. /usr/local/GNUstep/System/Library/Makefiles/GNUstep.sh etoile_system
There is a space between '.' and '/usr/local/GNUstep/System/Library/Makefiles/GNUstep.sh'. And be sure the permission is correct.
If you log into the fault-safe session again, you should be able to run this script '/usr/local/bin/etoile' to launch etoile_system, which will then launch all other system-level applications. By this point, you should have a Étoilé session in GDM for Étoilé environment.
If you want to set up the Login.app, see 'Etoile/Services/Private/Login/README' for details.
Update: There is a summary of latest GNUstep/Etoile on Solaris.
The Road to CoreObject Part 2: Why Bother?
Since the last post, a lot of people have asked me 'why are you doing this? What advantage does it actually give?' In this post, I'll try to explain.
One Abstraction, Two Uses
What is a file? Over the last year, I've asked a number of people that, from computer scientists to technophobes. None has managed to give me a clear answer. The next question I asked is 'What is a document?' Everyone I asked gave me a clear answer.
From a user interface perspective, it's clear that a document is a better abstraction than a file. A file is a very convenient abstraction for operating systems; it's basically a virtualised block device with a simple text key (the path/filename) that can be used to uniquely identify it. It is not a good abstraction for users.
Files are used for two things:
- Storing a document.
- Publishing a document.
From a user interface perspective, these are very different tasks. Storing a document is not something that should ever need to be done explicitly. Raskin's first law states:
A program shall not harm a user's data, or through inaction allow the user's data to come to harm.
Everything I do to a document should automatically be stored if possible. In some situations, such as sudden power failure, some data loss is inevitable, but the program should do everything it can to minimise the chance of avoidable data loss. A simple corollary to this is that versioning information should also be stored. If I hit select all, delete, then I don't want the stored form of my document to be overwritten with an empty document. I want an undo feature, and I don't want this to be contingent on keeping the document in memory (select all, delete, {autosave}, power failure, panic).
CoreObject's serialisation function does this. You don't need to explicitly save a document. From the time an application tells CoreObject to manage the object graph representing the document model, you have the ability to replay every single change you've made to it (this actually works in the version in /trunk now, although it needs more testing).
While you don't have to save a document explicitly, you might want to tag it with some metadata. Some of this will be created automatically for all objects (creation dates, modification dates, etc). Some will be created automatically for certain object types (e.g. colour depth, word count, table of contents). Some can be specified manually. This will be indexed by the higher layers of CoreObject. These tags can either be assigned to a specific version, or to the latest version. You might tag a book you are working on with the book title, and also tag the version you sent to the proof readers, so you can jump back to that one to compare with the comments they gave you.
Publishing is a very different problem. When you publish a document, you typically don't want to include revision information, you want a snapshot. A few government agencies have been embarrassed in recent years by forgetting that Word Documents are intended for storing, not publishing, and include a lot of revision information.
How does CoreObject help with the publishing? Well, the current implementation doesn't (yet), but the plan is to integrate something like Apple's UTI (or, more likely, UTI itself). This is a type hierarchy supporting multiple inheritance that is orthogonal to the object hierarchy. Each compliant object will publish a number of types that is inherits from, such as rich text, or image. It will also support exporting its contents as each of these. For complex compound documents, the root document will simply query the enclosed components, and assemble a composite of images, text etc. Each object only needs to be able to export to something one layer up the type hierarchy. For example, a word processor might export as rich text, and the system would then convert this to text using a shared component.
What About My Friends
The other important feature of CoreObject is collaboration, which is central to the Étoilé vision. CoreObject's serialisation of invocations allows these to be sent across any kind of network connection. In 0.3, there will be a XML-over-XMPP system for this. This will stream changes between two (or, in theory, more) users as they are made. Some systems exist for doing this in very specific cases, such as SubEthaEdit for text and a few whiteboarding solutions for images. CoreObject will allow us to do this in the general case. Any document that works with CoreObject will be able to be shared in this way.
Because it only sends the deltas, this approach will scale to relatively large object types. Imagine something like a raw digital photograph. These can easily be several tens of megabytes. The changes made to them, however, are usually of the form 'alter the brightness level by 5%,' or 'apply this filter with these parameters.' These are not very big, and so once the photograph is initially shared, it can be tweaked in a collaborative fashion easily.
This is even true of video editing. Something like Apple's Final Cut does non-destructive editing. While the source footage is often tens of gigabytes, the project file is very small, since all it contains are instructions like 'take insert ten seconds from source file x at y in the timeline,' and 'cross fade for 10 seconds.' With CoreObject, we get this kind of non-destructive editing for free, and we also get the ability to collaborate on documents like this for free. We could have two people editing the same video on their own machines and having the changes automatically kept in sync. Once it's done, they export it as something like MPEG-4, and anyone can view it irrespective of whether they're using Étoilé.
Labels: CoreObject
Étoilé 0.2 is now officially released
Étoilé 0.2 is now officially released. See the full 0.2 Release Announcement for more information. There are a number of screenshots of this release online. A source tarball is available for download. Those preferring to use subversion should check out /tags/Etoile-0.2 from the repository. If you have any questions regarding this release, please post your queries to the Etoile-discuss mailing list or visit the SILC channel Etoile on silc.etoile-project.org.
First Steps into Étoilé and GNUstep Programming
If you are interested to contribute to Étoilé directly or by writing your own applications, I have compiled below some resources to learn GNUstep programming…
The best resource for Objective-C development with Cocoa/GNUstep frameworks is Apple documentation in my opinion. Cocoa documentation is now excellent, but don't forget GNUstep isn't always in sync with Cocoa and Carbon stuff isn't available on GNUstep platforms.
To begin the best is probably to read the following Apple guide: Cocoa Fundamentals.
On GNUstep website, visit Developer documentation area. From this documentation, you should read gnustep-make documentation. Take note it isn't really exhaustive and doesn't cover the new stuff brought by latest release 2.0. Base Programming Manual can be worth to read. It details a bit some GNUstep specific extensions like Documentation system, logging etc. and includes a very good overview of Distributed Objects with real examples.
GNUstep API references:
- Foundation
- Foundation Additions… This is the place where you can find documentation on XML support and Runtime utility functions.
- Foundation Tools… Important place where GNUstep Documentation system (aka GSDoc) with the related tool autogsdoc is explained.
- AppKit
Cocoa Dev Central is a good place to start with plenty of nice tutorials too.
Here are other very good guides you can read to dig into GNUstep development:
- Application Architecture
- Memory Management
- Document based Application
- Event Handling
- Coding Guidelines
- Objective-C Language
- Drawing
Many good resources are listed on CocoaDev and Stepwise.
If you are interested by books, check:
- Cocoa Programming for Mac OS X, I learnt Objective-C development with the first edition of this book :-)
- Coecoa Programming, this book is outdated but it's still the only one which covers advanced stuff decently
The Road to CoreObject Part 1: EtoileSerialise
The Road to CoreObject Part 1: EtoileSerialise
I've now submitted my PhD thesis and more or less wrapped up my Xen Book, so now I'm taking the summer off (apart from the odd article) to work on Étoilé before looking for some kind of job.
I've mainly been focussing on development of CoreObject, specifically the low level components required for CoreObject, this month.
Files Suck
Many of the ideas here come from a discussion between myself and Nicolas a few years back about how computers suck and how we can make them suck less (a recurring theme, that eventually led to the formation of Étoilé when it turned out that we weren't the only people having this conversation).
CoreObject is intended to be one of the foundation pieces of Étoilé. The current roadmap calls for an experimental version in 0.3, a stable interface in 0.4, and a completely stable version in 0.5.
What is CoreObject? Basically, it's a replacement for a filesystem as a programmer and user interface. Files (in the UNIX sense of the word) never were a good abstraction; an untyped series of bytes is no use to anyone. The operating system needs to deal with things like this, but programmers shouldn't have to.
We already have a much nicer abstraction than a file; the object. Unlike files, objects have all of the structure and introspection that we want in order to be able to interact with them programatically. In Étoilé, we want to treat everything as an object, and objects as first-class citizens.
Persistence
At the lowest level, CoreObject provides a persistence layer, and that's what the EtoileSerialise framework is for. It turns out that the Objective-C runtime stores the names and types of all of a class's instance variables, making it almost possible to serialise arbitrary objects without any extra code.
Why Almost? Because Objective-C isn't really a language, it's two languages that you are allowed to mix together in the same source file. One of these languages is a close relative of Smalltalk. The other is C. Anything that just uses the Smalltalk components of the language is easy, since the runtime stores all of the type information. Anything that uses the C component is impossible, because there is no runtime information at all. Most code that we are interested in lies on the boundary.
The EtoileSerialise framework doesn't serialise anything other than objects. Objects, however, have instance variables that can be C types. Some of these are easy. An int, or float, for example, has its type encoded in the object's description, and so we just need some special handlers for the various C types (of which there are not many). The same is true of static arrays. If you say something like 'int foo[20];' in an instance variable description, the runtime encodes the size of the array and we can retrieve that and transparently serialise the array. The same is true of structs, where the runtime stores the type for each field.
Dynamic arrays start to get a bit harder. A dynamic array in C is just a pointer. If you say 'int * foo' then the only information available to the serialiser is that foo is a pointer to an integer. It could be an array, or a pointer to an aliased intrinsic. If it's an array, we have no way of knowing the size. This is not quite true. On Windows and OS X there are extensions to the malloc() family that let us know the size of a block of memory identified by a pointer, but they are non-portable. This doesn't help us at all when we have something like 'int **' though, since we don't know if it's a 2D array, an index array, or something else. At this point, the serialiser just gives up.
Fortunately, Objective-C is a nice dynamic language, so we can fudge this slightly. For objects that use low-level components of C, we introduce an informal protocol that asks the object to manually serialise it. This takes the ivar name and the serialiser back end as arguments, and so anything the serialiser can do, this method can too. On the deserialisation end, two other methods are available for fixups. One is the converse of this, requesting that an object manually deserialise an ivar. The other is invoked with no arguments once an entire object graph has been deserialised. Note that the serialise and deserialise methods don't do any type checking for manually serialised things, so you can serialise an int as a C string if you need to. One reason you might want to do this is for an object wrapping a file, where the ivar would contain a file descriptor (an int). If you just store this and reload it, you will get nonsense, so you might instead store the file name and re-open the file on deserialisation.
There are two other pieces of the puzzle. One is named structures. Some structures need special handling, and it is a bit rubbish to expect the developer of every object that needs to handle them to know about this. Fortunately, the runtime system knows the name of structures that are used. To make use of this, you can register a function that handles serialisation of a named structure. The serialiser will then call this whenever a structure of this form is encountered. The other part is versioning.
OpenStep's NSObject already has a -version and a -setVersion: method. We make use of this with the serialiser by encoding the version with each serialised class (for subclasses, we encode the version of each class in the hierarchy with the instance variables inherited from that class). The manual deserialiser method takes the version as a third argument. If you change the instance variables of a class, it's easy to add support for deserialising the old version by implementing this. It is even possible to do this in reverse, by supplying a category on the old object that loads the new one, or even deserialise an object using an object of a completely different class using the poseAs: mechanism.
Versioning
We can version classes, but what about objects? It would be nice to have the revision history preserved. We do this by turning a model-controller-view trio into a model-(CoreObject proxy)-controller-view system.
With Objective-C, everything is an object, including the messages you send to objects (the equivalent of method calls in C++/Java). The combination of the message name (selector) and arguments is known as an invocation, represented by the NSInvocation class. Our COProxy object, or a subclass, sits between the model and controller and serialises every message that is sent to the model. This stores the complete revision history of the object. To reload any version of the object, you can just reload and replay the invocations. So this doesn't take too long, the COProxy object periodically serialises a copy of the object. Currently this is done every 100 messages. In future it will be configurable.
I mentioned COProxy subclasses. The reason for needing these is that Objective-C doesn't have the concept of 'const' methods, i.e. methods that are guaranteed not to affect the state of the object. We don't want to bother serialising these, so we will use a subclass for each class we might use as the principle class in our model to automatically pass these through.
We thus have saving, restoring, and versioning of arbitrary objects for free. Since we're greedy, we want more. Let's also have branching and merging. Branching is easy; we just define two objects with the same previous version. What about merging? Well, I think we can do this by re-playing the invocations from the two branches in an interleaved way. This will probably be done with a UI allowing the user to select which invocation should be run next.
Anything else? What about collaboration? Since we are serialising invocations, we can pass them over a network to another user, and they can keep their copy of an object in sync with ours. With a simple locking protocol, we can have bi-directional syncing. The serialiser and deserialiser are split into a front and back end, with the back end defining the storage format. At the moment there is a binary file format and a simple human-readable output-only back end for testing. An XML backend will be added too, allowing objects (including invocations) to be passed over XMPP (Jabber).
Oh, one more thing. This also gives us non-destructive editing of any arbitrary object type, from text through images to video, as long as we have an object encapsulating it. Excited yet?
Does it work?
So, what's the current status? Actually, pretty good. The code contains a few 32-bit x86isms that need fixing. Serialisation works for a lot of objects with no modification, and more with a little tweaking. Deserialisation is a bit less finished. Deserialisation of named structures is not finished, and neither is the special code-path for serialising invocations. The COProxy object works, and serialises invocations properly. It does not yet include a mechanism for re-loading them, but the example back end allows you to see that all the required information is saved.
Currently, the build system creates a test app, rather than a framework. This will be changed towards the end of the month, when the first alpha will be ready. The interfaces are still likely to change, however, so don't start integrating it into your code yet. Étoilé 0.3 will have a fairly stable version for everyone to play with and, hopefully, some of the higher-levels of CoreObject, which will deal with metadata, indexing, and type conversions too.
Idiot! Use Smalltalk!
Some people will be reading this and thinking 'this whole thing would be much easier in Smalltalk.' This is true. In Smalltalk it would be possible to write the completely generic version, and use the garbage collector to track any aliasing, etc. So, why don't we use Smalltalk? Smalltalk's a great language (ask Nicolas why, but only if you've got a long time to listen to the answer), and easy to learn; even small children can pick it up quickly. It doesn't, however, play nicely with other languages (or even GUIs), and there are not many people who know the language well.
Nicolas has described Étoilé as a 'pragmatic Smalltalk.' We sacrifice some of the nice features of Smalltalk, but gain the ability to make use of lots of legacy C code. Objective-C isn't quite as nice as Smalltak, although it's close, but we gain a lot more from the Objective-C frameworks than we lose from the language.
Labels: CoreObject, EtoileSerialise, shiny
Font Manager
I've been working on a new app for Étoilé: Font Manager. Font Manager is, as the name suggests, an application with which to manage fonts.
As you can see in the above screenshot, Font Manager already shows you your installed fonts, and presents samples of them. Planned features are: disabling/enabling fonts, previewing and installing fonts, and automatic conversion to the nfont format.
Update 8/7/07: New URL for image & thumbnail.
Jabber Update
There have been lots of changes in the Jabber code recently. The GNUstep version has had a bit of extra hacking to work around some GNUstep bugs (I'm going to have to have a serious look at the GNUstep implementation of NSOutlineView over the summer), and Jesse and I have been working on integrating µblogging and Jabber.
First, a quick introduction to the Jabber code. The /Services/User/Jabber directory in trunk contains an application, StepChat, and two frameworks. One of these handles XML parsing, the other handles XMPP. The application is a thin wrapper around the XMPP framework. Currently it works on OS X and GNUstep (although there are some GNUstep-related bugs remaining). Eventually, the code will diverge into two branches, one taking on Étoilé-specific features, the other remaining a stand-alone application for OS X and GNUstep-without-Étoilé users.
The framework uses a 'people' and 'identities' abstraction that is likely to show up in other parts of Étoilé. It understands that a person can have multiple identities (e.g. a work Jabber ID, a home Jabber ID, and an MSN Messenger account) and treats the person as the important one. When you have a conversation, it is with a person. You can swap between talking to any of the person's identities transparently. If you have a conversation active then it will automatically pick the most relevant identity to talk to, so if a person moves from one machine logged in with one account to another logged in with another (or the same account with a different resource) then it will switch over to the correct one automatically. It maintains the same abstraction in the roster, which has a three-tier hierarchy of groups, people and identities. Of course, if you want to talk to a specific identity, you still can.
On to the µblogging stuff, Jesse has been playing with a Jaiku µblog recently. Since µblog entires are very similar to status messages in an IM application, it made sense to try integrating them. A couple of changes were made to StepChat for this. First, a new status message UI was created, adding a text field to the roster so it's easy to set a message (the old UI was a quick hack), and a 'presence log' UI was added to give an aggregated view of messages set by others. Next, the presence code was modified to send a distributed notification whenever the presence was updated.
Jesse and I put together an example tool which makes use of ththis and pushes updated presences out to Jaiku. Now, when you set your presence in StepChat, it will set your presence and push it out to Jaiku. Anyone who wants to bother could add support for Twitter fairly easily too.
Build individual Étoilé component
Étoilé project comes with many components, but it is not necessary to build everything if you do not want to use every component. You can just go to individual subdirectory and build a component with 'make; sudo make install'. Most of applications are under /Etoile/Services. Components in User subdirectory are for regular tasks. They are often launched by users when needed. Components in Private subdirectory are usually staying through the whole session after login.
If you choose to build Étoilé components individually, there are two levels of dependencies. First, some applications depend on other frameworks or libraries. If a building fails, you should be able to find out which framework or library is needed. Second, some applications depend on other applications. GNUstep provides excellent mechanism for inter-process communication and Étoilé takes an advantage of it. In that case, the component will build and run, but may not behave as expected. An example is that AZDock depends on Azalea.
If you met any problem of building or running an Étoilé component, you can ask people on Étoilé SILC server at silc.etoile-project.org. Here is a list of SILC clients to use.
Progress in May
Besides fixing bugs and preparing LiveCD 0.2, we have new applications in -stable:
And some experimental applications in -trunk:
While these are small applications, they demonstrate GNUstep applications can easily work together.
Login.app
Screenshot of Login.app to be used with GDM. It will be available in the coming 0.2 release of Étoilé.
Sketch
Sketch is Nicolas' port of Apple's Sketch from developers' examples. Besides it is very straight forward to port, it uses original nib files (not gorm file) because GNUstep and Gorm support reading and writing nib file. It is not fully functional yet, but should be easy to fix. It is in etoile/trunk/Etoile/Services/User/Sketch
Étoilé cooler inside GNOME... who knows!
Here is a funny screenshot of Étoilé environment running inside GNOME.
I just improved menus support today. If the menu bar isn't in the usual location, the menus are still properly displayed. Étoilé inside GNOME is now truly usable :-) By the way Jesse also improved the menu look to better integrate with Nesedah as you can see when you click on screenshot thumbnail. New menu gradient was featured in mockups posted last week.
To start the environment, I just typed in gnome-terminal: etoile_system To exit, I can choose Log out entry in Étoile top left menu. Only GNUstep applications would be quitted in this border case.
You can also see AZDock on the left border nicely tracks X applications currently running too.
May be I should mention the main reason behind this patch was to make Étoilé system stuff (Menu, Session etc.) debugging more friendly.
Google Summer of Code
GNUstep, on which Étoilé is built, was awarded two Google Summer of Code places this year. This will fund two students to work on GNUstep over the summer period. One of the two will be mentored by Étoilé developer David Chisnall, working on some missing AppKit classes and tidying (read: rewriting) the GNUstep text system. While neither student will be working on Étoilé directly, anything that strengthens GNUstep will benefit Étoilé.
Étoilé Stable
We create a Étoilé stable branch for regular users. It can be obtained through svn:
svn co http://svn.gna.org/svn/etoile/stable/Etoile Etoile
Components in stable branch are relatively stable. It is recommended for packagers to use stable branch instead of trunk. A simple <i>make</i>
can build everything[1]. Here is a short list of what we offer in stable branch:
- Menu bar (MenuServer + EtoileWildMenu)
- Window manager (Azalea)
- Dock (AZDock)
- Desktop background (AZBackground)
- Modern theme (Camaelon)
- Dictionary (DictionaryReader)
- RSS reader (Grr)
- PDF viewer (Vindaloo)
- Address book (AddressManager)
- Calculator (Calc)
- Text editor (Typewriter)
- and many other components behind the scene
[1] PopplerKit uses C++ language. Users may need to specify their C++ compiler to build it. For example: <i>CXX=g++-4.1 make</i>
.
New Etoile Mockups
I designed some new mockups, available here:
Descriptions of all this stuff is on the mailing list.
Étoilé: The Thesis!
Today the Étoilé project got a huge compliment from student Michal Čáp in the form of a thesis paper proposal:
I am 22 y.o. computer science student from the Czech Republic, but I’m studying at England (Coventry) this year. I am writing here because I am now working on my final year project, which is about Étoilé. I am basically aiming to produce a work that should describe what is Étoilé project about and what are the main ideas behind. I am doing that because I am interested in the ideas put forward by Étoilé project. I can remember that when I read about Étoilé for the first time it took me quite a lot of time and effort to find out what is it all about and why is it so different. Therefore I want to work out a paper that would allow people to get into Étoilé more quickly.
He’s already done some really cool mockups. While some of the stuff is in need of updating due to recent changes, I’m still really impressed about what he’s been able to put together based on some random (and poorly organized) notes, emails and sketches. I can’t wait to chat more with him and get some of the blurrier points of the UI crystalized.
Install Étoilé (Survivor Guide)
Here are some quick instructions to build and install Étoilé…
First you need to build and install latest GNUstep svn version (not the latest release), then you can build and install Étoilé svn version too.
The easiest is to download a repository snapshot for both GNUstep and Étoilé.
Now you have to take some time to install GNUstep and Étoilé dependencies. They are described in this document for Linux Ubuntu. You can probably extrapolate the packages you need to install for other distributions or platforms from this document. If you have worked out a list of packages to install on other plaftorms than Ubuntu, we would be insterested to include this list directly into Étoilé repository as we already did with Ubuntu.
After that, you have to build and install GNUstep. Detailed instructions can be found here. To summarize shortly, inside core you have four modules to be built. Inside each one, type ./configure && make && sudo make install. Do it in this order: make, base, gui, back.
Then you are ready to truly install Étoilé. Unless you already install oniguruma regular expressions library on your computer, you probably need to it by hand since no packages exists usually. Then Étoilé...
- Go into /trunk/Dependencies/oniguruma and type: ./configure make sudo make install
- Go intro /trunk/Etoile make sudo make install ./setup.sh
Now Étoilé should be successfully installed, you just got to log out and choose the right session in GDM, KDM or similar. If you cannot find Étoilé session as a choice, you should submit a bug report. However you can still use Étoilé by starting it with etoile script located in /usr/local/bin
A screenshot
Although it looks similar as previous ones, many are improved underneath.
GNUstep participates in Google Summer of Code 2007
This year, GNUstep got accepted to SoC. Some projects such as enhancing text system and printing, or porting WebKit to GNUstep will help Étoilé in general. You can also digg the story to help spread the word.
Keyboard control and theme of Azalea
Azalea is a port of Openbox 3. Therefore, configuration of OpenBox also applies to Azalea. Besides, Azalea comes with a default configuration for keyboard control and theme under Resources/openbox/rc.xml and Resources/themes/Azalea/ respectively. Here shows some keyboard control in rc.xml:
- Ctrl+Alt+Left: move to the left desktop.
- Ctrl+Alt+Right: move to the right desktop.
- Ctrl+Alt+d: toggle desktop (hide and show all windows).
- Alt+Tab: next window.
- Alt+Space: show window menu (same as mouse left-click on icon on window title bar).
- Ctrl+Space followed by 'r': bring up root menu (same as mouse right-click on empty space).
If a menu is brought up, use arrow keys to move around and 'enter' key to select. Some keyboard control are handy in certain case. For example, if the desktop is managed by GWorkspace, mouse right-click cannot bring up the root menu. Keyboard control still works.
Setup Spell Checker for GNUstep
First, GNU Aspell headers and library has to be installed. After configuration of gnustep-gui, this information should be positive
checking for new_aspell_document_checker in -laspell... yes checking aspell.h usability... yes checking aspell.h presence... yes checking for aspell.h... yes
After compliation of gnustep-gui, GSspell.service sould be available under GNUstep/System/Library/Services.
Now, spell checking is available, for example, in Typewriter. If not, try run make_services.
FOSDEM 2007
Many of the Etoile devs will be at FOSDEM in Brussels next month. Stop by the GNUstep booth and say hi!
Minimal Étoilé
This is an update from previous post. Although you probably can do everything by following INSTALL, here provides some details about what is going on. I use Ubuntu/PPC 6.10.
All the dependencies can be installed through Ubuntu package manager and GNUstep can be installed by following GNUstep Build Guide. Art backend is recommended and ArtResources is already included in GNUstep.
Étoilé can be downloaded through SVN. These are the components to install:
The default theme is etoile/Themes/Nesedah.theme and should be copied under ~/GNUstep/Library/Themes.
Two user defaults should be added (assuming GNUstep is installed under /usr/local/):
Now, the system is ready to go. Run setup.sh to setup the environment, or follow these steps:
!/bin/sh
. /usr/local/GNUstep/System/Library/Makefiles/GNUstep.sh etoile_system
Next time when you login, choose Étoilé. etoile_system will execute the tools and applications in SystemTaskList.plist mentioned above.
Have fun !!
If you want, just log out...
Log out support has been fully implemented recently.
Recent System updates and the addition of EtoileBehavior bundle makes now possible to log out in a safe and clean way.
Safe and clean means any running applications will be asked to quit and whether they can quit or not. EtoileBehavior is registered as an AppKit bundle when you run setup.sh script, then it is automatically injected in any AppKit-based applications when they are launched. EtoileBehavior includes currently the client part of the protocol that makes log out possible since GNUstep doesn't include anything by default to do this.
Any applications can cancel the log out either immediately or after some delay.
Any applications has 45 seconds to reply to log out request when they asked some delay, otherwise the log out is cancelled. For testing purpose, the reply delay is currently reduced to 15 seconds. System waits to be notified of all applications terminations, then it terminates all system processes (like gpbs, Azalea etc.) and exits.
Really Simplifying Internal Dependency Handling
This is a followup to my previous post: Simplifying Internal Dependency Handling.
Why 'Really' in the title? When a module relies on another one, now you don't even need to declare this dependency. In my first attempt discussed previously, DEPENDENCIES variable was used to do so. Any support related to this variable has been removed. You also don't need anymore to declare module used as dependency in etoile.make.
To make things easier to understand, I'm going to take Grr and RSSKit as an example yet another time. The only thing you got to know is Grr depends on RSSKit framework.
What you have to add now:
- Grr GNUmakefile -include ../../../etoile.make
- RSSKit GNUmakefile -include ../../etoile.make
- etoile.make nothing
Now behind the scene, any modules that includes etoile.make in its GNUmakefile gets exported in a shared Build directory at the root of the repository. In the first attempt, the dependency was imported directly in the module which needs it.
What you had to add before:
- Grr GNUmakefile DEPENDENCIES = RSSKit -include ../../../etoile.make
- RSSKit GNUmakefile nothing
- etoile.make RSSKit = Frameworks/RSSKit
Étoilé: What's going on? (August/September/October 2006)
Well... I didn't write anything about the ongoing development in a long time, so I guess I should summarize a bit :-) I will first talk about what happened in august to october, and leave november for another post.
August 2006
August was a fairly calm month, mostly working on existing software. Quentin and Günther worked on UnitTests (a graphical tool to run .. unit tests, yes !), fixing some bugs, reimporting the gorm (ui) file, etc. Quentin also worked on setup.sh, a script that automatically setup properly your environment to run étoilé, and he also worked on various makefiles tricks (better dependencies support, etc). WorkspaceSwitcher (a plugin for the menu system, to change workspaces) and Grr (the RSS reader) are now included in the build process by default. Günther continued to work on DictionaryReader, adding local dictionary support (jargon file included). Finally, Yen-Ju updated the Io bridge and Azalea to their last versions.
September 2006
September was much more active...
System
Quentin worked on the MenuServer, integrating it more tightly with other parts of étoilé by adding hardware and LookAndBehavior entries. It's also more Azalea friendly, by using XWindowServerKit by default and thus eliminating window positions issues. He also improved the build process, reworking internal dependencies, makefiles system, etc.
Yen-Ju created AZDock (see this post below for some screenshots), a simple dock application for Étoilé.
Frameworks
Yen-Ju added some new functions to XWindowServerKit to get properties and commands. He also updated Oniguruma (the regular expressions library) to version 4.4.2. OgreKit was also updated (OgreKit is a regular expressions library for Objective-C which uses Oniguruma), and he added an example on how to use it. Talking about examples, he also added a LuceneKit example. Finishing with the frameworks, he imported into Étoilé PopplerKit, an Objective-C framework written by Stefan Kleine Stegemann interfacing with Poppler, the pdf library, and AddressesKit, a framework implementing the Cocoa AddressBook framework written by Björn Giesler. Both aren't really updated anymore, so having them in étoilé simplify getting them and working on them.
Applications
Yen-Ju updated Azalea to the 3-3-release version of openbox. He also imported on the repository Vindaloo, a pdf viewer using PopplerKit (also written by Stefan Kleine Stegemann), and AddressesManager, an addressbook manager written by Björn Giesler. Finally, he wrote and imported TypeWriter, a basic rtf editor (using OgreKit for the find panel though).
Quentin imported Calc in the repository, a simple calculator written using the Io bridge by Baptiste Heyman (see this post below).
October 2006
October was an even more active month :-)
System
Quentin worked on the build process -- including Io by default, as well as various frameworks (the EtoileExtensionsKit framework beeing splitted into EtoileFoundation, EtoileUI and DistributedView). He also worked on System (etoile_system), bringing a major update introducing basic session management and better process handling (based on code written by Saso Kiselkov). MenuServer was thus updated to enable the log out feature :-) (finally!)
Frameworks
Yen-Ju worked on many, many things this month.. First, on frameworks: he improved XWindowServerKit (a framework encapsulating x window commands), RSSKit (the RSS framework used by Grr), with fetching content in background. He also wrote CollectionKit, a framework providing common storage facility for records with properties, such as contact information, playlist, bookmark.. and used it to provide an implementation of BookmarkKit (whose job is.. yes, you guessed it!). He also added a BookmarkKitExample. Armed with these toolkits, he started working on a new version of Grr, using BookmarkKit and CollectionKit, with searching implemented, etc.
Not satisfied with such low amount of work, he then updated the Io bridge to the current version, as well as Calc. He then worked on a couple Azalea bug fixes and on AZDock (keeping track of recent applications, etc).
Günther worked on RSSKit, adding code to store and load articles as property lists, and a beginning of podcasting support. He also relicensed it to LGPL.
Applications
Günther fixed some bug in DictionaryReader and also improved the speed.
Finally, as Yen-Ju had a few minutes left, he started writing Babbler, a media player based on GStreamer, and finally wrote Mantella, an experimental browser based on GtkMozEmbed -- yes, a GNUstep web browser using mozilla ! (see the post below from yen-ju).
That's all for today, I will post the summary for november in the next following days... (this time with screenshots if I can)
Mantella
An experimental web browser based on GtkMozEmbed.
etoile-dev members, please resubscribe
On september 13, etoile-dev list got wrongly recreated for an unknown reason. This issue led to the lost of the membership list, therefore every members of etoile-dev list must resubscribe to it since I have no copy of the members list elsewhere.
I'm really sorry for the inconvenience, but I suppose this bug is a side effect of Savane tools update used by GNA (the update was underway at this time), you probably noticed the refreshed web interface. etoile-commits list was wrongly recreated too since I received a similar mail about it the same day.
Here is the link to subscribe: https://mail.gna.org/listinfo/etoile-dev/
I just came back yesterday night, that's why this list issue is only fixed now.
Io and GNUstep
This is a screenshot of a calculator written in Io and GNUstep via ObjcBridge by Baptiste HEYMAN. The user interface is made with Gorm. It can be found in latest Io. After compilation and installation of Io, use 'openapp ./Calc.app' to execute the calculator under addons/ObjcBridge/samples. The file permission of Io file (Calc.app/Calc) may need to be set to executable. The static-linked ioobjc in Étoilé (under Etoile/Languages/Io/) can also be used.
Étoilé on Wikipedia
Étoilé on CIA
You can now check the svn commits of Étoilé on CIA.
Screenshot of Étoilé dock
Simplifying Internal Dependency Handling
Inside Étoilé repository, it's common to have a module depending on other Étoilé modules. To build such a module you need to include the headers of the dependency and link the related object code (usually in a library or framework form).
This is easy if you compile and install modules one by one by taking care of any dependencies yourself. For example, Grr application relies on RSSKit framework. With a fresh copy of the repository… First, you step in Framework/RSSKit, type make && make install, then you move to Services/User/Grr and type make && make install once again.
This becomes more tricky, when you do make && make install for the whole repository (specially when you install Étoilé for the first time). Without any GNUmakefile tweaks, Grr compilation will fail reporting an error about missing RSSKit on your system. Until now, every modules in the repository got to be hacked in order to be included in the default build process. The default build process is what gets compiled when you type make at the root of the repository.
The hack was usually in form of such GNUmakefile.postamble and GNUmakefile.preamble (both from PreferencesKitExample module in this case).
To avoid the repetition of this boilerplate code everywhere, I recently committed a new makefile called etoile.make directly at the root of the repository. By including it in your GNUmakefile, the only code you have to write to handle an internal dependency is drastically reduced. Here are the lines needed to handle RSSKit dependency in Grr GNUmakefile:
DEPENDENCIES = RSSKitFinally before including this code, you need to check this dependency is already declared in etoile.make with a line like:-include ../../../etoile.make
RSSKit = Frameworks/RSSKitIf it's not the case, you need to update etoile.make too.
There is currently a limitation with etoile.make, you cannot declare more than one dependency trough DEPENDENCIES variable, but this should improve really soon.
Tutorial: Accessing X window system from GNUstep, part I
While GNUstep is a portable development environment, some applications do need to access the underneath X window system. This tutorial illustrates a simple way to do so.
Note: the codes may not be executable. It is only used as demostration
First, application delegate needs to register itself for X window event:
- (void) applicationWillFinishLaunching:(NSNotification *)aNotification { Display *dpy = (Display *)[GSCurrentServer() serverDevice]; int screen = [[NSScreen mainScreen] screenNumber]; Window root_win = RootWindow(dpy, screen);Then application delegate can listen to the event:/* Listen event */ NSRunLoop *loop = [NSRunLoop currentRunLoop]; int xEventQueueFd = XConnectionNumber(dpy);
[loop addEvent: (void*)(gsaddr)xEventQueueFd type: ET_RDESC watcher: (id
)self forMode: NSDefaultRunLoopMode]; }
- (void)receivedEvent:(void *)data type:(RunLoopEventType)type extra:(void *)extra forMode:(NSString *)mode { XEvent event;After [NSApp run], X window events will go into -receivedEvent:type:extra:forMode: and application delegate can make use of it.while (XPending(dpy)) { XNextEvent (dpy, &event); /* Intercept event here / switch (event.type) { case Expose: case DestroyNotify: case PropertyNotify: case FocusIn: default: / Go back to GNUstep */ [server processEvent: &event]; } } }
If applications need listen to root window, use XSelectInput() in the end of -applicationWillFinishLaunching:. For example:
/* Listen to root window for window closing and opening */ XSelectInput(dpy, root_win, PropertyChangeMask);By default, applications only receive events acting on their windows. If they listen to other windows, such as root window, do not pass the events belonging to other windows into applications ([server processEvent: &event]). Otherwise, it will behaves weird.
A short guide for beginner
This is a walk-through to have a minimal Étoilé system running. It is not a complete guide and is for people who have some knowledge of Unix, mostly programmers. I use Ubuntu/PPC 6.06.
All the dependencies can be installed through Ubuntu package manager and GNUstep can be installed by following GNUstep Build Guide. Art backend is recommended and ArtResources is already included in GNUstep.
Étoilé can be downloaded through SVN. These are the components to install:
The default theme is etoile/Themes/Nesedah.theme and should be copied under ~/GNUstep/Library/Themes.
Two user defaults should be added (assuming GNUstep is installed under /usr/local/):
Now, the system is ready to go. Azalea is the window manager and EtoileMenuServer is the menu bar on top of the screen. To start, use 'openapp Azalea.app' and 'openapp EtoileMenuServer.app' from a terminal emulator.
Étoilé: What's going on? (July 2006)
Another summary of the svn activity :-)
Well, we are already the 8th... so I'll simply sum things up to today :-)
Yen-Ju did some updates on Azalea, as well as updating the Io VM; Saso fixed MenuServer to use the proper defaults key for suppressing the App icons tiles that GNUstep generates. Quentin imported Oniguruma 4.2.1 (released on 31/07/2006), the regular expression engine used in LuceneKit, on the repository (in the Dependencies folder) -- easier for people to get it that way, as it's apparently slightly exotic on some operating systems / distributions.
Günther continued to work on the new RSSKit -- more refactoring, some fixes, multithreading code, and doxygen documentation. Grr (the rss reader) then also had its share of updates and the current version should work with the current RSSKit. Günther also worked on DictionaryReader (the dictionary application), and did some refactoring and a few UI improvements (avoiding unnecessary line breaks, an edit menu item to copy/select all). Finally, he added icons to the Monitor and the Look and Behaviour preference panels.
Quentin fixed a few compilation issues in ServicesBarKit, and fixed the gorm file for the Hardware pref panel (it was apparently corrupted in the cvs to svn transition).
Quentin also worked on System -- the Étoilé main system process, which plays the role of an init process and a main server process (taking care of starting/stopping and monitor Étoilé core processes, possibly restarting them if they die). So System is now a real server and daemon, Distributed Objects support was added, it is now compiled by default when compiling Étoilé, and there is even some early documentation. Also, an "etoile.desktop" file was added -- which means you can now easily start étoilé (azalea, menuserver..) directly from a login panel such as GDM/KDM :-) System is still in early stages, but looks promising.
Finally, Quentin also added two nice postinstall scripts:
- setup.sh: install Étoilé transparently
- setdown.sh: revert the effect of setup.sh
Setup.sh nicely install everything you need, in the proper place: EtoileWildMenu bundle, the Camaelon theme engine, Nesedah (our default GUI theme), System, etc.
Étoilé: What's going on? (June 2006)
A summary of what happend this last month on the subversion repository...
There wasn't as much activity compared to the previous month -- I'd guess for various reasons (end of academic year, exams, holidays, etc.) ;-)
Most of the activity was concentrated on updating the code, fixing bugs, reorganization, or refactorization.
The only addition to the repository is a new menulet (MenuServer plugin) by Yen-Ju Chen, WorkspaceSwitcher, which, like indicated by its name, allows you to switch between virtual desktops provided by Azalea.
Günther Noack worked on RSSKit -- more unit tests, some refactorization (a new class hierarchy for RSS types -- each class contains code to parse documents of that type). Thus the current version is deemed experimental, but bode well for the future ;-)
He also worked a bit on DictionaryReader, fixing some bugs ('bad dictionary name' bug in particular).
Saso Kiselkov added Solaris OS support code in the MenuServer, as well as fixing some bugs and reorganizing the makefiles.
Quentin Mathé worked on the ServicesBarKit to prepare its future integration with the MenuServer (Basically, the ServicesBarKit let applications provide "tray icons" -- currently displayed in a separate window, and in the near future it will thus be integrated in the MenuServer so those tray icons will appear directly in the menu; similar to OS X). He also fixed a few things in the compilation of process, added some better log statements in MenuServer, and reorganized a few things in EtoileExtensionsKit.
Azalea (our window manager) was updated by Yen-Ju, Quentin also improved the build process to better deal with the XWindowServerKit dependency.
Finally, Yen-Ju continued to work on the Io bridge, adding ObjC constants and importing an example (Developer/Examples/IoExample/example.io) on how to use the Io Bridge to create a simple gui application.
To sum it up, lots of "internal work" this month... :-)
People like screenshots...
...so here's a shot of the current version of DictionaryReader.app, which is an application to look up words that also integrates nicely in your personal GNUstep environment by providing a service.
The application queries multiple free dictionaries via the Dict protocol and allows you to easily browse through their content using a web-browser like interface with "Hyperlinks" and a "back"-button.
Étoilé: What's going on? (May 2006)
While we do not seem to write a lot of news on the blog, étoilé is far from dead: there is a lot of activity going on on the repository and the mailing list. Let's see what happened during last month...
First: the existing stuff...
Yen-Ju did a lot of work on Azalea, our X11 window manager, doing its share of fixes and improvement (like working nicely with borderless windows or compiling on OSX), and he is on top of that working on a branch using directly appkit for drawing...
Günther fixed some issues with DictionaryReader and applied a patch from Chris Vetter to add a service -- so you can now select some text in any application, go in the service menu, click on "look up in dictionary" and voila! a window will open containing the definition of what you selected. Very convenient isn't it ?
Saso improved quite a bit the MenuServer, also applying various patches from Chris Vetter and David Chisnall. The info panel now displays properly informations about the machine/os, and a nice animation was even added in the about panel.
To give you an idea of where we are, here is a screenshot showing Azalea plus the MenuServer and an application (TalkSoup, an irc client), with the Nesedah theme:
On the frameworks front, Quentin is working on the ServicesBarKit to integrate it with the MenuServer; Yen-Ju updated LuceneKit to the 391895 java version, and Günther added unit tests in RSSKit.
In addition to improving what is on the repository, new code also appeared :-)
Yen-Ju imported the first version of XWindowServerKit -- a simple framework that applications can use to "talk" with X11 in order to set things like the available space, the current desktop for a window, etc. He also imported a patched version of the io bridge and added StepTalk support -- so you can now use io to develop apps and/or as a StepTalk language... nice !
Finally... I imported the Nesedah theme on the repository, and fixed a couple of things. Quentin worked on the Documentation, specifically a technical overview of étoilé, an user experience overview, and a core object overview.
So as you see, things are moving fast (special mention to Yen-Ju !) and I will try to regularly post some news about what's going on the repository...
New Azalea based on OpenBox3
A newer version Azalea is ported from OpenBox3. It currently supports GNUstep window style and window level. It is avaialbe under trunk/Etoile/Services/Private/Azalea/.
Étoilé presentation at the Fosdem
We did a talk in the gnustep devroom at the FOSDEM 2006 presenting Étoilé.
I uploaded the slides and the video. Check also the page about the other talks!
Azalea
Azalea is derived from WMaker released by Enrico Sersale, which is based on WindowMaker 0.92. The goal is to replace WINGs with GNUstep while keeping it work as WindowMaker. It is still experimental and may not work out in the end.
I put Azalea in SVN in hope people can help here and there. It is in Etoile/Service/Private/. You can run 'configure' before 'make', but 'make' should work. It depends on libwraster, but not WINGs. Some parts of WINGs are in the source code already. It should works as WindowMaker. No major function is broken. If you found one, it is a bug to be fixed.
Besides the TODO in README, here is the list of tasks in my mind:
Switch Panel (switchpanel.m):
This is only called from cycling.m and is an eye-candy. Therefore, it should be easy to work on. Probably a NSPanel will do. But you might need to deal with both Xwindow and NSApplication event. It is triggered by key press through Xwindow, then once the switch panel (NSPanel) shows up, it is handled by NSApplication. Currently it does not work because the pixmap files are missing. But the Alt-Tab still work to switch focus among windows.
Contextual menu:
The main menu is a NSMenu already (right-click on the empty space). There are still four menus needed to be rewritten: dock menu (icon) on the right), application icon and miniwindow menu (icon on the bottom), clip menu (icon on the top-left corner), window menu (right-click on the window title or Ctrl+ESC). Some menu will call a dialog for settings (icon chooser is rewritten already). It can be rewritten as NSPanel later. You can check the OpenRootMenu() in rootmenu.m for main menu. It should be better to replace these menus with contextual menu from GNUstep.
Array and dictionary:
WINGs has its own WMArray and WMHashTable. It may be nice to replace them with NSArray and NSDictionary, or NSMapTable and NSHashTable. But it is used everywhere, and may not be worth to rewrite at this stage.
Notification:
Same as above.
Dialog:
Beside the dialog called from menus mentioned above, there are still two dialogs: message and crash dialog. I already partially rewrite the message dialog. But here is the trick: the dialogs may be called outside NSApplication run loop, especially when Azalea starts up and crash. Therefore, it may not be safe to use NSPanel for these situation.
Property list:
Most of the property list are already rewritten. The rest is in winspector, which will be replaced by WMWindowInspector. It is work-in-progress. Once it is done, we should be able to remove property list from WINGs.
General clean-up:
There are some general functions here and there, which can be easily rewritten in ObjC. They can probably be collected into a WMUtilities class.
Localization:
Replace all @"..." with _(@"...") should be fine.
Remove most of WINGs:
Some parts of WINGs may not be able to replaced easily, but others should be clean up.
PreferencesKit Documentation... well, it's ready to taste!
After several days spent at fighting with autogsdoc (GNUstep documentation system), I have finally managed to output PreferencesKit documentation in html. I'm happy to stop fiddling with it, now I understand how it works and its shortcomings.
Basically in my case it was very important to pass both headers and implementations, to have implementations following headers in autogsdoc paramaters… An another important point was to turn on both -Verbose and -Warn options and to never forget autogsdoc may fail silently without reporting any errors. When autogsdoc is stumbling on constants declared at several places, you may just process your source code files two by two (.h and .m), that's the last trick I used to output the current PreferencesKit Reference.
With some key points highlighted in documentation, I'm pretty sure it could be easier to tackle autogsdoc. I suppose I have to report the weird experiences I encountered with it ;-)
Back to the crux of the matter now… I have updated PreferencesKit page with various links :
both recently uploaded on Étoilé website.
Preferences Applications Early Screenshots
Three Étoilé Preferences applications are currently planned:
- Look & Behavior
- Hardware
- Network
Another one to handle user accounts will probably be needed.
Here are some early screenshots (presenting both Hardware and Look & Behavior currently under development).
You can download these applications from Étoilé repository (they are located in trunk/Etoile/Services/User). Take note you will have to compile PreferencesKit first. If you build the whole repository, the build system will take care of that.
These applications are based on PreferencesKit and currently contain no real functionnalities. Their User Interface is still to be discussed and refined. Preferences Utilities are a very important point in my opinion for Étoilé, because both KDE and GNOME lack a simple yet efficient User Experience on this matter.
It is planned to use GNOME system-tools-backend (written in Perl) as Preferences applications backend to handle various OS deployment. More about System Tools Backend here.
PreferencesKit Preview
I'm happy to announce first PreferencesKit preview I have been working on until recently (with some help from Yen-Ju). It is based on GSSytemPreferences initial code written by Uli Kusterer. Thanks Uli :-)
PreferencesKit is a framework which provides various features to build flexible Preferences-like window in any GNUstep or Cocoa applications.
Key features:
- Generic plugin model/schema
- Plugins registry mechanism (with search paths) that can be specialized
- NSPreferencePane and Backbone PrefsModule support
- Possibility to choose between various presentations like toolbar or table view (where panes are listed)
- Possibility to extend PreferencesKit with new custom presentation
- Facility for Preferences-only applications development (just set up a main nib file, list plugins to load in a plist and link PreferencesKit to have an application ready to use)
- Cocoa compatibility (Xcode project bundled)
Here are PreferencesKitExample screenshots running on both GNUstep and Cocoa.
You can download the framework from Étoilé repository. You can browse the code with this link.
Most of PreferencesKit features are demonstrated in PreferencesKitExample, it is worth to take a look at it.
GNA! Hotspot
On a regular basis, Gna! people pick and interview one project to highlight the software and personalities that drive Libre Software community. And for the 8th hotspot.. it's us!
Link to the interview: https://gna.org/forum/forum.php?forum_id=1046 (an html version is also available)
New Site Theme
The etoile-project.org site has a new theme -- Flora! The theme should be going live soon, but in the mean time, if you have a wiki account and want to test drive it, follow the instructions below:
- Point your browser here
- Click 'Skin', select 'Flora' and then 'Save Preferences'
You many notice some bugs here and there -- we'll be ironing those out throughout the week.
Build LuceneKit straight
Some people asked about how to build LuceneKit from scratch. Here is the instruction:
-
./configure --prefix=/usr/local (or other place like /sw/local) make sudo make install
-
./configure --prefix=/usr/local/ (or other place like /sw/local) make sudo make install
OgreKit in Etoile. make sudo make install
LuceneKit in Etoile. make sudo make install
You probably need to use -lLuceneKit -lOgreKit to link the libraries in your project.
AddressBook with LuceneKit support
A new version of AddressBook with LuceneKit support can be download here. See README.LuceneKit for details. It creates a new group with search result and supports boolean and prefix query.
This is just a showcase for LuceneKit.
WildMenus
WildMenus is a nice gui bundle to have horizontal menus with GNUstep apps. As it could be an option for Etoile apps (when using with small screens, horizontal menus are nice), and as I wanted to sligthly modify it to be a better Camaelon companion, I just imported it (with michael's green light) in Etoile's repository, in /Etoile/Bundles/WildMenus. That way people can have a simpler way of getting WildMenus, and we can use subversion to improve it.
Track... New Étoilé List!
A new Étoilé list has been set up, it is called etoile-track. It is configured to receive notifications sent by Bug, Patch and Task trackers each time one of their item is updated. I would have preferred to have such list named etoile-support or eventually etoile-bug, but none of them were accepted by GNA system (strangely, they are reported to be reserved to avoid potential conflicts).
If you are involved in Étoilé or just interested to track Étoilé development, you should subscribe here.
For now, this list isn't mirrored on Gmane.
Last note, it is widely public… It is not needed to be a subscribed member to post on this list. I mean any mails about support issues are welcome on it, by the way why not call it officially Étoilé support list!
Finally we have our Build Guide
Finally today, I have just added this lengthily promised Étoilé Build Guide in form of two README and INSTALL files to our repository. INSTALL will probably need some adjustements, Nicolas has already improved it a bit.
Étoilé Build Guide is available on Étoilé web site too, take a look here.
Now it should be possible to try building this whole repository to know on which platforms Étoilé is really supported.
Subversion!
Quentin announced today that the Subversion repository is now set up on gna.org, with the previous cvs history imported.
SVN client or SVK are appropriate. Quentin wrote a small tutorial/introduction to SVK, which brings distributed capacities to SVN.
OgreKit compilation was also fixed yesterday by Quentin, and last Camaelon changes (done after the cvs history was saved..) were committed too.
Here and Tomorrow
Étoilé News is alive now... Well, you probably want to know more about Étoilé, may be you should move there: http://www.etoile-project.org