News
Étoilé 0.4.0 Release Announcement
Posted on 16 November 2008 by
Étoilé intends to be an innovative, GNUstep-based, user environment built from the ground up on highly modular and light components. It is created with project and document orientation in mind, in order to allow users to create their own workflow by reshaping or recombining provided Services (aka Applications) and Components. Flexibility and modularity on both User Interface and code level should allow us to scale from handheld to desktop environments.
0.4 is a developer-targeted release on its way towards this goal. As a developer-focussed release, this predominantly consists of frameworks. A few demonstration applications are also included. More will be added during the 0.4.x release series, leading to a user-focussed 0.5 release next year.
Highlights
CoreObject is a framework for describing and organizing model objects. It supports automatic persistence and versioning by recording messages sent to objects. It offers a flexible versioning scheme where both individual objects and their entire object graph can be versioned separately. The built-in object model is a generalization of the property model used by the AddressBook framework. Foreign model objects can be also integrated by wrapping them with a special proxy. CoreObject uses the EtoileSerialize framework which, in many cases, allows objects and messages to be automatically serialized with no extra code being written.
LanguageKit is a compiler kit built on top of LLVM for creating dynamic language implementations using an Objective-C runtime for the object model. This is used by SmalltalkKit, implementing Étoilé's Pragmatic Smalltalk, a Smalltalk JIT compiler which generates code binary-compatible with Objective-C, allowing classes to be written in a mixture of Smalltalk and Objective-C.
EtoileFoundation is the core framework for all Étoilé projects, providing numerous convenience methods on top of the OpenStep foundation and significantly better support for reflection. This includes EtoileThread which allows objects to transparently be run in a separate thread. It also includes a number of extensions to the Objective-C object model, allowing traits and mixins. This framework is used by most of the rest of Étoilé and provides a number of core functions, such as UUID and XML handling.
EtoileUI is also available as an early preview release and should not be considered stable. EtoileUI is a high-level, object-oriented, user interface toolkit that provides a uniform tree representation for graphical objects on top of the AppKit. All User Interface concerns such as layouts, event handlers, styles, model objects etc. will be implemented as pluggable aspects. It also shares the same interfaces as other CoreObject systems. The combination of these three key features makes possible to inspect and reshape both User Interface and model objects at runtime through direct manipulation. It comes with a library of layouts where each one encapsulates a custom and pluggable visual presentation.
Other frameworks, such as LuceneKit, providing full-text indexing and searching, and OgreKit, a powerful regular expression framework are also included. UnitKit is a simple and flexible unit testing framework used by much of Étoilé. A new addition is MediaKit, a framework used to provide support for sound playback and recording and, in future, video. SystemConfig has received a number of improvements since our last release, including support for modifying basic X11 keyboard settings and monitoring the battery level.
Several applications are part of this release, such as Mélodie, a music jukebox using CoreObject for the music library and MediaKit for playback. Étoilé applications which use ScriptKit are scriptable from outside using Objective-C or Smalltalk. This is used by the hot corners and gesture recognition tool to run arbitrary commands in response to corner activations or mouse gestures, and by ScriptServices which allows arbitrary shell or Smalltalk scripts to be invoked on the current selection from any GNUstep or Étoilé application.
Screenshots
![]() |
![]() |
| Étoilé in the Dictionary | About and Vindaloo PDF reader |
Availability
Étoilé 0.4.0 is currently available in code source form only and may be downloaded at http://download.gna.org/etoile/etoile-0.4.0.tar.gz It may also be obtained from Subversion with the following command:
svn co svn://svn.gna.org/svn/etoile/tags/Etoile-0.4.0
If you wish to use the latest stable release, then you can download
http://download.gna.org/etoile/etoile-0.4.0-svn.tar.gz before running
svn up to seed your source tree.
More Information
Visit our website: http://www.etoileos.com/ and blog: http://etoileos.com/news/ Or subscribe to our mailing lists: https://gna.org/mail/?group=etoile Or join our SILC channel: silc://silc.etoileos.com/Etoile
Static Compiling Smalltalk
Posted on 10 November 2008 by
One of the things I wanted to do with Smalltalk was allow static compilation. This is possible with LLVM as the back end. The compiler creates LLVM IR, a low-level intermediate representation form, which is then used to perform optimisations and can be compiled or interpreted. I was using this for the JIT - the IR was created when the code was loaded but turned in to native code on-demand, when each method was used.
Today I committed a few changes to LanguageKit to allow the bitcode to be written to a file instead of loaded. This was slightly more complicated than you might imagine. I use a trick with the JIT where each Smalltalk module uses the set of functions defining small integer messages as a template. This allows them to be inlined nicely without having to worry about cross-module optimisations. For static compilation, this is not desirable, so the biggest change was allowing it to reference these functions externally or internally depending on how the code generator was being used.
Once this was done, I added a new -c option to edlc. If you now do:
$ edlc -c -f test.st
You will get a file test.bc as output. This contains the LLVM bitcode for the Smalltalk file. The next step is to link together all of the .bc files, including the MsgSendSmallInt.bc file which contains definitions of small integer messages:
$llvm-link $(GNUSTEP_LOCAL_ROOT)/Library/Frameworks/LanguageKit.framework/Versions/0/Resources/MsgSendSmallInt.bc test.bc -o smalltalk.bc
This outputs a single file, smalltalk.bc, containing all of the bitcode from the various modules. If you compiled more than one Smalltalk file then list all of the .bc files here. This is completely unoptimised, so let's run some optimisations on it:
$ opt -O3 smalltalk.bc -o smalltalk.optimised.bc
This runs the same set of optimisations that llvm-gcc runs at -O3. I haven't actually done any sensible tests to see if this is sensible, but hopefully it is (if anyone can come up with a good list of optimisations before I get around to doing some sensible testing, please let me know).
Now we have an optimised bitcode file, we want to turn this into object code. This is a two-step process:
$ llc smalltalk.optimised.bc
$ gcc -c smalltalk.optimised.s
The first step produces assembly code, and the second step assembles it (you can use as for the second step, but I was lazy and just threw it at the GCC compiler driver). You now have a file called smallltalk.optimised.o, an object code file that you can link in to your executable just as you would an object code file compiled from Objective-C.
This sounds a bit complicated, and it is. It's actually more steps than the first C compiler I ever used (where preprocess, compile, assemble, and link were all separate steps) required. Fortunately, Nicola Pero is working on adding support for it to GNUstep Make, so soon it should be just a matter of putting SMALLTALK_FILES=... in your GNUmakefile.
The bad news is that this is too big a change to be properly reviewed in time for 0.4.0, so unless you are running trunk you will have to wait for a bit to see it. 0.4.1 should be out around the new year, so you don't have too long to wait...
Packaging Étoilé
Posted on 27 October 2008 by
Nicolas just sent me a link to this Ubuntu brainstorm idea. Someone is proposing full Étoilé packages for inclusion in Ubuntu.
Currently, Étoilé is a bit of a moving target. It's been a long time since our last release. FreeBSD has ports for this release, but so much has changed in subversion since then that these are no longer a good introduction to Étoilé.
Hopefully this will change next month. We are planning on releasing Étoilé 0.4.0 on the 31st of October. If you want to get an idea of what it will contain then take a look at the current stable branch - this will be tagged 0.4.0 in under a week.
After this, we will be moving to a time-based point release schedule, with Étoilé 0.4.1 being released at the end of the year, 0.4.2 at the end of February, and so on.
Hopefully this will make life easier for packagers. We aim to only require released versions of dependencies for release versions of Étoilé (0.4.0 will require LLVM 2.4, for example, which is due for release on October 30).
If you are interested in providing packages for your platform, then please get in touch and let us know what we can do to help.
The road to CoreObject Part 3: Mixing Temporal Object Store and Name Service
Posted on 19 October 2008 by
CoreObject is a central piece of Étoilé, often discussed but rarely seen :-) The good news is that it's currently shaping up pretty well. But before entering in the details and illustrating CoreObject persistency with an example in a next post, I'd like to give a brief overview of it.
The overall architecture has evolved substantially over the past two years. The implementation started with the writing of EtoileSerialize by David, and CollectionKit then OrganizeKit by Yen-Ju. These last two frameworks were built to provide a semi-structured object model inspired by AddressBook framework, that can be used to write applications managing collection of objects (music, photo, contacts etc.). This summer, Eric wrote a music manager named Mélodie based on this reusable object model. As such, Mélodie is the first Étoilé application that truly uses CoreObject.
Until recently, CoreObject mostly existed as a fork of OrganizeKit in Étoilé repository. The persistency model was to store the whole core object graph into a single property list, or multiple property lists but without the possibility to reference core objects across these property lists. This was a very important limitation that prevented concurrency control and versioning of objects through EtoileSerialize. Moreover each time a process wanted to access a core object, the entire core object graph had to be deserialized. Over the past two months, I have revisited CoreObject, in order it fully leverages EtoileSerialize for persistency, supports the loading of the core objects in memory on demand, interacts with a metadata server to track stored objects, and provides a better control over the history of core objects.
The updated version of the semi-structured object model also brings a very transparent approach to persistency, you don't need to call EtoileSerialize explicitly or even use a proxy to wrap your objects.
Now let's look at the various building blocks of the framework. The basic idea behind CoreObject is to provide a reusable model for organizing objects and handling their persistency. The low-level persistency logic is implemented by EtoileSerialize, CoreObject extends it with:
- a protocol to organize core objects into groups (COObject and COGroup protocols)
- a main backend that provides a semi-structured object model (COObject and COGroup classes)
- additional backends to attach external object graphs (for example mounting a filesystem or exporting an application UI into the core object graph)
- COProxy for integrating persistent model objects not derived from the COObject class
- a metadata server to track stored objects and index both metadatas and content of core objects (COMetadataServer class)
- a per process object factory and cache that is used to handle the faulting and uniquing of core objects (COObjectServer class)
So CoreObject mostly adds a name service on top of a EtoileSerialize and persists the name service structure and the objects bound to it in the same uniform representation. This representation is the core object graph, where each object and each group is stored as a persistent root by EtoileSerialize. Each persistent root is identified by an UUID/URL pair. Persistent roots are currently stored as object bundles on the filesystem. An object bundle is a directory the contains the history of the object in term of snapshots and deltas. Deltas are serialized invocations that represent logical changes. EtoileSerialize defines a protocol for the storage model, so new ways to store the objects could be defined. For example, changing the layout of object bundles, storing all the objects in a single flat file, over the network or other kind of data stores such as ZFS DMU (the low-level ZFS transactional store on which the filesystem is built).
The UUID/URL pairs are stored in a metadata server, which defines all the objects that belong to a core object graph. In future, the core object graph should thus be able to span multiple computers or data stores backed by a single metadata server. The metadata server is currently based on a PostgreSQL database.
For this first approach, multiple users cannot share a single core object graph and the access rights are simply defined by the permissions set on the object bundles at the filesystem level.
Out of the box, EtoileSerialize provides the basic infrastructure for per object history. This allows to support undo/redo per object. However objects such as photos, music, contacts are usually organized into libraries and it is expected undo/redo will operate on the last modification for the currently opened library, when you use a photo manager or a music manager. If that wasn't the case, undo/redo would only work if one or several objects are selected as targets for undo. This also means the user would have to remember the last modified object if he changed the selection after editing this object.
To solve this problem, CoreObject introduces the notion of object contexts. An object context is a pool where you insert related core objects. The object context records an history that is the interleaved histories of all the objects that belong to it. By this mean, it becomes possible to navigate and restore the history per object and per context.
Most of the elements of the architecture outlined at the beginning have already been implemented, if we put aside the indexing service. Various key pieces remain to be written though: concurrency control, update feed to push object changes to client applications, in-store deletion model and history cleaning.
In a more broad perspective, integration with the branching support of EtoileSerialize, exporting core objects to other formats and collaborative editing, will also have to be fully worked out. Finally the versioning of structured documents will require additional support to be truly convenient and integrate perfectly with EtoileUI.
So, you want to invent a language?
Posted on 12 October 2008 by
I posted a little while ago about the Smalltalk compiler in Étoilé svn. Since then, Truls Becken has rewritten my parser (which was quite bad, and is now quite good) and tidied up the code a little. I've also refactored it into two frameworks, LangaugeKit and SmalltalkKit. LanguageKit contains all of the abstract syntax tree and code generation stuff, while SmalltalkKit contains all of the Smalltalk-specific parts.
The total line count for the Smalltalk-specific part is a shade over 500 lines of code. This means that writing a new front-end for something Smalltalk-like is very easy (I plan on adding some things to LanguageKit to make slightly less Smalltalk-like languages similarly easy).
If you want to play, then the first thing you need is a subclass of LKCompiler, which implements two methods: +fileExtension and +parser. The first returns the extension used by scripts in your language (@"st" for Smalltalk), while the second returns the Class implementing your parser.
Then you need to implement the parser. This just needs to implement one method, parseString: which takes a string as an argument and returns an AST. For Smalltalk, I have a hand-written tokeniser and use LEMON (from the SQLite project) for the parser. The tokeniser simply turns the string into a stream of tokens and then passes them one at a time to the parser (it might be simpler if I wrote it using something like Lex, but since it's only 200 lines of code now I can't really be bothered). The parser is generated from a BNF-like description of the grammar, with instructions in Objective-C on how to generate the AST from this.
Now that Truls has rewritten it, the Smalltalk grammar is a fairly good example of a LEMON grammar. If you want to write a new language, a good first step is tweaking Smalltalk a bit. If you find that you want a semantic construct that isn't supported by the AST, drop in to SILC and talk to me - adding static flow control (if statements and while loops) is high on my list of priorities, as is support for primitive (non-object) types that aren't auto-boxed.
Scripting and Gestures
Posted on 13 August 2008 by
Two of the things that have been on my TODO list for about two (maybe three) years are cross-app scripting and mouse gestures. StepTalk had some preliminary support for cross-app scripting, but I don't think it made it into a release. I never really liked its approach, since it seemed horrible over-engineered (for reference, the Smalltalk interpreter in StepTalk is about twice as much code as the Pragmatic Smalltalk compiler and support library).
Yesterday, I committed the first version of ScriptKit. This is a very lightweight cross-app scripting framework built on top of Distributed Objects. It simply exports a dictionary containing a set of named objects for scripting. By default, NSApp (the application object) is exported. If you don't want to give unrestricted access to remote scripts you can export your own object with the 'Application' key and filter out some messages. You can also export other objects with their own names. In future we will define a set of standard-but-optional ones that Étoilé services should export (e.g. the current document, some CoreObject related things and so on).
For the paranoid, I plan on adding a 'Paranoid Mode' which uses a pre-shared key to prevent unauthorised scripts from controlling the app.
The nice side-effect of using DO as the core is that it is also trivial to send scripting events from Objective-C. Anyone who has tried doing this with Cocoa has probably given up and just generated a string containing AppleScript code and passed this to the scripting engine. Since we are using a Smalltalk which is toll-free bridged with Objective-C, it makes sense to just expose scripting objects as Objective-C / Smalltalk objects (well, object proxies) and use them directly, without a confusing abstraction layer.
The other thing I added yesterday was a gesture recognition engine. Today I remembered that 'x is a-cross' and fixed it so that it actually works. This is embedded in Corner.app, which currently handles hot corners for Étoilé (allowing scripts to be run when the mouse enters and leaves a screen corner). If you hold down control and shift, it enters a gesture quasi-mode. It then tracks mouse movements. Each movement is treated as an approximation of a movement in one of 8 directions, numbered 1 to 8 clockwise from the top (i.e. 1 is up, 5 is down, and so on). Complete gestures are therefore turned into strings ('gesture words'), so an 'h' shape would be '5135' (down-up-right-down). Distance moved in each direction is ignored because when doing mouse gestures I am rubbish at getting distances right, while with this system I can consistently do the gesture I was trying to.
Corner maintains a dictionary mapping gesture words to objects. These objects can be written in Smalltalk or Objective-C. They have to implement a -gesturePerformed method, and this will be called whenever the gesture they are associated with is drawn. Now that cross-application scripting is working, this can be used to control any application, for example locking the screen and setting an away message in the Jabber client.
Currently, there is one default gesture - drawing an h hides the active application (if the active application supports scripting, otherwise it does nothing). Others will probably be added in time for 0.4.
OgreKit Tutorial #3
Posted on 6 August 2008 by
OgreKit also comes with a find panel. It can work on NSTextView, NSTableView and NSOutlineView. The later two are not ported yet, but the architecture is extendable to other graphic interface. An example of using OgreKit find panel is under '/Etoile/Developer/Examples/OgreKitExample'. First, we need to connect the find panel to the text view:
- (void) awakeFromNib
{
textView = [scrollView documentView];
textFinder = [OgreTextFinder sharedTextFinder];
[textView setRichText: NO]; /* Use Plain text adaptor */
[textFinder setTargetToFindIn: textView];
}
OgreKit find panel can search both plain text and attributed text. Here, text view is set to use plain text and the right adaptor will be used by OgreKit find panel automatically. To connect find panel and text view, use -setTargetToFindIn: from OgreTextFinder. That's all.
To bring up the find panel, add this action into menu:
- (void) findPanelAction: (id)sender
{
[textFinder showFindPanel: sender];
}
Now, you have a find panel which supports regular expression by default. Here is a screenshot:

OgreKit Tutorial #2
Posted on 1 August 2008 by
Here are some examples of using OgreKit:
In NSMutableString, -chomp remove all newlines ('\n') anywhere in a mutable string.
NSObject subclass: SmalltalkTool
[
run
[
| target |
target := NSMutableString stringWithString: 'alphabetagammadelta\n\n\n'.
target length log.
target chomp.
target length log.
]
]
In OGRegularExpression, -replaceAllMatchesInString:withString: replaces all matched strings.
NSObject subclass: SmalltalkTool
[
run
[
| regex target result |
regex := OGRegularExpression regularExpressionWithString:'a[^a]*a'.
target := 'alphabetagammadelta'.
result := regex replaceAllMatchesInString:target withString: '###'.
target log.
result log.
]
]
You can even swap the matched substring like this:
NSObject subclass: SmalltalkTool
[
run
[
| regex target result |
regex := OGRegularExpression regularExpressionWithString:'(a)([^a]*a)'.
target := 'alphabetagammadelta'.
result := regex replaceAllMatchesInString:target withString: '(\2)(\1)'.
target log.
result log.
]
]
OgreKit also supports various regular expression syntax:
OgrePOSIXBasicSyntax POSIX Basic RE
OgrePOSIXExtendedSyntax POSIX Extended RE
OgreEmacsSyntax Emacs
OgreGrepSyntax grep
OgreGNURegexSyntax GNU regex
OgreJavaSyntax Java (Sun java.util.regex)
OgrePerlSyntax Perl
OgreRubySyntax Ruby (default)
OgreSimpleMatchingSyntax Simple Matching
Now, let's go back to Objective-C. Instead of using regular expression to replace string, you can have a delegate method for that. Use -replaceAllMatchesInString:delegate:replaceSelector:contextInfo: to specify the delegate and method, then write your own replace method. Here, the replace method is -count:contextInfo:, which will return the number of matched letter.
(void) testReplaceDelegate
{
OGRegularExpression *regex = [OGRegularExpression regularExpressionWithString: @"a[^a]*a"];
NSString *target = @"alphabetagammadelta";
NSString *result = [regex replaceAllMatchesInString: target
delegate: self
replaceSelector: @selector(count:contextInfo:)
contextInfo: nil];
NSLog(@"Target %@", target);
NSLog(@"Result %@", result);
}
- (NSString *) count: (OGRegularExpressionMatch *) match
contextInfo: (id) contextInfo
{
return [NSString stringWithFormat: @"(%d)", [[match matchedString] length]];
}
The result will be:
Target alphabetagammadelta
Result (5)bet(3)mm(6)
OgreKit Tutorial #1
Posted on 31 July 2008 by
David asked me to write an example of using OgreKit framework. I figured it might be interesting to do that in combination with SmallTalk. This post shows you how to set up everything on Ubuntu 8.04. Of course, you need to have GNUstep installed first.
Dependencies for LLVM are 'lemon', 'flex' and 'bison'. They can be installed from Ubuntu packages. It is necessary to use LLVM trunk for Smalltalk. Based on LLVM User Guide, you can download LLVM trunk with
svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
After configuration with './configure', compile it with 'make ENABLE_OPTIMIZED=1' for release build (10x faster than debug build) and install it with 'make install'. It is not necessary to install the frontend for this purpose.
Once LLVM is ready, compile and install 'EtoileFoundation' and 'Smalltalk'. To avoid debug information from Smalltalk, use 'make debug=no' for compilation. Use 'st -f test.sh' under Smalltalk directory to testing Smalltalk. There are also a few examples under 'examples' directory.
For OgreKit, you need to have oniguruma from Ubuntu packages. Then compile and install OgreKit as usual.
Finally, this is a small script to check everything.
NSObject subclass: SmalltalkTool
[
run
[
| regex matches |
regex := OGRegularExpression regularExpressionWithString:'a[^a]*a'.
matches := regex allMatchesInString:'alphabetagammadelta'.
matches foreach:[ :x | x matchedString log.].
]
]
Save it in a text file called 'ogre.st', for example, and run it with 'st -f ogre.st -l OgreKit'. Parameter 'f' refers to the file name and 'l' refers to the OgreKit framework. The regular expression patterns is 'a[^a]*a', which means a string that the first and the last letter is 'a', but none of the letters in-between is 'a'. Using this pattern to match a string 'alphabetagammadelta' will give 3 substrings: 'alpha', 'aga', 'adelta'. Results are stored in an array of OGRegularExpressionMatch. Use '-matchedString' to retrive the matched substring.
Fun With Threads
Posted on 29 July 2008 by
This weekend I started working on a replacement for MultimediaKit. This has been on my TODO list for a while, since the current one is GPL-tainted. I started working with libavcodec and libavformat directly, since these are LGPLd.
In order to get consistent latency, ideally I wanted the decoder running in its own thread. Since we have a threading library in svn, I thought I'd try using it (okay, I wrote it, but I've not actually had the need of a threading framework since then). The first thing I needed to do was create the player object and put it in its own thread:
MusicPlayer *player = [[MusicPlayer alloc] initWithDefaultDevice];
// Move the player into a new thread.
player = [player inNewThread];
Actually, that's all I needed to do. After putting some files in the player's queue, I could periodically query its state, like this:
// Periodically wake up and see where we are.
while (1)
{
id pool = [NSAutoreleasePool new];
sleep(2);
NSLog(@"Playing %@ at %lld/%lld", [player currentFile],
[player currentPosition] / 1000, [player duration] / 1000);
[pool release];
}
Note the complete lack of any locking or thread operations here. The player object, after the call to -inNewThread is really a proxy which maintains a lockless ring buffer storing messages between the player and my main thread. When I send it a currentFile message, it adds it to the queue and returns a proxy. If I try to use the proxy (here, NSLog will do so by sending it a -description message) then my calling thread will block. The other two messages return primitives, so they block immediately.
When I am not sending the player messages, the run loop managed by EtoileThread sends it a -shouldIdle message whenever the message queue is empty, and if it is then it sends it an -idle message. The -idle method reads the next frame from the audio file, decodes it, and passes it to the output device. All of these are synchronous, blocking, calls (although the output device does some buffering) and so it's very simple code. Neither thread needs to spend much time waiting on a mutex - the structure used to send messages between threads is a hybrid ring buffer, which runs in lockless mode unless it has spent a little bit of time spinning (at which point it uses a mutex).
This means that, while playing, the cost of checking for new messages is very cheap (one comparison operation, in fact). While paused (and not receiving messages), the object will automatically switch to locked mode and wait for a condition variable to wake it up, so you aren't wasting CPU.
The best thing is that all of this is hidden away in EtoileThread (in EtoileFoundation), so any of your objects can use the same mechanism with almost no code. Just adopt the Idle protocol if you want to do something when your object isn't receiving messages from another thread, and send it an inNewThread message just after creation.
Pragmatic Smalltalk 0.5
Posted on 12 July 2008 by
I've been calling Étoilé 'a pragmatic Smalltalk' for a long time (although Nicolas, I believe, was the one to coin the expression). Smalltalk is a really great language, but it has two disadvantage:
1) It tends to be bytecode-interpreted, which is not very fast. 2) Implementations tend to be all-or-nothing.
The first is less of a problem now that CPUs are so fast they spend 90% of their time idle in a typical desktop workload. The second is much more of a problem. Smalltalk-80 includes a complete GUI and common implementations, such as Squeak adopt this model. This means that Squeak applications and 'native' applications are entirely separate. If there is one thing that Squeak doesn't have that you need, then using Squeak is not easy.
This week, I committed the first version of the Smalltalk compiler I have been working on to Étoilé svn. Unlike other Smalltalk implementations, this is designed from the ground up for interoperability. Smalltalk objects are compiled (to native code) as Objective-C objects. This means that they can subclass Objective-C objects, and can even implement categories on Objective-C objects. There is no C function interface - if you want to call C functions then call them from Objective-C.
The compiler is in three components. SmalltalkKit contains everything required to take a string containing Smalltalk code and compile it to a set of Objective-C objects.
The Support library contains things needed by Smalltalk but not Objective-C. The most important class here is the BlockClosure class, which implements a Smalltalk block as an Objective-C object with a function pointer as an instance variable and pointers to bound variables and space for promoting other variables (eliminating the need for garbage collected stack frames). There are also a few categories, such as map: and related methods on NSArray which take blocks as arguments. Note that these are implemented in Objective-C even though they are used by Smalltalk - they could, in most cases, easily be implemented in Smalltalk instead.
The final part is a tool which compiles a Smalltalk file, instantiates a specified class, and send the instance a run message. This is very small and shows how the compiler can be used, and will serve as the framework for writing complete applications in Smalltalk.
The parsing is done in Objective-C, using the Lemon parser generator from SQLite. The abstract syntax tree (AST) is constructed out of Objective-C objects, which means it's exposed to Smalltalk. As a result, Smalltalk programs can generate code easily by constructing the AST and invoking its compileWith: method, or by instantiating a parser and giving it a string.
Currently, the compiler only works in-process. It uses runtime introspection when constructing the AST. Code generation, however, is done via LLVM, and involves generating an LLVM intermediate representation (IR) version of the AST, running LLVM optimisation passes on this, and then compiling it to native code. With minor modifications, it is possible to emit the LLVM IR as bitcode and then run extra optimisations on it or compile and link it as a native library. Whether this is interesting depends on how long it takes to run the compiler. For the simple test I've done so far, program startup has taken much longer than parsing and code generation (and I'm using a debug build of LLVM, which is about 10% the speed of a release build). For larger programs, it might be worth statically-compiling. If parsing is a major overhead, it might be worth caching the bitcode for each Smalltalk input class.
So far, it is a fairly naive implementation. Lots more optimisations are possible (some are very easy) than are currently done. My aim, however, is to move as many as possible into LLVM passes, so that they can be used when compiling other dynamic languages. The code representing the Objective-C object model is taken from code I wrote for clang, the new C language family front end for LLVM, and so is also used for compiling Objective-C with LLVM.
Building a Better Garbage Collector
Posted on 6 July 2008 by
One day a student came to Moon and said, "I understand how to make a better garbage collector. We must keep a reference count of the pointers to each of the cans." Moon patiently told the student the following story:
"One day a student came to Moon and said, "I understand how to make a better garbage collector...
I am not a fan of many of the things Apple has done to Objective-C recently. The one thing I liked the idea of in principle was garbage collection. Unfortunately, they seem to have done this very badly, so I set about seeing if there was a better way. First, some background:
There are, generally speaking, two kind of garbage collection: reference counting and tracing. With reference counting, every assignment increments the reference count of the new value and decrements the reference count of the old value. When an object's reference count hits zero, it is freed. This is what is traditionally done with OpenStep, via the -retain and -release methods.
The other alternative is tracing. This requires every object to be known to the garbage collector. Globals are identified as 'roots' and periodically the collector attempts to navigate from the roots to every reachable object. Those that can not be reached are freed.
In 2004, some very bright guys at IBM's T.J. Watson Research Center (a nice place to visit, by the way - it's on top of a hill, with huge windows and overlooks some gorgeous scenery) came up with a Unified Theory of Garbage collection in which they propose that these are really equivalent. A tracing garbage collector needs to set a flag indicating that an object has been reached, and this can be seen as a special case of a reference count (one capped at one). Reference counting garbage collectors need some extra mechanism for detecting loops, and this is equivalent to the tracing operation.
When Apple added tracing GC to Cocoa, they threw away the reference counting mechanism. This was a shame, since all that is needed to turn reference counting into full GC is the addition of a cycle detecting algorithm. If a has a reference to b and b has a reference to a then, with pure reference counting, both a and b will leak. The rôle of the cycle detector is to run periodically and make sure this does not happen.
Fortunately, the two of the same guys who came up with the unified theory had, a few years earlier, published another paper in which they describe a mechanism for adding an efficient (i.e. fast) cycle detector to a reference counting system.
I have implemented this for GNUstep, and it shows promise. I've made a few modifications to NSObject - retain counts are now stored in a 16-bit value (if you have more than 65535 references to a single object, you probably have a bug) and the other 16 bits are now used to store flags, including a colour. The colour is set by the cycle detection algorithm, which is invoked periodically when a buffer of objects which have been released but not freed becomes full.
One minor problem with this is that objects can now exist safely in loops and -retain can be called on an object which is currently executing -dealloc. This can lead to an infinite loop, and some careful juggling is required to ensure that no objects deallocate themselves while freeing loops.
The code works, although it has a few (easily fixable) limitations. I created a small graph of five Pair objects. Each one has a reference to itself and to the next one in the ring. The code correctly determines that this contains a loop, and destroys all five objects when the autorelease pool is destroyed.
The only major limitation is that I've only written code for atomically accessing the colours on x86. This can be fixed trivially by simply writing these functions. A smaller issue is that a number of GNUstep classes indulge in premature optimisation by calling NSDeallocateObject() in their -dealloc method, rather than calling [super dealloc].
I currently use a modified version of the algorithm in the paper which uses an NSHashTable to store pointers to objects that might contain loops. Since there's space in the flags field for a 'buffered' flag, I can easily extend it to use this, and replace the hash table with a static array. This is better for two reasons: it should be faster, and it means that we can use thread-local storage without having to worry about explicit destructors (which are currently called by listening for a thread terminating notification, which is slightly fragile and will potentially fail to catch things released when a thread is dying).
Since some code already handles loops via unretained references, the current code has problems. To avoid this, I introduced an extra colour (transparent), and any objects with this colour are assumed to always be acyclic. This allows automatic loop detection to be turned on on a per-object, or per-class basis. In future, this can be used in reverse: to turn off loop detection for intrinsically acyclic data structures (e.g. trees).
My main motivation for this is for the Smalltalk compiler I am currently working on. Since Smalltalk expects garbage collection, and Objective-C does not provide it, this presents a small problem. A tracing collector can be used, however this is very tricky when integrating with a C-like language, since it means that everything which may potentially contain pointers has to be checked, including integers and untyped buffers.
On the way to EtoileUI, Part 1: Back to the Hackathon
Posted on 6 June 2008 by
For the Swansea hackathon, I gave a quick overview of EtoileUI. When I came back, I intended to upload it and wrote a post on the subject but the time went by faster than I expected :-)
During April and May, the framework has steadily improved to the point of being now usable on both GNUstep and Cocoa, but it is still quite experimental and many things remain to be worked out. In the early days of April, the stability of EtoileUI on GNUstep wasn't really satisfying too, since I initially wrote a large part of it on Mac OS X before backporting it this winter.
Before trying to explain what is EtoileUI in upcoming posts, here is the link to the presentation (PDF) I did.
As you probably know from the previous post, the integration of CoreObject and EtoileUI is moving forward, especially now that Eric has started to use both in a real application named EtoileTunes. This recent project has also motivated me to be quicker at squashing annoying bugs in these frameworks. On my side, a generic object manager based on the example of the last slide is coming along nicely. For now, it is mostly a CoreObject-based file manager which supports several views (icon, list, column, etc.), although it can be used to browse and mutate any object graphs that comply to a simple object collection protocol (declared in EtoileFoundation).
EtoileTunes
Posted on 26 May 2008 by
For the past few weeks I've been working on EtoileTunes, a music player for Etoile. My goals were to try improve my Objective-C/GNUStep knowledge, to try out programming with EtoileUI and CoreObject, to work on a replacement for MultimediaKit, and to hopefully end up with a good enough music player to use regularly.
It's still in fairly early stages, but I have something that uses TagLib to read music file metadata, then constructs a CoreObject with that metadata. It then puts these into a COGroup subclass which represents a playlist, and displays this group with EtoileUI. The eventual goal, as I understand it, is for EtoileTunes not to be a separate application, but just a pre-defined layout which the normal object-manager UI in Etoile can take on. Jesse, Quentin, and I discussed some ideas for what would make a good music manager UI, so hopefully I can build some working mockups with EtoileTunes.
On the MultimediaKit replacement side, I have an incomplete, but working, Objective-C wrapper around Xine-lib which provides an API similar to MultimediaKit's (only for music, not video playback so far.) I'm also working on a GStreamer backend. A future task might be to write a MultimediaKit framework from "scratch" - an Objective-C framework that would fit between Etoile apps/services and the operating system's/sound server's audio API. It could use the ffmpeg project's libavcodec/libavformat libraries to do all of the hard work of decoding media formats. The advantages of this would be having something that exactly fits Etoile's needs, and fits in to the system perfectly (for example, the same code for decoding music could be used for transcoding between different formats, and be integrated into a system Etoile might provide for converting file formats), and being able to ensure details like gapless playback work perfectly. This might be a lot of work though - especially if it has to handle video playback and recording, and multiple OS backends, so it may be best to stick with GStreamer/Xine for now.
Here's a screenshot of EtoileTunes:
If you would like to try it, the code is in /branches/ericwa/EtoileTunes. It's still very work-in-progress, though.
Lastly, any suggestions for a better name? :)
Compiler Fun
Posted on 12 May 2008 by
Anyone following the Étoilé svn logs recently will notice that I haven't been committing much for a few weeks. The reason for this is that I've been taking a short break to do some compiler hacking.
Objective-C support was first added to GCC by some guys at NeXT. They didn't want to release their code, but were eventually forced to by the FSF. They did not release the code for their runtime library, and so this code was completely useless to anyone else. RMS wrote a drop-in replacement for this library, which became the GNU Objective-C runtime. Gradually the GNU and NeXT runtimes diverged and the Objective-C support code in GCC became littered with #ifdefs.
After Apple bought NeXT, they continued developing their version of GCC in a branch. This branch was slightly cleaner, since it never had support for the GNU runtime, but no use to anyone on platforms other than Darwin for the same reason. This code is no fun at all to work with - Objective-C structures are lowered to the corresponding C structures, so there is no clean Objective-C AST to work with and runtime-specific code is interleaved with the abstract representations. When Apple add a new language feature, they add it to their branch, and if anyone else wants to use it then they have to merge the changes into the main trunk. Unfortunately, no one is doing this and Objective-C support in GCC is in a rather depressing state (bugs in Objective-C are not seen as show stoppers for a release, as we saw in the early 4.x series).
Recently, GCC switched to GPLv3. Apple corporate policy is that they will not touch GPLv3 code, and so the Apple branch is now a fork of GCC 4.2. Features added to GNU GCC will not find their way into Apple GCC and vice versa, unless explicitly licensed in a compatible way by their contributor.
Apple have also started looking at a new compiler, known as LLVM. This is a modular infrastructure for building compilers. It currently has an Objective-C/C/C++ front end based on Apple's GCC. This combination of an LLVM back end and a GCC front-end is typically known as llvm-gcc. It is found in the iPhone SDK and is likely to be found in the OS X dev tools soon. GCC isn't really designed to be split apart like this, however, and so the Apple guys have been working on a new one.
Unlike GCC, clang has very clean layering. This is intentional, since Apple also want to use it in XCode for syntax highlighting and refactoring tools. This means that every single Objective-C language construct gets corresponding AST nodes which are then passed to another part of the program which emits LLVM intermediate representation (IR) code - single static assignment assembly language - which is then turned into native code for the desired platform.
When I first looked at clang, most of the parsing code for Objective-C was done, but none of the code generation part. This meant that I was free to add any interfaces I wanted. Clang now has an abstract class encapsulating all of the runtime-specific behaviour and hooks in the generic code that call this. I have also written a complete implementation of this for the GNU runtime and an almost-complete one for the Étoilé runtime. As a result of this, clang can now compile about 90% of the files in GNUstep-base without issue. The remaining ones are failing due to a couple of outstanding bugs with implicit casts (the LLVM type system is a lot more strict than the Objective-C one and so casts which are implicit in Objective-C need to become explicit in the IR) and a few C features. GNUstep uses variable length arrays in a few places, for example, and I have only added partial support for these.
My changes to Clang are currently undergoing code review, but after this has happened and I've made the required changes they should go in.
Objective-C isn't the only thing that makes this interesting. Since the object model code is all isolated in a separate class, it is possible to plug this into other compilers trivially. Generating classes, protocols and categories, selectors and message sends that use the underlying GNU runtime (and soon the Étoilé runtime) functionality is trivial when using this class (each high-level construct is mapped to a method call). I am currently in the process of writing a Smalltalk compiler that uses this same back end. LLVM supports both JIT and static compilation, so we will be able to JIT-compile Smalltalk while developing, dump it to a file, and static compile it for distribution.
This means that Smalltalk will be a first-class citizen of the Étoilé ecosystem. Applications will be able to be written in Smalltalk and Smalltalk classes will be able to inherit from Objective-C classes. There is no bridging - Smalltalk methods will be compiled to native code and attached to the same structures as Objective-C methods. Once this is finished, I will be recommending Smalltalk as the development language-of-choice for new Étoilé applications. If you discover that a particular piece of code is too slow (after profiling) then you might want to rewrite it in Objective-C (or even pure C), although I don't expect Smalltalk to be much slower than Objective-C.
Smalltalk is not the only high-level language we will implement in this way - just the first. Expect Io, JavaScript and maybe even Self implementations later. These languages are all prototype-based, however, and so require a few features that are not found in the GNU runtime (but are in the Étoilé runtime) for full support.
Hackathon Recap
Posted on 11 April 2008 by
Étoilé hackathon ended ten days ago, I left Swansea on Tuesday afternoon to flight back to Paris after a mix of train and bus. Afterwards David and Damien Pollet succeeded in closing the hackaton by writing the first bits of a bridge between GNU Smalltalk and Objective-C.
Following Friday talks, we wandered around the university looking for a pub where we could eat. We finally ended up at David's place and had a nice break around beers (that aren't beers but ales), coffee, saucissons and we ordered some real food too. Afterwards Nicolas was still motivated to learn more about EtoileUI. His interest for EtoileUI had the bad side-effect of shortening the night quite dramatically!
During the week-end, we discussed the proper way to implement a semantic editor and how to integrate it with EtoileUI and CoreObject. Nicolas spent most of his time writing a new prototype. He already wrote a very rough one few years ago.
Damien arrived from Lugano on the second day of the hackaton. After having installed Étoilé, he started to play with the possibility to hack a Smalltalk bridge. In the meantime, David continued his work on EtoileSerialize, while I spent a large part of the week-end cleaning stuff here and there and hunting bugs of EtoileUI on GNUstep (Saturday and Sunday night especially :-).
We also began to clean EtoileFoundation a bit, David moved EtoileThread to it, and last week I recently managed to bring EtoileXML too and have both compiles fine as subframeworks (EtoileFoundation.h playing the role of an umbrella header). In the middle of our discussions centered around CoreObject and EtoileSerialize, I got the impression David was doing some coding on LLVM too, while Damien was fighting with autotools next to me. At least, I was able to follow David's presence on the #llvm channel right on the wall screen where Jesse appeared the previous day. I'm pretty sure everything went on rougly that way until Monday!
On Saturday evening, I lost myself in Swansea suburbs and the five minutes walk to buy a few ales became a one hour walk in the Welsh drizzle. The next day, the weather persisted to be quite unstable, constantly hesitating between storm, sun and waxy clouds. Monday, our own trinity (EtoileSerialize, CoreObject and EtoileUI) was expecting us for a last serious day of hacking. The sky was now in a better shape and could have been qualified of sunny and warm by previous day standards ;-) David had the good idea to plan a break at the pub for the evenining, so we could slow down a bit before really ending the hackaton. For the last day, I went to a coffee shop where I previously bought some nice scones and Chelsea cakes (iirc), then we left David's place for the university and our daily coffee session with a nice vista on the sea and the sound of the seagulls in background (or is it just my vivid imagination?).
The hackaton was just great, although the time went by very quickly. David had well organized everything and after our daily hacking sessions, he even managed to cook some very nice stuff (like weird and tasty squashes) for the hungry french hackers, so I'm looking forward to the next one. May be in France…
Hackathon Progress
Posted on 30 March 2008 by
The hackathon started on Friday, with Quentin and Nicolas arriving in Swansea in the early afternoon. After briefly settling in, the three of us gave a short series of talks to the postgraduate students in the department of computer science.

I managed to pursuade Quentin to write some documentation, which can be seen here:

Jesse, unfortunately, was unable to be with us in the flesh, but appeared on the wall in the middle of the afternoon as our very own Big Brother (or possibly Emmanuel Goldstein):

Most of the time has been spent working on CoreObject-related things. Nicolas has implemented a basic semantic text editor, which Quentin is wrapping up in CoreObject. I spent most of the time so far working on EtoileSerialise (or EtoileSerialize, as it is now known). It now passes the test automatically serialising nested structures of arrays of structures. The storage has also been abstracted away, and branching (finally) implemented.

Last night, after a day of hacking on various things, we implemented CorePizza, the official Étoilé project food:

Someone stole an hour from the middle of last night, so we're a bit tired today, but still making progress.
Summer of Code
Posted on 18 March 2008 by
The Google Summer of Code list was published last night. We have some good news and some bad news. The bad news is that Étoilé wasn't selected. The good news is that GNUstep was. Since Étoilé is built on top of GNUstep, everything that benefits them benefits us, so anyone interested in the summer of code and Étoilé should consider applying for one of the GNUstep places. We don't know how many places each project was awareded until later on in the process, but last year they got two so hopefully a couple of interesting projects can be finished as a result of the summer.
Spring Hackathon
Posted on 6 March 2008 by
The dates and location of the first Étoilé Spring Hackathon are now finalised. We will officially be starting on Friday the 28th of March and continuing until the 1st of April.
The Swansea University Computer Science department has very kindly agreed to provide us with a room for the duration of the hackathon in this building. Access to the building and room is via security card. I'll arrange some visitors' cards for attendees.
We're about fifteen minutes walk away from a pub that serves real ale and has free WiFi, which should come in handy for the evenings.
If you're coming, let me know when you're likely to arrive. I'll try to arrange some kind of social event on the Thursday evening before the real hacking starts.
Some Quick StepChat News
Posted on 13 November 2007 by
My last few posts have all been about very low-level stuff and have been sorely lacking in pretty pictures.
For the last day or so I've been working on adding vCard support to StepChat. It's not finished, but it now creates a 'Jabber People' group in your address book and adds any published vCards there. In future, it will merge any changes in a nice way. For now, it only gets vCards once (I need to poke the presence stuff to spot vCard updates). One nice bonus is that Jabber stores avatars in vCards, and once you can load vCards you get avatars almost for free. You will see in the screenshot that Dom has a nice line-art dragon for his avatar.
I also chased down the bug that was preventing colours displaying in the roster with GNUstep, so now that works too. Finally, one more thing you can see in the screenshot is that GNUstep now has nice menu dividers. This shot was taken with the Cairo backend (which works almost perfectly; the only bug I've seen with it is that text in the status message box on the roster gets horribly smeared. Since this doesn't happen in other text boxes it's probably a nib loading error).
Unfortunately, the bug causing windows to have the wrong titles is still there. I will have to have a hunt for it at some point (or just wait for Yen-Ju to fix it).
Another Day, Another Runtime
Posted on 10 November 2007 by
After spending a little while poking at the GNU runtime, I came to two conclusions:
- It was at least twice as complicated as it needed to be.
- GNU coding style really hurts my eyes.
I've spent a little while thinking about what I want from a runtime. One of my recent projects has been writing a Smalltalk JIT that targets the GNU runtime (still quite work-in-progress) and so I know the GNU runtime can support Smalltalk as well as Objective-C. Quentin, meanwhile, has been working on the Io bridge. While this works, it is a bit ugly because the Io object model doesn't really mesh well with the Smalltalk object model that the GNU runtime uses. More on this later.
Beyond better being able to support Io, I read a few interesting papers recently. The first was on Polymorphic Inline Caching. This is quite a neat idea, and allows you to eliminate the cost of dynamic method lookups in a number of cases. Unfortunately, this is quite hard to get right. Consider the simple Objective-C line:
[object message];
With the Apple runtime, this will be translated roughly into something like this:
objc_msgSend(object, @selector(message));
I'm cheating a bit here, and skimming over how the @selector() directive is expanded. In contrast, the GNU runtime does this:
IMP method = objc_msg_lookup(object, @selector(message));
method(object, @selector(message));
This is quite nice, since it means that a small compiler change is all that's needed to cache the method. We could replace this with something like this:
static IMP cached_method = NULL;
static Class cached_class = Nil;
if(cached_class != object->isa)
{
cached_method = objc_msg_lookup(object, @selector(message));
}
cached_method(object, @selector(message));
Now you only need to bother with the (expensive) method lookup if you reach this bit of code with two different object types. There are a lot of places in code where you will get the same kind of object all of the time, and this can give a huge speed boost. In other cases, you get a number of different ones and this is where polymorphic inline caching comes in. Rather than keeping a single (class, method) pair cached, you keep a few. Profiling can determine the optimal size for this cache relatively easily.
Nice and easy? Well, there's a catch. Objective-C is a dynamic language. You can load bundles which will replace methods at runtime and languages like Io allow even different objects to have different methods. This means that you need to check that the cache is valid before you use it. A problem.
This, and the difficulty in supporting Objective-C 2.0 on the GNU runtime caused me to write a new one from scratch. This took just under 48 hours (after which I ate and went out to a well-earned salsa class). What's new?
First, inline caching can now be done safely. Rather than looking up methods, the runtime looks up slots. The slot contains (among other things) an IMP and a version. How does the version help? Consider two classes, A and B. A inherits from B, which implements a -foo method. Somewhere in my code, I call this method on an instance of A and cache the result. Two things can cause this cache to become invalid:
- B's implementation of the method being modified / replaced.
- A having an implementation of the method added.
The first case Just Works™ since a pointer to the the slot (which contains a pointer to the method) is cached. You can modify the method without any problems (great for debuggers and runtime optimisations). The second case is more tricky. When you add a method to A, it first performs a lookup on the selector. If this returns a non-NULL slot, then the version of the located slot is incremented. Any time you cache a slot, you should also cache the version; if there is a version mismatch with the cached slot then you need to perform the lookup again.
The slots also contain an optional offset. This can be used to implement very fast set/get methods. A lot of the time you wrap instance variables up in set/get methods to insulate users of the code from changes to the instance variable layout. This comes with a speed penalty. I can't make this go away completely, but the new runtime allows you to avoid the method call and just access the ivar directly, while maintaining the dynamic lookup. This can make things like KVO faster, since you can do direct ivar access while there are no observers and then switch to indirect access when there are some.
I said the new runtime was simpler (no exaggeration; it's roughly 10% of the code size of the GNU runtime). That's partly because it works in a slightly different way to other Objective-C runtimes. While the GNU runtime provides the functionality required to implement Smalltalk in C, the new one provides the functionality required to implement Self in C, and then implements Smalltalk in Self (which is very easy).
Every object has (or, rather, can have) its own dispatch table and its own lookup function. Classes really are just objects. Anything you do with classes, you can also do with objects; for example you can add a method to a single object at runtime (say hello to closures, prototypes, and all of the things Io and Lisp programmers have been mocking you for having to do without).
The class model contains a nod to that used by Animorphic Smalltalk. This used mixins as a base type. That's effectively what I'm doing although they're called classes so as not to scare off the old Objective-C programmers. Classes can be composed as mixins are, which is how concrete protocols are supported (the only difference between a concrete protocol and a mixin is that the compiler does type checking for a concrete protocol. From the perspective of the runtime they are the same).
Oh, and every object has an associated recursive mutex, so @synchronized can now be generated easily.
To get an idea of how the runtime is used, take a look at example.c, which contains some simple example Objective-C code in comments and the equivalent code the compiler should be producing. I'd love to see this supported in LLVM, so anyone familiar with that codebase who feels like lending a hand please let me know.
You can also find more information in the release announcement email, including a more detailed overview and the API docs.
None of the interfaces are set in stone yet, so any suggestions are welcome. You can grab a copy of the code by doing:
svn co http://svn.gna.org/svn/etoile/branches/libobjc_tr/
Labels: libobjc, Objective-C, runtime, shiny
Objective-C: Étoilé Vs Leopard
Posted on 18 October 2007 by
Mac OS X 10.5, codenamed Leopard, is due out in a week or so. One of the features it advertises is Objective-C 2.0. I've written a bit about Objective-C 2.0 before. In this post, I'm going to compare some of the new language features present in Étoilé with those present in Leopard.
Garbage Collection
The big feature all the Java programmers want is garbage collection. GNUstep has actually supported this for a little while. If you use RETAIN() and RELEASE() macros instead of sending retain and release messages, you get the correct stubs for the garbage collector generated when compiling with garbage collection enabled, or -retain and -release messages sent otherwise.
This support was originally begun in 2004, but I'm not aware of anyone who uses it. Part of the problem is that mixing GC and non-GC code is tricky, so it's really only an option for people with no legacy code. Another part is that it adds some runtime overhead.
Loose Protocols
Apple now allows you to specify that some methods in a protocol are optional. Apparently this is useful, but I can't think why. Objective-C gives two ways of accomplishing this already. The first is to use an informal protocol; a category on NSObject with a default (typically null) implementation of the methods. The other is to query at runtime with respondsToSelector: whether a delegate implements a method.
The point of using a formal protocol, rather than an informal one, is so that the compiler can check that you have implemented the methods. Another possible reason is to allow a runtime check for a set of methods at once. A loose protocol gives you none of these. It just moves things that should be in the documentation into the code. Great for people who believe header files are documentation, not so great for the rest of us.
Concrete Protocols
Concrete protocols are a potentially useful part of Objective-C 2.0. They allow protocols to contain default implementations of a method. In Étoilé, we have something similar; typesafe mixins.
Mixins allow you to maintain the separation of interfaces and implementations that Objective-C encourages. Mixins, unlike concrete protocols, are defined as classes. If you want to add a method to a class, you first define a class implementing it, like this:
@interface Mixin : NSObject {
}
- (void) method;
@end
@implementation Mixin
- (void) method
{
NSLog(@"Method called");
}
@end
When you want to apply it to a class, you simply do:
[aClass mixInClass:[Mixin class]];
After this, all instances of aClass will responds to -method (it will not get a double helping of the methods declared in NSObject). You can even declare and use instance variables in the Mixin class. When you apply a mixin you will get an exception if one of the following happens:
- The types instance variables declared in the mixin do not match those declared in the class. The class can include more instance variables than the mixin, but it must include all of them. This allows mixins to directly access class ivars; something not possible with concrete protocols.
- The types of methods declared in the mixin conflict with the types of methods declared in the class (or a class the class inherits from).
Method Attributes and Properties
Method attributes might be really nice, but the number of the GCC function attributes that you are allowed to use is very small. Most of the ones that are actually useful can not be used in Objective-C without radically changing the way in which method lookup is handled; for example by adding an equivalent of Java's finally keyword.
Properties seem at first glance to be a nice idea. They are close to C#'s implicit set and get methods. In terms of expressiveness, they give nothing more than key-value coding already allows us. They may be slightly faster; the compiler could possibly add some code to translate them into ivar lookups if the implementation is for direct access to ivars. Something similar could probably be done with KVC, in the same way that polymorphic inline caching works. I'd be surprised if Apple has implemented this, however. At the moment, the only advantage is to add some confusing extra syntax.
The only remaining feature of Objective-C 2.0 that I recall is the foreach construct. Étoilé has a macro which works in a similar way. The following two lines are semantically equivalent:
FOREACH(anArray, string, NSString*)
for(NSString * string in anArray)
The latter is slightly faster, but requires anArray to implement a very messy countByEnumeratingWithState:objects:count: method, which retrieves 16 objects with a single call. The Étoilé version is slightly slower (although it does do IMP caching for you), but works with any collection that supports -objectEnumerator and so does not require multiple code paths. It's included with EtoileFoundation, so can be used on OS X too, including OS X 10.4 and earlier.
Prototypes and Futures
We've run out of new features for Leopard, but there's still one new one for Étoilé. We have support for prototypes in Objective-C. Any object that inherits from ETPrototype, or implements the ETPrototype protocol can use them. This required a small (binary-compatible) modification to the runtime system to allow delegation of method lookup to the class.
By using nested function (not supported on OS X), you can create and add methods at runtime, like so:
id anObject;
DEFMETHOD(method)
{
//Code goes here.
}
[anObject setMethod:(IMP)method forSelector:@selector(foo)];
[anObject foo];
You can also declare methods with arguments, and declare them outside the scope of a function / method. Note that, as with nested functions, you can not call the method after the current function has returned if it references any local variables. Note too that instance variables in the method can only be accessed by casting self to the correct type and accessing them explicitly (e.g. ((MyClass*)self)->ivar).
These prototype objects can then be -clone'd, have KVC-accessible ivars added and removed using the -setValue:forKey: method, and be used just as prototypes in Self or Io. We don't restrict you to class-based programming.
Note, however, that prototypes do come with some runtime overhead and so should probably not be used everywhere. The same mechanism can be used for closures; if the nested function you add as a method uses lexical scoping, and is called immediately, then it will work as a block would in Smalltalk.
While not technically a language feature, as I mentioned earlier, we also have support for futures.
Playing with the Runtime Again
Posted on 9 October 2007 by
Everyone should have a hobby, and at the moment mine seems to be abusing the GNU Objective-C runtime. One of my spare time projects is writing a Smalltalk JIT that uses the GNU runtime to provide the object model (allowing you to subclass Objective-C objects in Smalltalk).
Last night, Quentin demonstrated the old maxim that the way to get anything done in an open source project is to tell the developers that it sucks because it can't do X; pretty soon they'll have it doing X, for any value of X. Quentin's criticism was that it wasn't possible to subvert the message dispatch mechanism very easily.
What does that mean? Let's take a look at what happens when you send a message in Objective-C. First, you write something like this:
[object doSomethingWith:aParameter];
The compiler then converts this into something like this:
SEL sel = sel_get_any_uid("doSoemthingWith:");
IMP imp = objc_msg_lookup(object, sel);
imp(object, sel, aParameter);
Note that this is a simplification, and the selector will typically be cached somewhere. The important function is objc_msg_lookup, which returns the function that implements the method. These functions always take the object and selector as the first two arguments, and may take others.
For a language like Smalltalk or Objective-C, this mechanism makes sense. For something like Io, it almost does. The problem, for Io, is the implementation of objc_msg_lookup(). This looks at a sparse array structure in the class structure to find the mapping. This isn't helpful for a prototype-based language like Io, where instances might have different methods to their classes. For Io, you want to be able to alter the behaviour of the objc_msg_lookup() function on a per-object basis. for bonus points, you want to do this without breaking binary compatibility (the GNU C++ standard library people were very unpopular when they did this).
Fortunately, the Objective-C class structure has a field called info, which is a bitfield. Actually, it's half a bitfield; the upper half is used to store the id of the class in the system, limiting you to 64K different classes, and the lower half stores flags. These flags are used for various different purposes, including indicating whether a class has an +initialize method that needs to be called. Not all of them are used, so I added a new one. I then modified the objc_msg_lookup function to include a special case if this is set.
Now, if you set the flag on your class then the runtime will know it wants to handle message lookup itself:
+ (void) initialize
{
CLS_SETOBJECTMESSAGEDISPATCH(self);
[super initialize];
}
You then implement a method like this for your class:
+ (IMP) messageLookupForObject:(id)anObject selector:(SEL)aSelector
This lets you store your own version of a dispatch table in an instance variable, so you can add methods to a specific object at runtime. Objects which are extended in Io can have this flag set on their classes at runtime, and use a separate dispatch mechanism for the Io methods.
It can also be used for more efficient proxying for local objects. The CoreObject proxy object wants to pass messages that don't change object state right through, without logging them. With this mechanism, it will be possible to implement a hashmap lookup with the same sort of cost as performing a normal message lookup for messages that are passed through (store an NSMapTable in a COProxy ivar containing selector to IMP mappings for the proxied object), and return the message forwarding IMP for those that aren't.
Another, potentially interesting, option would be to combine this with some runtime code generation to dynamically construct proxy methods that would log their arguments and then pass them on without needing to construct an NSInvocation.
Hopefully, this patch will make it upstream as far as the GNUstep version of libobjc, even if it doesn't make it all the way into GCC. Anyone who wants to play with it themselves can find the diff in this mailing list post.
Futures in Objective-C
Posted on 23 September 2007 by
I like concurrent programming, but I don't like threads. Like files, they're a nice abstraction for operating system designers but not so much for userspace hippies.
In functional languages, you can often get concurrency for free by having a clever compiler, instead of a clever human. This is good, because clever humans are expensive. Clever compilers are too, but they're easier to copy than clever humans.
Consider the following bit of Objective-C:
id foo = [anObject doSomehing]; ... [foo doSomethingElse];
In Smalltalk, objects were regarded as simple computers that communicated by message passing. The fact that this message passing was implemented with a stack was hidden. To the Smalltalk way of thinking, the objects were independent. This bit of code sends a doSomething message to anObject, and block until it sends a return message.
From here, it doesn't take long to realise that you don't actually need it to block until the [foo doSomethingElse] line. So, can we implement this in Objective-C in the general case? The answer is yes, and that's what the EtoileThread framework (soon to be in EtoileFoundation) does. It works on OS X too.
I actually wrote EtoileThread in a hotel in Dublin in June last year, but I recently rewrote a lot of it to be more efficient. I'd like to give a little overview of how it works.
There are three core components to this. The first is the ETThreadedObject class, which encapsulates an object in its own thread. It's an NSProxy subclass, and forwards messages to the real object. You create it typically via an NSObject category, which adds +threadedNew and -inNewThread methods. When you send a message to the object returned by either of these, the following sequence happens:
- The invocation is caught by the ETThreadedObject and put into a ring buffer.
- The forwardInvocation: method returns, and the calling code receives an ETThreadProxyReturn object.
- The second thread retrieves the invocation from the ring buffer and executes it.
- The second thread passes the real return value to the previously returned ETThreadProxyReturn.
- Any calls to methods in the returned proxy block until this point.
What does that look like in practice? Well, we'll look at the simple example program included with the framework (ETThreadTest.m for those following in svn) and see. First, we define a simple class that has some trivial methods:
@implementation ThreadTest
- (void) log:(NSString*)aString
{
sleep(2);
NSLog(@"%@", aString);
}
- (id) getFoo
{
sleep(2);
return @"foo";
}
@end
The first just NSLogs whatever is passed to it, and the second returns a constant string. Next, in the main body, we create an instance of this in its own thread:
id proxy = [ThreadTest threadedNew];
We then send this a log message:
[proxy log:@"1) Logging in another thread"];
And then a getFoo message. Recall that the implementations of both of these messages had 2 second delay built into them. This was introduced to make it obvious which order everything was being executed in.
NSString * foo = [proxy getFoo];
Next, we NSLog something from the main thread, just to show where we are.
NSLog(@"2) [proxy getFoo] called. Attempting to capitalize the return...");
Then, we NSLog the return value from the getFoo method (for good measure, we'll send it a message and NSLog the result, rather than NSLoging it directly):
NSLog(@"3) [proxy getFoo] is capitalized as %@", [foo capitalizedString]);
Finally, since we know we are calling a future, we get the real object out and NSLog it.
if([foo isFuture])
{
NSLog(@"4) Real object returned by future: %@",
[(ETThreadProxyReturn*)foo value]);
}
What happens when we run this? Take a look:
$ ./ETThreadTest 2007-09-23 21:59:39.718 ETThreadTest[25196] 2) [proxy getFoo] called. Attempting to capitalize the return... 2007-09-23 21:59:41.695 ETThreadTest[25196] 1) Logging in another thread 2007-09-23 21:59:43.695 ETThreadTest[25196] 3) [proxy getFoo] is capitalized as Foo 2007-09-23 21:59:43.695 ETThreadTest[25196] 4) Real object returned by future: foo
Note that the NSLog from the main thread completes first. Note also the two second delays. Finally, note that the third and fourth log statements don't complete until after the getFoo method has run, since they depend on the returned value.
What's improved in the implementation of this in the last week? My first version was quite experimental. It used an NSMutableArray to store the invocation queue. This meant that every message going into the queue required these steps:
- Acquiring a mutex.
- Inserting an object into an NSMutableArray.
- Signalling a condition variable (if the array was empty).
- Releasing a mutex.
On the receiving end, you needed the following:
- Acquiring a mutex.
- Sleeping on a condition variable (if the array is empty).
- Removing the first object from an NSMutableArray (quite expensive).
- Releasing a mutex.
This is a minimum of four system calls (a maximum of six) and at least one expensive array operation. This isn't too bad if you are only very occasionally sending messages to your threaded objects, and are expecting them to take a long time to complete. If you are sending a lot of messages, however, the overhead quickly stops it being worthwhile to bother with the second thread.
The new implementation uses a lockless ring buffer in this situation. Inserting an object into this involves a subtraction and a comparison to see if it's full (we just spin using sched_yield() if it is, but with enough space for 128 invocations in the buffer that should be rare), inserting two objects in a C array (the invocation and the proxy value), an addition, and a memory buffer (on weakly-ordered platforms, not on x86). Removing an object switches back to the old-style locking mode if the queue is empty. We don't want to spin if there is no work to do for the consumer thread, because that could last a long time. Then, taking the objects out of the array is just copying their pointers out of the array and another addition.
The ring buffers, like those in Xen, use free-running counters and a mask of the lower bits to translate this into an address.
There are two special cases. The futures code only works for methods that return objects. For methods that return non-objects, we need to wait until the invocation has completed. For void return methods, we simply complete asynchronously, but don't return a proxy object.
If you want a more detailed overview of how it all works, there are lots of comments in the code. Have fun, and file helpful bug reports.
XHTML-IM Support
Posted on 27 August 2007 by
Ages ago, I wrote most of an XHTML-IM parser for StepChat. I didn't enable it in the default build, since it wasn't quite working. Over the weekend, I finished it off, tested it, and added support for generating some XHTML-IM. The code is not completely compliant with the specification, since it doesn't bother checking if the other party supports HTML. This will probably be added soon. It also contains a couple of work-arounds for libgaim-based clients, which interpret the standard quite creatively.
Below, you can see a conversation I had with my debugging persona (yes, I know talking to yourself is a sign of madness, especially when you use the Internet to do it). The first image comes from FreeBSD/GNUstep while the second one comes from OS X. Camaelon is not enabled for Jabber on my FreeBSD box because it was breaking things (I think it's fixed now, but I haven't got around to removing the default)

The more observant among you will notice that I fixed a bug in the handling of bold text in the middle of this conversation.
Labels: jabber, omgponies, stepchat, xhtml-im, xmpp
Drop Shadows
Posted on 9 August 2007 by
Shadows are nice. Since we got compositing managers for X11, we've had the ability to do this. The xcompmgr program does it, but in a gimmicky, eye-candy way, which doesn't really add much to usability.
Shadows are definitely something we want though, so yesterday I forked xcompmgr and put it in our repository. Today we have some (early) results. Drop shadows:
Three things to note. The first is that the dock and menu bar don't have shadows. The second is that the active window has a bigger one than the others. Finally, there are no shadows visible on the Typewriter window at the top right of the screen. The first two of these are intentional. The third is not, but only occurs in screenshots, not on the screen, and so is probably not very important.
Étoilé 0.2 Troubleshoot
Posted on 3 August 2007 by
Since 0.2 release, there are some issues regarding setting up the Étoilé environment. Here are a few steps to narrow down the problem.
First, make sure your GNUstep is correctly installed. Current stable release of GNUstep is Make 2, Base 1.14, GUI/Back 0.12. You can also get them from GNUstep stable branch. Run a few GNUstep applications in any graphic environment to be sure your GNUstep is working. And remember to source GNUstep.sh in your profile. You can get help from GNUstep maillist if problems occur at this stage.
Second, after installing Étoilé, run a few user-level applications, such as Typewriter, Sketch, StepChat, Vindaloo, AddressManager and FontManager. They are regular GNUstep applications. You can run them in any graphic environment such as GNOME, KDE or WindowMaker. If you run 'setup.sh' during the installation, several bundles have also been installed. For testing purpose, you can remove them by:
defaults delete NSGlobalDomain GSAppKitUserBundles
and add them back by :
defaults write NSGlobalDomain GSAppKitUserBundles '(
/usr/local/GNUstep/System/Library/Bundles/Camaelon.themeEngine,
/usr/local/GNUstep/System/Library/Bundles/EtoileMenus.bundle,
/usr/local/GNUstep/System/Library/Bundles/EtoileBehavior.bundle)'
Make sure the paths are correct. You can also try different combination of these bundles. Camaelon is the theme engine. EtoileMenus is the horizontal menu. EtoileBehavior handles various tasks behind the scene and you will not see any change on user interface with it.
Third, you should be able to set up Étoilé manually. With GNOME, you can log into a fault-safe session with xterm only. There should be something similar on KDE. Once xterm shows up, run these system-level applications one-by-one:
gdomap &
openapp Azalea &
openapp AZBackground &
openapp EtoileMenuServer &
openapp AZDock &
If they all run propertly, you are close to have a functional Étoilé environment. Log out the session by exiting from xterm and log into the fault-safe session again. Run 'etoile_system' tool (no openapp !) and all the system applications should launch automatically. If not, check your SystemTaskList.plist in your GNUstep/System/Library/Etoile/ or ~/GNUstep/Library/Etoile/. It contains all the applications you launched one step before.
Finally, to add Étoilé into your GDM, make sure these files exists:
- etoile.desktop, in your xsession directory, such as '/usr/share/xsessions'. And it should contain a line 'Exec=/usr/local/bin/etoile'. You can get this file in Etoile/Services/Private/System/.
- /usr/local/bin/etoile. This is the actual script to run 'etoile_system'. It should look like this:
. /usr/local/GNUstep/System/Library/Makefiles/GNUstep.sh etoile_system
There is a space between '.' and '/usr/local/GNUstep/System/Library/Makefiles/GNUstep.sh'. And be sure the permission is correct.
If you log into the fault-safe session again, you should be able to run this script '/usr/local/bin/etoile' to launch etoile_system, which will then launch all other system-level applications. By this point, you should have a Étoilé session in GDM for Étoilé environment.
If you want to set up the Login.app, see 'Etoile/Services/Private/Login/README' for details.
Update: There is a summary of latest GNUstep/Etoile on Solaris.
The Road to CoreObject Part 2: Why Bother?
Posted on 30 July 2007 by
Since the last post, a lot of people have asked me 'why are you doing this? What advantage does it actually give?' In this post, I'll try to explain.
One Abstraction, Two Uses
What is a file? Over the last year, I've asked a number of people that, from computer scientists to technophobes. None has managed to give me a clear answer. The next question I asked is 'What is a document?' Everyone I asked gave me a clear answer.
From a user interface perspective, it's clear that a document is a better abstraction than a file. A file is a very convenient abstraction for operating systems; it's basically a virtualised block device with a simple text key (the path/filename) that can be used to uniquely identify it. It is not a good abstraction for users.
Files are used for two things:
- Storing a document.
- Publishing a document.
From a user interface perspective, these are very different tasks. Storing a document is not something that should ever need to be done explicitly. Raskin's first law states:
A program shall not harm a user's data, or through inaction allow the user's data to come to harm.
Everything I do to a document should automatically be stored if possible. In some situations, such as sudden power failure, some data loss is inevitable, but the program should do everything it can to minimise the chance of avoidable data loss. A simple corollary to this is that versioning information should also be stored. If I hit select all, delete, then I don't want the stored form of my document to be overwritten with an empty document. I want an undo feature, and I don't want this to be contingent on keeping the document in memory (select all, delete, {autosave}, power failure, panic).
CoreObject's serialisation function does this. You don't need to explicitly save a document. From the time an application tells CoreObject to manage the object graph representing the document model, you have the ability to replay every single change you've made to it (this actually works in the version in /trunk now, although it needs more testing).
While you don't have to save a document explicitly, you might want to tag it with some metadata. Some of this will be created automatically for all objects (creation dates, modification dates, etc). Some will be created automatically for certain object types (e.g. colour depth, word count, table of contents). Some can be specified manually. This will be indexed by the higher layers of CoreObject. These tags can either be assigned to a specific version, or to the latest version. You might tag a book you are working on with the book title, and also tag the version you sent to the proof readers, so you can jump back to that one to compare with the comments they gave you.
Publishing is a very different problem. When you publish a document, you typically don't want to include revision information, you want a snapshot. A few government agencies have been embarrassed in recent years by forgetting that Word Documents are intended for storing, not publishing, and include a lot of revision information.
How does CoreObject help with the publishing? Well, the current implementation doesn't (yet), but the plan is to integrate something like Apple's UTI (or, more likely, UTI itself). This is a type hierarchy supporting multiple inheritance that is orthogonal to the object hierarchy. Each compliant object will publish a number of types that is inherits from, such as rich text, or image. It will also support exporting its contents as each of these. For complex compound documents, the root document will simply query the enclosed components, and assemble a composite of images, text etc. Each object only needs to be able to export to something one layer up the type hierarchy. For example, a word processor might export as rich text, and the system would then convert this to text using a shared component.
What About My Friends
The other important feature of CoreObject is collaboration, which is central to the Étoilé vision. CoreObject's serialisation of invocations allows these to be sent across any kind of network connection. In 0.3, there will be a XML-over-XMPP system for this. This will stream changes between two (or, in theory, more) users as they are made. Some systems exist for doing this in very specific cases, such as SubEthaEdit for text and a few whiteboarding solutions for images. CoreObject will allow us to do this in the general case. Any document that works with CoreObject will be able to be shared in this way.
Because it only sends the deltas, this approach will scale to relatively large object types. Imagine something like a raw digital photograph. These can easily be several tens of megabytes. The changes made to them, however, are usually of the form 'alter the brightness level by 5%,' or 'apply this filter with these parameters.' These are not very big, and so once the photograph is initially shared, it can be tweaked in a collaborative fashion easily.
This is even true of video editing. Something like Apple's Final Cut does non-destructive editing. While the source footage is often tens of gigabytes, the project file is very small, since all it contains are instructions like 'take insert ten seconds from source file x at y in the timeline,' and 'cross fade for 10 seconds.' With CoreObject, we get this kind of non-destructive editing for free, and we also get the ability to collaborate on documents like this for free. We could have two people editing the same video on their own machines and having the changes automatically kept in sync. Once it's done, they export it as something like MPEG-4, and anyone can view it irrespective of whether they're using Étoilé.
Labels: CoreObject
Étoilé 0.2 is now officially released
Posted on 28 July 2007 by
Étoilé 0.2 is now officially released. See the full 0.2 Release Announcement for more information. There are a number of screenshots of this release online. A source tarball is available for download. Those preferring to use subversion should check out /tags/Etoile-0.2 from the repository. If you have any questions regarding this release, please post your queries to the Etoile-discuss mailing list or visit the SILC channel Etoile on silc.etoile-project.org.




