Is the world Class oriented?

When programming we in some sense try to model the world, or model our understanding of an aspect of the world. Or maybe even specifically model a piece of the world in a way that makes it easier for us to work with. But the world isn’t everything. We are also modeling ideas and abstractions that have no physical counterpart. Of course, we aren’t just modeling physical objects, “things”. We also have to model processes, relations, work flows and interactions.

How do we model these things? In most cases we generally use either object orientation or functional programming. Structured programming without explicit object orientation support in the language can still be divided into actions that operate on data, in something that looks a bit like object orientation. I won’t touch on the functional approach here, since it’s too tangential to what I want to discuss.

When I talk about object oriented languages here, many will probably associate mostly to languages like Java, C#, C++ or maybe Python or Ruby. These are all quite firmly in the category of class based object oriented languages. Is that all there is to object orientation? Not really. In fact, there are several ways to look at OO. Depending on what part you emphasize, you will get object systems that vary quite drastically.

For example, Scheme was originally an attempt to understand the Actor model, where the way actors work will look like OO (at least with tinted glasses). Common Lisp has CLOS, where generic methods aren’t associated with a specific class, but instead can be specialized on zero or more arguments. (Of course, you can mimic regular class based OO in CLOS, by only ever specializing on the first argument, but that diminishes the power a lot).

And then there is Self. Self was a programming language and environment based on Smalltalk, that introduced prototype based object orientation. This model basically removes the distinction between classes and other objects. Even in really open systems, where classes are instances just like other objects, classes still is the only thing you can make instances of in a class based OO language. In Self you clone an object to create a new one. Both values and methods are slots, and all slots of an object is copied when an object is cloned.

Self influenced several other languages, most notably JavaScript, NewtonScript, Io and REBOL. All of these have their own special take on prototyping, but the general distinction between prototype based and class based languages is easy to make.

I’m currently working on Ioke, and just like Io, Ioke will be a prototype based language. It will use differential delegation, which means that a new clone of something doesn’t get any copies of slots, but instead has a pointer back to the object it was cloned from. If you try to call a method on the object and that method isn’t found, the search will continue to the object it was created from, and so on. Since Ioke will allow you to have more than one prototype, I’ve decided to not use the word prototype, but instead call them mimics (an object mimics zero or more other objects).

We, as programmers, need to have powerful ways to model the world. But as I mentioned in the beginning, we also need to be able to model abstract concepts. I’m getting more and more convinced that prototype based OO is easier to work with and reason around than class based OO. The world around is can definitely be categorized, but the that is a side effect of the way our minds work. There is no real world equivalent of Plato’s Theory of Forms, but yet we insist in programming like it was true.

Working with ideas categorized as classes constrain us. As an example, most programs start out as some vaguely formed thoughts, and these thoughts get clearer and clearer as we continue developing the system. In most cases we end up refining the class definition in a source file and recompiling. But working with a prototype based system, this isn’t the only choice. You can instead create an initial set of objects and then continue to refine and refine with new objects. This is especially powerful when doing exploratory programming.

Is the world really class oriented? No. Is it prototype based? Not really. (Although a case could definitely be made for calling evolution, DNA and genetics for prototype based.) We can see the world in class oriented glasses, or we can see it using prototype based lenses. Both choices work well for certain things, and when modeling it’s nice to have both. So the question is this – can you express the one naturally in the other?

Class based system generally have a hard time looking like prototype based languages (although Ruby actually is very close). Prototype based systems on the other hand can very naturally congeal some of the core concepts into classes, and it’s easy to work with these as if they were real classes. (Some of the proposals for EcmaScript 4 added class oriented OO on top of the prototype based system).

I choose to make Ioke prototype based, since this gives me more power and flexibility.

Why not Io?

I have been asked a few times in different circumstances why I feel the need to create my own language instead of just working with Io. That is a very valid question, so I’m going to try to answer it here.

First of all, I like Io a lot. Ioke is close enough to Io that it will be obvious who the parent is. In my mind at least, the differences are in many ways cosmetic and in those that are not it’s because I have some fairly specific things in mind.

So what are the main differences? Well, first of all it runs on the JVM. I want it that way because of all the obvious reasons. The Java platform is just such a good place to be. All the libraries are there, a good way of writing extensions in a language that is not C/C++, a substrate that gives me threads, GC and Unicode for free. So these reasons make a big difference both for people using Ioke, and for me. I want to be able to use Ioke to interact with other languages, polyglot programming and all. And since I expect Ioke to be much more expressive than most other languages, I think it will be a very good choice to put on top of a stable layer in the JVM. Being implemented in C makes these benefits go away.

Of course I could just have ported Io to the JVM and be done with it. That’s how it all started. But then I realized that if I decided to not do a straight port, I could change some things. You have seen some discussions about the decisions I’m revisiting here. The whole naming issue, handling of numbers, etc. Other things are more core. I want to allow as much syntactic flexibility as possible. I really can’t stand the three different assignment operators. I know they make the implementation easier to handle, but having one assignment operator with pluggable semantics gives a more expressive language.

Another thing I’m adding in is literal syntax for arrays and hashes, and literal syntax for referencing and setting elements in these. Literals make such a difference in a language and I can’t really handle being without it. These additions substantially complicate the language, but I think it’s worth it for the expressive power.

A large difference in Ioke will be the way AST modification will be handled. Io gives loads of power to the user with regard to this, but I think there is more that can be done. I’m adding macros to Ioke. These will be quite powerful. As an example, the DefaultMethod facility (that gives arguments, optional arguments, REAL keyword arguments and rest argument) can actually be implemented in Ioke itself, using macros. At the moment this code is in Java, but that’s only because of the bootstrapping needed. The word macro might be a bad choice here though, since it executes at the same time as a method. The main difference is that a macro generally has call-by-name/call-by-need semantics, and that it will modify it’s current or surrounding AST nodes in some way. Yes, you heard me right, the macro facility will allow you to modify AST siblings. In fact, a macro could change your whole script from that point on… Of course Io can do this, with some trickery. But Ioke will have facilities to do it. Why? Doesn’t that sound dangerous… Yeah. It does, but on the other hand it will make it really easy to implement very flexible DSLs.

A final note – coming from Ruby I’ve always found Io’s libraries a bit hard to work with. Not sure why – it’s probably just personal taste, but the philosophy behind the Io libraries seem to not provide the things I like in a core library. So I will probably base Ioke’s core library more on Ruby than on Io.

There you have it. These are the main reasons I decided to not use Io. And once I started to diverge from Io, I decided to take a step back and start thinking through the language from scratch. Ioke will be the result, when it’s finished. (Haha. Finished. Like a language is ever finished… =)

The contract of IO#read

It’s interesting. After Charlie made an immense effort and rewrote our IO system, basically from scratch, I have started to find bugs. But these are generally not bugs in the IO code, but bugs in Ruby libraries that depend on the way MRI usually works. One of the more annoying ones are IO#read(n), where n is the length you want to read.

This method is not guaranteed to return a string of length n, even if we haven’t hit EOF yet. You can NEVER be sure that what you get back is the length you requested. Ever. If you have code that doesn’t check the length of the returned string from read, you are almost guaranteed to have a bug just waiting to happen.

Of course, it might work perfectly on your machine and every other machine you test it on. The reason for this is that read(n) will usually return n bytes, but that depends on the socket implementation or file reading implementation of the operating system, it depends on the size of the cache in the network interface, it depends on network latency, and many other things. Please, just make sure to check the return values length before going ahead and using it.

Case in point: net/ldap has this exact problem. Look in net/ber.rb and you will see. There are also two – possibly three – bugs (couple of years old too) that reports different failures because of this.

One thing that makes this problem hard to find is the fact that if you insert debug statement, you will affect the timing in such a way that the code might actually work with debug statement but not without them.

Oh, did I mention that StringIO#read works the same way? It has exactly the same guarantees as IO#read.

Silly Io experiment

I have known for some time that Io is quite powerful as a language. I wanted to check if it was also suited for creating DSLs. Since I’m lazy I didn’t want to create a new DSL just for this, so I decided to appropriate the associations language from ActiveRecord.

The process I followed was this – first come up with something that looks passably nice, then try to implement it. The only requirements was that the names of the associations were saved away somewhere, once per prototype for each model. This is the syntax I came up with (well OK, I didn’t work very long on it…):

Post := IoRecord {
has many authors
belongs to blog
belongs to isp

Author := IoRecord {
has many blogs
has many posts
has one name

Actually, this looks almost readable, right? It’s all valid Io code. The question is, how do we get it to do what we want? Without further ado, here’s an implementation that gives us enough:

IoRecord := Object clone do(
init := method(
self belongings := list
self hasOnes := list
self hasManies := list
self hasFews := list

appender := method(msg,
blk := block(
call sender doMessage(msg) append(call message next name)
call message setNext(call message next next)
blk setIsActivatable(true)

collector := method(
meths := call argCount / 2
waiter := Object clone
for(index, 0, meths-1,
waiter setSlot(call argAt(index*2) name,
appender(call argAt(index*2+1)))

belongs := collector(
to, belongings

has := collector(
many, hasManies,
one, hasOnes

curlyBrackets := method(
current := self clone
call message setName("do")
current doMessage(call message)

Since I’m a bad person and really enjoys metaprogramming, I had to get rid of most of the duplication the first implementation contained. Let’s say it like this: the first version didn’t have the collector and appender methods. And boy do they make a difference. This is real metaprogramming. Actually, these two methods are actually macros, almost as powerful as Common Lisps. Notice that the words we send in to collector doesn’t actually get evaluated. This is one of the reasons we don’t need to use symbols – we can just take the unevaluated messages and take their name. In the appender macro we’re doing something really funky where we use setNext.

The final effect of that setNext call is that in something like “has many foobar”, Io will not try to evaluate foobar at all. In fact, even before Io tries this, we remove the foobar message from the chain and inserts the next message instead.

Oh, right, and the Io lexer actually inserts a synthetic message called “curlyBrackets” when it finds curly brackets. The brackets themselves actually act exactly like parenthesis. You can try this out in Io by changing the closing parenthesis to a closed curly bracket instead – Io doesn’t care. Your brackets doesn’t have to match up in type, just in cardinality.

Of course, this is just the beginning. Io will blow your mind.

The thing I’m finding myself missing is a good literal syntax for hashes and lists. I’m thinking about implementing something for that. Also, using atPut is starting to get annoying. I want square brackets access. Shouldn’t be too hard, since Io does the same thing with square brackets as it does with curly brackets. Almost, at least.

And btw, the Io standard library… Not sure what I think about it. I miss many things from the Ruby core library actually. Now, Io combined with the Ruby libraries, that might be fun?

Emacs Inferior mode for the Io language

For anyone who like the Io language and work with Emacs, here is a small Inferior mode for Emacs and Io.
It’s adapted from the Ruby Inferior mode. Download it here. To use it, just do “(require ‘inf-io)” and then M-x run-io.