Ioke at Amsterdam.rb


Monday evening I’ll be in Amsterdam, talking about Ioke to Amsterdam.rb, inbetween some other presentations. Seeing as this is going to be the first public presentation about Ioke, I’m absolutely thrilled. It’s going to be great fun. I still haven’t decided exactly how to present Ioke, but I think it’s going to be eased by this being a Ruby crowd.

As far as I understand, the evening is actually fully booked, so there is no more space for people to come. But hopefully everyone in Amsterdam who is interested in Ioke (all three of you? =) will be there. The presentation will be at TTY’s offices in Amsterdam, from 19. It seems my presentation will be from 20:35.

Hope to see you there. This might be an historic event! =)



Ioke is not a Lisp


I generally get lots of different kinds of reactions from people when showing them Ioke. A very common misconception about the language is that it is “just another dialect of Lisp”. That is a quite easy mistake to make, since I’ve borrowed heavily from the things I like about Lisp. But there are also several things that make Ioke fundamentally different. Ioke is Lisp turned inside out and skewed a bit.

First we have the syntax. Ioke has syntax inspired by Ruby and Self, but includes some variations that come from Common Lisp too. The main difference between how Ruby and Ioke handles syntax is that Ioke will transform everything down to the same message passing idiom. Most of the syntax is in the form of operators, which don’t get any special handling by the parser at all. But the syntax is still there, and it is also deeper embedded compared to how Lisp generally does it. Ioke acknowledges the existence of different kind of literals – and allow you to override and handle them differently if you want. One of the examples in the distribution is a small parser combinator library. It allows you to use regular text and number literals, but in the context of the parser those literals will return new parser for their type.

Common Lisp can play these syntactic games with reader macros, but it is generally not done. Ioke embraces the use of syntax to improve readability and the creation of nice DSLs.

Of course, any Lisp guy will tell you that syntax has been tried – but programmers preferred S-expressions. The latest example of this is Dylan. But I’ll have to admit that if you look at the Dylan syntax, you understand why the programmers didn’t feel like bothering with it. It’s one thing to try syntax by just adding some very clumsy Algol-derivation. It is another thing entirely to actually focus on syntax.

That said, Ioke is homoiconic, and translates to a structure that could easily be represented as cons cells. It doesn’t, though, since the message chain abstraction works better.

The other thing that really makes Ioke different from Lisp – and also a reason that would make infix S-expressions extremely impractical – is that Ioke is fundamentally object oriented based on a message passing idiom. In a lisp, all function calls are evaluated in the context of the current package (in Common Lisp). You can get different behavior if the function you call is in fact a generic method, but in reality you’re still talking about one method. If you want to chain method calls together, you have to turn them inside out. That doesn’t lend itself well to a message passing paradigm where there is an explicit receiver for everything.

Ioke in contrast have the message passing model at its core. Any evaluation in Ioke is message send to some receiver. In that model, you really need to make it easy to chain messages in some way. And that’s why S-expressions would never work well for Ioke. S-expressions fundamentally doesn’t use the concept of a receiver at all.

All the other differences in Ioke versus any Lisp could be chalked down to minor dialectical differences, but the two biggies are the above. Ioke is not a Lisp. It’s heavily inspired by Lisp, but it’s fundamentally different from it.



Ioke’s need for speed


I find it interesting that one of the expectations on Ioke is that it will be faster than Io. I need to make this really, obviously clear. Ioke is not even close to the speed of Io. Ruby is a blazing arrow compared to Ioke. I know this, and I’ve made a clear design decision to not take performance into consideration at all right now.

There are several ways to think about this. I’m creating Ioke for several reason. One of them is to see how far you can push expressability in a language. Can you take the blub hierarchy another level? To be able to do this, it’s impossible to be concerned with speed.

I realize that if Ioke someday becomes a successful language, performance will be more of an issue. But there are several implicit assumptions in that statement. The first one is that it will be successful at all. After all, most languages fail. If Ioke fails to become successful enough (for any definition of the term), it will almost certainly not be because of lack of speed. And if lack of speed is the only problem, well, that can be fixed by focusing totally on performance after the fact. But I feel that it is much more important to first get the model correct, and then optimize, rather than the other way around.

Programmers all know the old aphorism about the dangers of premature optimization. But we still do it. All the time. It is really hard for me to try to avoid it in Ioke. But I think I’m succeeding, because Ioke is really dead slow right now.

Ioke does need speed. But it doesn’t speed right now. There are many ifs in the success of a language, but if all of them are fulfilled – well, then maybe in a year or two a focus on speed will come. Ioke is a malleable language, and it will be possible to do all kinds of tricks with the language to make performance improvement happen. But that is a consequence of the runtime model. The fact that the language is really flexible makes it slow now, but will make it easier to improve performance on later.



Ioke S released


Exactly one month after the first release of Ioke, I am very happy to announce that Ioke S has been released. It has been a team effort and I am immensely pleased with the result of it.

Ioke is a language that is designed to be as expressive as possible. It is a dynamic language targeted at the Java Virtual Machine. It’s been designed from scratch to be a highly flexible general purpose language. It is a prototype-based programming language that is inspired by Io, Smalltalk, Lisp and Ruby.

Homepage: http://ioke.org
Download: http://ioke.org/download.html
Programming guide: http://ioke.org/guide.html

Ioke S is the second release of Ioke. It includes a large amount of new features compared to Ioke 0. Among the most important are syntactic macros, full regular expression support, for comprehensions, aspects, cond and case, destructuring macros, and many more things.

Ioke S also includes a large amount of bug fixes, and several example programs.

Features:

  • Expressiveness first
  • Strong, dynamic typing
  • Prototype based object orientation
  • Homoiconic language
  • Simple syntax
  • Powerful macro facilities
  • Condition system
  • Aspects
  • Developed using TDD
  • Documentation system that combines documentation with specs
  • Wedded to the JVM

The many things added in Ioke S could not have been done without the support of several new contributors. I would like to call out and thank:
T W <twellman@gmail.com>
Sam Aaron <samaaron@gmail.com>
Carlos Villela <cv@lixo.org>
Brian Guthrie <btguthrie@gmail.com>
Martin Elwin <elvvin@gmail.com>
Felipe Rodrigues de Almeida <felipero@gmail.com>



Guide to Ioke development environment


Wonderful! Martin Elwin posted the first part of a series on setting up an Ioke development environment. It’s fantastic to see people blog about Ioke, and Martin wrote some other pieces too. Especially the JSON parser is cool. You can see the development environment post here.



Talking about Ioke, at Amsterdam.rb


Thanks to Sam Aaron, I will talk about Ioke at Amsterdam.rb February the 23rd. Great stuff. Not sure what I will cover until then, since it’s so far in the future.

For the record, since I released Ioke 0 on December 23rd, it would be fun to release Ioke S on Jan 23rd, and Ioke P on Feb 23rd. The Ioke S release date I think I can hold, but the P one is much more worrisome. Anyway, see you in Amsterdam, hopefully!



Macro types in Ioke – or: what is a dmacro?


With the release of Ioke 0, things regarding types of code were pretty simple. At that point Ioke had DefaultMethod, LexicalBlock and DefaultMacro. (That’s not counting the raw message chains of course). But since then I’ve seen fit to add several new types of macros to Ioke. All of these have their reason for existing, and I thought I would try to explain those reasons a bit here.

But first I need to explain what DefaultMacro is. Generally speaking, when you send the message “macro” in Ioke, you will get back an instance of DefaultMacro. A DefaultMacro is executed at runtime, just like regular methods, and in the same namespace. So a macro has a receiver, just as a method. In fact, the main difference between macros and methods are that you can’t define arguments for a macro. And when a message activates a macro, the arguments sent to that message will not be evaluated. Instead, the macro gets access to a cell called “call”. This cell is a mimic of the kind Call.

What can you do with a Call then? Well, you can get access to the unevaluated arguments. The easiest way to do this is by doing “call arguments”. That returns a list of messages. A Call also contains the message sent to activate it. This can be accessed with “call message”. Call contains a reference to the ground in which the message was sent. This is accessed with “call ground”, and is necessary to be able to evaluate arguments correctly. Finally, there are some convenience methods that allow the macro to evaluate arguments. Doing “call argAt(2)” will evaluate the third argument and return it. This is a short form for the equivalent “call arguments[2] evaluateOn(call ground, call ground)”.

This is all well and good. Macros allow you to do most things you would want to do, really. But they are quite rough to work with in their raw form. There are also plumbing that is a bit inconvenient. One common thing that you might want to do is to transform the argument messages without evaluating them, return those messages and have them be inserted instead of the current macro. You can do this directly, but it is as mentioned above a bit inconvenient. So I added DefaultSyntax. You define a DefaultSyntax with a message called “syntax”. The first time a syntax is activated, it will run, take the result of itself and replace itself with that result, and then execute that result. The next time that piece of code is found, the syntax will not execute, instead the result of the first invocation will be there. This is the feature that lies behind for comprehensions. To make this a bit more concrete, lets create a very simplified version of it. This version is fixed to take three arguments, an argument name, an enumerable to iterate over, and an expression for how to map the output value. Basically, a different way of calling “map”. A case like this is good, because we have all the information necessary to transform it, instead of evaluating it directly.

An example use case could look like this:

myfor(x, 1..5, x*2) ; returns [2,4,6,8,10]

Here myfor will return the code to double the the elements in the range, and then execute that.

The syntax definition to make this possible looks like this:

myfor = syntax(
  "takes a name, an enumerable, and a transforming expression
and returns the result of transforming each entry in the
expression, with the current value of the enumerable
bound to the name given as the first argument",

  argName = call arguments[0]
  enumerable = call arguments[1]
  argCode = call arguments[2]
  ''(`enumerable map(`argName, `argCode))
)

As you can see, I’ve provided a documentation text. This is available at runtime.

Syntactic macros also have access to “call”, just like regular macros. Here we use it to assign three variables. These variables get the messages, not the result of those things. Finally, a metaquote is used. A metaquote takes its content and returns the message chain inside of it, except that anywhere a ` is encountered, the message at that point will be evaluated and spliced into the message chain at that point. The result will be to transform “myfor(x, 1..5, x*2)” into “1..5 map(x, x*2)”.

As might be visible, the handling of arguments is kinda impractical here. There are two problems with it, really. First, it’s really verbose. Second, it doesn’t check for too many or too few arguments. Doing these things would complicate the code, at the expense of readability. And regular macros have exactly the same problem. That’s why I implemented the d-family of destructuring macros. The current versions of this are dmacro, dsyntax, dlecro and dlecrox. They all work the same, except the generate macros, syntax, lecros or lecroxes, depending on which version used.

Let’s take the previous example and show how it would look like with dsyntax:

myfor = dsyntax(
  "takes a name, an enumerable, and a transforming expression
and returns the result of transforming each entry in the
expression, with the current value of the enumerable
bound to the name given as the first argument",

  [argName, enumerable, argCode]

  ''(`enumerable map(`argName, `argCode))
)

The only difference here is that we use dsyntax instead of syntax. The usage of “call arguments[n]” is gone, and is instead replaced with a list of names. Under the covers, dsyntax will make sure the right number of arguments are sent and an error message provided otherwise. After it has ensured the right number of arguments, it will also assign the names in the list to their corresponding argument. This process is highly flexible and you can choose to evaluate some messages and some not. You can also collect messages into a list of messages.

But the real nice thing with dsyntax is that it allows several choices of argument lists. Say we wanted to provide the option of giving either 3 or 4 arguments, where the expansion looks the same for 3 arguments, but if 4 arguments are provided, the third one will be interpreted as a condition. In other words, to be able to do this:

myfor(x, 1..5, x*2) ; returns [2,4,6,8,10]
myfor(x, 1..5, x<4, x*2) ; returns [2,4,6]

Here a condition is used in the comprehension to filter out some elements. Just as with the original, this code transforms into an obvious application of “filter” followed by “map”. The updated version of the syntax looks like this:

myfor = dsyntax(
  "takes a name, an enumerable, and a transforming expression
and returns the result of transforming each entry in the
expression, with the current value of the enumerable
bound to the name given as the first argument",

  [argName, enumerable, argCode]

  ''(`enumerable map(`argName, `argCode)),

  [argName, enumerable, condition, argCode]

  ''(`enumerable filter(`argName, `condition) map(`argName, `argCode))
)

The only thing added is a new destructuring pattern that matches the new case and in that situation returns code that includes a call to filter.

The destructuring macros have more features than these, but this is the skinny on why they are useful. In fact, I’ve used a combination of syntax and dmacro to remove a lot of repetition from the Enumerable core methods, for example. Things like this make it possible to provide abstractions where you only need to specify what’s necessary, and nothing more.

And remember, the destructuring I’ve shown with dsyntax can be done exactly the same for macros and lecros. Regular methods doesn’t need it that much, since the rules for DefaultMethod arguments are so flexible anyway. But for macros this has really made a large difference.



Operators in Ioke


When I first published the guide for Ioke, one of the reactions was that that was one heck of a lot of operators. The reason being that I listed all the available Ioke operators in the guide, and there are quite a lot of them. What is implicit in the reaction is that these operators all have defined meanings and are used in Ioke. That’s not true at all. So, I thought I’d describe a little bit more what operators are, how they work, and why they are necessary in a language like Ioke.

First of all, all available operators can be found in the Ioke grammar file. They are available as the tokens ComparisonOperator, RegularBinaryOperator and IncDec. Together, they are at the moment 10 + 77 + 2 = 89 operators. Together with assignment, that is 90 available operators. As mentioned above, most of these aren’t actually used anywhere. Instead they are available for use by any client program. They are there specifically for creating nice DSLs and readable APIs.

Operators only exist in the parsing stage of Ioke. After that everything is a message, and a message based on an operator is no different from one based on a regular message send. So the importance of operators happen mostly in the stage of operator shuffling. Since Ioke uses whitespace for regular message application, the implementation could treat EVERYTHING the same. That is also the base case. If you turn of operator shuffling, you could write =(foo, 1 +(2 *(20 **(42)))) to assign an expression to the name foo. This way of writing isn’t necessarily convenient, though, which is why Ioke adopts an operator shuffling scheme similar to Io.

So, what is an operator? It is really just a method with some funky characters in the name. All operators in Ioke are binary or trinary. What looks like a unary operator is simply just a message send to an implicit receiver. So 10 – 5 is the same operator as -5. It’s just that the second version will call the method – on DefaultBehavior, giving it 5 as argument. The result will be the negation of the argument. Binary operators aren’t anything strange, but when I say trinary operators, people will probably think about the ?: combination available in some languages. That isn’t exactly what I mean. There is another distinction between operators that is useful in Ioke – that between assigning operators and regular ones. The assigning operators are =, things like +=, /= and the increment and decrement operators ++ and –. All assigning operators are trinary except for the increment and decrement operators.

So what does this mean? (And I realize this is becoming a bit rambling at this point…). Ok, the rule is this. All assigning operators takes a place to assign to. That is the left hand side. Everything except for increment and decrement takes a value to send to the assign operator. But that leaves the actual receiver of the assignment message. Since assignment is just a message like everything else, there must be a receiver. So, in Ioke, if I write  foo += 1+2, that will be translated (I will explain this later), into  +=(foo, 1 +(2)). At this stage it looks like the += message is sent without receiver, but everything in Ioke has a default receiver – called the ground. In another situation, suppose we have a class like object called Foo. Then the expression  Foo bar = 42   will be translated into  Foo =(bar, 42). Here it is more apparent that the receiver of the =-message is actually Foo, and the direct left hand side and right hand side of the =-sign are both arguments. This means that there are three operands in these assignment operators and that is why they are called trinary.

Back to operator shuffling. In the examples I’ve shown above, the operator shuffling step is code that will basically take something that likes like regular arithmetic or assignment and rearrange that into the real message form. So x+1 will be translated into   x +(1). Something like   2+2*3 will be translated into 2 +(2 *(3)). And all operators translated this way has an associativity to make sure they follow expectations. You tune this associativity using the Dict inside Message OperatorTable operators. This can be used to create DSLs with new or different operators.

One thing that might surprise some people is that regular alphabetical names can be used as operators to. That is how something like “and” can be used in infix position.

I know that the operator rules are a bit complicated, and the grammar isn’t fantastic either. But when working with it, it feels really natural for me – and that is the goal. Operators and syntax are important. They make a large difference to the feel of the language. So I decided to make it as obvious as possible, without sacrificing the internal consistency of the language. And ultimately the control is in the hands of the Ioke programmer.

So, to recap, any of the available operators can be defined in the regular way. Just define it like you would anything else. And you can assign macros or syntax to operators too. They are just handled as regular names.



An Ioke spelling corrector


A while back Peter Norvig (of AI, Lisp and Google fame) published a small entry on how spelling correctors work. He included some code in Python to illustrate the concept, and this code have ended up being a very often used example of a programming language.

It would be ridiculous to suggest that I generally like to follow tradition, but in this case I think I will. This post will take a look at the version implemented in Ioke, and use that to highlight some of the interesting aspects of Ioke. The code itself is quite simple, and doesn’t use any of the object oriented features of Ioke. Neither does it use macros, although some of the features used is based on macros, so I will get to explain a bit of that.

For those that haven’t seen my Ioke posts before, you can find out more at http://ioke.org. The features used in this example is not yet released, so to follow along you’ll have to download and build yourself. Ioke S should be out in about 1-2 weeks though, and at that point this blog post will describe released software.

First, for reference, here is Norvig’s original corrector. You can also find his corpora there: http://norvig.com/spell-correct.html. Go read it! It’s a very good article.

This code is available in the Ioke repository in examples/spelling/spelling.ik. I’m adding a few line breaks here to make it fit the blog width – expect for that everything is the same.

Lets begin with the method “words”:

words = method(text,
  #/[a-z]+/ allMatches(text lower))

This method takes a text argument, then calls the method “lower” on it. This method will return a new text that is the original text converted to lower case. A regular expression that matches one or more alphabetical characters are used, and the method allMatches called on it. This method will return a list of texts of all the places matches in the text.

The next method is called “train”:

train = method(features,
  features fold({} withDefault(1), model, f,
    model[f] ++
    model))

The argument “features” should be a list of texts. It then calls fold on this list (you might know fold as reduce or inject. those would have been fine too.) The first argument to fold is the start value. This should be a dict, with a default value of 1. The second argument is the name that will be used to refer to the sum, and the third argument is the name to use for the feature. Finally, the last argument is the actual code to execute. This code just uses the feature (which is a text), indexes into the dict and increments the number there. It then returns the dict, since that will be the model argument the next iteration.

The next piece of code uses the former methods:

NWORDS = train(words(FileSystem readFully("small.txt")))

alphabet = "abcdefghijklmnopqrstuvwxyz" chars

As you can see we define a variable called NWORDS, that contains the result of first reading a text file, then extracting all the words from that text, and finally using that to train on. The next assignment gets a list of all the characters (as text) by calling “chars” on a text. I could have just written [“a”, “b”, “c”, …] etc, but I’m a bit lazy.

OK, now we come to the meat of the code. For an explanation of why this code does what it does, refer to Norvig’s article:

edits1 = method(word,
  s = for(i <- 0..(word length + 1),
    [word[0...i], word[i..-1]])

  set(
    *for(ab <- s,
      ab[0] + ab[1][1..-1]), ;deletes
    *for(ab <- s[0..-2],
      ab[0] + ab[1][1..1] + ab[1][0..0] + ab[1][2..-1]), ;transposes
    *for(ab <- s, c <- alphabet,
      ab[0] + c + ab[1][1..-1]), ;replaces
    *for(ab <- s, c <- alphabet,
      ab[0] + c + ab[1]))) ;inserts

The mechanics of it is this. We create a method assigned to the name edits1. This method takes one argument called “word”. We then create a local variable called “s”. This contains the result of executing a for comprehension. There are several things going on here. The first part of the comprehension gives a generator (that’s the part with the <-). The thing on the right is what to iterate over, and the thing on the left is the name to give it on each iteration. Basically, this comprehensions goes from 0 to the length of the word plus 1. (The dots denote an inclusive Range). The second argument to “for” is what to actually return. In this case we create a new array with two elements. The three dots creates an exclusive Range. Ending a Range in -1 means that it will extract the text to the end of it.

The rest of the code in this method is four different comprehensions. The result of these comprehensions are splatted, or spread out as arguments to the “set” method. The * is symbol to splat things. Basically, it means that instead of four lists, set will get all the elements of all the lists as separate arguments. Finally, set will create a set from these arguments and return that.

Whew. That was a mouthful. The next method is easy in comparison. More of the same, really:

knownEdits2 = method(word,
  for:set(e1 <- edits1(word),
    e2 <- edits1(e1),
    NWORDS key?(e2),
    e2))

Here we use another variation of a comprehension, namely a set comprehension. A regular comprehension returns a list. A set comprehension returns a set instead. This comprehension will only return words that are available as keys in NWORDS.

known = method(words,
  for:set(w <- words,
    NWORDS key?(w), w))

This method uses a set comprehension to find all words in “words” that are keys in NWORDS. As this point you might wonder what a comprehension actually is. And it’s quite easy. Basically, a comprehension is a piece of nice syntax around a combination of calls to “filter”, “map” and “flatMap”. In the case of a set comprehension, the calls go to “filter”, “map:set” and “flatMap:set” instead. The whole implementation of comprehensions are available in Ioke, in the file called src/builtin/F10_comprehensions.ik. Beware though, it uses some fairly advanced macro magic.

OK, back to spelling. Let’s look at the last method:

correct = method(word,
  candidates = known([word]) ifEmpty(
    known(edits1(word)) ifEmpty(
      knownEdits2(word) ifEmpty(
        [word])))
  candidates max(x, NWORDS[x]))

The correct method takes a word to correct and returns the best possible match. It first tries to see if the word is already known. If the result of that is empty, it tries to see if any edits of the word is known, and if that doesn’t work, if any edits of edits are known. Finally it just returns the original word. If more than one candidate spelling is known, the max method is used to determine which one was featured the most on the corpus.

The ifEmpty macro is a bit interesting. What it does is pretty simple, and you could have written it yourself. But it just happens to be part of the core library. The implementation looks like this:

List ifEmpty = dmacro(
  "if this list is empty, returns the result of evaluating the argument, otherwise returns the list",

  [then]
  if(empty?,
    call argAt(0),
    self))

This is a dmacro, which basically just means that the handling of arguments are taken care of. The argument for a destructured macro can be seen in the square brackets. Using a dmacro instead of just a raw macro means that we will get a good error message if the wrong number of arguments are provided. The implementation checks if its list is empty. If it is it returns the value of the first argument, otherwise it doesn’t evaluate anything and returns itself.

So, you have now seen a 16 line spelling corrector in Ioke (it’s 16 without the extra line breaks I added for the blog).



An Ioke update


I haven’t written here in a while – the reason being that I’ve been seriously heads down with Ioke, having a blast implementing new things and planning the future. I’m happy to have received several contributions from other people. Github makes this so easy it’s silly.

Since December 23rd – when Ioke 0 was released – I have made quite a lot of changes to Ioke. The highlights are these:

  • fixed all outstanding bugs reported
  • several examples, contributed by Carlos Villela
  • Range#each
  • ensure (like Ruby ensure, Java finally)
  • lexical macros
  • become!
  • full implementation of Regexp and Regexp Match
  • freeze!, thaw!
  • case expression
  • list comprehensions, set comprehensions and dict comprehensions
  • support for TextMate
  • alternative syntax for text and regexps
  • interpolation inside of regexps
  • support for syntactic macros
  • support for quoting and metaquoting
  • cond expression
  • destructuring macros
  • added support for inverted ranges
  • added methods to remove cells
  • added methods to find the owner of cells

So as you can see, I’ve been kinda busy. Ioke has garnered some real attention too, which is great fun. The github repository has over a 100 watchers. The guide has been viewed over 3500 times. The distribution packages have been downloaded about 400 times.

But right now I’m looking to the future. There are still many, many small holes in the core libraries, but with regards to the big stuff there is basically three pieces missing. These are Java integration, aspects and concurrency. I haven’t started on any of these because I haven’t decided exactly how they should look like. Especially the concurrency issue is definitely problematic. So I’m punting on it right now. But I promise to have an answer within a few weeks to these three major issues.

So what’s the plan? Ioke S will be released within 2 weeks. The guide need to be updated quite substantially, since all the features I listed above need to be described, and some of them are really complicated. They are very nice of course. I will also create at least a few more examples to show some actual code. I might write about it here to show some features.