Ioke syntax


Or: How using white space for application changes the syntax of a language.

I have spent most of the weekend working with different syntax elements of Ioke. Several of them are actually based on one simple decision I made quite early, and I thought it would be interesting to take a look at some of the syntax elements I’ve worked on, from the angle of how they are based on that one decision.

What is this decision then? In the manner of Smalltalk, Self and Io, I decided that periods are not the way to apply methods. Instead, space makes sense for this. So if in Java you would write “foo().bar(1).quux(2,3)” this would be written as “foo bar(1) quux(2, 3)” in Ioke. Everything is an expression and sending a message to something is done with putting the message adjacent to the thing receiving the message, separated by whitespace. This turns out to have some consequences I really didn’t expect, and several parts of the syntax have actually changed a lot because of this decision. I’ll take a look at the things that changed most recently because of it.

Terminators

Most language without explicit expression nesting (like Lisp) need some way to decide when a chain of message passing should stop. Most scripting languages today try to use newlines, and then use semicolons when newlines doesn’t quite work. That’s what I started out doing with Ioke too (since Io does it). But once I started thinking about it, I realized that Smalltalk got this thing right too. Since I don’t use dots for message application, I’m free to use it for termination. You still don’t need to terminate things that are obviously terminated with newlines, but when you need a terminator, the dot reads very well. I’ve always disliked the intrusiveness of semicolons – they seem to take to much visual space for me. Dots feel like the right size, and there is also a more pleasing symmetry with commas.

Comments

Once you don’t use semicolons for termination, you can use it for other things. I am quite fond of the Lisp tradition of using semicolons for comments, so I decided to not use hashes for that anymore. One of the ways Lisp systems use semicolons for comments is that they use different numbers of them to prepend different kinds of documentation. Common Lisp standard is to use four semicolons for headlines, three semicolons for left justified comments, two semicolons for a new line of comment that should be indented with the program text, and one semicolon for comments on the same line as program text. These things work because semicolons doesn’t take up so much visual space when stacked. A hash would never work for it.

The obvious question from any person with Unix experience will be how I handle shebangs if a hash isn’t a comment anymore. The short answer is that I will provide general read macro syntax based on hash. Since the shebang always starts with “#!” that would be a perfect application for a reader macro. That also opens up the possibility for other interesting reader macros, but I’ll take that question later.

Operator precedence

This one was totally unexpected. I had planned to add regular operator precedence style and it ended up being quite painful. I should probably have guessed the problem, but I didn’t – two grammar files later and I’m now hopefully a bit wiser. The problem ended up being whitespace. Since I use whitespace to separate application, but whitespace is also interesting to judge operator precedence, what happened was that the parsers I got working actually had exponential amount of backtracking. Two lines of regular code without operators still backtracked enough to take a minute or two to parse. Ouch. So what’s the solution? Two passes of parsing. Or not exactly, but almost. I’m currently implementing something like Io’s operator shuffling, which is a general solution to rearrange operators into a canonical form based on precedence rules. What’s fun with it is that the rules can be dynamically changed. If you want Smalltalk style left to right precedence, that should be possible by just setting the precedence to 1 for all operators. You can also turn of operator shuffling completely, which means you can’t use infix operators at all.

I’m also planning a way to scope these things, so you can actually change quite a lot of the syntax without switching the parser.

At some point I’m planning to explore how it would work to use an Antlr tree parser to do the shuffling. My intuition is that it would work well, but I’ll have to find the time to do it.

Syntactic flexibility

All is not perfect, but the current scheme seems to work well. I’ve been able to get a real amount of flexibility into the syntax, with loads of operators free for anyone to use and subclass. The result will be the possibility to create internal DSLs that Ruby could only dream of. Some things gets harder too, though. Regular expression syntax for example. If you can create a statement like this: “[10,12,14] map(/2 * 2/a)”, it’s kinda obvious that there is no easy way to know whether the statement inside the mapping call is a regular expression or an expression fragment. In Ioke the decision is simple, the above is an expression fragment. I’ve decided to make it really easy to work with regular expression syntax. Interestingly, it was one of the reasons I wanted reader macros for, and it turns out that using #/ will work well. So a regular expression looks just like in a perl like language, except that you add a hash before the first slash: #/foo/ =~ “str”. It seems that hash will end up being my syntax sin bin for those cases where I want syntax without touching the parser to much.

It’s funny to see how many things in classic syntax that changes if you change how message passing works. I like Ioke more and more for each of these things I find, and it currently looks very pleasant to work with. Dots are such an improvement for one-lines.



Hacking trampolining CPS


I spent some quality time today trying to hack together a continuation passing style system in Ruby, to clarify some of my thinking. I ended up with something that is more or less a very small interpreter for S expressions, that uses a trampolining CPS interpreter. The language is not in any way complete, such things as assignment isn’t there, there is only one global scope and so on, so the continuations in this system is really not useful for anything except for hacking with it to gain understanding.

As such, I thought people might find it a bit interesting. I wish I’d seen something like this 5 or 10 years ago… Note that this code is extremely hacky and incomplete and bad and whatnot. Be warned. =)

OK, first you need to “gem install sexp”. This provides dead easy parsing of S expressions. Since that wasn’t the main purpose of this code, doing it with a Gem was easier.

The first part of the code we need is the requires, and structures to represent continuations:

require 'rubygems'
require 'sexpressions'

class Cont
  def initialize(k)
    @k = k
  end
end

class BottomCont < Cont
  def initialize(k, &block)
    super(k)
    @f = block
  end

  def resume(v)
    @f.call(v)
  end
end

class IfCont < Cont
  def initialize(k, et, ef, r)
    super(k)
    @et, @ef, @r = et, ef, r
  end

  def resume(v)
    evaluate((v ? @et : @ef), @r, @k)
  end
end

class CallCont < Cont
  def initialize(k, r)
    super(k)
    @r = r
  end

  def resume(v)
    evaluate(v, @r, @k)
  end
end

class ContCont < Cont
  def initialize(k, v, r)
    super(k)
    @r, @v = r, v
  end

  def resume(v)
    evaluate(@v, @r, v)
  end
end

class NextCont < Cont
  def initialize(k, ne, r)
    super(k)
    @ne, @r = ne, r
  end

  def resume(v)
    evaluate(@ne, @r, @k)
  end
end

BottomCont is is what we use to do something at the end of the program. We could print something, or anything else. IfCont is used to implement a conditional. It’s quite easy – once we resume we check the truth value and evaluate the next part based of the result. CallCont will invoke some existing S expressions in a variable. It just takes the value and evaluates that. ContCont is a bit trickier. It will take a value, and then when asked to resume will assume that the parameter to resume is a continuation and invoke that continuation with the value it got earlier. Finally, NextCont is used to implement basic sequencing. It basically just throws away the earlier value and uses the next instead.

The actual code for evaluate and a helper function looks like this:

def evaluate_sexp(sexp)
  cont = BottomCont.new(nil) do |val|
    return val
  end

  env = {
    :haha => proc{|x| puts "calling proc"; 43 },
    :print => proc{|x| puts "printing" },
    :save_cont => proc{|x| puts "saving cont"; env[:saved] = x; true },
    :foo => 42,
    :bar => 33,
    :flux => "(call flux)".parse_sexp.first
  }

  c = evaluate(sexp, env, cont)

  while true
    c = c.call
  end
end

def evaluate(e, r, k)
  if e.is_a?(Array)
    case e.first
    when :if
      evaluate(e[1], r, IfCont.new(k,e[2],e[3],r))
    when :call
      evaluate(e[1], r, CallCont.new(k, r))
    when :continue
      p [:calling, :continue, e[1]]
      evaluate(e[1], r, ContCont.new(k, e[2], r))
    when :prog2
      evaluate(e[1], r, NextCont.new(k, e[2], r))
    end
  else
    case e
    when :true
      proc { k.resume(true) }
    when :nil
      proc { k.resume(nil) }
    when Symbol
      proc {
        if r[e].is_a?(Proc)
          k.resume(r[e].call(k))
        else
          k.resume(r[e])
        end
      }
    else
      proc { k.resume(e) }
    end
  end
end

Here evaluate_sexp is the entry point to the code. We first create a BottomCont that will just return the value. We then create an environment that includes simple values, a function (flux) that calls itself, and some procs that do different things. Finally evaluate is called, and then we repeatedly evaluate the thunk it returns. Since we know that the bottom continuation will return, we can actually invoke this part indefinitely. That is the actual trampolining part, right there.

The evaluate function will check if it’s an array we got, and in that case it will check the first entry and switch based on that, creating IfCont, CallCont, ContCont or NextCont based on the entry. If it’s a primitive value we do something different. As you can see we first check if the value is one of a few special ones, and then if it’s a symbol we look it up in the environment. If the value from the environment is a proc we invoke it with the current continuation, which means the proc can do funky stuff with it. The common thing for all the branches is that they wrap everything they do in a thunk, and inside that thunk call resume on the continuation with the value provided.

Finally we can try it out a bit:

p evaluate_sexp("123".parse_sexp.first) # 123
p evaluate_sexp("bar".parse_sexp.first) # 33
p evaluate_sexp("nil".parse_sexp.first) # nil

p evaluate_sexp("(if quux 13 (if true (if nil 444 555)))".parse_sexp.first) # 555
p evaluate_sexp("(if quux 13 (if true (if nil 444 haha)))".parse_sexp.first)

Here you can see that simple things work as expected.

What about calling the flux function, that will invoke itself?

p evaluate_sexp("(call flux)".parse_sexp.first)

This will actually loop endlessly. In effect, when we add trampolining to a CPS, we in effect get a stack less interpreter, in such a way that we get tail call recursion for free.

Finally, what about the actual continuation stuff? Another way of creating an eternal loop is to do something like this:

p evaluate_sexp("(prog2 save_cont (prog2 print (continue saved 33333)))".parse_sexp.first)

This piece of interesting code will actually loop forever. How? Well, first the prog2 will run the proc in save_cont. This will save the current continuation, and then return true from the proc. Then the next prog2 will be entered, running the print proc. Finally, the final part will be evaluating the continue form, which will take the continuation in saved, invoke that with the value 33333. This will in effect jump back to the first prog2, return 33333 from the call to save_cont and go into the next prog2 again. Looping…

If you use an if statement instead, and return nil from the inner call to the continuation, and add some printing to the IfCont#resume, you can see that that point will only be invoked twice:

p evaluate_sexp("(if save_cont (prog2 print (continue saved nil)) 321)".parse_sexp.first)

This will generate:

[:running, :if, :statement]
printing
[:calling, :continue, :saved]
[:running, :if, :statement]
321

Here it’s obvious that the if statement runs twice, and that the second time the evaluation turns into false, which makes the final continuation return 321

I hope this little excursion into CPS land was interesting for someone. It’s a quite useful technique to know about, once you wrap your head around it.



Ioke 0 roadmap


The first release of Ioke will be called Ioke 0, and I aim to have it more or less finished in a month or so. At the longest, it might take until Christmas. So, since it’s coming soon, I thought I would just put in a list of the kind of things I’m aiming to have in it at that release. I’ll also quickly discuss some feature I will have in the language but that’s going to be on Ioke I or Ioke II.

First, the first release of the language means that the basic core is there. The message passing works and you can create new things, methods and blocks. Numbers are in, but nothing with decimal points so far. If I need it for some of the other stuff I’m implementing, I’ll add them, otherwise integers might be the only numbers in Ioke 0. I’m OK with that. The core library will be quite small at this point too. Ioke 0 will be a usable language, but it’s definitely not batteries included in any way.

These are some specific things I want to implement before releasing it:

  • List and Dict should be in, including literal syntax for creation, aref-fing and aset-ting. Having syntax for aset means that I will have in place a simple version of setting of places, instead of just names.
  • Enumerable-like implementation for List and Dict.
  • DefaultMethod and LexicalBlock should support regular, optional, keyword and rest arguments. Currently only the rest arguments are missing, and this is mostly because I don’t have Lists yet.
  • Basic support for working with message instances, to provide crude metaprogramming.
  • The full condition system. That includes modifying the implementation to provide good restarts in the core. It also might include a crude debugger. Restarts are implemented, but the conditions will take some time.
  • cellMissing should be there. Contexts should be implemented in terms of it.
  • Basic IO functionality.
  • A reader (that reads Ioke syntax and returns the generated Message tree).
  • Access to program arguments.
  • IIk – Interactive Ioke. The REPL should definitely be in, and be tightly integrated with the main-program. I’m taking the Lisp route here, not the Ruby one. IIk will be implemented in Ioke, and should drive the evolution of several of the above features.
  • Dokgen – A tool to generate documentation about existing cells in the system. Since this information is available at run time it should be exceedingly easy to create this tool. Having it will drive features too.
  • Affirm – A testing framework written in Ioke. The goal will be to rewrite the full test suite of Ioke (which is currently using JtestR) into using Affirm instead. That’s going to happen between Ioke 0 and Ioke I.
  • Documentation that covers the full language, and some usage pointers.

There are some features I’m not sure about yet. They are larger and might prove to be too large to rush out. The main one of these is the Java integration features. Right now I’m thinking about waiting with that support.

I have loads of features planned for the future. These are the ones that I’m most interested in getting in there quite soon, which means they’ll be in either I or II.

  • Java Integration
  • Full ‘become’, with the twist that become will actually not change the class of an instance, but instead change an instance into the other instance. This is something I’ve always wanted in Ruby, and ‘become’ seems to be a fitting way to do it. This will make transparent futures and things like that quite easy to implement.
  • Common Lisp like format, that can handle formatting of elements in a List in the formatting language. Not sure I’m going to use the same syntax as Common Lisp, though. Maybe I’ll just make it into an extension of the printf support?
  • Simple aspects. Namely, it should be possible to add before, after and around advice to any cell in the system. I haven’t decided if I should restrict this to only activatable cells or any cell at all.
  • Ranges.
  • Macros. I’m not sure which version I’ll end up with yet. I have two ideas that might be more or less the same, but both of them are really, really powerful.
  • Simple methods. In Ioke, a method is something that follows a very simple interface. It’s extremely easy to create something that acts like a method in some cases but does something different. Simple methods are restricted in the kind of meta programming they can do, which means they can be compiled down to quite efficient code. This is a bit further away, maybe III or IV.
  • Continuations. I would like to have them. I think I can do it without changing to much of the structure. This is not at all a certainty at the moment, but it might happen.

That’s about it for now. Once I have the core language in place I want to start working on useful libraries around it. Once 0 is out, I’m planning to start using Ioke as my main scripting language, and have that drive what libraries I need to create and so on.

Around II or III, I think it’s time to go metacircular. Not necessarily for the implementation, but to describe the semantics in it. Might be possible to do something like SLang too, and compile Ioke to Java for the needed core.

If you are interested in following the development, you can check it out at my git repository at http://github.com/olabini/ioke, or at the project pages at http://ioke.kenai.com. The Git repository is the canonical one right now, and the Kenai HG one is a clone of that. If you’re interested in discussion Ioke, there are mailing lists at the project pages. I also will have a real page for the project ready for the first release. But I promise you will notice when that release happens.



Condition system in Ioke


Continuing in the series of “standing on the shoulders of giants”, I will in this post talk a little bit about one feature I always miss in “modern” languages. A condition system. According to Wikipedia, Smalltalk also had one, but I’m only familiar with the Common Lisp and Dylan versions. And that’s where my inspiration is coming from.

So what is a condition system, you might ask? Isn’t that just conditionals? Nope, not really. It’s actually totally different. Conditions are a generalization of errors. This means that errors are a kind of conditions, but you can also do other things with conditions. The main difference is that invoking a condition doesn’t necessarily mean that the stack will be unwinded. Instead, the code handling the condition can choose to do different things depending on the condition. Conditions can have different severity and different default actions. So most real errors would probably end up unwinding the stack if no handler was defined for it. But warnings can also be conditions – and the default warning action might depend on a command line flag. Or maybe you want to have a handler for a specific warning that should be an error in your code. Etc. The possibilities are quite literally endless, and when you can define your own conditions, the power of this mechanism should be apparent. (And if you’re thinking continuations, that’s not exactly it, but almost.)

Common Lisp allow you to disregard some exceptions. You can also restart the code involved that threw an exception, if you think it’s likely to be intermittent. And the handler for this doesn’t need to be lexically close to the place that caused the problem. Instead, the place that raised a condition will give you a couple of options on what to do, commonly called restarts. Restarts are lexical closures that can change the environment that caused the exception, which means it’s possible to fix things quite neatly. Most (all?) Common Lisp implementations have a default exception handler that drops you into the interactive debugger, which is invoked with the lexical closure of the point where the exception happened. You might want to read that last sentence a few times over if it didn’t make sense the first time.

You might ask yourself if this magic is some kind of Lisp trick? That’s a good question, since it seems to mostly be available in Lisp systems, such as Common Lisp and Dylan. (Smalltalk is obviously very Lispy too. =)

There is nothing technical standing in the way for other languages to adopt a condition system. Ruby would do well with it for many things, although I don’t think it would be a great thing to graft it on at this point.

I’m pretty sure I can implement it easily in Ioke, on top of the JVM. The interesting point will be to see how I can make it interact with Java exceptions…

Since I haven’t implemented it yet, I don’t know the exact syntax, but I do have some ideas.

bindHandler(
  noSuchCell: block(c, c asText println),
  incorrectArity: block(c, c asText println),
  # code that might cause the above conditions
)

bindHandler(
  somethingHappened: block(c, invokeRestart(useNewValue, 24))
  loop(
    value = 1
    bindRestart(
      useNewValue: block(val, value = val),
      quit: block(break),

      value println
      signal(somethingHappened)))

These examples show a small subset of what you will be able to do – define handlers for specific conditions, and in these handlers call restarts. I will probably provide some way to collect all handlers for a condition, so Common Lisp style interactive choices can be made. I will try to keep the system a bit easier than the Common Lisp version, but hopefully I’ll be able to retain it’s power.



Why not Io?


I have been asked a few times in different circumstances why I feel the need to create my own language instead of just working with Io. That is a very valid question, so I’m going to try to answer it here.

First of all, I like Io a lot. Ioke is close enough to Io that it will be obvious who the parent is. In my mind at least, the differences are in many ways cosmetic and in those that are not it’s because I have some fairly specific things in mind.

So what are the main differences? Well, first of all it runs on the JVM. I want it that way because of all the obvious reasons. The Java platform is just such a good place to be. All the libraries are there, a good way of writing extensions in a language that is not C/C++, a substrate that gives me threads, GC and Unicode for free. So these reasons make a big difference both for people using Ioke, and for me. I want to be able to use Ioke to interact with other languages, polyglot programming and all. And since I expect Ioke to be much more expressive than most other languages, I think it will be a very good choice to put on top of a stable layer in the JVM. Being implemented in C makes these benefits go away.

Of course I could just have ported Io to the JVM and be done with it. That’s how it all started. But then I realized that if I decided to not do a straight port, I could change some things. You have seen some discussions about the decisions I’m revisiting here. The whole naming issue, handling of numbers, etc. Other things are more core. I want to allow as much syntactic flexibility as possible. I really can’t stand the three different assignment operators. I know they make the implementation easier to handle, but having one assignment operator with pluggable semantics gives a more expressive language.

Another thing I’m adding in is literal syntax for arrays and hashes, and literal syntax for referencing and setting elements in these. Literals make such a difference in a language and I can’t really handle being without it. These additions substantially complicate the language, but I think it’s worth it for the expressive power.

A large difference in Ioke will be the way AST modification will be handled. Io gives loads of power to the user with regard to this, but I think there is more that can be done. I’m adding macros to Ioke. These will be quite powerful. As an example, the DefaultMethod facility (that gives arguments, optional arguments, REAL keyword arguments and rest argument) can actually be implemented in Ioke itself, using macros. At the moment this code is in Java, but that’s only because of the bootstrapping needed. The word macro might be a bad choice here though, since it executes at the same time as a method. The main difference is that a macro generally has call-by-name/call-by-need semantics, and that it will modify it’s current or surrounding AST nodes in some way. Yes, you heard me right, the macro facility will allow you to modify AST siblings. In fact, a macro could change your whole script from that point on… Of course Io can do this, with some trickery. But Ioke will have facilities to do it. Why? Doesn’t that sound dangerous… Yeah. It does, but on the other hand it will make it really easy to implement very flexible DSLs.

A final note – coming from Ruby I’ve always found Io’s libraries a bit hard to work with. Not sure why – it’s probably just personal taste, but the philosophy behind the Io libraries seem to not provide the things I like in a core library. So I will probably base Ioke’s core library more on Ruby than on Io.

There you have it. These are the main reasons I decided to not use Io. And once I started to diverge from Io, I decided to take a step back and start thinking through the language from scratch. Ioke will be the result, when it’s finished. (Haha. Finished. Like a language is ever finished… =)



Language revolution


JAOO was interesting this year. A collection of very diverse subjects, and many focusing on programming languages – we had presentations about functional programming, JavaScript, Fortress and JRuby. Guy Steele and Richard Gabriel did their 50 in 50 presentation, which was amazing. I’ve also managed to get quite a lot of work done on Ioke. The result of all this is that my head has been swimming with thoughts about programming languages. I’ve also had the good fortune of spending time talking about languages with such people as Bill Venners, Lars Bak, Neal Ford, Martin Fowler, Guy Steele, Richard Gabriel, Dave Thomas, Erik Meijer, Jim des Rivieres, Josh Holmes and many others.

It is obvious that we live in interesting times for programming languages. But are they interesting enough? What are the current trends in cutting edge programming languages? I can see these:

  • Better implementation techniques. V8 is an example of this, and so is Hotspot. V8 employs new techniques to drive innovation further, while Hotspot’s engineers continuously adds both old and new techniques to their tool box.
  • DSLs. The focus by some people on domain specific languages seem to be part of the larger focus on languages as an important tool.
  • Functional semantics. Erik Meijers keynote was the largest push in this direction, although many languages keep adding features that make it easier to work in a functional style. Clojure is one of the new languages that come from this point, and so is Scala. The focus on concurrency generally lead people to the conclusion that a more functional style is necessary. From the concurrency aspect we get the recent focus on Erlang. Fortress aslo seems to be mostly in this category.
  • Static typing. Scala and Haskell are probably the most representative of this approach, in trying to stretch static typing as far as possible to improve both the programmer experience, semantics and performance.

Is this really it? You can quibble about the specific categories and where the borders are. I’m not entirely satisfied with where I put Fortress, for example, but all in all it feels like this is what’s going on.

Seeing 50 in 50 reminded me about how many languages we have seen, and how different these all are. It feels like most of the innovation happened in the past. So why is the current state of programming languages so poor? Is it because other things overshadow the language itself? I really don’t believe that. I think a good enough language would enable better tools, more productivity and more successful projects. So why isn’t it happening? We seem to be stuck in a rut. Anders Hejlsberg said in his opening keynote that the last 10-15 years have been an anomaly. I really do hope so.

What is apparent from the list compiled above is that everything that currently happens is very much evolutionary in approach. Innovation is happening, but it’s mostly small innovation.

We need a language revolution. We need totally new ways at looking at programming languages. We need new innovation, unfettered by the failures and successes of times past. We need more language implementors. We need more people thinking about these things.

I don’t know what the new approaches need to be, but the way I see it the last 10 years have been quite disappointing. If programming languages really are important tools, why haven’t we seen the same kind of innovation in that field as we have in IDEs and tools? Why haven’t we seen totally new ideas crop up? Is it because language development is always evolutionary? Does it have to be? Or is everyone interested in the field already convinced that we are at the peak right now? Or that Lisp or Smalltalk was the peak?

What needs to be rethought? I’ve read Jonathan Edwards recently, and he writes a lot about revisiting basic ideas and conclusions. I don’t agree with everything he says, but in this matter he’s totally right. We need to revisit all assumptions. We need to figure out better ways of doing things. Programming languages are just too important. We shouldn’t be satisfied with the current approaches just because we don’t know anything better.

We need a revolution.



Naming of core concepts


Another part of naming that I’ve spent some time thinking about is what I should call the core concepts in the language. A typical example of this is using the word Object for the base of the structure. I’ve never liked this and I get the feeling that many languages have chosen this just because other languages have it, instead of for any good reason.

In a prototype based language it definitely doesn’t feel good. My current favorite name for the hierarchy base is right now Origin. Since everything is actually copied from this object, that word seems right.

Another naming issue that turns up quite quickly is what the things you store in an object is called. In JavaScript they are called properties. In Io they are called slots. It’s hard to articulate why I don’t like these names. Maybe this is a personal preference, but at the moment I’m considering calling them ‘cells’ in Ioke.

What about String? That does seem like an arbitrary choice too. A String is probably short for String of characters in most cases, and that’s really an implementation choice. What if you do like JavaScript engines where a String is actually a collection of character buffers stitched together? In that case String feels like a slightly suspect name. For Ioke I’m currently leaning towards just calling it Text. Text is immutable by default and you need a TextBuffer to be able to modify text on the fly.

Another thing that sometimes feel very strange in prototype based languages is the name you give the operation to create new instances. In Io it’s called clone. In JavaScript you use new. Here I’m really not sure what to do. At the moment I’m considering leaving it to ‘clone’ but provide two factory methods that will allow better flow while reading and also avoid the clone keyword. Say that you have something called Person. To create a new Person the code should look like this:

p = Person with(
  givenName = "Ola"
  surName = "Bini")

This code will first call clone, and then execute the code given to with inside of the context of the newly cloned object. This pattern should be quite common when creating instances. The other common person I see occurring is this:

tb = TextBuffer from("some text")

Maybe there are other useful cases, but with and from will work quite well for most purposes in the core libraries. What’s nice is that these methods are extremely easy to create and they don’t need any fancy runtime support. They are entirely composable from other primitives.

This is currently the kind of decisions I’m trying to make. If you have something that works nice instead of ‘clone’, do let me know.



Variations on naming


I’ve been spending lots of time thinking about naming lately. What kind of naming scheme do you follow in your class names? How does your method names look like? Your variables? Do you use active words or passive? Should you use longer, descriptive names or shorter, more succinct? How can you best use symbolic freedom to allow better naming? What kind of names do you choose for your core classes? Does it matter or should you just go for Good ‘Ole ‘Object’?

These musings is part of my design thoughts about Ioke, and I’ll describe some of the decisions I’ve made.

I first want to start with something simple. What naming scheme do you follow for your class-like objects? I’m going to go the same way as Java here, using capitalized words. This will be a convention and not anything forced by syntax, since the class-like objects will actually not be classes since Ioke is a prototype based language.

So what about method names and variable names? First of all there is no distinction in Ioke. Everything is a message. There are several variations here that are quite common. I have opinions on all of them, of course.

  • C#: A typical method might be called ‘ShouldRender’. The words are capitalized and put together without any separation characters. Most symbols aren’t allowed so method names are generally restricted to numbers, letters and underscore. This style doesn’t appeal to me at all. I find it really hard to read. That the naming convention is the same class names makes it hard to discern these names from each other. Having the words connected without any separation characters also doesn’t help.
  • Java: The same as C# except that the first character is lower case: ‘shouldRender’. The same restrictions apply as for C#. I find it slightly easier to read than C#, since there is specific difference between class names and method names. But the train wreck style of putting together the words is still inconvenient.
  • Ruby: With symbol freedom and small case separated by underscores, Ruby has a quite different style. The example would look like this: ‘should_render?’. The question mark is really nice to have there, and the under scores really make it easier to read. Of course, the difference between class names and method names is quite large which is both a good and bad thing.
  • Lisp: Typical Lisp naming is all lower case and separated by dashes: ‘should-render?’. Lisp also has high degrees of symbolic freedom so you can use basically any character except for white space and close parenthesis in it. I like this style a lot. It’s eminently readable but the main problem is the dash. Allowing dashes in names in a language with infix operators means that you must use spaces around a minus sign which really would annoy me. So even though I like this style I don’t think it’s practical unless your language uses Lisp style notation for operator names.
  • Smalltalk: This style of naming is definitely the most different. It tries to avoid words that run together by using keyword selectors. That means you interleave the arguments with the parts of the name of the method to call. For example: ‘render: “str” on: screen1’ is actually a method called ‘render:on:’. The method of naming have really nice reading benefits but there is also a high cost. Specifically, Ioke uses spaces to chain invocation, which means that if I used Smalltak style names I would need to surround all method invocations with parenthesis which won’t look good at all in this case. There are lots of good reasons for this naming but ultimately it doesn’t work.

Those are basically the naming styles I can think of right now. Everything else is mostly variations on it. So what will I use for Ioke? I don’t know exactly, but the two alternatives I’m considering right now is Ruby names and Java names + symbolic freedom. I’m leaning towards the Java naming. Adding more symbols will make it easier to use good names. Small things like question marks and exclamation marks really make a difference. Another reason for going with Java names is that it allows interacting with Java without doing name mangling (lke JRuby does). That’s highly attractive, even though the method names will be a bit more compact with this convention.

Any opinions on this?