Tutorial at JAOO about JRuby testing


Just thought I’d mention it here – I’m at JAOO this year and will give a tutorial about testing Java with JRuby. I will be a great tutorial and I hope to see many of you there.



Upcoming talks


There hasn’t been much interesting happening this summer, but the fall is shaping up to be pretty busy. I will be talking at several different conferences, and thought I’d mention when and where I will be appearing.

First, this week I’m presenting at JavaZone in Oslo. I will present at 11:45 tomorrow, talking about Ioke.

Next week is the JVM Language Summit in Santa Clara. It is shaping up to be a great collection of people with many interesting discussions and talks. Take a look at the details for the talks. The people there are some of the most experienced language developers and implementors in the world. It should be a blast. I will do a talk about Ioke, and also a workshop about the challenges of improving Ioke’s performance.

After that I will attend RubyFoo in London, Oct 2-3, where I will talk about JRuby. RubyFoo will feature Matz, Sam Aaron, Aslak Hellesøy, Adam Wiggins and me. It should be great fun!

At JAOO this year (Oct 4-9 in Aarhus, Denmark) I will do a tutorial about testing Java code with JRuby. This conference also looks like it will be great. Many interesting talks and speakers. And of course, JAOO is generally the best conference I’ve ever been to.

At Øredev in Malmö, Sweden (Nov 2-6), I will be talking about Ioke.

And finally, at QCon SF in San Francisco (Nov 16-20) I will be hosting a track on emerging languages. After JAOO, QCon is my favorite conference, so I think it will be very nice too.

So, several interesting conferences coming up. Hope to see many of you there!



Re2j – a small lexer generator for Java


There is a tool called re2c. It’s pretty neat. Basically it allows you to intersperse a regular expression based grammar in comments inside of C code, and those comments will be transformed into a basic lexer. There are a few things that make re2c different from other similar tools. The first one is that the supported features are pretty limited (which is good). The code generated is fast. The other good part is that you can have several sections in the same source file. The productions for any specific piece of code are constrained to the specific comment.

As it happens, why the lucky stiff used re2c when he made Syck (the C-based YAML processor used in Ruby and many other languages). So when I set set out to port Syck to Java, the first problem was to figure out the best way to port the lexers using re2c. I ended up using Ragel for the implicit-scanner, and thought about doing the same for the token scanner, but Ragel is pretty painful to use for more than one main production in the same source file. The syntax is not exactly the same either, so it would add to the burden of porting the scanner if I decided to switch.

At the end of the day the most pragmatic choice was to port the output generator in re2c to generate Java instead. This turned out to be pretty easy, and the result is now used in Yecht, which was merged as the YAML processor for JRuby a few days ago.

You can find re2j in my github repository at http://github.com/olabini/re2j. This is still a C++ program, and it probably won’t compile very well on windows. But it’s good enough for many small use cases. Everything works exactly as re2c except for one small difference, namely that you can define a parameter called YYDATA that points to a byte or char buffer that should be the place to read from. For an example usage, take a look at the token scanner: http://github.com/olabini/yecht/blob/master/src/main/org/yecht/TokenScanner.re.

I haven’t put any compiled binaries out anywhere, and at some point it might be nice to merge this with the proper re2c project so you can give a flag to generate Java instead of C, but for now this is all there is to the project.



What is eval?


The glib answer to this question would be: “evil”. Of course, that doesn’t really tell us anything new. I wanted to explore the question of where in the spectrum eval fits in, in dynamic languages, and why the power of the language is ultimately increased by including eval.

Lately I’ve been saying that having eval is actually a roundabout way of having the interpreter be first class. After some thinking I’ve realized that this isn’t strictly true, which is why I wanted to spend some more time on eval.

The history of eval goes back to McCarthy’s paper on Lisp, long before Lisp was actually implemented. The interesting point is that the eval given in that paper can be used by the language itself, and the language can define its own semantics in term of itself, so a complete eval can be implemented in the language itself. This property is generally called a metacircular interpreter. Of course, having eval be this easy to implement in the language itself makes it extremely simple to also tweak it a bit and implement subtly different versions of the language. All of these advantages are not really based on eval itself, though, but rather in the fact that Lisp is so easy to define in terms of itself.

Eval shines more in languages where it’s really hard to define the semantics, like in JavaScript, Ruby or Perl. In these languages it is still possible to implement an eval in the language itself, but it’s extremely hard. In these languages, having eval gives you an escape hatch into the already implemented interpreter that is running the host code.

There are two different versions of eval in common use. Which one is mostly used depends on the type of the language. In homoiconic languages you will generally not give strings to eval, since you can just give the code to execute directly to eval. The typical example of this is Lisp, where eval takes an S-expression. Since it is so easy to build S-expressions (and it’s fundamentally more expressive), this means that this version of eval makes many things easy. Languages that are not homoiconic generally takes a string that contains the code, and will then parse the code and then execute it.

Most versions of eval also take an argument that contains the current binding information, or the current context. In some versions this is implicit and can never be sent in explicitly, while some languages (like Ruby and Lisp) allow you to send in the binding separately. For this to be powerful you obviously need a way to get at the binding in a current context, and then be able to store that somewhere.

So, in summary eval depends on two different capabilities that are more or less orthogonal. The first one is to call out to the interpreter and ask it to execute some code. The second is to be able to manipulate code contexts in a limited manner. Some languages allow you to do whatever you want with contexts, but that is definitely not the norm – since it disallows some very powerful optimization techniques. It is possible to get access to this information without sacrificing performance, though, as Smalltalk shows.

To get back to the question whether eval has anything to do with first class objects, we need to first look at what it actually means to be first class. Of course, the points for being first class depend to a degree on what language we are talking about. The wikipedia definition is that a first class object is something that can be used inside the programming language without restriction, compared to other entities in the language. In the context of an object oriented language, this would mean that you should be able to create new instances of it, you should be able to store it in variables, you should be able to pass it as arguments to methods and return it from methods. You should also be able to call methods on it, and so on.

Depending on how you see it, the eval function is generally pretty restricted in what you can do with it. Specifically, in Ruby, if you do the refactoring Extract Method on a piece of code that includes eval, eval will actually not work the same. This makes eval a fundamentally different method than all other methods in Ruby.

So lets change the question a bit – how can we make the interpreter first class while still retaining the simplicity of eval? The first step is to actually make the interpreter into a class. This class have one instance that is the currently running runtime. Once you have that object available at runtime, the next step is to be able to create new instances of the interpreter, and finally to be able to ask it to invoke code. The second piece of the puzzle is to make bindings/context first class, so you can create new ones at runtime and manipulate them. Once you have those two things together, eval will actually just be a shortcut to getting the current interpreter and the current runtime and ask it to evaluate some code.

Ioke doesn’t have it right now, but I have made place for it. There is an object called Runtime that reflects the current runtime. The plan is to make it possible to call mimic on in, and by doing so create a new interpreter from the current one. What is interesting is that this makes it possible to have some inherent security too. Since the second runtime mimics the first one, the second one won’t have capabilities that the first one lacks.

In Ioke a binding is just a regular Ioke object – nothing special at all really, and you can just create any kind of object and use that as a binding object. The core of simplicity in Ioke makes these operations that much simpler.

Eval is a strange beast, but at the end of the day it is still about accessing the interpreter. Generalizing this makes much more interesting things possible.



Google Wave


It has been almost two weeks since Google Wave was announced, so I thought I’d write a little about my thoughts on the subject. I have actually waited with this blog post because I haven’t been exactly sure what I think about it yet. Of course, I’m still not – but hopefully I will be able to get some of my thoughts collected by writing this post.

So let me start with the basics. Google Wave is a combination of email, instant messaging, forums, document creation and much more. The core concept is the wave, which in turn is comprised of blips. The blips are what actually contains content. Blips are threaded in terms of each other. The most different thing about Wave is probably that a wave is persistent and centralized. The model assumes that there is only one instance of each wave (although they can be federated and cached temporarily).

In general, there are three kinds of actors in a wave system. The first one is the human participants. You have a list of contacts and so on, that you can use to include people in a wave. The default settings make it possible for all participants to see all blips in a wave, but you can set this on a blip-by-blip basis. The second kind of actor is the gadget. A gadget is basically a piece of functionality that gets inserted into the blip directly. This is written in JavaScript and is as such a client side functionality. The final kind of actor is the robot, which is basically a computerized participant that can do more or less the same things as a human participant can do.

The robots are interesting, because most of the functionality in Wave is actually implemented in terms of robots. And if you want to build more interesting systems on top of Wave, the robots will be the way to achieve this. For the moment, you can only deploy robots on Google AppEngine – we have been told that this will change, though.

So how does a Robot work? Conceptually, a robot will get a bundle of events that it can react to by doing different things on a simple object model that represents the wave/blip in question. It’s not much harder than that. The available events are actually pretty few right now, including changes of participants, when a new blip is submitted, when a blip is created or has its title changed, when a blip is deleted and when a document is changed. Some of these events doesn’t happen often, while the document changed is generated on every new character. The actual protocol does bundling and so on, so you won’t necessarily run the robot on every character.

The actual wire protocol is built on top of JSON – it hasn’t actually been fully documented yet, the reason being that it’s not totally stable. At the moment it also looks like the protocol is pretty chatty, and that for most real world scenarios, you will want to have quite a few robots in most conversations, which could potentially lead to a large amount of traffic to the robots.

So, what are my impressions? I think it’s definitely cool. I think there is absolutely the potential for Wave to be a new platform that could replace many of the existing ones. Of course, it is still very early days. This means that the functionality and protocols are subject to change. I’m also looking forward to when the implementations will be open sourced so it will be easy to set up your own instance.

At the same time, my initial experience was that Wave easily became very confusing, especially when having several conversations going. On the other hand the Wave team reported the same, but also noted that after they got used to working with the system they learned new ways to handle it. I guess the same will happen for me, after some time of usage.

In summary: Wave is cool. It will be the platform for many applications, and the platform has great potential. It’s going to be interesting.



Scala LiftOff


During Saturday I attended the Scala LiftOff conference. THere were about 50-60 people there this year – many interesting people. The format of the conference was to have everyone propose sessions to talk about, and then put them in different time slots. This worked out great in practice, except for the small detail that the place we were in had terrible acoustics. It was extremely hard to make out what people were saying at times.

The exceptions to the unconference format were Martin’s keynote talk, before we started, and also something they called speedgeeking. I’ll talk more about that later.

So, Martin Odersky talked about the next five years of Scala – this information is pretty well covered in Dean Wamplers blog entry about the BASE meeting. I am impressed by some of the things they’re planning for the future.

After that I decided to attend a session John Rose put together, about JSR 292, invoke dynamic, and other features we could add to the JVM to make life for Scala easier. This turned into a pretty interesting discussion about different things. Martin Odersky was there and gave his perspective on what kind of features would be most useful. He was specially interested in interface injection and tail-call optimization, but we managed to cover quite a lot of ground in this discussion.

During the next slot I ended up being a butterfly – no session was really extremely interesting.

We had lunch and during that I saw Alex Payne describe some of the things they are doing at Twitter using Scala. After that came the speedgeeking. The basic idea was that twelve people should do small demos, max five minutes. They would do those demos for a smaller group of people, and then switch group – until everyone had seen those demos. I didn’t like this concept at all, and the way it worked out was just annoying – I ended up talking to John Rose and Martin Odersky for most of the time.

After that, me, Josh and Amanda figured that the weather was very nice so we moved our sessions outside. This also solved the problem with the bad acoustics. The first one of the outside sessions was Martin convening people to talk about equality and hash code semantics. The way implicits work right now make for some very strange and unexpected cases – such that “bob”.reverse == “bob” is not true. There are also several intricacies in how to handle hash code calculation for mutable collections. We didn’t really come to any conclusions, but Martin was happy that he’d gone through the available options and thoughts pretty thouroughly.

After that Josh and Amanda led a discussion about what kind of patterns we’re starting to see in functional object-oriented hybrid languages. Amandas experience with F# came in handy when comparing to the approaches used in Scala. No real conclusions here either, but lots of interesting discussions. My one mental note was to look up a recent OBJ paper, detailing an Object calculus. This reference came from Paul Snively.

After that the conference was over – me, Josh and Amanda was joined by Paul Snively for a beer. That ended up with me ranting about programming languages, as usual…

All in all, Scala LiftOff was a great conference, with a collection of many interesting people from several language communities. This ended up sparking very interesting discussions.



The Clojure meetup and general geekiness


The Bay Area Clojure user group threw a special JavaOne special, with Rich Hickey as special guest on Wednesday afternoon. I went there and it turned out to be a large collection of former and current ThoughtWorkers there, among all the other Clojure enthusiasts. The model was lightning talks for a while and then general town hall with Rich answering questions. The reality turned out to be a bit different – firstly because people spent quite long on their talks, and people asked many questions and so on. The second problem was that the projector in the place had some serious problems – which basically ended up resulting in everyone projecting pink tinted presentations.

There were several interesting talks. The first one took a look at what the Clojure compiler actually generates. This turned a bit funny when Rich chimed in and basically said “that doesn’t look right” – the presenter had simplified some of what was happening. I don’t envy the presenter in this case, but it all turned into good fun, and I think we all learned a bit about what Clojure does during compilation.

There was a longer talk about something called Swarmli, which was a very small distributed computing network, written in about 300 lines of code. I defocused during that talk since I had to hack some stuff in Ioke.

After that, one of the JetBrains guys showed up the new IntelliJ Clojure plugin. It seems to be quite early days for it still, but there is potential to get good cross language refactoring, joint compilation and other goodies there.

Finally, my colleague Bradford Cross did a very cool talk about some of the work he’s currently doing at a startup. The work seems to be perfectly suited for Clojure, and the code shown was very clear and simple. Very cool stuff, really. ThoughtWorks – actually using Clojure at client projects. Glad to see that.

After that it was time for Rich Hickey. Rich decided to give a lightning talk himself – about chunked sequences. Very cool in concept, but actually one of those ideas that seem very simple and evident after the fact. Chunked sequences really seems to promise even better Clojure performance in many cases – without even requiring changes to client code.

After that there was a general Q&A session, where questions ranged all over the map, from personal to professional. One of the more conentious things said was about Rich’s attitude to testing. This caused lots of discussions later in the evening.

All in all, this was really a great event. We ended up at a nearby bar/restaurant afterwards and had long discussions about programming languages. A great evening.



Second day of JavaOne


The second day of JavaOne ended up being not as draining as the first one, although I had lots of interesting times this day too. I’ve divided it into two blog posts – this is about what happened at JavaOne, and the next one will be about the Clojure meetup.

The first session of the day was Nick Siegers talk about using JRuby in production at Kenai. An interesting talk about some of the things that worked, and some of the things that didn’t work. A surprising number of decisions were given as fiat since they needed to use Sun products for many things.

After that Neal Ford gave a comparison between JRuby and Groovy. I don’t have much to say about this talk except it seemed that some of the things seemed to be a bit more complicated to achieve in Groovy, than in Ruby.

As it turns out, the next talk was my final talk of the day. This was Bob Lee (crazy bob) talking about references and garbage collection on the JVM. A very good talk, and I learned about how the Google Collections MapMaker actually solves some of my Ioke problems. I ended up integrating it during the evening and it works great.

The second day had fewer talks for me – but I still had a very good time and even learned some stuff. Nice.



First days of JavaOne and CommunityOne


I’ve been spending the last few days in San Francisco, attending CommunityOne and JavaOne. We are right now up to the second day of JavaOne, so I felt it would be a good idea to take a look at what’s been going on during the first two days.

I will not talk about the general sessions here since I as a rule avoid going to them. So, I started out CommmunityOne seeing Guilloume talk about what is new in Groovy 1.6. Pretty interesting stuff, and many useful things. Although, one of the things I noted was that many of the default usages of AST transformations actually just make up for the lack of class body code. Things like “@Singleton” that neeeds an AST transformation in Groovy, is a very simple thing to do by executing code in the class body in Ruby.

After that I saw John Rose talk about the Da Vinci machine project. Pretty nice stuff going on there, really. The JVM will really improve with this technology.

Charles Nutter did a reprise of his Beyond Impossible JRuby talk. It’s a really good talk that focuses on the things that you really wouldn’t think possible to do on the JVM, that we’ve had to do to get JRuby working well.

Guido talked about Python 3000 – much of that was really a look at the history of Python, and as such was really interesting. Unfortunately, my jetlag started to get the better of me at that point, so my focus could have been better.

For me, the first day of JavaOne started out with the Script Bowl. This year the languages represented was Jython, Groovy, Clojure, Scala and JRuby. I think they all did a pretty good job of showcasing the languages, although it’s very hard to do that in such a small timeframe. I think I sympathized the most with Rich Hickey (creator of Clojure) – the reason being that the Clojure model is the most dissimilar from the rest of the languages. But this dissimilarity is actually the key to understanding why Clojure is so powerful, so if you don’t understand it, you’re just going to be turned of by Clojure’s weird surface semantics. (Hint: they are not weird, they are necessary and powerful and really cool). Rich did a valiant effort to conveying this by talking a lot about the data structures that is Clojure, but I’m unsure how much of it actually penetrated.

Tom did a great job with the JRuby demos – he had a good flash 3d game running using a JRuby DSL, and then some slides showcasing how much benefit JRuby gets from the Ruby community. Good stuff.

After that I went to Rich’s Clojure talk. I’ve seen him give similar talks several times, but I don’t get tired of seeing this. As usual, Rich did a good job of giving a whirlwind tour of the language.

After lunch I went to the talk by Konstantin about JetBrains MPS. I was curious about MPS since I’ve been spending time with Intentional lately. I came away from the talk with a pretty different view of MPS compared to going in, actually. My initial reaction is that MPS seems to be pretty limited to what you can do with Intentional.

Then it was time to see Yehuda Katz talk about Ruby – this was a great intro to Ruby and I think the audience learned a lot there.

The first evening of JavaOne was really crazy, actually. I ended up first going to Brian Goetz and John Rose’s talk about building a Renaissance VM. This was a bit of an expansion of John’s CommunityOne talk, and gave a good overview of the different pieces we’re looking at in JSR 292, and also other things that should be in the JDK in some way to make a multi-language future possible.

Tobias Ivarsson gave a BOF about language interoperability on the JVM. This ended up being more about the interface injection feature that Tobias has been hacking on. We had some pretty good discussion, and I think we ended up with a feeling that we need to discuss this a bit more – especially if the API should be push or pull based. Good session by Tobias, though.

And then it was finally time for my BOF, called Hacking JRuby. This was actually a pretty mixed things, containing lots of funny small pieces of JRuby knowledge that can be useful if you want to do some weired things with JRuby. The slides can be found here: http://dist.codehaus.org/jruby/talks/HackingJRuby.pdf. I think the talk went pretty well, although it was in a late slot so not many people showed up.

The final session of the day was a BOF called JRuby Experiences in the Real World. This ended up being a conversation between about 10-12 people about their JRuby experiences. Very interesting.

After that I was totally beat, and ended up going home and crashing. So that was my first day at JavaOne.



Google I/O


Currently sitting in a session on day two of the Google I/O conference. The morning opened up with the keynote and announcement of Google Wave, which is something that seems very cool and has a lot of potential. Very cool start of the day.

After that I watched Ben and Dion talk about Bespin. I hadn’t seen Bespin before – it was definitely interesting, although I will be hard pressed to give up Emacs any day soon.

During lunch I came up with a fun idea, but it required something extra. I talked to Jon Tirsen, a Swedish friend from his ThoughtWorks days, who is on the Google Wave team – and he managed to get me an early access account for Google Wave. So I spent the next few hours hacking – and was able to unveil an Ioke Wave Robot during my talk. It is basically only a hello world thing, but it is almost certainly the first third-party Google Wave code… You can find it at http://github.com/olabini/iokebot. It is deployed as iokebot@appspot.com so when you have your Wave account you can add it to any waves. Very cool. I do believe there is a real potential for scripting languages to handle these tasks. Since most of it is about gluing services together, dynamic languages should be perfectly suited for it.

Finally I did my talk about JRuby and Ioke – that went quite well too. The video should be up on Google sooner or later.

And that was basically my Google I/O experience. Very nice conference and lots of interesting people.