The pain of compiling try-catch


I’ve been spending some time trying to implement a compiler for the defined?-feature of Ruby. If you haven’t seen it, be happy. It’s quite annoying, and incredibly complicated to implement, since you basically need to create a small interpreter especially just for nodes existing within defined?. So why is defined? so important? Well, for one it’s actually needed to implement the construct ||= correctly. And that is used everywhere, which means that not compiling it will severely impact our ability to compile code. Also, it just so happens that OpAsgnOrNode (as it’s called), and EnsureNode, are the two nodes left to implement to be able to compile Test::Unit assert-methods, since the internal _wrap_assertion uses both ensure and ||=.

So, now you know why. Next, a quick intro to the compilation strategy of JRuby. Basically we try to compile each script and each method into one Java method. We try to use the stack as much as possible, since we in that way can link statements together correctly. And that’s about it.

The problem enters when you need to handle exceptions in the emitted Java bytecode. This isn’t a problem in the interpreter, since we explicitly return a value for each node, and the interpreter doesn’t use the Java stack as much as the compiler does. We also want to be able to use finally blocks at places, especially to ensure that ensure can be compiled down, but also to make the implementation of defined? safe.

So what’s the problem? Can’t we just emit the catch-table and so on correctly? Well, yes, we can do that. But it doesn’t work. Because of a very annoying feature of the JVM. Namely, when a catch-block is entered, the stack gets blown away. Completely. So if the Ruby code is in the middle of a long chained statement, everything will disappear. And what’s worse, this will actually fail to load with a Verifier exception, saying “inconsistent stack height”, since there will now be one code path with things on the stack, and one code path with no values on the stack, and the way JRuby works, these will end up at the same point later on. And the JVM doesn’t allow that either.

This makes it incredibly hard to handle these constructs in bytecode, and frankly, right now I have no idea how to do it. My first approach was to actually create a new method for each try-catch or try-finally, and just have the code in there instead. The fine thing about that is that the surrounding stack will not be blown away since it’s part of the invoking method, and not in the current activation frame. And that approach actually works fairly well. Until you want to refer to values from outside from the try or catch block. Then it breaks down.

So, right now I don’t know what to do. We have no way of knowing at any specific place how low the stack is, so it’s not possible to copy it somewhere, and then restore it in the catch block. That would be totally inefficient too. In fact, I have no idea how other implementations handle this. There’s gotta be a trick to it.



FOSCON slides


I have been asked for the slides of my FOSCON presentation. They can be downloaded in PDF format from here: http://ologix.com/JRubyWhirlwindTour.pdf.



Should JRuby support 1.4.2?


Right now we’re trying to decide if JRuby should upgrade from Java 1.4.2 to Java 5. There are some compelling reasons for this, but I’m not a 100% sure it’s a good idea. Any comments from my readers?

In practical terms, this will mean that JRuby 1.0 will continue to be supported on 1.4.2, but new development will only work on Java 5 or higher. There is talk about using retrotranslator for handling 1.4.2 compatibility in later versions.

So. Please, comments and opinions!



Really Radical Ruby


I had a very good time at FOSCON III with the Portland Ruby Brigade yesterday. There were lots of entertaining talks too. I would say that my “lightning talk” wasn’t really a lightning talk at all. Unless you count the speed of my talking… I managed to race through all my 22 slides – all of them shock full with information – in the time alloted to me. Hopefully people learned something from it.

Chad Wathington demonstrated Mingle which also was very nice.

John Lam showed us a taste of IronRuby, and also talked some about the implementation particulars that made certain things faster in IronRuby than on MRI. Interesting stuff, but I’m looking forward to the full talk tomorrow.

Alan McKean from GemStone showcased GemStone/JRuby – still a work in progress though. For those of you who don’t know what GemStone is, think extremely powerful object persistence. And they’re building a new version for Ruby, on top of JRuby. Very cool.

I realized that there are a few points about JRuby that haven’t been emphasized enough, though, so here is there executive summary bullet points:

  • JRuby is totally Java compliant and runs on any Java.
  • JRuby is 1.0
  • JRuby supports Rails
  • ThoughtWorks offers commercial support for JRuby
  • JRuby performance is on par with the C implementation, on average.


OSCON: first tutorial day


Yesterday was the first tutorial day at OSCON. Due to some planning mistakes, I didn’t get the correct conference pass, so I missed the first tutorial. After that was sorted out I proceeded to the second tutorial of the day: Advanced Techniques for Parsing, by Mark Dominus. Of course I knew that the code would be Perl, but that didn’t disturb me so much, since I expected to see some advanced parsing techniques. This is where disappointment hit me. Maybe it was advanced Perl code used, but it was not in any way advanced parsing. The first 2 hours were spent implementing a recursive descent parser with 7 productions. After that, I decided that I wouldn’t be learning anything from this presentation, and headed back to the hotel, which was good, since I got sick that afternoon and spent the rest of the evening slightly delirious in my bed.

But now I’m up and going again, sitting here waiting for the tutorial “Real World Grails” to begin. I’m looking forward to see how Grails is actually used, since the presentations I’ve seen on it usually just show scaffolding and simpler things.

I have also decided on the subject for tonights FOSCON, but the slides are not finished yet. And the topic I choose is kind of a cop out: JRuby Cavalcade is the title of the talk, and I will basically just run through loads of interesting and funny JRuby things until I run out of time or gets booed of stage. Hope to see you there!



JRuby at FOSCON 2007


So, I will attend and present at FOSCON 2007, which is arranged by the Portland Ruby Brigade (PDX.rb), and the theme is Really Radical Ruby. It’s on Tuesday, more information here. It seems to be an interesting event, so please show up!

I have not yet decided what I’m going to talk about. JRuby will be involved in some way of course. Anyone have any request about what I should talk about?



Back to JRuby regular expressions


It seems that this issue comes up every third month. After all the work we have done, we realize that regular expressions need some real work again. Our current solution works quite well. We have imported JRegex into JRuby, and done a whole slew of modifications to it. It runs well, have no issues with to deep regular expressions (Javas engine uses a recursive algorithm, making it stack overflow for certain inputs. Certain very common inputs, in say … Rails. *sigh*).

But JRegex is good. It’s not perfect though. It’s slightly slower than the Java engine, it doesn’t support everything in the Java engine, and conversely, it supports some things that Java doesn’t support. The major problem is that we don’t have MRI compliant multibyte support, and the implementation of our engine is wildly different compared to MRI’s engine, and Oniguruma.

At some point we will probably just bite the bullet and do a real port of Oniguruma. But until such time comes, I have extracted our current regular expression stuff, and put everything behind a common interface. What that means is that with the current trunk, you can actually choose which Regular Expression engine you want to use. You can even write your own and plug in. The interface is really small right now. At the moment we only have JRegex and Java, and the Java engine doesn’t pass all tests (I think, I haven’t tried, since that wasn’t the point of this exercise.). Anyway; it means you can have Java Regular Expressions if you want them, right in your JRuby code. But only where you want them. So, you can regular which engine is used globally by doing one of these two:

jruby -J-Djruby.regexp=java your_ruby_script.rb
jruby -J-Djruby.regexp=jregex your_ruby_script.rb

The last is current the default, so it’s not needed. In the future it may be possible that JRegex isn’t the default though, but this options should still be there. But the more nice thing about this is also that you can use Java Regexps inline, even if you want to use JRegex for most expressions:

begin
p(/\p{javaLowerCase}*/ =~ "abc")
p $&
rescue => e
p e
end

p(/\p{javaLowerCase}*/j =~ "abc")
p $&

Now, the first example will actually raise a RegexpError, because javaLowerCase is not a valid character class in JRegex. But not the small “j” I’ve added to the second Regexp literal! That expression works and will match exactly as you expected.



RSpec and RBehave runs on JRuby


I’m not sure if this is well known or not, so I’ve been meaning to write a quick notice about it. The short story is this: JRuby can run RSpec and RBehave. Why is this important? Well, you can write code that tests Java code using RSpec and RBehave, meaning that it will be possible to get much more readable tests, even for code living in Java land.

Even if your organization won’t accept writing an application in Ruby, it would probably be easier to get the testing done in Ruby. And writing tests in an effective language means that you will either write more production code, or more tests. Either of those are a quite good outcome.

A quick example of this in action. To run this example, you need JRuby 1.0 or later, and the rspec gem:

require 'java'

describe java.util.ArrayList, " when first created" do
before(:each) do
@list = java.util.ArrayList.new
end

it "should be empty" do
@list.should be_empty
end

it "should be able to add an element" do
@list.add "content"
end

it "should raise exception when getting anything" do
lambda{ @list.get 0 }.should raise_error(java.lang.IndexOutOfBoundsException)
end
end

In this code the examples are not that realistic, but you can see that the RSpec code looks the same for Java code, as it does for Ruby code. Even the raise_error exception matcher works. You can run this:

jruby -S spec -fs arraylist_spec.rb

The RBehave test suite also runs, which means you can stop using JBehave now… =)

This is a perfect example of the intersection where JRuby’s approach can be very powerful, utilizing the existing Ruby libraries to benefit your Java programming.



The results of JRuby compilation


If you are interested in what actually happens when JRuby compiles Ruby to Java bytecode, I have added some small utilities to help out with this. To compile a string:

require 'jruby'
JRuby::compile("puts 1,2,3")

If you are running with -J-Djruby.jit.enabled=false, you can also inspect the result of compiling a block:

require 'jruby'

JRuby::compile do
puts "Hello World"
end

The results of both of these invocations will be an object of type JRuby::CompiledScript. It has four attributes: name, class_name, original_script and code. The original_script attribute is only available when compiling from a string. The code attribute contains a Java byte array, and as such is not so useful in itself. But you can use the inspect_bytecode method to get a string which describes the compiled class. So, to see how JRuby compiles a puts “Hello, World”:

require 'jruby'

puts JRuby::compile(<<CODE).inspect_bytecode
puts "Hello, World"
CODE

Once you know what happens, you can start contributing to the compiler! =)



JRuby Inside


Peter Cooper have opened a “sister” site to RubyInside, called JRubyInside. It seems very promising; the address is http://www.jrubyinside.com.