JRuby now JIT-compiles assertions


One of the major problems when running automated testing with JRuby is that all the standard Test::Unit assertions would never be JIT compiled, meaning that they would be quite slow. Actually, assertions seems to be very slow when running interpreted in JRuby. I have a small test case, courtesy of Michael Schubert:

require 'test/unit'
require 'benchmark'

class A < Test::Unit::TestCase
[10_000, 100_000].each do |n|
define_method "test_#{n}" do
puts "test_#{n}"
5.times do
puts Benchmark.measure{n.times{assert_equal true,true}}
end
end
end
end

This code will show quite nicely how large the overhead of asserts are, by using assert_equal. Now, the numbers for MRI for this benchmark looks like this:

Loaded suite test_assert
Started
test_10000
0.150000 0.000000 0.150000 ( 0.155817)
0.150000 0.000000 0.150000 ( 0.158376)
0.160000 0.000000 0.160000 ( 0.155575)
0.150000 0.000000 0.150000 ( 0.154380)
0.160000 0.000000 0.160000 ( 0.157737)
.test_100000
1.520000 0.010000 1.530000 ( 1.539325)
1.530000 0.000000 1.530000 ( 1.543889)
1.520000 0.010000 1.530000 ( 1.540376)
1.530000 0.000000 1.530000 ( 1.543742)
1.530000 0.010000 1.540000 ( 1.558292)
.
Finished in 8.509493 seconds.

2 tests, 550000 assertions, 0 failures, 0 errors

And for JRuby without compilation:

Loaded suite test_assert
Started
test_10000
1.408000 0.000000 1.408000 ( 1.408000)
0.582000 0.000000 0.582000 ( 0.582000)
0.425000 0.000000 0.425000 ( 0.426000)
0.419000 0.000000 0.419000 ( 0.418000)
0.466000 0.000000 0.466000 ( 0.467000)
.test_100000
4.189000 0.000000 4.189000 ( 4.190000)
4.196000 0.000000 4.196000 ( 4.196000)
4.139000 0.000000 4.139000 ( 4.139000)
4.165000 0.000000 4.165000 ( 4.165000)
4.162000 0.000000 4.162000 ( 4.162000)
.
Finished in 24.181 seconds.

2 tests, 550000 assertions, 0 failures, 0 errors

It’s quite obvious that something is very wrong. We’re about 2.5-3 times slower.

Now, the way the JRuby compiler works, we build it piece by piece and the JIT will try to compile a method that’s used enough. If there is any node that can’t be compiled it will fail and fall back on interpretation. In the case of assertions, all Test::Unit assertions use a small helper method called _wrap_assertion that looks like this:

def _wrap_assertion
@_assertion_wrapped ||= false
unless (@_assertion_wrapped)
@_assertion_wrapped = true
begin
add_assertion
return yield
ensure
@_assertion_wrapped = false
end
else
return yield
end
end

When I started out on this quest, there were two things in this method that doesn’t compile. The first is the ||= construct, which I mentioned in an earlier blog post. The problem with it is that it requires that we can compile DefinedNode too, and that one is large. The second problem node is Ensure. After lots of work, I’ve finally managed to implement most of these safely, and falling back on interpretation when it’s not safe. Without further ado, the numbers after compilation with these features added:

Loaded suite test_assert
Started
test_10000
0.996000 0.000000 0.996000 ( 1.013000)
0.415000 0.000000 0.415000 ( 0.415000)
0.110000 0.000000 0.110000 ( 0.110000)
0.099000 0.000000 0.099000 ( 0.100000)
0.109000 0.000000 0.109000 ( 0.109000)
.test_100000
1.012000 0.000000 1.012000 ( 1.012000)
1.008000 0.000000 1.008000 ( 1.000000)
1.017000 0.000000 1.017000 ( 1.017000)
1.039000 0.000000 1.039000 ( 1.039000)
1.024000 0.000000 1.024000 ( 1.024000)
.
Finished in 6.966 seconds.

So we’re looking at over 4 times improvement in speed, and about 33% percent faster than MRI. Try your test cases; hopefully it will show up.



ObjectSpace: to have or not to have


Among all the features of Ruby that JRuby supports, I would say that two things take the number one place as being really inconvenient. Threads are one; making the native threading of Java match the green threading semantics of Ruby is not fun, and it’s not even possible for all edge cases. But that argument have been made several times by both me and Charles.

ObjectSpace now, that is another story. The problems with OS are many. But first, let’s take a quick look at the most common usage of OS; iterating over classes:

ObjectSpace::each_object(Class) do |c|
  p c if c < Test::Unit::TestCase
end

This code is totally obvious; we iterate over all instances of Class in the system, and print an inspected version of them if the class is a subclass of Test::Unit::TestCase.

Before we take a closer look at this example, let’s talk quickly about how MRI and JRuby implements this functionality. In fact, having this functionality in MRI is dead easy. It’s actually very simple, and there are no performance problems of having it when it’s not used. The trick is that MRI just walks the heap when iterating over ObjectSpace. Since MRI can inspect the heap and stack without problems, this means that nothing special needs to be done to support this behavior. (Note that this can never be safe when using a real threading system).

So, the other side of the story: how does JRuby implement it? Well, JRuby can’t inspect the heap of course. So we need to keep a WeakReference to each instance of RubyObject ever created in the system. This is gross. We pay a huge penalty for managing all this stuff. Many of the larger performance benefits we have found the last year have revolved around having internal objects be smarter and not put themselves into ObjectSpace until necessary. One of my latest optimizations of regexp matching was simple to make MatchData lazy, so it only goes into OS when someone actually uses it. RDoc runs about 40% faster when ObjectSpace is turned off for JRuby.

So, is it worth it? In real life, when do you need the functionality of ObjectSpace? I’ve seen two places that use it in code I use every day. First, Rails uses it to find generators, and secondly, Test::Unit uses it to find instances of TestCase. But the fun thing is this; the above code is almost exactly what they do; they iterate over all classes in the system and checking if they inherit from a specific base class. Isn’t that a quite gross implementation? Shouldn’t it be possible to do something better? Euhm, yes:

module SubclassTracking
  def self.extended(klazz)
    (class <<klazz; self; end).send :attr_accessor,
                                     :subclasses
    (class <<klazz; self; end).send :define_method,
                                 :inherited do |clzz|
      klazz.subclasses << clzz
      super
    end
    klazz.subclasses = []  end
end

# Where Test::Unit::TestCase is defined
Test::Unit::TestCase.extend SubclassTracking

# Load all other classes# To find all subclasses and test them:
Test::Unit::TestCase.subclasses

I would say that this code solves the problem more elegantly and useful than ObjectSpace. There are no performance degradation due to it, and it will only effect subclasses of the class you are interested in. What’s the best benefit of this? You can use the -O flag when running JRuby, and your tests and rest of the code will run much faster and use less memory.

As a sidenote: I’m putting together a patch based on this to both Test::Unit and Rails. ObjectSpace is unnecessary for real code and the vision of JRuby is that you will explicitly have to turn it on to use it, instead of the other way around.

Anyone have any real world examples of things you need to do with ObjectSpace?