Current state of Regular Expressions


As I’ve made clear earlier, the current regular expression situation has once again become impractical. To reiterate the history: We began with regular Java regex support. This started to cave in when we found out that the algorithm used is actually recursive, and fails for some common regexps used inside Rails among others. To fix that, we integrated JRegex instead. That’s the engine 1.0 was released with and is still the engine in use. It works fairly well, and is fast for a Java engine. But not fast enough. In particular, there is no support for searching for exact strings and failing fast, and the engine requires us to transform our byte[]-strings to char[] or String. Not exactly optimal. Another problem is that compatibility with MRI suffers, especially in the multi byte support.

There are two solutions currently on the way. Core developer Marcin are working on a port of the 1.9 regexp engine Oniguruma. This port still has some way to go, and is not integrated with JRuby. The other effort is called REJ, and is a port of the MRI engine I did a few months back. I’ve freshened up the work and integrated it with JRuby in a branch. At the moment this work actually seems to go quite well, but there are some snags.

First of all, let me point out that this approach gives us more or less total multibyte compatibility for 1.8, which is quite nice.

When doing benchmarking, I’m generally using Rails as the bar. I have a series of regular expressions that Petstore uses for each requst, and I’m using these to check performance. As a first datapoint, JRuby+REJ is faster at parsing regexps than JRuby trunk for basically all regexps. This ranges from slightly faster to twice as fast.

Most of the Rails regexen are actually faster in REJ than in JRuby+trunk, but the problem is that some of them are actually quite a bit slower. 4 of the 22 Rails regexps are slower, by between 20 and 250% percent. There are also this one: /.*_f/ =~ “_fxxxxxxxxxxxxxxxxxxxxxxx” which basically runs about 10x slower than JRuby trunk. Not nice at all.

In the end, the problem is backtracking. Since REJ is a straight port of the MRI code, the backtracking is also ported. But it seems that Java is unusually bad at handling that specific algorithm, and it performs quite badly. At the moment I’m continuing to look at it and trying to improve performance in all ways possible, so we’ll see what happens. Charles Nutter have also started to look at it.

But what’s really interesting is that I reran my Petstore benchmarks with the current REJ code. To rehash, my last results with JRuby trunk looked like this:

controller :   1.804000   0.000000   1.804000 (  1.804000)
view : 5.510000 0.000000 5.510000 ( 5.510000)
full action: 13.876000 0.000000 13.876000 ( 13.876000)

But the results from rerunning with REJ was interesting, to say the least. I expected bad results because of the bad backtracking performance, but it seems the other speed improvements weigh up:

controller :   1.782000   0.000000   1.782000 (  1.782000)
view : 4.735000 0.000000 4.735000 ( 4.735000)
full action: 12.727000 0.000000 12.727000 ( 12.727000)

As you can see, the improvement is quite large in the view numbers. It is also almost there compared to MRI which had 4.57. Finally, the full action is better by a full second too. Again, MRI is 9.57s and JRuby 12.72. It’s getting closer. I am quite optimistic right now, provided that we manage to fix the remaining problems with backtracking, our regexp engine might well be a great boon to performance.



Updated JRuby on Rails performance numbers


So, after my last post, several of us have spent time looking at different parts of Rails and JRuby performance. We have managed to improve things quite nicely since my last performance note. Some of the things changed have been JRuby’s each_line implementation, JRuby’s split implementation, a few other improvements, and some small fixes to AR-JDBC. After that, here are some new numbers for Petstore. Remember, the MRI numbers are for MRI 1.8.6 with native C MySQL. The JRuby numbers is with ActiveRecord-JDBC trunk and MySQL. (I’m only showing the best numbers of each). The number in bold is the most important one for comparison.

MRI   controller :   1.000000   0.070000   1.070000 (  1.430260)
JRuby controller : 1.804000 0.000000 1.804000 ( 1.804000)

MRI view : 4.410000 0.150000 4.560000 ( 4.575399)
JRuby view : 5.510000 0.000000 5.510000 ( 5.510000)

MRI full action: 8.260000 0.410000 8.670000 ( 9.574460)
JRuby full action: 13.876000 0.000000 13.876000 ( 13.876000)

As you can see, we are talking about 9.5s MRI to 13.8s for JRuby, which I find is a quite nice achievement if you look at the numbers from Friday. We are inching closer and closer. Both the view and the controller numbers are looking very nice. This is actually indicative of a nice trend – since general JRuby primitive performance is really good, the slowness in our Regular Expression engine is weighed up by much faster execution speed.

Once the port of Oniguruma lands, this story will almost certainly look very different. But even so, this is looking good.



JRuby discovery number one


After my last entry I’ve spent lots of time checking different parts of JRuby, trying to find the one true bottleneck for Rails. Of course, I still haven’t found it (otherwise I would have said YAY in the subject for this blog). But I have found a few things – for example, symbols are slow right now, but Bill’s work will make them better. And it doesn’t affect Rails performance at all.

But the discovery I made was when I looked at the performance of the regular expressions used in Rails. There are exactly 50 of them for each request, so I did a script that checked the performance of each of them against MRI. And I found that there was one in particular that had really interesting performance when comparing MRI to JRuby. In fact, it was between 200 and a 1000 times slower. What’s worse, the performance wasn’t linear.

So which regular expression was the culprit? Well, /.*?\n/m. That doesn’t look to bad. And in fact, this expression displayed not one, but two problems with JRuby. The first one is that any regular expression engine should be able to fail fast on something like this, simply because there is a string that always needs to be part of a string for this expression to match. In MRI, that part of the engine is called bm_search, and is a very fast way to fail. JRuby doesn’t have that. Marcin is working on a port of Oniguruma though, so that will fix that part of the problem.

But wait, if you grep for this regexp in the Rails sources you won’t find it. So where was it actually used? Here is the kicker: it was used in JRuby’s implementation of String#each_line. So, let’s take some time to look at a quick benchmark for each_line:

require 'benchmark'

str = "Content-Type: text/html; charset=utf-8\r\nSet-Cookie: aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa "

TIMES=100_000

puts "each_line on small string with several lines"
10.times do
puts(Benchmark.measure{TIMES.times { str.each_line{} }})
end

str = "abc" * 15

puts "each_line on short string with no line divisions"
10.times do
puts(Benchmark.measure{TIMES.times { str.each_line{} }})
end

str = "abc" * 4000

puts "each_line on large string with no line divisions"
10.times do
puts(Benchmark.measure{TIMES.times { str.each_line{} }})
end

As you can see, we simple measure the performance of doing a 100 000 each_line calls on three different strings. The first one is a short string with several newlines, the second is a short string with no newlines, and the last is a long string with no newlines. How does MRI run this benchmark?

each_line on small string with several lines
0.160000 0.000000 0.160000 ( 0.157664)
0.150000 0.000000 0.150000 ( 0.160450)
0.160000 0.000000 0.160000 ( 0.171563)
0.150000 0.000000 0.150000 ( 0.157854)
0.150000 0.000000 0.150000 ( 0.154578)
0.150000 0.000000 0.150000 ( 0.154547)
0.160000 0.000000 0.160000 ( 0.158894)
0.150000 0.000000 0.150000 ( 0.158064)
0.150000 0.010000 0.160000 ( 0.156975)
0.160000 0.000000 0.160000 ( 0.156857)
each_line on short string with no line divisions
0.080000 0.000000 0.080000 ( 0.086789)
0.090000 0.000000 0.090000 ( 0.084559)
0.080000 0.000000 0.080000 ( 0.093477)
0.090000 0.000000 0.090000 ( 0.084700)
0.080000 0.000000 0.080000 ( 0.089917)
0.090000 0.000000 0.090000 ( 0.084176)
0.080000 0.000000 0.080000 ( 0.086735)
0.090000 0.000000 0.090000 ( 0.085536)
0.080000 0.000000 0.080000 ( 0.084668)
0.090000 0.000000 0.090000 ( 0.090176)
each_line on large string with no line divisions
3.350000 0.020000 3.370000 ( 3.404514)
3.330000 0.020000 3.350000 ( 3.690576)
3.320000 0.040000 3.360000 ( 3.851804)
3.320000 0.020000 3.340000 ( 3.651748)
3.340000 0.020000 3.360000 ( 3.478186)
3.340000 0.020000 3.360000 ( 3.447704)
3.330000 0.020000 3.350000 ( 3.448651)
3.350000 0.010000 3.360000 ( 3.489842)
3.350000 0.020000 3.370000 ( 3.429135)
3.350000 0.010000 3.360000 ( 3.372925)

OK, this looks reasonable. The large string is obviously taking more time to search, but not incredibly much time. What about trunk JRuby?

each_line on small string with several lines
32.668000 0.000000 32.668000 ( 32.668000)
30.785000 0.000000 30.785000 ( 30.785000)
30.824000 0.000000 30.824000 ( 30.824000)
30.878000 0.000000 30.878000 ( 30.877000)
30.904000 0.000000 30.904000 ( 30.904000)
30.826000 0.000000 30.826000 ( 30.826000)
30.550000 0.000000 30.550000 ( 30.550000)
32.331000 0.000000 32.331000 ( 32.331000)
30.971000 0.000000 30.971000 ( 30.971000)
30.537000 0.000000 30.537000 ( 30.537000)
each_line on short string with no line divisions
7.472000 0.000000 7.472000 ( 7.472000)
7.350000 0.000000 7.350000 ( 7.350000)
7.516000 0.000000 7.516000 ( 7.516000)
7.252000 0.000000 7.252000 ( 7.252000)
7.313000 0.000000 7.313000 ( 7.313000)
7.262000 0.000000 7.262000 ( 7.262000)
7.383000 0.000000 7.383000 ( 7.383000)
7.786000 0.000000 7.786000 ( 7.786000)
7.583000 0.000000 7.583000 ( 7.583000)
7.529000 0.000000 7.529000 ( 7.529000)
each_line on large string with no line divisions

Ooops. That doesn’t look so good… And also, where is the last ten lines? Eh… It’s still running. It’s been running for two hours to produce the first line. That means that it’s taking at least 7200 seconds which is more than 2400 times slower than MRI. But in fact, since the matching of the regular expression above is not linear, but exponential in performance, I don’t expect this to ever finish.

There are a few interesting lessons to take away from this exercise:

  1. There may still be implementation problems like this in many parts of JRuby – performance will improve by quite much every time we find something like this. I haven’t measured Rails performance after this is fixed, and I don’t expect it to actually fix the whole problem, but I think I’ll see better numbers.
  2. Understand regular expressions. Why is /.*?\n/ so incredibly bad for strings over a certain length? In this case it’s the combination of .* and ?. What would be a better implementation in almost all cases? /[^\n]\n/. Notice that there is no backtracking in this implementation, and because of that, this regexp will have performance O(n) while the earlier one was O(n^2). Learn and know these things. They are the difference between usage and expertise.


Mystery: An exposé on JRuby performance


So, after Charlies awesome work yesterday (documented here), I felt it was time I put the story straight on general JRuby performance. It’s something of a mystery. I’ll begin by showing you the results of lots of different benchmarks and comparison to MRI. All these benchmarks have been run on my MacBook Pro, dual 2.33Ghz cores with 2Gb memory. JRuby revision is 4578, Java is Apple 1.6 beta, and Ruby version is 1.8.6 (2007-03-13 patchlevel 0). JRuby was ran with -J-server and -O.

Now, lets begin with the YARV benchmarks. The first values is always MRI, the second i JRuby. I’ve made the benchmark red when MRI is faster, and green when JRuby is faster.

bm_app_answer.rb:
0.480000 0.010000 0.490000 ( 0.511641)
0.479000 0.000000 0.479000 ( 0.478000)

bm_app_factorial.rb:
ERROR stack level too deep
2.687000 0.000000 2.687000 ( 2.687000)

bm_app_fib.rb:
5.880000 0.030000 5.910000 ( 6.151324)
2.687000 0.000000 2.687000 ( 2.687000)

bm_app_mandelbrot.rb:
1.930000 0.010000 1.940000 ( 1.992984)
2.752000 0.000000 2.752000 ( 2.876000)

bm_app_pentomino.rb:
84.160000 0.450000 84.610000 ( 88.205519)
77.117000 0.000000 77.117000 ( 77.117000)

bm_app_raise.rb:
2.600000 0.430000 3.030000 ( 3.169156)
2.162000 0.000000 2.162000 ( 2.162000)

bm_app_strconcat.rb:
1.390000 0.010000 1.400000 ( 1.427766)
1.003000 0.000000 1.003000 ( 1.003000)

bm_app_tak.rb:
7.570000 0.060000 7.630000 ( 7.888381)
2.676000 0.000000 2.676000 ( 2.676000)

bm_app_tarai.rb:
6.020000 0.030000 6.050000 ( 6.186971)
2.236000 0.000000 2.236000 ( 2.236000)

bm_loop_times.rb:
4.240000 0.020000 4.260000 ( 4.404826)
3.354000 0.000000 3.354000 ( 3.354000)

bm_loop_whileloop.rb:
9.450000 0.050000 9.500000 ( 9.678552)
5.037000 0.000000 5.037000 ( 5.037000)

bm_loop_whileloop2.rb:
1.890000 0.010000 1.900000 ( 1.936502)
1.039000 0.000000 1.039000 ( 1.039000)

bm_so_ackermann.rb:
ERROR stack level too deep
4.928000 0.000000 4.928000 ( 4.927000)

bm_so_array.rb:
5.580000 0.020000 5.600000 ( 5.709101)
5.552000 0.000000 5.552000 ( 5.552000)

bm_so_concatenate.rb:
1.580000 0.040000 1.620000 ( 1.647592)
1.602000 0.000000 1.602000 ( 1.602000)

bm_so_exception.rb:
4.170000 0.390000 4.560000 ( 4.597234)
4.683000 0.000000 4.683000 ( 4.683000)

bm_so_lists.rb:
0.970000 0.030000 1.000000 ( 1.036678)
0.814000 0.000000 0.814000 ( 0.814000)

bm_so_matrix.rb:
1.700000 0.010000 1.710000 ( 1.765739)
1.878000 0.000000 1.878000 ( 1.879000)

bm_so_nested_loop.rb:
5.130000 0.020000 5.150000 ( 5.258066)
4.661000 0.000000 4.661000 ( 4.661000)

bm_so_object.rb:
5.480000 0.030000 5.510000 ( 5.615154)
3.095000 0.000000 3.095000 ( 3.095000)

bm_so_random.rb:
1.760000 0.010000 1.770000 ( 1.806116)
1.495000 0.000000 1.495000 ( 1.495000)

bm_so_sieve.rb:
0.680000 0.010000 0.690000 ( 0.705296)
0.853000 0.000000 0.853000 ( 0.853000)

bm_vm1_block.rb:
19.920000 0.110000 20.030000 ( 20.547236)
12.698000 0.000000 12.698000 ( 12.698000)

bm_vm1_const.rb:
15.720000 0.080000 15.800000 ( 16.426734)
7.654000 0.000000 7.654000 ( 7.654000)

bm_vm1_ensure.rb:
14.530000 0.070000 14.600000 ( 15.137106)
7.588000 0.000000 7.588000 ( 7.589000)

bm_vm1_length.rb:
17.230000 0.090000 17.320000 ( 20.406438)
6.415000 0.000000 6.415000 ( 6.416000)

bm_vm1_rescue.rb:
11.520000 0.040000 11.560000 ( 11.736435)
5.604000 0.000000 5.604000 ( 5.603000)

bm_vm1_simplereturn.rb:
17.560000 0.080000 17.640000 ( 18.178065)
5.413000 0.000000 5.413000 ( 5.413000)

bm_vm1_swap.rb:
22.160000 0.110000 22.270000 ( 22.698746)
6.836000 0.000000 6.836000 ( 6.835000)

bm_vm2_array.rb:
5.600000 0.020000 5.620000 ( 5.675354)
1.844000 0.000000 1.844000 ( 1.844000)

bm_vm2_method.rb:
9.800000 0.030000 9.830000 ( 9.918884)
5.152000 0.000000 5.152000 ( 5.151000)

bm_vm2_poly_method.rb:
13.570000 0.050000 13.620000 ( 13.803066)
10.289000 0.000000 10.289000 ( 10.289000)

bm_vm2_poly_method_ov.rb:
3.990000 0.010000 4.000000 ( 4.071277)
1.750000 0.000000 1.750000 ( 1.749000)

bm_vm2_proc.rb:
5.670000 0.020000 5.690000 ( 5.723124)
3.267000 0.000000 3.267000 ( 3.267000)

bm_vm2_regexp.rb:
0.380000 0.000000 0.380000 ( 0.387671)
0.961000 0.000000 0.961000 ( 0.961000)

bm_vm2_send.rb:
3.720000 0.010000 3.730000 ( 3.748266)
2.135000 0.000000 2.135000 ( 2.136000)

bm_vm2_super.rb:
4.100000 0.010000 4.110000 ( 4.138355)
1.781000 0.000000 1.781000 ( 1.781000)

bm_vm2_unif1.rb:
3.320000 0.010000 3.330000 ( 3.348069)
1.385000 0.000000 1.385000 ( 1.385000)

bm_vm2_zsuper.rb:
4.810000 0.020000 4.830000 ( 4.856368)
2.920000 0.000000 2.920000 ( 2.921000)

bm_vm3_thread_create_join.rb:
0.000000 0.000000 0.000000 ( 0.006621)
0.368000 0.000000 0.368000 ( 0.368000)

What is interesting about these numbers is that almost all of them are way faster, and the ones that are slower are so by a quite narrow margin (except for the regexp and thread tests).

So, this is the baseline against MRI. Let’s take a look at a few other benchmarks. These can all be found in JRuby, in the test/bench directory. When you run them, all of them generate 5 or 10 runs of all measures, but I’ve simply taken the best one for each. The repetition is to allow Java to warm up. Here are a few different benchmarks, same convention and running parameters as above:

bench_fib_recursive.rb
----------------------
1.390000 0.000000 1.390000 ( 1.412710)
0.532000 0.000000 0.532000 ( 0.532000)


bench_method_dispatch.rb
------------------------
Control: 1m loops accessing a local variable 100 times
2.830000 0.010000 2.840000 ( 2.864822)
0.105000 0.000000 0.105000 ( 0.105000)

Test STI: 1m loops accessing a fixnum var and calling to_i 100 times
10.000000 0.030000 10.030000 ( 10.100846)
2.111000 0.000000 2.111000 ( 2.111000)

Test ruby method: 1m loops calling self's foo 100 times
16.130000 0.060000 16.190000 ( 16.359876)
7.971000 0.000000 7.971000 ( 7.971000)


bench_method_dispatch_only.rb
-----------------------------
Test ruby method: 100k loops calling self's foo 100 times
1.570000 0.000000 1.570000 ( 1.588715)
0.587000 0.000000 0.587000 ( 0.587000)


bench_block_invocation.rb
-------------------------
1m loops yielding a fixnum 10 times to a block that just retrieves dvar
2.800000 0.010000 2.810000 ( 2.822425)
1.194000 0.000000 1.194000 ( 1.194000)

1m loops yielding two fixnums 10 times to block accessing one
6.550000 0.030000 6.580000 ( 6.623452)
2.064000 0.000000 2.064000 ( 2.064000)

1m loops yielding three fixnums 10 times to block accessing one
7.390000 0.020000 7.410000 ( 7.467841)
2.120000 0.000000 2.120000 ( 2.120000)

1m loops yielding three fixnums 10 times to block splatting and accessing them
9.250000 0.040000 9.290000 ( 9.339131)
2.451000 0.000000 2.451000 ( 2.451000)

1m loops yielding a fixnums 10 times to block with just a fixnum (no vars)
1.890000 0.000000 1.890000 ( 1.908501)
1.278000 0.000000 1.278000 ( 1.277000)

1m loops calling a method with a fixnum that just returns it
2.740000 0.010000 2.750000 ( 2.766255)
1.426000 0.000000 1.426000 ( 1.426000)


bench_string_ops.rb
----
Measure string array sort time
5.950000 0.060000 6.010000 ( 6.055483)
8.061000 0.000000 8.061000 ( 8.061000)

Measure hash put time
0.390000 0.010000 0.400000 ( 0.398593)
0.208000 0.000000 0.208000 ( 0.209000)

Measure hash get time (note: not same scale as put test)
1.620000 0.000000 1.620000 ( 1.646155)
0.740000 0.000000 0.740000 ( 0.740000)

Measure string == comparison time
2.340000 0.010000 2.350000 ( 2.368579)
0.812000 0.000000 0.812000 ( 0.812000)

Measure string == comparison time, different last pos
2.690000 0.000000 2.690000 ( 2.724772)
0.860000 0.000000 0.860000 ( 0.860000)

Measure string <=> comparison time
2.340000 0.010000 2.350000 ( 2.369915)
0.824000 0.000000 0.824000 ( 0.824000)

Measure 'string'.index(fixnum) time
0.790000 0.010000 0.800000 ( 0.808189)
1.113000 0.000000 1.113000 ( 1.113000)

Measure 'string'.index(str) time
2.860000 0.010000 2.870000 ( 2.892730)
0.956000 0.000000 0.956000 ( 0.956000)

Measure 'string'.rindex(fixnum) time
0.800000 0.000000 0.800000 ( 0.817300)
0.631000 0.000000 0.631000 ( 0.631000)

Measure 'string'.rindex(str) time
12.190000 0.040000 12.230000 ( 12.310492)
1.247000 0.000000 1.247000 ( 1.247000)


bench_ivar_access.rb
----
100k * 100 ivar gets, 1 ivar
0.500000 0.000000 0.500000 ( 0.582682)
0.340000 0.000000 0.340000 ( 0.340000)

100k * 100 ivar sets, 1 ivar
0.700000 0.010000 0.710000 ( 0.816724)
0.402000 0.000000 0.402000 ( 0.401000)

100k * 100 attr gets, 1 ivar
0.970000 0.000000 0.970000 ( 0.988212)
0.875000 0.000000 0.875000 ( 0.874000)

100k * 100 attr sets, 1 ivar
1.390000 0.010000 1.400000 ( 1.406535)
1.114000 0.000000 1.114000 ( 1.114000)

100k * 100 ivar gets, 2 ivars
0.490000 0.000000 0.490000 ( 0.506206)
0.344000 0.000000 0.344000 ( 0.344000)

100k * 100 ivar sets, 2 ivars
0.680000 0.000000 0.680000 ( 0.693064)
0.388000 0.000000 0.388000 ( 0.388000)

100k * 100 attr gets, 2 ivars
0.970000 0.000000 0.970000 ( 0.989313)
0.878000 0.000000 0.878000 ( 0.878000)

100k * 100 attr sets, 2 ivars
1.400000 0.000000 1.400000 ( 1.434206)
1.129000 0.000000 1.129000 ( 1.128000)

100k * 100 ivar gets, 4 ivars
0.490000 0.000000 0.490000 ( 0.502097)
0.340000 0.000000 0.340000 ( 0.340000)

100k * 100 ivar sets, 4 ivars
0.690000 0.000000 0.690000 ( 0.696852)
0.389000 0.000000 0.389000 ( 0.389000)

100k * 100 attr gets, 4 ivars
0.970000 0.010000 0.980000 ( 0.986163)
0.872000 0.000000 0.872000 ( 0.872000)

100k * 100 attr sets, 4 ivars
1.370000 0.010000 1.380000 ( 1.394921)
1.128000 0.000000 1.128000 ( 1.128000)

100k * 100 ivar gets, 8 ivars
0.500000 0.000000 0.500000 ( 0.519511)
0.344000 0.000000 0.344000 ( 0.344000)

100k * 100 ivar sets, 8 ivars
0.690000 0.000000 0.690000 ( 0.710896)
0.389000 0.000000 0.389000 ( 0.389000)

100k * 100 attr gets, 8 ivars
0.970000 0.000000 0.970000 ( 0.987582)
0.870000 0.000000 0.870000 ( 0.870000)

100k * 100 attr sets, 8 ivars
1.380000 0.000000 1.380000 ( 1.400542)
1.132000 0.000000 1.132000 ( 1.132000)

100k * 100 ivar gets, 16 ivars
0.500000 0.000000 0.500000 ( 0.523690)
0.342000 0.000000 0.342000 ( 0.343000)

100k * 100 ivar sets, 16 ivars
0.680000 0.000000 0.680000 ( 0.707385)
0.391000 0.000000 0.391000 ( 0.391000)

100k * 100 attr gets, 16 ivars
0.970000 0.010000 0.980000 ( 1.017880)
0.879000 0.000000 0.879000 ( 0.879000)

100k * 100 attr sets, 16 ivars
1.370000 0.010000 1.380000 ( 1.387713)
1.128000 0.000000 1.128000 ( 1.128000)


bench_for_loop.rb
----
100k calls to a method containing 5x a for loop over a 10-element range
0.890000 0.000000 0.890000 ( 0.917563)
0.654000 0.000000 0.654000 ( 0.654000)

All of this is really good, of course. Lots of green. And the red parts are not that bad either.

But here is the mystery. General Rails performance sucks. This test case can be ran by checking out petstore from tw-commons at RubyForge. There is a file called script/console_bench that will run the benchmarks. The two commands ran was: ruby script/console_bench production, and jruby -J-server -O script/console_bench ar_jdbc. With further ado, here are the numbers:

controller :   1.000000   0.070000   1.070000 (  1.472701)
controller : 2.144000 0.000000 2.144000 ( 2.144000)
view : 4.470000 0.160000 4.630000 ( 4.794336)
view : 6.335000 0.000000 6.335000 ( 6.335000)
full action: 8.300000 0.400000 8.700000 ( 9.597351)
full action: 16.055000 0.000000 16.055000 ( 16.055000)

These numbers plainly stink. And we can’t find the reason for it. As you can see, most benchmarks are much better, and there is nothing we can think of that would add up to a general degradation like this. It can’t be IO since these tests use app.get directly. So what is it? There is very likely to be one or two bottlenecks causing this problem, and when we have found it, Rails will most likely blaze along. But profiling haven’t helped yet. We found a few things, but there is still stuff missing. It’s an interesting problem.

My current thesis is that either symbols or regexps are responsible. I’ll spend the day checking that.



Recent interviews


I have been doing several interviews recently, mostly in connection with my book. Here are the ones I know of right now:



Announcing JRuby/LDAP


I have just released JRuby/LDAP, a new project within JRuby-extras, that aim to be interface compatible with Ruby/LDAP. That is, if you have a Ruby project written for Ruby/LDAP, you should be able to run it on JRuby without any changes.

This is a first release and some of the more obscure functionalities haven’t been added yet.

Installation is easy:

jruby -S gem install jruby-ldap

If you’re interested in the source, it’s available on RubyForge, within the JRuby-extras project.

JRuby/LDAP uses JNDI to achieve nice LDAP access. You can also use standard Java ways (i.e. jndi.properties) to change the LDAP access implementation.

The code is BSD licensed.



The first fully functional Ruby compiler


It’s a glorious day. The JRuby compiler is now complete and functional. Without sacrificing interpreted mode. Read more in Charles blog, here.



Rubinius is important


I predict that parts of this blog posts will make certain people uncomfortable, annoyed and possibly foamy mouthed. If you feel that you’re of this disposition, please don’t read any further.

As I’m working on JRuby, I obviously think that JRuby is the best solution for many problems I perceive in the MRI implementation currently. I have been quite careful to never say anything along the lines that JRuby is better than anything else, though. I will continue with that stance. However, I won’t restrict myself in the same way regarding Rubinius.

In fact, I’m getting more and more convinced that for the people that don’t need the things Java infrastructure can give you, Rubinius is the most important project around, in Ruby-land. More than that, Rubinius is MRI done right. If nothing substantial changes in the current timeline and plans for Ruby 1.9.1, I predict that Rubinius will be the CRuby implementation of choice within 6 months. Rubinius is an implementation done the way MRI should have been. Of course, Matz have always focused on the language, not the implementation. I’m very happy about that, since it means that we have an outstanding language.

But still. Rubinius will win over MRI and YARV. I’ve had this thought for a while, and I’m finally more or less convinced that it’s true. Of course, there are a few preconditions. The first and most important one is that Rubinius delivers 1.0 as planned, by the end of the year and that it doesn’t have abysmal performance. Or if YARV would happen to be totally finished and perfectly usable in the same time frame, things might take a different turn.

Why is Rubinius so good, compared to the existing C implementations? There are a number of good reasons for this:

  • It is byte code based. This means it’s easier to handle performance.
  • It has a pluggable, very clean architecture, meaning that for example garbage collection/object memory can be switched out to use another algorithm.
  • It is designed to be thread safe (though this is not really true yet), and Multi-VM capable.
  • It works with existing MRI extensions.
  • Most of the code is written in Ruby.
  • It gives you access to all the innards, directly from your Ruby code (stuff like MethodContexts/BlockContexts, etc).
  • The project uses Valgrind to ensure that the C code written is bullet proof.

Anyway. I put my money on Rubinius. Of course, that doesn’t mean I don’t think JRuby have a place to fill in the eco system. In fact, the real interesting question is what will happen when both Rubinius and JRuby have become more mature. I’d personally love to see more cooperation and sharing between the projects. Not a merging, since the goals are too separate, but it would be wonderful if JRuby could use the same Ruby code for all the primitive operations as Rubinius does.

Right now we have a simple Rubinius engine in JRuby, that can interpret and run some simpler byte codes from Rubinius.

JRuby and Rubinius are both extremely important. Right now I believe JRuby is more important, since it opens up a totally different market for Ruby, and gives the benefits of Java to Ruby. Rubinius has another place to fill.

Of course, being who I am, I have also looked into what would be required to port Rubinius to Java, using the same approach directly instead of going through JRuby. If you decide to use Java’s Garbage Collector, Java Threads, and reuse the JRuby parser you would end up with about 40 files of C code to port. Most of these are extremely easy, and none is really that hard. And what you would end up with is something that would run the same things Rubinius does, but with the possibility of invoking Java code at the same time. (Of course, I hope that Evan reserves a block of about 8-16 bytecodes that can be implementation dependent – these Jubinius would use to interop with Java code).



Ruby+Erlang concurrency?


I keep reading from lots of people that you can’t bolt Erlang’s concurrency model on Ruby. But is this really true? MRI already has green threads. Adding a higher level of concurrency with the basic primitives of !, recv and spawn doesn’t seem like a gigantic project. The main problem would be to prohibit access to shared memory between the spawned green threads, and avoid the GIL. But that doesn’t seem to be that large of a problem. The main question is rather if this model would fit well with the Ruby language… Since Erlang was designed from the ground up with these primitives in mind, many of the libraries and functions work well with it. For example, pattern matching work exactly the same in recv as in function dispatch or case expressions.

On the other hand, the send, recv and spawn primitives in Gambit Scheme seems to work out really well even though the language is LISP in root (or maybe that’s the reason?)

In fact, this is one of the few places where it would be harder to add something to JRuby than MRI. Since we can’t control the full stack it would be very hard to implement anything resembling Erlang processes in Java. And Java threads would almost certainly be too heavy weight for this to work. Hmm.



What about Sun’s Ruby strategy?


Wow. Today was a strange day for blog reading. I’ve already had several WTF moments. Or what do you say about 7 reasons I switched back to PHP after two years on Rails, where the first and most important reason seemed to be

IS THERE ANYTHING RAILS/RUBY CAN DO THAT PHP CAN’T DO? … (thinking)… NO.

Bloody hell. All programming languages in use today are Turing complete. Of course you can do exactly the same things in all of them. But I still don’t program in Intercal that often.

Example number two: About JRuby and CPython performance. This blog post uses the Alioth language benchmark to show us that Python’s C implementation is faster than JRuby. Of course, the JRuby version used in this comparison is 1.0, which is now almost 4 months old. Comparing the C implementation of one language with a Java implementation of another language seems kind of suspect anyway.

But these two examples is nothing compared to a post by Krishna Kotecha from yesterday. It’s called Sun’s Ruby strategy – Engage and Contain?. You should ge read it. It’s actually quite amusing. I usually don’t respond to other blog posts, and I don’t usually quote in my blog. I’ll do an except here, because there are several points in that post I want to elaborate on.

Compromise definitely seemed to be an underlying theme at RailsConf Europe. DHH’s keynote downplayed the need for evangelism – something I strongly disagree with. Rails has certainly made a lot of progress towards wider acceptance, but we’ve got a really long way to go before more companies start to adopt it, and I certainly don’t think turning down the evangelism and doing stealth deployments via JRuby is the answer.

There are really two interesting points in this paragraph. First of all the question of evangelism. I would say it’s about time to turn it down. Actually, I’ve gotten the very real impression that the wild-eyed Rails evangelism is now turning people away from Rails rather than winning more “converts”. Telling people about the advantages of Rails is still something that needs to be done, but the full fledged Rails marketing machine has already done it’s work and should be turned down a notch.

The second point Krishna sneaked in there is about JRuby “stealth deployments”. I’m pretty sure no one will ever do a real stealth deployment, and I find that concept totally wrong.

At RailsConf Europe 2007 however, Dave didn’t even specifically discuss Rails – and this seems to have been at the behest of the conference organizers. If this is the case, then the Rails community is already in trouble. Is this the price of Sun’s ’support’: that the community is no longer able to freely discuss the platform and what work needs to be done to get it accepted in the enterprise on its own terms?

Where does this particular conspiracy theory come from? Is there any evidence whatsoever that the organizers of RailsConf wanted Dave to not speak about the shortcomings of Rails? And even if that were the case, what’s there to say that Sun is the reason for this? (Couldn’t it have been IBM or ThoughtWorks, who were also Diamond sponsors?)

I have real problems with this attitude and approach. Selling Rails and Ruby, as “just a Java library” is a massive disservice to the technology, and simply means enterprise customers and decision makers won’t evaluate Ruby on its own merits.

What is important to realize is that the argument “just a Java library” will only ever be used in the case of organizations where there are good technology arguments for using JRuby on Rails, but non-technical management are making decisions based on what is most safe at the moment (see the Blub Paradox by Paul Graham). In most cases the “just a Java library” argument is useful only when talking about conservative environments who are standardized on a certain platform. And believe me, Krishna, there are many places where Java is the only allowed technology to be deployed. But in most cases JRuby will work fine for those IS departments. Is it really a disservice to the technology to make Rails and Ruby into something that can be used in even wider domains, removing cruft and bloatware at all places possible? Is it a disservice for the technology to be used in places where it would never enter without the help of JRuby?

But JRuby is not the best answer for Rails and Ruby developers.

I don’t really understand this quote. Obviously Krishna have strong opinions about the subject, but stating something as a fact without telling the reasons for it being that way doesn’t feel that interesting.

Serious Rails deployments (Mongrel not some Java Application Server) within enterprise environments may be difficult to achieve, but with the right political backing and developer persistence it can be done.

I must say I find it interesting that Mongrel is viewed as a serious deployment option when compared with a standard Java application server. The question here isn’t how we should get enterprise environments to use Mongrel; rather, we should first decide if Mongrel really is a serious enough deployment environment for enterprises.

And this will benefit the whole Rails community – not just those who tie themselves to Sun’s technology platform.

I heard this rumor about Java being Open Source… Does that still mean tying yourself to “Sun’s technology platform”?

And also, the vendor with the most to lose if Rails really does fulfill its potential in enterprise environments.

Why? What exactly does Sun lose if Rails win? Sun is using Rails and benefiting from it. And be careful about this: Rails is Rails, no matter if it runs on JRuby, MRI, YARV, Rubinius, IronRuby, Ruby.NET, XRuby, Cardinal or any other Ruby implementation now or ever. Rails is Rails.

Maybe I’ll start to believe when they start promoting Ruby on Rails at JavaOne, as opposed to promoting JRuby on Rails at RailsConf.

I guess you didn’t attend JavaOne 2007, where both JRuby on Rails and Ruby on Rails had sessions, including promoting in one of the major keynotes. Sun is serious about Java being a multilingual platform. Of course they’re spending money on getting these languages working on Java, but Sun is also giving support to both Rubinius and MRI. Would they really do that if this conspiracy theory is correct? For more information about that particular data point, take a look at Tim Bray’s blog here: Rubinius Sprint.

Much more likely I think, is that we’ll see a Java based Rails alternative that ships with some new version of Java which has been designed to incorporate features from dynamic languages like Ruby and Python.

Java will almost certainly never ship with a web framework. That said, Phobos is one of the Sun projects for web development that uses JavaScript and incorporates features from Rails and Ruby, and also Python and other languages and frameworks.

And still, Sun doesn’t seem to have a problem with Rails and Phobos living side by side. GlassFish includes support for both. And the Rails support doesn’t make any changes to Rails, it doesn’t require you to do anything extra, except that your application should run in JRuby. The latest version basically allows you to say “glassfish_rails start” while standing in your application directory.

…what compromises are we making for Sun’s involvement…

Yeah. What compromises are we making for Sun’s involvement in the community? Except handling the fact that we get more commercial backing, more money in the ecosystem, more help from Sun engineers creating high quality Ruby code, a server that happens to host the SVN server for Ruby itself, and so on? Are these contributions? Sure. Are they commitments? Yeah. Are they something that will require compromises from the community? No, not really.

Sometimes I think that many in the Open Source world still panics as soon as a big company starts to make inroads. And yes, in many cases this hasn’t worked out well. But we gotta see to the facts too. Some companies we will never be able to trust, but Sun has definitely been on the right side of the Open Source fence for a long time. Come on, people.