Lexing Ruby.


It has become apparent that JRuby’s hand coded Lexer is a liability, in the long run. The code is hard to maintain and probably suboptimal with regards to speed. So two weeks ago I decided to check out the possibility of a Lexer generator. I haven’t had much time, but the last days I’ve started reimplementing the JRuby lexer with the help of JFlex.

The first parts have been really easy, actually. I already have a simple version tied in with JRuby, working. There’s not much of the syntax there yet, but some of it works really fine. Fine enough to check performance and see the road ahead. So, I have two test cases, the first one looks like this:

class H < Hash
end

puts H.new.class
puts H[:foo => :bar].class


and the second like this:


class H < Hash
def abc a, b, *c
1_0.times {
puts "Hellu"
}
end

def / g, &blk
2.times do
puts "well"
end
end
end

H.new.abc 1, 2
H.new/3


The first one test a corner case in subclassing core JRuby classes. The second one is just nonsense to test different parts of the syntax. Both of these parse and run correctly on JRuby with the JFlex-based lexer. It’s not much, but it’s something to build on.

Regarding performance, I’ve created a test program that uses the JRuby parser to parse a specified file 10 x 10_000 times, reporting each time and the total. I’ve run it on both the first and the second example, and both are about 8 to 10 percent faster with the new lexer. I also expect performance to improve more for bigger files. Right now the lexer keeps track on line number, offset and column number, which is also a performance drain. Removing it gives about 2-3 percent more.



Speed in JRuby.


I had really wanted my next post here to be a tutorial about how to get Camping working with ActiveRecord-JDBC and JRuby, but I managed to find two quite serious bugs related to blocks while trying things out. One of the bugs is really strange, and manifests in some of the Markaby CSS-Proxy internals. So, Markaby works, if you refrain from add css class names to tags.

Since Monday I’ve been busy with various things in JRuby. I’ve looked around for easy performance fixes, which there are loads of. Many of them didn’t give much, but a few things actually had a noticeable difference. One of these places was in the loading process, where it’s important to do as few searches as possibly, since most of them are really expensive. Actually, in a few situations JRuby actually parsed and ran a file several times. This happened in WEBrick among others.

I’ve also implemented a Java version of Enumerable, which was very trick to get working at first, since JRuby don’t really support calling methods that yield from Java. So, I had to devise a way of doing this. The generic case seems to be to hard to do right now, but the specific case of calling each and collect the results in a List are finally working well.

I devised a test case where all Enumerable-operations are used 2 or 3 times, and ran this test case 1_000.times. The results for trunk JRuby was ~7s, while the Java implementation of Enumerable took that down to ~5s, which is nice.

Anyway. That’s mostly all. I’m off on vacation now.



JvYAML status


I have finally started work on JvYAML again, and I hope to have a new release together within 2 weeks, which will contain JavaBean materialization and the completed emitter. Hopefully I’ll manage to integrate this with JRuby not long after.

It’s been really hard to get going with this, since I have so many different projects going simultaneously.



Some JRuby tidbits


I’ve spent some time this weekend taking a look at various JRuby issues.

  • Camping:
    The most interesting was trying to get Camping to work. Camping is a microframework for certain kinds of web applications. Very neat, but uses lots of Ruby tricks to function. As such it’s a really good test of JRuby capabilities, but it’s quite hard to debug. I have got it working really good for basic applications, in the process finding a bug in our Hash#[] implementation that fortunately was easy to fix. But today I’ve unearthed something really hairy. The test case demands two classes with two methods each, and it’s some really strange block tweaking that happens. I hope Charlie is up to the challenge. Markaby (which Camping uses), needs this functionality to work.
  • Mongrel:
    I’ve started to take a serious look at the Mongrel support. Danny’s work on the parser seems really great, so I’ve added basic JRuby integration to our Subversion, and plan to try getting it to work with Mongrel-0.4 later on today.
  • ActiveRecord-JDBC:
    Last week I added some Oracle functionality to it, but this still needs some work. I’m thinking that we need to factor out some driver-specific functionality soon. Anyway, I released the a version 0.0.1-gem so that people easily can use what functionality we have.
  • JvYAML:
    I’ve finally begun writing on the Emitter in ernest again. I hope to have most of it finished by Friday, before I go on vacation.

Expect a tutorial on using JRuby and Camping together very soon; hopefully tomorrow. And in a few days after, maybe we can showcase a functional Mongrel in JRuby too?



Announcing Swedish Rails


I would like to announce my plugin Swedish Rails.

Rails provide many goodies and helpers for you, but some of these are dependant on your user interface being in english. Date controls, for example. Pluralization is another. And have you ever tried to capitalize a string with Swedish letters in them? Then you know it doesn’t work that well. Until now, that is. You install this plugin, make sure all your Swedish source uses UTF-8, and most of these problems disappear. Downcase, upcase, swapcase and capitalize work as expected. Integer#ordinalize gives Swedish ordinals. And month names and weekday names are in Swedish.

This plugin isn’t only for people looking to create Swedish Web interfaces. It can also act as a map for creating a plugin like this for any western language. Just replace all the text-strings in this plugin, change the module name and you’re set to go. (Of course you’ll have to know the language you’re porting too, though).

The plugin can be found at:
http://svn.ki.se/rails/plugins/swe_rails
and some information here:
http://opensource.ki.se/swe_rails.html

The plugin itself is fairly simple. Most of it monkey patches different parts of Ruby proper or aspects of Rails. One of the more interesting parts (which could also be usable by itself) translates all internal month and weekday names in Swedish. This is probably the most interesting when doing Rails GUI’s, since you don’t have to create your own date_select tag, and all the other variations on this. I’ve also added som inflector rules, but only for the pluralize helper (so don’t start naming database tables in Swedish, please!). The problem is that english pluralization rules are pretty simple. Just tuck an -s on the end and hope for the best. Swedish uses 5 or six declinations for nouns, and most of these are irregular. This makes it slightly hard to describe inflection rules for Swedish, but I gave it a try anyway.

So, enjoy! I just wish I had done this a year ago. Then I wouldn’t have had to write sublty different versions of this code all over the place.



Rails, SOAP4R and Java.


I’ve spent the last three weeks working part time on a project called LPW at work. LPW is a set of web services that talk with the Swedish student databases. The libraries are implemented in Java and deployed with Axist. My project was to create a web system that students can use to access various LPW services. We had a an old implementation of this, written in Java with uPortal as a framework, but for the new implementation we decided that using Ruby on Rails would be interesting and probably worthwhile.

I’ve spent about 40 hours on the implementation. I did everything except the user interface. I got finished HTML-pages and integrated these with the system. Initially we expected the complete development to take about 120-150 hours with Ruby and about 250 if we did it in Java. In retrospect, I’m pretty sure the Java version would have taken more than that; probably between 350 and 400 hours. Since I wrote the first version in Java I can say this pretty accurately.

So, did I have any interesting experiences while writing this system? Oh yes; otherwise I wouldn’t be writing. First of all, I managed to release two plugins in the process of this project. More information about them can be found in older blog entries. I also tried Mongrel for the first time, and I just have to say that I will never go back to WEBrick.

But the most interesting part with using Ruby was that I would have to get SOAP4R and Axis to work together. In the process I found some interesting things. Let me describe the relevant layout of my application.

The project goal was to implement 4-6 different services, which are all available as separate WSDL-files. I generated the drivers for these with wsdl2ruby. After generating the clients for each, I renamed the driver.rb into for example addr_driver.rb, and created a file called lpw_valueobjects.rb where I put all valueobjects found in defaultDriver.rb. I then repeated this process for each service. I then put all these files into the directory RAILS_ROOT/vendor/lpw.

Now, hitting one or two web services each time a student goes to a page isn’t really realistic, so I decided early to implement some way of caching the results of the service calls. Since the data in question is pretty static this works well.

So, to integrate the services into Rails, I created a base class called LPWObject in the models-directory. In this class I defined self.wsdl_class and a few other helpers that let me transparently handle the caching of service-classes without actually having them inside the LPWObject itself. This becomes important when I want to save the results in the session. Ruby can’t marshal classes which have singleton methods, and WSDL2Ruby depends heavily on singleton methods for the service client.

The first problem with getting Ruby and Java to work over Soap was with swedish characters. When I added something like “Gävlegatan 74″ to an attribute and then tried to send this over soap, Soap4R transformed the attribute type into Base64 instead of xsd:String. Suffice to say, Axis didn’t really like this. The solution was to add $KCODE=”UTF8” to environment.rb. This let’s Soap4R believe that åäöÅÄÖ is part of regular strings.

The next problem came when I tried to save some of the value objects into the session. After looking for a long while, I found that if the value object had an attribute called “not”, WSDL2Ruby didn’t generate an
attr_accessor :not
for this until at runtime, which creates singleton methods on the value object. The solution was to add these accessors by myself. I’m not sure why wsdl2ruby does it like this, but probably there is some weird interaction with the not keyword in Ruby.

The final problem – which I’m not sure if I should blame Axis or Soap4R for – came when one of the value objects contained a byte array with a PDF-file. For some reason Axis sends the regular response XML looking fine, but before and after there are some garbled data. It looks like Axis actually sends the byte data as an attachment too, not just inside the byte array. Soap4R didn’t handle this at all, and I got an “illegal token” from the XML parser. The solution to this problem is the worst one, and I should really send a bug report about this, but I haven’t had the time yet. Anyway, to fix it, I monkeypatched SOAP::HTTPStreamHandler, and aliased the method send. Inside my new send method I first called the old one, then use a regexp to extract only the xml-parts of the response and reset the receive_string to this. It fixes my problem and works fairly well, but it isn’t pretty.

So, in conclusion, Soap4R and Axis seem to work good together, except for a few corner cases. I’m really happy about a project that could’ve taken 10 times longer to complete, though.



InPlaceEditor with Autocompletion.


Since the AJAX controls in Rails are so neat, and the InPlaceEditor is probably the neatest since it’s so easy and useful, I immediately decided to start fixing one small feature I needed/wanted very much. That is, autocompletion on InPlaceEditor-fields. So, I’ve created a plugin that does this. It was actually much easier than I thought, since the prototype JavaScript library makes JavaScript almost pain-less to work with. Not quite nice, but not bad either.

Anyway, it’s dead simple to use, do it like you’ve done it with InPlaceEditor and you’ll be find. More information can be found here (I’ve finally convinced my employer to host a place where we can release open source, so it can be seen that KI actually supports open source). The Subversion path is: http://svn.ki.se/rails/plugins/in_place_completer.

Much joy!



CAS Rails filter


Yesterday I released a Rails filter for doing authentication with CAS. This is really neat, but it’s the Rails plugins that make it so neat, since you just have to install the plugin, add three configuration parameters and everything will just work out of the box with all your controllers protected by authentication.

During development it’s often practical to not do real authentication, but rather just get a username back as if you’d actually been authenticated already. This can be easily accomplished in three ways with the filter, by a configuration options:

CAS::Filter.fake = “testuid”

which returns the string provided as a username

CAS::Filter.fake = :param

which takes the value of params[:username] and returns this as the authenticated user, and lastly

CAS::Filter.fake = lambda { |controller| [‘testuser1′,’testuser2′,’testuser3’][rand(3)] }

which invokes the proc every time the filter is called, and uses the string returned as username.

Anyway, authentication is just one of those things that you don’t want to think about. It should just be there. And now it is:

script/plugin install http://svn.ki.se/rails/plugins/cas_auth

Enjoy.



Rails and plugin irritation.


I’ve just started work on a medium sized Rails-application (100-200 hours), and about the first thing I wanted to do was to add my CAS filter (see next posting) to the system. Now, what I’d really like is to be able to set the filter to fake authentication while doing development on my box but going to real authentication when in production. I first tried using $RAILS_ROOT/config/environments/development.rb to set this option, but this doesn’t work. Plugins are loaded really late in the startup sequence so my best choice is to add

load “plugin/#{ENV[‘RAILS_ENV’].rb” if File.exist?(“plugin/#{ENV[‘RAILS_ENV’].rb”)

at the end of environment.rb and add my specific configuration to $RAiLS_ROOT/plugin/development.rb

This works, of course, but it is something that I’d thought would already be part of the startup process. Oh well, you can’t have everything.



The perils of hashCode


A few days ago, me and two colleagues tried to track down a very tricky bug. After some hours looking, we finally found it, and it was actually due to a misconception that I had about the workings of HashSet and HashMap. I’m not sure if I’m the only one that didn’t know this, but it’s very logical once you’ve found it out. You see, if you save an object in a HashSet, and then change the object in such a way that the hashCode changes, then you won’t find that object in the Set anymore. It will still be there, you will still iterate over it, but if you ask for example set.contains(obj), then it will return false. If you iterate over the set, and call Iterator#remove, this will silently fail to remove anything, since the HashSet can’t find the object you want to remove. So, if you save things in a HashSet or use them as keys in a HashMap, make sure that the object is immutable, otherwise you’ll get extremely hard-to-find bugs.

Incidentally, one of the best newsletters about Java programming wrote about this issue ages ago. Regardless if you work with Java professionally or just for fun I implore you to subscribe to JavaSpecialists by Dr Heinz Max Kabutz. It can be found here.