That Rails Thing


I need to join the fray. Pat Eyler has announced a competition together with APress. The first installment is about how Rails has made me a better programmer. Details to be found here.

So, let’s get at it. Rails has inspired and affected me in numerous ways. You got to realize that I have been programming web applications in Java for way too many years. Before that I’ve done both ASP and PHP. Nothing ever felt natural. I’ve tried many of the LISP frameworks (and Uncommon Web is really cool). I’ve embraced continuation-based frameworks. But none felt so powerful, yet nonintrusive. Rails embodies the best parts of Ruby.

For me, Rails is very interesting for several reasons. First and foremost is the fact that it is an incredibly good framework. It’s really amazing how good it is. And that in itself acts as an inspiration. To know that software can be this good makes me want to strive to perfect my API’s, make my libraries more usable, and finding new and novel ways to improve my DSL’s.

But that’s only part of it. The flip side of this is that Rails is not perfect. In fact, there are myriad ways it can be improved. And that gives me hope, because it also means that no matter how good my code gets, it can always get better. And that means I’ll never be totally bored!

In a very literal sense, Rails has made me a better programmer in another way. Since Ruby on Rails was the first Ruby-application to make it into Karolinska Institutet, it means I can thank Rails for being able to write more Ruby at work. And writing Ruby instead of Java must make you a better programmer, neh?

Of course, Rails is an excellent testing tool for seeing how far we have left with JRuby. It goes without saying that you become a better programmer by implementing a language…

I have several applications and libraries that I keep in mind when writing my code. Those libraries act as a gauge against which I measure my code. Code quality, testing ability, interface, sheer trickyness; in all these areas, Rails is one of the top frameworks I know.



Another neat trick


Another sweet thing, that I’ve actually wondered idly about from time to time, is how you could get “provided?”-style parameters in Ruby. If you have used a language with this functionality, like Common Lisp for example, you know that this can be highly useful.

And just the other day, while reading Hal Fultons excellent (but poorly proof read) The Ruby Way (2nd ed) I found the way. Ergo, it looks like this:

def foo(bar, baz = (baz_provided=true; 42))
p [bar,baz,baz_provided]
end

foo(13,47)
foo(13)
foo(13,42)

This will provide the output:

[13, 47, nil]
[13, 42, true]
[13, 42, nil]

which illustrates the usage pretty well. The magic, of course, lies in the fact that default parameters are evaluated, just the same as anything else. You could do some very crazy things inside that default value specification if you wanted. Anyway, just a nugget.



A very small Ruby method


I haven’t had much time or inclination for blogging lately. Not much happening at the moment, actually. But I came up with one small thing I wanted to document. Just a practical thing for certain situations. Basically it’s a with-method, that works fairly well. It’s nothing magical and the trick is basic. It’s more or less an alias, actually:

module Kernel
def with(obj = nil, &block)
(obj || self).instance_eval &block
end
end

with("abc") do
puts reverse
end

"abc".with do
puts reverse
end

As you can see, insstance_eval can be used like JavaScript or VB’s with. This is nice for the simple reason of documentation. I find this usage much easier to read and understand than most usages of instance_eval that I’ve seen.

So, that’s it for today. I’ll probably be back soon with some recent JRuby developments too.



Some notes from the Stockholm Rails meet


Last night (Wednesday) about 35 developers with an unhealthy interest in Rails met up in Stockholm, at Valtech’s offices, to share some experiences and talk shop.

It was very interesting and fun to meet people I can relate to. We ended up talking programming languages for a few hours after the main event ended.

Peter Marklund did a presentation on a CRM system in Rails, and Christian and Albert from Adocca talked about caching, and handling a Rails application that needs to scale into the millions. Nice stuff.

Last, I did an improvised presentation on LPW, a small Rails application that is fed its main data through Web Services instead of ActiveRecord. I believe it went fairly well, even though I had no slides and almost no preparation… =)



Nooks and Crannies of Ruby


There are many small parts of Ruby, tips, tricks and strange things. I thought that I would write about some of the more interesting of these, since some of them are common idioms in the Ruby community. The basis for the information is as always from the Pick-axe, but how these things are used in real life comes from various places.

The splat operator

The asterisk is sometimes called the splat operator when not used for multiplication. It is used in two different places for opposite cases. When on the right hand side of an expression, it is used to convert an array into more than one right hand value. This makes splicing of lists very easy and nice to do.

a,b,c = *[1,3,2]

Second, it’s used at the left hand side to collect more than one right hand value into an arra

*a = 1,3,2

This makes no difference if you’re calling a method or assigning variables. What matters is as usual with programming languages; that there is a left hand side and a right hand side (lhs and rhs from now on):

def foo(a,*b)
p b
end

foo 1,2,3,*[4,5,6]

This is all old news, and not very exciting. It’s useful and the basis for some niceties, but nothing overwhelming. The thing that is really nice about the rhs version of the splat operator is what it does if the value it’s applied to isn’t an array. Basically, the interpreter first checks if there is a to_ary-method available. If not, it goes for the to_a method. Now, Kernel has a default to_a-method so all objects will respond to to_a. This method is deprecated to call directly, though, but if called through splat or Kernel#Array it doesn’t generate a warning. So:

a = *1

will result in the same thing as

a = 1

except for jumping through some unnecessary hoops underneath the covers. But say that you have an object that implements Enumerable and you want to do something with. Maybe transform a Hash into an array of 2-element arrays, you can do it like this:

*a = *{:a=>1,:b=>2}

Now, this still isn’t that useful. Oh, it’s slightly useful but there is a method in Hash that does this too. But say that we have a file object:

*a = *open('/etc/passwd')

Since File includes Enumerable, it also has a to_a method which creates the array by using each to iterate and collect all elements. In this case all the lines in the file.

def foo(*args)
bar(*args)
end

Camping uses the splat operator at many places, mostly with the common idiom to take any arguments offered and passing them all on as separate arguments again:

Symbols and to_proc

I hesitate to use the word neat, but I can’t really find anything that better describes the sweet, sweet combination of symbols and to_proc. I’m going to show you a small example of how it’s used before I explain this very common practice:

[1e3,/(foo)/,"abc",:hoho].collect &:to_s

Now, this code will not run without a small addition to your code base. But first of all, let’s just walk through the code. First we define a literal array that contains four elements of different type. One Float, one Regexp, a String and a Symbol. Then we call collect to make a new array out of this. But where we usually provide collect with a block, we instead see the ampersand that symbolizes that we want to turn a Proc-object into a block argument for a method. But what comes next is not a variable, but a symbol. So, what happens? Well, the ampersand checks if the value provided to it is a Proc, and if not it calls to_proc on the value in question, if such a method is defined. And how should this method look? Like this:

class Symbol
def to_proc
lambda { |o| o.send(self) }
end
end

Now, this method is nothing much. But it employs some fun trickery. It first creates a Proc by calling Kernel#lambda with a literal block. This block takes one argument, and the block calls the method send on the argument with itself as argument. As self in this case would be a symbol, and specifically the symbol :to_s in the above example, the end result is that the Proc returned will call to_proc on each object yielded to the block. So, with this explanation it’s easier to understand what the first example does. In effect it is exactly the same as

[1e3,/(foo)/,"abc",:hoho].collect {|v| v.to_s}

but without that nasty duplication of the v-argument. It’s not a big saving, but many small savings…

I recommend installing facets, which include numerous small, nice solutions like this. They can also be required separately, so if you have facets installed, just require ‘facet/symbol/to_proc’ to get this specific functionality included.

Using operators as method names

Ruby allows much more operators to be redefined than most languages. This makes some interesting tricks possible, but most importantly it can make your code radically more readable. An excellent example of this can be found in the net/ldap-library (available as ruby-net-ldap from RubyGems). Now, LDAP uses something called filters for searching, and the syntax for filters are basically prefix notation with ampersand, pipe and exclamation mark for and, or and not, respectively. Now, with the net/ldap-library you can define a combined filter like this:

include Net
f = (LDAP::Filter.eq(:cn,'*Ola*') & LDAP::Filter.eq(:mail,'*ologix*')) |
LDAP::Filter.eq(:uid,'olagus')

This defines a filter that basically says: find all entries where cn is ‘*Ola*’ and mail is ‘*ologix*’ or uid is ‘olagus’. This is very readable thanks to the infix operators, that for everyone who knows LDAP will be easy to understand.

The next example comes from Hpricot, where _why puts the slash to good use:

doc = Hpricot(open("http://redhanded.hobix.com/index.html"))
(doc/"span.entryPermalink").set("class", "newLinks")

Note how neatly doc/”span…” fits in, and it looks like XQuery, or any other path query syntax. But it’s just regular Ruby code and the slash is just method call. I’m really sad that /. isn’t allowed as a method in this way… =)

Now, ackording to the Pickaxe, all of these infix operators will be translated from arg1 op arg2 into arg1.op(arg2). But Ruby still needs to be able to parse everything. This means that most operators need to have one required argument. Trying this with a home defined *-operator will not work:

x = a *

But, an experimental syntax for importing packages in JRuby actually used this effect:

import java.util.*

This is just a simple exploatation of the fact that * is a regular method name and used like this will be parsed by Ruby like that too, which means it doesn’t need an argument. So, which operators are available for your leisure? Ackording to the Pickaxe, these are [], []=, **, !, ~, + (unary), – (unary), *, /, %, +, -, >>, <<, &, ^, |, <=, <, >, >=, <=>, ==, ===, !=, =~, !~.
Note that the method names when implementing the unary + and – is +@ and -@:

class String
def -@
swapcase
end
end

The most important thing to remember when reusing operators like this is to not overdo it. Use it where it makes sense and is natural but not elsewhere. Remember that Ruby code should follow the principle of least surprise. The above example of using unary minus to return a swapcased version of the string is probably not obvious enough to warrant its use, for example.

Using lifecycle methods to simplify daily life

Inversion of control is all the rage in the Java world right now, but using callbacks of call kinds have always been a great way to make readable and compact. The Observer pattern is used in many places, and I suspect it’s implemented without any knowledge of the pattern in most places.

Ruby contains a few callback methods and lifecycle hooks that make life that much easier for the Ruby library writer. Probably the most useful of these are Module#included. Basically, this is a method you define like this:

module Enumerable
def self.included(mod)
puts "and now Enumerable has been used by #{mod.inspect}..."
end
end

It will be called every time a module is included somewhere else.

There are other callbacks that can be useful. Module#method_added, Module#method_removed, Module#method_undefined and counterparts for Kernel with singleton prefixed. Class#inherited is interesting. Through this you can actually keep track of all direct subclasses of your class and with some metaprogramming trickery (basically writing a new inherited for each subclass that does the same thing) you can get hold of the complete tree of subclasses. If you want that for some reason. I would for example use this approach for Test::Unit, rather than iterating over ObjectSpace. But I guess that’s a matter of taste.

Class variables versus Class instance variables

This is one thing that always trips people up. Including me. Class variables are special variables that are associated with a class. They are referenced with two at-signs and a name, like @@name. So far, it’s simple. But classes are also instances of Class, which means that these instances can have regular one-at-sign instance variables. These are not the same thing. Not at all. Something like this:

class Foo
@@borg = []
@me = nil

def initialize
@me = self
Foo::add_borg
end

def self.add_borg
@@borg << @me
end
end

will result in a @@borg-list filled with nils. This is because the first @me refers to an instance variable in the Foo instance of Class; not the @me instance variable associated with an instance of the Foo-class.

Condensed lesson: Class have instance variables of themselves, these are rarely useful; they usually contribute to hard-to-find-errors. And don’t confuse them with class variables which is a totally different kind of beast.

Shortcuts: __FILE__ and ARGF

Ruby contains a myriad of shortcuts, many influenced from Perl and other invented to make it easier to write condensed programs. The regexp result globals are always good to have, but there are other that can be very useful too. Two that I like most are __FILE__ and ARGF. __FILE__ is also part of a very, very common idiom that the Pickaxe details. Combined with the global $0 it makes it easy to differ execution when a file is required, and when it’s executed. Basically, $0 contains the name of the file that has been executed. In C this would be argv[0]. __FILE__ is the full filename of the file the code can be found in. If these are the same, the current file is the one asked to execute. This is useful in many places. I use it often in gemspecs:

if $0 == __FILE__
Gem::manage_gems
Gem::Builder.new(spec).build
end

If I run the file above with gem build, this part will not execute, but if I execute the file directly, it will run.

Matz sometimes likes to show how to implement the UNIX utility cat in Ruby:

puts *ARGF

This combines tip number uno in this blog entry with the constant ARGF. ARGF is a nice special object that when you reference it will open all the files named in ARGV. If you have any options in your ARGV you’d better remove them before referencing ARGF, though. Basically what you get when referencing ARGF is a file handle to the files named on the command line. And since a File has Enumerable and thus to_a, splat will read all the lines in all the files and combine them into an array and then splay the array into the call to puts which will print each line. Here you are, cat!

There are other globals and constants available, but most aren’t as useful as the previously named. For example you can use __END__ on an empty line, and the code interpolation will stop there and the rest of the file will be available as the constant DATA. I haven’t seen anyone use this. It’s a remnant from when Ruby was a tool to replace Perl, and the other scripting tools in UNIX.

Everything is runtime

Basically, the whole difference in Ruby compared to compiled languages is that everything happens at runtime. Actually, this difference can be seen when looking at Lisp too. In Common Lisp there are three different times when code can be evaluated: at compile-time, load-time and eval-time. In Java class-structure is fixed. You can’t change class structure based on compile parameters (oh boy, sometimes I miss C-style macros). But in Ruby, everything is runtime. Everything happens at that time (except for constants… this is a different story). This means that class definitions can be customized based on environment. A typical example is this:

class Foo
include Tracing if $DEBUG
end

This class will include some methods when the -d flag is provided, and others when it’s not. Basically there isn’t much syntax in Ruby that couldn’t be implemented in the language itself. A class declaration can be be duplicated with

Class.new(:name) do
#class declarations go here
end

And almost all parts of a method-definition with def can be provided with define_method. The glaring mismatch (blocks) will be corrected with 1.9. Except for that, it’s just sugar. If statements could be implemented with duck typing/polymorphism:

class TrueClass
def if(t,f)
t.call
end
end

class FalseClass
def if(t,f)
f.call if f
end
end

x = true

x.if lambda{ puts "true" }, lambda{ puts "false"}

And that’s the real Lisp inheritage of Ruby. There really isn’t any essential syntax. Everything can be implemented with the basics of receiver, message, arguments, and blocks. Just remember that. It’s the basis for all useful metaprogramming. There is no compile-time. Everything can change. “There is no spoon”.



Announcing ActiveRecord-Mimer 0.0.1


The initial version of ActiveRecord-Mimer have been released.

The project aims to provide complete ActiveRecord support for the Mimer SQL database engine. This initial release provides the basis for that. Most operations work, including migrations. The only exceptions are rename_column and rename_table which isn’t supported by the underlying database engine.

The project resides at RubyForge: http://rubyforge.org/projects/ar-mimer

and can be installed with RubyGems by
gem install activerecord-mimer

The code is released under an MIT license.



OpenSSL status report


I just checked in a few updates to my openssl branch for JRuby. Boy is it tricky getting everything right. It seems like every DER format Java crypto emits differs from the OpenSSL DER output. And it’s really incompatible. As an example I have been forced to reimplement the DER dumping for X509 certificates myself, and that’s not the only place.

But the work is actually going forward; as fast as I can make it when I’m only doing this in my spare time and my regular work takes lots of time right now. I can’t say for sure when it will be finished or usable, but I know for a fact that most of the MRI tests run now. What’s missing is PKCS#7, X509 CRL’s and X509 cert-stores, plus the regular SSL socket support. Not much, compared to what actually works.

But that leads to me to two issues. We have recently agreed that OpenSSL support will require BouncyCastle and Java 5. There is really no other way to get this working. 1.4.2 is fine for basic Digest support and some of the RSA/DSA support, but Java is sorely lacking in the ASN.1 and X509 department. Nothing whatsoever. Which is why we need BouncyCastle, which is fairly complete. I have only been forced to reimplement one or two central classes. Quite good. But SSL support is another story. As you may know, 1.4.2 has SSLSocket and SSLServerSocket. The problem is this: they aren’t any good. As a first, they are blocking, and there isn’t any support in 1.4.2 for NIO SSL sockets. Whoopsie. Which explains the requirement on Java 5. Tiger adds the SSLEngine class which can be used to implement NIO SSL, with the caveat that it heightens complexity. I have only taken a cursory look at this yet. Right now I want the other stuff working first, since there are so many dependencies on them.

But it’s really going forward. Now, if I only had this as my day job, this would be finished in a few days… Alas, that’s not the way it is. Expect further updates in a week or two.



The JRuby Tutorial #4: Writing Java extensions for JRuby


There are many reasons to write a Java extensions for JRuby. Maybe your favorite Ruby library hasn’t been ported to JRuby yet, or you want to directly interface with some Java code without going through JRuby’s Java interface. Maybe you need the speed from doing calculations in Java, or you just want to add missing functionality. Whatever the reason, writing extensions for JRuby can be tricky if you don’t know how the internals of JRuby work. The purpose of this tutorial is to show how to build a simple extension the exercises many parts of the Ruby language and how to implement this with Java.

The example will be a module called Sequence with one class inside it called Sequence. Whenever I create something as a Java extension, I usually write functional Ruby code for doing it first, to get the structure of the code straight in my head. So, without further ado, here is the Sequence module:

module Sequence
def self.fibonacci(to=20)
Sequence.new(1,1,1..to)
end

def self.lucas(to=20)
Sequence.new(1,3,1..to)
end

class Sequence
include Enumerable
attr_reader :n1,:n2,:range
def initialize(n1,n2,range)
@n1, @n2, @range = n1,n2,range
regenerate
end
%w(n1 n2 range).each do |n|
define_method(n) do |v|
send("#{n}=",v)
regenerate
end
end
def regenerate
@value = []
v1, v2 = @n1, @n2
@value << v1 if @range === 1
@value << v2 if @range === 2
3.upto(@range.last) do |i|
v1, v2 = v2, v1+v2
@value << v2 if @range === i
end
nil
end
def [](ix)
@range = ix..(@range.last) if ix < @range.first
@range = (@range.first)..(ix+1) if ix > @range.last
regenerate
@value[ix-@range.first]
end
def each(&b)
@value.each(&b)
end
def to_a
@value
end
def to_s
@value.to_s
end
def inspect
"#<Sequence::Sequence n1=#@n1 n2=#@n2 range=#@range value=#{@value.inspect}>"
end
end
end

Interfacing with the JRuby runtime

There are a few different ways to write extensions for JRuby. The difference isn’t big from a functional viewpoint, but there is a definite gap in usability. I call the two major ways to implement an extension the MetaClass way, and the MRI way. The MetaClass subclasses the Java class that represent a Ruby class, called RubyClass, and implements some meta information methods and classes. The MRI way, in contrast, just creates the Ruby class in code, and adds methods to it in some static initializer. This tutorial will use the MRI way for two reasons; first, it’s easier and doesn’t require so many files and classes, and second, when porting MRI C extensions, the MetaClass way doesn’t map very well to how MRI does things.

Project setup

To make the extension building as simple as possible, it helps to follow a few conventions. First of all, I’m going to call the extension “fib”. I want my potential users to be able to require ‘fib’ and get all the good Sequence-functionality. To achieve this there are two things to keep in mind. First, the jar-file should be called fib.jar and put somewhere in JRuby’s load path. Secondly, there should be a class called FibService that implements the BasicLibraryService
interface. For our purposes, FibService.java will contain all functionality, but in a realistic situation is makes sense to extract the functionality and let the library loader just set up the
environment. The skeleton for my FibService.java will look like this:

import java.io.IOException;

import org.jruby.IRuby;

import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
return true;
}
}

At this point the only imports needed are for IRuby, which is the main interface for the JRuby runtime, and the BasicLibraryService which provides the basicLoad method. The return value specifies if the service was loaded correctly or not.

Basic structure

I will start by adding the basic structure for our code; the Sequence module and class:

import java.io.IOException;

import org.jruby.IRuby;
import org.jruby.RubyClass;
import org.jruby.RubyModule;

import org.jruby.runtime.builtin.IRubyObject;

import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});
return true;
}
}

What this code does is to establish the Sequence module at the top level, and then define the Sequence class inside this module. We need to specify a super class for it, and this is what the
runtime.getObject()-call is about. Basically it’s a shortcut for writing runtime.getClass(“Object”). After we have defined the class, make it include Enumerable, and then create attribute readers for the 3 instance variables. Despite the name, newSymbol doesn’t necessarily create a new symbol; it returns an existing if there is one.

Singleton methods

We’re going to create the singleton factory methods before actually creating the implementation for the class. The new class looks like this:

import java.io.IOException;

import org.jruby.IRuby;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyNumeric;

import org.jruby.runtime.CallbackFactory;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});

CallbackFactory fibService_cb = runtime.callbackFactory(FibService.class);
mSequence.defineSingletonMethod("fibonacci",fibService_cb.getOptSingletonMethod("fibonacci"));
mSequence.defineSingletonMethod("lucas",fibService_cb.getOptSingletonMethod("lucas"));

return true;
}

private static IRubyObject seq(int a1, int a2, RubyModule module, IRubyObject[] args) {
IRuby runtime = module.getRuntime();
int to = 20;
if(module.checkArgumentCount(args,0,1) == 1) {
to = RubyNumeric.fix2int(args[0]);
}
IRubyObject[] seqArgs = new IRubyObject[3];
seqArgs[0] = runtime.newFixnum(a1);
seqArgs[1] = runtime.newFixnum(a2);
seqArgs[2] = runtime.getClass("Range").callMethod("new",
new IRubyObject[]{RubyFixnum.one(runtime),runtime.newFixnum(to)});
return module.getClass("Sequence").callMethod("new",seqArgs);
}

public static IRubyObject fibonacci(IRubyObject recv, IRubyObject[] args) {
return seq(1,1,(RubyModule)recv,args);
}

public static IRubyObject lucas(IRubyObject recv, IRubyObject[] args) {
return seq(1,3,(RubyModule)recv,args);
}
}

This code contains a number of new things. First of all, our singleton methods needs implementations. Since we don’t need any data associated for these methods, static Java-methods suffice for implementation. A CallbackFactory is used to get a reflection handle at the methods. I use the method call getOptSingletonMethod on the CallbackFactory; this is because the one parameter to the two methods are optional, so the callback factory will look for a static method with signature IRubyObject name(IRubyObject, IRubyObject[]). We’ll later see how we
can specify explicit types for method arguments. The recv argument is a specialty for static methods. Usually when working with Ruby instances from Java code, you will have a handle to the runtime implicit in the self, but this isn’t possible for static methods. The recv parameter is the instance of RubyModule/RubyClass that the method is called on. In our case this is a handy way of getting hold of the Sequence-module.

All IRubyObject’s have checkArgumentCount which is a simple utility method for methods with optional arguments. Basically, it takes an array, the minimum and maximum argument count, and throws a Ruby exception if it isn’t correct. It also returns the actual argument count (which is the same as args.length right now). Note, if porting C Ruby code, that this two numeric parameters to checkArgumentCount is NOT the same as rb_scan_args where for example “12” means one required and two optional parameters. The equivalent with checkArgumentCount would be checkArgumentCount(args,1,3).

RubyNumeric has a few utility methods, where fix2int is one of the more useful. It basically allows us translate a Ruby integer into the Java corresponding type.

The most common types have shortcut creation methods in IRuby, and newFixnum is one of these. To create a new Range we have to get a reference to the class and call new on it, though.

The Sequence class

Here comes the meat of it all. This is the final version of the Java source:

import java.io.IOException;

import java.util.ArrayList;
import java.util.List;
import java.util.Iterator;

import org.jruby.IRuby;
import org.jruby.RubyArray;
import org.jruby.RubyClass;
import org.jruby.RubyFixnum;
import org.jruby.RubyModule;
import org.jruby.RubyNumeric;
import org.jruby.RubyObject;
import org.jruby.RubyRange;

import org.jruby.runtime.CallbackFactory;
import org.jruby.runtime.builtin.IRubyObject;
import org.jruby.runtime.load.BasicLibraryService;

public class FibService implements BasicLibraryService {
public boolean basicLoad(IRuby runtime) throws IOException {
RubyModule mSequence = runtime.defineModule("Sequence");
RubyClass cSequence = mSequence.defineClassUnder("Sequence",runtime.getObject());
cSequence.includeModule(runtime.getModule("Enumerable"));
cSequence.attr_reader(new IRubyObject[]{runtime.newSymbol("n1"),
runtime.newSymbol("n2"),
runtime.newSymbol("range")});

CallbackFactory fibService_cb = runtime.callbackFactory(FibService.class);
mSequence.defineSingletonMethod("fibonacci",fibService_cb.getOptSingletonMethod("fibonacci"));
mSequence.defineSingletonMethod("lucas",fibService_cb.getOptSingletonMethod("lucas"));

CallbackFactory seq_cb = runtime.callbackFactory(Sequence.class);
cSequence.defineSingletonMethod("new",seq_cb.getOptSingletonMethod("newInstance"));
cSequence.defineMethod("initialize",seq_cb.getMethod("initialize",RubyFixnum.class,RubyFixnum.class,RubyRange.class));
cSequence.defineMethod("n1=",seq_cb.getMethod("set_n1",RubyFixnum.class));
cSequence.defineMethod("n2=",seq_cb.getMethod("set_n2",RubyFixnum.class));
cSequence.defineMethod("range=",seq_cb.getMethod("set_range",RubyRange.class));
cSequence.defineMethod("[]",seq_cb.getMethod("arr_ix",RubyFixnum.class));
cSequence.defineMethod("each",seq_cb.getMethod("each"));
cSequence.defineMethod("to_a",seq_cb.getMethod("to_a"));
cSequence.defineMethod("to_s",seq_cb.getMethod("to_s"));
cSequence.defineMethod("inspect",seq_cb.getMethod("inspect"));

return true;
}

private static IRubyObject seq(int a1, int a2, RubyModule module, IRubyObject[] args) {
IRuby runtime = module.getRuntime();
int to = 20;
if(module.checkArgumentCount(args,0,1) == 1) {
to = RubyNumeric.fix2int(args[0]);
}
IRubyObject[] seqArgs = new IRubyObject[3];
seqArgs[0] = runtime.newFixnum(a1);
seqArgs[1] = runtime.newFixnum(a2);
seqArgs[2] = runtime.getClass("Range").callMethod("new",
new IRubyObject[]{RubyFixnum.one(runtime),runtime.newFixnum(to)});
return module.getClass("Sequence").callMethod("new",seqArgs);
}

public static IRubyObject fibonacci(IRubyObject recv, IRubyObject[] args) {
return seq(1,1,(RubyModule)recv,args);
}

public static IRubyObject lucas(IRubyObject recv, IRubyObject[] args) {
return seq(1,3,(RubyModule)recv,args);
}

public static class Sequence extends RubyObject {
public static IRubyObject newInstance(IRubyObject recv, IRubyObject[] args) {
Sequence result = new Sequence(recv.getRuntime(), (RubyClass)recv);
result.callInit(args);
return result;
}

public Sequence(IRuby runtime, RubyClass type) {
super(runtime,type);
}

public IRubyObject initialize(RubyFixnum n1, RubyFixnum n2, RubyRange range) {
setInstanceVariable("@n1",n1);
setInstanceVariable("@n2",n2);
setInstanceVariable("@range",range);
regenerate();
return this;
}

public IRubyObject set_n1(RubyFixnum n1) {
setInstanceVariable("@n1",n1);
regenerate();
return n1;
}

public IRubyObject set_n2(RubyFixnum n2) {
setInstanceVariable("@n2",n2);
regenerate();
return n2;
}

public IRubyObject set_range(RubyRange range) {
setInstanceVariable("@range",range);
regenerate();
return range;
}

private void regenerate() {
List v = new ArrayList();
int v1 = RubyNumeric.fix2int(getInstanceVariable("@n1"));
int v2 = RubyNumeric.fix2int(getInstanceVariable("@n2"));
IRubyObject r = getInstanceVariable("@range");
if(r.callMethod("===",getRuntime().newFixnum(1)).isTrue()) {
v.add(getRuntime().newFixnum(v1));
}
if(r.callMethod("===",getRuntime().newFixnum(2)).isTrue()) {
v.add(getRuntime().newFixnum(v2));
}
int l = RubyNumeric.fix2int(r.callMethod("last"));
for(int i=3;i<=l;i++) {
int tmp = v1;
v1 = v2;
v2 = tmp + v1;
if(r.callMethod("===",getRuntime().newFixnum(i)).isTrue()) {
v.add(getRuntime().newFixnum(v2));
}
}
setInstanceVariable("@value",getRuntime().newArray(v));
}

public IRubyObject arr_ix(RubyFixnum ix) {
int index = RubyNumeric.fix2int(ix);
if(index < RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("first"))) {
setInstanceVariable("@range",getRuntime().getClass("Range").callMethod("new",
new IRubyObject[]{ix,getInstanceVariable("@range").callMethod("last")}));
}
if(index > RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("last"))) {
setInstanceVariable("@range",getRuntime().getClass("Range").callMethod("new",
new IRubyObject[]{getInstanceVariable("@range").callMethod("first"), getRuntime().newFixnum(index+1)}));
}
regenerate();
return getInstanceVariable("@value").callMethod("[]",
getRuntime().newFixnum(index -
RubyNumeric.fix2int(getInstanceVariable("@range").callMethod("first"))));
}

public IRubyObject each() {
Iterator iter = ((RubyArray)getInstanceVariable("@value")).getList().iterator();
while(iter.hasNext()) {
getRuntime().getCurrentContext().yield((IRubyObject)iter.next());
}
return getRuntime().getNil();
}

public IRubyObject to_a() {
return getInstanceVariable("@value");
}

public IRubyObject to_s() {
return getInstanceVariable("@value").callMethod("to_s");
}

public IRubyObject inspect() {
StringBuffer sb = new StringBuffer("#<Sequence::Sequence n1=");
sb.append(getInstanceVariable("@n1").toString());
sb.append(" n2=");
sb.append(getInstanceVariable("@n2").toString());
sb.append(" range=");
sb.append(getInstanceVariable("@range").toString());
sb.append(" value=");
sb.append(getInstanceVariable("@value").callMethod("inspect").toString());
sb.append(">");
return getRuntime().newString(sb.toString());
}
}
}

Compiling this and placing it in fib.jar on your load path will allow JRuby to use the code as if it was Ruby. Try it out.

Now, let’s take the code in pieces. First of all, the initialization code defines the methods available and gives them a reflected implementation through CallbackFactory. We create a static inner class to hold the actualy implementation of the class. This isn’t strictly necessary in this case, since we haven’t associated any external state with the object, but it makes for cleaner separation and easier to understand code. Note that we need to have our own
new-implementation. This is one of the drawbacks with the MRI technique. When using MetaClasses you can define an allocateObject-method that automatically get’s used by the runtime. Most of CallbackFactory’s different getMethods-variants are used. This display how to have a fixed number of arguments with specific classes.

The initialize method just sets the instance variables and then call the method regenerate. Note that this isn’t a Ruby method anymore. I didn’t feel it was necessary to expose it, and using Java call semantics makes this slightly more efficient. Apart from that, there is nothing really strange in this code. I use the fact that you can create a new Ruby array from a list to make the regeneration of @value easier. But in most cases this is purely translated Ruby to JRuby-code. The only point where something strange is happening is in fact in the each-method. Handling blocks with JRuby in Java isn’t always practical, so I tend to find it easier to refactor the Ruby code into something that calls yield specifically, by itself.

Conclusion

Implementing a Java extension for JRuby can be tricky, but the hard part is mostly to know what services are available where. By having the JRuby source code available it’s easy to get a peek into the internals and find out more about those things that are problematic. Taking a look at how the core classes are implemented often give some hints on how continue, too. For example, RubyZlib, RubyYAML, RubyOpenSSL, RubyStringIO and RubyEnumerable are all mostly written in this style, and there are various examples of the different styles available.

If you need the speed or if it’s more practical to implement the functionality in Java, I would say that writing an extension is fairly easy once you get started. The important thing to remember is to be sure what the interface should be, and implement everything else outside of JRuby, demarcating the interface from the implementation.



Effectiveness of automated refactoring.


The last few weeks have seen some discussion regarding refactoring tools for dynamic languages. The basic questions are if it is possible, and if so, how effective it would be. More information about the debate in question can be found here, here and here. I’m not really going into the fray here; I just wanted to provide my hypothesis on something tangential to the issue.

The interesting point is something Cedric said in his blog:

And without this, who wants an IDE that performs a correct refactoring “most of the time”?

The underlying assumption here is that there can actually exist a refactoring tool that works all the time. So, my question is this: “Can a Refactoring tool be 100% effective, (where effectiveness is defined as completely fulfilling the refactoring preconditions, postconditions and invariants and without introducing dangers or errors in the code).”

My hypothesis (and there will be no rigorous proof of my position) is based on one axiom:

Automated refactoring is a subset of the Church-Turing halting problem.

I have no direct proof for this position, but it seems intuitive to me, that for a refactoring to always be completely correct, you would need to know things about the program that isn’t always entirely possible to predict only from code. In this case, test runs would be necessary, and in those cases the halting problem enters. Now, for a strongly, statically typed language you would have to go to some effort to actually produce a program that couldn’t be safely refactored, but the possibility is still there.

One commenter on one of the blog entries above said that a precondition for 100% refactoring of Java would be that you didn’t use reflection and other meta-tricks. But the problem is, to avoid the halting problem you would have to remove enough features of Java to make it into a Turing-incomplete language. And by then it wouldn’t be usable for general purpose programming.

I see no way out of this dilemma, unless my axiom is wrong. But if it is correct, there can never ever be a 100% effective refactoring tool.



JRuby import


OK, for those of you who thought that importing a class into the current namespace by assigning a constant, here is a small implementation (based on include_class), that lets you use import like you do it in Java:

 require 'java'

class Object
def import(name)
unless String === name
name = name.java_class.inspect
end
class_name = name.match(/((.*)\.)?([^\.]*)/)[3]
clazz = self.kind_of?(Module) ? self : self.class
unless clazz.const_defined?(class_name)
if (respond_to?(:class_eval, true))
class_eval("#{class_name} = #{name}")
else
eval("#{class_name} = #{name}")
end
end
end
end

import java.util.TreeMap

x = TreeMap.new
x['foo'] = 'bar'
puts x