Ola Bini: Programming Language Synchronicity

October 5th, 2006

JRuby progress

I thought that I should post a little notice about what’s happening with JRuby right now. Most of this is available for you if you subscribe to FishEye for JRuby (http://fisheye.codehaus.org/changelog/~rss/jruby/rss.xml).

There are a few very nice innovations going on. The most important is probably that Java-integration has been improved quite substantially. Basically, if the package you need are in the java, javax, org or com packages, you can just refer to classes the same way you refer to them in Java (except for classes that doesn’t have a capital initial letter, but you don’t do that, do you?). Typical JRuby code could look like this:

 require 'java'

TreeMap = java.util.TreeMap
x = TreeMap.new(java.util.Comparator.impl { |m,o1,o2|
            o1 <=> o2
          })

Now, there are two new features in here, and one JRuby idiom that should be used in JRuby code. The first feature is to refer to classes by name, simply. The idiom is to import classes into the current namespace by assigning them to constants. The second interesting feature is the ‘impl’ method, that is available on all interfaces. This method will create an anonymous implementation of the interface, which will call the block when any of the methods in the interface is called. The method name goes in the parameter ‘m’ in the example. This allows a very clean syntax for implementing one-method-interfaces, like in the example above. Before these we’re added, you had to use import_class for each class, then create a class for the interface, and then use this class. The code would easily have doubled for this example.

A few days ago Charles and Tom made me committer in the JRuby project, which obviously feels very good, though slightly nervous. But I’ve been able to fix a number of very small bugs since then. The more interesting of these, in no order:

Correct implementation of default-inspect. (This isn’t that interesting for implementations, but it makes debugging much easier)
Hash#each didn’t yield properly in some edge-cases. This fix was submitted by Miguel, and applied by me.
I also added a fix to improve the Java method matching when the methods had a primitive in their arguments. This means that from now on, if you have two Java-methods, foo(float a) and foo(int a), and call it from Ruby with “foo(32)”, the float-one won’t get called anymore. Pretty nice.
A String#crypt implementation, which is based on some stuff I did years ago, which means there are no copyright issues.
The flash-issue with Rails (a message placed in the flash doesn’t get removed). This was a Marshalling issue, which was quite easy to fix. Big win.
Stack overflow when calling non-implemented methods when subclassing an interface. It calls method_missing instead, now.

There are a few things being discussed on the list right now. We will soon commit a readline.rb that checks if the JNI-based GNU readline-bridge is present, and if so uses it. This means that if you want, you can have basic readline in JIRB.

We are talking about the best way to implement OpenSSL support too. This is quite a big thing, though, and as Tom put it “It will be pretty difficult but the person who does it will be worshipped”. I’m not sure about the right way to go with implementation either. It seems the available OpenSSL JNI-implementations aren’t good enough, so the best route seems to be JSSE. If you have any opinions or suggestions, please get in touch.

In other news, it seems I’m going to be talking about JRuby at JavaForum in Malmö. The date is not final but it seems likely to be the 23 October. If you’re in Malmö, please stop by. I’ll post more information as soon as everything is clear.

6 Comments | By Ola Bini | In: Uncategorized | tags: jruby, ruby. | #

October 3rd, 2006

Announcing ActiveRecord-JDBC 0.2.2

Version 0.2.2 of ActiveRecord–JDBC have now been released. It contains numerous smaller bug fixes, but more importantly the support for MimerSQL. The internals have been slightly refactored to allow easier change of database specific instructions further down the road.

The release can be found at http://rubyforge.org/frs/?group_id=2014 or installed through RubyGems.

No Comments | By Ola Bini | In: Uncategorized | tags: active record, jdbc, jruby, jruby-extras, ruby. | #

October 2nd, 2006

The JRuby Tutorial #3: Playing with Mongrel

This part of the tutorial will be based on some slightly not-released software, but since it is so cool, I bet you will try it anyway. Basically, what I’m going to show you is how to get Mongrel 0.4 working with JRuby, and then how you can serve your JRuby on Rails-application with said version of Mongrel.

What you’ll need
First of all, check out the latest trunk version of JRuby. There are some smoking new fixes in there that is needed for this hack. Next, you will also need to check out the 0.4-branch of Mongrel. This can be done with the following command:

svn co svn://rubyforge.org/var/svn/mongrel/branches/mongrel-0.4

You need to manually copy two parts of mongrel into your JRuby home. If $MONGREL_SRC is the name of the directory where you checked out mongrel, these commands will suffice:

cp -r $MONGREL_SRC/lib/mongrel* $JRUBY_HOME/lib/ruby/site_ruby/1.8
cp $MONGREL_SRC/projects/gem_plugin/lib/gem_plugin.rb $JRUBY_HOME/lib/ruby/site_ruby/1.8
echo '#\!/usr/bin/env jruby' > $JRUBY_HOME/bin/mongrel_rails
cat $MONGREL_SRC/bin/mongrel_rails >> $JRUBY_HOME/bin/mongrel_rails
chmod +x $JRUBY_HOME/bin/mongrel_rails

You will need to download the JRuby-specific http11-extension library. This can be downloaded here, and should also be put in the $JRUBY_HOME/lib/ruby/site_ruby/1.8-directory.

You’re now set to go.

Simple web hosting
I will now show how to set up at small web server, that can serve both files and servlets. There really isn’t much to it. First of all, we need to include some libraries:

require 'mongrel'
require 'zlib'
require 'java'
include_class 'java.lang.System'

Next step is to create a simple HttpHandler (which is like a Servlet, for you Java-buffs):

class SimpleHandler < Mongrel::HttpHandler
def process(request, response)
  response.start do |head,out|
    head["Content-Type"] = "text/html"
    results = <<-"EDN";
<html>
<body>
  Your request:<br/>
  <pre>#{request.params.inspect}</pre>
  <a href=\"/files\">View the files.</a><br/>
  At: #{System.currentTimeMillis}
</body>
</html>
EDN
    if request.params["HTTP_ACCEPT_ENCODING"] == "gzip,deflate"
      head["Content-Encoding"] = "deflate"
      # send it back deflated
        out << Zlib::Deflate.deflate(results)
    else
      # no gzip supported, send it back normal
        out << results
    end
  end
end
end

Now, this handler basically just generates a bunch of HTML and sends it back. The HTML contains the request parameters. Just to show how easy it is to combine Java-output with Ruby-output, I have added a call to System.currentTimeMillis. This could of course by anything. The last part is to actually make this handler active also. To finalize, we also start the server:

@simple = SimpleHandler.new
@http_server = Mongrel::HttpServer.new('0.0.0.0',3333)
@http_server.register("/", @simple)
if ARGV[0]
@files = Mongrel::DirHandler.new(ARGV[0])
@http_server.register("/files", @files)
end

puts "running at 0.0.0.0:3333"

@http_server.run

If you start this script with:

jruby testMongrel.rb htdocs

you can visit localhost:3333 and expect to see some nice output.

Making it work with Rails
A prerequisite for this part is that you have a functional JRuby on Rails-application using ActiveRecord-JDBC. If that is the case, you just need to go your application directory and execute this command:

$JRUBY_HOME/bin/mongrel_rails --prefix "" start

and everything should just work.

So, that’s it. JRuby on Rails, with Mongrel. Enjoy.

No Comments | By Ola Bini | In: Uncategorized | tags: jruby, mongrel, rails, ruby. | #

September 30th, 2006

In Joy and Sorrow with Continuations

Continuations is one of those topics that tend to crop up now and again. This is not strange, of course, since they happen to be one of the more powerful features of certain languages, but also is one of the most confusing one. I would like to stick my head out and say that continuations are probably up there besides real macros in power. The reason for this is that you can implement so many language features in terms of them.

Since there still seem to be some confusion about them, I’ll write my piece on the. Not just for you readers of course, but more importantly for myself. I intend to get a good grip on continuations in Ruby by writing this (and this is incidentally one of the best ways to learn about something confusing; try to write about).

First of all, exactly what is a continuation? Basically, at every point in the evaluation of an expression, there will be one or more continuations lurking. For example, if we take the very simple expression foo = 13 * (10 – 7). In this place there is 4 interesting continuations waiting. (There are actually 8 of them all in all, but only 4 interesting.) We start by looking at the expression 10 – 7. If we look at the rest of the expression like this: foo = 13 * [] where the square brackets is the place where the result of the expression 10 – 7 will go. What’s actually happening is that those square brackets is the continuation of the complete expression. The result of evaluating 10 – 7 will be injected into the rest of the expression, and that is what the continuation is.

Until now, I have spoken about continuations as a concept. Those of you who know the Ruby interpreter knows that it isn’t coded in continuation-passing style. But it could be, and it doesn’t really matter, since we still have a way to get at the current continuation. So, how should a continuation be represented, though? The way most languages choose to do it, a continuation is nothing but an anonymous closure, which takes one parameter, which is the result to return to the evaluation. In the example above, if we inject the callcc-primitive into the mix, we will have code that looks like this:

foo = 13 * callcc {|c| c.call(10-7) }

His doesn’t really look that spectacular, of course. The above code will have exactly the same effect as the first example, namely binding the variable ‘foo’ with the value 39.

If you want to, you can look at every computation like this. It sometimes helps to imagine that you just wring the evaluation inside out.

So. What can you do with them? Mostly anything, actually. Many parts of Scheme is implemented in CPS (continuation passing style). But for a few concrete things that can be implemented easily: exceptions, throw/catches, breaks, returns, coroutines, generators and much more. As an example, we can implement a return like this:

 def val
  callcc do |ret|
    1000.times do |v|
      if v == 13
        ret.call(v+1)
      end
    end 
  end
end
bar = val
puts bar

What happens here is exactly the same result as if we had used the keyword return. Most of the other flow control primitives can be implemented this way too.

What has made continuations trendy lately is something called continuation web servers. The idea is to make the statelessness of the web totally transparent by hiding the client round trips inside methods, and these methods save the current continuation, and then breaks of evaluation. When the result from the server arrives, the continuation will be looked up from some session storage, and then restarted again, where it was. Basically, this allows web applications to work more or less exactly the same as a console application. This is very powerful, but as I hope this small post have shown, continuations have much more to give.

2 Comments | By Ola Bini | In: Uncategorized | tags: callcc, continuations, coroutines, ruby. | #

September 25th, 2006

Two things in Rails

This will be a short in-between post. Don’t expect to be annoyed, enlightened or even trivially entertained. I’m just going to describe two small things I do in all my Rails-projects, and I haven’t found a way to do them as plugins. This is very annoying, of course, so I hope someone from the Rails team will eventually see this and tell me how to do it DRY.

1. Add a production_test environment
I feel constrained by the three environments that get delivered by Rails out of the box. And I find that for every project where the customer isn’t myself and the codebase is bigger than about 50 lines of (hand-written) code, I tend to add a new environment to Rails; ‘production_test’. The problem this environment solves is the situation where I want my customers to test out an application, but I don’t want them to do it against a real production environment. For example, I did an application called LPW a few months back, that works against a 3rd party web service. This web service has one production environment and one test environment. I want the production_test to be as fast, responsive and generally as much as the production environment as possible, but not go against the production web service. I solve this by adding a production_test env which is exactly like the production environment, except I can just change the address to the web service endpoint to the test one.

I usually do this, so I can give my customers a nice application that they can play with, but without worrying about them damaging production data.

2. Add plugin environment configuration
This is actually a major pain. I have developed a few plugins, and generally I want them to have configurations based on which environment we are in. For example, the CAS authentication plugin shouldn’t really redirect to the CAS server when in development environment. But, I can’t set this in any good way, since the plugins will be loaded after the environment-specific files have been loaded. So, what I do is simply to add a new directory, called config/plugin and in environment.rb I have this:

 plugin_environment = File.join(RAILS_ROOT,'config', 'plugin', "#{ENV['RAILS_ENV']}.rb")
load plugin_environment if File.exist?(plugin_environment)

This solution sucks, but it works.

5 Comments | By Ola Bini | In: Uncategorized | tags: rails, ruby. | #

September 24th, 2006

The Ruby singleton class

After my post on Meta-programming techniques I got a few comments and questions about the singleton-class. This feature seem to be quite hard to understand so I have decided that I will try to clarify the issue by first describing what it is, and then detail why it is so useful. This entry will be concept-heavy and code-light.

What it is
A child with many names, the singleton class has been called metaclass, shadow class, and other similar names. I will stay with singleton class, since that’s the term the Pickaxe uses for it.

Now, in Ruby, all objects have a class that it is an instance of. You can find this class by calling the method class on any object. The methods an object respond to will originally be the ones in that objects class. But as probably know, Ruby allows you to add new methods to any object. There are two syntaxes to do this:

 class << foo
def bar
puts "hello world"
end
end

and

 def foo.bar
puts "hello, world"
end

To the Ruby interpreter, there is no difference in this case. Now, if foo is a String, the method bar will be available to call on the object referenced by foo, but not on any other Strings. The way this works is that the first time a method on a specific object is defined, a new, anonymous class will be inserted between the object and the real class. So, when I try to call a method on foo, the interpreter will first search inside the anonymous class for a definition, and then go on searching the real class hierarchy for an implementation. As you probably understand, that anonymous class is our singleton class.

The other part of the mystery about singleton classes (and which is the real nifty part) is this. Remember, all objects can have a singleton class. And classes are objects in themselves. Actually, a class such as String is actually an instance of the class Class. There is nothing special about these instances, actually. They have capitalized names, but that’s because the names are constants. And, since every class in Ruby is an instance of the class Class, that means that what’s called class methods, or static methods if you come from Java, is actually just singleton methods defined on the instance of the class in question. So, say you would add a new class method to String:

 def String.hello
puts "hello"
end

String.hello

And now you see that the syntax is actually the same as when we add a new singleton method to any other object. This only difference here is that that object happens to be an instance of Class. There are two other common ways to define class methods, but they work the same way:

 class String
def self.hello
puts "hello"
end
end

class String
class << self
def hello
 puts "hello"
end
end
end

Especially the second version needs explaining, for two reasons. First, this is the preferred idiom in Ruby, and it also makes explicit the singleton class. What happens is that, since the code inside the “class String”-declaration is executed in the scope of the String instance of Class, we can get at the singleton class with the same syntax we used to define foo.bar earlier. So, the definition of hello will happen inside the singleton class for String. This also explain the common idiom for getting the singleton class:

 class << self; self; end

There is no other good way to get it, so we extract the self from inside a singleton class definition.

Why is it so useful for metaprogramming?
Obviously, you can define class methods with it, but that’s not the main benefit. You can do many metaprogramming tricks with it, that are impossible without. The first one is to create a super class that can define new class methods on sub classes of itself. That is the use I show cased in my earlier blog entry. The problem is that you can’t just use self by itself, since that only gives the class instance. This code with results show the difference:

 class String
p self
end # => String

class String
p (class << self; self; end)
end # => #<Class:String>

And, if you want to use define_method, module_eval and all the other tricks, you need to invoke them on the singleton-class, not the regular self. Basically, if you need to dynamically define class methods, you need the singleton-class. This example will show the difference between defining a dynamic method with self or the singleton class:

 class String
self.module_eval do
  define_method :foo do
    puts "inside foo"
  end
end

(class << self; self; end).module_eval do
  define_method :bar do
    puts "inside bar"
  end
end
end

"string".foo # => "inside foo"
String.bar # => "inside bar"

As you can see, the singleton class will define the method on the class instead. Of course, if you know the class name it will always be easier to avoid having an explicit singleton class, but when the method needs to defined dynamically you need it. It’s as simple as that.

8 Comments | By Ola Bini | In: Uncategorized | tags: metaprogramming, ruby, singleton class. | #

September 23rd, 2006

Three ways to add Ruby Macros

As most of my readers probably have realized at this point, I have a few obsessions. Lisp and Ruby happens to be two of the more prominent ones. And regarding Lisp, macros is what especially interest me. I have been doing much thinking lately on how you could go about adding some kind of macro facility to Ruby and these three options are the result.

I should begin by saying that none of these options are entirely practical right now. All of them have some serious problems which I frankly haven’t been able to come up with an answer for yet. But that doesn’t stop me from blogging about my ideas, of course. Another thing to notice is that this is not about hygienic macros. This is the full-blown, power, blow-the-moon away version of macros.

MacRuby – Direct defmacro in Ruby
The first approach rests on modifying the language itself. You can add a defmacro keyword which takes a name and a code block to execute. Each time the compiler/interpreter finds a macro-definition, it will remember the name. When that name is found in the code later on each place will be marked. Then, before execution begins, all places where the call to the macro are will be replaced by the output from sending in the subnodes at that place by the output of calling the macro. An example of a simple macro:

 defmacro log logger, level, *messages
if $DEBUG
  :call, logger, level, *messages
else
  :nop
end
end

log @l, :debug, "value is: #{very_expensive_operation()}"

What’s interesting in this case is that the messages will not be evaluated if the $DEBUG flag is not set. This is because the value returned from the macro will be spliced into the AST only if that flag is set. Otherwise a no-op will be inserted instead. Obviously, for this kind of code to work, the interpreter would need to change substantially. There is also a big problem with it, since it’s very hard to fit this model into the object-oriented system of Ruby. As I think about it now, it seems macros would be the only non-OOP feature in Ruby, if added in this way. Another big problem with this model is that it is really not that intuitive what the resulting code from the macro will be. As soon as something more advanced needs to be returned, it will be very hard getting it straight in your head. One solution to this would be to do it the standard CL way. First write the output from the macro in several different instances. Then transform this to the AST code through a tool that parses the code. Then transform this into the macro. This process would be helped by tools, of course.

Back-and-Lisp-Ruby – Write macros in Lisp, translate Ruby back and forth
Another way to achieve this power in Ruby would be to separate the macro language from the main language. In effect, the macros would be a classic pre-processor. To offer the same power level as Lisp and others, the best way would be to write the macros themselves in a Lisp dialect, then transform Ruby in a well-defined way to Lisp and back again. (See the next version for more about this idea.) In this situation the same macro as before could look like this:

 (defmacro log (logger level &rest messages)
 (if $DEBUG
     `(,level ,logger ,@messages)
     '()))

The main difference in this code is that the macro and the output from the macro is Lisp. We have gotten rid of the ugly :call and :nop return values, and to me this seems quite readable. Of course, I’m not sure everyone else feels the same way. And we still have the same problem with Object Orientedness. It’s missing.

RoCL – Ruby over Common Lisp
The final idea is to build a Ruby runtime within Common Lisp and transform Ruby into Common Lisp before running it. The macros could either be added as Ruby code or Lisp code. Everything will be transformed into the equivalent code in Lisp, maybe using CLOS as the Object-system, or building something based on Ruby’s. Of course, the semantics of many things would change, and many libraries would need to rewritten. But in the end, there would be incredible power available. Especially if we can make it go both ways, so that Common Lisp can use Ruby libraries.

An example transformation could look like this. From this Ruby:

 class String
  def revert(a, *args)
    if block_given?
      yield a
    else
      args + [a]
    end
  end
end

"abc".revert "one" do |x|
  puts x
end

This is nonsense code, if you hadn’t noticed. =)

 (with-class "String" nil
            (def revert (a block &rest args)
              (if block
                  (apply block a)
                  (+ args [a]))))
(revert "abc" "one" #'(lambda (x)
                        (puts self x)))

Conclusions
It is very hard to actually retrofit macros into Ruby after the fact. I’m still not sure it can be done and keep enough of Ruby’s semantics to make it meaningful. It seems that we need a new language. But if I had to choose among these approach, the RoCL one seems the most interesting and also the most fun to implement. If I have a motto it would have to be something in the line of “best of all worlds”. I want the best from Ruby, Java, Lisp, Erlang and everything I can find.

17 Comments | By Ola Bini | In: Uncategorized | tags: lisp, macros, ruby. | #

September 22nd, 2006

The Dark Ages of programming languages

We seem to be living in the dark ages of programming languages. I’m not saying this to bash everything; I’m actually being totally objective right now. Obviously, our situation right now is much better than it was 10 years ago. Or even 5 years ago. I would actually say that it’s really much better now, than 1 year ago. But programming is still way too painful in almost all cases. We are doing so much stuff by hand that obviously should be done be computer.

I spend quite much time learning new languages now and then, to try to find something that’s really good for me. So far, the best contestants are Ruby, Erlang, OCaml and Lisp, but all of those have their share of problems too. They just suck less than the alternatives.

Ruby… I really like Ruby. Ruby is such an improvement that I really want to do almost everything in it nowadays. I think in Ruby half the time and in Lisp the other half. But it’s not enough. It is still clunky. I want tail calls. I want real macros. I want blazing speed and complete integration with good libraries for everything and more. I’m just a sucker for power, and I want more of it in Ruby.
Erlang and OCaml. These languages are really great. For specific applications. Specifically, Erlang is totally superior for concurrent programming. And OCaml is incredibly fast, very typesafe and has great GUI libraries. So, if I was asked to do something massively concurrent I would probably choose Erlang, and OCaml if it was GUI programming. But otherwise… Well, Erlang does have some neat functional properties, but not any nice macro support. It doesn’t have a central code repository and many other things you expect from a general purpose language. OCaml suffers from the same things.
Lisp is the love of my life. But as so many people before me has noted, all the implementations are bad in some way or another. Scheme is lovely; for research. Common Lisp is so powerful, but it needs users. Lots of them, creating libraries for every little data format there can be, creating competing implementations of particularly important API’s; like databases.

Conclusion. Nothing is good enough, right now. I see two two paths ahead. Two ways that could actually end in the “100-year language”.

The first path is one new language. This language will be based on all the best features of all current languages, plus a good amount of research output. I have a small list what this language would need to be successful as the next big one:

It needs to be multiparadigm. I’m not saying it can’t choose one paradigm as the base, but it should be possible to program in it functionally, OOP, AOP, imperative. It should be possible to build a declarative library so you can do logic programming without leaving the language.
It should have static type inference where possible. It should also allow optional type hints. This is so important for creating great implementations. It can also increase readability in some cases.
It needs all the trappings of functional languages; closures, first-order functions and lambdas. This is essential, to avoid locking the language into an evolutionary corner.
It needs garbage collection. Possibly several competing implementations of GC’s, running evolutionary algorithms to find out which one is best suited for long running processes of the program in question.
A JIT VM. It seems almost a given right now that Virtual Machines are a big win. They can also be made incredibly fast.
Another JIT VM.
A non-VM implementation. Several competing implementations for different purposes is important to allow competition and experimentation with new features of implementation.
Great integration with legacy languages (Java, Ruby (note, I’m counting on all Rubyists moving to this new language when it gets out, making Ruby legacy), Cobol). This is obvious. There are to many things lying around, bitrotting, that we will never get rid of.
The language and at least one production quality implementation needs to be totally open-source. No lock-in of the language should be possible.
Likewise, good company support is essential. A language needs money to be developed.
A centralized code/library repository. This is one of Java’s biggest failings. Installing a new library in Java is painful. We need something like CPAN, ASDF, RubyGems.
The language needs great, small and very orthogonal libraries. The libraries included with the language needs to be great, since they have to be small but still pack all the most needed punch.
Concurrency must be a breeze. There should be facilities in the language itself for making this obvious. (Like Erlang or Gambit Scheme).
It should be natural to do meta-programming in it (in the manner of Ruby).
It should be natural to solve problems bottom-up, by implementing DSL’s inside or outside the language.
The languages needs a powerful macro facility that isn’t to hard to use.
Importantly, for the macro facility, the language needs to have a well-defined syntax tree of the simplest possible kind, but it also needs to have optional syntax.

So, that’s what I deem necessary (but maybe not sufficient) for a really useful, good, long term programming language. When I read this list, it doesn’t seem that probables that this language will show up any time soon, though. Actually, it seems kinda unrealistic.

So maybe the other way ahead is the right one? The other way I envision is that languages become easier and easier to create, and languages have their strength in different places. Along this path I envision the descendants of Ruby and Erlang exploiting what they’re good at and eschewing everything else. But for this strategy to work, the first thing implemented in each language needs to be a seamless way to integrate to other languages. Maybe there will come an extremely good glue-language (not like Perl or Ruby, but a language that only will serve as glue between programming languages), and all languages will implement good support for that language. For example you could code a base Erlang concurrent framework, which uses G (the glue language) to implement some enterprise functionality in Java sandboxes, and some places where Ruby through G will implement a DSL, which have subparts where Ruby uses G to run Prolog knowledge engines.

If you had to choose among the two futures, I am frankly more inclined towards the one-language one. But the multi-language way seems much more probable. And since I’m trying to choose way now, I’m placing my bets on the second option. We are not ready to implement G yet, but I do think that as many p-language techs as possible should do their best to learn how languages can cooperate in different ways, to prepare this project.

19 Comments | By Ola Bini | In: Uncategorized | tags: future of programming, lisp, metaprogramming, programming languages, ruby. | #

September 20th, 2006

Ruby Metaprogramming techniques

Updated: Scott Labounty wondered how the trace example could work and since a typical metaprogramming technique is writing before- and after-methods, I have added a small version of this.
Updated: Fixed two typos, found by Stephen Viles

I have been thinking much about Metaprogramming lately. I have come to the conclusion that I would like to see more examples and explanations of these techniques. For good or bad, metaprogramming has entered the Ruby community as the standard way of accomplishing various tasks, and to compress code. Since I couldn’t find any good resources of this kind, I will start the ball running by writing about some common Ruby techniques. These tips are probably most useful for programmers that come to Ruby from another language or haven’t experienced the joy of Ruby Metaprogramming yet.

1. Use the singleton-class

Many ways of manipulating single objects are based on manipulations on the singleton class and having this available will make metaprogramming easier. The classic way to get at the singleton class is to execute something like this:

 sclass = (class << self; self; end)

RCR231 proposes the method Kernel#singleton_class with this definition:

 module Kernel
  def singleton_class
    class << self; self; end
    end
end

I will use this method in some of the next tips.

2. Write DSL’s using class-methods that rewrite subclasses

When you want to create a DSL for defining information about classes, the most common trouble is how to represent the information so that other parts of the framework can use them. Take this example where I define an ActiveRecord model object:

 class Product < ActiveRecord::Base
  set_table_name 'produce'
 end

In this case, the interesting call is set_table_name. How does that work? Well, there is a small amount of magic involved. One way to do it would be like this:

module ActiveRecord
  class Base
    def self.set_table_name name
      define_attr_method :table_name, name
    end

    def self.define_attr_method(name, value)
      singleton_class.send :alias_method, "original_#{name}", name
      singleton_class.class_eval do 
        define_method(name) do   
          value 
        end
      end
    end
  end
end

What’s interesting here is the define_attr_method. In this case we need to get at the singleton-class for the Product class, but we do not want to modify ActiveRecord::Base. By using singleton_class we can achieve this. We have to use send to alias the original method since alias_method is private. Then we just define a new accessor which returns the value. If ActiveRecord wants the table name for a specific class, it can just call the accessor on the class. This way of dynamically creating methods and accessors on the singleton-class is very common, and especially so in Rails.

3. Create classes and modules dynamically

Ruby allows you to create and modify classes and modules dynamically. You can do almost anything you would like on any class or module that isn’t frozen. This is very useful in certain places. The Struct class is probably the best example, where

PersonVO = Struct.new(:name, :phone, :email)
p1 = PersonVO.new(:name => "Ola Bini")

will create a new class, assign this to the name PersonVO and then go ahead and create an instance of this class. Creating a new class from scratch and defining a new method on it is as simple as this:

c = Class.new
c.class_eval do
  define_method :foo do
    puts "Hello World"
  end
end

c.new.foo    # => "Hello World"

Apart from Struct, examples of creating classes on the fly can be found in SOAP4R and Camping. Camping is especially interesting, since it has methods that creates these classes, and you are supposed to inherit your controllers and views from these classes. Much of the interesting functionality in Camping is actually achieved in this way. From the unabridged version:

def R(*urls); Class.new(R) { meta_def(:urls) { urls } }; end

This makes it possible for you to create controllers like this:

class View < R '/view/(\d+)'
  def get post_id
  end
end

You can also create modules in this way, and include them in classes dynamically.

4. Use method_missing to do interesting things

Apart from blocks, method_missing is probably the most powerful feature of Ruby. It’s also one that is easy to abuse. Much code can be extremely simplified by good use of method_missing. Some things can be done that aren’t even possible without. A good example (also from Camping), is an extension to Hash:

class Hash
  def method_missing(m,*a)
    if m.to_s =~ /=$/  
      self[$`] = a[0]
    elsif a.empty?  
      self[m]
    else  
      raise NoMethodError, "#{m}"
    end
  end
end

This code makes it possible to use a hash like this:

x = {'abc' => 123}
x.abc # => 123
x.foo = :baz
x # => {'abc' => 123, 'foo' => :baz}

As you see, if someone calls a method that doesn’t exist on hash, it will be searched for in the internal collection. If the method name ends with an =, a value will be set with the key of the method name excluding the equal sign.

Another nice method_missing technique can be found in Markaby. The code I’m referring to makes it possible to emit any XHTML tags possible, with CSS classes added into it. This code:

body do
  h1.header 'Blog'
  div.content do
    'Hellu'
  end
end

will emit this XML:

  <body><h1 class="header">Blog</h1><div class="content">Hellu</div></body>

Most of this functionality, especially the CSS class names is created by having a method_missing that sets attributes on self, then returning self again.

5. Dispatch on method-patterns

This is an easy way to achieve extensibility in ways you can’t anticipate. For example, I recently created a small framework for validation. The central Validator class will find all methods in self that begin with check_ and call this method, making it very easy to add new checks: just add a new method to the class, or to one instance.

methods.grep /^check_/ do |m|
  self.send m
end

This is really easy, and incredibly powerful. Just look at Test::Unit which uses this method all over the place.

6. Replacing methods

Sometimes a method implementation just doesn’t do what you want. Or maybe it only does half of it. The standard Object Oriented Way ™ is to subclass and override, and then call super. This only works if you have control over the object instantiation for the class in question. This is often not the case, and then subclassing is worthless. To achieve the same functionality, alias the old method and add a new method-definition that calls the old method. Make sure that the previous methods pre- and postconditions are preserved.

class String
  alias_method :original_reverse, :reverse

  def reverse 
    puts "reversing, please wait..." 
    original_reverse
  end
end

Also, a twist on this technique is to temporarily alias a method, then returning it to before. For example, you could do something like this:

def trace(*mths)
  add_tracing(*mths) # aliases the methods named, adding tracing  
  yield
  remove_tracing(*mths) # removes the tracing aliases
end

This example shows a typical way one could code the add_tracing and remove_tracing methods. It depends on singleton_class being available, as per tip #1:

class Object  
  def add_tracing(*mths)
    mths.each do |m|
      singleton_class.send :alias_method, "traced_#{m}", m      
      singleton_class.send :define_method, m do |*args|        
        $stderr.puts "before #{m}(#{args.inspect})"        
        ret = self.send("traced_#{m}", *args)        
        $stderr.puts "after #{m} - #{ret.inspect}"        
        ret      
      end    
    end  
  end

  def remove_tracing(*mths)    
    mths.each do |m|      
      singleton_class.send :alias_method, m, "traced_#{m}"    
    end  
  end
end

"abc".add_tracing :reverse

If these methods were added to Module (with a slightly different implementation; see if you can get it working!), you could also add and remove tracing on classes instead of instances.

7. Use NilClass to implement the Introduce Null Object refactoring

In Fowlers Refactorings, the refactoring called Introduce Null Object is for situations where an object could either contain an object, or null, and if it’s null it will have a predefined value. A typical exampel would be this:

name = x.nil? ? "default name" : x.name

Now, the refactoring is based on Java, which is why it recommends to create a subclass of the object in question, that gets set when it should have been null. For example, a NullPerson object will inherit Person, and override name to always return the “default name” string. But, in Ruby we have open classes, which means you can do this:

def nil.name; "default name"; end
x # => nil
name = x.name # => "default name"

8. Learn the different versions of eval

There are several versions of evaluation primitives in Ruby, and it’s important to know the difference between them, and when to use which. The available contestants are eval, instance_eval, module_eval and class_eval. First, class_eval is an alias for module_eval. Second, there’s some differences between eval and the others. Most important, eval only takes a string to evaluate, while the other can evaluate a block instead. That means that eval should be your absolutely last way to do anything. It has it’s uses but mostly you can get away with just evaluating blocks with instance_eval and module_eval.

Eval will evaluate the string in the current environment, or, if a binding is provided in that environment. (See tip #11).

Instance_eval will evaluate the string or the block in the context of the reveiver. Specifically, this means that self will be set to the receiver while evaluating.

Module_eval will evaluate the string or the block in the context of the module it is called on. This sees much use for defining new methods on modules or singleton classes. The main difference between instance_eval and module_eval lies in where the methods defined will be put. If you use String.instance_eval and do a def foo inside, this will be available as String.foo, but if you do the same thing with module_eval you’ll get String.new.foo instead.

Module_eval is almost always what you want. Avoid eval like the plague. Follow these simple rules and you’ll be OK.

9. Introspect on instance variables

A trick that Rails uses to make instance variables from the controller available in the view is to introspect on an objects instance variables. This is a grave violation of encapsulation, of course, but can be really handy sometimes. It’s easy to do with instance_variables, instance_variable_get and instance_variable_set. To copy all instance_variables from one object to another, you could do it like this:

from.instance_variables.each do |v|
  to.instance_variable_set v, from.instance_variable_get(v)
end

10. Create Procs from blocks and send them around

Materializing a Proc and saving this in variables and sending it around makes many API’s very easy to use. This is one of the ways Markaby uses to manage those CSS class definitions. As the pick-axe details, it’s easy to turn a block into a Proc:

def create_proc(&p); p; end
create_proc do
  puts "hello"
end       # => #<Proc ...>

Calling it is as easy:

p.call(*args)

If you want to use the proc for defining methods, you should use lambda to create it, so return and break will behave the way you expect:

p = lambda { puts "hoho"; return 1 }
define_method(:a, &p)

Remember that method_missing will provide a block if one is given:

def method_missing(name, *args, &block)
  block.call(*args) if block_given?
end

thismethoddoesntexist("abc","cde") do |*args|
  p args
end  # => ["abc","cde"]

11. Use binding to control your evaluations

If you do feel the need to really use eval, you should know that you can control what variables are available when doing this. Use the Kernel-method binding to get the Binding-object at the current point. An example:

def get_b; binding; end
foo = 13
eval("puts foo",get_b) # => NameError: undefined local variable or method `foo' for main:Object

This technique is used in ERb and Rails, among others, to set which instance variables are available. As an example:

class Holder
  def get_b; binding; end
end

h = Holder.new
h.instance_variable_set "@foo", 25
eval("@foo",h.get_b)

Hopefully, some of these tips and techniques have clarified metaprogramming for you. I don’t claim to be an expert on either Ruby or Metaprogramming. These are just my humble thoughts on the matter.

29 Comments | By Ola Bini | In: Uncategorized | tags: metaprogramming, ruby. | #