SQLServer is also stupid

I can’t understand the problems database drivers have with whitespace where there shouldn’t be any. It’s clearly wrong…

So say that you create a table in SQLServer that is nullable. If you use DatabaseMetaData and do getColumns for that table, and check the nullable column, and fetch the value at index 18 – IS_NULLABLE. What you get back here is “YES” which is all according to the API. But if you by chance do the same thing for a column that is non-nullable – what value would you get then?
“NO “. Notice the space. Lovely. I *heart* databases.

… Someone asked me in response to my last blog post which database I would choose if I could. Hard question. I’d rather do without databases. But if I have to, I’d go with Mimer SQL which is probably the most SQL-compliant database ever, and really doesn’t have WTF moments at all. It’s small, it’s from Sweden, and it’s very nice.

Oracle is stupid

I’ve just spent two days debugging and fixing AR-JDBC issues with Oracle. And let me tell you, those days haven’t been fun. I am really unfond of Oracle at the moment. You probably saw my last post. Now, let me add a new point of insanity to the proceedings…

Say that you define a table like this:

create table companies (
firm_id integer default null references companies

Now, if you try to get the default value of the column firm_id in some manner (for example, you could use JDBC, DatabaseMetaData.getColumns, and getString(13) on the result). You could also use the OCI8 C interface. You would get the same result. Any guesses? What is the default value of the column firm_id? Some might say that it should be the String “null”. Nope. It’s the String “null “. Notice the space.

Now, if you instead defined your table like this:

create table companies (
firm_id integer default null

In this, what is the default value of the column firm_id? it’s “null”. Without a space. Yes, it varies. Yes, it actually varies based on the formatting of the SQL used to create the table. You could potentially use the Whitespace language to embed arbitrary programs in the null default value… Because if there is two blanks between the null word, and the next token, then that’s what you will get from the default value. Notice that we used two totally different interfaces to get this information so it’s obviously something that is saved in the database engine. Wow.

Is this insane? Is it crazy? Am I thoroughly disgusted by now?

Yes. And also, AR-JDBC finally runs all ActiveRecord tests with Oracle.

What’s wrong with this code?

Today I will introduce to you a method from ActiveRecord. The method takes a parameter called type and that value can bu for example :primary_key, :string or :integer. Now, in the first line there is a call to native_database_types. Generally, that call returns a structure that looks somewhat like this:

def native_database_types #:nodoc:
:primary_key => "int(11) DEFAULT NULL auto_increment PRIMARY KEY",
:string => { :name => "varchar", :limit => 255 },
:text => { :name => "text" },
:integer => { :name => "int", :limit => 11 },
:float => { :name => "float" },
:decimal => { :name => "decimal" },
:datetime => { :name => "datetime" },
:timestamp => { :name => "datetime" },
:time => { :name => "time" },
:date => { :name => "date" },
:binary => { :name => "blob" },
:boolean => { :name => "tinyint", :limit => 1 }

The method itself looks like this.

def type_to_sql(type, limit = nil, precision = nil, scale = nil) #:nodoc:
native = native_database_types[type]
column_type_sql = native.is_a?(Hash) ? native[:name] : native
if type == :decimal # ignore limit, use precison and scale
precision ||= native[:precision]
scale ||= native[:scale]
if precision
if scale
column_type_sql << "(#{precision},#{scale})"
column_type_sql << "(#{precision})"
raise ArgumentError, "Error adding decimal column: precision cannot be empty if scale if specified" if scale
limit ||= native[:limit]
column_type_sql << "(#{limit})" if limit

There is something very wrong with this implementation. Of course, there could exist many errors here, but what I’m thinking about right now is a violation of the usual way methods should work. And in effect, that problem with this method have caused ActiveRecord-JDBC to implement some very inefficient code to handle this method. And it gets called a lot in ActiveRecord. I’ll get back later today with a pointer to what’s wrong here, and I will also discuss some of what I’ve done in AR-JDBC to handle this situation. I hope for many suggestions here! =)

This is what’s wrong

I must say, I got some really good responses to my post about what was wrong with the code I posted. Most of those responses concerned the design of the code, and I agree, this part of Rails could have been done much better. But what I was thinking about was actually a bug. And Lars Westegren (my former colleague) nailed it at the first try. Let me show two important excerpts from this code:

column_type_sql = native.is_a?(Hash) ? native[:name] : native

and here:

column_type_sql << "(#{limit})" if limit

Obviously, double left arrow is append, and for all cases where there is a limit, this append will change the String. This is one of the cases where it’s kind of annoying that strings are mutable. If I cache away the values that native_database_types should return, then the next time anyone wants a string SQL type, that will generate VARCHAR(255)(255). The next time again, VARCHAR(255)(255)(255). And so one. So either I need to recreate the hash every time, or I need to do a deep clone of it every time. Neither of these options are very good, and it seems the deep clone option isn’t fast enough, even when done in Java, so I decided to go with a hash literal instead. Was that the right choice? I don’t know. It improves performance, but on the other hand it churns objects and creates new objects all the time. All of this because of some sloppy coding in Rails.

What’s the lesson learned? Never modify your arguments, unless that is an explicit part of the contract for that method and part of the documentation.

Testing with JRuby on Rails and ActiveRecord-JDBC

This will be a highly uninflammatory blog post, as contrast to the last one. Specifically, there is a slight problem when running the command

jruby -S rake

in a a JRuby on Rails-application. This problem is caused by some hard coded values in the database Rake definitions for Rails. But don’t despair, there is a simple solution to this. It’s not as simple as it should be (invisible) but it’s easy enough. Provided you have JRUBY_HOME set and your version of AR-JDBC is 0.3.1, execute this command from your Rails application root

cp $JRUBY_HOME/lib/ruby/gems/1.8/gems/ActiveRecord-JDBC-0.3.1/lib/tasks/jdbc_databases.rake lib/tasks

Since the hard coded values are hard to override, the jdbc_databases.rake file just hacks Rake to be able to redefine tasks and then redefines the core tasks. This shouldn’t affect a bi-Ruby installation, since the overriding only happens on JRuby, not on MRI. If someone has a better way to do this, please tell me. =)

ActiveHibernate – Any takers?

This is a call for action. JRuby on Rails is getting more stable each day, and JRuby performance is consistently improving. This means that JRuby on Rails is well on the path of becoming a viable platform for several kinds of web development.

ActiveRecord-JDBC is central to this, and it’s a good project and works very well. Within the limitations of ActiveRecord, of course. So, I’m thinking (and this is by all means not a new thought), that I would like to have the rest of Rails, but using Hibernate as backend. With JRuby approaching 1.0 fast, ActiveHibernate seems like a neat project. The only problem is time. So why is ActiveHibernate tempting? Well, for all those situations where ActiveRecord really doesn’t fit; composite keys, complicated legacy systems. Or databases where you would like to use prepared statements for everything. Or get really nice speed.

What needs to be done? For it to be really useful, there are few points: First, a Hibernate extension that serializes and deserializes into RubyObjects. By doing it that way, there is no need to create Java bean classes. Secondly, provide all the useful help functionality around the Hibernate implementation, that AR users have gotten used to. This includes things like validations, automatic handling of updates and inserts, and is generally about doing a good Ruby interface around Hibernate. This also includes creating a good way of configuring Hibernate without having to resort to the XML. Maybe using YAML? Or a Ruby DSL? After that has been done, the final point is easy: get mind share and start doing applications with it! I for one know that I would like to use such a project!

On ActiveRecord-JDBC performance

I have been a bit concerned about the performance of our component that connects ActiveRecord with JDBC. Since ActiveRecord demands that every result of a select should be turned into a big array of hashes of strings to strings, I suspected we would be quite inefficient at this, and I wasn’t sure I could put all my faith in JDBC either.

So, as a good developer, I decided to test this, with a very small microbenchmark, to see how bad the situation actually was.

Since I really wanted to check the raw database and unmarshalling performance, I decided to not use ActiveRecord classes, but do executions directly. The inner part of my benchmark execution looks like this:

conn.create_table :test_perf, :force => true do |t|
t.column :one, :string
t.column :two, :string

100.times do
conn.insert("INSERT INTO test_perf(one, two) VALUES('one','two')")

1000.times do
conn.select_all("SELECT * FROM test_perf")

conn.drop_table :test_perf

It is executed with a recent MySQL Community Edition 5 server, locally, with matching JDBC drivers. The MRI tests is run with 1.8.6, and both use ActiveRecord 1.15.3. ActiveRecord-JDBC is a prerelease of 0.2.4, available from trunk. My machine is an IBM Thinkpad T43p, running Debian. It’s 32bit and Java 6.

The results were highly interesting. First, let’s see the baseline: the Ruby results:

      user     system      total        real
7.730000 0.020000 7.750000 ( 8.531013)

Frankly, I wasn’t that impressed with these numbers. I thought Ruby database performance was better. Oh well. The interesting part is the JRuby AR-JDBC results:

      user     system      total        real
6.948000 0.000000 6.948000 ( 6.948000)

WOW! We’re actually faster in this quite interesting test. Not what I had expected at all, but very welcome news indeed. Note that there is still much block overhead in the interpreter, so the results are a little bit skewed in MRI’s favour by this, too.

Announcing ActiveRecord-Mimer 0.0.1

The initial version of ActiveRecord-Mimer have been released.

The project aims to provide complete ActiveRecord support for the Mimer SQL database engine. This initial release provides the basis for that. Most operations work, including migrations. The only exceptions are rename_column and rename_table which isn’t supported by the underlying database engine.

The project resides at RubyForge: http://rubyforge.org/projects/ar-mimer

and can be installed with RubyGems by
gem install activerecord-mimer

The code is released under an MIT license.

Announcing ActiveRecord-JDBC 0.2.2

Version 0.2.2 of ActiveRecordJDBC have now been released. It contains numerous smaller bug fixes, but more importantly the support for MimerSQL. The internals have been slightly refactored to allow easier change of database specific instructions further down the road.

The release can be found at http://rubyforge.org/frs/?group_id=2014 or installed through RubyGems.