The development of Oracle Mix


Rich Manalang just posted a very nice entry on the Oracle AppsLab about the technology behind Oracle Mix, how we developed it and so on. Read it here.



Accumulators in Ruby


So, me and Ben Butler-Cole discussed the fact that accumulators in Ruby isn’t really done in the obvious way. This is due to the somewhat annoying feature of Ruby that nested method definitions with the def-keyword isn’t lexically scoped, so you can’t implement an internal accumulator like you would in Python, Lisp, Haskell or any other languages like that.

I’ve seen three different ways to handle this in Ruby code. To illustrate, let’s take the classic example of reversing a list. The functional way of doing this is to define an internal accumulator, this takes care of making the implementation tail recursive, and very efficient on linked list.

So, the task is to reverse a list in a functional, recursive approach. First version, using optional arguments:

class Reverser
def reverse(list, index = list.length-1, result = [])
return result if index == -1
result << list[index]
reverse(list, index - 1, result)
end
end

So, this one uses two default arguments, which makes it very easy to reuse the same method in the recursive case. The problem here is that the optional arguments expose an implementation detail which the caller really has no need of knowing. The implementation is simple but it puts more burden on the caller. This is also the pattern I see in most places in Ruby code. From a design perspective it’s not really that great.

So, the next solution is to just define a private accumulator method:

class Reverser
def reverse(list)
reverse_accumulator(list, list.length-1, [])
end

private
def reverse_accumulator(list, index, result)
return result if index == -1
result << list[index]
reverse_accumulator(list, index - 1, result)
end
end

This is probably in many cases the preferable solution. It makes the interface easier, but adds to the class namespace. To be sure, the responsibility for the implementation of an algorithm should ideally belong at the same place. With this solution you might have it spread out all over the place. Which brings us to the original problem – you can’t define lexically scoped methods within another method. So, in the third solution I make use of the fact that you can actually have recursive block invocations:

class Reverser
def reverse(list)
(rec = lambda do |index, result|
return result if index == -1
result << list[index]
rec[index - 1, result]
end)[list.length-1, []]
end
end

The good thing about this implementation is that we avoid the added burden of both a divided implementation and an exposed implementation. It might seem a bit more complex to read if you’re not familiar with the pattern. Remember that [] is an alias for the call method on Procs. Also, since the assignment to rec happens in a static scope we can actually refer to it from inside the block and get the right value. Finally, all assignments return the assigned value which means that we can just enclose everything in parens and apply it directly. Another neat aspect of this is that since the block is a closure, we don’t need to pass the list variable around anymore.

Does anyone have a better solution on how to handle this? Accumulators aren’t really that common in Ruby – is this a result of Ruby making functional programming unneat, or is it just don’t needed?



A new language


So, it’s that time of the year again. The restlessness flows over me. I feel cold and numb. And no, it’s not because I live in London – it’s because I need the warmth of learning a new language.

Now, I want something I can actually get into and learn. I’ve tried to get into OCaml, but I gotta admit I hate the type system. I have no problem with bondage static typed languages (Haskell’s type system is really nice, for example) but OCaml’s really feels like half of it exists just to cover up holes in the other half. There seems to be a large overlap in functionality, and lots of workarounds for handling things that should be simple.

I’m half way into Erlang, but for several reasons the language feels very primitive.

I’ve kinda thought about maybe getting serious with Scala. I like many of the language features, it’s a nicely designed language and so on. But – hear this, people – I would love to get away from the JVM for a while, just for the sake of it. I can do Scala later. I actually have a medium sized project lined up for my Scala learning. But not right now.

So, what do I want? Something I haven’t touched before. I would love something that involves radically new language features, if there are any left to discover. I have no need for it to be static or dynamic specifically. Doesn’t really matter. It would be fun if it’s new, but if it’s old, good and still in use in some sectors that would be fun too. Specifically something that’s not mainly run on the JVM or CLR. And of course, not any of the “mainstream” languages, who I actually tend to know fairly well (and yeah, to my sorrow that includes the whole W-family…).

Please help me! Give this December new meaning for me. I promise, if someone comes up with a nice language to try out, I’ll be very fair to it when I evaluate and learn it. =)



Ruby memory leaks


They aren’t really common, but they do exist. As with any other garbage collected language, you can still be susceptible to memory leaks. In many cases they can also be very insidious. Say that you have a really large Rails application. After some time it grinds to a halt, CPU bound in GC. It may not even be a leak, it could just be something that creates so much garbage that the collector cannot take care of it.

I gotta admit, I’m not sure how to find such a problem. After getting histograms of objects, and trying to profile it, maybe run with ruby-debug, I would be out of options. Maybe some kind of shotgun technique – shutting down parts of the application, trying to pinpoint the location of the problem.

Now, ordinarily, that would have been the end of my search. A failure. Or maybe several weeks of trying to read through the sources.

The alternative? Run the application in JRuby. See if the same memory leak shows up (remember, it might be a bad interaction with MRI’s runtime system that gives you grief. Or maybe even a bug in MRI Garbage Collector). But if it doesn’t go away, you’re in luck. Wait until the CPU starts chugging for real, and then take a heap dump using the jmap Java SDK tool. Once that’s done, you’ll be sitting with a large honking binary file that you can’t do much with. The standard way of reading it is through jhat, but that don’t give much to go on.

But then I found this wonderful tool called SAP Memory Analyzer. Google it and download it. It’s marvelous. Easily the best heap analyzer I’ve run across in a long time. It’s only flaw is that it runs in Eclipse… But well, it can’t be everything, right?

Once you’ve opened up the file in SAP, you can do pretty much everything. It’s quite self explanatory. The way I usually go about things is to use the core option, and then choose “find_leak”. That almost always gives me some good suspects that I can continue investigating. From there on it’s just to drill down and find out exactly what’s going on.

Tell me if you can do that in any way as easy as that with MRI. I would love to know. But right now, JRuby is kicking butt in this regard.



Oracle developers on OSX unite!


All my ranting aside, Oracle RDBMS is pretty good. It’s got good performance, and lots of features you really need in a database. I shan’t proclaim it my favorite database, but it’s definitely something I have no problem working with. Except for that one small detail…

Yeah, you guessed it. Oracle support on Mac OS X is kinda… nonexistent. The best solution I’ve come up with is to run Parallels with a Windows or Linux instance and run Oracle XE inside of that. But that only works if I want to use the JDBC thing driver. OCI development? You’re screwed. And the Parallels route isn’t exactly painless either. Especially from a performance point of view.

So what do we need? OCI8 precompiled binaries would be a good start. But in the end, the only workable solution for all developers on OSX in the world who wants to be able to use Oracle is a compatible Oracle XE for Intel OS X. It shouldn’t really be to hard, right? It’s just a BSD beneath the covers…

Anyway, it’s kinda interesting. If you’re a consultant or a developer, OS X is definitely the superior platform. That’s a fact (well, except for Java 6…). The lack of Oracle support forces people to develop their application against Postgres and then let continuous integration – you are using CI, right? – tell you if you made any Oracle-unfriendly mistakes. That doesn’t really sound to professional.

So, go on and vote for this in Oracle Mix. The links are here: https://mix.oracle.com/ideas/we-need-the-oracle-clients-oci-jdbc-for-the-apple-intel-osx-platform,
https://mix.oracle.com/ideas/compile-oracle-xe-for-intel-os-x



Oracle Mix has launched


The last 5 weeks, a team consisting of me, Alexey Verkhovsky, Matt Wastrodowski and Toby Tripp from ThoughtWorks, and Rich Manalang from Oracle have created a new application based on an internal Oracle application. This site is called Oracle Mix, and is aimed to be the way Oracles customers communicate with Oracle and each other, suggesting ideas, answering each others questions and generally networking.

Why is this a huge deal? Well, for me personally it’s really kinda cool... It’s the first public JRuby on Rails site in existance. It’s deployed on the “red stack”: Oracle Enterprise Linux, Oracle Application Server, Oracle Database, Oracle SSO, Oracle Internet Directory. And JRuby on Rails.

It’s cool. Go check it out: http://mix.oracle.com.



QCon San Francisco recap


Last week I attended QCon San Francisco, a conference organized by InfoQ and Trifork (the company behind JAOO). It must admit that I was very positively surprised. I had expected it to be good, but I was blown away by the quality of most presentations. The conference had a system where you rated sessions by handing in a green, yellow or red card – I think I handed in two yellow cards, and the rest was green.

Everything started out with tutorials. I didn’t go to the first tutorial day, but the second day tutorial was my colleagues Martin Fowler and Neal Ford talking about Domain Specific Languages, so I decided to attend that. All in all it was lots of very interesting material. Sadly, I managed to get slightly food poisoned from the lunch, so I didn’t stay the whole day out.

On Wednesday, Kent Beck started the conference proper with a really good keynote on why Agile development really isn’t anything else than the way the world expects software development to happen nowadays. It’s clear to see that the Agile way provides many of the ilities that we have a responsibility to deliver. A very good talk.

After that Richard Gabriel delivered an extremely interesting presentation on how to think about ultralarge, self sustaining systems, and how we must shift the way we think about software to be able to handle large challenges like this.

The afternoons sessions was dominated by Brian Goetz extremely accomplished presentation on concurrency. I really liked seeing most of the knowledge available right now into a 45 minute presentation, discussion most of the things we as programmers need to think about regarding concurrency. I am so glad other people are concentrating on these hard problems, though – concurrency scares me.

The panel on the future of Java was interesting, albeit I didn’t really agree with some of the conclusions Rod Johnson and Josh Bloch arrived at.

The day was capped by Richard Gabriel doing a keynote called 50 in 50. I’m not sure keynote is the right word. A poem, maybe? Or just a performance. It was very memorable, though. And beautiful. It’s interesting that you can apply that word to something that discusses different programming languages, but there you have it.

During the Thursday I was lazy and didn’t attend as many sessions as I did on the Wednesday. I saw Charles doing the JRuby presentation, Neal Ford discussing DSLs again, and my coworker Jim Webber rant about REST, SOA and WDSL. (Highly amusing, but beneath the hilarious surface Jim definitely had something very important to say about how we build Internet applications. I totally agree. Read his blog for more info.)

The Friday was also very good, but I missed the session about Second Life architecture which seemed very interesting. Justin Gehtland talked about CAS and OpenID in Rails, both solutions that I think is really important, and have their place in basically any organization. Something he said that rang especially true with me is that a Single Sign-On architecture isn’t just about security – it’s a way to make it easier to refactor your applications, giving you the possibility to combine or separate applications at will. Very good. Although it was scary to see the code the Ruby CAS server uses to generate token IDs. (Hint, it’s very easy to attack that part of the server.

Just to strike a balance I had to satisfy my language geekery by attending Erik Meijer’s presentation on C#. It was real good fun, and Erik didn’t get annoyed at the fact that me and Josh Graham interrupted him after more or less every sentence, with new questions.

Finally, I saw half of Obie’s talk about the new REST support in Rails 2.0 (and he gave me a preview copy of his book – review forthcoming). There is lots of stuff there that can really make your application so much easier to code. Nice.

The day ended with two panels, first me, Charles, Josh Susser, Obie and James Cox talking about Rails, the future of the framework and some about the FUD that inevitably happens.

The final panel was Martin Fowler moderating me, Erik Meijer, Aino Vonge Corry and Dan Pritchett, talking about the things we had seen at the conference. The discussion ranged from large scale architecture down to concurrency implementations. Hopefully the audience were satisfied.

All in all, an incredibly good time.



JRuby 1.0.2 released


The JRuby community is pleased to announce the release of JRuby 1.0.2.

Homepage: http://www.jruby.org/
Download: http://dist.codehaus.org/jruby/


JRuby 1.0.2 is a minor release of our stable 1.0 branch. The fixes in this
release include primarily obvious compatibility issues that we felt were
low risk. We periodically push out point releases to continue supporting
production users of JRuby 1.0.x.

Highlights:
- Fixed several nasty issues for users on Windows
- Fixed a number of network compatibility issues
- Includes support for Rails 1.2.5
- Reduced memory footprint
- Improved File IO performance
- trap() fix
- 99 total issues resolved since JRuby 1.0.1

Special thanks to the new JRuby contributors who rose to Charlie's challenge
to write patches for some outstanding bugs: Riley Lynch, Mathias Biilmann
Christensen, Peter Brant, and Niels Bech Nielsen. Welcome aboard...


An interesting memory leak in JRuby


The last two days I had lots of fun with the interesting task of finding a major memory leak in JRuby. The only way I could reliably reproduce it was by running Mingle’s test suite and see memory being eaten. I tried several approaches, the first being using jhat to analyze the heap dumps. That didn’t really help me much, since all the interesting queries I tried to run with OQL had a tendency to just cause out of memory errors. Not nice.

Next step was to install SAP Memory Analyzer which actually worked really well, even though it’s built on top of Eclipse. After several false starts, including one where I thought we had found the memory leak I finally got somewhere. Actually, I did find a memory leak in our method cache implementation. But alas, after fixing that it was obvious there was another leak in there.

I finally got SAP to tell me that RubyClasses was being retained. But when I tried to find the root chain to see how that happened I couldn’t see anything strange. In fact, what I saw what the normal chaining of frames, blocks, classes and other interesting parts. And this is really the problem when debugging this kind of problem in JRuby. Since a leak almost always be leaking several different objects, it can be hard to pinpoint the exact problem. In this case I guess that the problem was in a large branch that Bill merged a few weeks back, so I tried going back to it and checking. Alas, the branch was good. In fact, since I went back 200 revisions I finally knew within which range the problem had to be. Since I couldn’t find anything more from the heap dumps I resorted to the venerable tradition of binary search. Namely going through the revisions and finding the faulty one. According to log2, I would find the bad revision in less than 8 tries, so I started out.

After I while I actually found the problem. Let me show it to you here:

def __jtrap(*args, &block)
sig = args.first
sig = SIGNALS[sig] if sig.kind_of?(Fixnum)
sig = sig.to_s.sub(/^SIG(.+)/,'\1')
signal_class = Java::sun.misc.Signal
signal_class.send :attr_accessor, :prev_handler
signal_object = signal_class.new(sig) rescue nil
return unless signal_object
signal_handler = Java::sun.misc.SignalHandler.impl do
begin
block.call
rescue Exception => e
Thread.main.raise(e) rescue nil
ensure
# re-register the handler
signal_class.handle(signal_object, signal_handler)
end
end
signal_object.prev_handler = signal_class.handle(signal_object, signal_handler)
end

This is part of our signal handling code. Interestingly enough, I was nonplussed. How could trap leak? I mean, noone actually calls trap enough times to make it leak, right?

Well, wrong. Actually, it seems that ActiveRecord traps abort in transactions and then restore the original handler. So each transaction created new trap handlers. That would have been fine, except for the last line. In effect, in the current signal handler we save a reference to the previous signal handler. After a few iterations we will have a long chain of signal handlers, all pointing back, all holding a hard reference from one of the single static root sets in the JVM (namely, the list of all signal handlers). That isn’t so bad though. Except a saved block has references to dynamic scopes (which reference variables). It has a reference to the Frame, and the Frame has references to RubyClass. RubyClass has references to method objects, and method objects have in some cases references to RubyProcs, which in turn have more references to Blocks. At the end, we have a massive leak.

The solution? To simple remove saving of the previous handler and simplify the signal handler.



QCon and OpenWorld


As mentioned before I will be in San Francisco next week for QCon, and the week after that for Oracle OpenWorld. I will be part of a panel debate at QCon and man a booth on Oracle OpenWorld. In fact, if you’re attending OpenWorld you should visit ThoughtWorks booth at 343 Moscone South. Looking forward to seeing you there.