Dynamic languages on Google App Engine – an overview


As mentioned in a post a few minute ago here, Google has released App Engine support for Java. This is obviously very cool – and I’ve spent a few weeks testing several things using it. It should come as no surprise that my main goal with this investigation has been to see how dynamic languages fit in with the Java story.

The good news are these: JRuby works very well on the infrastructure. I will spend some more time in another post detailing what you have to do to get a JRuby on Rails application working on Google App Engine. In this post I’ll talk a bit about the different kind of restrictions a language implementation will run into, and what needs fixing.

Several other people has been testing languages such as Groovy, Scala, Clojure and Jython. My own experiments have been focused on JRuby and Ioke. At the moment, Ioke still doesn’t run on GAE/J, but the issue is something I hope will be fixed soon.

When looking at GAE/J, it’s important to keep in mind the security restrictions that Google has been forced to implement, to make the Java implementation totally safe for them. This includes restrictions of many kinds, and some of them might come as a bit of a surprise in some cases. One of the larger things you will notice is that some classes aren’t available – and you will get a ClassNotFoundException if you try to use them from your application. Personally, I believe that using a SecurityException when trying to load these might have been better, but this fact remains: many classes you expect will not be there.

Among the classes that are there (and the important parts of the JDK are there) there are many that will give you different kinds of security related problems too. JRuby trunk has been fixed for all these issues, so it should work without modification.

File system

GAE/J restricts quite a lot of what you can do with the file system. One of the things that surprised me was that calling methods like java.io.File#canRead on a restricted file might throw a SecurityException. Basically this means that all file access in an implementation need to wrap these calls in try-catch blocks.

In JRuby, I solved this by an approach that Ryan Brown gave us – creating a subclass of java.io.File that wraps all these method and return something reasonable. canRead should for example just return false if it gets a SecurityException.

Threads

It’s very hard to secure a thread scheduler – there are ways of screwing up things that are basically impossible to guard against. That means GAE/J does not support threads at all. You can’t create new ones, you can’t create new ThreadGroups or change most settings on these threads.

This is something that is less problematic for some languages, and more problematic for others. I know that Lift (in Scala) for example had some trouble, since it relied very heavily on actors, implemented using thread pools.

Reflection

Java’s reflection capabilities are very powerful, and most of the reflection methods throw several kinds of exceptions. On GAE/J you will have to guard most reflective accesses against SecurityException too. One of the things that many dynamic languages do is to use “setAccessible” on all methods. This will fail on some methods that Google thinks you shouldn’t have access to. Several of the methods on Object are among these.

Verification

In some cases, the bytecode verifier is a bit stricter than for other JDKs. It’s important to try out many corners of the application and see that it works correctly. Of course, if your language generates code at runtime, this is even more important. The good news is that I haven’t seen any problems at all with JRuby. The bad news is that the parser for Ioke doesn’t even load (and that is static Java code). This seems like a small problem in the verifier where a stack height of 0 causes it to fail, so hopefully it will be shortly fixed.

Class loading

One of the early problems for Clojure was some intricacies in the way GAE/J handles class loaders. One of these is that doing ClassLoader.getSystemClassLoader() caused a SecurityException.

Testing

It is not immediately obvious to me how you can test applications written for GAE/J in another language. The intuition is that you would be able to use the local development server to run tests, but many of the things above don’t work exactly the same in the local dev server, and some things are problematic locally but won’t cause any trouble on the server. One thing I’ve noticed is that JRuby doesn’t load correctly, because the dev server doesn’t actually load things from jar-files in such a way that JRuby can load several property files from the same jar-file. This issue doesn’t actually exist on the real servers.

You can use unit tests to test parts of your application, but you need to make sure to stub out all the calls to the Google APIs. This is actually kinda hard in Java, since one of the negative aspects about the GAE/J APIs is that they are built around singleton factories. It is very hard to inject new functionality there.

With JRuby, you can of course override these methods and unit test without running them. The main problem with this kind of unit testing is that it won’t give you any real security on the server – since you might still run into several kinds of security exceptions.

I ended up implementing a very small unit testing framework for sanity checking. This allows you to trigger a test run by going to a specific URL. Of course, this approach sucks.

At the end of the day, it seems the best kind of testing you can do is functional testing using something like Selenium or WebDriver. Or Twist. GAE/J allow you to have different versions of an application, so one way you could utilize this automatically is to allow your automatic test run deploy to a version called “test”, and then you can use a specific url to get the latest version. Say your app is deployed on “testgae”, you can allow your CI to test against “test.latest.testgae.appspot.com”, while the production environment is still running on “testgae.appspot.com”. It’s still not perfect, but it gives you some flexibility and a possibility to run continuous integration on the correct infrastructure.


11 Comments, Comment or Ping

  1. Thanks for exploring dynamic language support. Do you know if anybody has looked into supporting Rhino on GAE?

    BTW, unit testing singletons is easy with powermock; I highly recommend it. http://code.google.com/p/powermock/

    April 8th, 2009

  2. David Ziegler

    One thing I really liked about the python version of app engine is that it scaled down as well as up. That is an application that just got the occasional hit still had good performance, and cost very little to run (if anything).

    From what you describe, Java app engine applications (particularly those with lots of user libraries, or with dynamic language compilation at startup) have a big, expensive JVM warmup time.

    So, if I were to write a JRuby application that only got a flurry of requests every hour or so would I end up with this long warmup time each time – or would google be happy to keep several MB of classes and objects live in a JVM waiting for the next request?

    What did you see in your experimentation? If you left your app for a hour or so, did you get the big hit the next time you accessed it?

    – Thanks

    David

    April 10th, 2009

Reply to “Dynamic languages on Google App Engine – an overview”