A new server infrastructure


A month ago, Joyent told me that the machine I was hosting my server on was being end-of-lifed. This server has been running my blog, the Ioke and Seph home pages and various other small things for several years now. However, I’m ashamed to admit that the machine was always a Snowflake server. So now when I had to do something about it, I decided to make this the right way. So I thought I would write up a little bit what I ended up with in order to make this into a Phoenix server.

The first step was to decide on where to host my new server. At work I’ve been using AWS since January, and I like the experience. However, for various reasons I was not comfortable using Amazon for my personal things, so I looked at the capabilities of other providers. I ended up choosing Rackspace. In general I have liked it, and I would use them again for other things.

Once I had a provider, I had to find the right libraries to script provisioning of the server. We have been using Fabric and Boto for this at work – but we ended up having to write quite a lot of custom code for this. So I wanted something a bit better. I first looked at Pallet – from the feature lists it looked exactly like what I wanted, but in practice, the classpath problems with getting JClouds working with Rackspace correctly was just too much, and I gave up after a while. After that I looked around for other libraries. There are several OpenStack libraries, both for Ruby and for Python. Since I’m more comfortable I ended up with the Ruby version (the gem is called openstack-compute). However, since it was tied heavily to openstack I wrote a thin layer of genering provisioning functionality on top of it. Hopefully I’ll be able to clean it up and open source it soon.

When I had everything in place to provision servers, it was time to figure out how to apply server configurations. I’ve always preferred Puppet over Chef, and once again it’s something we use at my current project, so Puppet it is. I use a model where I pack up all the manifests and push them to the server and then run puppet there. That way I won’t have to deal with a centralized puppet server or anything else getting in the way.

And that is really all there is to it. I have all my sites in Git on Github, I have a cron script to pulls regularly from them, making sure to inject passwords in the right places after pulling. All in all, it ended up being much less painful than I expected.

So what about Rackspace then? The good: the APIs work fine, the machines seems stable and performant, and they now have a CloudDB solution that gives you roughly the same kind of capability as RDS. However, there is currently no backup option for the cloud databases. The two things I missed the most from AWS was EBS volumes and elastic IPs. However, EBS volumes are also a pain to deal with sometimes – I wish there was a better solution. Elastic IPs seems like the right solution for fast deploys, stable DNS names and so on – but they also have some problems, and the more I’ve been thinking about it, the less I realize I want them. In order to get zero-down time deploys and having stable DNS names I think the right solution is to use the Rackspace load balancer feature. You attach the DNS name to the LB, and then have scripts that point to the real machines. The useful feature for what I’ve been doing is that you can take up a new staging/pre-production server, add it to the load balancer (but while doing so, set the policy to only balance your personal home IP or the IP of your smoke test runner to the new box). That means you can try out the machine while everyone else gets routed to the old machine. Once you are satisfied you can just send an atomic update to the load balancer and switch out the policy to only point to the new machine.

Going forward, I think this is the kind of model I actually want for our AWS deploys as well.

So all in all, I’ve been pretty happy with the move.