Evil hook methods?


I have come to realize that there are a few hook methods I really don’t like in Ruby. Or actually, it’s not the hook methods I have a problem with – it’s the way much code is written using them. The specific hooks that seems to cause the most trouble for me is included, extended, append_features and extend_features. Let me first reiterate – I don’t dislike the methods per se. The power they give the language is incredible and should not be underestimated. What I dislike is the way it makes things un-obvious when reading code that depends on them.

Let’s take a silly example:

module Ruby;end

module Slippers
  def self.included(other)
    other.send :include, Ruby
  end
end

class Judy
  include Slippers
end

p Judy.included_modules

Since all this code is in the same place, you can see what will happen when someone include Slippers. And really, in this case the side effect isn’t entirely dire. But imagine that this was part of a slightly larger code base. Like for example Rails. And the modules were spread over the code base. And the included hook did a few more things with your class. No way of knowing what – except reading the code – and the Ruby idiom is that include will add some methods and constants to your class and that is it. Anything else is going outside what the core message of that statement is.

One of the most common things you see with the included hook is something like this:

module Slippers
  module ClassMethods
  end

  def self.included(other)
    other.send :extend, ClassMethods
  end
end

class Judy
  include Slippers
end

This will add some class methods to the class that includes this module. DataMapper does this in the public API, for example. It’s very neat, you only have to include one thing and you get stuff on both your instances and your classes. Except that’s not what include does. Not really. So say you’re debugging someone’s code and happen upon an include statement. You generally don’t check what it’s doing until you’ve exhausted most other options.

So what’s wrong with having a public API like this?

module Slippers
  module ClassMethods;end
end

class Judy
  include Slippers
  extend Slippers::ClassMethods
end

where you explicitly include the Slipper module and then extend the class methods. This is more obvious code, it doesn’t hide anything behind your expectations, and it also might give me the possibility to choose. What if I want most of the DataMapper instance methods, but really don’t want to have finders on my class? Maybe I want to have a repository pattern? In that case I’ll have to explicitly remove all class methods introduced by that include, because there is no way of choosing if I want to have the class methods or not.

So that’s another benefit of dividing the extending out from the included hook. And finally, what about all those other things that people do in those hooks? Well, you don’t really need it. Make it part of the public API too! Instead of this:

module Slippers
  def self.included(other)
    do_funky_madness_on other
  end
end

class Judy
  include Slippers
end

make it explicit, like this:

module Slippers;end

class Judy
  include Slippers
  Slippers.do_funky_madness_on self
end

This is really just good design. It makes the functionality explicit, it makes it possible for the user to choose what he wants without doing monkey patching. And it makes the code easier to read. Yeah, I know, this will mean more lines of code. Booo hooo! I know that Ruby people are generally obsessed with making their libraries as easy to use as possible, but it feels like it sometimes goes totally absurd and people stop thinking about readability, maintainability and all those other things. And really, Ruby is such a good language that a few more lines of code like this still won’t make a huge impact on the total lines of code.

I’m not saying I haven’t done this, of course. But hopefully I’ll get better at it. And I’m not saying not to use these methods at all – I’m just saying that you should use them with caution and taste.



WordPress posting malfunctioning on Orange


Interesting. While on vacation in Paris I have discovered that Orange seems to mess with some of the POST requests going to my WordPress. That includes posting new posts through the user interface, posting comments, and also posting using MarsEdit. Very strange. If you have Orange and get timeouts for some posts for your WordPress, now you know you’re not alone.



Polyglot Programming thesis


A few months ago I was interviewed for a master thesis on Polyglot Programming by Hans-Christian Fjeldberg. This is now finished and available online. You can download it here: http://theuntitledblog.com/wp-content/uploads/2008/08/polyglot_programming-a_business_perspective.pdf



Ribs available from Gems


Ribs is now available from Gems. You can install it using this command:

jruby -S gem install ribs

Much pleasure, and do give your comments and ideas to me.



Announcing Ribs 0.0.1


I am extremely pleased to announce the first release of Ribs.

Ribs is a library for JRuby, that allows you to persist Ruby objects using Hibernate. Some time ago I wrote about ActiveHibernate. I have now decided to implement this myself, and the result is the Ribs project.

The first release is quite minimal in scope. You can define and work with models that have primitive values only – there is no support for associations. You can find, create, update and delete model objects. All of this uses Hibernate and JDBC.

To get started, you just define that an object is to be a Ribs model:

class Artist
  Ribs!
end

Once that’s done, you can start working with it.

Of course, this is just the beginning. I have a quite long list of things I’d like to have in the project, but I felt the need to release quickly and often to be more important than to implement everything first.

This release is not really for production usage, but I would appreciate if people tried it out and came with suggestions. The current planned features can be found in the PLAN file, in the git repository.

More documentation can be found here: http://olabini.com/projects/ribs/doc.
You can download the gem at: http://olabini.com/projects/ribs/downloads/ribs-0.0.1.gem.
The git repository is at: git://github.com/olabini/ribs.git.

Ribs will soon be available in the regular gem repositories – as soon as my Rubyforge project has been approved.

The project is released under the MIT license.



Unexpected JRuby overload resolution


Had an interesting bug using Hibernate from JRuby today. Totally unexpected actually. Interestingly, it actually exposed a problem with dynamic dispatch when going into a static language. To a degree I guess it’s about getting our overload resolution more correct, but it seems like a general rule will be hard.

Basically, my problem was this. I was calling update() on org.hibernate.Session. Now, I used the version that take a String with the entity name, and the actually entity as the other parameter. So the signature update(String, Object) was the one I was aiming for. Sadly, things failed, and kept on failing. And I really couldn’t figure out why. I got this lovely error message: org.hibernate.MappingException: Unknown entity: java.lang.String. This problem can show up from several different reasons, so Google didn’t help.

And then, after tracing the calls for a bit, I finally understood. It just so happens that the default implementation of Session (called org.hibernate.impl.SessionImpl), have a few public update methods that are not part of the Session interface. One of them has the signature update(Object, Serializable). The first parameter is the object to update, and the serializable parameter is the id. JRuby was very helpful in choosing to call that method instead of the update(String, Object) one, since my entity happened to be serializable. Of course, this meant that Hibernate tried to persist a String, instead of my real object, and this fails. The workaround was simple in this case: just use the single argument version of update, since the entity name can be figured out from the object.

But in general these kind of problems will show up sometimes – it’s the price you pay for having an extremely flexible dynamic programming language, interfacing with a statically typed language. But we can improve the overload resolution, and also make it possible to control it more explicitly. I’m currently thinking that it might be a good plan to have some debug flags that will give you some output about overload resolution and things like that too. What do you think? How would you solve this in JRuby?



RSA parameters in OpenSSL, Ruby and Java


I would just like to publish this information somewhere, so that Google can help people find it easier than I did.  If you have ever wondered how the internal OpenSSL RSA parameters map to the Java parameters on RSAPrivateCrtKey, this little table will probably help you a bit. There are three different names in motion here. The first one is the internal field names in OpenSSL. These are also used as method names in Ruby. The second name is what gets presented when you use something like to_text on an RSA key. The third name is what it’s called in Java.

  • n == modulus == modulus
  • e == public exponent == publicExponent
  • d == private exponent == privateExponent
  • p == prime1 == primeP
  • q == prime2 == primeQ
  • dmp1 == exponent1 == primeExponentP
  • dmq1 == exponent2 == primeExponentQ
  • iqmp == coefficient == crtCoefficient


Ruby HTTPS web calls


As I noted in my entry on Ruby security, VERIFY_NONE is used all over the place. And what I realized when I tried to use VERIFY_PEER was that it really doesn’t work for net/https, and doesn’t seem to ever have worked for me. I got a bit mystified by this since I couldn’t find much mention about it online. And then Victor Grey came to the rescue in one of the comments. The solution is to not use net/https at all, but instead use the httpclient gem (formerly called http-access2). So do a ‘gem install httpclient’. Then you can use this code:

require 'rubygems'
require 'httpclient'

clnt = HTTPClient.new
puts clnt.get_content("https://www.random.org/")

This will just work. Under the covers, httpclient uses VERIFY_PEER as default. And you can see this by changing the hostname from www.random.org to random.org. That will generate a verification error directly. Awesome. So what’s the lesson? Never use net/https, folks!



JRuby 1.1.4 Released


It’s late and I don’t have time to write something witty about this. This is the release announcement:

JRuby 1.1.4 is the fourth point release of JRuby 1.1.  The fixes in this
release are primarily obvious compatibility problems and performance
enhancements.  Our goal is to put out point releases more frequently for
the next several months (about 3-4 weeks a release).  We want a more
rapid release cycle to better address issues brought up by users of JRuby.

Highlights:

– Massive refactoring of Java integration layer
– 2-20x speed up of most features (calls, construction, arrays)
– Many long-standing Ruby/Java interaction bugs fixed
– Existing features made more consistent, reliable
– Closures can be passed as interface to static methods, constructors
– Java exceptions can be raised/rescued directly from Ruby
– Massive memory efficiency improvements (a lot less GC)
– Beginning of Ruby 1.9 support (enabled with –1.9 flag)
– Native complex/rational
– Additional efficiency, performance work in the interpreter
– Memory leak under –manage repaired
– FFI subsystem for calling C libraries
– syslog module from Rubinius is working and included
– win32 API support started
– Thread pooling improved (at least one production user now)
– Array concurrent-access improvements
– 72 issues resolved since JRuby 1.1.3

Issues fixed:
JRUBY-231        Provide attr_reader, attr_writer, and attr_accessor for JavaBean style getters & setters
JRUBY-1183     New closure conversion should prefer methods with convertable args over those without
JRUBY-1300     Masquerading of native Java exceptions
JRUBY-1326     Error invoking overloaded Java constructor
JRUBY-1562     Declaration of certain method name (setJavaObject(Xxx x)) will throw an exception using BSF
JRUBY-1615     Raising java exceptions from ruby causes TypeError
JRUBY-1707     Unable to raise Java exceptions of derived types
JRUBY-1735     Java Integration wraps to much
JRUBY-1839     closure conversion fails for blocks
JRUBY-1964     Determine what test/specs are needed to be written in order to refactor java integration post 1.1
JRUBY-1976     Working with JavaMethods doesn’t work.
JRUBY-2136     $VERBOSE = true; require ‘tmpdir’ gives non-fatal Java exception
JRUBY-2192     YAML parser does not appear to deserialize object types.
JRUBY-2204     Syslog module is not available for JRuby
JRUBY-2236     NPE in isDuckTypeConvertible
JRUBY-2287     Storing ruby objects in java classes instances
JRUBY-2377     Wrong line numbers for ArgumentError for Java calls
JRUBY-2429     Cannot Catch Core Java Exceptions From JRuby Internals in Ruby Code
JRUBY-2439     Trying to subclass a Java class from a signed .jar will crash on you.
JRUBY-2449     Implement closure convention for static java methods
JRUBY-2561     JavaField.set_value(foo, nil) breaks
JRUBY-2673     Java exceptions do not return the wrapped exception when getStackTrace is called
JRUBY-2680     When JIT Compiler compiles the append_features in the ruby\site_ruby\1.8\builtin\javasupport\proxy\interface.rb the compiled code slows down by a factor of 10
JRUBY-2741     OSGify jruby.jar in the release jruby distribution
JRUBY-2749     Make RaiseException show the exception message and the Ruby stack trace
JRUBY-2803     Bad performance calling Java classes
JRUBY-2823     Can’t reference Java’s constants that start with a lower case character
JRUBY-2828     Rational#% differs from MRI when argument is negative
JRUBY-2843     Issues with BasicSocket#close_read
JRUBY-2847     A non-existant jar + dir on the load path causes require to error
JRUBY-2850     In some cases, reopened Java objects cannot find methods on Ruby objects subclassed from Java
JRUBY-2854     AST offset error for StrNode and DStrNode
JRUBY-2857     Coercion error with public member variables
JRUBY-2863     Nested Interfaces can’t find the correct method when Java calls Ruby
JRUBY-2865     Can’t extend a class in default package
JRUBY-2867     Wrong overloaded Java method called when both int and float signatures exist
JRUBY-2869     IO.select fails to block with nil timeout
JRUBY-2870     [REGRESSION] Converting a Ruby array to a Java array (of Object references) broken
JRUBY-2872     JSpinner cannot accept Fixnum for it’s value
JRUBY-2873     FFI needs a way to specify call convention
JRUBY-2874     TCPSocket#new and TCPServer#new crash JRuby when the specified port is out of range (negative or bigger than 65k)
JRUBY-2879     net/ftp library is broken if mathn is also loaded
JRUBY-2880     Regression: 17 new RubySpec failures and 3 unit tests falirues caused by r7327 (Array changes)
JRUBY-2881     JAVA_HOME with () breaks JRuby on windows
JRUBY-2882     Incorrect subclass for constructor arg throws internal JRuby error
JRUBY-2886     Extending a final Java class should be rescuable as a normal Ruby exception type
JRUBY-2890     UDPSocket.recvfrom should block until something is available
JRUBY-2891     UDPSocket.bind throws a Java Error when already bound on Java 5
JRUBY-2892     JRuby releases use random copies of joni svn HEAD
JRUBY-2893     mspec runs need to pass properties through -T argument for compilation, etc
JRUBY-2894        When spec runs fail, Ant is not terminating with a failure message
JRUBY-2899     Using JavaEmbedUtils.rubyToJava causes problems when passing RubyObject-derived parameters back in to Ruby code
JRUBY-2903     Allow implementing Java interfaces with underscored method names
JRUBY-2905     NoMethodError does not give a useful message when thrown in BSF
JRUBY-2906     IOError message is garbled when java.io.IOException message is multi-byte character.
JRUBY-2907     method_missing invocation paths end up boxing arguments twice, among other inefficiencies
JRUBY-2910     Object#send is not specific-arity
JRUBY-2915     Exception construction performance is poor
JRUBY-2918     jruby 1.1.3 + activescaffold 1.1.1 generating RJS error
JRUBY-2919     Time.-(Time) does not include microseconds and is off by 10
JRUBY-2923     Eliminate (unknown) from trace elements
JRUBY-2924     JMX support added leaks memory like crazy
JRUBY-2927     Calling interface method on specific instance from Java doesn’t work.
JRUBY-2928     Same issue with hashCode and toString for Interfaces.
JRUBY-2929     Java Integration with regards to arrays of classes are broken
JRUBY-2931     Templater error causes merb-gen (0.9.4 and trunk) to fail on JRuby 1.1.3
JRUBY-2932     Move static soft reference timezone cache to be runtime-specific cache to remove complexity of dealing with soft references
JRUBY-2938     Calling JavaUtil.convertJavaToUsableRubyObject throws AssertionError
JRUBY-2943     Memory leak in closure coercion
JRUBY-2944     Java caller gets null when calling a method on a Ruby object implementing an interface method declared to return Object[], and the Ruby object returns an array of Ruby objects subclass of Hash converted with #to_java
JRUBY-2946     New invokers attempt to access argument list of non-overloaded methods with incorrect arity
JRUBY-2947     Multidimensional array conversion broke in recent Java integration refactoring

Not too bad, is it? Now go find out more at http://www.jruby.org, and download at http://dist.codehaus.org/jruby/



Ruby Security quick guide


I’ve looked around a bit and it seems that there is no real good guide to security programming in Ruby. Neither is there any book available (although Maik Schmidts book Enterprise Recipes with Ruby and Rails will be the best reference once it arrives). The aim for this blog entry will be to note a few things you often would like to do, and how you can do it with Ruby. The focus will be mostly on the cryptographic APIs for Ruby, which doesn’t have much documentation either. In fact, the best documentation for this is probably the OpenSSL documentation, once you learn how to map it to the Ruby libraries.

You want to avoid clear text passwords in your database

One of the very good properties of handling passwords, is that you usually don’t need to actually know what they are. The only thing you need to be able to do is to reset them if someone forgets their password, and verify the correct combination of username and password. In fact, many practitioners feel better if they don’t need to have the responsibility of knowing the passwords of every user on their system, and conversely I feel much better knowing that my password is secure from even the administrators of the system. Not that I ever use the same password to two different systems, but someone else might… =)

So how do you solve this easily? Well, the easiest way – and also the most common way – is to use a digest. A digest is a mathematical function that takes as input a series of numbers of any length and returns a large number that represents the input text. There are three properties of digests that makes them useful: the first is that a small change in the input text generate a large effect in the outcome data, meaning that olabini and olbbini will have quite different digests.

The second is digests generally have a good distribution and small risk of collisions – meaning that it’s extremely unlikely that two different texts have the same output for a specific digest algorithm. The third property is that they are fundamentally one way mathematical operations – it’s very easy to go from plain text to digest, but extremely hard to go in the other direction.

There are currently several algorithms used for digests. MD5 and SHA-1 is by far the most common ones. If you can avoid MD5, do so, because there have recently been some successful attacks against it. The same might happen with SHA-1 at some point, but right now it’s the best algorithm for these purposes. Oh, and avoid doing a double digest – digesting the output of a digest algorithm – since this generally makes the plain text easier to crack, not harder.

Let’s see some code:

require 'openssl'

digest = OpenSSL::Digest::SHA1.hexdigest("My super secret pass")
p digest # => "dd5f30310682e5b41e122c637e8542b1b39466cf"

digest = OpenSSL::Digest::SHA1.hexdigest("my super secret pass")
p digest # => "f923786cc72ed61ae31325b6e8e285e6c35e6519"

d = OpenSSL::Digest::SHA1.new
d << "first part of pass phrase"
d << "second part of pass phrase"
p d.hexdigest # => "f13d7bdee0634c017babb8c72dcebe18f9e0598e"

I have used two different variations here. The first one, calling a method directly on the SHA1 class, is useful when you have a small string that you want to digest immediately. It’s also useful when you won’t need different algorithms or send the digest object around. The second method allows you to update the string data to digest several times and the finally generate the end digest from that. I have taken the liberty of using the methods called hexdigest for both cases – this is because they are more readable. If you replace hexdigest with digest, you will get back a Ruby string that contains characters from 0 to 255 instead, which means they doesn’t print well. But if you were to compare the result of hexdigest and digest, you will see that they return the same data, just in different formats.

So as you can see, this is extremely easy to incorporate in your code. To verify a password you just make a digest of the password the person trying to authenticate sent in to you, and compare that to what’s in your database. Of course, the approach is generalizable to other cases when you want to protect data in such a way that you can’t recover the data itself.

Finally, be careful with this approach. You should generally combine the password with something else, to get you good security. There are attacks based on something called rainbow tables that makes it easy to find the password from a digested password. This can be avoided using a salt or other kind of secret data added to the password.

You want to communicate with someone securely
When you want to so send messages back and forth between actors, you generally need a way to turn it unreadable and then turn it back into something readable again. The operation you need for this is called a cipher, but there are many kinds of ciphers, and not all of them are right. For example, rot-13 be considered a cipher, but there is no real security inherent in it.

In general, what you want for communication between two parties is a symmetric cipher using a secret key. A symmetric cipher means that you can use the same key to “lock” and “unlock” a message – or encrypt and decrypt it. There are other kinds of ciphers that are very useful, which we’ll see in the next example, but symmetric ciphers are the most commonly used ciphers since they are quite easy to use and also very efficient. For a symmetric cipher to be completely safe, the key should always be as long as the data that should be encrypted. Generally this doesn’t happen, since key distribution becomes a nightmare, but the current approaches are reasonably sure against most kinds of cracking. Coupled with asymmetric ciphers, they become extremely useful.

There are three things you need to use a symmetric cipher. The first one is the algorithm. There are loads of different kinds of symmetric ciphers around. Which kind you will use doesn’t matter that much as long as you choose something that is reasonably sure. One of the more widely used algorithms is called DES. In it’s original form DES should definitely not be used (since the key length is only 56 bits, it’s actually not that hard to crack it.). There is another form of DES called Triple DES, or DES3, which effectively gives you more security. DES3 might work in some circumstances, but I would recommend AES in almost all cases. AES come in three varieties called AES-128, AES-192 and AES-256. The difference between them is the key length. As you might guess, AES-128 needs a 128 bit long key, AES-192 a 192 bit long and AES-256 a 256 bit long. These key lengths all give reasonable security. The more security you need the longer key algorithm you can choose. The tradeoff is that the longer the key is, the more time it will take to encrypt and decrypt messages.

Once you have an algorithm, you need a key. This can come in two varieties – either you get a key from a humans, where that key is generally a password. Otherwise you might want to automatically generate a key. This should be done with a secure random number generator – NOT with rand().

Depending on which algorithm you choose, you might also need something called an IV (initialization vector). The algorithms that require an IV is called cyclic block ciphers (CBC) and will work on a small amount of bytes at the same time. In the case of AES-128, a block of 16 bytes are generated on every cycle of the algorithm. These 16 bytes will be based on the algorithm, the key, and the 16 bytes generated the last time. The problem is that the first time there were no bytes generated, which means these will have to be initialized another way and this is where the IV comes in. It’s just the first block that will be used for generating the first real block of data. The IV does not need to be secured and can be sent in the clear. In Ruby, if you don’t provide an IV when initializing your cipher, the default will be a part of the string “OpenSSL for Ruby rulez!”. Depending what length the IV should have, a substring will be used.

So, let’s take a look at some code. This code will just encrypt a message with a password and then decrypt it again:

require 'openssl'

cipher = OpenSSL::Cipher::AES128.new("CBC")
cipher.encrypt
cipher.key = "A key that is 16"
cipher.iv =  "An IV that is 16"

output = ""

output << cipher.update("One")
output << cipher.update(" and two")
output << cipher.final

p output # => "\023D\aL\375\314\277\264\256\245\225\a\360|\372+"

cipher = OpenSSL::Cipher::AES128.new("CBC")
cipher.decrypt
cipher.key = "A key that is 16"
cipher.iv =  "An IV that is 16"

output2 = ""

output2 << cipher.update(output)
output2 << cipher.final

p output2 # => "One and two"

We first create a cipher instance, giving it CBC as the type. This is the default but will generate a warning if you don’t supply. We tell the cipher to encrypt, then initialize it with key= and iv=. The key is 16 bytes because it’s a 128 bit cipher, and the IV is 16 bytes because that’s the block size of AES. Finally we call update on the data we would like to encrypt. We need to save the return value from the update call, since that’s part of the generated cipher text. This means that even encrypting really large texts can be very efficient, since you can do it in smaller pieces. Finally you need to call final to get the last generated cipher text. The only difference when decrypting is that we call decrypt on the object instead of encrypt. After that the same update and final calls are made. (Am I the only one who thinks that encrypt and decrypt should be called encrypt! and decrypt!)?

That’s how easy it is to work with ciphers in Ruby. There are some complications with regards to padding, but you generally don’t need to concern yourself with that in most applications.

You want nodes to be able to communicate with each other securely without the headache of managing loads of secret keys

OK, cool, symmetric ciphers are nice. But they have one weakness, which is the secret key. The problem shows up if you want to have a network of computers talking to each other securely. Of course, you can have a secret key for all of them, which means that all nodes in the network can read messages to other parties. Or you can have a different secure key for each combination of nodes. That’s ok if you have 3 nodes (when you will just need 3 secret keys). But if you have a network with a 100 nodes that all need to communicate with each other you will need 4950 keys. It will be hard to distribute these keys since they need to be kept secure, and generally just managing it all will be painful.

The solution to this problem is called asymmetric ciphers. The most common form of this is public key cryptography. The idea is that you need one key to encrypt something, and another key to decrypt it. If you have key1 and encrypt something with it, you can’t decrypt that something with key1. This curious property is extremely useful. In the version of public key cryptography you generate two keys, and then you publish one of the keys widely around. Since the public key can’t be used to decrypt content there is no security risk in not keeping it secret. The private key should obviously be kept private, since you can generate the public key from it. What does this mean in practice? Well, that in your 100 node network, each node can have a key pair. When node3 wants to send a message to node42, node3 will ask around for the public key of node42, encrypt his message with that and send it to node42. Finally, node42 will decrypt the message using his private key.

There is one really large downside with these ciphers. Namely, it is quite expensive operations, even compared to other ciphers. So the way you generally solve this is to generate a random key for a symmetric cipher, encrypt the message with that key and then encrypt the actual key using the asymmetric cipher. This is how https works, it’s how SSH works, it’s how SMIME and PGP works. It’s generally a good way of matching the strengths and weaknesses of ciphers with each other.

So how do you use an asymmetric cipher? First you need to generate the keys, and then you can use them. To generate the keys you can use something like this:

require 'openssl'

key = OpenSSL::PKey::RSA.generate(1024)
puts key.to_pem
puts key.public_key.to_pem

This will give you the output in form of one private key and one public key in PEM format. Once you have this saved away somewhere, you can start distributing the public key. As the name says, it’s public so there is no problem with distributing it. Say that someone has the public key and wants to send you a message. If the PEM-encoded public key is in a file called public_key.pem this code will generate a message that can only be decrypted with the private key and then read the private key and decrypted the message again.

require 'openssl'

key = OpenSSL::PKey::RSA.new(File.read("public_key.pem"))
res = key.public_encrypt("This is a secret message. WOHO")
p res

key2 = OpenSSL::PKey::RSA.new(File.read("private_key.pem"))
res2 = key2.private_decrypt(res)
p res2

Note that the methods we use are called public_encrypt and private_decrypt. Since the public and private prefix is there, there has to be a reason for it. And there is, as you’ll see soon.

You want to prove that it was you and only you who wrote a message

Being able to send things in private is all good and well, but if you distribute you public key wide you may never know who you get messages from. Or rather, you know that you get a message from their addresses (if you’re using mail) – but you have no way of ensuring that the other person is actually who they say they are. There is a way around this problem too, using asymmetric ciphers. Interestingly, if you encrypt something using your private key, anyone with your public key can decrypt it. THat doesn’t sound very smart from a security perspective, but it’s actually quite useful. Since you are the only one with your private key, if someone can use your public key to decrypt it, then you have to be the one who wrote the message. This have two important consequences. First, you can always trust that the person who wrote you a message was actually the one writing it. And second, that person can never retract a message after it’s been written, since it’s been signed.

So how do you do this in practice? It’s called cryptographic signatures, and OpenSSL supports it quite easily. If you have the keys we created earlier, you can do something like this using the low level operations available on the keys:

require 'openssl'

pub_key = OpenSSL::PKey::RSA.new(File.read("public_key.pem"))
priv_key = OpenSSL::PKey::RSA.new(File.read("private_key.pem"))

text = "This is the text I want to send"
signature = priv_key.private_encrypt(text)

if pub_key.public_decrypt(signature) == text
  puts "Signature match"
else
  puts "Signature didn't match!"
end

There happens to be a slight problem with this code. It works, but if you want to send the signature along the message will always be twice the size of the original text. Also, the larger the text to encrypt the longer it takes. The way this is solved in basically all cases is to first create a digest of the text and then sign that. The code to do that would look like this:

require 'openssl'

pub_key = OpenSSL::PKey::RSA.new(File.read("public_key.pem"))
priv_key = OpenSSL::PKey::RSA.new(File.read("private_key.pem"))

text = "This is the text I want to send"*200

signature = priv_key.sign(OpenSSL::Digest::SHA1.new,text)

if pub_key.verify(OpenSSL::Digest::SHA1.new, signature, text)
  puts "Signature verified"
else
  puts "Signature NOT verified"
end

As you can see we use the sign and verify methods on the keys. We also have to send in the digest to use.

You want to ensure that a message doesn’t get modified in transit

Another usage of signatures, that we actually get for free together with them, is that there is no way to tamper with the message in transit. So even if you send everything in clear text, if the signature verification succeeds you know that the message can’t have been tampered with. The reason is simply this: if the text was tampered with, the digest that gets generated would be different, meaning that the signature would not be equal to what’s expected. And if someone tampered with the signature, it wouldn’t match either. Theoritically you can tamper with both the text and the signature, but then you’d first have to crack the private key used, since otherwise you would never be able to generate a correct signature. All of this comes for free with the earlier code example.

You want to ensure that you can trust someones public key
Another thing that can get troublesome with public and private keys is the distribution problem. When you have lots of agents you need to make sure that the public key you get from someone is actually their public key and that they are the ones they say. In real life this is a social problem as much as a technological. But the way you generally make sure that you can trust that someones public key is actually theirs is that you use somethong called a certificate. A certificate more or less is a signature, where the signer is someone you trust, and the text that is signed is the actual public key under question. If you trust someone that issues certificates, and that someone has issued a certificate saying that a public key belongs to a specific entity, you can trust that public key and use it. What’s more interesting is that certificates can also be signed, which means that you can trust someone, that someone can sign another certificate authority, which signs another certificate authority, which finally signs a public key. If you trust the root CA (certificate authority) you should also be able to trust all the certificates and public keys in the chain. This is how the infrastructure surrounding https works. You have a list of implicitly trusted certificates in your browser, and if an https site can’t be verified with this trust store, you will get one of those popup boxes asking if you really want to trust this site or not.

I will not show any code for this, since X509 certificates are actually quite well documented for Ruby. They are also a solution that takes some more knowledge to handle correctly, so a real book would be recommended.

You want to use https correctly

The protocol https is very useful, since it allows you to apply the whole certificate and asymmetric cipher business to your http communication. There is a library called net/https which works really well if you use it right. The problem is that I see lots and lots of example code that really doesn’t use it correctly. The main problem is the setting of verification level. If you have the constant VERIFY_NONE anywhere in your code, you probably aren’t as secure as you’d like to think. The real problem is that I’ve never been able to get Ruby to verify a certificate with VERIFY_PEER. No matter what I do in the manner of adding stores and setting ca_file and ca_path, I haven’t ever been able to make this work. It’s quite strange and I can’t seem to find much mention of it online, since everyone just uses VERIFY_NONE. That’s not acceptable to me, since if no verification happens, there is no way to know if the other party is the right web site.

This is one area where JRuby’s OpenSSL seems to work better, in fact. It verifies those sites that should be verified and not the others.

If anyone feels like illuminating this for me, I would be grateful – the state of it right now is unacceptable if you need to do really secure https connections.

Keep security in mind

My final advice is twofold and have nothing to do with Ruby. First, always keep security in mind. Always think about how someone externally can exploit what you’re doing, intercept your connections, or just listen to them. Programmers generally need to be much more aware of this.

The last thing I’ll say is this: start reading Cryptogram by Bruce Schneier. While not completely technical anymore, it talks about lots of things surrounding security and also gives a very good feeling about how you should think and approach security. Subscribe here.