Separate blog about Privacy, Anonymity and Cryptography


It’s been a long while. But I have been writing a little bit during 2014 as well. I decided to switch venue a bit, once my writing became almost exclusively about privacy, anonymity and cryptography. Since my day-to-day job has been research in these areas for the last 2 years, it has become natural to write about what I think on these subjects.

So, if you are interested in following these kind of thoughts, please follow along at https://reap.ec.



Update


It’s almost a year since my last blog post. I’ve had a busy and interesting year, and wanted to take the time to talk about a few different things. One of the reasons I haven’t blogged as much as I wanted is that I’ve been working on several things that aren’t exactly public at the moment. Moving to Ecuador has also involved a whole bunch of work and lead to less time.

Computer and cloud

Some of my last blog posts had to do with me wanting to change what hardware and cloud providers I used. I went through a few different laptops until I settled on a Thinkpad X1 Carbon (2nd gen). It’s got the right weight to power ratio, larger screen, nice keyboard and works quite well with Linux. I ended up moving to Fedora for my main operating system which has served me pretty well.

When it comes to password managers, I’ve ended up with KeePass2 for Linux. It works satisfactorily even though it was a bit of a pain to get running, what with Mono dependencies etc.

I have also been hosting my own email infrastructure for the last year or so. It’s worked out quite well.

For hosting my server infrastructure I ended up with a Swedish provider called moln.is – I am now happily far away from Rackspace and the other American hosting providers.

Programming languages

The last few years haven’t really seen much change in the programming language scene. There are new interesting experiments coming out now and again, while Clojure and Scala are gaining more ground. The biggest difference I’ve seen lately is that Go is moving to becoming a really useful tool, especially for scalable systems where the low-level nature of Go works well, but the security properties of the language is also a plus. Many of the security oriented systems I’ve seen the last year is moving towards Go. And of course, being able to create a binary that runs anywhere is extremely powerful.

I have a new language rattling around in the back of my brain. I kinda wish I had 6 months to sit down and get it out. I think it could be quite nice. However, at the moment there doesn’t seem to be that much time for language experiments.

Security

Most of my work the last 18 months have been focused on research around anonymity, privacy, security and cryptography. It’s an extremely interesting place to be, and ThoughtWorks is actively working on several different angles in this space. The security field is seeing a lot of sea change after the Snowden revelations. Lots of snake-oil coming out, but also lots of interesting new approaches.

One of the things that worry me about all this is the continued focus on using browsers to implement security aware applications. In my mind this is a dangerous trend – browsers are the largest attack surface out there, and you are also forced to use JavaScript. And JavaScript is not a well suited language for strong security.

E-mail is pretty high up on the list for a lot of people and organizations to fix in one way or another. This is also a space where we are working to create something. We will open source what we have quite soon – it’s something pragmatic that we think can make real change. I’m excited about it.

Meanwhile, here in Ecuador I’m doing research and thinking about the problems with transport security, specifically DNS and TLS. Hopefully I’ll have something more substantial to write about that soon.

Conferences?

Some might wonder where I’ve disappeared, since I haven’t been to many of the conferences I used to go to. I’ve missed the JVM Language Summit, QCon in many cities, StrangeLoop and others. I’m very sad about all that, but I haven’t been able. However, if you happen to be at Goto Aarhus, do say hi.



Technical Details from Snowden


This summer has given confirmation to many things that technologists only guessed before. We know much more about what the NSA, GCHQ and other intelligence services are doing around the world, how they are subverting privacy and security in the name of fighting terrorism. All of this is primarily thanks to Edward Snowden, Laura Poitras and Glenn Greenwald – with the help of many other courageous people. For the technically inclined, last weeks revelations about how the NSA is pursuing a broad program to subvert all kinds of encryption was probably one of the most worrying releases. But right now we’re also seeing a strong backlash against Greenwald, claiming that he should be releasing the names of technologies broken, the companies involved and who specifically is complicit in all this. A lot of people are ascribing malicious intentions to Greenwald for keeping these things to himself. I would just like to add two things to the debate:

First, it is highly likely that Snowden did not in fact have access to what specific technologies were broken. It might not exist in the papers he gave to Greenwald and others. As far as we know, Snowden was not cleared for BULLRUN and related programs, and the fact that we know about them is because he managed to get access to protected documents he wasn’t supposed to be able to access. So I think it’s only fair to give Greenwald the benefit of the doubt – he might not be able to tell us the specific algorithms that are broken. Let’s not immediately jump to the conclusion that he is acting maliciously.

When it comes to what companies and people are complicit in these issues, in the short term it would be very useful for us to know. I suspect there are good reasons why this information hasn’t been released yet – but let’s not forget that many companies have been outed as cooperating in one way or another under the PRISM program.

The big problem is this – for us technologists to stop future BULLRUN programs to happen we need to build new organizational structures. We need to guard ourselves from compromised algorithms and hardware chips with backdoors. In order to do that we need to change how we do these things – and this will require long term cultural fixes. And even though it would be very satisfying in the short term to know what companies and people to be angry at, in the long run we need to build up an immune system that stops this from happening again.

This all said – I’m dying to know all these details myself. I think it’s pretty human. But let us not lose sight of the real battle.



How do you safely communicate over email?


Communicating safely over email is actually pretty complicated. I wanted to walk through the steps necessary in order to create a complete email identity online that should be reasonably safe against interception, network analysis and impersonation.

Before we begin, you need to have the Tor Browser Bundle installed. You also need to make sure that you never do anything related to your email account without having the connection going over Tor.

One important aspect of is the ability to find a good email provider where you don’t have to supply real personal information. If you ever have to supply your real information you will also always have to trust that the email provider does the right thing. The one thing you can never get away from is that network analysis can happen on your email if the provider can’t be trusted. If this happens, your only recourse is to be sure that the people you are talking to are using the same techniques, and that you are using several of these accounts for various activities.

The first step is to find a provider that matches your needs. For this I’m going to use RiseUp.net. I could also use hushmail or other services, although none of these are completely safe. I will first generate the email address/username I want. In order to do this, you need a mechanism of generating randomness. I will use 1Password for this, and generate a completely random string. However, an alternative you can use is to go to one of the random name generators available online (go there using Tor), and then generate a random name there. Once you have a random name and a long, really random password, you can go ahead and start registering for an account.

When signing up, use Tor for all the connections and make sure to not give any extra information asked for (such as time zone or country, for example). Once you have been completely signed up, use Tor again and sign in to the web client to make sure everything works as it should.

The next step is to create a private key specifically for this email account. I will use the command line to do this, using gpg. Before you create this key, you should also create a new pass phrase for yourself. Use the XKCD Battery Staple method, with about 5-6 words. However, be very careful to choose these words randomly. If you don’t use a random (really random) process, you lose all the benefits of having a pass phrase and it becomes a very weak password. So, once you have the pass phrase, you can use it to create a new private key:

gpg –gen-key

The choices I will make are these: RSA and RSA for the kind of key. A keysize of 4096, and a validity of 2 years. I will specify the username for the email address as the name for the key. Finally you will be asked for the pass phrase, so enter it now. Make sure to never push this key to a keyserver using the gpg program.

Once you have created the key, you should use the Tor browser to add it to the keyservers. First export the public key into a file. Make sure to not export the private part of the key. Once you have Tor up and running you can go to http://sks-keyservers.net/i and submit it there.

In order to use this account you should probably use Thunderbird and TorBirdy. If you have non-anonymous accounts you need to figure out how to run multiple Thunderbird instances, since TorBirdy takes over the full installation. You need a new profile and should install Enigmail and TorBirdy once you have the Thunderbird installed. Then you can go ahead and configure the mail account. It is important to install TorBirdy before you configure the mail account. Once you’ve configured the mail account, it’s a good idea to make sure Enigmail will encrypt and sign emails by default.

You are now ready to send safe and anonymous email. There are a few different things to keep in mind for this. First, make sure to never access this account over a regular connection. Second, never download keys automatically from the keyserver, instead always manually download and import it. Finally, never send an email in the clear. Always encrypt it using the right key. If you ever send an email like this in clear text over the same connection, you have lost most of the potential security of the account.

In order for this to work you should give your account information and fingerprint of the public key in person to the people who should have it.

Finally, all these things can not guarantee safety.

Comments and corrections to this writeup are very welcome.



Passwords Are Terrible


I’ve been going through a rash of password resets and changes the last few days, and as such things always do, it set me thinking. If I’m lucky, most of this won’t really be much of a surprise for you. It certainly won’t contribute anything significant to the security world.

My thought is simply that passwords are terrible. I know, I know – not an original thought. Basically, passwords might have been the right trade off between convenience and security a while back. I’m unsure when that stopped being the case, but I’m pretty sure it’s been more than a few years ago. However, we are still using passwords, and we there are a lot of things we do that doesn’t necessarily make our use better. Just as black hats use social engineering to crack systems, security experts should use social engineering and experience design to entice people to do the safest possible thing under the circumstances. Sadly, we are however doing the exact opposite right now.

Let’s take a look at a laundry list of what you should be doing and what you are doing:

  • You should never use the same password in more than one place. REALITY: people basically always reuse passwords or password variants on different services. The proliferation of places that require passwords and login means we either have the choice of having more than 50 passwords, or reuse. But if you reuse passwords, within each set of services with the same password, the service with the most sensitive material, will be protected by the least secure service. So as long as you use the same password, you have to think about all the services you’ve used that password on to have a realistic idea about how protected that password is.
  • You should never use words, numbers from your life or names from your life in your password – scrambled or not. REALITY: basically everyone does one of these things – most wifi-network passwords I know are combinations of the names of the children in the family. In order to remember passwords, we usually base them on existing words and then scramble them with a few letters switched out for digits or added a few symbols. This practice is basically completely useless, unless your password is actually a pass phrase. And if your password is in reality a pass phrase you don’t gain much by scrambling the words. So for a really secure password, use a pass phrase, but that the words in the phrase are randomly selected.
  • Security policies usually require you to change passwords every 2 or 3 months. REALITY: This means you are training people to choose insecure passwords. If you have to change passwords often you have a few choices – you can write it down or you can use variations on the same password. Note that remembering a new strong password every 2 months is not an option – people will simply not do it. Most people I know uses a sequence of numbers added to a base password, and they change these numbers every time they are forced to change the password. All of these things come together to defeat the purpose of the security policy. It is security theatre, simple and pure. If your company has a policy that requires you to change passwords like this, that is basically guaranteed to be a company with no real security.

What is the solution? I have decided to base my life around 1Password. For most purposes, the combination of the password generator, the browser integration, and the syncing between different devices means that it’s mostly hassle-free to have really strong passwords in all places I want it. I think 1Password is really good at what it does, but it’s still a stop-gap measure. The basic idea of using passwords for authentication is an idea that should be relegated to history. We need a better alternative. We need something that is more secure and less brittle than passwords. But we also need something that is more convenient than two-factor authentication. For most of us that login to services all the time, two-factor is just too slow – unless we get to a point where we have a few central authentication providers with roaming authentication.

Is there a better alternative? HTTP has included support for TLS Client Certificates for a long time now, and in theory it provides all the things we would want. In practice, it turns out to be inconvenient for people to use, expiration and other aspects of certificates complicates and frustrates things.

I guess what I would want is a few different things. The first would be to simply make it possible to have my browser automatically sign a challenge and send it back, instead of putting in a password in the login box. That would require a little support from the browser, but could potentially be as easy to use as passwords.

Another thing that could make this possible is if 1Password had support for private keys as well as passwords. This would mean syncing between devices would become substantially easier. Someone would have to think up a simple protocol for making it possible to use this system instead of passwords on a web page. This is a bit of a catch-22, since you need support from the browser or extension for it to be worth putting in to your service. I kinda wish Google would have done something like this as the default for Google Accounts, instead of going all the way to two-factor.

In summary, I think we are ready for something better than passwords. I would love if we could come together and figure out something with better usability and security than passwords so we can finally get rid of this scourge.



RSA parameters in OpenSSL, Ruby and Java


I would just like to publish this information somewhere, so that Google can help people find it easier than I did.  If you have ever wondered how the internal OpenSSL RSA parameters map to the Java parameters on RSAPrivateCrtKey, this little table will probably help you a bit. There are three different names in motion here. The first one is the internal field names in OpenSSL. These are also used as method names in Ruby. The second name is what gets presented when you use something like to_text on an RSA key. The third name is what it’s called in Java.

  • n == modulus == modulus
  • e == public exponent == publicExponent
  • d == private exponent == privateExponent
  • p == prime1 == primeP
  • q == prime2 == primeQ
  • dmp1 == exponent1 == primeExponentP
  • dmq1 == exponent2 == primeExponentQ
  • iqmp == coefficient == crtCoefficient


Ruby HTTPS web calls


As I noted in my entry on Ruby security, VERIFY_NONE is used all over the place. And what I realized when I tried to use VERIFY_PEER was that it really doesn’t work for net/https, and doesn’t seem to ever have worked for me. I got a bit mystified by this since I couldn’t find much mention about it online. And then Victor Grey came to the rescue in one of the comments. The solution is to not use net/https at all, but instead use the httpclient gem (formerly called http-access2). So do a ‘gem install httpclient’. Then you can use this code:

require 'rubygems'
require 'httpclient'

clnt = HTTPClient.new
puts clnt.get_content("https://www.random.org/")

This will just work. Under the covers, httpclient uses VERIFY_PEER as default. And you can see this by changing the hostname from www.random.org to random.org. That will generate a verification error directly. Awesome. So what’s the lesson? Never use net/https, folks!



Ruby Security quick guide


I’ve looked around a bit and it seems that there is no real good guide to security programming in Ruby. Neither is there any book available (although Maik Schmidts book Enterprise Recipes with Ruby and Rails will be the best reference once it arrives). The aim for this blog entry will be to note a few things you often would like to do, and how you can do it with Ruby. The focus will be mostly on the cryptographic APIs for Ruby, which doesn’t have much documentation either. In fact, the best documentation for this is probably the OpenSSL documentation, once you learn how to map it to the Ruby libraries.

You want to avoid clear text passwords in your database

One of the very good properties of handling passwords, is that you usually don’t need to actually know what they are. The only thing you need to be able to do is to reset them if someone forgets their password, and verify the correct combination of username and password. In fact, many practitioners feel better if they don’t need to have the responsibility of knowing the passwords of every user on their system, and conversely I feel much better knowing that my password is secure from even the administrators of the system. Not that I ever use the same password to two different systems, but someone else might… =)

So how do you solve this easily? Well, the easiest way – and also the most common way – is to use a digest. A digest is a mathematical function that takes as input a series of numbers of any length and returns a large number that represents the input text. There are three properties of digests that makes them useful: the first is that a small change in the input text generate a large effect in the outcome data, meaning that olabini and olbbini will have quite different digests.

The second is digests generally have a good distribution and small risk of collisions – meaning that it’s extremely unlikely that two different texts have the same output for a specific digest algorithm. The third property is that they are fundamentally one way mathematical operations – it’s very easy to go from plain text to digest, but extremely hard to go in the other direction.

There are currently several algorithms used for digests. MD5 and SHA-1 is by far the most common ones. If you can avoid MD5, do so, because there have recently been some successful attacks against it. The same might happen with SHA-1 at some point, but right now it’s the best algorithm for these purposes. Oh, and avoid doing a double digest – digesting the output of a digest algorithm – since this generally makes the plain text easier to crack, not harder.

Let’s see some code:

require 'openssl'

digest = OpenSSL::Digest::SHA1.hexdigest("My super secret pass")
p digest # => "dd5f30310682e5b41e122c637e8542b1b39466cf"

digest = OpenSSL::Digest::SHA1.hexdigest("my super secret pass")
p digest # => "f923786cc72ed61ae31325b6e8e285e6c35e6519"

d = OpenSSL::Digest::SHA1.new
d << "first part of pass phrase"
d << "second part of pass phrase"
p d.hexdigest # => "f13d7bdee0634c017babb8c72dcebe18f9e0598e"

I have used two different variations here. The first one, calling a method directly on the SHA1 class, is useful when you have a small string that you want to digest immediately. It’s also useful when you won’t need different algorithms or send the digest object around. The second method allows you to update the string data to digest several times and the finally generate the end digest from that. I have taken the liberty of using the methods called hexdigest for both cases – this is because they are more readable. If you replace hexdigest with digest, you will get back a Ruby string that contains characters from 0 to 255 instead, which means they doesn’t print well. But if you were to compare the result of hexdigest and digest, you will see that they return the same data, just in different formats.

So as you can see, this is extremely easy to incorporate in your code. To verify a password you just make a digest of the password the person trying to authenticate sent in to you, and compare that to what’s in your database. Of course, the approach is generalizable to other cases when you want to protect data in such a way that you can’t recover the data itself.

Finally, be careful with this approach. You should generally combine the password with something else, to get you good security. There are attacks based on something called rainbow tables that makes it easy to find the password from a digested password. This can be avoided using a salt or other kind of secret data added to the password.

You want to communicate with someone securely
When you want to so send messages back and forth between actors, you generally need a way to turn it unreadable and then turn it back into something readable again. The operation you need for this is called a cipher, but there are many kinds of ciphers, and not all of them are right. For example, rot-13 be considered a cipher, but there is no real security inherent in it.

In general, what you want for communication between two parties is a symmetric cipher using a secret key. A symmetric cipher means that you can use the same key to “lock” and “unlock” a message – or encrypt and decrypt it. There are other kinds of ciphers that are very useful, which we’ll see in the next example, but symmetric ciphers are the most commonly used ciphers since they are quite easy to use and also very efficient. For a symmetric cipher to be completely safe, the key should always be as long as the data that should be encrypted. Generally this doesn’t happen, since key distribution becomes a nightmare, but the current approaches are reasonably sure against most kinds of cracking. Coupled with asymmetric ciphers, they become extremely useful.

There are three things you need to use a symmetric cipher. The first one is the algorithm. There are loads of different kinds of symmetric ciphers around. Which kind you will use doesn’t matter that much as long as you choose something that is reasonably sure. One of the more widely used algorithms is called DES. In it’s original form DES should definitely not be used (since the key length is only 56 bits, it’s actually not that hard to crack it.). There is another form of DES called Triple DES, or DES3, which effectively gives you more security. DES3 might work in some circumstances, but I would recommend AES in almost all cases. AES come in three varieties called AES-128, AES-192 and AES-256. The difference between them is the key length. As you might guess, AES-128 needs a 128 bit long key, AES-192 a 192 bit long and AES-256 a 256 bit long. These key lengths all give reasonable security. The more security you need the longer key algorithm you can choose. The tradeoff is that the longer the key is, the more time it will take to encrypt and decrypt messages.

Once you have an algorithm, you need a key. This can come in two varieties – either you get a key from a humans, where that key is generally a password. Otherwise you might want to automatically generate a key. This should be done with a secure random number generator – NOT with rand().

Depending on which algorithm you choose, you might also need something called an IV (initialization vector). The algorithms that require an IV is called cyclic block ciphers (CBC) and will work on a small amount of bytes at the same time. In the case of AES-128, a block of 16 bytes are generated on every cycle of the algorithm. These 16 bytes will be based on the algorithm, the key, and the 16 bytes generated the last time. The problem is that the first time there were no bytes generated, which means these will have to be initialized another way and this is where the IV comes in. It’s just the first block that will be used for generating the first real block of data. The IV does not need to be secured and can be sent in the clear. In Ruby, if you don’t provide an IV when initializing your cipher, the default will be a part of the string “OpenSSL for Ruby rulez!”. Depending what length the IV should have, a substring will be used.

So, let’s take a look at some code. This code will just encrypt a message with a password and then decrypt it again:

require 'openssl'

cipher = OpenSSL::Cipher::AES128.new("CBC")
cipher.encrypt
cipher.key = "A key that is 16"
cipher.iv =  "An IV that is 16"

output = ""

output << cipher.update("One")
output << cipher.update(" and two")
output << cipher.final

p output # => "\023D\aL\375\314\277\264\256\245\225\a\360|\372+"

cipher = OpenSSL::Cipher::AES128.new("CBC")
cipher.decrypt
cipher.key = "A key that is 16"
cipher.iv =  "An IV that is 16"

output2 = ""

output2 << cipher.update(output)
output2 << cipher.final

p output2 # => "One and two"

We first create a cipher instance, giving it CBC as the type. This is the default but will generate a warning if you don’t supply. We tell the cipher to encrypt, then initialize it with key= and iv=. The key is 16 bytes because it’s a 128 bit cipher, and the IV is 16 bytes because that’s the block size of AES. Finally we call update on the data we would like to encrypt. We need to save the return value from the update call, since that’s part of the generated cipher text. This means that even encrypting really large texts can be very efficient, since you can do it in smaller pieces. Finally you need to call final to get the last generated cipher text. The only difference when decrypting is that we call decrypt on the object instead of encrypt. After that the same update and final calls are made. (Am I the only one who thinks that encrypt and decrypt should be called encrypt! and decrypt!)?

That’s how easy it is to work with ciphers in Ruby. There are some complications with regards to padding, but you generally don’t need to concern yourself with that in most applications.

You want nodes to be able to communicate with each other securely without the headache of managing loads of secret keys

OK, cool, symmetric ciphers are nice. But they have one weakness, which is the secret key. The problem shows up if you want to have a network of computers talking to each other securely. Of course, you can have a secret key for all of them, which means that all nodes in the network can read messages to other parties. Or you can have a different secure key for each combination of nodes. That’s ok if you have 3 nodes (when you will just need 3 secret keys). But if you have a network with a 100 nodes that all need to communicate with each other you will need 4950 keys. It will be hard to distribute these keys since they need to be kept secure, and generally just managing it all will be painful.

The solution to this problem is called asymmetric ciphers. The most common form of this is public key cryptography. The idea is that you need one key to encrypt something, and another key to decrypt it. If you have key1 and encrypt something with it, you can’t decrypt that something with key1. This curious property is extremely useful. In the version of public key cryptography you generate two keys, and then you publish one of the keys widely around. Since the public key can’t be used to decrypt content there is no security risk in not keeping it secret. The private key should obviously be kept private, since you can generate the public key from it. What does this mean in practice? Well, that in your 100 node network, each node can have a key pair. When node3 wants to send a message to node42, node3 will ask around for the public key of node42, encrypt his message with that and send it to node42. Finally, node42 will decrypt the message using his private key.

There is one really large downside with these ciphers. Namely, it is quite expensive operations, even compared to other ciphers. So the way you generally solve this is to generate a random key for a symmetric cipher, encrypt the message with that key and then encrypt the actual key using the asymmetric cipher. This is how https works, it’s how SSH works, it’s how SMIME and PGP works. It’s generally a good way of matching the strengths and weaknesses of ciphers with each other.

So how do you use an asymmetric cipher? First you need to generate the keys, and then you can use them. To generate the keys you can use something like this:

require 'openssl'

key = OpenSSL::PKey::RSA.generate(1024)
puts key.to_pem
puts key.public_key.to_pem

This will give you the output in form of one private key and one public key in PEM format. Once you have this saved away somewhere, you can start distributing the public key. As the name says, it’s public so there is no problem with distributing it. Say that someone has the public key and wants to send you a message. If the PEM-encoded public key is in a file called public_key.pem this code will generate a message that can only be decrypted with the private key and then read the private key and decrypted the message again.

require 'openssl'

key = OpenSSL::PKey::RSA.new(File.read("public_key.pem"))
res = key.public_encrypt("This is a secret message. WOHO")
p res

key2 = OpenSSL::PKey::RSA.new(File.read("private_key.pem"))
res2 = key2.private_decrypt(res)
p res2

Note that the methods we use are called public_encrypt and private_decrypt. Since the public and private prefix is there, there has to be a reason for it. And there is, as you’ll see soon.

You want to prove that it was you and only you who wrote a message

Being able to send things in private is all good and well, but if you distribute you public key wide you may never know who you get messages from. Or rather, you know that you get a message from their addresses (if you’re using mail) – but you have no way of ensuring that the other person is actually who they say they are. There is a way around this problem too, using asymmetric ciphers. Interestingly, if you encrypt something using your private key, anyone with your public key can decrypt it. THat doesn’t sound very smart from a security perspective, but it’s actually quite useful. Since you are the only one with your private key, if someone can use your public key to decrypt it, then you have to be the one who wrote the message. This have two important consequences. First, you can always trust that the person who wrote you a message was actually the one writing it. And second, that person can never retract a message after it’s been written, since it’s been signed.

So how do you do this in practice? It’s called cryptographic signatures, and OpenSSL supports it quite easily. If you have the keys we created earlier, you can do something like this using the low level operations available on the keys:

require 'openssl'

pub_key = OpenSSL::PKey::RSA.new(File.read("public_key.pem"))
priv_key = OpenSSL::PKey::RSA.new(File.read("private_key.pem"))

text = "This is the text I want to send"
signature = priv_key.private_encrypt(text)

if pub_key.public_decrypt(signature) == text
  puts "Signature match"
else
  puts "Signature didn't match!"
end

There happens to be a slight problem with this code. It works, but if you want to send the signature along the message will always be twice the size of the original text. Also, the larger the text to encrypt the longer it takes. The way this is solved in basically all cases is to first create a digest of the text and then sign that. The code to do that would look like this:

require 'openssl'

pub_key = OpenSSL::PKey::RSA.new(File.read("public_key.pem"))
priv_key = OpenSSL::PKey::RSA.new(File.read("private_key.pem"))

text = "This is the text I want to send"*200

signature = priv_key.sign(OpenSSL::Digest::SHA1.new,text)

if pub_key.verify(OpenSSL::Digest::SHA1.new, signature, text)
  puts "Signature verified"
else
  puts "Signature NOT verified"
end

As you can see we use the sign and verify methods on the keys. We also have to send in the digest to use.

You want to ensure that a message doesn’t get modified in transit

Another usage of signatures, that we actually get for free together with them, is that there is no way to tamper with the message in transit. So even if you send everything in clear text, if the signature verification succeeds you know that the message can’t have been tampered with. The reason is simply this: if the text was tampered with, the digest that gets generated would be different, meaning that the signature would not be equal to what’s expected. And if someone tampered with the signature, it wouldn’t match either. Theoritically you can tamper with both the text and the signature, but then you’d first have to crack the private key used, since otherwise you would never be able to generate a correct signature. All of this comes for free with the earlier code example.

You want to ensure that you can trust someones public key
Another thing that can get troublesome with public and private keys is the distribution problem. When you have lots of agents you need to make sure that the public key you get from someone is actually their public key and that they are the ones they say. In real life this is a social problem as much as a technological. But the way you generally make sure that you can trust that someones public key is actually theirs is that you use somethong called a certificate. A certificate more or less is a signature, where the signer is someone you trust, and the text that is signed is the actual public key under question. If you trust someone that issues certificates, and that someone has issued a certificate saying that a public key belongs to a specific entity, you can trust that public key and use it. What’s more interesting is that certificates can also be signed, which means that you can trust someone, that someone can sign another certificate authority, which signs another certificate authority, which finally signs a public key. If you trust the root CA (certificate authority) you should also be able to trust all the certificates and public keys in the chain. This is how the infrastructure surrounding https works. You have a list of implicitly trusted certificates in your browser, and if an https site can’t be verified with this trust store, you will get one of those popup boxes asking if you really want to trust this site or not.

I will not show any code for this, since X509 certificates are actually quite well documented for Ruby. They are also a solution that takes some more knowledge to handle correctly, so a real book would be recommended.

You want to use https correctly

The protocol https is very useful, since it allows you to apply the whole certificate and asymmetric cipher business to your http communication. There is a library called net/https which works really well if you use it right. The problem is that I see lots and lots of example code that really doesn’t use it correctly. The main problem is the setting of verification level. If you have the constant VERIFY_NONE anywhere in your code, you probably aren’t as secure as you’d like to think. The real problem is that I’ve never been able to get Ruby to verify a certificate with VERIFY_PEER. No matter what I do in the manner of adding stores and setting ca_file and ca_path, I haven’t ever been able to make this work. It’s quite strange and I can’t seem to find much mention of it online, since everyone just uses VERIFY_NONE. That’s not acceptable to me, since if no verification happens, there is no way to know if the other party is the right web site.

This is one area where JRuby’s OpenSSL seems to work better, in fact. It verifies those sites that should be verified and not the others.

If anyone feels like illuminating this for me, I would be grateful – the state of it right now is unacceptable if you need to do really secure https connections.

Keep security in mind

My final advice is twofold and have nothing to do with Ruby. First, always keep security in mind. Always think about how someone externally can exploit what you’re doing, intercept your connections, or just listen to them. Programmers generally need to be much more aware of this.

The last thing I’ll say is this: start reading Cryptogram by Bruce Schneier. While not completely technical anymore, it talks about lots of things surrounding security and also gives a very good feeling about how you should think and approach security. Subscribe here.



Security vs Convenience


I really like Cryptogram and read every issue. It’s interesting stuff that talks a lot about how our minds work in conjunction with risk and reward. Today I had a typical example of how security versus convenience is a part of day to day life.

I had just checked out from my hotel, and wanted to store all my luggage (including my laptop bag) in the hotel until my ride out of town arrived. I asked about this, and it was fine, they had a room for this. The person in the reception pointed me to an open room and said it was open and that I could put my stuff there. Feeling uneasy I asked how secure it was, and she answered that the door was usually locked. OK, I said, but can someone take any bag from inside of there? Yes, was the answer. I decided I couldn’t store my stuff there. Even if the risk was small, losing my work laptop would be way to bad to risk. But I also decided I couldn’t drag my two heavy bags and laptop bag around.

I ended up putting the large bags in the room, and just taking my laptop bag around. I didn’t have as much to lose with the large bags, and the price of inconvenience in taking them along was just to high. These considerations go into everything we do in programming and systems engineering. A totally secure system is generally quite inconvenient to use, while an insecure system can be very pleasant to use. The trick is to get the balance right, I guess.