Cryptography—Not Just a Digital Thing
As defined by Bruce Schneier in his book Applied Cryptography, “The art and science of keeping messages secure is cryptography […].” Cryptography, while now considered fundamental in our digital lives, is not specifically related to computing. It has existed in various forms for millennia. From Wikipedia’s article on the history of cryptography:
The earliest known use of cryptography is found in non-standard hieroglyphs carved into the wall of a tomb from the Old Kingdom of Egypt circa 1900 BCE. These are not thought to be serious attempts at secret communications, however, but rather to have been attempts at mystery, intrigue, or even amusement for literate onlookers. These are examples of still other uses of cryptography, or of something that looks (impressively if misleadingly) like it. Some clay tablets from Mesopotamia somewhat later are clearly meant to protect information — one dated near 1500 BCE was found to encrypt a craftsman’s recipe for pottery glaze, presumably commercially valuable. Later still, Hebrew scholars made use of simple monoalphabetic substitution ciphers (such as the Atbash cipher) beginning perhaps around 500 to 600 BCE. In India around 400 BCE to 200 CE, Mlecchita vikalpa or the art of understanding writing in cypher, and the writing of words in a peculiar way was documented in the Kama Sutra for the purpose of communication between lovers. This was also likely a simple substitution cipher. Parts of the Egyptian demotic Greek Magical Papyri were written in a cypher script.
You have probably seen simple examples of cryptography before. A Caesar cipher, or shift cipher, is commonly used in children’s games which involve decoding a secret message. ROT13 is an extremely common type of shift cipher in which the alphabet is rotated by 13 steps, as illustrated below:
It’s easy to see that this kind of cipher provides no real security, but it’s a simple and fun illustration of the general idea behind cryptography.
Nowadays when we talk about cryptography, we usually talk about it in the context of technology. How is personal and financial information safely transmitted (known as protecting data in transit) on the Web, say, when we make a purchase or look at our bank accounts? How can data safely be stored(known as protecting data at rest) so that someone couldn’t just open up a computer, pop out the hard drive, and have a field day with the information on it?
Some Definitions and a Quick Cybersecurity Primer
In cybersecurity, there are a number of things we are concerned with when it comes to data. These include confidentiality, integrity, availability, and non-repudiation.
Confidentiality means that our data cannot be accessed/read by unauthorized users.
Integrity means that our data gets to us 100% intact, and has not been modified, whether by a malicious actor, data loss, or otherwise.
Availability means that our data is accessible when needed.
Non-repudiation means that if Bob sends some data to Mary, he should not be able to claim later on that he was not, in fact, the sender of that information. In other words, there is some way to determine that no one other than Bob could have sent the data.
Cryptography doesn’t do much for us in the way of availability, but we will look at the various forms of digital cryptography and how they can help us achieve the other three goals listed above. When we talk about digital cryptography, we are usually referring to one of the following:
- Symmetric encryption
- Asymmetric encryption
- Hash functions
- Digital signatures
I will expand on each of these below. All code examples are adapted from Jesus Castello’s awesome post on SitePoint, Exploring Cryptography Fundamentals in Ruby, as well as official Ruby documentation. Also keep in mind that these examples are meant to illustrate the concepts, not to provide best practices for data security.
Okay, before we dive into this: what exactly do we mean by ‘encryption’? Encryption and decryption are typically used to mean enciphering and deciphering, respectively; to put it simply, encrypting a message means making it unreadable to unauthorized parties using a cipher (the specific method for doing so). Decrypting the message means reversing the process and making the data readable once more.
In order to properly encrypt and decrypt our data, we need both the data and a key (which determines the output of our cipher).
With symmetric encryption, the key used to encrypt and decrypt data is the same. Let’s take a string and encrypt it using Ruby and OpenSSL:
require 'openssl' require 'pry' data_to_encrypt = 'now you can read me!' cipher = OpenSSL::Cipher.new('aes256') cipher.encrypt key = cipher.random_key iv = cipher.random_iv data_to_encrypt = cipher.update(data_to_encrypt) + cipher.final binding.pry true
Note that I am doing this destructively (reassigning the variable which contained our original string) in order to demonstratethat we are indeed encrypting and decrypting. Now let’s pop into our Pry console:
Notice that our
data_to_encrypt variable, which was initially set to ‘now you can read me!’, is now a bunch of unreadable garbage. Let’s reverse the process, using the key we initially saved in the
As we can see, using the same key we set for encryption, we get back our original string.
The problem with symmetric encryption is this: What if I need to send data securely in a hostile environment, such as the Internet? If the same key is used to encrypt and decrypt data, then I would first need to send you the decryption key to establish a secure connection. But that means I’m sending the key over an insecure connection to start with, meaning the key can be intercepted and used by a third party! How do we get around this? Enter asymmetric encryption.
I’ve never really studied the mathematics of asymmetric encryption, and it would be beyond the scope of this blog post anyway, but I can give you a general idea. To use an asymmetric cipher, you need to generate two keys which are mathematically related. One key is your private key, which only youshould have access to, and your public key, which (as its name suggests) can be shared publicly with anyone.
So, you request a secure connection to a server, the server sends its public key, the client generates a key for a symmetric cipher and encrypts it with the server’s public key, the server decrypts the message containing the symmetric key using its private key, and now that both parties have the symmetric key, a secure connection can be established using symmetric encryption.
But wait! Now we have another problem. How do I know that the server’s public key is legitimate, i.e. belongs to that server!? In general, there are a couple of ways to deal with this issue, but the most common method (and the one used on the web) is by using Public Key Infrastructure (PKI). In the case of websites, there is a Certificate Authority that has a directory of all websites to which it has issued certificates, as well as their public keys. When you connect to a website, its public key is first verified with the Certificate Authority.
Let’s bring back our string to encrypt from the last section, and generate a public/private key pair:
require 'openssl' require 'pry' data_to_encrypt = 'now you can read me!' key = OpenSSL::PKey::RSA.new(2048) binding.pry true
Now let’s get into our Pry console and work on our string:
First, notice that our key and our public key are separate objects with different object IDs. Using
#private_encrypt, we can encrypt our string using our private key. Again, note the nonsensical output that we’ve put in our
data_to_encrypt variable. Now let’s decrypt the data using our public key:
#public_decrypt, we were able to get back the original message using NOT the private key which with which we originally encrypted the message, but its related public key. The same works in reverse.
A hashing function, unlike symmetric/asymmetric encryption, is a one-way function. You can create a hash from some data, but there is no way to actually reverse the process. As such, it is not a useful way to store data, but it is a useful way to verify the integrity of some data.
A hashing function takes some data as its input and and outputs a seemingly random (but not at all) string which will always be the same length. An ideal hashing function creates unique values for different inputs. The exact same input will always produce the exact same hash—which is why we can use it to verify data integrity.
Let’s use a new string this time, run it through a hashing function, and store that hash in a variable.
require 'openssl' require 'pry' test = 'some data' digest = Digest::SHA256.digest(test) binding.pry true
First, let’s hash our string again and compare that digest to the one we saved in the
As we can see, as long as the data remains the same, the digests will still match. Now let’s change the data juuust a bit and compare the digests. Then, we’ll change the data back to exactly what it was originally, and compare the digests once more.
To give you an idea of how different the digests look even for similar data, have a look at the digests themselves:
Digital signatures are great for both integrity and non-repudiation. A digital signature is a combination of hashing and asymmetric encryption. That is to say, a message is first hashed, and that hash is encrypted with the sender’s private key. This constitutes the signature, which is sent along with the message.
The recipient uses the sender’s public key to extract the hash from the signature, and the message is then hashed to compare against the extracted hash. If the you know for sure that the public key belongs to the sender, and the public key decryption is successful, then you can be assured that the message actually came from the sender. If the extracted hash matches the computed hash for the message, you can be assured of the message’s integrity.
Bear in mind that a digital signature does not necessarily make the message confidential; it is absolutely possible to sign a plaintext message. Digital signatures will work with encrypted messages, but the encryption of the message itself must be performed separately.