Thinking like a cryptographer, part 2
Last week, we met the grand daddy of cryptosystems: the one-time pad. Julia argued that this was perfectly secure, because if one sees an n-bit ciphertext, there are n possible keys that each correspond to a different message, and the key is chosen uniformly at random.
We saw that the one-time pad only works if the key is never reused, and the key must be at least as long as the message. These are fairly unrealistic requirements, so this week we will begin to discuss systems that are more practical.
Julia breathes a sigh of relief as Ron accepts her presentation of the one-time pad. However, she sees Ron’s next question in her mind’s eye before he even begins to ask it.
R: So we have a perfectly secure cryptosystem, but it needs the key to be just as long as the message, and we can’t even reuse it. That doesn’t seem very useful. We might have to set our sights somewhat lower.
J: And presumably we won’t get perfect security this time.
R: That’s right. What might we hope for instead?
J: We started with “given a ciphertext, it isn’t possible to do better than guessing the message uniformly at random.” What if we say “given a ciphertext, you can’t do much better than guessing the message uniformly at random”?
R: That’s a good start, but we have a problem: you might only care about half the message, say if my message is “we attack at dawn and I like cats”.1
J: Oh, you’re saying I shouldn’t be able to get any information about the message except with a small probability?
R: Yes. How might we phrase that mathematically? Think of “information” as being a set of properties of the message.
J: Let’s lay it out: I want to evaluate some Boolean function f on your messages.2 You have a secret message m. I have some probability of guessing f(m). If I see an encryption of m, my success probability shouldn’t increase much.
Julia is discussing semantic security, one of the earliest generalisations of perfect security to imperfect cryptosystems. Mathematically, if the ciphertext c is given by
then the following conditional probability distributions must be approximately equal:3
An obvious question is “what exactly does ≈ mean?” We can reasonably assume that a longer message contains more information, and so does a longer ciphertext. If they are of length λ = |m|, I claim the following: a difference in probabilities 𝛿(λ) is “small enough” if it decreases faster than any polynomial in λ increases. Mathematically, the following should be true for all polynomials p(λ):
This is not supposed to be obvious. This definition4 is basically cryptographic folklore, and the philosophical justification boils down to the convention that an algorithm is efficient if it runs in a polynomial number of steps — a concept we will return to later today.
Quite miraculously, it turns out that a much simpler definition is equivalent to semantic security.
Imagine a game between Charlie and Eve. Charlie picks a key and asks Eve to give him two messages m₀ and m₁. He flips a coin. If he gets heads he encrypts m₀, otherwise he encrypts m₁; in either case he sends the resulting ciphertext to Eve. If Eve successfully guesses which message Charlie encrypted, she wins. This is called indistinguishability under a chosen plaintext attack, or IND-CPA for short. Eve could of course toss a coin herself and have a 50-50 chance at winning, but we say that she must have only a negligible advantage over guessing randomly.
If this is true for all “efficient” adversaries, then the cryptosystem is also semantically secure; conversely, if a cryptosystem is semantically secure then this statement is true. Because it’s much easier to prove, cryptographers usually think purely in terms of this indistinguishability criterion.5 (This is also not supposed to be obvious, but it’s sadly not something I have room to prove here!)
R: Now that we have our goal, how can we achieve it?
J: The normal way I’ve seen this done is to take some problem that we think is hard to solve — like factorising a big number — and showing you can’t break our system if you can’t solve the problem.6
R: And how would you prove this security?
J: Using a reduction. We take an algorithm that can break the system, and turn it into one that can factorise integers. If no algorithm exists that can factorise integers, then none can exist that break the system.
R: But such an algorithm exists for your factorisation example. If n = pq where p and q are prime numbers, then I can just try dividing n by every prime number less than n and see if it divides evenly.
J: Oh, I mean efficient algorithms. For your example, if n is a b-bit number, then you have to check all prime numbers up until about 2ᵇ. That’s exponential, so it’s slow. If it’s a polynomial like b² or b³, then it’s efficient.
R: You’re saying that if no polynomial-time algorithm exists to factorise integers, then no polynomial-time algorithm exists that can break our security.
J: Yes, exactly.
R: One question: what exactly is a polynomial-time algorithm?
J: Um… an algorithm that takes only polynomially-many steps as a function of its input size?
R: OK, an easier question: what exactly is an algorithm?
This is a harder question to answer than it may seem. At first glance, an algorithm is a bit like a function: you give it some inputs (the number n that you want to factorise) and it gives you some outputs (the prime factors p and q such that n = pq). But it isn’t just a function: it’s a sequence of steps to compute that function. Moreover there are many functions that are non-computable; the truth function that maps a statement of arithmetic (such as Goldbach’s conjecture) to true or false is one example among many others.
You have probably heard of the Turing machine. This is an abstract model of computing that roughly works as follows: it has a tape composed of an infinite list of cells, and it has a set of states. Initially, finitely many cells are assigned some symbol such as 1, 2, 3, a, etc.; this represents the machine’s input. Each “step” the machine reads the current cell and based on the symbol it reads, the current state decides:
whether to overwrite the cell with another symbol;
whether to move the tape left or right (by one cell); and
which state it should move to next.
The Turing machine was one of the earliest models of computing (cf. the lambda calculus), and it appears to capture what we mean by an “algorithm”: it takes input and gives you output (e.g. the final tape’s symbols), and it naturally defines a series of steps to compute that output. Even better, we can take a particular Turing machine and ask: what is the largest number of steps it must take if at most n symbols are on the input tape? If the answer is a polynomial in n, then we can say our machine is efficient.
J: … and so you see, our bad guy is a Turing machine that must run in polynomial time. More specifically, I’m saying that if some efficient Turing machine breaks our security, then I can combine it with another efficient Turing machine to turn it into a third machine that efficiently factorises integers.
R: What does combining Turing machines mean?
J: Uh… I’m not sure exactly, but intuitively it’s clear. Our machine runs the security breaking machine with some input and gets an answer, then transforms the answer into factorised integers.
R: Okay, and why a Turing machine? Nobody in real life is using a Turing machine after all.
J: Well, the Church-Turing thesis tells us that any model that can evaluate computable functions is equivalent to a Turing machine, and they’re easy to reason about, so we may as well use them.
We have arrived at the end of our quest to think like a cryptographer in the 20th century. We saw one way to define imperfect security, and a much simpler version that is surprisingly equivalent. We learnt about what an algorithm is and how we can use a reduction to argue a system satisfies the security definition. If you’d like to see a formal definition of a cryptosystem and a reduction to an assumed-hard problem, my Master’s thesis contains a full discussion for the ElGamal cryptosystem. (If you don’t know much about groups, you may find this other thing I wrote helpful.)
You may have detected a hint of scepticism in Ron’s questions to Julia. I encourage you to think critically about what we’ve done so far. Where are we making arbitrary choices? Do our definitions really say what we want them to? Are there other possible approaches?
Next week, we will enter the twenty-first century.
Which half matters here is entirely up to you.
A Boolean function is one that outputs either true or false; also called a predicate.
An important technicality is that the function f must be “efficiently computable”, as discussed later in this post.
See also a host of equivalent definitions.
To the pedants: cryptographers also think in terms of extensions such as IND-CCA and IND-CCA2.
A common misconception is that cryptography uses NP-complete problems (if you don’t know what this means, don’t worry). This isn’t true. We have algorithms that can solve prime factorisation and integer discrete logarithms in sub-exponential but super-polynomial time, which (assuming NP-complete problems do require exponential time) means they are not NP-complete. It turns out that NP-complete problems like Boolean satisfiability or Hamiltonian cycle finding are too inconvenient to make into cryptosystems, and we have other perfectly good problems that are (probably) not in P.