The Caesar Cipher is a very simple encryption technique named after Julius Caesar. It is a simple substitution cipher in which each letter in the plaintext is replaced with some letter a fixed number of positions down the alphabet.

Here’s an illustration that explains it nicely:

In this post, we’ll take a look at an implementation of the cipher as well as using frequency analysis to break the cipher. All the source code is available here. It is written in a Lisp variant called Urn.

Implementation

Since all the Caesar Cipher does is shift letters around in the alphabet, it’s quite simple to implement:

(import constants (ALPHABET-LETTERS CAPITAL-A-ASCII CAPITAL-Z-ASCII LOWER-A-ASCII LOWER-Z-ASCII))

(defun do-shift (c shift _min _max)
	:hidden
	(with (n (+ c shift))
		(cond
			[(< n _min)
				(+ n ALPHABET-LETTERS)]
			[(> n _max)
				(- n ALPHABET-LETTERS)]
			[else
				n])))

(defun transform (shift text)
	(let* [(original (string->bytes text))
		   (transformed '())]
		(for-each c original
			(cond
				[(and (>= c CAPITAL-A-ASCII) (<= c CAPITAL-Z-ASCII))
					(push! transformed (do-shift c shift CAPITAL-A-ASCII CAPITAL-Z-ASCII))]
				[(and (>= c LOWER-A-ASCII) (<= c LOWER-Z-ASCII))
					(push! transformed (do-shift c shift LOWER-A-ASCII LOWER-Z-ASCII))]
				[else
					(push! transformed c)]))
		(bytes->string transformed)))

As you can see, we first convert the input string (plaintext) to an array of ASCII bytes. Then we add the shift value to that value and correct for if we go “below A” or “above Z”. We do this in the case of lowercase and uppercase letters. We then push the shifted ASCII byte into an array and finally, once we’ve done this for each letter in the input string, convert the resulting array of shifted bytes back into a string and return that as the result. Characters that are not letters are not modified.

If you want to decrypt some ciphertext, you just transform it with the negative of the shift it was encrypted with.

We can also look at this mathematically:

$$ T_n(x) = (x + n) \mod 26 $$

In this example, T is a function that transforms one letter, x, in the cipher by the shift value n. For decryption, as I said, the negative of the shift is used. This function assumes letters are represented as 0 to 25, ignoring case. The concept can be expanded quite easily.

Frequency Analysis

In text that is comprised of valid sentences in some language, in our case, English, the relative frequencies of letters tends to be similar to some average frequency profile. For example, in English, ‘E’ is the most common letter. We can use knowledge of the average relative frequencies of letters in the English language to figure out the shift that was used to encrypt some ciphertext, assuming it was encypted with the Caesar Cipher.

Here is the frequency profile of letters for English:

English Frequencies

The way this is calculated is by counting how many occurences there are of each letter and then dividing each of those sums by the total number of letters. Here is the Urn code for performing this calculation:

(defun letter-index (b)
	:hidden
	(cond
		[(and (>= b CAPITAL-A-ASCII) (<= b CAPITAL-Z-ASCII))
			(- b CAPITAL-A-ASCII)]
		[(and (>= b LOWER-A-ASCII) (<= b LOWER-Z-ASCII))
			(- b LOWER-A-ASCII)]
		[else
			nil]))

(defun analyze (text)
	(let* [(frequencies '(0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0))
		   (total 0)
		   (data (string->bytes text))]
		(for-each b data
			(when-with (c (letter-index b))
				(inc! c)
				(set-idx! frequencies c (+ (get-idx frequencies c) 1))
				(inc! total)))
		(for i 1 26 1
			(set-idx! frequencies i (/ (get-idx frequencies i) total)))
		frequencies))

Before I move on to the next section, let’s also take a look at the frequency profile of the Bee Movie script (the test data) compared to the frequency profile for English:

English vs Bee Movie Script Frequencies

As you can see, it’s not an exact match, but the similarity is blatantly obvious.

Cracking the Caesar Cipher with Frequency Analysis

Since we know roughly what the freuency profile of plaintext should look like, we can use that to figure out the shift needed to decrypt a ciphertext. It’s not a perfect science; it wouldn’t work as well for smaller ciphertexts or a ciphertext whose plaintext is not comprised of English sentences. But in general, it works pretty well.

First of all, let’s define a function we can use to compare two freuency profiles:

(defun distance (a b)
	:hidden
	(assert! (= (len# a) 26) "invalid frequency profile")
	(assert! (= (len# b) 26) "invalid frequency profile")
	(with (total 0)
		(for i 1 ALPHABET-LETTERS 1
			(let* [(d (- (get-idx b i) (get-idx a i)))
				   (d2 (* d d))]
				(set! total (+ total d2))))
		(sqrt total)))

This function just calculates the Euclidean distance between the two frequency profiles. Here’s what it does written mathematically:

$$ D(A, B) = \sqrt{\sum_{i=1}^{26}{(B_i - A_i)^2}} $$

In order to crack a ciphertext, we simply try to minimize this distance. We iteratively search for the shift value that produces a frequency profile with the lowest distance to the frequency profile of English.

Here’s the implementation of that in code:

(defun crack (ciphertext)
	(let* [(best-distance huge)
		   (best-shift nil)]
		(for i 1 (- ALPHABET-LETTERS 1) 1
			(let* [(maybe-cleartext (transform i ciphertext))
				   (maybe-cleartext-freqs (analyze maybe-cleartext))
				   (dist-from-best (distance ENGLISH-LETTERS-RELATIVE-FREQUENCIES maybe-cleartext-freqs))]
				(when (< dist-from-best best-distance)
					(set! best-distance dist-from-best)
					(set! best-shift i))))
		best-shift))

This function finds the shift value that produces a plaintext with a frequency profile as close to that of the English language as possible. It is possible that this will return the wrong shift; it’s only a best approximation. We then use this shift to decrypt the ciphertext.

Notice that we only try shift values between 1 and 25; we don’t try 26 as it would be the same as 0.

Conclusion

That’s really all there is to it. The Caesar Cipher is vulnerable to frequency analysis attacks because, although it scrambles the plaintext, it preserves the frequency information in a lossless fashion. Therefore extracting this information is trivial.