Caesar Cipher - Description and Cryptanalysis

Cryptography Home Classical Cryptography Classical Cryptanalysis Modern Cryptography Modern Cryptanalysis Latest Updates Links

Back to: [Classical Cryptography and list of ciphers] [Classical Cryptanalysis] [Cryptography Home]

On this page: [Introduction] [Example] [Javascript Example] [Cryptanalysis] [Code] [References]

Introduction

The caesar cipher (a.k.a the shift cipher, Caesar's Code or Caesar Shift) is one of the earliest known and simplest ciphers. It is a type of substitution cipher in which each letter in the plaintext is 'shifted' a certain number of places down the alphabet. For example, with a shift of 1, A would be replaced by B, B would become C, and so on. The method is named after Julius Caesar, who used it to communicate with his generals.

More complex encryption schemes such as the Vigenère cipher employ the Caesar cipher as one element of the encryption process. The widely known ROT13 'encryption' is simply a caesar cipher with an offset of 13. The caesar cipher offers essentially no communication security, and it will be shown that it can be easily broken even by hand.

Example

To pass an encrypted message from one person to another, it is first necessary that both parties have the 'key' for the cipher, so that the sender may encrypt it and the receiver may decrypt it. For the caesar cipher, the key is the number of characters to shift the cipher alphabet. Here is a quick example of the encryption and decryption steps involved with the caesar cipher. The text we will encrypt is 'defend the east wall of the castle', with a shift (key) of 1.
plaintext:  defend the east wall of the castle
ciphertext: efgfoe uif fbtu xbmm pg uif dbtumf
It is easy to see how each character in the plaintext is shifted up the alphabet. Decryption is just as easy, by using an offset of -1.
The plain alphabet to cipher alphabet mapping is shown below (for the above example):
plain:  abcdefghijklmnopqrstuvwxyz
cipher: bcdefghijklmnopqrstuvwxyza
Obviously, if a different key is used, the cipher alphabet will be shifted a different amount.

JavaScript Example of the Caesar Cipher

Plaintext

shift:

Ciphertext

Cryptanalysis

Cryptanalysis is the art of breaking codes and ciphers. The caesar cipher is probably the easiest of all ciphers to break. Since the shift has to be a number between 1 and 25, (0 or 26 would result in an unchanged plaintext) we can simply try each possibility and see which one results in a piece of readable text. If you happen to know what a piece of the ciphertext is, or you can guess a piece, then this will allow you to easily reconstruct a large part of the key.
A more systematic approach is to calculate the frequency distribution of the letters in the cipher text. This consists of counting how many times each letter appears. Natural english text has a very distinct distribution that can be used help crack codes. This distribution is as follows:

English Letter Frequencies

This means that the letter 'e' is the most common, and appears almost 13% of the time, whereas 'z' appears far less than 1 percent of time. Application of the caesar cipher does not change these letter frequencies, it merely shifts them along a bit (for a shift of 1, the most frequent ciphertext letter will be 'f'). A cryptanalyst just has to find the shift that causes the ciphertext frequencies to match up closely with the natural english frequencies, then decrypt the text using that shift.

If you are still having trouble, try the cryptanalysis section of the substitution cipher page. All strategies that work with the substitution cipher will also work with the caesar cipher.

For a method that works well on computers, we need a way of figuring out which of the 25 possible decyptions looks the most like english text. This is explained in the Classical Cryptanalysis section 'Text Characterisation'. The key that results in a decryption with the highest likelyhood of being english text is most probably the correct key. Of course, the more ciphertext you have, the more likely this is to be true (this is the case for all statistical measures, including the frequency approach above). So the method used is to take the ciphertext, try decrypting it with each key, then see which decryption looks the best. This simplistic method of cryptanalysis only works on very simple ciphers such as the caesar cipher and the rail fence cipher, even slightly more complex ciphers can have far too many keys to check all of them.

Code

I have included here some C code that does encryption and decryption of the caesar cipher. It is only meant to show the working of the algorithm, not be a final polished solution.
caesar_encrypt_decrypt.c
There is also some code that utilises a markov model to automatically solve a caesar cipher. Note that very short ciphertexts may not be cryptanalysed properly. 20-30 (or greater) characters of ciphertext should ensure correct operation.
caesar_crack.c

References

Wikipedia has a good description of the encryption/decryption process, history and cryptanalysis of this algorithm

Simon Singh's 'The Code Book' is an excellent introduction to ciphers and codes, and includes a section on caesar ciphers.
Singh, Simon (2000). The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography. ISBN 0-385-49532-3. 

Choose your way out: [Classical Cryptography and list of ciphers] [Classical Cryptanalysis] [Cryptography Home]
[Modern Cryptography] [Modern Cryptanalysis]

Copyright James Lyons - 2007 - No reproduction without permission
dsplabslinuxkamilcryptojames