Vigenere Cipher - Description and Cryptanalysis

Cryptography Home Classical Cryptography Classical Cryptanalysis Modern Cryptography Modern Cryptanalysis Latest Updates Links

Back to: [Classical Cryptography and list of ciphers] [Classical Cryptanalysis] [Cryptography Home]

On this page: [Introduction] [The Algorithm] [Variants] [Javascript Example] [Cryptanalysis] [Code] [References]

Introduction

The Vigenere Cipher is a polyalphabetic substitution cipher. The method was originally described by Giovan Battista Bellaso in his 1553 book La cifra del. Sig. Giovan Battista Bellaso; however, the scheme was later misattributed to Blaise de Vigen�e in the 19th century, and is now widely known as the "Vigen�e cipher".

Blaise de Vigen�e actually invented the stronger Autokey cipher in 1586 (the Autokey cipher is described in the 'Variants' section).

The Vigenere Cipher was considered 'le chiffre ind�hiffrable' (French for 'the unbreakable cipher') for 300 years, until in 1863 Friedrich Kasiski published a successful attack on the Vigen�e cipher. Charles Babbage had, however, already developed the same test in 1854. Gilbert Vernam worked on the vigenere cipher in the early 1900s, and his work eventually led to the one-time pad, which is a provably unbreakable cipher.

The Algorithm

The 'key' for a vigenere cipher is a key word. e.g. 'FORTIFICATION'
The Vigenere Cipher uses the following tableau (the 'tabula recta') to encipher the plaintext:
    A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
    ---------------------------------------------------
A   A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
B   B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
C   C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
D   D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
E   E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
F   F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
G   G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
H   H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
I   I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
J   J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
K   K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
L   L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
M   M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
N   N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
O   O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
P   P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
Q   Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
R   R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
S   S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
T   T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
U   U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
V   V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
W   W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
X   X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
Y   Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
Z   Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
To encipher a message, repeat the keyword above the plaintext:
FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
Now we take the letter we will be encoding, 'D', and find it on the first column on the tableau. Then, we move along the 'D' row of the tableau until we come to the column with the 'F' at the top (The 'F' is the keyword letter for the first 'D'), the intersection is our ciphertext character, 'I'.

So, the ciphertext for the above plaintext is:
FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
ISWXVIBJEXIGGBOCEWKBJEVIGGQS

Variants

There are several ciphers that are very similar to the vigenere cipher.

The Gronsfeld cipher is exactly the same as the vigenere cipher, except numbers are used as the key instead of letters. There is no other difference. The numbers may be picked from a sequence, e.g. the Fibonacci series, or some other pseudo-random sequence.

The Autokey cipher is exactly the same as the vigenere cipher, except the key is used a little differently. With the vigenere cipher, a short key is simply repeated over and over e.g.
FORTIFICATIONFORTIFICATIONFO
DEFENDTHEEASTWALLOFTHECASTLE
The Autokey cipher still requires a key, however it is only used once, after which the message becomes the key.
FORTIFICATIONDEFENDTHEEASTWA
DEFENDTHEEASTWALLOFTHECASTLE
The gronsfeld cipher is cryptanalysed in the same way as the vigenere algorithm, however the autokey cipher will not be broken using the kasiski method since the key does not repeat. The best way to break the autokey cipher is to try and guess portions of the plaintext or key from the ciphertext, knowing they must both follow the frequency distribution of english text. Guessing how the plaintext begins is the easiest way of cracking the cipher.

Javascript Example


Plaintext

keyword =

Ciphertext

Cryptanalysis

Vigenere-like substitution ciphers were regarded by many as practically unbreakable for 300 years. In 1863, a Prussian major named Kasiski proposed a method for breaking a Vigenere cipher that consisted of finding the length of the keyword and then dividing the message into that many simple substitution cryptograms. Frequency analysis could then be used to solve the resulting simple substitutions. Kasiski's technique for finding the length of the keyword was based on measuring the distance between repeated bigrams in the ciphertext. Note that in the above cryptogram the plaintext bigram 'TO' occurs twice in the message at position 0 and 9 and in both cases it lines up perfectly with the first two letters of the keyword. Because of this it produces the same ciphertext bigram, 'KS.' The same can be said of plaintext 'BE' which occurs twice starting at positions 2 and 11, and also is encrypted with the same ciphertext bigram, 'ME.' In fact, any message encrypted with a Vigenere cipher will produce many such repeated bigrams. Although not every repeated bigram will be the result of the encryption of the same plaintext bigram, many will, and this provides the basis for breaking the cipher. By measuring and factoring the distances between recurring bigrams (in this case the distance is 9) Kasiski was able to guess the length of the keyword. For this example,
	Location	01234 56789 01234 56789 01234 56789
	Keyword:	RELAT IONSR ELATI ONSRE LATIO NSREL
	Plaintext: 	TOBEO RNOTT OBETH ATIST HEQUE STION
	Ciphertext:	KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
the Kasiski method would create something like the following list: Repeated Bigram Location Distance Factors
KS	9	9	3, 9
SM	10	9	3, 9
ME	11	9	3, 9
...
Factoring the distances between repeated bigrams is a way of identifying possible keyword lengths, with those factors that occur most frequently being the best candidates for the length of the keyword. Note that in this example since 3 is also a factor of 9 (and any of its multiples) both 3 and 9 would be reasonable candidates for keyword length. Although in this example we don't have a clear favorite, we've narrowed down the possibilities to a very small list. Note also that if a longer ciphertext were encrypted with the same keyword ('RELATIONS'), we would expect to find repeated bigrams at multiples of 9 -- i.e., 18, 27, 81, etc. These would also have 3 as a factor. Kasiski's important contribution is to note this phenomenon of repeated bigrams and propose a method -- factoring of distances -- to analyze it. Once the length of the keyword is known, the ciphertext can be broken up into that many simple substitution cryptograms. That is, for a keyword of length 9, every 9-th letter in the ciphertext was encrypted with the same keyword letter. Given the structure of the Vigenere tableau, this is equivalent to using 9 distinct simple substitution ciphers, each of which was derived from 1 of the 26 possible Caesar shifts given in the tableau. The pure Kasiski method proceeds by analyzing these simple substitution cryptograms using frequency analysis and the other standard techniques. A variant of this method, proposed by the French cryptographer Kerckhoff, is based on discovering the keyword itself and then using it to decipher the cryptogram. In Kerckhoff's method, after the message has been separated into several columns, corresponding to the simple substitution cryptograms, one tallies the frequencies in each column and then uses frequency and logical analysis to construct the key. For example, suppose the most frequent letter in the first column is 'K'. We would hypothesize that 'K' corresponds to the English 'E'. If we consult the Vigenere tableau at this point, we can see that if English 'E' were enciphered into 'K' then row G of the table must have been the alphabet used for the first letter of the keyword. This implies that the first letter of the keyword is 'G'. The problem with this "manual" approach is that for short messages there are often several good candidates for English 'E' in each column. This requires the testing of multiple hypotheses, which can get quite tedious and involved. Therefore we need a more sensitive test to discover the alphabet used by each letter of the keyword. Recalling that each row of the Vigenere tableau is one of the 26 Caesar shifts, we can use the chi-square test to determine which of the 26 possible shifts was used for each letter of the keyword. This modern day version of the Kerckhoff method turns out to be very effective.

Code

I have included here some C code that does encryption and decryption of the Vigenere cipher. It is only meant to show the working of the algorithm, not be a final polished solution.
Vigenere_implementation.c

References

[1] Wikipedia has a good description of the encryption/decryption process, history and cryptanalysis of this algorithm

[2] Kahn, D (1973) The CodeBreakers. Macmillan: New York
[3] http://www.trincoll.edu/depts/cpsc/cryptography/vigenere.html has a good description of the algorithm along with cryptanalysis.
Simon Singh's 'The Code Book' is an excellent introduction to ciphers and codes, and includes a section on ADFGVX ciphers.
Singh, Simon (2000). The Code Book: The Science of Secrecy from Ancient Egypt to Quantum Cryptography. ISBN 0-385-49532-3. 

Choose your way out: [Classical Cryptography and list of ciphers] [Classical Cryptanalysis] [Cryptography Home]
[Modern Cryptography] [Modern Cryptanalysis]

Copyright James Lyons - 2007 - No reproduction without permission
dsplabslinuxkamilcryptojames