Sunday, January 10, 2016

Coding the Vigenère Square Cipher

Since I was a young boy, one of my favorite things to play around with has always been ciphers. Although they have largely fallen out of use I still enjoy studying different ciphers and trying to break them down into python algorithms. Today  I will share one such project. The Advanced Vigenère Square Cipher.

The  Vigenère square cipher is an advanced substitution cipher.
With the originial square, there are 26 different cipher alphabets that are used
to encrypt text. Each cipher alphabet is just another rightword Caesar shift
of the original alphabet. I added a 27th alphabet to encode space chars as well
This is what the modified Vigenère square looks like:

               A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
               B C D E F G H I J K L M N O P Q R S T U V W X Y Z   A
               C D E F G H I J K L M N O P Q R S T U V W X Y Z   A B
               D E F G H I J K L M N O P Q R S T U V W X Y Z   A B C
               E F G H I J K L M N O P Q R S T U V W X Y Z   A B C D
               F G H I J K L M N O P Q R S T U V W X Y Z   A B C D E
               G H I J K L M N O P Q R S T U V W X Y Z   A B C D E F
               H I J K L M N O P Q R S T U V W X Y Z   A B C D E F G
               I J K L M N O P Q R S T U V W X Y Z   A B C D E F G H
               J K L M N O P Q R S T U V W X Y Z   A B C D E F G H I
               K L M N O P Q R S T U V W X Y Z   A B C D E F G H I J
               L M N O P Q R S T U V W X Y Z   A B C D E F G H I J K
               M N O P Q R S T U V W X Y Z   A B C D E F G H I J K L
               N O P Q R S T U V W X Y Z   A B C D E F G H I J K L M
               O P Q R S T U V W X Y Z   A B C D E F G H I J K L M N
               P Q R S T U V W X Y Z   A B C D E F G H I J K L M N O
               Q R S T U V W X Y Z   A B C D E F G H I J K L M N O P
               R S T U V W X Y Z   A B C D E F G H I J K L M N O P Q
               S T U V W X Y Z   A B C D E F G H I J K L M N O P Q R
               T U V W X Y Z   A B C D E F G H I J K L M N O P Q R S
               U V W X Y Z   A B C D E F G H I J K L M N O P Q R S T
               V W X Y Z   A B C D E F G H I J K L M N O P Q R S T U
               W X Y Z   A B C D E F G H I J K L M N O P Q R S T U V
               X Y Z   A B C D E F G H I J K L M N O P Q R S T U V W
               Y Z   A B C D E F G H I J K L M N O P Q R S T U V W X
               Z   A B C D E F G H I J K L M N O P Q R S T U V W X Y
                 Z A B C D E F G H I J K L M N O P Q R S T U V W X Y

To use this Vigenère square to encrypt a message, you first choose a keyword
and then repeat it until it is the same length as the message you wish to
encode. You then would write the message underneath the repeated keyword to
see which cipher alphabet you would use for each letter of the message.
The first letter of the message would be encoded using the cipher alphabet
that corresponds with the first letters of the keyword. The cipher alphabet
that uses B for A and C for B etc. would be cipher alphabet 'B'. Each cipher
alphabet is named by the first letter in it. For example if you have a
keyword of WORD and the message you want to encode is I LOVE CRYPTOGRAPHY,
this is what you would do:

message: I LOVE CRYPTOGRAPHY
keyword: WORDWORDWORDWORDWOR
encoded text: DNBRQSQFMLFWJUHDKVO

Vigenère can also be viewed algebraically. If the letters A–Z are taken to be the numbers 0–25,
and addition is performed modulo 26 (or {length of alphabet} if using a modified version), then Vigenère encryption E using the key K can be written,
       C_i = E_K(M_i) = (M_i+K_i) % {length of alphabet}
and decryption D using the key K,
       M_i = D_K(C_i) = (C_i-K_i) % {length of alphabet},
where M = M_1 ... M_n is the message, C = C_1 ... C_n is the ciphertext and K = K_1 ... K_n is the key obtained
by repeating the keyword ceil(n / m) times, where m is the keyword length.


That is sufficient to begin coding te encoding and decoding functions.
First, to encode a message we need to generate a key string the same length as the message to be encoded





where PASSWORD is a string treated like a character list

It is worth noting that any 'A' in the keyword will leave the corresponding
letter unshifted.

Next, we use this to figure out the row and column offsets M_i and K_i
Finally, included in that block of code, we take the offset (M_i+K_i) % len(ALPHABET)
and use that as the key to look up the encoded letter.  This version of the cipher is limited to alphabetic characters only. Adding a few more characters to the alphabet allows for a larger key space as well as more complex messages. In addition to the standard A-Z + " " I added ".", ",", ":", "?", "!","'","\"",";","-","_","&","*","$","%","#", and 0-9

As an example lets take the message below:
Go to 21 st. james street and wait for the payphone to ring once. Then call 555-234-6789 and wait for agent #47 to answer. The challenge is "Is it sunny in philly?" and the response is "only when the orchestra is playing". Await further instructions from your contact. Do not mess this up agent!
I choose the password "CONNERY WAS BEST". After encoding with the new square we are left with
I,$'S1Q3%!'!1"C R!?-1TSR'?R&F%-NM_8H,?$XY,,:N&TY*PS$'S1#K.T$S? G#$'LV&,QNYP1T7J6F7LH8LLM?R&F%-NM_8H,?$EX,P"$C8O8V,$NR-4G!%$XY,,QUNP,,PUR$M-8-W!$M_8U;  ,1'P%,UM,- 24$E?.,"UR?;,U:. WV8K'$4S?- %-UI?8VVR$S; JS!'VR8K'$,PR6K.T4!1YYOV'?W2T"URV1'P''?YT1K, !?W#Q $&S&#,Q. XR V#$QS1&Q"$ZI-0,"UVW12R%NTI?1"
To see how we got there lets take the First Character of the Message "G" and encode it. The corresponding Key character is "C".
so:
M_i = 6 # index of letter G in 0 based alphabet
K_i = 2 # index of letter C in 0 based alphabet
len(ALPHABET) = 52 # Length of the alphabet for modulous
and the formula to figure out the new character is
E_k = (M_i + K_i) % len(ALPHABET)
E_i = ALPHABET[E_k]
Therefore:
E_k = (6 + 2) % 52 =  8
E_i = ALPHABET[8] = "I" # eighth letter of the alphabet
There you have it. The first letter indeed returns the 8 letter of the alphabet, "I"
 
Deciphering the message is simply the opposite. Create a key string from the password the same length as the cipher text, then work out
D_k = (C_i - K_i) % len(ALPHABET)
D_i = ALPHABET[D_k]
So given our example character of "I"
C_i = 8 # index of letter I in 0 based alphabet
K_i = 2 # index of letter C in 0 based alphabet
len(ALPHABET) = 52 # Length of the alphabet for modulous
Finally reversing te steps gets us:
D_k = (8 - 2) % 52 =  6
D_i = ALPHABET[6] = "G" # sixth letter of the alphabet is G

No comments:

Post a Comment