Cracking the Trithemius Cipher

# Cracking the Trithemius Cipher

Since the Trithemius cipher has $26$ unique choices for an offset, 2 unique choices for a direction, and 26 unique choices for a step value, there are only:

$$26 \cdot 2 \cdot 26 = 1,352$$

possible unique key values for this cipher. To try them all by hand would certainly be tedious, but using a computer will make quick work of it. Using the chi-squared method results in a straightforward way to determine a fitness score for each attempted decipher, and then select the configuration values that result in the best score.

Suppose an intercepted ciphertext was suspected to be enciphered with the Trithemius Cipher:

WEXRU EXRZY QDMFS CMFAB KLURE KVEBN BKLPO KODWR TBTOT AXNOK DRRFU DJKZH XVNHI XNBBB PAXDJ XROFZ RSORF TCHEK ZTVPF SBMIP ZONMU UNSSV PDEAR MXFHU IKRLI NHWBC KUNLM BTBZM WGZYB WJZPP TPADM YYBWJ ZZHAZ BUDYS YXXTL FHZMD KVSHA SVJPW ZZHSM VZDQP RLQCH QBCVF BJFJX SSHUC GPNKQ FQHQP HUSCV ODWEB GDQHE DVLGH XPCEP XLOHF VNBYI BBGDH QCIFJ JSEQA TYAKD ARACF TQBRD ODCJN FXKCF KSDPQ EGZTQ YGIBI ELGZX IBBXU EXAZU JSPJF XWFPB AVJXQ ORVWX UETIH LUDNK ENYDE XSTUI QJEGX KUEXW FNHVD EQXBQ OHSFJ LMBRX RSELY BZTHM OREHO BMTCP ZHDAW RLYBK WBLRL MBYVV YIFGZ ZMWGJ JRYFM IBSUH GDARP NIQXB CELUR JWGXP ZLPWZ AKQJL TQYIB FTOPX LJWNW FSQMW LALRI VPNXT XKXWF LLHFO RQFPX TVARB JEIND UEXHP UVDJW FFNGQ GTDZD MYDBC NPKMW PZXKG VAVZS JNGZM WGZSR NXTEH JWKHQ DETCM FFKLL FWGMF HPMUE XAZUJ TIDBF SHOTH DVUBD IPUNO DPXEO PNIFG XSPRL XYZLR OVALJ SLNCO AKDYL FCDHF EISVU MNFSC MFPMG LNJKD ETFTP AUXYL VDZDR MYPJT ZPAKD NKVUQ OBLHX VUDJG CAJTP BKPAK DYZZA TBOHU WVQCJ EJJXM FDTEO HAJLE MTOKH IPVIZ YZFCF OQHGR HQ

Storing it to the variable ciphertext:

ciphertext = 'WEXRU EXRZY QDMFS CMFAB KLURE KVEBN BKLPO KODWR TBTOT AXNOK DRRFU DJKZH XVNHI XNBBB PAXDJ XROFZ RSORF TCHEK ZTVPF SBMIP ZONMU UNSSV PDEAR MXFHU IKRLI NHWBC KUNLM BTBZM WGZYB WJZPP TPADM YYBWJ ZZHAZ BUDYS YXXTL FHZMD KVSHA SVJPW ZZHSM VZDQP RLQCH QBCVF BJFJX SSHUC GPNKQ FQHQP HUSCV ODWEB GDQHE DVLGH XPCEP XLOHF VNBYI BBGDH QCIFJ JSEQA TYAKD ARACF TQBRD ODCJN FXKCF KSDPQ EGZTQ YGIBI ELGZX IBBXU EXAZU JSPJF XWFPB AVJXQ ORVWX UETIH LUDNK ENYDE XSTUI QJEGX KUEXW FNHVD EQXBQ OHSFJ LMBRX RSELY BZTHM OREHO BMTCP ZHDAW RLYBK WBLRL MBYVV YIFGZ ZMWGJ JRYFM IBSUH GDARP NIQXB CELUR JWGXP ZLPWZ AKQJL TQYIB FTOPX LJWNW FSQMW LALRI VPNXT XKXWF LLHFO RQFPX TVARB JEIND UEXHP UVDJW FFNGQ GTDZD MYDBC NPKMW PZXKG VAVZS JNGZM WGZSR NXTEH JWKHQ DETCM FFKLL FWGMF HPMUE XAZUJ TIDBF SHOTH DVUBD IPUNO DPXEO PNIFG XSPRL XYZLR OVALJ SLNCO AKDYL FCDHF EISVU MNFSC MFPMG LNJKD ETFTP AUXYL VDZDR MYPJT ZPAKD NKVUQ OBLHX VUDJG CAJTP BKPAK DYZZA TBOHU WVQCJ EJJXM FDTEO HAJLE MTOKH IPVIZ YZFCF OQHGR HQ'

The following code would score all $1,352$ possible decipherments of the message, and append a 4 element list containing the values of testOffset, testDirection, and testStep to the list theScores creating $1,352$ sub-lists inside this main list.

theScores = []

for testOffset in range(0, 26):
for testDirection in [-1, 1]:
for testStep in range(0,26):
theScores.append( [testOffset, testDirection, testStep, chiSquaredScore( trithemiusDecipher( ciphertext, testOffset, testDirection, testStep ) )] )

Showing just the first 5 elements in the list:

print(theScores[0:5])
[[0, -1, 0, 5832.258106258903], [0, -1, 1, 5065.595988083758], [0, -1, 2, 5691.3602750603095], [0, -1, 3, 3845.910675038989], [0, -1, 4, 9245.868577435736]]

Formatted nicely:

offset: 0	 direction: -1	 step: 0	 chi-squared: 5832.2581
offset: 0	 direction: -1	 step: 1	 chi-squared: 5065.596
offset: 0	 direction: -1	 step: 2	 chi-squared: 5691.3603
offset: 0	 direction: -1	 step: 3	 chi-squared: 3845.9107
offset: 0	 direction: -1	 step: 4	 chi-squared: 9245.8686

Sorting by the chi-squared score and showing only the top 20:

offset: 17	 direction: -1	 step: 4	 chi-squared: 38.8954
offset: 17	 direction: 1	 step: 22	 chi-squared: 38.8954
offset: 17	 direction: -1	 step: 17	 chi-squared: 948.5753
offset: 17	 direction: 1	 step: 9	 chi-squared: 948.5753
offset: 4	 direction: -1	 step: 17	 chi-squared: 1088.3624
offset: 4	 direction: 1	 step: 9	 chi-squared: 1088.3624
offset: 3	 direction: -1	 step: 17	 chi-squared: 2301.1405
offset: 3	 direction: 1	 step: 9	 chi-squared: 2301.1405
offset: 16	 direction: -1	 step: 17	 chi-squared: 2538.9579
offset: 16	 direction: 1	 step: 9	 chi-squared: 2538.9579
offset: 3	 direction: -1	 step: 2	 chi-squared: 3108.4915
offset: 3	 direction: 1	 step: 24	 chi-squared: 3108.4915
offset: 11	 direction: -1	 step: 10	 chi-squared: 3123.7784
offset: 11	 direction: 1	 step: 16	 chi-squared: 3123.7784
offset: 19	 direction: -1	 step: 2	 chi-squared: 3127.5838
offset: 19	 direction: 1	 step: 24	 chi-squared: 3127.5838
offset: 21	 direction: -1	 step: 10	 chi-squared: 3155.6881
offset: 21	 direction: 1	 step: 16	 chi-squared: 3155.6881
offset: 3	 direction: -1	 step: 8	 chi-squared: 3185.8115
offset: 3	 direction: 1	 step: 18	 chi-squared: 3185.8115

We can see that there are two choices that are clearly the most likely as they have the same chi-squared score which is over 900 points lower than the next lowest score. It does seem odd that the scores appear to come in pairs of identical values. More on that in a bit!

Printing out the most likely settings that would yield the plaintext:

print( trithemiusDecipher( ciphertext, 17, -1, 4) )

Which we can see worked out perfectly.

## Multiple Keys to the Same Ciphertext

For each ciphertext created by the Trithemius cipher, there appears to be another configuration that results in the same chi-squared score. Does this mean that it results in the same plaintext? It would seem too coincidental if it didn't. Let's check the second:

print( trithemiusDecipher( ciphertext, 17, 1, 22) )
This result would seem to indicate that there are in fact not $1,352$ unique keys to the Trithemius cipher, as half the keys in that count result in the same ciphertext as the other half. Our revised count is now $676$. Unfortunately, we don't yet know how to easily identify which keys are duplicates, so for now we're stuck checking all $1,352$ possible key configurations to determine which one(s) will yield the correct plaintext.