Mercurial > public > pelican-blog
comparison content/Coding/019-enigma-challenge.rst @ 4:7ce6393e6d30
Adding converted blog posts from old blog.
author | Brian Neal <bgneal@gmail.com> |
---|---|
date | Thu, 30 Jan 2014 21:45:03 -0600 |
parents | |
children | 6e0d4799796d |
comparison
equal
deleted
inserted
replaced
3:c3115da3ff73 | 4:7ce6393e6d30 |
---|---|
1 Completing the Enigma Challenge | |
2 ############################### | |
3 | |
4 :date: 2012-07-22 13:30 | |
5 :tags: Enigma, Py-Enigma, Cpp-Enigma, C++ | |
6 :slug: completing-the-enigma-challenge | |
7 :author: Brian Neal | |
8 | |
9 Since my last `blog post | |
10 <http://deathofagremmie.com/2012/06/06/introducing-py-enigma/>`_, I have spent | |
11 almost all my free time working on `Dirk Rijmenants' Enigma Challenge | |
12 <http://users.telenet.be/d.rijmenants/en/challenge.htm>`_. I'm very happy to | |
13 report that I cracked the last message on the evening of July 10. I thought it | |
14 might be interesting to summarize my experiences working on this challenge in a | |
15 blog post. | |
16 | |
17 The first nine | |
18 -------------- | |
19 | |
20 I had a lot of fun building Py-Enigma_, an Enigma machine simulator library | |
21 written in Python_ based on the information I found on Dirk's `Technical Details | |
22 of the Enigma Machine`_ page. After I had built it and played with it for a | |
23 while, I discovered Dirk also had an `Enigma Challenge`_. I decided this would | |
24 be a good test for Py-Enigma_, and it may give me some ideas on how to improve | |
25 the library for breaking messages. | |
26 | |
27 The first nine messages were pretty easy to solve using Py-Enigma_. Dirk gives | |
28 you most of the key information for each problem, and one can then write a small | |
29 program to brute force the missing information. I was able to solve the first | |
30 nine messages in a weekend. This was very enjoyable, as it did require some | |
31 thinking about how to simulate the problem and then executing it. I also learned | |
32 about `Python's itertools <http://docs.python.org/library/itertools.html>`_ | |
33 library, which contains some very handy functions for generating permutations | |
34 and combinations for brute forcing a solution. | |
35 | |
36 Dirk did a great job on the messages. They referenced actual events from World | |
37 War II and I ended up spending a fair amount of time on Wikipedia reading about | |
38 them. Translating the messages from German to English was a bit difficult, but I | |
39 relied heavily on `Google Translate <http://translate.google.com>`_. | |
40 | |
41 Message 10 | |
42 ---------- | |
43 | |
44 But then I came to the last message. Here, Dirk doesn't give you any key | |
45 information, just ciphertext. Gulp. If you include 2 possible reflectors, the | |
46 Enigma machine's key space for message 10 is somewhere around 2 x 10\ | |
47 :sup:`23` keys, so it is impossible to brute force every possible key, unless | |
48 you can afford to wait around for the heat death of the universe. | |
49 | |
50 I then did some research, and learned about a technique for brute-forcing only | |
51 the rotor and message settings, and then performing a heuristic technique called | |
52 "hill-climbing" on the plugboard settings. I am deeply indepted to Frode Weierud | |
53 and Geoff Sullivan, who have published high level descriptions of this | |
54 technique. See Frode's `The Enigma Collection | |
55 <http://cryptocellar.org/Enigma/>`_ page and Geoff's `Crypto Barn | |
56 <http://www.hut-six.co.uk/>`_ for more information. | |
57 | |
58 Python is too slow, or is it? | |
59 ----------------------------- | |
60 | |
61 After a week or so of studying and tinkering, I came up with a hill-climbing | |
62 algorithm in Python. I then started an attempt to break message 10, but after | |
63 running the program for a little over 24 hours, I realized it was too slow. I | |
64 did a "back of the envelope" calculation and determined I would need several | |
65 hundred years to hill-climb every rotor and message setting. This was a bit | |
66 disappointing, but I knew it was a possibility going in. | |
67 | |
68 I then set out, somewhat reluctantly, to port Py-Enigma to C++. These days I | |
69 don't do any C++ at home, only at work. However I got much happier when I | |
70 decided this was an opportunity to try out some new C++11 features (the embedded | |
71 compiler we use at work has no support for C++11). Okay, things are fun again! I | |
72 have to say that C++11 is really quite enjoyable, and I made good use of the new | |
73 ``auto`` style variable declarations, range-based for-loops, and brace-style | |
74 initialization for containers like ``vector`` and ``map``. | |
75 | |
76 C++ is too slow, or is it? | |
77 -------------------------- | |
78 | |
79 It took a fair amount of time to re-tool my code to C++11. I ensured all my unit | |
80 tests originally written in Python could pass in C++, and that I could solve | |
81 some of the previous messages. I then re-implemented my hill-climbing | |
82 algorithm in C++ and made another attempt at cracking message 10. | |
83 | |
84 My C++ solution was indeed faster, but only by a factor of 8 or 9. I calculated | |
85 I would need about 35 years to hill-climb all rotor and message settings, and I | |
86 didn't really want to see if I could out-live my program. My morale was very low | |
87 at this point. I wasn't sure if I could ever solve this self-imposed problem. | |
88 | |
89 Algorithms & Data Structures are key | |
90 ------------------------------------ | |
91 | |
92 I had wanted to test myself with this problem and had avoided looking at anyone | |
93 else's source code. However, at this juncture, I needed some help. I turned to | |
94 Stephan Krah's `M4 Message Breaking Project | |
95 <http://www.bytereef.org/m4_project.html>`_. Stephan had started a distributed | |
96 computing attack on several 4-rotor naval Enigma machine messages. He had | |
97 graciously made his source code available. Perhaps I could get some hints by | |
98 looking at his code. | |
99 | |
100 Indeed, Stephen's code provided the break I needed. I discovered a very cool | |
101 trick that Stephen was doing right before he started his hill-climb. He | |
102 pre-computed every path through the machine for every letter of the alphabet. | |
103 Since hill-climbing involves running the message through the machine roughly | |
104 600 to 1000 times, this proved to be a huge time saver. After borrowing this | |
105 idea, my hill-climbing times for one fixed rotor & message setting went down | |
106 from 250 milliseconds to 2! | |
107 | |
108 I immediately stopped looking at Stephen's code at this point; I didn't bother | |
109 to compare our hill-climbing algorithms. I believe both of us are influenced by | |
110 Weierud & Sullivan here. I was now convinced I had the speed-up I needed to make | |
111 a serious attempt at message 10 again. | |
112 | |
113 This showed me how critically important it is to have the right data structure | |
114 and algorithm for a problem. However there were two other lessons I needed to | |
115 learn before I was successful. | |
116 | |
117 Utilize all your computing resources | |
118 ------------------------------------ | |
119 | |
120 Up to this point my cracking program was linear. It started at the first rotor | |
121 and message setting, then hill-climbed and noted the score I got. Then it | |
122 proceeded to the next message setting and hill-climbed, noting the score of that | |
123 attempt, and so on. There were two problems with this approach. | |
124 | |
125 I realized if I could do some sort of multi-processing, I could search the key | |
126 space faster. I thought about forking off multiple processes or maybe using | |
127 threads, but that was over-thinking it. In the end, I added command-line | |
128 arguments to the program to tell it where in the key space to start looking. I | |
129 could then simply run multiple instances of my program. My development laptop, | |
130 which runs Ubuntu, is from 2008, and it has 2 cores in it. My desktop PC (for | |
131 gaming), has 4 cores! I next spent a couple nights installing MinGW_ and getting | |
132 my application to build in that environment. I could now run 6 instances of my | |
133 cracking program. Hopefully this would pay off and shorten the time further. | |
134 | |
135 (I didn't even attempt to commandeer my wife's and kids' computers. That | |
136 wouldn't have gone down so well.) | |
137 | |
138 How do you know when you are done? | |
139 ---------------------------------- | |
140 | |
141 The final problem that I had was determining when my cracking program had in | |
142 fact cracked the message. The hill-climbing attempts produced scores for each | |
143 key setting. Up to this point, based on running my program on messages 1 through | |
144 9, I had noticed that when I got a score divided by the message length of around | |
145 10 I had cracked the message. But my tests on messages 1 through 9 were | |
146 hill-climbing using the known ring settings. My message cracker was using a | |
147 shortcut: I was only searching ring settings on the "fast" rotor to save time. | |
148 This would let me get the first part of a message, but if the middle or | |
149 left-most rotor stepped at the wrong time the rest of the message would be | |
150 unintelligible. | |
151 | |
152 To verify my message cracking program, I ran it on some of the previous | |
153 messages. I was shooting for a score divided by message length of 10. Using this | |
154 heuristic, I was in fact able to crack some of the previous messages, but not | |
155 others. It took me a while to realize that not searching the middle rotor's ring | |
156 setting was causing this. The light bulb came on, and I realized that my | |
157 heuristic of 10 is only valid when I search all the ring settings. Instead I | |
158 should just keep track of the highest scores. When you get close enough, the | |
159 score will be significantly higher than average score. Thus I may not know when | |
160 I am done until I see a large spike in the score. I would then have to write | |
161 another program to refine the solution to search for the middle and left-most | |
162 ring setting. | |
163 | |
164 Success at last | |
165 --------------- | |
166 | |
167 It took a little over a month of evenings and weekends to get to this point. | |
168 I even had a false start where I ran my program on 6 cores for 1.5 days only to | |
169 realize I had a bug in my code and I was only searching ring setting 0 on the | |
170 fast ring. But once I had that worked out, I started a search on 6 cores on a | |
171 Sunday night. I checked the logs off and on Monday, but no significant spike in | |
172 the scores were noted. Before leaving for work on Tuesday I checked again, and | |
173 noticed that one instance of my program had found a score twice as high as any | |
174 other. Could that be it? I really wanted to stay and find out, but I had to go | |
175 to work! Needless to say I was a bit distracted at work that day. | |
176 | |
177 After work, I ran the settings through my Python application and my heart | |
178 pounded as I recognized a few German words at the beginning of the message. In | |
179 fact I had gotten around 75% of the message, then some garbage, but then the | |
180 last few words were recognizable German. I then wrote another application to | |
181 search the middle ring setting space. Using the highest score from that program | |
182 allowed me to finally see the entire message. I cannot tell you how happy and | |
183 relieved I was at the same time! | |
184 | |
185 Final thoughts | |
186 -------------- | |
187 | |
188 For a while I didn't think I was going to be smart enough or just plain up to | |
189 the task of solving message 10. Looking back on it, I see a nice progression of | |
190 trying things, experimenting, setbacks, and a few "a-ha" moments that lead to the | |
191 breakthrough. I am very grateful to Dirk Rijmenants for creating the challenge, | |
192 which was a lot of fun. And I also wish to thank Frode Weierud and Geoff | |
193 Sullivan for their hill-climbing algorithm description. Thanks again to Stephan | |
194 Krah for the key speedup that made my attempt possible. | |
195 | |
196 For now I have decided not to publish my cracking program, since it is a tiny | |
197 bit specific to Dirk's challenge. I don't want to ruin his Enigma Challenge by | |
198 giving anything away. I don't believe my broad descriptions here have revealed | |
199 too much. Other people need to experience the challenge for themselves to fully | |
200 appreciate the thrill and hard work that go into code breaking. | |
201 | |
202 I should also note that there are other ways of solving Dirk's challenge. You | |
203 could, with a bit of patience, solve messages 1 through 9 with one of the many | |
204 Enigma Machine GUI simulators (Dirk has a fine one). I'm not certain how | |
205 you could break message 10 using a GUI simulator and trial and error, however. | |
206 | |
207 It is my understanding that this style of Enigma message breaking was not the | |
208 technique used by the allied code-breakers during the war. I believe they used | |
209 knowledge of certain key words, or "cribs", in messages received in weather | |
210 reports and other regular messages to find the key for the day. I look forward | |
211 to reading more about their efforts now. | |
212 | |
213 Finally, I have published the C++11 port of Py-Enigma_, which I am calling | |
214 (rather unimaginatively) Cpp-Enigma_ on bitbucket. I hope Cpp-Enigma_ and | |
215 Py-Enigma_ are useful for other people who want to explore the fascinating world | |
216 of Enigma codes. | |
217 | |
218 .. _Py-Enigma: https://bitbucket.org/bgneal/enigma | |
219 .. _Python: http://www.python.org | |
220 .. _Technical Details of the Enigma Machine: http://users.telenet.be/d.rijmenants/en/enigmatech.htm | |
221 .. _Enigma Challenge: http://users.telenet.be/d.rijmenants/en/challenge.htm | |
222 .. _Cpp-Enigma: https://bitbucket.org/bgneal/cpp-enigma | |
223 .. _MinGW: http://www.mingw.org/ |