bgneal@4
|
1 Completing the Enigma Challenge
|
bgneal@4
|
2 ###############################
|
bgneal@4
|
3
|
bgneal@4
|
4 :date: 2012-07-22 13:30
|
bgneal@14
|
5 :tags: Enigma, Py-Enigma, Cpp-Enigma, C++, simulator
|
bgneal@4
|
6 :slug: completing-the-enigma-challenge
|
bgneal@4
|
7 :author: Brian Neal
|
bgneal@4
|
8
|
bgneal@4
|
9 Since my last `blog post
|
bgneal@4
|
10 <http://deathofagremmie.com/2012/06/06/introducing-py-enigma/>`_, I have spent
|
bgneal@4
|
11 almost all my free time working on `Dirk Rijmenants' Enigma Challenge
|
bgneal@4
|
12 <http://users.telenet.be/d.rijmenants/en/challenge.htm>`_. I'm very happy to
|
bgneal@4
|
13 report that I cracked the last message on the evening of July 10. I thought it
|
bgneal@4
|
14 might be interesting to summarize my experiences working on this challenge in a
|
bgneal@4
|
15 blog post.
|
bgneal@4
|
16
|
bgneal@4
|
17 The first nine
|
bgneal@4
|
18 --------------
|
bgneal@4
|
19
|
bgneal@4
|
20 I had a lot of fun building Py-Enigma_, an Enigma machine simulator library
|
bgneal@4
|
21 written in Python_ based on the information I found on Dirk's `Technical Details
|
bgneal@4
|
22 of the Enigma Machine`_ page. After I had built it and played with it for a
|
bgneal@4
|
23 while, I discovered Dirk also had an `Enigma Challenge`_. I decided this would
|
bgneal@4
|
24 be a good test for Py-Enigma_, and it may give me some ideas on how to improve
|
bgneal@4
|
25 the library for breaking messages.
|
bgneal@4
|
26
|
bgneal@4
|
27 The first nine messages were pretty easy to solve using Py-Enigma_. Dirk gives
|
bgneal@4
|
28 you most of the key information for each problem, and one can then write a small
|
bgneal@4
|
29 program to brute force the missing information. I was able to solve the first
|
bgneal@4
|
30 nine messages in a weekend. This was very enjoyable, as it did require some
|
bgneal@4
|
31 thinking about how to simulate the problem and then executing it. I also learned
|
bgneal@4
|
32 about `Python's itertools <http://docs.python.org/library/itertools.html>`_
|
bgneal@4
|
33 library, which contains some very handy functions for generating permutations
|
bgneal@4
|
34 and combinations for brute forcing a solution.
|
bgneal@4
|
35
|
bgneal@4
|
36 Dirk did a great job on the messages. They referenced actual events from World
|
bgneal@4
|
37 War II and I ended up spending a fair amount of time on Wikipedia reading about
|
bgneal@4
|
38 them. Translating the messages from German to English was a bit difficult, but I
|
bgneal@4
|
39 relied heavily on `Google Translate <http://translate.google.com>`_.
|
bgneal@4
|
40
|
bgneal@4
|
41 Message 10
|
bgneal@4
|
42 ----------
|
bgneal@4
|
43
|
bgneal@4
|
44 But then I came to the last message. Here, Dirk doesn't give you any key
|
bgneal@4
|
45 information, just ciphertext. Gulp. If you include 2 possible reflectors, the
|
bgneal@4
|
46 Enigma machine's key space for message 10 is somewhere around 2 x 10\
|
bgneal@4
|
47 :sup:`23` keys, so it is impossible to brute force every possible key, unless
|
bgneal@4
|
48 you can afford to wait around for the heat death of the universe.
|
bgneal@4
|
49
|
bgneal@4
|
50 I then did some research, and learned about a technique for brute-forcing only
|
bgneal@4
|
51 the rotor and message settings, and then performing a heuristic technique called
|
bgneal@4
|
52 "hill-climbing" on the plugboard settings. I am deeply indepted to Frode Weierud
|
bgneal@4
|
53 and Geoff Sullivan, who have published high level descriptions of this
|
bgneal@4
|
54 technique. See Frode's `The Enigma Collection
|
bgneal@4
|
55 <http://cryptocellar.org/Enigma/>`_ page and Geoff's `Crypto Barn
|
bgneal@4
|
56 <http://www.hut-six.co.uk/>`_ for more information.
|
bgneal@4
|
57
|
bgneal@4
|
58 Python is too slow, or is it?
|
bgneal@4
|
59 -----------------------------
|
bgneal@4
|
60
|
bgneal@4
|
61 After a week or so of studying and tinkering, I came up with a hill-climbing
|
bgneal@4
|
62 algorithm in Python. I then started an attempt to break message 10, but after
|
bgneal@4
|
63 running the program for a little over 24 hours, I realized it was too slow. I
|
bgneal@4
|
64 did a "back of the envelope" calculation and determined I would need several
|
bgneal@4
|
65 hundred years to hill-climb every rotor and message setting. This was a bit
|
bgneal@4
|
66 disappointing, but I knew it was a possibility going in.
|
bgneal@4
|
67
|
bgneal@4
|
68 I then set out, somewhat reluctantly, to port Py-Enigma to C++. These days I
|
bgneal@4
|
69 don't do any C++ at home, only at work. However I got much happier when I
|
bgneal@4
|
70 decided this was an opportunity to try out some new C++11 features (the embedded
|
bgneal@4
|
71 compiler we use at work has no support for C++11). Okay, things are fun again! I
|
bgneal@4
|
72 have to say that C++11 is really quite enjoyable, and I made good use of the new
|
bgneal@4
|
73 ``auto`` style variable declarations, range-based for-loops, and brace-style
|
bgneal@4
|
74 initialization for containers like ``vector`` and ``map``.
|
bgneal@4
|
75
|
bgneal@4
|
76 C++ is too slow, or is it?
|
bgneal@4
|
77 --------------------------
|
bgneal@4
|
78
|
bgneal@4
|
79 It took a fair amount of time to re-tool my code to C++11. I ensured all my unit
|
bgneal@4
|
80 tests originally written in Python could pass in C++, and that I could solve
|
bgneal@4
|
81 some of the previous messages. I then re-implemented my hill-climbing
|
bgneal@4
|
82 algorithm in C++ and made another attempt at cracking message 10.
|
bgneal@4
|
83
|
bgneal@4
|
84 My C++ solution was indeed faster, but only by a factor of 8 or 9. I calculated
|
bgneal@4
|
85 I would need about 35 years to hill-climb all rotor and message settings, and I
|
bgneal@4
|
86 didn't really want to see if I could out-live my program. My morale was very low
|
bgneal@4
|
87 at this point. I wasn't sure if I could ever solve this self-imposed problem.
|
bgneal@4
|
88
|
bgneal@4
|
89 Algorithms & Data Structures are key
|
bgneal@4
|
90 ------------------------------------
|
bgneal@4
|
91
|
bgneal@4
|
92 I had wanted to test myself with this problem and had avoided looking at anyone
|
bgneal@4
|
93 else's source code. However, at this juncture, I needed some help. I turned to
|
bgneal@4
|
94 Stephan Krah's `M4 Message Breaking Project
|
bgneal@4
|
95 <http://www.bytereef.org/m4_project.html>`_. Stephan had started a distributed
|
bgneal@4
|
96 computing attack on several 4-rotor naval Enigma machine messages. He had
|
bgneal@4
|
97 graciously made his source code available. Perhaps I could get some hints by
|
bgneal@4
|
98 looking at his code.
|
bgneal@4
|
99
|
bgneal@4
|
100 Indeed, Stephen's code provided the break I needed. I discovered a very cool
|
bgneal@4
|
101 trick that Stephen was doing right before he started his hill-climb. He
|
bgneal@4
|
102 pre-computed every path through the machine for every letter of the alphabet.
|
bgneal@4
|
103 Since hill-climbing involves running the message through the machine roughly
|
bgneal@4
|
104 600 to 1000 times, this proved to be a huge time saver. After borrowing this
|
bgneal@4
|
105 idea, my hill-climbing times for one fixed rotor & message setting went down
|
bgneal@4
|
106 from 250 milliseconds to 2!
|
bgneal@4
|
107
|
bgneal@4
|
108 I immediately stopped looking at Stephen's code at this point; I didn't bother
|
bgneal@4
|
109 to compare our hill-climbing algorithms. I believe both of us are influenced by
|
bgneal@4
|
110 Weierud & Sullivan here. I was now convinced I had the speed-up I needed to make
|
bgneal@4
|
111 a serious attempt at message 10 again.
|
bgneal@4
|
112
|
bgneal@4
|
113 This showed me how critically important it is to have the right data structure
|
bgneal@4
|
114 and algorithm for a problem. However there were two other lessons I needed to
|
bgneal@4
|
115 learn before I was successful.
|
bgneal@4
|
116
|
bgneal@4
|
117 Utilize all your computing resources
|
bgneal@4
|
118 ------------------------------------
|
bgneal@4
|
119
|
bgneal@4
|
120 Up to this point my cracking program was linear. It started at the first rotor
|
bgneal@4
|
121 and message setting, then hill-climbed and noted the score I got. Then it
|
bgneal@4
|
122 proceeded to the next message setting and hill-climbed, noting the score of that
|
bgneal@4
|
123 attempt, and so on. There were two problems with this approach.
|
bgneal@4
|
124
|
bgneal@4
|
125 I realized if I could do some sort of multi-processing, I could search the key
|
bgneal@4
|
126 space faster. I thought about forking off multiple processes or maybe using
|
bgneal@4
|
127 threads, but that was over-thinking it. In the end, I added command-line
|
bgneal@4
|
128 arguments to the program to tell it where in the key space to start looking. I
|
bgneal@4
|
129 could then simply run multiple instances of my program. My development laptop,
|
bgneal@4
|
130 which runs Ubuntu, is from 2008, and it has 2 cores in it. My desktop PC (for
|
bgneal@4
|
131 gaming), has 4 cores! I next spent a couple nights installing MinGW_ and getting
|
bgneal@4
|
132 my application to build in that environment. I could now run 6 instances of my
|
bgneal@4
|
133 cracking program. Hopefully this would pay off and shorten the time further.
|
bgneal@4
|
134
|
bgneal@4
|
135 (I didn't even attempt to commandeer my wife's and kids' computers. That
|
bgneal@4
|
136 wouldn't have gone down so well.)
|
bgneal@4
|
137
|
bgneal@4
|
138 How do you know when you are done?
|
bgneal@4
|
139 ----------------------------------
|
bgneal@4
|
140
|
bgneal@4
|
141 The final problem that I had was determining when my cracking program had in
|
bgneal@4
|
142 fact cracked the message. The hill-climbing attempts produced scores for each
|
bgneal@4
|
143 key setting. Up to this point, based on running my program on messages 1 through
|
bgneal@4
|
144 9, I had noticed that when I got a score divided by the message length of around
|
bgneal@4
|
145 10 I had cracked the message. But my tests on messages 1 through 9 were
|
bgneal@4
|
146 hill-climbing using the known ring settings. My message cracker was using a
|
bgneal@4
|
147 shortcut: I was only searching ring settings on the "fast" rotor to save time.
|
bgneal@4
|
148 This would let me get the first part of a message, but if the middle or
|
bgneal@4
|
149 left-most rotor stepped at the wrong time the rest of the message would be
|
bgneal@4
|
150 unintelligible.
|
bgneal@4
|
151
|
bgneal@4
|
152 To verify my message cracking program, I ran it on some of the previous
|
bgneal@4
|
153 messages. I was shooting for a score divided by message length of 10. Using this
|
bgneal@4
|
154 heuristic, I was in fact able to crack some of the previous messages, but not
|
bgneal@4
|
155 others. It took me a while to realize that not searching the middle rotor's ring
|
bgneal@4
|
156 setting was causing this. The light bulb came on, and I realized that my
|
bgneal@4
|
157 heuristic of 10 is only valid when I search all the ring settings. Instead I
|
bgneal@4
|
158 should just keep track of the highest scores. When you get close enough, the
|
bgneal@4
|
159 score will be significantly higher than average score. Thus I may not know when
|
bgneal@4
|
160 I am done until I see a large spike in the score. I would then have to write
|
bgneal@4
|
161 another program to refine the solution to search for the middle and left-most
|
bgneal@4
|
162 ring setting.
|
bgneal@4
|
163
|
bgneal@4
|
164 Success at last
|
bgneal@4
|
165 ---------------
|
bgneal@4
|
166
|
bgneal@4
|
167 It took a little over a month of evenings and weekends to get to this point.
|
bgneal@4
|
168 I even had a false start where I ran my program on 6 cores for 1.5 days only to
|
bgneal@4
|
169 realize I had a bug in my code and I was only searching ring setting 0 on the
|
bgneal@4
|
170 fast ring. But once I had that worked out, I started a search on 6 cores on a
|
bgneal@4
|
171 Sunday night. I checked the logs off and on Monday, but no significant spike in
|
bgneal@4
|
172 the scores were noted. Before leaving for work on Tuesday I checked again, and
|
bgneal@4
|
173 noticed that one instance of my program had found a score twice as high as any
|
bgneal@4
|
174 other. Could that be it? I really wanted to stay and find out, but I had to go
|
bgneal@4
|
175 to work! Needless to say I was a bit distracted at work that day.
|
bgneal@4
|
176
|
bgneal@4
|
177 After work, I ran the settings through my Python application and my heart
|
bgneal@4
|
178 pounded as I recognized a few German words at the beginning of the message. In
|
bgneal@4
|
179 fact I had gotten around 75% of the message, then some garbage, but then the
|
bgneal@4
|
180 last few words were recognizable German. I then wrote another application to
|
bgneal@4
|
181 search the middle ring setting space. Using the highest score from that program
|
bgneal@4
|
182 allowed me to finally see the entire message. I cannot tell you how happy and
|
bgneal@4
|
183 relieved I was at the same time!
|
bgneal@4
|
184
|
bgneal@4
|
185 Final thoughts
|
bgneal@4
|
186 --------------
|
bgneal@4
|
187
|
bgneal@4
|
188 For a while I didn't think I was going to be smart enough or just plain up to
|
bgneal@4
|
189 the task of solving message 10. Looking back on it, I see a nice progression of
|
bgneal@4
|
190 trying things, experimenting, setbacks, and a few "a-ha" moments that lead to the
|
bgneal@4
|
191 breakthrough. I am very grateful to Dirk Rijmenants for creating the challenge,
|
bgneal@4
|
192 which was a lot of fun. And I also wish to thank Frode Weierud and Geoff
|
bgneal@4
|
193 Sullivan for their hill-climbing algorithm description. Thanks again to Stephan
|
bgneal@4
|
194 Krah for the key speedup that made my attempt possible.
|
bgneal@4
|
195
|
bgneal@4
|
196 For now I have decided not to publish my cracking program, since it is a tiny
|
bgneal@4
|
197 bit specific to Dirk's challenge. I don't want to ruin his Enigma Challenge by
|
bgneal@4
|
198 giving anything away. I don't believe my broad descriptions here have revealed
|
bgneal@4
|
199 too much. Other people need to experience the challenge for themselves to fully
|
bgneal@4
|
200 appreciate the thrill and hard work that go into code breaking.
|
bgneal@4
|
201
|
bgneal@4
|
202 I should also note that there are other ways of solving Dirk's challenge. You
|
bgneal@4
|
203 could, with a bit of patience, solve messages 1 through 9 with one of the many
|
bgneal@4
|
204 Enigma Machine GUI simulators (Dirk has a fine one). I'm not certain how
|
bgneal@4
|
205 you could break message 10 using a GUI simulator and trial and error, however.
|
bgneal@4
|
206
|
bgneal@4
|
207 It is my understanding that this style of Enigma message breaking was not the
|
bgneal@4
|
208 technique used by the allied code-breakers during the war. I believe they used
|
bgneal@4
|
209 knowledge of certain key words, or "cribs", in messages received in weather
|
bgneal@4
|
210 reports and other regular messages to find the key for the day. I look forward
|
bgneal@4
|
211 to reading more about their efforts now.
|
bgneal@4
|
212
|
bgneal@4
|
213 Finally, I have published the C++11 port of Py-Enigma_, which I am calling
|
bgneal@4
|
214 (rather unimaginatively) Cpp-Enigma_ on bitbucket. I hope Cpp-Enigma_ and
|
bgneal@4
|
215 Py-Enigma_ are useful for other people who want to explore the fascinating world
|
bgneal@4
|
216 of Enigma codes.
|
bgneal@4
|
217
|
bgneal@4
|
218 .. _Py-Enigma: https://bitbucket.org/bgneal/enigma
|
bgneal@4
|
219 .. _Python: http://www.python.org
|
bgneal@4
|
220 .. _Technical Details of the Enigma Machine: http://users.telenet.be/d.rijmenants/en/enigmatech.htm
|
bgneal@4
|
221 .. _Enigma Challenge: http://users.telenet.be/d.rijmenants/en/challenge.htm
|
bgneal@4
|
222 .. _Cpp-Enigma: https://bitbucket.org/bgneal/cpp-enigma
|
bgneal@4
|
223 .. _MinGW: http://www.mingw.org/
|