Science author Simon Singh is stood beside an Enigma machine, talking about the 15,354,393,600 password variants the German encryption box allows with its spaghetti of wiring, pseudo-random rotors and reconfigurable plugboard. He’s talking about the top secret work at Bletchley Park to break the code – the groundwork lain by Polish mathematicians; Alan Turing’s bombe; years of frustrated efforts waiting for a breakthrough.
Behind him, a screen shows that an artificial intelligence has cracked it in 13 minutes.
The stunt is being made by a data analysis firm. It is showing off its machine learning toolset with a live demonstration, competing with the very best in 1930s encryption. Enigma Pattern has recreated a code-cracking bombe for the Enigma machine using Python, set up to test all possible combinations of a four-rotor navy type machine. Using cloud computing, provided by DigitalOcean, the system is able to leverage 2,000 virtual servers to run through 41 million combinations per second.
This brute force is only half of the approach, however. The output of the bombe; the possible combinations of letters, are fed into an AI neural network trained on a data set of Brothers Grimm fairy tales. These allow it to identify words it considers to be German, so it can automatically sift through the possible combinations until it finds something comprehensible. Instead of narrowing the input, it narrows the vast quantity of outputs to a single, sensible line.
In this case, that combination is: “Deutsch ist eine schöne Sprache [German is a beautiful language]”.
“The important thing is that it hasn’t compiled a dictionary out of those words”
“The important thing is that it hasn’t compiled a dictionary out of those words,” explains Mike Gibbons, co-founder of Enigma Pattern. “What it has done is learn that German words are often, say, this long, or often have two or three syllables, or that when an ‘S’ occurs it’s often followed by a ‘T’. So it builds up those kinds of rules, rather than trying to word match.”
The technique on display is designed to give a frame of reference to the type of scale Enigma Pattern can offer companies ranging from financial firms to medical companies. Feed us your data, the whole setup boasts, and we’ll get you information. While not many people are likely to use neural networks to break Second World War-era encryption, the demo goes a long way to show how a combination of artificial intelligence and sheer computing power has created a whole different paradigm around data secrecy.
“This is an unfair fight,” admits Gibbons. “The guys who originally put together the Enigma machine knew what capabilities there were to crack a code, so that’s why they came up with that design. If you’re designing a code today you know there are people like us with the capabilities we’ve got, and probably governments with even more capability. So you design on the basis of that.”
While it made short work of an encrypted German phrase, Enigma Pattern’s effort does have serious limitations. The Grimm training set, for example, might have helped it to discern the rhythm and structure of German words, but that same system would be pretty useless with English words, or any other language for that matter. There’s also the more general problem machine learning techniques need to contend with: the black box problem.
This is the issue of not knowing why advanced algorithms are doing what they’re doing. You can see the input, and you can see the results, but the process of machine learning is opaque. It’s a major problem when you consider that the tools used to crack a pre-computer Enigma machine could be used by major social infrastructures, from healthcare to the legal system.
“You’re ultimately saying to a machine: it’s too complex for me, you look after it.”
“The reason we’re handing over this analysis to a machine is because it is beyond normal human capacity,” says Gibbons. “You’re ultimately saying to a machine: it’s too complex for me, you look after it. So it gives you some results. Then you say to it: Now explain to me, in terms I can understand, how you arrived at that. You’re asking a big question.
“But there are audit requirements needed here,” he acknowledges. “AI is being used to assess parole applications in the US, for example, and a fundamental mechanism in the legal system is the right to appeal. Clearly under those circumstances it needs to be handled in a way that can be explained. So there needs to be an explanation engine that comes back in the other direction.”
Back in the room, the actual Enigma box seems comprehensible by comparison. After the demo, I asked Simon Singh if there’s something fundamentally different in having a mechanical and electric machine, compared to the intangible algorithms of Enigma Pattern’s AI.
“We take this into schools, and you can show it to kids, and they understand every aspect of it,” he tells me. “If I showed you diagrams you’d understand how the wiring works, and you can hear the clunking. With digital encryption you can do that as well. There’s nothing clunky, nothing mechanical, but I can say: here is your message in ASCII, here are you algorithms that will encrypt it.
He pauses. “But there is something nice about having a thing that’s mechanical and electrical; something you can physically see in front of you,” he adds, twiddling a rotor.