Cryptanalysis

"The attackers find a buffer overflow and never bother attacking the cryptography. ...however, ... An attacker who breaks the cryptography has a low chance of being detected." - Bruce Schneier.

Introduction

Cryptanalysis is the arduous work of breaking a cipher system. Cryptanalysis has its roots in around 650 AD during which theologians wanted to verify the authenticity of ancient manuscripts by comparing the consistency of various words. However, true cryptanalysis probably began in Europe. In fact, one of the most famous cryptanalytic successes occurred in the late 1500's in England, leading to the arrest and execution of the Mary Queen of Scots for treason. Historically, the most common form of cryptanalysis is frequency analysis, which is basically counting the amount of each letter or symbol that occurred in an encrypted message and comparing those amounts with a table of average, normal, language frequency of letters for plaintext messages.

Zimmermann Telegram

Arguably, the most important or famous known act of cryptanalysis was the cracking of the Zimmermann telegram in early 1917 during World War I. It's significance stems from the fact of discovering evidence that would draw an entire and powerful nation, namely the United States, into war. Arthur Zimmermann, Germany's newly appointed foreign minister at the time, wanted Mexico as well as the Japanese to attack the U.S. The British intercepted this message that was en route to Mexico. Two members of the Admiralty's Cipher Bureau, Reverend Montgomery and Nigel de Grey, cracked it by using previously analyzed and encrypted messages. Bit by bit they were able to piece together the message. When U.S. President Woodrow Wilson received this evidence of German aggression, he committed the U.S. to war.

The Cryptanalyst

Today the cryptanalyst has to have unusual ability of observation, strong inductive and deductive reasoning, good judgment, perseverance, tremendous concentration and intuition gained from experience. The cryptanalyst must rearrange data to disclose nonrandom information. Any collateral information is also a benefit and an added weapon for the cryptanalyst's arsenal. The cryptanalyst must be well grounded with principles of mathematics, logic, complexity and computation theory. The cryptanalyst must be very knowledgeable of the cryptolinguistic data of the opposing language. Cryptanalysis refers more to finding hidden and subtle weaknesses and trapdoors within an encryption algorithm. However, brute force attacks can be considered a form of cryptanalysis. Keys can be poorly chosen or accidentally divulged. The cryptanalyst's job is one of the most challenging tasks to be undertaken.

4 General Security Threats

1. Interruption
2. Interception
3. Modification
4. Fabrication

These security attacks can be divided into passive or active attacks. Passive attacks are considered be just eavesdropping on transmitted information. Detecting a passive attack is more difficult than an active attack therefore, requiring prevention as more of a defensive means. Active attacks on the other hand, involves modifying information, masquerading, replay attacks and denial of service. So to defend against an active attack is to detect them where possible and recover.

"Cracking a difficult cipher is akin to climbing a sheer cliff face. The cryptanalyst is seeking any nook or cranny which could provide the slightest purchase." - Simon Singh.

Time Complexity Classifications

There are a few technical factors that contribute to the time complexity of a cryptanalytic attack, such as processor speed, key size and input length. With an algorithmic input size of N we have the following:

1. O(N) is linearly time bounded.
2. O(NX) where X is a constant, is polynomial time bounded.
3. O(XP(N)) where P(N) is a polynomial, is exponentially time bounded.

Since one-time pads are not practical, the next best encryption scheme is to have a key size that is usually some fraction of the length of the message. This would then imply that the key is subject to a brute force attack and therefore can be cracked in principle. However, this is somewhat paradoxical because, as long as the encoded message isn't cracked for a very, very long time then that is all that matters, not that it can be cracked period. It can not be proved that a key can be found quickly, but verifying the key can be done in polynomial time. This then has complexity theory implications, such that this problem is in class NP because no polynomial time-bounded solution exists to find the correct key, except with a lucky guess. Furthermore, since there is no mathematical proof of security, then we have to rely on circumstantial evidence.

Additionally, another area of complexity theory that provides evidence of the security of encryption via keys is NP-complete reducibility. An NP-complete problem can be reduced to a code cracking NP-complete problem and thus would show that code cracking is not solvable in polynomial time. However, NP-complete problems deal with worst case complexity implying that this is a problem that can be easy to solve in some circumstances. However, with regard to cryptography, problems require difficult average case complexity, such as the factorization problem. As long as no one is clever enough to find a fast(er) way to factor very large numbers, cryptographic processes relying on the factoring problem such as RSA, provides enough evidence for security. Furthermore, additional ambiguities exist in measuring the time complexity of cryptanalysis. For instance, counting the total number of possible keys requires linear time complexity but measuring the key length can be exponential time complexity, despite the total number of possible keys being a function of the key length.

"...there may already be remarkable breakthroughs hidden behind the veil of government secrecy." - Simon Singh, on James Ellis' remarks.

Attacks & Knowledge

Ciphertext Only

The ciphertext only attack is the most difficult form of cryptanalysis. The only knowledge the cryptanalyst has is the algorithm and ciphertext(s).

Known Plaintext

The known plaintext attack is when the cryptanalyst knows the matching plaintext or part of it, for a given ciphertext. This is similar to a crib except that it doesn't have to be a guess. Thus the cryptanalyst knows the algorithm and 1 to N ciphertext to plaintext mappings but still does not know the key(s). An example of this attack could be using a keystroke logger to record a plaintext e-mail and intercepting that e-mail in its encrypted form.

Chosen Plaintext

The chosen plaintext attack allows the cryptanalyst to, in essence, seed the the encryption algorithm with a plaintext of his/her choice. The cryptanalyst would know the algorithm and 1 to N ciphertext to plaintext mappings but still does not know the key(s). However, this kind of attack makes it a little easier than a known plaintext attack to fish for weaknesses in say the algorithm. An example of this attack could be inserting your own plaintext messages into a batch of messages that's supposed to be transmitted.

Chosen Ciphertext

The chosen ciphertext attack is when the cryptanalyst is able to choose his/her own ciphertext and have it somehow decrypted by the algorithm to produce a given plaintext. Likewise, building on the above, the cryptanalyst is also able make a chosen plaintext attack as well. The cryptanalyst would know the algorithm and 1 to N ciphertext to plaintext mappings but still does not know the key(s). This kind of attack also makes it a little easier to surface weaknesses in the algorithm as well, thus increasing your chances for deducing the key(s). An example could be intercepting an encrypted message from a sender, alter that message and then send your own encrypted message to the recipient, while somehow having access to the plaintext of the recipient, perhaps via some strong association or back door.

Birthday Attack

The Birthday Attack gets its name from the birthday paradox. One of the professors I had in college had each of us in class one at a time, say our birthday (month and day) aloud. There were about 50 of us students and about half way through, when about the 26th person said her birthday it was the same month and day of the 5th person's birthday, thus a match was found. This matching is referred to as a collision. There are 365 possible days but about ÃN (such that N is the total possible number of elements, namely in this case N = 365) we have roughly, with 20 or more people, that the probability is greater than 50% that 2 people will have the same birthday (month and day). The probability of a collision is equal to X * (X - 1) / (2 * N), such that X is the total number of chosen elements out of N. If X ³ ÃN, then the probability ³ 50%. In reference to bits, the "birthday" bound is roughly 2(K / 2) elements before expecting a collision. Because there are 2K possible values, there must be roughly Ã2K elements for anticipating a collision.

Meet in the Middle

The Meet in the Middle attack is similar to the Birthday Attack, except that the cryptanalyst has a little more flexibility and instead of waiting for a value to occur twice in a single set the cryptanalyst can check for an intersection between 2 sets. For instance, let's say that the cryptanalyst has a pretty good knowledge of the kind of plaintext messages of some given sender, who happens to use 64 bit keys. The cryptanalyst would build 1 of the 2 sets, namely a table, by computing 232 unique hash results for the same plaintext message, with the associated unique key for each hash result, using the same hash function as the sender. Then the cryptanalyst would eavesdrop for each message to check if the encrypted hash value exists in his table of pre-computed hash values. If there is a collision, then it might be possible that the corresponding key in your table is correct, which inevitably would open up the possibility for the cryptanalyst to insert his/her own messages. Again, if there is a total of N values and 1 set has X elements in it and the other set has Y elements in it (such that X + Y ² N) then there exists X * Y possible pairs of elements, such that each pair has a 1/N chance of a collision. The probability of a collision increases, the closer (X * Y) / N is to 1. Although, for efficiency's sake ideally, the cryptanalyst would want to use an X value for 1 set and a Y value for the other set, such that X * Y Å N or X Å Y Å ÃN.

Links

http://www.eweek.com/article/0,3658,s=712&a=24663,00.asp
http://www.umich.edu/~umich/fm-34-40-2/
http://www.mat.dtu.dk/persons/Jakobsen_Thomas/capapers.html
http://www.nsa.gov/programs/employ/c.html
http://ciphersaber.gurus.com/cryptanalysis.html
http://searchsecurity.techtarget.com/sDefinition/0,,sid14_gci214432,00.html
http://www.cs.ucsd.edu/~daniele/CSE207C/
http://home.ecn.ab.ca/~jsavard/crypto/mi060901.htm
http://www.tcs.hut.fi/~helger/crypto/link/block/dc.html
http://cryptome.org/a51-bsw.htm