Perfect Secrecy.
"Paranoia is very useful in this work. ...If your cryptographic system can survive the paranoia model, it has at least a fighting chance of surviving in the real world." - Niels Ferguson & Bruce Schneier.
Introduction
Claude Shannon developed the concept of perfect secrecy in his paper “Communication theory of secrecy systems.” Perfect secrecy is defined as the probability of obtaining the plaintext (pt), of a corresponding and given ciphertext (ct), is the same as obtaining the plaintext without any ciphertext to observe. In other words, the probability that the attacker finds the plaintext after observing the ciphertext is the same as the probability or difficulty of getting the plaintext before observing the ciphertext. The adversary is unable to obtain any information from the ciphertext, hence the observed ciphertext is completely meaningless.
Pr [ pt | ct ] = Pr [ pt ]
Each plaintext, ciphertext,
and key has to occur with equal probability.
Bayes’ theorem is used to prove that a cryptosystem is perfectly secure:
Pr [ pt | ct ] = Pr [ pt ] ´ Pr [ ct | pt ] / Pr [ ct ]
Pr[ct | pt] must cancel
with Pr[ct], i.e. they must be equal.
The Pr[ct | pt] is simply the probability of the key, since the key is
needed to obtain the ciphertext from the plaintext.
One-time Pad
Plaintext: 10010111101010011100000100101010011101
… Key: 11110010100010110010110011100101010101
… Ciphertext: 01100101001000101110110111001111001000
…
The one-time
is an example of a cryptosystem that achieves perfect secrecy. In fact, the
one-time pad, or OTP, (sometimes referred to as the Holy Grail of
Cryptography) achieves the highest form of encryption security. In its most
theoretical form, the OTP encrypts plaintexts with keys that are equally as
long as the message to be encrypted. The key(s) are as random as possible and
are predetermined or preset for both the sender and receiver to have before any
encryption or decryption begins. When considering encrypting bits, the attacker
has no greater than 1/2 probability of guessing each bit correctly. The
following represents a OTP simply with the Xor of bits (notice that the key has
to be as long as the plaintext):
![]()
The OTP was developed shortly after World War I by U.S. Army Major Joseph Mauborgne. Two very large pads of paper would contain the exact same sequence of the thousands of lines of random letters. One pad would be given to the sender and the other to the receiver. Now that the key is established the only thing left is a cipher. Very simply, a Vigenere cipher or some form of polyalphabetic substitution cipher can be used to encrypt the plaintext message and decrypt the ciphertext. After a key was used it was completely discarded and never used again, namely the sheet of random letters used as the key would be torn off and destroyed.
The strength of the OTP comes from the fact that there no known weaknesses so long as keys are never reused. Because of the very high degree of randomness, it is difficult to discern a pattern, nothing can be deduced, thus render ineffective any known form of cryptanalysis. Theoretically, even if every possible key was checked, the correct message would surely surface but so too would many other false positives be uncovered, therefore leaving it up to the code cracker to do additional time consuming work by trying to guess which sensible plaintext message is the correct one.
However, on the opposite end of theory, even though the OTP provides the best form of encryption, it is not very practical to use. The sender and receiver must contain this entirely large set of random keys and stay absolutely synchronized. During a battle many units would need to have a copy of this key or pad and everyone would have to be synchronized. The more mobile a unit is the more difficult it would be because of the sacrifice in resources. One shift in the random sequence of letters or one additional torn off page could easily disrupt effective communications, not to mention an enemy capturing the receiver's pad of keys. So, maybe in some cases, the OTP could be used for ultra secure, highly protective and low bandwidth processes. An example could be the hotline between the Russian President and the U.S. President perhaps.
Entropy
In 1948, Claude Shannon developed axioms characterizing entropy. Entropy is the mathematical measure of the amount of information, disorder, or uncertainty composed as a function of a probability distribution. The idea of entropy for cryptology is measuring the amount of uncertainty for certain events. As an example, given a sample space S = {e1, e2, ..., en}, where Σ Pr[ei] = 1, flipping a fair coin has a sample space of S = {h, t}, heads or tails, and the probability Pr[h] = Pr[t] = 0.5. The entropy denoted H, of either heads or tails is H(h) = H(t) = - log2 0.5 = 1. This simple example gives us a unit of information, which we can call the bit. In fact, the entire sample space can have an entropy, which is the expected value of information of all the events in S: H(S) = Σ -Pr[ei] log2 Pr[ei].
Fair coin toss example: H(S) = 0.5 * (-log 0.5) + 0.5 * (-log 0.5) = 0.5(-(-1)) + 0.5(-(-1)) = 1 bit.
Rolling a fair six-sided die example: H(S) = -Σ (1/6) log (1/6) ≈ 2.585.
Weighted single letter example: H(S) ≈ 4.19 considering frequency charts.
Non-weighted single letter example: H(S) = -Σ (1/26) log (1/26) ≈ 4.7, which is higher uncertainty.
So, given that multiple plaintexts can be encrypted using the same key, we would like to know how probable will a cryptanalyst carry out a successful (ciphertext-only) attack. Ideally, the higher the entropy the higher the security.
Links
http://www.cs.ucla.edu/~jkong/research/security/shannon1949/node10.html
http://www-courses.cs.uiuc.edu/~cs397pgn/
lectures/lecture6-7.pdf
http://www.math.ntu.edu.tw/~jmchen/Ch04.pdf
http://world.std.com/~franl/crypto/one-time-pad.html
http://en2.wikipedia.org/wiki/One-time_pad