Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Shannon’s theory part II Ref. Cryptography: theory and practice Douglas R. Stinson Shannon’s theory 1949, “Communication theory of Secrecy Systems” in Bell Systems Tech. Journal. Two issues: What is the concept of perfect secrecy? Does there any cryptosystem provide perfect secrecy? It is possible when a key is used for only one encryption How to evaluate a cryptosystem when many plaintexts are encrypted using the same key? Perfect secrecy Definition: A cryptosystem has perfect secrecy if Pr[x|y] = Pr[x] for all xP, yC Idea: Oscar can obtain no information about the plaintext by observing the ciphertext Bob Alice y x Pr[Head]=1/2 Pr[Tail]=1/2 Oscar Case 1: Pr[Head |y]=1/2 Pr[Tail |y]=1/2 Case 2: Pr[Head |y]=1 Pr[Tail |y]=0 Perfect secrecy when |K|=|C|=|P| (P,C,K,E,D) is a cryptosystem where |K|=|C|=|P|. The cryptosystem provides perfect secrecy iff every keys is used with equal probability 1/|K| For every xP, yC, there is a unique key K such that eK ( x) y Ex. One-time pad in Z2 P: 010 K: 101 P: 111 K: 000 ? C: 111 Outline Introduction One-time pad Elementary probability theory Perfect secrecy Entropy Properties of entropy Spurious keys and unicity distance Product system Preview (1) Ciphertext Plaintext yn xn K We want to know: the average amount of ciphertext required for an opponent to be able to uniquely compute the key, given enough computing time Preview (2) That is, we want to know: How much information about the key is revealed by the ciphertext = conditional entropy H(K|Cn) We need the tools of entropy Entropy (1) Suppose we have a discrete random variable X What is the information gained by the outcome of an experiment? Ex. Let X represent the toss of a coin, Pr[head]=Pr[tail]=1/2 For a coin toss, we could encode head by 1, and tail by 0 => i.e. 1 bit of information Entropy (2) Ex. Random variable X with Pr[x1]=1/2, Pr[x2]=1/4, Pr[x3]=1/4 The most efficient encoding is to encode x1 as 0, x2 as 10, x3 as 11. uncertainty information codeword length Pr[x1]=1/2 Pr[x2]=1/4 Entropy (3) Notice: probability 2-n => n bits p => -log2 p Ex.(cont.) The average number of bits to encode X 1 1 1 3 1 2 2 2 4 4 2 Entropy: definition Suppose X is a discrete random variable which takes on values from a finite set X. Then, the entropy of the random variable X is defined as H ( X) Pr[ x] log 2 Pr[ x] xX Entropy : example Let P={a, b}, Pr[a]=1/4, Pr[b]=3/4. K={K1, K2, K3}, Pr[K1]=1/2, Pr[K2]=Pr[K3]= 1/4. a b encryption matrix: K 1 2 1 K2 K3 2 3 1 1 3 3 H(P)= log 2 log 2 0.81 4 4 4 4 H(K)=1.5, H(C)=1.85 3 4 Properties of entropy (1) Def: A real-valued function f is a strictly concave (凹) function on an interval I if x y f ( x) f ( y ) f 2 2 f(y) f(x) x y Properties of entropy (2) Jensen’s inequality: Suppose f is a continuous strictly concave function on I, n a i 1 i 1, ai 0, 1 i n Then n ai f ( xi ) f ai xi i 1 i 1 n Equality hold iff x1 =...=xn x1 xn Properties of entropy (3) Theorem: X is a random variable having a probability distribution which takes on the values on p1, p2,…pn, pi>0, 1 i n. Then H(X) log2 n with equality iff pi=1/n for all i * Uniform random variable has the maximum entropy Properties of entropy (4) Proof: n 1 H ( X) pi log 2 pi i 1 1 log 2 pi pi i 1 n log 2 n Entropy of a natural language (1) HL : average information per letter in English 1. If the 26 alphabets are uniform random, = log2 26 4.70 2. Consider alphabet frequency H(P) 4.19 Entropy of a natural language (2) 3. However, successive letters has correlations Ex. Digram, trigram Q: entropy of two or more random variables? Properties of entropy (5) Def: H ( X, Y) Pr[ x, y] log xX , yY 2 Pr[ x, y ] Theorem: H(X,Y) H(X)+H(Y) with equality iff X and Y are independent n Proof: Let pi Pr[ X xi ], 1 i m q j Pr[Y y j ], 1 j n pi rij j 1 m q j rij i 1 rij Pr[ X xi , Y y j ], 1 i m, 1 j n Entropy of a natural language (3) 3. Let Pn be the random variable that has as its probability distribution that of all n-gram of plaintext. tabulation of digrams => H(P2)/2 3.90 tabulation of trigrams => H(P3)/3 … tabulation of n-grams => H(Pn)/4 n H (P ) H L lim n n 1.0 HL 1.5 Entropy of a natural language (4) Redundancy of L is defined as HL RL 1 log 2 P Take HL =1.25, RL = 0.75 English language is 75% redundant ! We can compress English text to about one quarter of its original length Conditional entropy Known any fixed value y on Y, information about random variable X H ( X | y) Pr[ x | y] log 2 Pr[ x | y] x Conditional entropy: the average amount of information about X that is revealed by Y H ( X | Y) Pr[ y ] Pr[ x | y ] log 2 Pr[ x | y ] y x Theorem: H(X,Y)=H(Y)+H(X|Y) Theorem about H(K|C) (1) Let (P,C,K,E,D) be a cryptosystem, then H(K|C) = H(K) + H(P) – H(C) Proof: H(K,P,C) = H(C|K,P) + H(K,P) Since key and plaintext uniquely determine the ciphertext H(C|K,P) = 0 H(K,P,C) = H(K,P) = H(K) + H(P) Key and plaintext are independent Theorem about H(K|C) (2) We have Similarly, Now, H(K,P,C) = H(K,P) = H(K) + H(P) H(K,P,C) = H(K,C) = H(K) + H(C) H(K|C)= H(K,C)-H(C) = H(K,P,C)-H(C) = H(K)+H(P)-H(C) Results (1) Define random variables as Ciphertext Plaintext Cn Pn K H (K | C ) H (K ) H (P ) H (C ) n n n H (P ) H L lim n n Set |P|=|C|, => n H (P ) nH L n(1 RL ) log 2 P n H (C ) nH (C) n log 2 C n H (K | C ) H (K) nRL log 2 P n Spurious(假) keys (1) Ex. Oscar obtains ciphertext WNAJW, which is encrypted using a shift cipher K=5, plaintext river K=22, plaintext arena One is the correct key, and the other is spurious Goal: prove a bound on the expected number of spurious keys Spurious keys (2) Ciphertext Plaintext Cn Pn K Given yCn , the set of possible keys n K ( y) K Κ : x P such that Pr[ x] 0, eK ( x) y The number of spurious keys |K(y)|-1 The average number of spurious keys sn Pr[ y]| K ( y) | 1 Pr[ y] | K ( y) | 1 yCn yCn Relate H(K|Cn) to spurious keys (1) By definition H (K | C ) n Pr[ y]H (K | y) yCn Pr[ y] log 2 | K ( y) | yCn log n Pr[ y] | K ( y) | yCn log n sn 1 Relate H(K|Cn) to spurious keys (2) H (K | C ) log n sn 1 n We have derived H (K | C ) H (K) nRL log 2 P n So log 2 (sn 1) H (K) nRL log 2 P |K| log 2 | K | nRL log 2 P log 2 nRL P Relate H(K|Cn) to spurious keys (3) Theorem: |C|=|P| and keys are chosen equiprobably. The expected number of spurious keys sn |K| P nRL 1 As n increases, right hand term => 0 Relate H(K|Cn) to spurious keys (4) Set sn 0 log 2 | K | n0 RL log 2 P For substitution cipher, |P|=|C|=26, |K|=26! n0 88.4 /(0.75 4.7) 25 Unicity distance The average amount of ciphertext required for an opponent to be able to unique compute the key, given enough time Product cryptosystem S1 = (P,P,K1,E1,D1), S2 = (P,P,K2,E2,D2) The product of two cryptosystems is S1 = (P,P, K1K2 ,E,D) Encryption: e( K1 , K 2 ) ( x) eK 2 (eK1 ( x)) Decryption: d ( K1 , K 2 ) ( y ) d K1 (d K 2 ( y )) Product cryptosystem (cont.) Two cryptosystem M and S commute if MxS = SxM Idempotent cryptosystem: S2 = S Ex. Shift cipher If a cryptosystem is not idempotent, then there is a potential increase in security by iterating it several times How to find non-idempotent cryptosystem? Thm: If S and M are both idempotent, and they commute, then SM will also be idempotent (SXM) x (SxM) = S x (M x S) xM =S x (S x M) x M =(S x S) x (M x M) =S x M Idea: find simple S and M such that they do not commute SxM is possibly non-idempotent