Download The method of types

University of Siena PhD short course Information Theory and Statistics Siena, 15-19 September, 2014 The method of types Mauro Barni University of Siena Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Outline of the course •  Part 1: Information theory in a nutshell •  Part 2: The method of types and its relationship with statistics •  Part 3: Information theory and large deviation theory •  Part 4: Information theory and hypothesis testing •  Part 5: Application to adversarial signal processing Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Outline of Part 2 •  The method of types –  Definitions –  Basic properties with proof of theorems •  Law of large numbers •  Source coding, Universal source coding Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Type or empirical probability Type, or empirical probability, of a sequence N(a | x n ) Px n (a) = ∀a ∈ X n Set with all the types with denominator n Pn = all types with denominator n ' * !1 4 $ ! 2 3$ ! 3 2 $ ! 4 1$ if X = {0,1} P5 = (( 0,1), # , &, # , &, # , &, # , &, (1, 0 )+ " 5 5% " 5 5% " 5 5% " 5 5% ) , Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Type class Type class: all the sequences having the same type T (P) = { x n ∈ X n : Px n = P} Example: x 5 = 01100 !3 2$ Px 5 = # , & "5 5% ') 11000,10100,10010,10001, 01100 T ( Px 5 ) = ( )* 01010, 01001, 00110, 00101, 00011 Short course on Information Theory and Statistics, Siena, September 2014 +) , )- M. Barni, University of Siena, VIPP group University of Siena Number of types The number of types grows polynomially with n Theorem The number of types with denominator n is upper bounded by: Pn ≤ (n +1) |X | Proof. Obvious. Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Probability of a sequence Theorem The probability that a sequence x = xn is emitted by a DMS source with pmf Q is Q(x) = 2 −n( H (Px ) +D(Px ||Q)) if Px = Q Q(x) = 2 −nH (Px ) = 2 −nH (Q) Remember The larger the KL distance from the type of x and Q the lower the probability. Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Probability of a sequence Proof. Q(x) = ∏ Q(xi ) = ∏ Q(a) N (a|x) a∈ X i = ∏ Q(a)nPx (a) = ∏ 2 nPx (a)logQ(a) a∈ X a∈ X = ∏ 2 n[ Px (a)logQ(a)−Px (a)log Px (a)+Px (a)log Px (a)] a∈ X n =2 " % x (a) +P (a)log P (a) ' ∑$#−Px (a)log PQ(a) x x & a =2 Short course on Information Theory and Statistics, Siena, September 2014 −n[ H (Px )+D(Px ||Q)] M. Barni, University of Siena, VIPP group University of Siena Examples •  Probability of a specific sequence with n/2 tails and heads –  Fair coin –  Biased coin with P(H) = 1/3, P(T) = 2/3 •  Same as above with n/3 heads –  Fair coin –  Biased coin with P(H) = 1/3, P(T) = 2/3 Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Size of a type class Theorem The size of a type class T(P) can be bounded as follows: 1 nH (P ) nH (P ) 2 ≤ T (P) ≤ 2 |X | (n +1) Remember The size of a type class grows exponentially with growing rate equal to the entropy of the type. Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Size of a type class Proof. (upper bound) Given P ∈ Pn consider the probability that a source with pmf P emits a sequence in T (P). We have 1≥ ∑ x∈T (P ) P(x) = ∑ 2 −nH (P ) = T (P) 2 −nH (P ) x∈T (P ) T (P) ≤ 2 nH (P ) Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Size of a type class Proof. (lower bound) ! n T (P) = ## " nP(a1 ) ... nP(a|X | ) n !n$ !n$ # & ≤ n! ≤ n # & "e% "e% T (P) ≥ n Stirling approximation "n% $ ' #e& n $ n! &= & n !n ! n ! |X | % 1 2 n after some algebra n " n1 % 1 " n|X | % |X | n1 $ '  n|X | $ ' #e& # e & Short course on Information Theory and Statistics, Siena, September 2014 T (P) ≥ 1 nH (P ) 2 (n +1)|X | M. Barni, University of Siena, VIPP group University of Siena Probability of a type class Theorem The probability that a DMS with pmf Q emits a sequence belonging to T(P) can be bounded as follows: 1 −nD(P||Q) −nD(P||Q) 2 ≤ Q(T (P)) ≤ 2 (n +1)|X | Remember The larger the KL distance between P and Q the smaller the probability. If P=Q, the probability tends to 1 exponentially fast Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Probability of a type class Proof. Q(T (P)) = ∑ Q(x ) = x∈T (P ) = T (P) 2 ∑ 2 −n(H (P )+D(P||Q)) x∈T (P ) −n(H (P )+D(P||Q)) By remembering the bounds on the size of T(P): 1 −nD(P||Q) −nD(P||Q) 2 ≤ Q(T (P)) ≤ 2 |X | (n +1) Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena In summary | Pn | ≤ (n +1) |X | Q(x) = 2 −n[ D(Px ||Q)+H (Px )] T (P) ≈ 2 nH (P ) Q(T (P)) ≈ 2 −nD(P||Q) Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Information Theory and Statistics Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Law of large numbers The law of large numbers provides the link between Information Theory and Statistics. The weak form of the LLN states that Given a sequence of n iid random variables Xi 1 n X = ∑ Xi n i=1 ∀ε > 0 lim Pr{| X − µ X | > ε } = 0 n→∞ Standard proof is based on Chebyshev inequality. LLN can be easily extended to relative frequencies and probabilities (for discrete random variables). Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Law of large numbers (IT perspective) Q(T (P)) ≈ 2 When n grows the only type class with a non-negligible probability is Q −nD(P||Q) Theorem (law of large numbers) TQε = { x n : D(Px n || Q) ≤ ε } P(x n ∉ TQε ) = ∑ Q(T (P)) ≤ P:D(P||Q)>ε ≤ (n +1)|X | 2 −nε = 2 ∑ 2 −nD(P||Q) ≤ P:D(P||Q)>ε # log(n+1) & −n%ε −|X | (' $ n Short course on Information Theory and Statistics, Siena, September 2014 ∑ 2 −nε P:D(P||Q)>ε That tends to 0 when n tends to infinity M. Barni, University of Siena, VIPP group University of Siena Source coding (achievability) Source coding theorem (Shannon ’48) Given a DMS source Q, any rate R such that R = H (Q) + ε is achievable (for any ε > 0) Code sequences of increasing lenght n. Code efficiently only the sequences in T(Q), since the others will (almost) never occur. To do that we need only nH(Q) bits. Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Source coding: rigorous proof Choose a small ε and define TQε = {x n : D(Px n || Q) ≤ ε } By the continuity of D d(Px n ,Q) ≤ ε ' which → 0 if ε → 0 By the continuity of H H (Px n ) ≤ H (Q) + ε '' which → 0 if ε ' → 0 1. Code sequences in TQε by counting them in TQε 2. Code sequences not in TQε by counting them in X Short course on Information Theory and Statistics, Siena, September 2014 n M. Barni, University of Siena, VIPP group University of Siena Source coding: rigorous proof The average number of bits is L ≤ Pr{TQε }[nH (Q) + nε ''+ | X | log(n +1)]+ (1− Pr{TQε })n log(| X |) L log(n +1) ≤ H (Q)ε ''+ | X | + δ log(| X |) n n That can be made arbitrarily small by increasing n and by properly choosing ε and δ Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Universal source coding What if Q is not known ? The suprising result is that we can still code at anyrate larger than the Entropy. Observe a sequence of emitted symbols and estimate Q, then transmit information about the type and the index of the sequence within the type Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Universal source coding (rigorous proof) Choose an arbitrarily small ε and let TQε = { x n : D(Px n || Q) ≤ ε }. Given a sequence x n use | X | log(n +1) bits to indicate its type and nH (Px n ) to index x n within the type. The average number of bits used by the code is: | X | log(n +1) + ∑ Q(x n )H (Px n ) + ∑ Q(x n )H (Px n ) n x n ∉TQε x ∈TQε ≤ | X | log(n +1) + Q(x n ∉ TQε )log X + Q(x n ∈ TQε )[H (Q) + δ ] ≤ H (Q) + δ ' n Being ε and δ (and hence δ’) arbitrarily small, any rate larger than H(Q) can be obtained. Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena Channel coding The method of types can be used to prove many other results in IT including the channel coding theorem Outside the scope of this course Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group University of Siena References 1.  T. M. Cover and J. A. Thomas, “Elements of Information Theory”, Wiley 2.  I. Csiszar, ”The method of types”, IEEE Trans. Inf. Theory, vol.44, no.6, pp. 2505–2523, Oct. 1998. 3.  I. Csiszar and P. C Shields, “Information Theory and Statistics; a Tutorial”, Foundations and Trends in Commun. and Inf. Theory, 2004, NOW Pubisher Inc. Short course on Information Theory and Statistics, Siena, September 2014 M. Barni, University of Siena, VIPP group

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download The method of types