Download Solution 4 - ETH Zürich

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Transcript
ETH Z¨
urich
Dept. Computer Science
Spring Semester 2015
Information theory
Exercise 4 (Solution)
1 April 2015
4.1
Limit vs AEP
Let X1 , X2 , X3 , · · · , Xn , · · · be an infinite sequence i.i.d. random variables, each with probability distribution Q
p(x), x ∈ X . That is, Pr(Xi = x) = p(x) and Pr(X1 = x1 , X2 =
x2 , · · · , Xn = xn ) = ni=1 p(xi ).
a) Find
1
lim p(X1 , ..., Xn ) n .
n→∞
Hint: use the AEP.
b) Find
1
log p(X1 , ..., Xn ).
n→∞ n
Recap: The weak law of large numbers states that the sample average converges in
probability to the expected value:
X
1
limn→∞ Pr x − Ep(x) [X] > ε = 0.
n
lim
x∈X
Since X is a random variable, log2 p(X) is also a random variable. We therefore have:
X
1
log2 p(x)
−Ep(X) [log2 p(X)] > ε = 0.
limn→∞ Pr n
x∈X
|
{z
}
Q
log
x∈X
p(x)=logp(xn )
Therefore:
1
n
limn→∞ Pr − log2 p(X ) − H(X) > ε = 0,
n
or equivalently:
(b)
1
p
log p(X n ) → H(X)
n 2
when
n → ∞,
or equivalently:
(a)
1
p
p(xn ) n → 2H(X)
when
n → ∞.
c) Let f (x) be a function from X to the interval (0, 1]. Find the limit
"
lim
n→∞
n
Y
#1
n
f (Xi )
.
i=1
The function f is the mapping f : X → (0, 1]. Since:
X
1
log2 f (x) − Ep(X) [log2 f (X)] > ε = 0,
limn→∞ Pr n
x∈X
we have:
Y
1
p
log2
f (x) → Ep(X) [logf (X)]
n
x∈X
or equivalently
!1
n
Y
f (x)
p
→ 2Ep(X) [log2 f (X)] .
x∈X
4.2
Errorless coding with AEP
Let U1 , U2 , . . . Un be independent, identically distributed random variables that take their
value over the set U = {1, . . . , k}, with probability distribution p(i), i ∈ U. Recall that in
class, we divided all sequences in U n into two sets: the typical set An,ε and its complement,
see Figure 1. We talked about a source coding scheme that only coded the typical sequences
in An,ε and declared an error for sequences which are not in An,ε . This scheme has a vanishing
error probability (by making ε small and n large).
Atypical sequence
Typical set: An,✏
Figure 1: Typical sets and source coding
a) Fix n, ε, show that n(H(U ) + ε) bits are enough to encode the typical sequences in
An,ε in a uniquely decodable way. We call this code C1 (note that C1 is a fixed length
code for the typical sequences inside An,ε ).
U ∈ {1, . . . , k}. The set of typical sequences is defined as:
−n(H(U )+ε) −n(H(U )−ε) n
n
n
n
Aε = u ∈ U : p(u ) ∈ 2
,2
In order to determine the number of bits, we first determine the number of elements in
the typical set, |Anε |:
X
1=
p(un )
un ∈X n
(a)
X
≥
p(un )
un ∈An
ε
(b)
X
≥
2−n(H(U )+ε)
un ∈X n
−n(H(U )+ε)
=2
|Anε |,
where in (a) we sum over a smaller set (i.e. |Anε | ≥ |U|) and in (b) we use the definition
of the typical set. Therefore, the typical set contains at least 2n(H(U )+ε) many elements
(i.e. |Anε | ≤ 2n(H(U )+ε) ). We therefore need at least n(H(U ) + ε) bits to encode Anε .
b) We refer to the sequences which are not inside An,ε as the atypical sequences. Show that
ndlog2 (k)e bits are enough to encode the atypical sequences in a uniquely decodable way.
We call this fixed length code C2 . (Hint: how many sequences can there be in total?)
|U| = k n , we therefore need at least nlog2 k bits to encode the source U . Atypical
sequences are defined as:
Bεn = un ∈ U n \Anε .
nlog2 k bits therefore suffice to encode Bεn since |Bεn | < |U|.
In this exercise, we intend to design a code for (U1 , · · · , Un ) which is error-less (i.e., it has
zero error probability). The scheme is as follows: We consider to add an initial “flag” bit to
indicate whether sequence (u1 , u2 , . . . un ) ∈ An,ε or not. We then use two fixed-length codes
(introduced in parts (a) and (b)) to encode sequences in the typical set and the atypical set
separately. More precisely, let C1 , C2 be the fixed-length codes for typical and atypical
sequences, then in the end we can use (0, C1 ) to encode the typical sequences, and (1, C2 ) to
encode the rest.
c) Explain why the code is uniquely decodable.
¯ be the average codeword length for this scheme. Show that
d) Let L
¯ ≤ n(H(U ) + ε)p(An,ε ) + ndlog(k)e(1 − p(An,ε )) + 1.
L
The expected code length is given by:
X
¯=
L
p(un )n(H(U ) + ε) +
un ∈An
ε
|
X
p(un )LBεn
un ∈Bεn
{z
}
expected # bits for typical set
|
+ |{z}
1
flag bit
{z
}
expected # bits for atypical set
(a)
≥ n(H(X) + ε)p(Anε ) + nlogk(1 − p(Anε )) + 1,
(1)
where in (a) we use the result from 4.2b: LBεn < nlogk.
e) Show that
¯
L
= H(U )
ε→0 n→∞ n
As n → ∞, the atypical sequences contain almost no probability mass (i.e. p(Bεn ) = ε).
We therefore only encode the typical sequences. By equation 1, the expected code length
reduces to:
lim lim
¯ = nH(U )
L
when n → ∞ and ε → ∞.