Download Chapter 7. Sums of Random Variables and Long

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Lecture 7:
Chapter 7. Sums of Random
Variables and Long-Term Averages
ELEC206 Probability and Random Processes, Fall 2014
Gil-Jin Jang
[email protected]
School of EE, KNU
page 1 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Overview
7.1
Sums of Random Variables (7.1.1)
7.2
The Sample Mean and the Laws of Large Numbers
Review of 4.6
7.3
The Markov and Chebyshev Inequalities∗
The Central Limit Theorem (7.3.1)
Many problems involve counting the number of vent occurrences,
measuring cumulative effects, or computing arithmetic averages in a
series of measurements.
These probables can be reduced to the problem of finding, exactly or
approximately, the distribution of a random variable that consists of
the sum of n independent, identically distributed random variables.
We investigate sums of random variables and their properties as
n becomes large.
page 2 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
7.1 Sums of Random Variables
Let Sn = X1 + X2 + · · · + Xn .
Regardless of statistical
dependence, the expected value of a sum of n random variables is
equal to the sum of the expected values: (proof: see Example 5.24)
7.1.1 Mean and Variance of Sums of Random Variables
E[Sn ] = E[X1 + X2 + · · · + Xn ] = E[X1 ] + E[X2 ] + · · · + E[Xn ]
Example 7.1
Find the variance of Z = X + Y .
VAR[Z]
=
=
=
E[(Z − mZ )2 ] = E[{(X + Y ) − (mX + mY )}2 ]
E[{(X − mX ) + (Y − mY )}2 ]
...
VAR[X] + VAR[Y ] + 2 COV(X, Y )
page 3 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Generalisation to many RVs
Variance of the sum of nRVs “NOT” equal to the sum of individual
n
n
X
X
VAR[Sn ] = E 
(Xj − E[Xj ])
(Xk − E[Xk ])
j=1
=
n X
n
X
j=1 k=1
=
n
X
k=1
E [(Xj − E[Xj ]) (Xk − E[Xk ])] =
VAR [Xk ] +
variances:
n
X
n
X
n X
n
X
COV (Xj , Xk )
j=1 k=1
COV (Xj , Xk )
j=1 k=1,k6=j
k=1
Variance of the sum of n “INDEPENDENT” RVs
VAR[Sn ]
=
equal to the sum of variances:
VAR [X1 ] + · · · + VAR [Xn ]
if X1 , X2 , . . ., Xn are iid (independent,
identically distributed), then
Example 7.2 Sum of iid RVs
E[Sn ]
=
VAR[Sn ]
=
E[X1 ] + · · · + E[Xn ] = nµ
nVAR[Xj ] = nσ 2
page 4 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
7.2 The Sample Mean and the Laws ...
Let X be a RV for which the mean, E[X] = µ, is unknown.
Let X1 , . . ., Xn denote n independent, repeated measurements of X;
i.e., Xj ’s are iid RVs with the same pdf as X.
The sample mean is a RV to estimate the true mean E[X].
n
1X
Xj .
Mn =
n j=1
We will assess the effectiveness of Mn as an estimator for E[X]
by computing the expected value and variance of Mn , and
investigating the behaviour of Mn as n becomes large.
This is very important in real situations:
“How many measurements are necessary to obtain a reliable
mean?”
“We made n measurements and take the average of them as the
mean value. Can we be confident of our mean?”
page 5 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Sample Mean
The sample mean is itself a RV, so it will exhibit random variation
The properties (conditions) of a good “estimator”
1. On the average, it should give the correct value: E[Mn ] = µ
2. Should not vary too much: VAR[Mn ] = E[(Mn − µ)2 ] is small
enough.
The sample mean is an unbiased estimator for µ:

n

n
X
X
1
1
1
Xj  =
E[Xj ] = nµ = µ
E[Mn ] = E 
n j=1
n j=1
n
since E[Xj ] = E[X] = µ, ∀j. The estimator is not biased, but
centered at the true value.
page 6 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Variance of Sample Mean
Compute the variance of Mn with its mean µ
VAR[Mn ] = E[(Mn − µ)2 ] = E[(Mn − E[Mn ])2 ]
Mn =
1
n
Pn
j=1 Xj =
1
n Sn
where Xj ’s are iid RVs.
From Example 7.2, VAR[Sn ] = nVAR[Xj ] = nσ 2 , and hence
nσ 2
σ2
1
VAR[Mn ] = 2 VAR[Sn ] = 2 =
n
n
n
The variance of the sample mean approaches zero as the number of
samples is increased.
In other words, the probability of the sample mean being close to true
mean approaches “one” as n becomes very large.
page 7 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
4.6 Markov and Chebyshev Inequalities
The mean and variance of a random variable X helps
obtain bound for probabilities of the form P [|X| ≥ t].
For X nonnegative
E[X]
P [X ≥ a] ≤
a
Z
Z
Z
Z
Markov inequality:
tfX (t)dt+
E[X] =
0
a
∞
∞
∞
a
tfX (t)dt ≥
a
tfX (t)dt ≥
a
afX (t)dt = aP [X ≥ a]
Suppose that the mean E[X] = m and the variance
VAR[X] = σ 2 . Then
σ2
P [|X − m| ≥ a] ≤ 2
a
Chebyshev inequality:
proof Let D2 = (X − m)2 be the squared deviation from the mean. Apply the Markov
inequality to D2 ,
E[(X − m)2 ]
σ2
2
2
P [D ≥ a ] ≤
= 2
a2
a
In the case of σ 2 = 0, then it implies that P [X = m] = 1.
page 8 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Applying Chebyshev Inequality
Keep in mind that E[Mn ] = µ and VAR[Mn ] =
inequality can be formulated by
σ2
n ,
then the Chebyshev
VAR[Mn ]
P [|Mn − E[Mn ]| ≥ ε] ≤
ε2
σ2
⇔
P [|Mn − µ| ≥ ε] ≤ 2
nε
σ2
⇔
P [|Mn − µ| < ε] ≥ 1 − 2
nε
If the true variance σ 2 is known, we can select the number of samples n
so that Mn is within ε and the true mean with probability 1 − δ or greater.
page 9 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Example 7.9
A voltage of (unknown) constant value is to be measured. Each measurement Xj is the
sum of the desired voltage v (true mean) and a noise voltage Nj of zero mean and standard
deviation of 1 microvolt (µV ):
Xj = v + Nj
Assume that the noise voltages are independent RVs. How many measurements are
required so that the probability that Mn is within ε = 1µV of the true mean at least 0.99?
Solution
E[Xj ] = E[v + Nj ] = v + E[Nj ] = 0
n
1 X
for Mn =
Xj ,
n j=1
VAR[Xj ] = VAR[v + Nj ] = VAR[Nj ] = 1
2
σX
P [|Mn − µX | < ε] ≥ 1 −
nε2
⇒ P [|Mn − v| < ε] ≥ 1 −
1
= 0.99
n
∴ n = 100
page 10 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Laws of Large Numbers
Let X1 , X2 , . . . be a sequence of iid RVs with
finite mean E[X] = µ, then for ε > 0, and even if the variance of the
Xj ’s does not exist,
Weak Law of Large Numbers
σ2
lim P [|Mn − µ| < ε] = 1 (= lim 1 − 2 )
n→∞
n→∞
nε
Let X1 , X2 , . . . be a sequence of iid RVs with
finite mean E[X] = µ and finite variance, then
Strong Law of Large Numbers
P
h
i
lim Mn = µ = 1
n→∞
(proof: beyond sophomore level)
page 11 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Example 7.10
In order to estimate the probability of an event A, a sequence of Bernoulli
trials is carried out and the relative frequency of A is observed. How large
should n be in order to have a 0.95 probability that the relative frequency is
within 0.01 of p = P [A]?
Solution
Let X = IA be the indicator function of the occurrence of A. Since p = P [A] is the
occurrence probability of the Bernoulli trial, E[X] = E[IA ] = p, VAR[X] = p(1 − p).
The sample mean for X is:
n
n
1 X
1 X
Mn =
Xk =
IA,k = fA (n)
n k=1
n k=1
Since Mn is an estimator for E[X] = p, fA (n) is also an estimator for p. By applying
Chebyshev Inequality,
2
σX
1
≤
P [|fA (n) − p| ≥ ε] ≤
nε2
4nε2
⇒
P [|fA (n) − p| < ε] ≥ 1 −
2 =VAR[X]=p(1−p)=−(p− 1 )2 + 1 ≤ 1
∵σX
2
4
4
⇒1−
1
4nε2
1
= 0.95 ⇒ n = 1/{(1 − 0.95) · 4ε2 } = 50, 000
2
4nε
page 12 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
7.3 The Central Limit Theorem
Let X1 , X2 , . . . be a sequence of RVs with finite mean µ and finite
variance σ 2 . Regardless of the probability distribution of Xj , the
sum of the first n RVs, Sn = X1 + · · · + Xn approaches a Gaussian
RV with mean E[Sn ] = nE[Xj ] = nµ and variance
VAR[Sn ] = nVAR[Xj ] = nσ 2 , as n becomes large.
Notion
Mathematical formulation
Standardise Sn by Zn =
1
lim P [Zn ≤ z] = √
n→∞
2π
Z
Sn − nµ
√
then
σ n
z
−x2 /2
e
dx
−∞
√ Mn − nµ
n
In terms of Mn =
j=1 Xj , Zn =
σ
Importance of Gaussian RV The central limit theorem explains why the
Gaussian RV appears in so many diverse applications, since the real
observations are from the mixture of so many random variables.
1
n
Pn
page 13 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Example 7.11
7.11 Suppose that orders at a restaurant are iid random variables with
mean µ = $8 and standard deviation σ = $2.
1. Estimate the probability that the first 100 customers spend a total of
more than $840.
S100 =
100
X
i=1
Xi , Z100 =
S100 − nµ
S100 − 100 · 8
S100 − 800
√
= √
=
20
nσ 2
100 · 22
P [S100 > 840] = P [Z100 >
840 − 800
] ≈ Q(2) = 2.28 × 10−2
20
2. Estimate the probability that the first 100 customers spend a total of
between $780 and $820.
P [780 ≤ S100 ≤ 820] = P [−1 ≤ Z100 ≤ 1]
= Φ(1) − Φ(−1) = 1 − Q(1) − Q(1) = 1 − 2Q(1) ≈ 0.682
page 14 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages
Example 7.12
7.12 In Example 7.11, after how many orders can we be 90% sure that the
total spent by all customers is more than $1000?
P [Sn > 1000] ≥ 0.9
1000 − 8n
1000 − 8n
√
√
⇔ P Zn >
=Q
≥ 0.9
2 n
2 n
1000 − 8n
√
≤ 1 − 0.9 = 0.1
⇔ Q −
2 n
1000 − 8n
√
≤ Q−1 (0.1) ≈ 1.2815
⇔ −
2 n
∴ n ≥ 129
page 15 / 15 — Chapter 7. Sums of Random Variables and Long-Term Averages