Download Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Addition wikipedia , lookup

Infinitesimal wikipedia , lookup

History of trigonometry wikipedia , lookup

Georg Cantor's first set theory article wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Series (mathematics) wikipedia , lookup

Vincent's theorem wikipedia , lookup

List of important publications in mathematics wikipedia , lookup

Pythagorean theorem wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Fermat's Last Theorem wikipedia , lookup

Wiles's proof of Fermat's Last Theorem wikipedia , lookup

Four color theorem wikipedia , lookup

Nyquist–Shannon sampling theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Theorem wikipedia , lookup

Transcript
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Lesson 6
Chapter 5: Convolutions and
The Central Limit Theorem
Michael Akritas
Department of Statistics
The Pennsylvania State University
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
1
Law of Large Numbers
2
Convolutions
Introduction
Distribution of Sums of Normal Random Variables
3
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Theorem (The Law of Large Numbers)
Let X1 , . . . , Xn be independent and identically distributed and let
g be a function such that −∞ < E[g(X1 )] < ∞. Then,
n
1X
g(Xi ) converges in probability to E[g(X1 )],
n
i=1
It guarantees the consistency of averages, i.e., that as the
sample size increases, the average approximates the
population mean more and more accurately.
Limitation: The LLN provides no guidance regarding the
quality of the approximation.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The Assumption of finite µ and σ 2
The LLN requires that the population has finite mean.
Later we will be using also the assumption of finite
variance.
Is it possible for the mean of a RV to not exist? Or a for RV
to have infinite mean, or to have finite µ but infinite σ 2 ?
The most famous abnormal RV is the Cauchy whose pdf is
f (x) =
1 1
, −∞ < x < ∞.
π 1 + x2
Its median is zero, but its mean does not exist.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Example
Let X1 and X2 have PDFs
f1 (x) = x −2 , 1 ≤ x < ∞, f2 (x) = 2x −3 , 1 ≤ x < ∞,
respectively. Then, E(X1 ) = ∞, E(X2 ) = 2, Var(X2 ) = ∞.
Solution.
Z
∞
E(X1 ) =
1
Z
E(X2 ) =
1
∞
xf1 (x) dx = ∞, E(X12 ) =
xf2 (x) dx = 2, E(X22 ) =
Michael Akritas
Z
Z
∞
1
∞
x 2 f1 (x) dx = ∞
x 2 f2 (x) dx = ∞
1
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Distributions with infinite mean, or infinite variance, are
described as heavy tailed.
The LLN may not hold for heavy tailed distributions: X may
not converge (as in the case of Cauchy) or diverge. (Also
the CLT, which we will see shortly, does not hold.)
Samples coming from heavy tailed distributions are much
more likely to contain outliers.
If large outliers exist in a data set, it might be a good idea to
focus on estimating another quantity, such as the median,
which is well defined also for heavy tailed distributions.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
1
Law of Large Numbers
2
Convolutions
Introduction
Distribution of Sums of Normal Random Variables
Introduction
Distribution of Sums of Normal Random Variables
3
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Introduction
Distribution of Sums of Normal Random Variables
The distribution of the sum of two independent random
variables is called the convolution of the two distributions.
In some cases it is easy to find the convolution of two RVs:
1
2
3
If Xi ∼ Bin(ni , p), i = 1, . . . , k, are independent then
X1 + · · · + Xk ∼ Bin(n1 + · · · + nk , p). (Example 5.3.2.)
If Xi ∼ Poisson(λi ), i = 1, . . . , k , are independent then
X1 + · · · + Xk ∼ Poisson(λ1 + · · · + λk ). (Example 5.3.1.)
We will also see that if Xi ∼ N(µi , σi2 ), i = 1, . . . , k , are
independent then
X1 + · · · + Xk ∼ N(µ1 + · · · + µk , σ12 + · · · + σk2 ).
Such nice results are the exception. In general, the
distribution of a sum of independent random variables is
not easy to determine.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Introduction
Distribution of Sums of Normal Random Variables
Example 5.3.3 derives the convolution of two independent
U(0, 1) random variables. http://stat.psu.edu/
˜mga/401/fig/convol_2Unif.pdf
See also relation (5.3.3) for general formula of the density
of the convolution of two densities (engineering majors
might recognize it!).
Instead of the mathematical derivation of the formula for the
convolution of two independent U(0, 1) random variables, we
will use R and random numbers to give a numerical ”proof” of it.
The numerical ”proof” is based on the fact that a histogram
serves as an estimator of the pdf. In fact, histograms are
consistent estimators (i.e. as the sample size increases,
the estimation error tends to zero).
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Introduction
Distribution of Sums of Normal Random Variables
Numerical verification of the result of Example 5.3.3 can be
done with the R commands ”hist” and ”density” (smooth
histogram): Try
m=matrix(runif(150000),nrow=3)
plot(density((m[1,]+m[2,])/2),ylim=c(0,2)); lines(density(m[1,]),
col=”red”)
How about the convolution of three independent uniforms?
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
1
Law of Large Numbers
2
Convolutions
Introduction
Distribution of Sums of Normal Random Variables
Introduction
Distribution of Sums of Normal Random Variables
3
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Introduction
Distribution of Sums of Normal Random Variables
Proposition
Let X1 , X2 , . . . , Xn be independent with Xi ∼ N(µi , σi2 ), and set
Y = a1 X1 + · · · + an Xn . Then
Y ∼ N(µY , σY2 ),
where
µY = a1 µ1 + · · · + an µn , and σY2 = a12 σ12 + · · · + an2 σn2 .
Corollary
Let X1 , X2 , . . . , Xn be iid N(µ, σ 2 ). Then
X ∼ N(µ,
Michael Akritas
σ2
).
n
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Introduction
Distribution of Sums of Normal Random Variables
Example (Controlling the estimation error via n)
We want to be 95% certain that X does not differ from the
population mean by more than 0.3 units. How large should the
sample size n be? It is known that the population distribution is
normal with σ 2 = 9.
Remark:
1
This example illustrates the use of a probabilistic result
(which is the previous corollary) to answer a statistical
question.
2
The example assumes normality with known variance. In
real life these are typically unknown. More details on this
will be given in Chapter 8.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
Introduction
Distribution of Sums of Normal Random Variables
Solution: Using the previous corollary we have
!
X − µ
0.3
0.3
P(|X − µ| < 0.3) = P √ < √
= P |Z | < √
.
σ/ n σ/ n
σ/ n
We want to find n so that P(|X − µ| < 0.3) = 0.95. Thus,
0.3
√ = z0.025 , or n =
σ/ n
1.96
σ
0.3
2
= 384.16.
Thus, using n = 385 will satisfy the precision objective.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
The complexity of finding the exact distribution of a sum (or
average) of k (> 2) RVs increases rapidly with k.
This is why the following is the most important theorem in
probability and statistics.
Theorem (The Central Limit Theorem)
Let X1 , . . . , Xn be iid with mean µ and variance σ 2 < ∞. Then,
for large enough n (n ≥ 30 for the purposes of this course), X
has approximately a normal distribution with mean µ and
variance σ 2 /n, i.e.
σ2
·
X ∼ N µ,
.
n
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
The Central Limit Theorem: Comments
The CLT does not require the Xi to be continuous RVs.
T = X1 + · · · + Xn has approximately a normal distribution
with mean nµ and variance nσ 2 , i.e.
·
T = X1 + · · · + Xn ∼ N nµ, nσ 2 .
The CLT can be stated for more general linear
combinations of independent r.v.s, and also for certain
dependent RVs. The present statement suffices for the
range of applications of this book.
The CLT explains the central role of the normal distribution
in probability and statistics.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Example
Two towers are constructed, each by stacking 30 segments of
concrete. The height (in inches) of a randomly selected
segment is uniformly distributed in (35.5, 36.5). A roadway can
be laid across the two towers provided the heights of the two
towers are within 4 inches. Find the probability that the roadway
can be laid.
Solution: The heights of the two towers are
T1 = X1,1 + · · · + X1,30 , and T2 = X2,1 + · · · + X2,30 ,
where Xi,j is the height of the jth segment used in the ith tower.
By the CLT,
·
·
T1 ∼ N(1080, 2.5), and T2 ∼ N(1080, 2.5).
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Example (Continued)
Since the two heights T1 , T2 are independent r.v.s, their
difference, D, is distributed as
·
D = T1 − T2 ∼ N(0, 5).
Therefore, the probability that the roadway can be laid is
−4
4
·
P(−4 < D < 4) = P
<Z <
2.236
2.236
= Φ(1.79) − Φ(−1.79) = .9633 − .0367
= .9266.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Example
The waiting time for a bus, in minutes, is uniformly distributed in
[0, 10]. In five months a person catches the bus 120 times. Find
the 95th percentile, t0.05 , of the person’s total waiting time T .
Solution. Write T = X1 + . . . + X120 , where Xi =waiting time for
catching the bus the ith time. Since n = 120 ≥ 30, and
E(Xi ) = 5, Var(Xi ) = 102 /12 = 100/12, we have
100
·
T ∼ N 120 × 5, 120
= N(600, 1000).
12
Therefore,
√
√
t0.05 ' 600 + z.05 1000 = 600 + 1.645 1000 = 652.02.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
1
Law of Large Numbers
2
Convolutions
The DeMoivre-Laplace Theorem
Introduction
Distribution of Sums of Normal Random Variables
3
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
The DeMoivre-Laplace theorem was proved first by
DeMoivre in 1733 for p = 0.5 and extended to general p by
Laplace in 1812.
Due to the prevalence of the Bernoulli distribution in the
experimental sciences, it continues to be stated separately
even though it is now recognized as a special case of the
CLT.
Theorem (DeMoivre-Laplace)
If T ∼ Bin(n, p) then, for n satisfying np ≥ 5 and n(1 − p) ≥ 5,
·
T ∼ N (np, np(1 − p)) .
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
The continuity correction
For an integer valued RV X , and Y the approximating
normal RV, the principle of continuity correction specifies
P(X ≤ k) ' P(Y ≤ k + 0.5).
The DeMoivre-Laplace theorem and continuity correction
yield
k + 0.5 − np
P(T ≤ k) ' Φ p
.
np(1 − p)
For individual probabilities use
P(X = k ) ' P(k − 0.5 < Y < k + 0.5).
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
0.00
0.05
0.10
0.15
0.20
0.25
Bin(10,0.5) PMF and N(5,2.5) PDF
0
2
4
6
8
10
Figure: Approximation of P(X ≤ 5) with and without continuity
correction.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Example
A college basketball teams plays 30 regular season games, 16
which are against class A teams and 14 are against class B
teams. The probability that the team will win a game is 0.4 if the
team plays against a class A team and 0.6 if the team plays
against a class B team. Assuming that the results of different
games are independent, approximate the probability that
1
the team will win at leat 18 games;
2
the number of wins against class B teams is smaller than
that against class A teams.
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Solution: If X1 and X2 denote the number of wins against class
A teams and class B teams,
X1 ∼ Bin(16, 0.4), and X2 ∼ Bin(14, 0.6).
Since 16 × 0.4 and 14 × 0.4 are both ≥ 5,
·
X1 ∼ N(16 × 0.4, 16 × 0.4 × 0.6) = N(6.4, 3.84),
·
X2 ∼ N(14 × 0.6, 14 × 0.6 × 0.4) = N(8.4, 3.36).
Consequently, since X1 and X2 are independent,
·
X1 + X2 ∼ N(6.4 + 8.4, 3.84 + 3.36) = N(14.8, 7.20),
·
X2 − X1 ∼ N(8.4 − 6.4, 3.84 + 3.36) = N(2, 7.20).
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem
Outline
Law of Large Numbers
Convolutions
The Central Limit Theorem
The DeMoivre-Laplace Theorem
Solution continued: Hence, using also continuity correction, the
needed approximation for part 1 is
17.5 − 14.8
√
P(X1 + X2 ≥ 18) = 1 − P(X1 + X2 ≤ 17) ' 1 − Φ
7.2
= 1 − Φ(1.006) = 1 − 0.843 = 0.157,
and the needed approximation for part 2 is
−0.5 − 2
√
P(X2 − X1 < 0) ' Φ
= Φ(−0.932) = 0.176.
7.2
Michael Akritas
Lesson 6 Chapter 5: Convolutions and The Central Limit Theorem