Download Central Limit Theorem: Discrete Independent Trials

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Central Limit Theorem: Discrete
Independent Trials
May 15, 2006
CLT: Discrete Independent Trials
Central Limit Theorem for Discrete
Independent Trials
• Let Sn = X1 + X2 + · · · + Xn be the sum of n independent
discrete random variables of an independent trials process with
common distribution function m(x) dened on the integers, with
mean µ and variance σ 2.
• Standardized Sums
Sn∗
Sn − nµ
.
= √
2
nσ
• This standardizes Sn to have expected value 0 and variance 1.
CLT: Discrete Independent Trials
1
• If Sn = j , then Sn∗ has the value xj with
j − nµ
xj = √
.
2
nσ
CLT: Discrete Independent Trials
2
Approximation Theorem
Theorem.
Let X1, X2, . . . , Xn be an independent trials process
and let Sn = X1 +X2 +· · ·+Xn. Assume that the greatest common
divisor of the dierences of all the values that the Xj can take on
is 1. Let E(Xj ) = µ and V (Xj ) = σ 2. Then for n large,
φ(xj )
P (Sn = j) ∼ √
,
2
nσ
√
where xj = (j −nµ)/ nσ 2, and φ(x) is the standard normal density.
CLT: Discrete Independent Trials
3
Central Limit Theorem for a Discrete
Independent Trials Process
Theorem.
Let Sn = X1 + X2 + · · · + Xn be the sum of
n discrete independent random variables with common distribution
having expected value µ and variance σ 2. Then, for a < b,
¶
µ
Z b
2
1
Sn − nµ
<b =√
lim P a < √
e−x /2 dx .
n→∞
2π a
nσ 2
CLT: Discrete Independent Trials
4
Example
A die is rolled 420 times. What is the probability that the sum of
the rolls lies between 1400 and 1550?
CLT: Discrete Independent Trials
5
Example
A die is rolled 420 times. What is the probability that the sum of
the rolls lies between 1400 and 1550?
The sum is a random variable
S420 = X1 + X2 + · · · + X420 .
We have seen that µ = E(X) = 7/2 and σ 2 = V (X) = 35/12.
Thus, E(S420) = 420 · 7/2 = 1470, σ 2(S420) = 420 · 35/12 =
1225, and σ(S420) = 35.
CLT: Discrete Independent Trials
5
µ
P (1400 ≤ S420 ≤ 1550) ≈ P
1399.5 − 1470
1550.5 − 1470
∗
≤ S420 ≤
35
35
∗
= P (−2.01 ≤ S420
≤ 2.30)
≈ N A(−2.01, 2.30) = .9670 .
CLT: Discrete Independent Trials
6
¶
A More General Central Limit Theorem
Theorem.
Let X1, X2, . . . , Xn , . . . be a sequence of
independent discrete random variables, and let Sn = X1 + X2 +
· · · + Xn. For each n, denote the mean and variance of Xn by µn
and σn2 , respectively. Dene the mean and variance of Sn to be mn
and s2n, respectively, and assume that sn → ∞. If there exists a
constant A, such that |Xn| ≤ A for all n, then for a < b,
µ
¶
Z b
Sn − mn
1
−x2 /2
lim P a <
e
dx .
<b =√
n→∞
sn
2π a
CLT: Discrete Independent Trials
7
Central Limit Theorem for Continuous
Independent Trials
CLT: Continuous Independent Trials
8
Standardized Sums
• Suppose we choose n random numbers from the interval [0, 1] with
uniform density. Let X1, X2, . . . , Xn denote these choices, and
Sn = X1 + X2 + · · · + Xn their sum.
• The standardized sum is
Sn∗ =
CLT: Continuous Independent Trials
Sn − nµ
√
.
nσ
9
n= 2
0.4
n= 3
n= 4
0.3
n = 10
0.2
0.1
-3
-2
CLT: Continuous Independent Trials
-1
1
2
3
10
Exponential Density
• Choose numbers from the interval [0, +∞) with an exponential
density with parameter λ.
• Then
1
,
λ
1
= V (Xj ) = 2 .
λ
µ = E(Xi) =
σ2
CLT: Continuous Independent Trials
11
• There are formulas for the density function for Sn and the density
function for Sn∗ :
fSn (x) =
fSn∗ (x) =
CLT: Continuous Independent Trials
λe−λx(λx)n−1
,
(n − 1)!
µ√
¶
√
n
nx + n
fS
.
λ n
λ
12
n= 2
0.5
n= 3
0.4
n = 10
n = 30
0.3
0.2
0.1
-4
CLT: Continuous Independent Trials
-2
2
13
Central Limit Theorem
Theorem.
Let Sn = X1 + X2 + · · · + Xn be the sum of
n independent continuous random variables with common density
function p having expected value µ and variance σ 2.
Let
√
Sn∗ = (Sn − nµ)/ nσ . Then we have, for all a < b,
1
∗
lim P (a < Sn < b) = √
n→∞
2π
CLT: Continuous Independent Trials
Z
b
e
−x2 /2
dx .
a
14
Example
• Suppose a surveyor wants to measure a known distance, say of
1 mile, using a transit and some method of triangulation.
• He knows that because of possible motion of the transit,
atmospheric distortions, and human error, any one measurement is
apt to be slightly in error.
• He plans to make several measurements and take an average.
• He assumes that his measurements are independent random
variables with a common distribution of mean µ = 1 and standard
deviation σ = .0002.
• What can he say about the average?
CLT: Continuous Independent Trials
15
Estimating the Mean
• Now suppose our surveyor is measuring an unknown distance with
the same instruments under the same conditions.
• He takes 36 measurements and averages them.
• How sure can he be that his measurement lies within .0002 of the
true value?
CLT: Continuous Independent Trials
16
Sample Mean
• The sample mean of n measurements:
x1 + x2 + · · · + xn
µ̄ =
,
n
• Moreover
P (|µ̄ − µ| < .0002) ≈ .997 .
• The interval (µ̄ − .0002, µ̄ + .0002) is called the 99.7% condence
interval.
CLT: Continuous Independent Trials
17
Sample Variance
• If he does not know the variance σ 2 of the error distribution, then
he can estimate σ 2 by the sample variance:
(x1 − µ̄)2 + (x2 − µ̄)2 + · · · + (xn − µ̄)2
σ̄ =
.
n
2
CLT: Continuous Independent Trials
18