Download Lecture 10, Oct. 11.

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Math/Stat 352
Lecture 10
Section 4.11
The Central Limit Theorem
1
Summing random variables
Summing random variables
Summing random variables
• Generally summation changes the shape of the distribution: range of
values, spread, mean, etc.
• There is no simple way to tell what is the distribution of X+Y if we know
distributions of X and Y: you have to do complicated math including
multiple integration and stuff.
• So, what to do when we have more summands, like 10, 20, …1000…
We need some magic to deal with it … enter the Central Limit Theorem.
The Central Limit Effect
Start with one random variable (one summand). Consider two
types of random variables: continuous (sample on left) and
discrete (sample on right)
One summand, 100 observations from random variable:
X1 (left)
or
Y1 (right)
Two summands, 100 observations from new random variable:
X1+X2 (left)
or
Y1 +Y2 (right)
The Central Limit Effect
Three summands, 100 observations from random variable:
X1+X2+X3 (left)
or
Y1+Y2+Y3 (right)
Twenty summands, 100 observations from new random variable:
X1+X2+ …+X20 (left)
or
Y1 +Y2+…+Y20 (right)
The Central Limit Effect
100 summands, 100 observations from random variable:
X1+X2+…+X100 (left)
or
Y1+Y2+…+Y100 (right)
1000 summands, 100 observations from new random variable:
X1+X2+ …+X1000 (left)
or
Y1 +Y2+…+Y1000 (right)
Normal distribution
The Central Limit Theorem
Let X1,…,Xn be a random sample from a population with mean µ and
variance σ2.
Let
� = 𝑿𝟏 +𝑿𝟐+⋯+𝑿𝒏
𝑿
𝒏
be the sample mean.
Let Sn = X1+…+Xn be the sum of the sample observations.
Then if n is sufficiently large,
 σ2
X ~ N  µ ,
n


 and

S n ~ N ( nµ , nσ 2 )
approximately.
8
How large a sample is large enough?
Rule of Thumb for the CLT
For most populations, if the sample size is greater than 30, the Central
Limit Theorem approximation is good.
9
Two Examples of the CLT
Normal approximation to the Binomial:
If X ~ Bin(n,p) and if np > 10, and n(1 − p) >10, then
 p (1 − p ) 
ˆ
p
~
N
 , approximately.
 p,
X ~ N(np, np(1 − p)) approximately and
n


Normal Approximation to the Poisson:
If X ~ Poisson(λ), where λ > 10, then X ~ N(λ, λ2), approximately.
10
Example
The manufacturer of a certain part requires two different machine operations. The
time on machine 1 has mean 0.4 hours and standard deviation 0.1 hours. The time
on machine 2 has mean 0.45 hours and standard deviation 0.15 hours. The times
needed on the machines are independent. Suppose that 65 parts are
manufactured. What is the distribution of the total time on machine 1? On
machine 2? What is the probability that the total time used by both machines
together is between 50 and 55 hours?
Soln: Let X=time on machine 1 in hours, EX = 0.4, and St. Dev. X = 0.1.
Y=time on machine 2, EX= 0.45, and St. Dev. Y= 0.15. X and Y independent.
X1, X2, …, X65 are times for the 65 parts on machine 1.
Y1, Y2, …, Y65 are times for the 65 parts on machine 2.
Sx=X1+X2+…+X65= total time on machine 1; Sx has approximately Normal
distribution with mean ESx= 65(0.4)= 26, and VarSx =65 (0.1)2= 0.65, so the
distribution of Sx is approximately N(26, 6.5).
Let Sy=Y1+Y2+…+Y65= total time on machine 2; Sy has approximately Normal
distribution with mean ESy= 65(0.45)= 29.25, and VarSy =65 (0.15)2= 1.4625, so the
distribution of Sy is approximately N(29.25, 1.4625).
11
Example contd.

What is the probability that the total time used by both machines together is
between 50 and 55 hours?
Need distribution of the total time on machines 1 and 2.
Let T=total time for the 65 parts on both machines, then T= Sx+Sy.
Since both Sx and Sy are approximately normal, their sum is also approximately
normal with ET=ESx+ESy= 26+29.25=55.25 and variance VarT=VarSy + VarSx = 0.65 +
1.4625 = 2.1125. So T’s distribution is approximately N(55.25, 2.1125).
Then, P(50 < T < 55)= standardization = P(-3.61 <Z < -0.17) = 0.4325 – 0.0002 = 0.4323
12
Continuity Correction



The binomial distribution is discrete, while the normal distribution is
continuous.
The continuity correction is an adjustment, made when approximating a discrete
distribution with a continuous one, that can improve the accuracy of the
approximation.
If you want to include the endpoints in your probability calculation, then extend
each endpoint by 0.5. Then proceed with the calculation.
Histogram of X ~ Bin(100, 0.5),
approximated by Y ~ N(50, 25)
P( 45 ≤ X ≤ 55) ≈ P( 44.5 < Y < 55.5)

If you want exclude the endpoints in your probability calculation, then include
0.5 less from each endpoint in the calculation.
Histogram of X ~ Bin(100, 0.5),
approximated by Y ~ N(50, 25)
P( 45 < X < 55) ≈ P( 45.5 < Y < 54.5)

We use continuity correction for the normal approximation to the binomial
distribution, but not for normal approx. to Poisson distribution.
13
Example
If a fair coin is tossed 100 times, use the normal curve to approximate the
probability that the number of heads is between 45 and 55 inclusive.
Soln. X=# of H; X ~ Bin(100, 0.5), approximated by Y ~ N(50, 25)
P( 45 ≤ X ≤ 55) ≈ P( 44.5 < Y < 55.5) =
= P(-1.1 < Z < 1.1)=0.7286.
How about the probability of the number of heads between 45 and 55 exclusive?
P( 45 < X < 55) ≈ P( 45.5 < Y < 54.5) =
= P(-0.9 < Z < 0.9)=0.6318.
14
Example
Suppose X is the score on a test and X~N(500, 1002). Let X1, X2, …X16 be a
sample of scores for 16 individuals and
their average score.
Find P( 550 < X ≤ 600).
X
X has a normal
Solution: Since the data come from a normal distribution,
distribution with mean
µ X= µ= 500 and σ X= σ / n= 100 / 16= 25.
Thus, P(550 <
X
550 − 500 X − 500 600 − 500
<
≤
)=
≤ 600) = P (
25
25
25
= P(2 < Z ≤ 4) = P(Z ≤ 4) - P(Z ≤ 2 ) =
= 1 – 0.9772 = 0.0228.
Example
Suppose X1, X2, …, X25 are lifetimes of electronic components, with μ=700
hours and σ=10 hours. Find P( ≤ 702), where
is the sample mean of
the lifetimes of 25 components.
X
X
Solution. Usually lifetime data is skewed to the right, so not normal (Why?)
Since n=25 (reasonably large), we will use CLT:
X
has approx. a N(μ, σ2/n) = N(700, 102/25) = N(700, 4) distr.
So,
X − 700 702 − 700
P( X ≤ 702)= P(
≤
)= P ( Z ≤ 1)= 0.8413.
2
2
Example – Water Taxi Safety
A water taxi has capacity of 3500 lb. Given the population of men has normally
distributed weights with a mean of 172 lb and a standard deviation of 29 lb,
a) if one man is randomly selected, find the probability that his weight is greater
than 175 lb.
b) if 20 different men are randomly selected,
find the probability that their mean
weight is greater than 175 lb (so that their total weight
exceeds the safe capacity
of 3500 pounds).
Example – cont
a) if one man is randomly selected,
find the probability that his weight is
greater than 175 lb.
b) if 20 different men are randomly
selected, find the probability that their
mean weight is greater than 172 lb.
z = 175 – 172 = 0.46
z = 175 – 172 = 0.10
29
29
20
Example - conclusion
a) if one man is randomly selected, find the probability
that his weight is greater than 175 lb.
P(x > 175) = 0.4602
b) if 20 different men are randomly selected, their mean
weight is greater than 175 lb.
P( X > 175) = 0.3228
It is much easier (larger probability) for an individual to deviate from the
mean than it is for a group of 20 to deviate from the mean.