Download 5.3 Central Limit Theorem (CLT)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
5.3 Central Limit Theorem (CLT)
!  Also
a very important section in the book
" In
the previous section, we computed
probabilities related to an individual
observation, such as P(X < 120) = ?
" We
now move to statements about a group
of observations, specifically, we compute
probabilities relate to the sample mean
(shown as X ), such as P( X < 120) = ?
1
Performance by a group of
individuals
!  Suppose
a class of 25 students is taking the
Stanford-Binet IQ test.
!  The
teacher would like to consider her
students’ performance as a whole.
!  For
example, what is the probability that the
class average is above 110?
2
Performance by a group of
individuals
!  We
know how to compute probabilities (or
percentiles) for any single (individual) student
taking the Stanford-Binet IQ test because we
know the test scores have a normal
distribution and with a mean of 100 and
standard deviation of 16. Or, in short-hand,
for an IQ score X, we have…
X ~ N(µ=100,σ=16)
3
Performance by a group of
individuals
!  But
what about a sample mean X ?
" To
compute probabilities for X , we need to know
it’s distribution. Does X also have a normal
distribution?
" In
this case, YES!!!!
So, we can use our previously learned procedure
for computing probabilities for X (i.e. z-tables).
4
Performance by a group of
individuals
!  It
turns out that X is also normally distributed,
and the mean and standard deviation for X
are related to the normal distribution for X .
σ
" If X ~ N(µ, σ ) , then X ~ N(µ,
)
n
The sample mean X is normally
distributed with a mean equal to
the population2 mean µ and a
variance of σ n .
µx σ x
5
Performance by a group of
individuals
" If
X ~ N(µ, σ ) , then
σ
X ~ N(µ,
)
n
And if n is large enough (say n >30), then X has
this same approximate distribution, even if X was
not normally distributed (by Central Limit Theorem).
6
!  It
turns out that an average X is less
variable than an individual observation.
2
The variance of a single observation X is σ ,
while the variance of a sample mean taken
2
from n observations is σ .
n
!  If you had to predict the points scored in a
single NFL football game, what range of
points would be relevant? 0 to 54?
!  If you had to predict the AVERAGE points
scored per game over the whole season, what
range of points would be relevant? 10 to 30?
7
Performance by a group of
individuals
!  For
a random sample of n=25 students taking
the Stanford-Binet IQ test, we have…
X ~ N(µ = 100, σ = 16) , so we have…
16
X ~ N(µ x = 100, σ x =
)
25
as a decimal is 3.2
8
Compare the distribution of scores for an individual to
the distribution of scores for a mean (n=25).
0.10
0.05
X
! 
The distribution for X and X
are both normally distributed.
! 
The distribution for X and X
both have a mean equal to 100,
or µ =100 and µ x =100.
! 
The spread of X IS MUCH
SMALLER than the spread
of X . Specifically,
0.00
relative frequency
0.15
Individual IQ Scores
60
80
100
120
140
IQ Score
0.10
0.05
X
0.00
relative frequency
0.15
Mean IQ Scores (n=25)
60
80
100
IQ Score
120
140
16
σ = 16 and σ x =
= 3.2
25
9
Performance by a group of
individuals
!  For
a random sample of n=25 taking the
Stanford-Binet IQ test, what is the probability
that the group average is more than 110?
P(
X > 110) = 1 - P( X ≤ 110) = ?
Using X ~ N(µ x = 100, σ x = 3.2) , we will
convert 110 to a z-score and compute the
probability.
10
Performance by a group of
individuals
X ~ N(µ x = 100, σ x =
P(
16
)
25
X > 110) = 1 - P( X ≤ 110)
= 1- P(z ≤
110 −100
16
25
= 1 – P(z ≤ 3.13)
= 1 – 0.9991
= 0.0009
)
Probability
distribution for the
average score
from 25 students
taking the
Stanford-Binet
test.
11
Performance by a group of
individuals
Very small
X
chance of
getting a mean
at 110 or higher.
0.05
0.10
The area under
the curve in red
is 0.0091
0.00
relative frequency
0.15
Mean IQ Scores (n=25)
60
80
100
IQ Score
120
140
12
Performance
by a single
individual
X ~ N(µ, σ )
Z-score:
x−µ
z=
σ
Performance
by a group of
n individuals
σ
X ~ N(µ,
)
n
Z-score:
x −µ
z=
σ
n
13
Exercise 1:
!  For
a random sample of size n=9, find the
probability that the average IQ score is
110 or higher.
14
Exercise 2:
!  For
a random sample of size n=30, find
the probability that the average IQ score is
98 or lower.
15
What if the original distribution
was NOT normal?
!  In
the previous example, the distribution of
individual IQ scores was normal. So, it
naturally followed that the average IQ score
( X ) would also have a normal distribution.
!  But
what about other distributions? There
are many other possibilities (uniform, rightskewed, left-skewed, etc.). What then?
16
What if the original distribution
was NOT normal?
!  Enter…
!  An
THE CENTRAL LIMIT THEOREM.
incredibly useful rule.
!  NO
MATTER WHAT DISTRIBUTION you’re
drawing from, X will be normally distributed
as long as you take a large enough random
sample (n>=30).
17
What if the original distribution
was NOT normal?
!  See
applet linked at our website:
http://onlinestatbook.com/stat_sim/
sampling_dist/index.html
18
Comment
!  Central
" Gives
Limit Theorem
us
σ
x ~ N(µ,
)
n
!  If
parent population is VERY non-normal, need n>=30
!  If parent population nearly normal, any size n OK
19