Download Topic 7 The Central Limit Theorem - AUEB e

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1
ECO 72 - INTRODUCTION TO ECONOMIC
STATISTICS
Topic 7
The Central
Limit Theorem
These slides are copyright © 2003 by Tavis Barr. This material may
be distributed only subject to the terms and conditions set forth in
the Open Publication License, v1.0 or later (the latest version is
presently available at http://www.opencontent.org/openpub/).
2
•
•
•
•
Random Samples
Sampling Distribution
The Law of Large Numbers
The Central Limit Theorem
o Known population mean and variance
o Known population mean, unknown variance
3
Random Sample
• Every element of the population have the same
chance of being included in the sample at each
draw.
o Simple sampling: all samples of the same size have
an equal chance of being selected from the
population.
o Cluster sampling: population is divided into clusters
and then a random sample of the clusters are
selected. (clusters are homogeneous with nonhomogeneous elements).
o Stratified sampling: population is divided into strata
and then a random sample is selected from each
strata.
(strata
are
non-homogeneous
with
homogeneous elements).
4
Sampling Distribution (SD)
•The distribution of a given statistic based
on a random sample.
•It may be considered as the distribution of
the statistic for all possible samples of a
given size.
•The sampling distribution depends on the
underlying distribution of the population,
the statistic being considered, and the
sample size used.
5
SD of the Sample Mean
• In order to obtain information about the
population mean μ, a sample is taken and the
sample mean X is calculated.
• Changing the sample the sample mean
changes.
• All possible values of the sample mean along
with the probability of occurrence of the
possible values is called the SD of the sample
mean.
• So the sample mean has also a mean and a
standard deviation.
6
The IID assumption
• We make the assumption that observations in
the sample are independent and identically
distributed (IID).
• Random
sampling
gives
IID
sample
observations.
• Observations are independent when knowing
the value of one observation in a sample does
not tell us anything about the value of other
observations in that sample.
• Observations are identically distributed if they
are all draws from a random variable with the
same distribution and parameters (we make no
assumption about the distribution).
7
• It turns out that if our samples are
independent and identically distributed, we
can predict the behavior of large samples.
• The law of large numbers and the central
limit theorem are two of the basic ways of
doing this.
8
The Law of Large Numbers
The Law of Large Numbers (LLT) states that
if the sample is IID and the population has a
finite mean and variance then the sample
mean approaches the population mean with
probability one as the sample becomes
infinitely large.
9
Central Limit Theorem-Result
The Central Limit Theorem (CLT) states that if
the sample is IID and the population has a finite
mean μ and variance σ2 then:
⎛ σ2 ⎞
X ≈ N ⎜ μ,
⎟
⎝ n ⎠
when n (the sample size) approaches infinity.
10
Central Limit Theorem-Remarks
• The observations don’t have to be normal for the
CLT to work!
• You usually need n ≥ 30 observations for the
approximation to work well. (Need fewer
observations if the observations come from a
symmetric distribution.)
• The standard deviation (also known as the
standard error) is σ/ n
11
Example of Central Limit
Theorem
●
●
Suppose we produce soda. Our
quality control engineer claims that
our bottles of soda have a mean
contents of 2000ml and a standard
deviation of 2 ml.
We take a sample of 100 bottles.
How likely is is that the mean
contents of the bottles in our sample
are 1999.5 ml or less?
12
Example of Central Limit
Theorem
●
●
Suppose we produce soda. Our
quality control engineer claims that
our bottles of soda have a mean
contents of 2000ml and a standard
deviation of 2 ml.
We take a sample of 100 bottles.
How likely is is that the mean
contents of the bottles in our sample
are 1999.5 ml or less?
13
Example of Central Limit
Theorem
●
We take a sample of 100 bottles. How
likely is is that the mean contents of the
bottles in our sample are 1999.5 ml or
less?
–
The sample mean will be normally distributed.
It will have an expected value of 2000, and a
standard error of 2 /  1 0 0=0 . 2
14
Example of Central Limit
Theorem
●
We take a sample of 100 bottles. How
likely is is that the mean contents of the
bottles in our sample are 1999.5 ml or
less?
–
The sample mean will be normally distributed.
It will have an expected value of 2000, and a
standard error of 2 /  1 0 0=0 . 2
–
So we want to know the probability that a
Normally distributed variable with mean 2000
and standard deviation 0.2 is less than 1999.5
15
●
Example of Central Limit
Theorem
We take a sample of 100 bottles. How
likely is is that the mean contents of the
bottles in our sample are 1999.5 ml or
less?
–
So we want to know the probability that a
Normally distributed variable with mean
2000 and standard deviation 0.2 is less
than 1999.5
–
This is the same as the probability that a
standard normal variable is less than
(1999.5-2000)/0.2 = -2.5.
16
Another Example of CLT
●
●
Suppose we know that the mean
marital age of men in the U.S. is 24.8
years and the standard deviation is 2.5
years.
If we take a sample of 60 married men,
what is the probability that the mean
marital age in the sample will be 25.1
years or more?
17
Another Example of CLT
●
●
Suppose we know that the mean
marital age of men in the U.S. is 24.8
years and the standard deviation is 2.5
years.
If we take a sample of 60 married men,
what is the probability that the mean
marital age in the sample will be 25.1
years or more?
–
Sample mean will be a Normal variable
with mean 24.8 and standard deviation
2 . 5/ 6 0=2 . 5/7 . 7 5=0 . 3 2
18
Another Example of CLT
●
If we take a sample of 60 married men,
what is the probability that the mean
marital age in the sample will be 25.1
years or more?
–
–
Sample mean will be a Normal variable
with mean 24.8 and standard deviation
2 . 5/ 6 0=2 . 5/7 . 7 5=0 . 3 2
What is the probability that a Normal
variable with mean 24.8 and standard
devation 0.32 is at least 25.1?
19
Another Example of CLT
●
If we take a sample of 60 married men,
what is the probability that the mean
marital age in the sample will be 25.1
years or more?
–
What is the probability that a Normal
variable with mean 24.8 and standard
devation 0.32 is at least 25.1?
–
Same as the probability that a standard
normal is at least(25.1 – 24.8)/0.32 =
0.3/0.32 = 0.9375.
20
Another Example of CLT
●
If we take a sample of 60 married men,
what is the probability that the mean
marital age in the sample will be 25.1
years or more?
–
Same as the probability that a standard
normal is at least(25.1 – 24.8)/0.32 =
0.3/0.32 = 0.9375.
–
From the table, P(z<.94) is 0.826
21
Another Example of CLT
●
If we take a sample of 60 married men,
what is the probability that the mean
marital age in the sample will be 25.1
years or more?
–
Same as the probability that a standard
normal is at least(25.1 – 24.8)/0.32 =
0.3/0.32 = 0.9375.
–
From the table, P(z<.94) is 0.826.
–
So P(z > .94) = 1 – 0.826 = 0.174.
22
What if we don't know ?
●
●
●
Sometimes we know the population
mean, but not the population
standard deviation
In this case, we can substitute the
sample standard deviation, s, for the
population standard deviation .
Then, the result is that the sample
mean is normally distributed with
expected value  and standard error
s /n
23
Example with  unkown
●
●
●
Suppose a company claims that its
light bulbs last an average of a
thousand hours.
We take a sample of 500 light bulbs.
The average bulb in the sample
lasts 950 hours, and the sample
standard deviation is 100 hours.
What is the probability of observing
a sample mean this small?
24
Example with  unkown
●
●
●
Suppose a company claims that its light
bulbs last an average of a thousand hours.
We take a sample of 500 light bulbs. The
average bulb in the sample lasts 950
hours, and the sample standard deviation
is 100 hours.
What is the probability of observing a
sample mean this small?
–
Here  = 1000,  unknown, n = 500, X = 950,
s = 100
25
Example with  unkown
●
●
Recap:
–
Population mean () of 1000, population
standard deviation () unknown
–
Sample size (n) 500, sample mean ( X ) 950,
sample standard deviation (s) 100
What is the probability of X this small or smaller?
is Normal with mean 1000, std error 1 0 0/  5 0 0
= 100/22.36 = 4.47.
– X
–
P( X<950) is the same as P(z <
[950-1000]/4.47), i.e.,
P( z < -11.18).
Related documents