Download 35. DEFINING THE CENTRAL LIMIT THEOREM

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Sampling Distribution
WELCOME to INFERENTIAL STATISTICS
A Sampling Distribution


We are moving from descriptive statistics
to inferential statistics.
Inferential statistics allow the researcher
to come to conclusions about a
population on the basis of descriptive
statistics about a sample.
For example:
A Sampling Distribution



Your sample says that a candidate gets support from
47%.
Inferential statistics allow you to say that the
candidate gets support from 47% of the population
with a margin of error of +/- 4%.
This means that the support in the population is likely
somewhere between 43% and 51%.
A Sampling Distribution

Margin of error is taken directly from a
sampling distribution.
95% of Possible Sample Means
It looks like this:
43%
47%
51%
Your Sample Mean
A Sampling Distribution


Let’s create a sampling distribution of means…
Take a sample of size 1,500 from the US. Record the mean
income. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the
mean income. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the
mean income. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the
mean income. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the
mean income. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Take another sample of size 1,500 from the US. Record the
mean income. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the
mean incomes. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the
mean incomes. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the
mean incomes. Our census said the mean is $30K.
$30K
A Sampling Distribution


Let’s create a sampling distribution of means…
Let’s repeat sampling of sizes 1,500 from the US. Record the
mean incomes. Our census said the mean is $30K.
The sample means would stack up in
a normal curve. A normal sampling
distribution.
$30K
A Sampling Distribution


Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would
get a sample mean that is more than $20K off.
The sample means would stack up in
a normal curve. A normal sampling
distribution.
$30K
-3z
-2z
-1z
0z
1z
2z
3z
A Sampling Distribution


Say that the standard deviation of this distribution is $10K.
Think back to the empirical rule. What are the odds you would
get a sample mean that is more than $20K off.
The sample means would stack up in
a normal curve. A normal sampling
distribution.
2.5%
2.5%
$30K
-3z
-2z
-1z
0z
1z
2z
3z
Central Limit Theorem
(CLT)
Central Limit Theorem: As sample size
increases, the sampling distribution of sample
means approaches that of a normal
distribution with a mean the same as the
population and a standard deviation equal to
the standard deviation of the population
divided by the square root of n (the sample
size).
N(ℳ , σ/√n) with mean ℳ and sd σ/√n
Variability in Sampling Distribution
For example, if the variability is low,
, we can
trust our number more than if the variability is high,
.
•
An Example:
•
A population’s car values are  = $12K with  = $4K.
•
Which sampling distribution is for sample size 625 and
which is for 2500? What are their s.e.’s (standard error)?
95% of M’s
-3
-2
?
$12K
-1
0
?
1
2
95% of M’s
3
? $12K ?
-3-2-1 0 1 2 3
•
An Example:
•
A population’s car values are  = $12K with  = $4K.
•
Which sampling distribution is for sample size 625 and which is for
2500? What are their s.e.’s?
•
(2500 = 50)
(625 = 25)
s.e. = $4K/50 = $80
s.e. = $4K/25 = $160
95% of M’s
-3
-2
?
$12K
-1
0
?
1
2
95% of M’s
3
? $12K ?
-3-2-1 0 1 2 3
Which sample will be more precise? If you get a
particularly bad sample, which sample size will
help you be sure that you are closer to the true
mean?
95% of M’s
-3
-2
?
$12K
-1
0
?
1
2
95% of M’s
3
? $12K ?
-3-2-1 0 1 2 3
So we know in advance of ever collecting a sample, that if
sample size is sufficiently large:

Repeated samples would pile up in a normal distribution

The sample means will center on the true population mean

The standard error will be a function of the population
variability and sample size

The larger the sample size, the more precise, or efficient, a
particular sample is

95% of all sample means will fall between +/- 2 s.e. from the
population mean
What proportion of US teens know that 1492 was
the year in which Columbus “discovered”
America? A Gallup Poll fund that 210 out of a
random sample of 501 American teens aged 1317 knew this historically important date. The
sample proportion:
p
= 210/501 = 0.42
0.42 is the statistic that we use to gain information
about the unknown population parameter p. We
may say that 42% of US teens know that
Columbus discovered America in 1492.
Sampling distribution of
sample proportion
p
=
Count of success in sample
Size of the sample
=
X
n
The mean of the sampling distributionp is exactly p
The standard deviation of the sampling distribution
p
is
p(1-p)
n
√
Applying to college
p
Normal calculation involving
A polling organization asks an SRS (simple random sample) of
1500 1st year college students whether they applied for
admission to any other college. In fact 35% of all the 1st year
students applied to colleges besides the one they are attending.
What is the probability that the random sample of 1500 students
will give a result within 2 percentage point of this true value?
n=1500
p=0.35
ℳ p =0.35
σ= √
=
p(1-p)
n
√
0.35(1-0.35)
1500
=
0.0123
Sampling Distribution
Jeremy, out of boredom, decided to find
the probability of a male student being
72 inches tall in BHS. Mr. Delton told
him that the average height of 857 male
students in BHS is 67 inches with a
standard deviation of 3.5 inches. Show
a statistical procedure on how to help
Jeremy on his quest of getting rid of his
boredom.
Related documents