Download Sampling Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Sampling
Distributions
OBJECTIVE
Set Up a
Sampling
Distribution,
CLT, &
Applications
RELEVANCE
To see how
sampling can be
used to predict
population
values.
• The U.S. census is only done once
every 10 years because it is
impractical to do it often.
• Therefore, the sample becomes very
important.
The Sampling Issue……
• The goal of the survey is to get the
same results that would be obtained
if all had answered from the entire
population.
• It is important that every member of
the population has an equal chance
of being chosen.
Activity……
• Get in groups of 4.
• Find the mean height of your group.
• Now find the mean height of the
class.
• Is the class mean the same as your
group’s mean?
Sample Mean vs. Population
Mean……
• Not every sample mean will be the same
as the population mean, but if you take
good samples the means will be very
close.
x  sample mean
  population mean
 x  Mean of Sample Means
What is a Sampling
Distribution?
Sampling Distribution – the
distribution of values for a sample
obtained from repeated samples, all
of the same size and all drawn
from the same population.
Example……
• Consider the following set: {0,2,4,6,8}.
a. Make a list of all possible samples of
size 2 that can be drawn from this set.
b. Construct a sampling distribution of
the sample means for samples of size
c. Graph the histogram of the population
and sampling distribution. What do
you notice?
a.
{0,2,4,6,8}
• (0,0)
(0,2)
(0,4)
(0,6)
(0,8)
(2,0)
(2,2)
(2,4)
(2,6)
(2,8)
(4,0)
(4,2)
(4,4)
(4,6)
(4,8)
Sets of 2
(6,0)
(6,2)
(6,4)
(6,6)
(6,8)
(8,0)
(8,2)
(8,4)
(8,6)
(8,8)
b. 1st find the means for
each sample……
• (0,0)
(0,2)
(0,4)
(0,6)
(0,8)
0
1
2
3
4
(2,0)
(2,2)
(2,4)
(2,6)
(2,8)
1
2
3
4
5
(4,0)
(4,2)
(4,4)
(4,6)
(4,8)
2
3
4
5
6
(6,0) 3 (8,0) 4
(6,2) 4 (8,2) 5
(6,4) 5 (8,4) 6
(6,6) 6 (8,6) 7
(6,8) 7 (8,8) 8
Sample Space
• Notice that each of these sample
means is equally likely to occur.
• Therefore, the probability of each is
1/25 = 0.04.
The sampling distribution of
the sample means (SDSM)
x
P(x)
0
1
2
3
1/25
2/25
3/25
4/25
4
5
6
5/25
4/25
3/25
7
8
2/25
1/25
Notice it is NORMAL!
Example – You Try……
• Let’s say I picked out all the grades
for the last quiz that were either 57,
67, 77, 87, or 97 and put them in a
pile. Find every possible
combination of quiz grades I could
get if I picked 2 quizzes from this
pile.
• NOTE: There will be 25 possible
combinations.
Now lets find the mean for
each pair
(57, 57)
(67, 57)
(77, 57)
(87, 57)
(97, 57)
(57, 67)
(67, 67)
(77, 67)
(87, 67)
(97, 67)
(57, 77)
(67, 77)
(77, 77)
(87, 77)
(97, 77)
(57, 87)
(67, 87)
(77, 87)
(87, 87)
(97, 87)
(57, 97)
(67, 97)
(77, 97)
(87, 97)
(97, 97)
There are 25 possible
combinations
(57, 57)
57
(57, 67)
62
(57, 77)
67
(57, 87)
72
(57, 97)
77
(67, 57)
62
(67, 67)
67
(67, 77)
72
(67, 87)
77
(67, 97)
82
(77, 57)
67
(77, 67)
72
(77, 77)
77
(77, 87)
82
(77, 97)
87
(87, 57)
72
(87, 67)
77
(87, 77)
82
(87, 87)
87
(87, 97)
92
(97, 57)
77
(97, 67)
82
(97, 77)
87
(97, 87)
92
(97, 97)
97
• Each has a probability of 1/25
chance of selection.
• Let’s make a chart.
Chart and Graph
x
P(x)
Probability Distribution of Means
1/25 = 0.04
62
2/25 = 0.08
0.25
67
3/25 = 0.12
0.2
72
4/25 = 0.16
77
5/25 = 0.20
82
4/25 = 0.16
87
3/25 = 0.12
92
2/25 = 0.08
97
1/25 = 0.04
Probability
57
0.15
0.1
0.05
0
57
62
67
72
77
82
Quiz Pair Means
87
92
97
Sampling
Distribution of
Sample Means SDSM
SDSM……
• If all possible
random samples,
each of size n, are
taken from any
population with
mean  and st.
deviation  , then
the SDSM will:
1.
Have a sampling
distribution mean equal
to the population mean.
x  
2.
Have a sampling
distribution standard
deviation equal to the
population st. dev.
divided by the square
root of the sample size.
x 

n
The shape of the
distribution……
• If the population
• If the population is
has a normal
NOT a normal
distribution, then
distribution, then
the sampling
we use the
distribution of the
Central Limit
sample means will
Theorem to make
also be normal.
the sampling
distribution
approximately
normal.
The CLT……
• Definition – The SDSM will more closely
resemble the normal distribution as the
sample size increases.
• The CLT can be used to answer questions
about sample means in the same manner
that the normal distribution can be used to
answer questions about individual values.
• **The CLT is used when the sampled
population is NOT normal. The
sampling distribution will be
approximately normal under the right
conditions.
The Standard Error of the
Mean……
• The symbol used to represent the
standard deviation of the samples,
also known as the standard error
of the mean, is
x
The SDSM follows these
rules…….
x  
1.
The
2.
The   
x
n
This measures the spread. (Note: “n” is
the size of each sample)
3.
a. A normal parent population
produces a normal sampling
distribution.
b. Use the CLT when the sample size is large
enough to make a sampling distribution
normal when the parent population is NOT
normal.
Let’s show how this works
using an example…..
• Consider all possibilities of sample
size 2 of {2,4,6}. Find the
probability distribution of the
population with the histogram and
then find the sampling distribution of
the sample means and draw the
histogram.
Probability Distribution of
Parent & Histogram……
x
P(x)
  [ x  P( x)]  4
  [x   2  P( x)]  1.63
2
1/3
4
1/3
6
1/3
• Now, let’s do a sampling distribution
of sets of 2 from this population we
just described.
The sets of 2 and their
means……
• (2,2) 2
(4,2) 3
(6,2) 4
(2,4) 3
(4,4) 4
(6,4) 5
(2,6) 4
(4,6) 5
(6,6) 6
Sampling Distribution……
x
P(x)
2
1/9
3
2/9
4
3/9
5
2/9
6
1/9
• Find the mean of the
sampling distribution:
x  4
Note :
x  
• Find the st. dev. of
the sampling dist:
 x  1.15

1.63
x 

 1.15
n
2
The Histogram……
• Now, take a look at
the shape of the
histogram of the
sampling
distribution. It is
approximately
normal.
Properties of SDSM –
Center, Shape, Spread
x  
x 

The shape of the
distributi on of the SDSM
was approx. normal.
n
Sample Question
• A certain population has a mean of 437 and a
standard deviation of 63. Many samples of size 49
are randomly selected and the means are calculated.
• A. What value would you expect to find for the mean of all
these samples?
437
• B.
What value would you expect to find for the st.
63
deviation of all these samples?
9
49
• C.
What shape would you expect the distribution of all
these sample means to have?
A p p ro x. N o rm a l
SDSM
Applications
Remember……
• Use “ncdf (z, z)” to find area or
probability under the curve.
• Change all “real” values to zscores if the mean is not 0.
• Population Mean = Sample Mean
x  

• St. Error of the Mean:  x  n
Why is Sample Size
Important?
• If
x 

n
• Answer:
As the sample size
increases, the
What happens as the
standard deviation
sample size increases?
of the sample
Larger sample sizedecreases. This
smaller variation
means that the
variation is
decreasing.
Remember, less
Smaller sample size- variation is better.
larger variation
Example……Follow the steps
• A normal population
has a population mean
of 100 and a
population st.
deviation of 20. If a
sample of size 16 is
selected, what is the
probability that this
sample will have a
mean value between
90 and 110?
• Draw the normal
distribution curve and
shade it.
• You need to change
90 and 110 to zscores.
• Then use normalcdf
(z, z) to find the
probability.
• The z-score formula
will be a little bit
different now because
the st. deviation of the
population must be
changed to a sample
st. deviation. You
now use
z
x  x
x

x  x
  


 n
• Let’s change the mean
values of 90 and 110.
90  100  10
z90 

 2
5
 20 


 16 
z110
110  100 10


2
5
 20 


 16 
• Now use normalcdf
from where you
started shading to
where you stopped
shading:
• ncdf(-2,2) = 0.9545
Example……You Try
• Kindergarten children have heights
that are approximately normally
distributed with a population mean of
39 inches and a population standard
deviation of 2 inches. A sample of
25 is taken. What is the probability
that this sample will have a mean
value between 38.5 inches and 40
inches?
Answer……
z 38.5
38.5  39


 2 


 25 
 0.5
 1.25
2
 
5
40  39
1
z 40 

 2.5
 2  2

  
 25   5 
ncdf (1.25,2.5)  0.8881
Cutoff Example
• If the population mean of a
distribution is 39 and the population
st. deviation is 2, within what
limits does the middle 90% fall for a
sample of 100?
• Hint: This is a cutoff score in the
middle. First, you find the z-scores.
Next, you substitute them back into
the z-score formula.
Answer……
• Find the z-score for
the middle 90%:
z = InvNorm(.5 .90/2)
z = + - 1.64
• Now, plug these into
the formula with the
new standard
deviation for a
sample.
x  39
 1.64 
 2 


 100 
x  38.67
x  39.33