Download distribution of sample means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
8.1 Sampling Distributions
LEARNING GOAL
Understand the fundamental ideas of sampling distributions
and how the distribution of sample means and the
distribution of sample proportions are formed. Also learn the
notation used to represent sample means and proportions.
Copyright © 2009 Pearson Education, Inc.
Sample Means: The Basic Idea
Table 8.1 lists the weights of the five
starting players (labeled A through E
for convenience) on a professional
basketball team. We regard these five
players as the entire population (with
a mean of 242.4 pounds).
Samples drawn from this population
of five players can range in size from
n = 1 (one player out of the five) to n = 5 (all five players).
With a sample size of n = 1, there are 5 different samples that
could be selected: Each player is a sample. The mean of each
sample of size n = 1 is simply the weight of the player in
the sample.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 2
Figure 8.1 shows a
histogram of the means of
the 5 samples; it is called a
distribution of sample
means, because it shows
the means of all 5 samples
of size n = 1.
The distribution of sample
Figure 8.1 Sampling distribution
means created by this process
for sample size n 1.
is an example of a sampling
distribution. This term simply refers to a distribution of a
sample statistic, such as a mean, taken from all possible
samples of a particular size.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 3
Notice that the mean of the 5
sample means is the mean of the
entire population:
215 + 242 + 225 + 215 + 315
5
= 242.4 pounds
This demonstrates a general rule: The mean of a
distribution of sample means is the population mean.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 4
Let’s move on to samples of size n = 2, in which each sample
consists of two different players. With five players, there are 10
different samples of size n = 2. Each sample has its own mean.
Table 8.2 lists the 10 samples with their
means.
Figure 8.2 shows
the distribution
of all 10 sample
means.
Again, notice
that the mean of
the distribution
of sample
means is equal to the population mean,
242.4 pounds.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 5
Ten different samples of size n = 3 are possible in a population
of five players.
Table 8.3 shows these samples and their means, and Figure
8.3 shows the distribution of these sample means.
Again, the
mean of the
distribution
of sample
means is
equal to the
population mean, 242.4 pounds.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 6
With a sample size of n = 4, only 5 different samples are
possible.
Table 8.4 shows these samples and their means, and Figure 8.4
shows the distribution of these sample means.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 7
Finally, for a population of five
players, there is only 1 possible
sample of size n = 5: the entire
population. In this case, the
distribution of sample means is
just a single bar (Figure 8.5).
Again the mean of the distribution
of sample means is the population
mean, 242.4 pounds.
Figure 8.5 Sampling distribution
for sample size n = 5.
To summarize, when we work
with all possible samples of a
population of a given size, the mean of the distribution of
sample means is always the population mean.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 8
Sample Means with Larger Populations
In typical statistical applications, populations are huge and
it is impractical or expensive to survey every individual in
the population; consequently, we rarely know the true
population mean, μ.
Therefore, it makes sense to consider using the mean of a
sample to estimate the mean of the entire population.
Although a sample is easier to work with, it cannot
possibly represent the entire population exactly. Therefore,
we should not expect an estimate of the population mean
obtained from a sample to be perfect.
The error that we introduce by working with a sample is
called the sampling error.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 9
Sampling Error
The sampling error is the error introduced because a
random sample is used to estimate a population
parameter. It does not include other sources of error,
such as those due to biased sampling, bad survey
questions, or recording mistakes.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 10
TIME OUT TO THINK
Would you expect the sampling error to increase or
decrease if the sample size were increased? Explain.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 11
Results from a survey of students who were asked how
many hours they spend per week using a search engine
on the Internet.
n = 400
μ = 3.88
σ = 2.40
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 12
A sample of 32 students selected from the 400 on
the previous slide.
1.1
3.8
1.7
7.8
5.7
2.1
6.8
6.5
1.2
4.9
2.7
0.3
3.0
2.6
0.9
6.5
1.4
2.4
5.2
7.1
2.5
2.2
5.5
7.8
5.1
3.1
3.4
5.0
4.7
6.8
7.0
6.5
Sample 1
The mean of this sample is x̄x = 4.17; we use the standard
notation x̄x to denote this mean.
We say that x̄x is a sample statistic because it comes from a
sample of the entire population. Thus, x̄x is called a sample
mean.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 13
Notation for Population and Sample Means
n = sample size
m = population mean
x¯ = sample mean
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 14
A different sample of 32 students selected from the 400.
1.8
5.2
0.5
0.4
5.7
3.9
4.0
6.5
3.1
2.4
1.2
5.8
0.8
5.4
2.9
6.2
5.7
7.2
0.8
7.2
0.9
6.6
5.1
4.0
5.7
3.2
7.9
3.1
2.5
5.0
3.6
3.1
Sample 2
For this sample x̄x is = 3.98.
Now you have two sample means that don’t agree with each
other, and neither one agrees with the true population mean.
x̄x1 = 4.17 (slide 13)
x̄x2 = 3.98
m = 3.88 (slide 10)
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 15
Figure 8.6 shows a histogram that results from 100 different
samples, each with 32 students. Notice that this histogram is
very close to a normal distribution and its mean is very close
to the population mean, μ = 3.88.
Figure 8.6 A distribution of 100 sample means, with a sample size of n = 32,
appears close to a normal distribution with a mean of 3.88.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 16
TIME OUT TO THINK
Suppose you choose only one sample of size n = 32.
According to Figure 8.6, are you more likely to choose a
sample with a mean less than 2.5 or a sample with a
mean less than 3.5? Explain.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 17
The Distribution of Sample Means
The distribution of sample means is the distribution
that results when we find the means of all possible
samples of a given size.
The larger the sample size, the more closely this
distribution approximates a normal distribution.
In all cases, the mean of the distribution of sample
means equals the population mean.
If only one sample is available, its sample mean, x,
x̄ is
the best estimate for the population mean, m.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 18
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 19
If we were to include all possible samples of size n = 32,
this distribution would have these characteristics:
• The distribution of sample means is approximately a
normal distribution.
• The mean of the distribution of sample means is 3.88 (the
mean of the population).
• The standard deviation of the distribution of sample
means depends on the population standard deviation and
the sample size. The population standard deviation is σ =
2.40 and the sample size is n = 32, so the standard
deviation of sample means is
σ
2.40
=
= 0.42
n
32
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 20
Suppose we select the following random sample of 32
responses from the 400 responses given earlier:
5.8 7.5 5.8 5.2 3.9 3.4 7.3 4.1 0.5 7.9 7.7 7.7 5.0
2.3 7.8 2.3 5.0 6.8 6.5 1.7 2.1 7.3 4.0 2.2 5.6 4.7
5.3 3.5 6.5 3.4 6.6 5.0
Sample 3
The mean of this sample is x̄x = 5.01.
Given that the mean of the distribution of sample means is
3.88 and the standard deviation is 0.42, the sample mean
of xx̄ = 5.01 has a standard score of
z=
sample mean – pop. mean
5.01 – 3.88
=
= 2.7
standard deviation
0.42
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 21
The sample (from the previous slide) has a standard score
of z = 2.7, indicating that it is 2.7 standard deviations
above the mean of the sampling distribution.
From Table 5.1, this standard score corresponds to the
99.65th percentile, so the probability of selecting another
sample with a mean less than 5.01 is about 0.9965.
It follows that the probability of selecting another sample
with a mean greater than 5.01 is about 1 – 0.9965 =
0.0035.
Apparently, the sample we selected is rather extreme
within this distribution.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 22
TIME OUT TO THINK
Suppose a sample mean is in the 95th percentile. Explain
why the probability of randomly selecting another sample
with a mean greater than the first mean is 0.05.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 23
EXAMPLE 1 Sampling Farms
Texas has roughly 225,000 farms, more than any other state in
the United States. The actual mean farm size is μ = 582 acres
and the standard deviation is σ = 150 acres. For random samples
of n = 100 farms, find the mean and standard deviation of the
distribution of sample means. What is the probability of
selecting a random sample of 100 farms with a mean greater
than 600 acres?
Solution: Because the distribution of sample means is a normal
distribution, its mean should be the same as the mean of the
entire population, which is 582 acres.
The standard deviation of the sampling distribution is σ/ n =
150/ 100 = 15.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 24
EXAMPLE 1 Sampling Farms
Solution: (cont.)
A sample mean of acres therefore has a standard score of
sample mean – pop. mean
600 – 582
z=
=
= 1.2
standard deviation
15
According to Table 5.1, this standard score is in the 88th
percentile, so the probability of selecting a sample with a mean
less than 600 acres is about 0.88.
Thus, the probability of selecting a sample with a mean greater
than 600 acres is about 0.12.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 25
Sample Proportions
In a survey where 400 students were asked if they own a
car, 240 replied that they did.
The exact proportion of car owners is
240
p=
= 0.6
400
This population proportion, p = 0.6, is another example of a
population parameter.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 26
TIME OUT TO THINK
Give another survey question that would result in a
population proportion rather than a population mean.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 27
A sample of 32 was selected from the 400 students and 21
were car owners.
21
pp̂ =
= 0.656
32
This proportion is another example of a sample statistic.
In this case, it is a sample proportion because it is the
p̂ symbol p
proportion of car owners within a sample; we use the
(read “p-hat”) to distinguish this sample proportion from the
population proportion, p.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 28
Notation for Population and Sample Proportions
n = sample size
p = population proportion
p
ˆ = sample proportion
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 29
Figure 8.7 shows such a histogram of sample proportions
from 100 samples of size n = 32. As we found for sample
means, this distribution of sample proportions is very close
to a normal distribution. Furthermore, the mean of this
distribution is very close to the population proportion of 0.6.
Figure 8.7 The distribution of 100 sample proportions, with a sample size
of 32, appears to be close to a normal distribution.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 30
Suppose it were possible to select all possible samples
of size n = 32. The resulting distribution would be
called a distribution of sample proportions.
The mean of this distribution equals the population
proportion exactly.
This distribution approaches a normal distribution as
the sample size increases.
In practice, we often have only one sample to work
with. In that case, the best estimate for the population
p̂
proportion, p, is the sample proportion, p.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 31
The Distribution of Sample Proportions
The distribution of sample proportions is the
distribution that results when we find the proportions (p̂ )
in all possible samples of a given size.
The larger the sample size, the more closely this
distribution approximates a normal distribution.
In all cases, the mean of the distribution of sample
proportions equals the population proportion.
If only one sample is available, its sample proportion, p̂ ,
is the best estimate for the population proportion, p.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 32
EXAMPLE 2 Analyzing a Sample Proportion
Consider the distribution of sample proportions shown in Figure
8.7 (slide 30). Assume that its mean is p = 0.6 and its standard
deviation is 0.1. Suppose you randomly select the following
sample of 32 responses:
YYNYYYYNYYYYYYNYYNYYYNYYNYYNYNYY
p̂ for this sample. How far does
Compute the sample proportion, p,
it lie from the mean of the distribution? What is the probability of
selecting another sample with a proportion greater than the one
you selected?
Solution: The proportion of Y responses in this sample is
p̂ =
24
= 0.75
32
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 33
EXAMPLE 2 Analyzing a Sample Proportion
Solution: (cont.)
Using a mean of 0.6 and a standard deviation of 0.1, we find
that the sample statistic, p̂ = 0.75, has a standard score of
sample proportion – pop. proportion 0.75 – 0.6
z=
=
= 1.2
standard deviation
0.1
The sample proportion is 1.5 standard deviations above the
mean of the distribution.
Using Table 5.1, we see that a standard score of 1.5 corresponds
to the 93rd percentile. The probability of selecting another
sample with a proportion less than the one we selected is about
0.93.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 34
EXAMPLE 2 Analyzing a Sample Proportion
Solution: (cont.)
Thus, the probability of selecting another sample with a
proportion greater than the one we selected is about 1 – 0.93 =
0.07.
In other words, if we were to select 100 random samples of 32
responses, we should expect to see only 7 samples with a higher
proportion than the one we selected.
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 35
The End
Copyright © 2009 Pearson Education, Inc.
Slide 8.1- 36