Download Sec_7.3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
+
Chapter 7: Sampling Distributions
Section 7.3
Sample Means
The Practice of Statistics, 4th edition – For AP*
STARNES, YATES, MOORE
+ Section 7.3
Sample Means
Learning Objectives
After this section, you should be able to…

FIND the mean and standard deviation of the sampling distribution of
a sample mean

CALCULATE probabilities involving a sample mean when the
population distribution is Normal

EXPLAIN how the shape of the sampling distribution of sample
means is related to the shape of the population distribution

APPLY the central limit theorem to help find probabilities involving a
sample mean
Means
+
 Sample
When we record quantitative variables we are interested
in other statistics such as the median or mean or
standard deviation of the variable. Sample means are
among the most common statistics.
Like any statistic computed from a random sample, a
sample mean also has a sampling distribution.
Sample Means
We have seen how sample proportions arise most often
when we are interested in categorical variables. We
might be interested in finding the proportion of males or
females, etc.
Sampling Distribution of x
Mean and Standard Deviation of the Sampling Distribution of Sample Means
Suppose that x is the mean of an SRS of size n drawn from a large population
with mean  and standard deviation  . Then :
The mean of the sampling distribution of x is x  
The standard deviation of the sampling distribution of x is

x 

n
as long as the 10% condition is satisfied: n ≤ (1/10)N.
x are true
no matter what shape the population distribution has.
 Note : These facts about the mean and standard deviation of
Sample Means
As we have seen in section 7.1, when we choose many SRSs from a
population, the sampling distribution of the sample mean is centered at
the population mean µ and is less spread out than the population
distribution. Here are the facts.

+
 The
from a Normal Population
In one important case, there is a simple relationship between the two
distributions. If the population distribution is Normal, then so is the
sampling distribution of x. This is true no matter what the sample size is.
Sample Means
We have described the mean and standard deviation of the sampling
distribution of the sample mean x but not its shape. That' s because the
shape of the distribution of x depends on the shape of the population
distribution.
+
 Sampling
Sampling Distribution of a Sample Mean from a Normal Population
Suppose that a population is Normally distributed with mean  and standard deviation
 . Then the sampling distribution of x has the Normal distribution with mean  and
standard deviation  / n, provided that the 10% condition is met.
Example: Young Women’s Heights
Find the probability that a randomly selected young woman is
taller than 66.5 inches.
Let X = the height of a randomly selected young woman. X is N(64.5, 2.5)
z
66.5  64.5
 0.80
2.5
Sample Means
The height of young women follows a Normal distribution with mean
µ = 64.5 inches and standard deviation σ = 2.5 inches.
P(X  66.5)  P(Z  0.80) 1 0.7881 0.2119
The probability of choosing a young woman at random whose height exceeds 66.5 inches
is about 0.21.

 Find the probability that the mean height of an SRS of 10 young women
exceeds 66.5 inches.
For an SRS of 10 young women, the
sampling distribution of their sample
mean height will have a mean and
standard deviation

2.5
x    64.5 x 

 0.79
n
10


Since the population distribution is Normal,
the sampling distribution will follow an N(64.5,
0.79) distribution.
P(x  66.5)  P(Z  2.53)
66.5  64.5
z
 2.53
 1 0.9943  0.0057
0.79
It is very unlikely (less than a 1% chance) that
we would choose an SRS of 10 young women

whose average height
exceeds 66.5 inches.
+ Simulating the Sampling Distribution of a
Mean
Most population distributions are not Normal.
What is the shape of the sampling distribution
of sample means when the population
distribution isn’t Normal?
We can use simulation to get a sense as to
what the sampling distribution of the sample
mean might look like…
Slide 18- 7
Means – The “Average” of One Die

Let’s start with a simulation of 10,000 tosses of a
die. A histogram of the results is:
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 8
Means – Averaging More Dice

Looking at the average of
two dice after a simulation
of 10,000 tosses:

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
The average of three dice
after a simulation of
10,000 tosses looks like:
Slide 18- 9
Means – Averaging Still More Dice

The average of 5 dice
after a simulation of
10,000 tosses looks like:

Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
The average of 20 dice
after a simulation of
10,000 tosses looks like:
Slide 18- 10
Means – What the Simulations Show


As the sample size (number of dice) gets larger,
each sample average is more likely to be closer
to the population mean.
 So, we see the shape continuing to tighten
around 3.5
And, it probably does not shock you that the
sampling distribution of a mean becomes Normal.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 11
The Fundamental Theorem of Statistics


The sampling distribution of any mean becomes
Normal as the sample size grows.
 All we need is for the observations to be
independent and collected with randomization.
 We don’t even care about the shape of the
population distribution!
This fact is he Fundamental Theorem of Statistics
and is called the Central
Limit Theorem
(CLT).
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 12
The Central Limit Theorem
Not only does the distribution of the sample means get
closer and closer to the Normal model as the sample
size grows, but this is true regardless of the shape
of the population distribution.
 The CLT works better (and faster) the closer the
population model is to a Normal itself. It also works
better for larger samples.
Draw an SRS of size n from any population with mean  and finite
standard deviation  . The central limit theorem (CLT) says that when n
is large, the sampling distributi on of the sample mean x is approximat ely
Normal.
Note: How large a sample size n is needed for the sampling distribution to be
close to Normal depends on the shape of the population distribution. More
observations are required if the population distribution is far from Normal.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Sample Means
The CLT is surprising and a bit weird:
Something so Powerful bears repeating as it is
what we know to be
The Fundamental Theorem of Statistics
The Central Limit Theorem (CLT)
The mean of a random sample has a sampling
distribution whose shape can be approximated by
a Normal model. The larger the sample, the
better the approximation will be.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 14
Assumptions and Conditions

The CLT requires remarkably few assumptions, but
there are few conditions to check:
1. Random Sampling Condition: The data values must
be sampled randomly or the concept of a sampling
distribution makes no sense.
2. Independence Assumption: The sample values must
be mutually independent. (When the sample is drawn
without replacement, check the 10% condition…)
3. Large Enough Sample Condition: There is no onesize-fits-all rule, although you can be pretty sure
about using a Normal Model if the sample size is a
minimum of 30.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 15
Example: Servicing Air Conditioners
Your company will service an SRS of 70 air conditioners. You have budgeted
1.1 hours per unit. Will this be enough?
Sample Means
Based on service records from the past year, the time (in hours) that a
technician requires to complete preventative maintenance on an air
conditioner follows the distribution that is strongly right-skewed, and
whose most likely outcomes are close to 0. The mean time is µ = 1
hour and the standard deviation is σ = 1
Since the 10% condition is met (there are more than 10(70)=700 air conditioners in
the population), the sampling distribution of the mean time spent working on the 70
units has

1
x 

 0.12
x   1
n
70
The sampling distribution of the mean time spent working is approximately N(1, 0.12)
since n = 70 ≥ 30.
We need to find P(mean time > 1.1 hours)


z

1.1 1
 0.83
0.12
P(x  1.1)  P(Z  0.83)
 1 0.7967  0.2033
If you budget 1.1 hours per unit, there is a 20%
chance the
technicians will not complete the
work within the budgeted time.
Sampling Distribution Models


Always remember that the statistic itself is a
random quantity.
 We can’t know what our statistic will be
because it comes from a random sample.
Fortunately, for the mean and proportion, the CLT
tells us that we can model their sampling
distribution directly with a Normal model.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 17
Sampling Distribution Models (cont.)

There are two basic truths about sampling
distributions:
1. Sampling distributions arise because
samples vary. Each random sample will have
different cases and, so, a different value of
the statistic.
2. Although we can always simulate a sampling
distribution, the Central Limit Theorem saves
us the trouble for means and proportions.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 18
What Can Go Wrong?

Don’t confuse the sampling distribution with the
distribution of the sample.
 When you take a sample, you look at the
distribution of the values, usually with a
histogram, and you may calculate summary
statistics.
 The sampling distribution is an imaginary
collection of the values that a statistic might
have taken for all random samples—the one
you got and the ones you didn’t get.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 19
What Can Go Wrong? (cont.)


Beware of observations that are not independent.
 The CLT depends crucially on the assumption
of independence.
 You can’t check this with your data—you have
to think about how the data were gathered.
Watch out for small samples from skewed
populations.
 The more skewed the distribution, the larger
the sample size we need for the CLT to work.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 20
What have we learned?


Sample proportions and means will vary from
sample to sample—that’s sampling error
(sampling variability).
Sampling variability may be unavoidable, but it is
also predictable!
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 21
What have we learned? (cont.)


We’ve learned to describe the behavior of sample
proportions when our sample is random and large
enough to expect at least 10 successes and
failures.
We’ve also learned to describe the behavior of
sample means (thanks to the CLT!) when our
sample is random (and larger if our data come
from a population that’s not roughly unimodal and
symmetric).
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley
Slide 18- 22
+ Section 7.3
Sample Means
Summary
In this section, we learned that…


When we want information about the population mean  for some variable,
we often take an SRS and use the sample mean x to estimate the unknown
parameter . The sampling distribution of x describes how the statistic
varies in all possible samples of the same size from the population.
The mean of the sampling distribution is
unbiased estimator of .
, so that x is an
The standard deviation of the sampling distribution of x is  / n for an SRS

of size n if the population has standard deviation  . This formula can be used
if the population is at least 10 times as large as the sample (10% condition).

+ Section 7.3
Sample Means
Summary
In this section, we learned that…


Choose an SRS of size n from a population with mean  and standard
deviation  . If the population is Normal, then so is the sampling
distribution of the sample mean x. If the population distribtution is not Normal,
the central limit theorem (CLT) states that when n is large, the sampling
distribution of x is approximately Normal.
We can use a Normal distribution to calculate approximate probabilities for
events involving x whenever the Normal condition is met :
If the population distribution is Normal, so is the sampling distribution of x .
If n  30, the CLT tells us that the sampling distribution of
approximately Normal in most cases.
x will be
Related documents