Download Section 7.3 Sample Means

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Section 7.3
Sample Means
Salaries of MLB players activity
http://www.usatoday.com/sports/mlb/salaries/2014/pl
ayer/all/
Sample Means
• Averages are less variable than
individual observations
• Averages are more normal than
individual observations
• These two facts contribute to the
popularity of sample means in
statistical inference
3
The Sampling Distribution of x
When we choose many SRSs from a population, the sampling distribution of
the sample mean is centered at the population mean µ and is less spread
out than the population distribution. Here are the facts.
Sampling Distribution of a Sample Mean
Suppose that x is the mean of an SRS of size n drawn from a large population
with mean m and standard deviation s . Then :
The mean of the sampling distribution of x is mx = m
The standard deviation of the sampling distribution of x is
sx =
s
n
as long as the 10% condition is satisfied: n ≤ (1/10)N or 10n < N.
Note : These facts about the mean and standard deviation of x are true
no matter what shape the population distribution has.
Sampling From a Normal Population
We have described the mean and standard deviation of the sampling
distribution of the sample mean x but not its shape. That's because the
shape of the distribution of x depends on the shape of the population
distribution.
In one important case, there is a simple relationship between the two
distributions. If the population distribution is Normal, then so is the
sampling distribution of x. This is true no matter what the sample size is.
Sampling Distribution of a Sample Mean from a Normal Population
Suppose that a population is Normally distributed with mean m
and standard deviation s . Then the sampling distribution of x
has the Normal distribution with mean m and standard
deviation s / n, provided that the 10% condition is met.
Problem: At the P. Nutty Peanut Company, dry-roasted, shelled
peanuts are placed in jars by a machine. The distribution of weights in
the jars is approximately Normal, with a mean of 16.1 ounces and a
standard deviation of 0.15 ounces.
(a) Without doing any calculations, explain which outcome is more
likely: randomly selecting a single jar and finding that the
contents weigh less than 16 ounces or randomly selecting 10 jars
and finding that the average contents weigh less than 16 ounces.
(a) Because averages are less variable than individual
measurements, you would expect the sample mean of 10
jars to be closer, on average, to the true mean of 16.1
ounces. Thus, it is more likely that a single jar would weigh
less than 16 ounces than for the average weight of 10 jars
to be less than 16 ounces.
(b) Find the probability that a randomly selected jar contains less
than 16 ounces of peanuts.
(b) Step 1: State the distribution and values of interest. Let X =
weight of the contents of a randomly selected jar of peanuts. X has
roughly the N(16.1, 0.15) distribution and we want to find P(X < 16).
Step 2: Perform calculations—show your work! The standardized
score for the boundary value is
16  16.1
z
 0.67
0.15
and P(Z < –0.67) = 0.2514. Using technology: The command
normalcdf(lower:−1000, upper:16,μ: 16.1, :
0.15) gives an area of 0.2525.
Step 3: Answer the question. There is about a 25% chance that a
single jar will contain less than 16 ounces of peanuts.
(c) Find the probability that 10 randomly selected jars contain less
than 16 ounces of peanuts, on average.
(c) Step 1: State the distribution and values of interest. Let x-bar = average weight of the
contents of 10 randomly selected jars of peanuts.
x    16.1
Step 2: Perform calculations and Check conditions.
10n < N so 10(10)= 100 < all jars of peanuts produced 

0.15
x 

 0.047
n
10
16  16.1
z
 2.13
0.047
and P(Z < –2.13) = 0.0166.
Using technology: The command normalcdf(lower:−1000, upper:16, μ:
16.1, : 0.047) gives an area of 0.0167.
Step 3: Answer the question. There is about a 2% chance that a random sample of 10 jars will
contain less than 16 ounces of peanuts, on average. This confirms our answer to part (a).
The Central Limit Theorem
Consider the strange population distribution
from the Rice University sampling distribution
applet.
Describe the shape of the sampling
distributions as n increases. What do you
notice?
The Central Limit Theorem
Most population distributions are not Normal. It is a remarkable fact that as the
sample size increases, the distribution of sample means changes its shape: it looks
less like that of the population and more like a Normal distribution!
When the sample is large enough, the distribution of sample means is very close to
Normal, no matter what shape the population distribution has, as long as the
population has a finite standard deviation.
If the population distribution is Normal, then so it’s the sampling distribution of x-bar.
This is true no matter what the sample size n is.
Draw an SRS of size n from any population with mean  and finite
standard deviation  . The central limit theorem (CLT) says that when n
is large (n  30), the sampling distribution of the sample mean x is approximately
Normal, even if the population distribution is not Normal.
The Sampling Distribution of x
Mean texts
Suppose that the number of texts sent during a
typical day by students at a particular high school
follows a right-skewed distribution with a mean of
45 and a standard deviation of 35. How likely is it
that a random sample of 50 students will have sent
more than a total of 2500 texts in the last day?
Solution:
State the distribution and values of interest. Let 𝑥 = average number
of texts for 50 randomly selected students. We want to find P(𝑥 >
2500/50) = P(𝑥 > 50).
Check conditions: 10n < N, 10(50) = 500 < population of all students
at this school.
n = 50 > 30 so according to CLT can assume Normality
x    45
x 

n

35
 4.95
50
Thus, has approximately the N(45, 4.95) distribution.
Perform calculations—show your work! The standardized score for
the boundary value is and P(Z > 1.01) = 0.1562. Using technology: The
command normalcdf(lower:50, upper:1000, μ:45, :4.95)gives an area
of 0.1562.
Answer the question in context. There is about a 16% chance that a
random sample of 50 students will have sent on average over 2500
texts in the last day.