Download Chapter 23 - TeacherWeb

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
"It is a capital mistake to theorize before one has data."
Sir Arthur Conan Doyle
Chapter 23: INFERENCES FOR MEANS (Pages 520 - 546)
OVERVIEW: In chapter 18 we were given the standard deviation (  ) of the population. In reality, one
frequently does not know the standard deviation of the population from which a random sample was
obtained. When sample sizes are small and the population standard deviation is not known, statisticians
make use of the t-distribution, which is a family of distributions that are all bell-shaped, but differs from
the normal distribution in the fact that there is more area in the tails. Hence, there is a t-distribution table
that differs from the normal distribution table. As samples sizes get larger, the t-distribution approaches
the normal distribution in shape. In a one-sample test, the degrees of freedom (needed to use the table) is
one less than the sample size, and that is what determines which distribution you will use.
A few important things to note in this situation:
-When the standard deviation parameter is estimated from a sample, the resulting statistic is called the
The standard error of the sample mean SE( x ) =
sx =
-If a sample has mean x , the one sample t-statistic is
-A confidence interval calculated for a t-statistic has the form
where t* is the appropriate value from the t-distribution table (table T in back of book).
Assumptions/Conditions:
1. Independence Assumption:
A) Randomization condition:
B) 10% condition:
2. Normal Population Assumption:
A) Nearly Normal Condition:
Examples:
1. A coffee machine dispenses coffee into paper cups. You’re supposed to get 10 ounces of coffee, but
the amount varies slightly from cup to cup. Here are the amounts measured in a random sample of 20
cups. Is there evidence that the machine is shortchanging customers?
9.9
9.7
10.0
10.1
9.9
9.6
9.8
9.8
10.0
9.5
9.7
10.1
9.9
9.6
10.2
9.8
10.0
9.9
9.5
9.9
Define the parameter.
Hypotheses.
Model. Random sample; 20 < 10% of all cups (no reason to doubt independence).
The histogram of sample data to the right looks roughly unimodal and
symmetric, so it’s reasonable to believe that the amount of coffee in all
possible cups could be described by a Normal model.
Mechanics. n 
dof 
x
sx 
t=
P-value =
Conclusion. Such a small P-value makes it unlikely that the low sample mean resulted from sampling
error, so we
Confidence interval. The conditions have been met, so we can create a 95% one-sample t-interval.
(Note that all confidence intervals look alike: estimate ± margin of error.)
x  t *19  SE ( x ) 
We are 95% confident that
2. A company has set a goal of developing a battery that lasts over 5 hours (300 minutes) in continuous
use. In a first test of 12 of these batteries the following life spans measured (in minutes): 321, 295, 332,
351, 281, 336, 311, 253, 270, 326, 311, and 288.
• Is there evidence that the company has met its goal?
Solution:
We want to know if the mean battery lifespan exceeds the 300-minute goal set by the manufacturer. We
have 12 battery lifespans in our sample to test the claim.
Define the parameter.
Hypotheses.
Model.
Randomization Condition: This is not a random sample of batteries,
but merely 12 batteries produced for preliminary testing. However, it
is reasonable to assume that these batteries are representative of all
batteries.
Nearly Normal Condition: The distribution of battery lifespans is
roughly unimodal and symmetric, so it’s reasonable to assume that the
lifespans of all batteries could be described by a Normal model.
Since the conditions have been met, we can do a one sample t-test for the mean, with 11 degrees of
freedom.
Mechanics. n 
t
x
sx
n
t
t
Conclusion.
dof 
P-value =
x
sx 
• Find a 90% confidence interval for the mean lifespan of this type of battery.
Solution. The conditions have been met, so we can create a one-sample t-interval, with 90% confidence.
x  t *11  SE ( x ) 
I am 90% confident that the mean battery lifespan is between
and
minutes.
• If we wish to conduct another trial, how many batteries must we test to be 95% sure of
estimating the mean lifespan to within 15 minutes? To within 5 minutes?
Solution: We want to know how many batteries to test to be 95% sure
of estimating the mean lifespan to within 15 minutes. First, do a preliminary
estimate using z* =1.96 as the critical value.
Now, do a better estimate, using
as the critical value.
We would need to sample about
batteries in order to estimate the
mean battery lifespan to within 15 minutes, with 95% confidence.
Finally, to estimate the mean battery lifespan to within 5 minutes, you could do the entire process again,
perhaps using a critical value with much higher degrees of freedom. We know that it’s going to take lots
more batteries to cut the margin of error to a third of what it was. Alternatively, we know it will take a
sample about 9 times as large, 18(9) = 162 batteries, since the margin of error was decreased to a third of
its size. (The standard error involves the square root of the sample size in it’s denominator.)
Facts about the t-distribution
1. Density curves of t-distribution are similar to normal curve.
2. Spread of t-distribution is a tad greater than normal dist.
3. As the dof increase, the curve approaches the normal curve
Rules for using the t-test
Ideally the population will have a normal distribution, but for times when this is not given:
- For sample sizes less than 15, the t-procedures can be used if the data are
- For sample sizes 15 to 40, t-procedures can be safely used as long as the data are
- For sample sizes greater than 40, t-procedures can be used
Use the TI-83 to calculate a quick 95% confidence interval for the following problem:
An SRS of 75 male adults living in a particular suburb was taken to study the amount of time they spent
per week doing rigorous exercise. It indicated a mean of 73 minutes with a standard deviation of 21
minutes.
 = true mean rigorous exercise time for all males in the suburb
x
sx 
t – confidence interval = x  t *
C=
sx
n
The desired 95% confidence interval is
dof =
t* =