Download 156Chapter23

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 23: Inferences About Means
General Background
We would now like to construct confidence intervals and conduct hypothesis tests for
population means (similar to what we did for population proportions in chapters 19 and 21).
Recall, the distribution of sample means (i.e. the distribution of x )
s
•
s (x) =
•
However, we typically will not know the value of 
•
Thus we use s (the sample’s standard deviation) instead to get the standard error of
the x distribution.
•
n
where  is the population’s standard deviation
s ( x ) » SE ( x ) =
s
n
PROBLEM
When we use this “distribution” of x , with standard error instead of standard
deviation, and calculate “z – scores”, the distribution shape is no longer a normal
curve (even if n is more than 30).
Rather, we get a curve similar curve to the z-curve. It is centered at 0, unimodal and
symmetric, but it is taller and thinner than the z-curve.
We end up on one of the Gosset t Curves
•
Each one of these curves is defined by a parameter called the “degrees of
freedom” or df. For every positive number, there is a t-curve for that many
degrees of freedom.
•
For problems involving only one population mean, we will use the t-curve
with df = n – 1 where n is the sample size.
We can use these distributions if the population is known to be normal, or the sample
size is large (C.L.T.), or if the sample data has a fairly linear normal plot (this can be
created in the TI)
•
•
put the data in L1
in “STAT PLOT”, choose Plot1, turn it on and select the last plot option
•
press ZOOM then 9 to see the plot
•
If we did this for the data in problem 36 of chapter 23 the plot should look
like this:
Since this is fairly linear looking, the sample data is consistent with having
come from a normal population.
Confidence Intervals For a Single Mean
Assumptions
• properly selected random sample of size n is selected and has mean x and
standard deviation s
•
population is known to be normal, or sample size is large, or sample data
produces a fairly linear normal plot
Formula
æ s ö
 is in x ± t * ç ÷
è nø
where t* is the value from the t-curve with df = n – 1 corresponding to the level of
confidence (this is the equivalent of the z* values used in the confidence intervals
for a population proportion)
This is problematic in practice, because there are different values of t* for each
different t-curve (too many to memorize). So we will, in practice, construct these
using technology.
Calculator
Go into the STAT menu, over to tests, and select TInterval…
If you have the actual sample data in L1, choose “Data”, if you have the summary
statistics for the sample ( x and s) , then choose “Stats” and enter the values.
Example 1
Construct a 96% confidence interval for the average calorie content of all vanilla yogurts
using the data from #40 in chapter 23.
Example 2
In a random sample of 50 of a new brand of battery, the average lifespan is 952 hours with a
standard deviation of 18 hours. Construct a 98% confidence interval for the average lifespan
of all such batteries.
Choosing Sample Size
æ s ö
The margin of error for these confidence intervals is ME = t * ç
÷.
è nø
So if we choose a desired margin or error and confidence level, we can solve this formula for
n to get
2
æ t *s ö
n =ç
÷
è ME ø
Now, this formula has 2 problems. We cannot know with which t-curve we are working
without knowing the sample size (because df = n – 1) and we do not have a value for s until
we have taken a sample. So, to determine n we need to know t* and s, but to know t* and s
we need to have a sample.
We “fix” this by replacing the t* values with the z* values from our previous
confidence intervals and we use a value for s that comes from previous studies.
Recall, z* = 1.645 for 90% confidence, 1.96 for 95% confidence, and 2.33 for 98%
confidence.
æ z*s ö
Thus: n ³ ç
÷
è ME ø
2
Example 3
Suppose we wish to construct a 98% confidence interval for average body temperature of
people testing positive for a new strain of influenza within 0.14. Suppose also that previous
studies support that the standard deviation of human body temperature is 0.64. How many
subjects must be tested?
Hypothesis Tests For a Single  (P-value approach)
1. Hypotheses
H0:  = #
Ha: one of (a)  > #
(b)  < #
(c)  ≠ #
This can be done via the TI calculator
by choosing “T-Test”. The menus are
similar to those for the TInterval…
discussed above.
2. Test Statistic
x -#
t=
s
n
3. P-value
(a) tcdf ( t, "¥", df )
(b) tcdf ("-¥", t, df )
where df = n – 1
(c) 2 *tcdf ( t , "¥", df )
4. Conclusion
Compare P-value to 
5. Validity
• properly collected random sample
•
one of:
• normal population
• large sample size (C.L.T.)
• fairly linear looking normal plot the from sample data
Hypothesis Tests for a Single  (C.I. Approach)
1. Hypotheses: Same as above
2. ___ % Confidence interval (for our purposes, constructed using TInterval)
3. Conclusion: Reject H0 if hypothesized value is not in interval, Fail to Reject H0 if it is
4. Validity: Same as above
Example 4
Find the P-value for each of the following, assuming samples are from a normal population.
(a) H0:  = 100
Ha:  < 100
t = -1.48
n = 15
(b) H0:  = 58
Ha:  ≠ 58
t = -2.64
n = 10
Example 5: Yellow Sheet #6
The posted speed limit on a certain residential road is 30mph. The residents believe that
drivers are speeding on this road on average. They observe 20 randomly selected drivers on
this road and find the mean speed to be 31.8mph with a standard deviation of 4.2mph. Is the
residents’ belief accurate?
Example 6: Yellow Sheet #6 via 90% C.I.
The posted speed limit on a certain residential road is 30mph. The residents believe that
drivers are speeding on this road on average. They observe 20 randomly selected drivers on
this road and find the mean speed to be 31.8mph with a standard deviation of 4.2mph. Is the
residents’ belief accurate? Test the relevant hypotheses using a 90% confidence interval.
Example 7: Yellow Sheet #7
Use the data from chapter 23, #36 to test whether the average caloric intake from all yogurt is
less than 175 calories.