Download Section 10-1 t Distribution for Inferences about a Mean

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Confidence interval wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
10.1 t Distribution for Inferences
about a Mean
LEARNING GOAL
Understand when it is appropriate to use the Student t
distribution rather than the normal distribution for
constructing confidence intervals or conducting hypothesis
tests for population means, and know how to make proper
use of the t distribution.
Copyright © 2009 Pearson Education, Inc.
Much of the work in preceding sections assumed the
sampling distribution is normal, but a review of articles in
professional journals shows that professional statisticians
rarely use the normal distribution for confidence intervals and
hypothesis tests in real applications.
A major reason for this is that the normal distribution requires
that we know the population standard deviation σ. Because
we generally do not know σ, we must estimate it with the
sample standard deviation s. Statisticians therefore prefer an
approach that does not require knowing σ.
Such is the case with the Student t distribution, or t
distribution for short, which can be used when we do not
know the population standard deviation and either the sample
size is greater than 30 or the population has a normal
distribution.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 2
Inferences about a Population Mean: Choosing
between t and Normal Distributions
t distribution: Population standard deviation is not known
and the population is normally distributed.
or
Normal
distribution:
or
Population standard deviation is not known
and the sample size is greater than 30.
Population standard deviation is known
and the population is normally distributed.
Population standard deviation is known
and the sample size is greater than 30.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 3
Figure 10.1 This figure compares the standard normal distribution to the t
distribution for two different sample sizes. Notice that as the sample size gets
larger, the t distribution more closely approximates the normal distribution.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 4
The t distribution is very similar in shape and symmetry to the
normal distribution, but it accounts for the greater variability
that is expected with small samples.
The real value of the t distribution is that it allows us to extend
ideas of confidence intervals or hypothesis tests to many cases
in which we cannot use the normal distribution because we do
not know the population standard deviation.
Keep in mind, however, that it still does not work for all cases.
For example, if we have a small sample of size 30 or less or the
sample data suggest that the population has a distribution which
is radically different from a normal distribution, then neither the
t distribution nor the normal distribution applies. Such cases
require other methods not discussed in this book.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 5
Confidence Intervals Using the t
Distribution
To specify a confidence interval, we must first calculate the
margin of error, E.
With a t distribution, the formula is
s
E=t·
n
where n is the sample size, s is the sample standard deviation,
and t is a value that we look up in Table 10.1 (next slide).
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 6
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 7
Steps for finding t values in Table 10.1.
• First, determine the number of degrees of freedom for the
sample data, defined to be the sample size minus 1:
degrees of freedom for t distribution = n – 1
• Table 10.1 (previous slide) shows degrees of freedom in
column 1. Find the row corresponding to the number of
degrees of freedom in your sample data, and then look across
the row to find the appropriate t value.
For confidence intervals with population means, the t values
correspond to 95% confidence in column 2 and 90%
confidence in column 3.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 8
Steps for finding t values in table 10.1 (cont.)
We use the table values for the “area in two tails” because the
margin of error can be either below the mean or above it.
For example, 95% confidence means we are looking for a
total area of 0.05 both to the far left and to the far right of a t
distribution like those shown in Figure 10.1 (slide 4).
Once you find the t value for your data and confidence level, you
can determine the confidence interval just as we did in Section
8.2, except using the new formula for the margin of error, E.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 9
Confidence Interval for a Population Mean (μ)
with the t Distribution
If conditions require use of the t distribution (σ not known
and n > 30 or population normally distributed), the
confidence interval for the true value of the population
mean (μ) extends from the sample mean minus the
margin of error ( x – E) to the sample mean plus the
margin of error ( x + E) . That is, the confidence interval for
the population mean is
E)
x –– EE < μ < x ++EE (or, equivalently, x ± E
where the margin of error is
s
E=t·
n
and we find t from Table 10.1.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 10
EXAMPLE 1 Confidence Interval for Diastolic
Blood Pressure
Here are five measures of diastolic blood pressure from randomly
selected adult men: 78, 54, 81, 68, 66. These five values result in
these sample statistics: n = 5, x  69.4, s = 10.7. Using this
sample, construct the 95% confidence interval estimate of the
mean diastolic blood pressure level for the population of all adult
men.
Solution: Because the population standard deviation is not
known and because it is reasonable to assume that blood pressure
levels of adult men are normally distributed, we use the t
distribution instead of the normal distribution.
With a sample of size n = 5, the number of degrees of freedom is
degrees of freedom for t distribution = n – 1 = 5 – 1 = 4
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 11
EXAMPLE 1 Confidence Interval for Diastolic
Blood Pressure
Solution: (cont.)
For 95% confidence, we use column 2 in Table 10.1 to find that
t = 2.776.
We now use this value along with the given sample size (n = 5)
and sample standard deviation (s = 10.7) to calculate the margin
of error, E:
s
10.7
 2.776 
 13.3
E=t 
n
5
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 12
EXAMPLE 1 Confidence Interval for Diastolic
Blood Pressure
Solution: (cont.)
Finally, we use the margin of error and the sample mean to find
the 95% confidence interval:
x –E < μ < x+E
69.4 – 13.3 < μ < 69.4 + 13.3
56.1 < μ < 82.7
Based on the five sample measurements, we have 95%
confidence that the limits of 56.1 and 82.7 contain the mean
diastolic blood pressure level for the population of all adult
men.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 13
Hypothesis Tests Using the t Distribution
When the t distribution is used for a hypothesis test of a claim
about a population mean (H0: μ = claimed value), the t value
plays the role that the standard score z played when we studied
these hypothesis tests in Section 9.2.
With the t distribution, instead of calculating the standard score
z, we use the following formula to calculate t:
x-μ
t
s/ n
where n is the sample size, x is the sample mean, s is the
sample standard deviation, and μ is the population mean
claimed by the null hypothesis.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 14
Once we have calculated t, we decide whether to reject or not
reject the null hypothesis by comparing our value of t to the
critical values of t found in Table 10.1. The critical values
depend on the type of test as follows.
Right-tailed test: Reject the null hypothesis if the computed
test statistic t is greater than or equal to the value of t found in
the column of Table 10.1 labeled “Area in one tail.”
Notice that for the one-tailed test, column 2 gives critical
values for significance at the 0.025 level and column 3 gives
critical values for significance at the 0.05 level.
Left-tailed test: Reject the null hypothesis if the computed
test statistic t is less than or equal to the negative of the value
of t found in the column of Table 10.1 labeled “Area in one
tail.”
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 15
Left-tailed test: (cont)
Again, because this is a one-tailed test, column 2 gives
critical values for significance at the 0.025 level and column 3
gives critical values for significance at the 0.05 level.
Two-tailed test: Reject the null hypothesis if the absolute
value of the computed test statistic t is greater than or equal to
the value of t found in the column of Table 10.1 labeled “Area
in two tails.”
For this case, column 2 gives critical values for significance
at the 0.05 level and column 3 gives critical values for
significance at the 0.10 level.
The computed test statistic t can also be used to find a Pvalue; however, that is usually done with the aid of statistical
software rather than with tables.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 16
EXAMPLE 2 Right-Tailed Hypothesis Test for
a Mean
Listed below are ten randomly selected IQ scores of statistics
students:
111 115 118 100 106 108 110 105 113 109
Using methods from Chapter 4, you can confirm that these data
have the following sample statistics: n = 10, x  109.5, s = 5.2.
Using a 0.05 significance level, test the claim that statistics
students have a mean IQ score greater than 100, which is the
mean IQ score of the general population.
Solution: Based on the claim that the mean IQ of statistics
students is greater than 100, we use the null hypothesis H0: μ =
100 and the alternative hypothesis Ha: μ > 100.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 17
EXAMPLE 2 Right-Tailed Hypothesis Test for
a Mean
Solution: (cont.)
Because the standard deviation of all IQ scores for the population
of all statistics students is not known and because it is reasonable
to assume that IQ scores of statistics students are normally
distributed, we use the t distribution instead of the normal
distribution.
The value of the t test statistic is computed as follows:
x – μ 109.5 – 100
t 

 5.777
s/ n
5.2 / 10
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 18
EXAMPLE 2 Right-Tailed Hypothesis Test for
a Mean
Solution: (cont.)
We now need to compare this value to the appropriate critical
value from Table 10.1:
• We find the correct row by recognizing that this data set has
n – 1 = 10 – 1 = 9 degrees of freedom.
• Because it is a one-tailed test and we are asked to test for
significance at the 0.05 level, we use the values from column 3.
• Looking in the row for 9 degrees of freedom and column 3, we
find that the critical value for significance at the 0.05 level is t =
1.833.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 19
EXAMPLE 2 Right-Tailed Hypothesis Test for
a Mean
Solution: (cont.)
Because the sample test statistic t = 5.777 is greater than the
critical value t = 1.833, we reject the null hypothesis.
We conclude that there is sufficient evidence to support the claim
that the mean IQ score is greater than 100.
We can be more precise by using software to compute the P-value
for this hypothesis test, which turns out to be 0.000135.
Notice that this P-value is much less than 0.05, so we can be quite
confident in the decision to reject the null hypothesis and support
the claim that the mean IQ score is greater than 100.
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 20
The End
Copyright © 2009 Pearson Education, Inc.
Slide 10.1- 21