Download Asking and Answering Questions about More Than Two

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
17
Preview
Chapter Learning Objectives
17.1The Analysis of Variance—
Single-Factor ANOVA and the
F Test
17.2Multiple Comparisons
Appendix: ANOVA
Computations
Are You Ready to Move On?
Chapter Review Exercises
Technology Notes
Appendix Tables
Table 7: Values That Capture
Specified Upper-Tail F Curve
Areas
Table 8: Critical Values of q
for the Studentized Range
Distribution
Answers to Selected Exercises
Asking and Answering
Questions about More
Than Two Means
James Woodson/Digital Vision/Getty Images
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2
Preview
In Chapters 13 and 14, you learned methods for testing H0: m1 2 m2 5 0
(or equivalently, m1 5 m2 ), where m1 and m2 are the means of two different
populations or the mean responses when two different treatments are applied.
However, many investigations involve comparing more than two population or
treatment means, as illustrated in the following example.
2
Chapter Learning Objectives
After completing this chapter, you should be able to
C1 Understand how a research question about differences between three or more population or treatment
means is translated into hypotheses.
Mastering the Mechanics
After completing this chapter, you should be able to
M1 Translate a research question or claim about differences between three or more population or
treatment means into null and alternative hypotheses.
M2 Know the conditions for appropriate use of the ANOVA F test
M3 Carry out an ANOVA F test.
M4 Use a multiple comparison procedure to identify differences in population or treatment means.
Putting It into Practice
After completing this chapter, you should be able to
P1 Recognize when a situation calls for testing hypotheses about differences between three or more
population or treatment means.
P2 Carry out an ANOVA F test and interpret the conclusion in context.
Preview Example
Risky Soccer
In a study to see if the high incidence of head injuries among soccer players might
be related to memory recall, researchers collected data from three samples of college
students (“No Evidence of Impaired Neurocognitive Performance in Collegiate Soccer
Players,” The American Journal of Sports Medicine [2002]: 157–162). One sample consisted
of soccer athletes, one sample consisted of athletes whose sport was not soccer, and one
sample was a comparison group consisting of students who did not participate in sports.
The following information on scores from the Hopkins Verbal Learning Test (which
measures memory recall) was given in the paper.
Group
Soccer
Athletes
Nonsoccer
Athletes
Comparison
Group
Sample Size
86
95
Sample Mean Score
29.90
30.94
53
29.32
Sample Standard Deviation
3.73
5.14
3.78
Notice that the three sample means are different. But even when the population
means are equal, you would not expect the three sample means to be exactly equal. Are
the differences in sample means consistent with what is expected simply due to chance
differences from one sample to another when the population means are equal, or are the
differences large enough that you should conclude that the three population means are
not all equal? This is the type of problem considered in this chapter.
3
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Conceptual Understanding
4
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
section 17.1
The Analysis of Variance—Single-Factor ANOVA
and the F Test
When more than two populations or treatments are being compared, the characteristic that
distinguishes the populations or treatments from one another is called the factor under
investigation. For example, an experiment might be carried out to compare three different methods for teaching reading (three different treatments), in which case the factor of
interest would be teaching method, a qualitative factor. If the growth of the fish raised in
waters having different salinity levels—0%, 10%, 20%, and 30%—is of interest, the factor
salinity level is quantitative.
A single-factor analysis of variance (ANOVA) problem involves a comparison of k
population or treatment means m1, m2, …, mk. The objective is to test
H0: m1 5 m2 5 . . . 5 mk
against
Ha: At least two of m's are different
When comparing populations, the analysis is based on independently selected random
samples, one from each population. When comparing treatment means, the data are from
an experiment, and the analysis assumes random assignment of the experimental units
(subjects or objects) to treatments. If, in addition, the experimental units are chosen at
random from a population of interest, it is also possible to generalize the results of the
analysis to this population.
Whether the null hypothesis in a single-factor ANOVA should be rejected depends
on how much the samples from the different populations or treatments differ from one
another. Figure 17.1 displays observations that might result when random samples are
selected from each of three populations. Each dotplot displays five observations from the
first population, four observations from the second population, and six observations from
the third population. For both displays, the three sample means are located by arrows.
The means of the two samples from Population 1 are equal, as are the means for the two
samples from population 2 and for the two samples from Population 3.
Mean of
Sample 1
Mean of
Sample 2
Mean of
Sample 3
(a)
Figure 17.1 Two possible ANOVA data
sets when three populations
are compared: green circle 5
observation from Population 1;
orange circle 5 observation
from Population 2; blue circle 5
observation from Population 3
Mean of
Sample 1
Mean of
Sample 2
Mean of
Sample 3
(b)
After looking at the data in Figure 17.1(a), you would probably think that the claim
m1 5 m2 5 m3 appears to be false. Not only are the three sample means different, but also
the three samples are clearly separated. In other words, differences between the three
sample means are quite large relative to the variability within each sample.
The situation pictured in Figure 17.1(b) is much less clear-cut. The sample means
are as different as they were in the first data set, but now there is considerable overlap among the three samples. The separation between sample means might be due to
the substantial variability in the populations (and therefore the samples) rather than
to differences between m1, m2, and m3. The phrase analysis of variance comes from
the idea of analyzing variability in the data to see how much can be attributed to
Unless otherwise noted, all content on this page is © Cengage Learning.
5
17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test
Notations and Assumptions
Notation in single-factor ANOVA is a natural extension of the notation used in earlier
chapters for comparing two population or treatment means.
ANOVA Notation
k 5 number of populations or treatments being compared
Population or treatment 1
2
Population or treatment mean
m1
m2
Population or treatment variance​s ​21​​
​ 
s ​22​​ 
Sample size
n1
n2
_
_
Sample mean​x​ 1​x​ 2
Sample variance ​s​21​​ 
​s22​ ​​ 
...
k
...
mk
...​s ​2k​ ​
...
nk
_
...​x​ k
...
​s2k​​ ​
N 5 n1 1 n2 1 . . . 1 nk (the total number of observations in the data set)
_
_
_
T 5 grand total 5 sum of all N observations in the data set 5 n x​
​   1 n x​
​   1 . . . 1 n x​
​ 
_
_
​  ​5
x​
1 1
grand mean 5 __
​ T  ​
2 2
k k
N
A decision between
H0: m1 5 m2 5 . . . 5 mk
and
Ha: At least two of m's are different
_
is based on examining the ​x​  values to see whether observed differences are small enough
to be explained by sampling variability alone or whether an alternative explanation for the
differences is more plausible.
Example 17.1 An Indicator of Heart Attack Risk
The article “Could Mean Platelet Volume Be a Predictive Marker for Acute Myocardial
Infarction?” (Medical Science Monitor [2005]: 387–392) described a study in which four
groups of patients seeking treatment for chest pain were compared with respect to the
mean platelet volume (MPV, measured in fL). The four groups considered were based
on the clinical diagnosis: (1) noncardiac chest pain, (2) stable angina, (3) unstable
angina, and (4) heart attack. The purpose of the study was to determine if the mean
MPV differed for the four groups, and in particular if the mean MPV was different
for the heart attack group, because then MPV could be used as an indicator of heart
attack risk.
To carry out this study, patients seen for chest pain were divided into groups according to diagnosis. The researchers then selected a random sample of 35 from each of the
resulting k 5 4 groups. The researchers believed that this sampling process would result
in samples that were representative of the four populations of interest and that could be
regarded as if they were random samples from these four populations. Table 17.1 presents
summary values given in the paper.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
differences in the µ’s and how much is due to variability in the individual populations.
In Figure 17.1(a), the within-sample variability is small relative to the betweensample variability, whereas in Figure 17.1(b), a great deal more of the total variability
is due to variation within each sample. If differences between the sample means can
be explained entirely by within-sample variability, there is no compelling reason to
reject H0: m1 5 m2 5 m3.
6
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Table 17.1 Summary Values for MPV Data of Example 17.1
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Group
Number
1
2
3
4
Sample
Size
Group Description
Noncardiac chest pain
Stable angina
Unstable angina
Heart attack
Sample
Mean
35
35
35
35
10.89
11.25
11.37
11.75
Sample Standard
Deviation
0.69
0.74
0.91
1.07
With m1 denoting the true mean MPV for group i (i 5 1, 2, 3, 4), consider the
null hypothesis H0: m1 5 m2 5 m3 5 m4. Figure 17.2 shows a comparative boxplot
of the four samples (based on data consistent with summary values given in the paper).
The mean MPV for the heart attack sample is larger than for the other three samples, and
the boxplot for the heart attack sample appears to be shifted a bit higher than the boxplots
for the other three samples. However, because the four boxplots show substantial overlap,
it is not obvious whether H0 is plausible or should be rejected. In situations like this, a
formal test procedure is helpful.
Noncardiac
Stable angina
Unstable angina
Heart attack
Figure 17.2 9
Boxplots for Example 17.1
10
11
MPV
12
13
As with the inferential methods of previous chapters, the validity of the ANOVA test
for H0: m1 5 m2 5 . . . 5 mk requires that some conditions be met.
Conditions for ANOVA
1. Each of the k population or treatment response distributions is normal.
2. s1 5 s2 5 . . . 5 sk (The k normal distributions have equal standard
deviations.)
3. The observations in the sample from any particular one of the k populations or
treatments are independent of one another.
4. When comparing population means, the k random samples are selected independently of one another. When comparing treatment means, experimental
units are assigned at random to treatments.
In practice, the test based on these assumptions works well as long as the conditions
are not too badly violated. If the sample sizes are reasonably large, normal probability
plots or boxplots of the data in each sample are helpful in checking the condition of normality. Often, however, sample sizes are so small that a separate normal probability plot
or boxplot for each sample is of little value in checking normality. In this case, a single
_
combined plot can be constructed by first subtracting ​x​1  from each observation in the first
Unless otherwise noted, all content on this page is © Cengage Learning.
17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test
7
_
13
Deviation
12
11
10
9
Figure 17.3 A normal probability plot using the
combined data of Example 17.1
−3
−2
−1
0
Normal score
1
2
3
There is a formal procedure for testing the equality of population standard deviations.
Unfortunately, it is quite sensitive to even a small violation of the normality condition.
However, the equal population or treatment standard deviation condition can be considered
reasonably met if the largest of the sample standard deviations is at most twice the smallest
one. For example, the largest standard deviation in Example 17.1 is s4 5 1.07, which is
only about 1.5 times the smallest standard deviation (s1 5 0.69).
The analysis of variance test procedure is based on the following measures of variation in the data.
Definition
A measure of differences among the sample means is the treatment sum of
squares, denoted by SSTr and given by
_
_
_
_
_
_
_
_
_
SSTr 5 n ​(x​
​   2 x​
​  ​)2​ ​1 n ​(x​
​   2 x​
​  ​)2​ ​1 . . . 1 n ​(x​
​   2 ​x​ ​)2​ ​
1
1
2
2
k
k
A measure of variation within the k samples, called error sum of squares and
denoted by SSE, is
SSE 5 (n 2 1)​s2​ ​​ 1 (n 2 1)​s2​ ​​ 1 . . . 1 (n 2 1)​s2​​ ​
1
1
2
2
k
k
Each sum of squares has an associated df:
treatment df 5 k 2 1
error df 5 N 2 k
A mean square is a sum of squares divided by its df. In particular,
SSTr
mean square for treatments 5 MSTr 5 _____
​ k 21 
 ​
SSE  
mean square for error 5 MSE 5 ​ ______
 ​
N2k
The number of error degrees of freedom comes from adding the number of degrees of
freedom associated with each of the sample variances:
(n 2 1) 1 (n 2 1) 1 … (n 2 1) 5 n 1 n 1 … n 21 21 2 …1
1
2
1
k
5N2k
Unless otherwise noted, all content on this page is © Cengage Learning.
2
k
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
sample, x​
​  2 from each value in the second sample, and so on, and then constructing a normal
probability or boxplot of all N deviations from their respective means. Figure 17.3 shows
such a normal probability plot for the data of Example 17.1.
8
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Example 17.2 © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Heart Attack Calculations
_
_
Let’s return to the mean platelet volume (MPV) data of Example 17.1. The grand mean x​
​ ​ 
was calculated to be 11.315. Notice that because the sample sizes are all equal, the grand
mean is just the average of the four sample means (this will not usually be the case when
_
_
_
_
the sample sizes are unequal). With x​
​ 1  5 10.89, x​
​ 2  5 11.25, x​
​ 3  5 11.37, x​
​ 4  5 11.75, and
n1 5 n2 5 n3 5 n4 5 35,
_
_
_
_
_
_
_
_
_
SSTr 5 n1​(x​
​ 1  2 x​
​  ​)2​ ​1 n2​(x​
​ 2  2 x​
​  ​)2​ ​1 . . . 1 nk​(x​
​ k  2 x​
​  ​)2​ ​
5 35​(10.89 2 11.315)​2​1​ 35(11.25 2 11.315)​2​1 ​35(11.37 2 11.315)​2​
1 ​35(11.75 2 11.315)​2​
5 6.322 1 0.148 1 0.106 1 6.623
5 13.199
Because s1 5 0.69, s2 5 0.74, s3 5 0.91, s4 5 1.07,
SSE 5 (n1 2 1)​s21​ ​​ 1 (n2 2 1)​s22​ ​​ 1 . . . 1 (nk 2 1)​s2k​​ ​
5 (35 2 1) ​(0.69)​2​1 (35 2 1) ​(0.74)​2​1 (35 2 1) ​(0.91)​2​1 (35 2 1) ​(1.07)​2​
5 101.888
The numbers of degrees of freedom are
treatment df 5 k 21 5 3
error df 5 N 2k 5 35 1 35 1 35 1 35 2 4 5 136
from which
13.199
SSTr
 
 5 4.4000
MSTr 5 _____
​  k 21 
 ​5 ______
​  3 ​ 
SSE
101.888
 
5 0.749
MSE 5 ______
​ N 2 k  
 ​5 _______
​  136 ​ 
Both MSTr and MSE are quantities whose values can be calculated once sample data
are available (they are statistics). Each of these statistics varies in value from data set to
data set. Both statistics MSTr and MSE have sampling distributions, and these sampling
distributions have mean values. The following box describes the relationship between the
mean values of MSTr and MSE.
When H0 is true ( m1 5 m2 5 . . . 5 mk ),
mMSTr 5 mMSE
However, when H0 is false,
mMSTr . mMSE
and the greater the differences among the m9s, the larger mMSTr will be relative to mMSE.
According to this result, when H0 is true, you would expect the values of the two
mean squares to be close. However, you would expect MSTr to be substantially greater
than MSE when some µ’s differ greatly from others. Thus, a calculated MSTr that is much
larger than MSE is inconsistent with the null hypothesis. In Example 17.2, MSTr 5 4.400
and MSE 5 0.749, so MSTr is about six times as large as MSE. Can this be attributed
solely to sampling variability, or is the ratio MSTr/MSE large enough to suggest that the
null hypothesis is false? Before a formal test procedure can be described, you have to learn
about a new family of probability distributions called F distributions.
An F distribution always arises in connection with a ratio. A particular F distribution is obtained by specifying both numerator degrees of freedom (df1) and denominator
17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test
9
degrees of freedom (df2). Figure 17.4 shows an F curve for a particular choice of df1 and
df2. The ANOVA test of this section is an upper-tailed test, so a P-value is the area under
an appropriate F curve to the right of the calculated value of the test statistic.
Shaded area = P-value for upper-tailed F test
Figure 17.4 An F curve and P-value for an
upper-tailed test
Calculated F
Constructing tables of these upper-tail areas is cumbersome, because there are two
degrees of freedom rather than just one (as in the case of t distributions). For selected
(df1, df2) pairs, the F table (Appendix Table 7) gives only the four numbers that capture
tail areas 0.10, 0.05, 0.01, and 0.001, respectively. Here are the four numbers for df1 5 4,
df2 5 10 along with the statements that can be made about the P-value:
Tail area
0.10
0.05
Value
2.61
3.48
↑
↑
↑
a
b
c
F , 2.16 → tail area 5 P-value > 0.10
2.61 , F , 3.48 → 0.05 , P-value , 0.10
3.48 , F , 5.99 → 0.01 , P-value , 0.05
5.99 , F , 11.28 → 0.001 , P-value , 0.01
F > 11.28 → P-value , 0.001
a.
b.
c.
d.
e.
0.01
5.99
↑
0.001
11.28
d
↑
e
For example, if F 5 7.12, then 0.001 , P-value , 0.01. If a test with a 5 0.05 is used,
H0 should be rejected, because P‑value  a. The most frequently used statistical computer
packages can provide exact P-values for F tests.
Single Factor ANOVA F Test for Equality of Three or More Population Means
Appropriate when the following conditions are met:
1. Each of the k population or treatment response distributions is normal.
2. s1 5 s2 5 . . . 5 sk (The k normal distributions have equal standard deviations.)
3. The observations in the sample from any particular one of the k populations or
treatments are independent of one another.
4. When comparing population means, the k random samples are selected independently of one another. When comparing treatment means, experimental
units are assigned at random to treatments.
When these conditions are met, the following test statistic can be used:
MSTr
F 5 _____
​ MSE ​ 
When the conditions above are met and the null hypothesis is true, the F statistic
has an approximate F distribution with
df1 5 k 2 1 and df2 5 N 2 k
Form of the null hypothesis: H0: m1 5 m2 5 . . . 5 mk
Form of the alternative hypothesis: Ha: At least two of the m9s are diffrent
The P-value is: Area under the F curve to the right of the calculated value of the
test statistic
Unless otherwise noted, all content on this page is © Cengage Learning.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
F curve for particular df1, df2
10
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Example 17.3 Heart Attacks Revisited
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Recall that the two mean squares for the MPV data given in Example 17.1 were calculated
in Example 17.2 to be
MSTr 5 4.400 MSE 5 0.749
You can now use the five-step process for hypothesis testing problems (HMC3) to test the
hypotheses of interest.
Process Step
H Hypotheses
The question of interest is whether there are differences in mean MPV
for the four different diagnosis groups.
Population characteristics of interest:
m1 5 mean MPV for the noncardiac chest pain group
m2 5 mean MPV for the stable angina group
m3 5 mean MPV for the unstable angina group
m4 5 mean MPV for the heart attack group
Hypotheses:
Null hypothesis: H0: m1 5 m2 5 m3 5 m4
Alternative hypothesis: Ha: At least two of the m9s are diffrent
M Method
Because the answers to the four key questions are hypothesis testing,
sample data, one numerical variable and four independently selected
samples, consider an ANOVA F test.
Potential method:
ANOVA F test. The test statistic for this test is
F 5 _____
​ MSTr ​ 
MSE
When the null hypothesis is true, this statistic has approximately an F
distribution with
df1 5 k 2 1 and df2 5 N 2 k
Once you have decided to proceed with the test, you need to select
a significance level for the test. In this example, you might choose a
value for a of 0.05.
Significance level:
a 5 0.05
C Check
The samples were independently selected. The largest sample standard deviation (from Table 17.1, s4 5 1.07) is not more than twice
as large as the smallest sample standard deviation (s1 5 0.69), so the
equal population standard deviations condition is reasonably met. A
normal probability plot (see Figure 17.3) indicates that the normality
condition is also reasonably met.
C Calculate
MSTr 5 4.400 MSE 5 0.749 (from Example 17.2)
Test statistic:
MSTr _____
4.400
F 5 _____
​  MSE 
 ​5 ​ 0.749 ​ 5 5.87
Degrees of freedom
df1 5 k 2 1 5 4 2 1 5 3
df2 5 N 2 k 5 140 2 4 5 136
(continued)
17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test
11
Process Step
C Communicate
results
Because the P-value is less than the selected significance level, you reject
the null hypothesis.
Decision: Reject H0.
The final conclusion for the test should be stated in context and
answer the question posed.
Conclusion: You can conclude that the mean MPV is not the same for
all four patient groups.
Techniques for determining which means differ are introduced in Section 17.2.
Example 17.4 Hormones and Body Fat
The article “Growth Hormone and Sex Steroid Administration in Healthy Aged Women and
Men” (Journal of the American Medical Association [2002]: 2282–2292) described an experiment to investigate the effect of four treatments on various body characteristics. In this
double-blind experiment, each of 57 female subjects age 65 or older was assigned at random to one of the following four treatments: (1) placebo “growth hormone” and placebo
“steroid” (denoted by P 1 P); (2) placebo “growth hormone” and the steroid estradiol
(denoted by P 1 S); (3) growth hormone and placebo “steroid” (denoted by G 1 P); and
(4) growth hormone and the steroid estradiol (denoted by G 1 S).
The following table lists data on change in body fat mass over the 26-week period
following the treatments that are consistent with summary quantities given in the article.
Change in Body Fat Mass (kg)
Treatment
P1P
0.1
0.6
2.2
0.7
22.0
0.7
0.0
22.6
21.4
1.5
2.8
0.3
21.0
21.0
n
_
​ 
x​
s
s2
14
0.064
1.545
2.387
P1S
G1P
G1S
20.1
0.2
0.0
20.4
20.9
21.1
1.2
0.1
0.7
22.0
20.9
3.0
1.0
1.2
21.6
20.4
0.4
22.0
23.4
22.8
22.2
21.8
23.3
22.1
23.6
20.4
23.1
14
20.286
1.218
1.484
13
22.023
1.264
1.598
23.1
23.2
22.0
22.0
23.3
20.5
24.5
20.7
21.8
22.3
21.3
21.0
25.6
22.9
21.6
20.2
16
22.250
1.468
2.155
_
_
265.4
For this example, N 5 57, grand total 5 265.4, and x​
​  ​. 5 ______
​  57 ​ 
 5 21.15.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Associated P-value:
P-value 5 area under F curve to the right of 5.87
Using df1 5 3 and df2 5 120 (the closest value to 136 that appears in
the table), Appendix Table 7 shows that the area to the right of 5.78 is
0.001. Since 5.87 > 5.78 it follows that the P-value is less than 0.001.
12
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Process Step
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
H Hypotheses
The question of interest is whether there are differences in mean
change in body fat mass for the four treatments.
Population characteristics of interest:
m1 5 mean change in body fat mass for the P 1 P treatment
m2 5 mean change in body fat mass for the P 1 S treatment
m3 5 mean change in body fat mass for the G 1 P treatment
m4 5 mean change in body fat mass for the G 1 S treatment
Hypotheses:
Null hypothesis: H0: m1 5 m2 5 m3 5 m4
Alternative hypothesis: Ha: At least two of the m9s are different
M Method
Because the answers to the four key questions are hypothesis testing,
sample data, one numerical variable, and four independently selected
samples, consider an ANOVA F test.
Potential method:
ANOVA F test. The test statistic for this test is
F 5 _____
​ MSTr ​ 
MSE
When the null hypothesis is true, this statistic has approximately an F
distribution with
df1 5 k 2 1 and df2 5 N 2 k
Once you have decided to proceed with the test, you need to select a
significance level for the test. For this example, a significance level of
0.01 will be used.
Significance level:
a 5 0.01
C Check
The subjects in the experiment were randomly assigned to treatments.
The largest sample standard deviation (s1 5 1.545) is not more than
twice as large as the smallest sample standard deviation (s2 5 1.218),
so the equal population standard deviations condition is reasonably
met. Boxplots of the data from each of the four samples are shown
in Figure 17.5. The boxplots are roughly symmetric, and there are no
outliers, so the normality condition is also reasonably met.
C Calculate
SSTr 5 n1​(x​
​  1 2 x​
​  ​)2​ ​1 n2​(x​
​  2 2 ​x​ ​)2​ ​1 . . . 1 nk​(x​
​  k 2 x​
​  ​)2​ ​
​5 14(0.064 2 (21.15))​2​1 ​14(20.286 2 (21.15))​2​
1 ​13(22.023 2 (21.15))​2​1 ​16(22.250 2 (21.15))​2​
5 60.37
_
_
_
_
_
_
_
_
_
treatment df 5 k 2 1 5 3
SSE 5 (n1 2 1)​s21​ ​​ 1 (n2 2 1)​s22​ ​​ 1 . . . 1 (nk 2 1)​s2k​​ ​
5 13(2.387) 1 13(1.484) 1 12(1.598) 1 15(2.155)
5 101.81
Test statistic:
MSTr
SSTr treatment df
20.12
60.37 3
_________
F 5 ​ _____
  
 
 ​5 ​ _______________
 ​ 
5 _____
​  1.92 ​ 5 10.48
MSE 
SSE error  
df ​5 ​  101.81 53
Degrees of freedom
df1 5 k 2 1 5 4 2 1 5 3
df2 5 N 2 k 5 57 2 4 5 53
Associated P-value:
P-value 5 area under F curve to the right of 10.48
Using df1 5 3 and df2 5 60 (the closest value to 53 that appears in
the table), Appendix Table 7 shows that the area to the right of 6.17 is
0.001. Since 10.48 > 6.17 it follows that the P-value is less than 0.001.
(continued)
17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test
13
Process Step
C Communicate
results
Because the P-value is less than the selected significance level, reject
the null hypothesis.
Decision: Reject H0.
P+P
P+S
G+P
G+S
Figure 17.5 −6
Boxplots for the data of
Example 17.4
−5
−4
−3 −2 −1
0
1
Change in body fat mass
2
3
Summarizing an ANOVA
ANOVA calculations are often summarized in a tabular format called an ANOVA table. To
understand such a table, one more sum of squares must be defined.
Total sum of squares, denoted by SSTo, is given by
∑ 
__
SSTo 5 ​    ​ ​​(x 2 x​ ​)  ​2​
​with associated df 5 N 2 1
 
all N obs.
The relationship between the three sums of squares SSTo, SSTr, and SSE is
SSTo 5 SSTr 1 SSE
which is called the fundamental identity for single-factor ANOVA
The quantity SSTo, the sum of squared deviations about the grand mean, is a measure
of total variability in the data set consisting of all k samples. The quantity SSE results from
measuring variability separately within each sample and then combining. Such withinsample variability is present regardless of whether or not H0 is true. The magnitude of
SSTr, on the other hand, depends on whether the null hypothesis is true or false. The more
the m’s differ from one another, the larger SSTr will tend to be. SSTr represents variation
that can (at least to some extent) be explained by any differences between means. An informal paraphrase of the fundamental identity for single-factor ANOVA is
total variation 5 explained variation 1 unexplained variation
Once any two of the sums of squares have been calculated, the remaining one is
easily obtained from the fundamental identity. Often SSTo and SSTr are calculated first
(using computational formulas given in the appendix to this chapter), and then SSE
is obtained by subtraction: SSE 5 SSTo − SSTr. All the degrees of freedom, sums of
squares, and mean squares are entered in an ANOVA table, as displayed in Table 17.2.
The P-value usually appears to the right of F when the analysis is done by a statistical
software package.
Unless otherwise noted, all content on this page is © Cengage Learning.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Conclusion: You can conclude that the mean change in body fat mass
is not the same for all four patient groups.
14
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Table 17.2 General Format for a Single-Factor ANOVA Table
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Source of
Variation
df
Sum of Squares
Mean Square
F
Treatments
k21
SSTr
SSTr
MSTr 5 ​ ______
 
 ​
k 2 1 
Error
N2k
SSE
SSE
 ​
MSE 5 ______
​  N 2  
k 
Total
N21
SSTo
MSTr
F 5 ​ _____
MSE ​ 
An ANOVA table from Minitab for the change in body fat mass data of Example 17.4
is shown in Table 17.3. The reported P-value is 0.000, consistent with the previous conclusion that P-value < 0.001.
Table 17.3 An ANOVA Table from Minitab for the Data of Example 17.4
One-way ANOVA
Source
section
DF
SS
MS
F
Factor
3
60.37
20.12
10.48
Error
53
101.81
1.92
Total
56
162.18
P
0.000
17.1 Exercises
Each Exercise Set assesses the following chapter learning objectives: C1, M1, M2, M3, P1, P2
Section 17.1
Exercise Set 1
17.1 Give as much information as you can about the P-value
for an upper-tailed F test in each of the following situations.
a. df1 5 4, df2 5 15, F 5 5.37
b. df1 5 4, df2 5 15, F 5 1.90
c. df1 5 4, df2 5 15, F 5 4.89
d. df1 5 3, df2 5 20, F 5 14.48
e. df1 5 3, df2 5 20, F 5 2.69
f. df1 5 4, df2 5 50, F 5 3.24
17.2 Employees of a certain state university system
can choose from among four different health plans. Each
plan differs somewhat from the others in terms of hospitalization coverage. Four samples of recently hospitalized individuals were selected, each sample consisting of
people covered by a different health plan. The length of
the hospital stay (number of days) was determined for each
individual selected.
a. What hypotheses would you test to decide whether the mean
lengths of stay are not the same for all four health plans?
b. If each sample consisted of eight individuals and
the value of the ANOVA F statistic was F 5 4.37,
what conclusion would be appropriate for a test with
a 5 0.01?
c. A
nswer the question posed in Part (b) if the F value given
there resulted from sample sizes n1 5 9, n2 5 8, n3 5 7,
and n4 5 8.
17.3 The authors of the paper “Age and Violent Content
Labels Make Video Games Forbidden Fruits for Youth”
(Pediatrics [2009]: 870–876) carried out an experiment to
determine if restrictive labels on video games actually
increased the attractiveness of the game for young game
players. Participants read a description of a new video
game and were asked how much they wanted to play the
game. The description also included an age rating. Some
participants read the description with an age restrictive label
of 71, indicating that the game was not appropriate for children under the age of 7. Others read the same description,
but with an age restrictive label of 121, 161, or 181. The
following data for 12- to 13-year-old boys are fictitious but
are consistent with summary statistics given in the paper.
(The sample sizes in the actual experiment were larger.) For
purposes of this exercise, you can assume that the boys were
assigned at random to one of the four age label treatments
(71, 121, 161, and 181). Data shown are the boys’ ratings of how much they wanted to play the game on a scale
of 1 to 10. Do the data provide convincing evidence that the
mean rating associated with the game description by 12- to
15
17.1 The Analysis of Variance—Single-Factor ANOVA and the F Test
storage times? Use the value of F from the ANOVA
table to test the appropriate hypotheses at significance
level 0.05.
Section 17.1
71 label
121 label
161 label
181 label
6
8
7
10
6
7
9
9
6
8
8
6
5
5
6
8
4
7
7
7
8
9
4
6
6
5
8
8
1
8
9
9
2
4
6
10
4
7
7
8
17.4 The accompanying data on calcium content of wheat
are consistent with summary quantities that appeared in
the article “Mineral Contents of Cereal Grains as Affected by
Storage and Insect Infestation” ( Journal of Stored Products
Research [1992]: 147–151). Four different storage times were
considered. Partial output from the SAS computer package
is also shown.
Storage
Period
Observations
0 months
58.75
57.94
58.91
56.85
55.21
Exercise Set 2
17.5 Give as much information as you can about the
P-value of the single-factor ANOVA F test in each of the
following situations.
a. k 5 5, n1 5 n2 5 n3 5 n4 5 n5 5 4, F 5 5.37
b. k 5 5, n1 5 n2 5 n3 5 5, n4 5 n5 5 4, F 5 2.83
c. k 5 3, n1 5 4, n2 5 5, n3 5 6, F 5 5.02
d. k 5 3, n1 5 n2 5 4, n3 5 6, F 5 15.90
e. k 5 4, n1 5 n2 5 15, n3 5 12, n4 5 10, F 5 1.75
17.6 The paper referenced in Exercise 17.3 also gave data
for 12- to 13-year-old girls. Data consistent with summary
values in the paper are shown below. Do the data provide
convincing evidence that the mean rating associated with the
game description for 12- to 13-year-old girls is not the same
for all four age restrictive rating labels? Test the appropriate
hypotheses using a 5 0.05.
71 label
121 label
161 label
181 label
4
4
6
8
7
5
4
6
6
4
8
6
5
6
6
5
3
3
10
7
57.30
6
5
8
4
3
6
10
1 month
58.87
56.43
56.51
57.67
59.75
58.48
4
2 months
59.13
60.38
58.01
59.95
59.51
60.34
5
8
6
6
4 months
62.32
58.76
60.03
59.36
59.61
61.95
10
5
8
8
5
9
5
7
Dependent Variable: CALCIUM
Sum of
Mean
Source
DF
Squares
Square F Value Pr>F
Model
3
32.13815000
10.71271667
Error
20
32.90103333
1.64505167
Corrected Total
23
65.03918333
R-Square
C.V.
Root MSE
CALCIUM Mean
0.494135
2.180018
1.282596
58.8341667
6.51 0.0030
a. Verify that the sums of squares and df’s are as given in
the ANOVA table.
b. Is there sufficient evidence to conclude that the mean
calcium content is not the same for the four different
17.7 The experiment described in Example 17.4 also
gave data on change in body fat mass for men (“Growth
Hormone and Sex Steroid Administration in Healthy
Aged Women and Men,” Journal of the American Medical
Association [2002]: 2282–2292). Each of 74 male subjects
who were over age 65 was assigned at random to one
of the following four treatments: (1) placebo “growth
hormone” and placebo “steroid” (denoted by P 1 P); (2)
placebo “growth hormone” and the steroid testosterone
(denoted by P 1 S); (3) growth hormone and placebo
“steroid” (denoted by G 1 P); and (4) growth hormone
and the steroid testosterone (denoted by G 1 S). The
accompanying table lists data on change in body fat
mass over the 26-week period following the treatment
that are consistent with summary quantities given in the
article.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
13-year-old boys is not the same for all four restrictive rating
labels? Test the appropriate hypotheses using a significance
level of 0.05.
16
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Change in Body Fat Mass (kg)
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Treatment
P1P
P1S
G1P
G1S
0.3
23.7
23.8
25.0
0.4
21.0
23.2
25.0
21.7
0.2
24.9
23.0
20.5
22.3
25.2
22.6
22.1
1.5
22.2
26.2
1.3
21.4
23.5
27.0
0.8
1.2
24.4
24.5
1.5
22.5
20.8
24.2
21.2
23.3
21.8
25.2
20.2
0.2
24.0
26.2
1.7
0.6
21.9
24.0
1.2
20.7
23.0
23.9
0.6
20.1
21.8
23.3
0.4
23.1
22.9
25.7
21.3
0.3
22.9
24.5
20.2
20.5
22.9
24.3
0.7
20.8
23.7
24.0
20.7
24.2
20.9
24.7
Source of
Variation
0.117
0.121
0.117
0.119
​x​ 
0.100
20.933
23.112
24.605
s
1.139
1.443
1.178
1.122
1.297
2.082
1.388
1.259
s
2
Mean
Square
F
Error
235,419.04
Total
310,500.76
Additional Exercise for Section 17.1
17.9 The article “Compression of Single-Wall Corrugated
Shipping Containers Using Fixed and Floating Text Platens”
(Journal of Testing and Evaluation [1992]: 318–320) described
an experiment in which several different types of boxes were
compared with respect to compression strength (in pounds).
The data at the bottom of the page resulted from a singlefactor experiment involving k 5 4 types of boxes (the sample
means and standard deviations are in close agreement with
values given in the paper). Do these data provide evidence to
support the claim that the mean compression strength is not
the same for all four box types? Test the relevant hypothesis
using a significance level of 0.01.
17.10 The accompanying summary statistics for a measure
of social marginality for samples of youths, young adults,
adults, and seniors appeared in the paper “Perceived Causes
of Loneliness in Adulthood” (Journal of Social Behavior and
Personality [2000]: 67–84). The social marginality score mea-
20.6
_
Sum of
Squares
Treatments
22.0
n
df
sured actual and perceived social rejection, with higher scores
indicating greater social rejection. For purposes of this exercise, assume that it is reasonable to regard the four samples as
representative of the U.S. population in the corresponding age
groups and that the distributions of social marginality scores
for these four groups are approximately normal with the same
standard deviation. Is there evidence that the mean social marginality score is not the same for all four age groups? Test the
relevant hypotheses using a 5 0.05.
_
_
2158.3
Also, N 5 74, grand total 5 2158.3, and ​x​ ​ 5 ​ _______
 ​ 
. 5
74
22.139 Carry out an F test to see whether mean change in
body fat mass differs for the four treatments.
17.8 In an experiment to investigate the performance of
four different brands of spark plugs intended for use on a
125-cc motorcycle, five plugs of each brand were tested, and
the number of miles (at a constant speed) until failure was
observed. A partially completed ANOVA table is given. Fill
in the missing entries, and test the relevant hypotheses using
a 0.05 level of significance.
Age Group
Youths
Young
Adults
Adults
Seniors
Sample Size
_
x​
​ 
106
255
314
36

2.00
3.40
3.07
2.84
s
1.56
1.68
1.66
1.89
Table for Exercise 17.9
Type of Box
Sample
Mean
Compression Strength (lb)
Sample SD
1
655.5
788.3
734.3
721.4
679.1
699.4
713.00
46.55
2
789.2
772.5
786.9
686.1
732.1
774.8
756.93
40.34
3
737.1
639.0
696.3
671.7
717.2
727.1
698.07
37.20
4
535.1
628.7
542.4
559.0
586.9
520.0
562.02
39.87
_
_
​  ​5 682.50
​x​
17
17.2 Multiple Comparisons
Soccer
Athletes
Group
Nonsoccer Comparison
Athletes
Group
Sample size
86
Sample mean score
29.90
30.94
95
29.32
53
Sample standard
deviation
​3.73
​5.14
​3.78
_
_
In addition, ​x​ ​5. 30.19 Suppose that it is reasonable to regard
these three samples as random samples from the three student populations of interest. Is there sufficient evidence to
conclude that the mean Hopkins score is not the same for the
three student populations? Use a 5 0.05.
17.12 Suppose that a random sample of size n 5 5 was
selected from the vineyard properties for sale in Sonoma
County, California, in each of 3 years. The following data are
consistent with summary information on price per acre (in dollars, rounded to the nearest thousand) for disease-resistant grape
vineyards in Sonoma County (Wines and Vines, November 1999).
1996
30,000
34,000
36,000
38,000
40,000
1997
30,000
35,000
37,000
38,000
40,000
1998
40,000
41,000
43,000
44,000
50,000
a. Construct boxplots for each of the 3 years on a common
axis, and label each by year. Comment on the similarities
and differences.
b. Carry out an ANOVA to determine whether there is evidence to support the claim that the mean price per acre
for vineyard land in Sonoma County was not the same
for the 3 years considered. Use a significance level of
0.05 for your test.
17.13 Parents are frequently concerned when their child
seems slow to begin walking (although when the child
section 17.2
finally walks, the resulting havoc sometimes has the parents wishing they could turn back the clock!). The article
“Walking in the Newborn” (Science, 176 [1972]: 314–315)
reported on an experiment in which the effects of several
different treatments on the age at which a child first walks
were compared. Children in the first group were given special walking exercises for 12 minutes per day beginning at
age 1 week and lasting 7 weeks. The second group of children received daily exercises but not the walking exercises
administered to the first group. The third and fourth groups
were control groups. They received no special treatment
and differed only in that the third group’s progress was
checked weekly, whereas the fourth group’s progress was
checked just once at the end of the study. Observations on
age (in months) when the children first walked are shown
in the accompanying table. Also given is the ANOVA table,
obtained from the SPSS computer package.
Age
Treatment 1
Treatment 2
Treatment 3
Treatment 4
9.00
9.50
9.75
10.00
13.00
9.50
11.00
10.00
10.00
11.75
10.50
15.00
11.50
12.00
9.00
11.50
13.25
13.00
13.25
11.50
12.00
13.50
11.50
n
Total
6
60.75
6
68.25
6
70.25
12.00
561.75
Analysis
of Variance
Source
df
Sum of sq. Mean Sq. F Ratio
F Prob
Between
Groups
3
14.779
4.926
.129
With in
Group
19
43.690
2.299
Total
22
58.467
2.142
a. Verify the entries in the ANOVA table.
b. State and test the relevant hypotheses using a significance level of 0.05.
Multiple Comparisons
When H0: m1 5 m2 5 . . . 5 mk is rejected by the F test, you believe that there are differences among the k population or treatment means. A natural question to ask at this point is,
which means differ? For example, with k 5 4, it might be the case that m1 5 m2 5 m4, with
m3 different from the other three means. Another possibility is that m1 5 m4 and m2 5 m3.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
17.11 The chapter Preview Example described a study
comparing three groups of college students (soccer athletes,
non–soccer athletes, and a comparison group consisting of
students who did not participate in intercollegiate sports). The
following is information on scores from the Hopkins Verbal
Learning Test (which measures immediate memory recall).
18
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Still another possibility is that all four means are different from one another. A multiple
comparisons procedure is a method for identifying differences among the m’s once the
hypothesis that all of the means are equal has been rejected. The Tukey-Kramer (T-K) multiple comparisons procedure is one method that can be used to identify differences.
The T-K procedure is based on computing confidence intervals for the difference between
each possible pair of m’s. For example, for k 5 3, there are three differences to consider:
m1 2 m2
m1 2 m3
m2 2 m3
(The difference m2 2 m1 is not considered, because the interval for m1 2 m2 provides the
same information. Similarly, intervals for m3 2 m1 and m3 2 m2 are not necessary.) Once
all confidence intervals have been computed, each is examined to determine whether the
interval includes 0. If a particular interval does not include 0, the two means are declared
“significantly different” from one another. If an interval includes 0, there is no evidence of
a significant difference between the means involved.
Suppose, for example, that k 5 3 and that the three confidence intervals are
Difference
T-K Confidence Interval
m1 2 m2 m1 2 m3 m2 2 m3 (20.9, 3.5)
(2.6, 7.0)
(1.2, 5.7)
Because the interval for m1 2 m2 includes 0, you would say that m1 and m2 do not differ significantly. The other two intervals do not include 0, so you would conclude that
m1 Þ m3 and m2 Þ m3.
The T-K intervals are based on critical values for a probability distribution called the
Studentized range distribution. These critical values appear in Appendix Table 8. To find a
critical value, enter the table at the column corresponding to the number of populations or
treatments being compared, move down to the rows corresponding to the number of error
degrees of freedom (N 2 k), and select either the value for a 95% confidence level or the
one for a 99% level.
The Tukey–Kramer Multiple Comparison Procedure
 k (k 2 1)
When there are k populations or treatments being compared,​ ________
 
 ​ 
confidence
2
intervals must be computed. Denoting the relevant Studentized range critical value
(from Appendix Table 8) by q, the intervals are as follows:
( 
)
WWWWWW
_
_
__
 ​ 
For mi 2 mj: (​x​ i 2 x​
​  j) 6 q ​ _____
​ MSE
 ​ __
​ 1  ​ 1
  
​ 1  ​   ​ ​
2 ni nj
Ï
Two means are judged to differ significantly if the corresponding interval does not
include zero.
If the sample sizes are all the same, you can use n to denote the common value of
n1, n2, . . ., nk. In this case, the 6 term for each interval is the same quantity
WWW
q​ _____
​ MSE
 
​  
n   
Ï
Example 17.5 Hormones and Body Fat Revisited
Example 17.4 introduced the accompanying data on change in body fat mass resulting
from a double-blind experiment designed to compare the following four treatments:
(1) placebo “growth hormone” and placebo “steroid” (denoted by P 1 P); (2) placebo
“growth hormone” and the steroid estradiol (denoted by P 1 S); (3) growth hormone and
17.2 Multiple Comparisons
19
placebo “steroid” (denoted by G 1 P); and (4) growth hormone and the steroid estradiol
(denoted by G 1 S). From Example 17.4, MSTr 5 20.12, MSE 5 1.92, and F 5 10.48
with an associated P-value , 0.001. It was concluded that the mean change in body fat
mass is not the same for all four treatments.
Treatment
n
_
​ 
x​
s
s2
P1P
P1S
G1P
G1S
0.1
0.6
2.2
0.7
22.0
0.7
0.0
22.6
21.4
1.5
2.8
0.3
21.0
21.0
20.1
0.2
0.0
20.4
20.9
21.1
1.2
0.1
0.7
22.0
20.9
23.0
1.0
1.2
21.6
20.4
0.4
22.0
23.4
22.8
22.2
21.8
23.3
22.1
23.6
20.4
23.1
23.1
23.2
22.0
22.0
23.3
20.5
24.5
20.7
21.8
22.3
21.3
21.0
25.6
22.9
21.6
20.2
14
0.064
1.545
2.387
14
20.286
1.218
1.484
13
22.023
1.264
1.598
16
22.250
1.468
2.155
Appendix Table 8 gives the 95% Studentized range critical value q 5 3.74 (using
k 5 4 and error df 5 60, the closest tabled value to df 5 N 2 k 5 53). The first two T-K
intervals are
WWWWWWWW
___
m1 2 m2: (0.064 2 (20.286)) 6 3.74 ​ ​ ____
​ 1.92
 ​​ ___
​  1  ​ 1
  
 ​  
​  1  ​  ​ ​
2
14 14
5 0.35 6 1.39
5 (21.04, 1.74)
Ï(  ) ( 
)
Ï(  ) ( 
)
s0
Include
WWWWWWWW
___
m1 2 m3: (0.064 2 (22.023)) 6 3.74 ​ ​ ____
​ 1.92
 ​​ ___
​  1  ​ 1
  
 ​  
​  1  ​  ​ ​
2
14 13
5 2.09 6 1.41
5 (0.68, 3.50)
The remaining intervals are
m1 2 m4
m2 2 m3
m2 2 m4
m3 2 m4
(0.97, 3.66)
(0.32, 3.15)
(0.62, 3.31)
(21.14, 1.60)
e0
t includ
Does no
Does not include 0
Does not include 0
Does not include 0
Includes 0
You would conclude that m1 is not significantly different from m2 and that m3 is
not significantly different from m4. You would also conclude that m1 and m2 are significantly different from both m3 and m4. Note that Treatments 1 and 2 were treatments that
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Change in Body Fat Mass (kg)
20
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
administered a placebo in place of the growth hormone and Treatments 3 and 4 were treatments that included the growth hormone. This analysis was the basis of the researchers’
conclusion that growth hormone, with or without steroids, decreased body fat mass.
Minitab can be used to construct T-K intervals if raw data are available. Typical output
(based on Example 17.5) is shown in Figure 17.6. From the output, you can see that the
confidence interval for m1 (P 1 P) 2 m2 (P 1 S) is (21.039, 1.739), that for m2 (P 1 S) 2
m4 (G 1 S) is (0.619, 3.309), and so on.
Tukey 95% Simultaneous Confidence Intervals
All Pairwise Comparisons
Individual confidence level = 98.95%
G + S subtracted from:
G+P
P+S
P+P
Lower
-1.145
0.619
0.969
Center
0.227
1.964
2.314
Upper
1.599
3.309
3.659
--------+---------+---------+---------+(------*------)
(------*------)
(------*-----)
--------+---------+---------+---------+-2.0
0.0
2.0
4.0
G + P subtracted from:
P+S
P+P
Lower
0.322
0.672
Center
1.737
2.087
Upper
3.153
3.503
--------+---------+---------+---------+(------*------)
(------*-------)
--------+---------+---------+---------+-2.0
0.0
2.0
4.0
P + S subtracted from:
P+P
Figure 17.6
The T-K intervals for Example 17.5
(from Minitab)
Lower
-1.039
Center
0.350
Upper
1.739
--------+---------+---------+---------+(------*------)
--------+---------+---------+---------+-2.0
0.0
2.0
4.0
Why calculate the T-K intervals rather than use the t confidence interval for a difference between m’s from Chapter 13? The answer is that the T-K intervals control
the simultaneous confidence level at approximately 95% (or 99%). That is, if the procedure is used repeatedly on many different data sets, in the long run only about 5%
(or 1%) of the time would at least one of the intervals not include the value of what
the interval is estimating. Consider using separate 95% t intervals, each one having a
5% error rate. In those instances, the chance that at least one interval would make an
incorrect statement about a difference in m’s increases dramatically with the number of
intervals calculated. The Minitab output in Figure 17.6 shows that to achieve a simultaneous confidence level of about 95% (experimentwise or “family” error rate of 5%)
when k 5 4 and error df 5 76, the individual interval confidence levels must be 98.95%
(individual error rate 1.05%).
An effective display for summarizing the results of any multiple comparisons proce_
dure involves listing the ​x​ ’s and underscoring pairs judged to be not significantly different.
The process for constructing such a display is described in the following box.
Unless otherwise noted, all content on this page is © Cengage Learning.
Summarizing the Results of the Tukey–Kramer Procedure
1. List the sample means in increasing order, identifying the corresponding popu_
lation or treatment just above the value of each x​
​  .
2. Use the T-K intervals to determine the group of means that do not differ
significantly from the first in the list. Draw a horizontal line extending from the
smallest mean to the last mean in the group identified. For example, if there are
five means, arranged in order,
Population
3
2
1
4
5
_
_
_
_
_
Sample mean​x​ 3​x​ 2​x​ 1​x​ 4​x​ 5
and m3 is judged to be not significantly different from m2 or m1, but is judged to be
sig­nifi­cantly different from m4 and m5, draw the following line:
Population
3
_
Sample mean​x​ 3
2
​x​ 2
1
​x​ 1
_
_
4
​x​ 4
_
5
_
​  5
x​
3. Use the T–K intervals to determine the group of means that are not significantly
different from the second smallest. (You need consider only means that appear
to the right of the mean under consideration.) If there is already a line connecting the second smallest mean with all means in the new group identified, no
new line need be drawn. If this entire group of means is not underscored with
a single line, draw a line extending from the second smallest to the last mean
in the new group. Continuing with our example, if m2 is not significantly
different from m1 but is significantly different from m4 and m5, no new line
need be drawn. However, if m2 is not significantly different from either
m1 or m4 but is judged to be different from m5, a second line is drawn as
shown:
Population
3
2
1
4
5
_
_
_
_
_
Sample mean​x​ 3
​  2
x​
​  1
x​
​  4
x​
​  5
x​
4. Continue considering the means in the order listed, adding new lines as needed.
_
_
To illustrate this summary procedure, suppose that four samples with x​
​  1 5 19, x​
​  2 5 27,
_
5 24, and ​x​ 4 5 10 are used to test H0: m1 5 m2 5 m3 5 m4 and that this hypothesis is
rejected. Suppose the T-K confidence intervals indicate that m2 is significantly different
from both m1 and m4, and that there are no other significant differences. The resulting summary display would then be
_
​x​ 3
Population
Sample mean
Example 17.6 4
1
3
2
10
19
24
27
Sleep Time
A biologist studied the effects of ethanol on sleep time. A sample of 20 rats, matched for
age and other characteristics, was selected, and each rat was given an oral injection having
a particular concentration of ethanol per body weight. The rapid eye movement (REM)
sleep time for each rat was then recorded for a 24-hour period, with the results shown in
the following table:
Treatment
1. ​0 (control)
2. ​1 g/kg
3. ​2 g/kg
4. ​4 g/kg
__
Observations
88.6
63.0
44.9
31.0
73.2
53.9
59.5
39.6
91.4
69.2
40.2
45.3
​x​ 
68.0
50.1
56.3
25.2
75.2
71.5
38.7
22.7
79.28
61.54
47.92
32.76
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
21
17.2 Multiple Comparisons
22
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Table 17.4 (an ANOVA table from SAS) leads to the conclusion that actual mean REM
sleep time is not the same for all four treatments (the P-value for the F test is 0.0001).
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Table 17.4 SAS ANOVA Table for Example 17.6
Analysis of Variance Procedure
Dependent Variable: TIME
Sum of
Source
DF
Squares
Mean
Square
F Value
Pr > F
21.09
0.0001
Model
3
5882.35750
1960.78583
Error
16
1487.40000
92.96250
Total
19
7369.75750
The T-K intervals are
Difference
Interval
Includes O?
m1 2 m2
m1 2 m3
m1 2 m4
m2 2 m3
m2 2 m4
m3 2 m4
17.74 6 17.446
31.36 6 17.446
46.24 6 17.446
13.08 6 17.446
28.78 6 17.446
15.16 6 17.446
no
no
no
yes
no
yes
The only T-K intervals that include zero are those for m2 2 m3 and m3 2 m4. The corresponding underscoring pattern is
_
_
_
_
​x​ 4​x​ 3​x​ 2​x​ 1
32.76
47.92
61.54
79.28
Figure 17.7 displays the SAS output that agrees with our underscoring; letters are used
to indicate groupings in place of the underscoring.
Figure 17.7 SAS output for Example 17.6
Alpha 5 0.05 df 5 16 MSE 5 92.9625
Critical Value of Studentized Range 5 4.046
Minimum Significant Difference 5 17.446
Means with the same letter are not significantly different.
Tukey Grouping
Mean
N
Treatment
A
79.280
5
0 (control)
B
61.540
5
1 g/kg
C
B
47.920
5
2 g/kg
C
32.760
5
4 g/kg
Example 17.7 Roommate Satisfaction
How satisfied are college students with dormitory roommates? The article “Roommate
Satisfaction and Ethnic Identity in Mixed-Race and White University Roommate Dyads”
( Journal of College Student Development [1998]: 194–199) investigated differences among
randomly assigned African American/white, Asian/white, Hispanic/white, and white/
white roommate pairs. The researchers used a one-way ANOVA to analyze scores on the
Roommate Relationship Inventory to see whether a difference in mean score existed for
the four types of roommate pairs. They reported “significant differences among the means
(P , 0.01). Follow-up Tukey [intervals] . . . indicated differences between White dyads
(M 5 77.49) and African American/White dyads (M 5 71.27). No other significant differences were found.”
Although the mean satisfaction score for the Asian/white and Hispanic/white groups
were not given, they must have been between 77.49 (the mean for the white/white pairs)
Unless otherwise noted, all content on this page is © Cengage Learning.
23
17.2 Multiple Comparisons
and 71.27 (the mean for the African American/white pairs). (If they had been larger than
77.49, they would have been significantly different from the African American/white
pairs mean, and if they had been smaller than 71.27, they would have been significantly
different from the white/white pairs mean.) An underscoring consistent with the reported
information is
Hispanic/
African-American/
White and White
Asian/White
17.2 Exercises
Each Exercise Set assesses the following chapter learning objectives: M4, P1
Section 17.2
Exercise Set 1
17.14 Leaf surface area is an important variable in plant
gas-exchange rates. Dry matter per unit surface area (mg/cm3)
was measured for trees raised under three different growing
conditions. Let m1, m2, and m3 represent the mean dry matter
per unit surface area for the growing conditions 1, 2, and 3,
respectively. The given 95% simultaneous confidence intervals are:
Difference
Interval
m1 2 m2
m1 2 m3
m2 2 m3
(23.11, 21.11) (24.06, 22.06) (21.95, 0.05)
Which of the following four statements do you think describes
the relationship between m1, m2, and m3? Explain your choice.
a. m1 5 m2, and m3 differs from m1 and m2.
b. m1 5 m3, and m2 differs from m1 and m3.
c. m2 5 m3, and m1 differs from m2 and m3.
d. All three m’s are different from one another.
17.15 The accompanying underscoring pattern appears in
the article “Women’s and Men’s Eating Behavior Following
Exposure to Ideal-Body Images and Text” (Communications
Research [2006]: 507–529). Women either viewed slides
depicting images of thin female models with no text (treatment 1); viewed the same slides accompanied by diet and
exercise-related text (treatment 2); or viewed the same
slides accompanied by text that was unrelated to diet and
exercise (treatment 3). A fourth group of women did not
view any slides (treatment 4). Participants were assigned
at random to the four treatments. Participants were then
asked to complete a questionnaire in a room where pretzels
were set out on the tables. An observer recorded how many
pretzels participants ate while completing the questionnaire. Write a few sentences interpreting this underscoring
pattern.
Treatment:
2
1
4
3
Mean number of pretzels
consumed:
0.97
1.03
2.20
2.65
17.16 The accompanying data resulted from a flammability
study in which specimens of five different fabrics were tested
to determine burn times.
Fabric
1
2
3
4
5
17.8
13.2
11.8
16.5
13.9
16.2
10.4
11.0
15.3
10.8
15.9
11.3
​9.2
14.1
12.8
15.5
10.0
15.0
11.7
13.9
​MSTr 5 23.67
​ ​MSE 5 1.39
​ ​ ​ ​ ​F 5 17.08
P-value 5 0.000
The accompanying output gives the T-K intervals as calculated by Minitab. Identify significant differences and give
the underscoring pattern.
Individual error rate 5 0.00750
Critical value 5 4.37
Intervals for (column level mean) 2 (row level mean)
2
3
4
1
1.938
2
7.495
3.278
21.645
3
8.422
3.912
21.050
25.983
26.900
4
3.830
20.670
22.020
1.478
23.445
24.372
0.220
5
6.622
2.112
0.772
5.100
Section 17.2
Exercise Set 2
17.17 The paper “Trends in Blood Lead Levels and Blood
Lead Testing among U.S. Children Aged 1 to 5 Years” (Pediatrics
[2009]: e376–e385) gave data on blood lead levels (in mg/dL)
for samples of children living in homes that had been classified either at low, medium, or high risk of lead exposure,
based on when the home was constructed. After using a multiple comparison procedure, the authors reported the following:
1. The difference in mean blood lead level between low-risk
housing and medium-risk housing was significant.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
section
White/White
24
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
2. The difference in mean blood lead level between low-risk
housing and high-risk housing was significant.
3. The difference in mean blood lead level between mediumrisk housing and high-risk housing was significant.
Which of the following sets of T-K intervals (Set 1, 2, or 3)
is consistent with the authors’ conclusions? Explain your
choice.
mL 5 mean blood lead level for children living in low-risk
housing
mM 5 mean blood lead level for children living in mediumrisk housing
mH 5 mean blood lead level for children living in high-risk
housing
Difference
Set 1
Set 2
Set 3
mL 2 mM
(20.6, 0.1) (20.6, 20.1)
(20.6, 20.1)
mL 2 mH
(21.5, 20.6) (21.5, 20.6)
(21.5, 20.6)
mM 2 mH
(20.9, 20.3) (20.9, 0.3)
(20.9, 20.3)
17.18 The paper referenced in the Exercise 17.15 also gave
the following underscoring pattern for men.
Treatment:
Mean number of pretzels
consumed:
2
6.61
1
5.96
3
3.38
4
2.70
a. Write a few sentences interpreting this underscoring pattern.
b. Using your answers from Part (a) and from the Exercise 17.15, write a few sentences describing the differences
between how men and women respond to the treatments.
17.19 Do lizards play a role in spreading plant seeds?
Some research carried out in South Africa would suggest
so (“Dispersal of Namaqua Fig [Ficus cordata cordata] Seeds
by the Augrabies Flat Lizard [Platysaurus broadleyi],” Journal
of Herpetology [1999]: 328–330). The researchers collected
400 seeds of a particular type of fig, 100 of which were
from each treatment: lizard dung, bird dung, rock hyrax
dung, and uneaten figs. They planted these seeds in batches
of 5, and for each group of 5 they recorded how many of the
seeds germinated. This resulted in 20 observations for each
treatment. The treatment means and standard deviations are
given in the accompanying table.
Treatment
n
__
​ 
x​
Uneaten figs
20
2.40
0.30
Lizard dung
20
2.35
0.33
Bird dung
20
1.70
0.34
Hyrax dung
20
1.45
0.28
s
a. Construct the appropriate ANOVA table, and test the
hypothesis that there is no difference between mean
number of seeds germinating for the four treatments.
b. Is there evidence that seeds eaten and then excreted
by lizards germinate at a higher rate than those eaten
and then excreted by birds? Give statistical evidence
to support your answer.
Additional Exercises for Section 17.2
17.20 Samples of six different brands of diet or imitation
margarine were analyzed to determine the level of physiologically active polyunsaturated fatty acids (PAPUFA, in
percent), resulting in the data shown in the accompanying
table. (The data are fictitious, but the sample means agree
with data reported in Consumer Reports.)
Imperial
14.1
13.6
14.4
14.3
Parkay
12.8
12.5
13.4
13.0
Blue Bonnet
13.5
13.4
14.1
14.3
Chiffon
13.2
12.7
12.6
13.9
Mazola
16.8
17.2
16.4
17.3
Fleischmann’s 18.1
17.2
18.7
18.4
12.3
18.0
a. Test for differences among the true mean PAPUFA percentages for the different brands. Use a 5 0.05.
b. Use the T-K procedure to compute 95% simultaneous
confidence intervals for all differences between means
and give the corresponding underscoring pattern.
17.21 The nutritional quality of shrubs commonly used for
feed by rabbits was the focus of a study summarized in the
article “Estimation of Browse by Size Classes for Snowshoe
Hare” ( Journal of Wildlife Management [1980]: 34–40). The
energy contents (cal/g) of three sizes (4 mm or less, 5–7 mm,
and 8–10 mm) of serviceberries were studied. Let m1, m2,
and m3 denote the true mean energy content for the three size
classes. Suppose that 95% simultaneous confidence intervals
for m1 2 m2, m1 2 m3, and m2 2 m3 are (210, 290), (150,
450), and (10, 310), respectively. How would you interpret
these intervals?
17.22 Consider the accompanying data on plant growth
after the application of five different types of growth hormone.
Hormone
1
13
17
7
14
2
21
13
20
17
3
18
14
17
21
4
7
11
18
10
5
6
11
15
8
a. Carry out the F test at level a 5 0.05.
b. What happens when the T-K procedure is applied? (Note:
This “contradiction” can occur when H0 is “barely”
rejected. It happens because the test and the multiple
comparison method are based on different distributions.
Consult your friendly neighborhood statistician for more
information.)
Appendix: ANOVA Computations
25
Chapter 17 Appendix: ANOVA Computations
Let T1 denote the sum of the observations in the sample from the first population or treatment, and let T2, …, Tk denote the other sample totals. Also let T represent the sum of all
N observations—the grand total—and
T 2
CF 5 correction factor 5 ​ ___
N ​ 
Then
∑ 
SSTo 5 ​    ​ ​ x2​2 CF
all N obs.
2
1
___
T
Tk2
T22 … ___
SSTr 5 ​ n  ​ 1 ​ ___
 ​
 
1
1
​ 
n
n  ​ 2 CF
1
2
k
SSE 5 SSTo 2 SSTr
Example 15A.1
Treatment 1
4.2 ​3.7 ​5.0 ​4.8
T1 5 17.7 ​n1 5 4
Treatment 2
5.7 ​6.2 ​6.4
T2 5 18.3 ​n2 5 3
Treatment 3
4.6 ​3.2 ​3.5 ​3.9
T3 5 15.2 ​n3 5 4
T 5 51.2 ​ N 5 11
(51.2)2
T 2 ______
___
CF 5 correction factor 5 ​ N ​ 5 ​  11 ​ 
 5 238.31
2
2
2
T1
Tk
T2
… 1 ​ ___
SSTr 5 ___
​ n  ​ 1 ​ ___
n2 ​ 1
nk ​ 2 CF
1
2
2
(17.7)
(18.3)
(15.2)2
______
______
 1 ​ 
 1 ​ 
 2 238.31
5 ​ ______
4 ​ 
3 ​ 
4 ​ 
5 9.40
∑ 
SSTo 5 ​     ​ x​  2​2 CF 5 (4.2)2 1 (3.7)2 1 … 1 (3.9)2 2 238.31 5 11.81
all N obs.
SSE 5 SSTo 2 SSTr 5 118.1 2 9.40 5 2.41
are you ready to move on?
Chapter 17 Review Exercises
All chapter learning objectives are assessed in these exercises. The learning objectives assessed in each exercise are given in
parentheses for each exercise.
17.23 (C1, M1, M2, M3)
The paper “Women’s and Men’s Eating Behavior Following
Exposure to Ideal-Body Images and Text” (Communication
Research [2006]: 507–529) describes an experiment in which
74 men were assigned at random to one of four treatments:
1. Viewed slides of fit, muscular men
2.Viewed slides of fit, muscular men accompanied by diet
and fitness-related text
3.Viewed slides of fit, muscular men accompanied by text
not related to diet and fitness
4. Did not view any slides
The participants then went to a room to complete a questionnaire. In this room, bowls of pretzels were set out on
the tables. A research assistant noted how many pretzels
were consumed by each participant while completing the
questionnaire. Data consistent with summary quantities
given in the paper are given in the accompanying table.
Do these data provide convincing evidence that the mean
number of pretzels consumed is not the same for all four
treatments? Test the relevant hypotheses using a significance level of 0.05.
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Single-Factor ANOVA
26
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Treatment 1
8
7
4
13
2
1
5
8
11
5
1
0
6
4
10
7
0
12
Treatment 2
Treatment 3
Treatment 4
1
5
2
0
3
0
3
4
4
5
5
7
8
4
0
6
3
5
2
5
7
5
2
0
0
3
4
2
4
1
1
6
8
0
4
9
8
6
2
7
8
8
5
14
9
0
6
3
12
5
6
10
8
6
2
10
17.24 (P1, P2)
Can use of an online plagiarism-detection system reduce
plagiarism in student research papers? The paper “Plagiarism
and Technology: A Tool for Coping with Plagiarism” ( Journal of
Education for Business [2005]: 149–152) describes a study in
which randomly selected research papers submitted by students during five semesters were analyzed for plagiarism.
For each paper, the percentage of plagiarized words in the
paper was determined by an online analysis. In each of the
five semesters, students were told during the first two class
meetings that they would have to submit an electronic version of their research papers and that the papers would be
reviewed for plagiarism. Suppose that the number of papers
sampled in each of the five semesters and the means and
standard deviations for percentage of plagiarized words
are as given in the accompanying table. For purposes of
this exercise, assume that the conditions necessary for the
ANOVA F test are reasonable. Do these data provide evidence to support the claim that mean percentage of plagiarized words is not the same for all five semesters? Test the
appropriate hypotheses using a 5 0.05.
Semester
n
Mean
Standard deviation
1
2
3
4
5
39
42
32
32
34
6.31
3.31
1.79
1.83
1.50
3.75
3.06
3.25
3.13
2.37
17.25 (M4, P2)
The paper referenced in Exercise 17.3 described an experiment to determine if restrictive age labeling on video games
increased the attractiveness of the game for boys ages 12
to 13. In that exercise, the null hypothesis was H0: m1 5 m2
5 m3 5 m4, where m1 is the population mean attractiveness
rating for the game with the 71 age label, and m2, m3, and
m4 are the population mean attractiveness scores for the
121, 161, and 181 age labels, respectively. The sample
data are given in the accompanying table.
71 label
121 label
161 label
181 label
6
6
6
5
4
8
6
1
2
4
8
7
8
5
7
9
5
8
4
7
7
9
8
6
7
4
8
9
6
7
10
9
6
8
7
6
8
9
10
8
a.Compute the 95% T-K intervals and then use the underscoring procedure described in this section to identify
significant differences among the age labels.
b.Based on your answer to Part (a), write a few sentences
commenting on the theory that the more restrictive the
age label on a video game, the more attractive the game
is to 12- to 13-year-old boys.
17.26 (M4)
The authors of the paper “Beyond the Shooter Game:
Examining Presence and Hostile Outcomes among Male Game
Players” (Communication Research [2006]: 448–466) stud-
ied how video game content might influence attitudes and
behavior. Male students at a large Midwestern university
were assigned at random to play one of three action-oriented
video games. Two of the games involved some violence—
one was a shooting game and one was a fighting game. The
third game was a nonviolent race car driving game. After
playing a game for 20 minutes, participants answered a set
of questions. The responses were used to determine values
of three measures of aggression: (1) a measure of aggressive behavior; (2) a measure of aggressive thoughts; and (3)
a measure of aggressive feelings. The authors hypothesized
that the means for the three measures of aggression would
be greatest for the fighting game and lowest for the driving
game.
a.For the measure of aggressive behavior, the paper reports
that the mean score for the fighting game was significantly higher than the mean scores for the shooting and
driving game, but that the mean scores for the shooting
and driving games were not significantly different. The three
sample means were:
Sample mean
Driving
Shooting
Fighting
3.42
4.00
5.30
Use the underscoring procedure of this section to construct
a display that shows any significant differences in mean
aggressive behavior score among the three games.
b.For the measure of aggressive thoughts, the three sample
means were:
Sample mean
Driving
Shooting
Fighting
2.81
3.44
4.01
The paper states that the mean score for the fighting game
only significantly differed from the mean score for the
driving game, and that the mean score for the shooting
game did not significantly differ from either the fighting
or driving games. Use the underscoring procedure of this
section to construct a display that shows any significant
differences in mean aggressive thoughts score among the
three games.
Technology Notes
ANOVA
JMP
1. Input the raw data into the first column
2. Input the group information into the second column
3. Click Analyze then select Fit Y by X
4.Click and drag the first column name from the box under
Select Columns to the box next to Y, Response
5.Click and drag the second column name from the box under
Select Columns to the box next to X, Factor
6. Click OK
7.Click the red arrow next to Oneway Analysis of… and select
Means/ANOVA
Unless otherwise noted, all content on this page is © Cengage Learning.
MINITAB
Data stored in separate columns
1. Input each group’s data in a separate column
2.Click Stat then ANOVA then One-Way (Unstacked)…
3. Click in the box under Responses (in separate columns):
4. Double-click the column name containing each group’s data
5. Click OK
Data stored in one column
1. Input the data into one column
2. Input the group information into a second column
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
27
Technology Notes
28
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
3.Click Analyze then click Compare Means then click OneWay ANOVA…
4.Click the name of the column containing the raw data and
click the arrow to move it to the box under Dependent List:
5.Click the name of the column containing the group data and
click the arrow to move it to the box under Factor:
6. Click OK
Excel
1. Input the raw data for each group into a separate column
2. Click the Data ribbon
3. Click Data Analysis in the Analysis group
Note: If you do not see Data Analysis listed on the Ribbon, see
the Technology Notes for Chapter 2 for instructions on installing
this add-on.
4.Select Anova: Single Factor and click OK
5.Click on the box next to Input Range and select ALL columns
of data (if you typed and selected column titles, click the box
next to Labels in First Row)
6.Click in the box next to Alpha and type the significance level
7. Click OK
3. Click Stat then ANOVA then One-Way…
4.Click in the box next to Response: and double-click the
column name containing the raw data values
5.Click in the box next to Factor: and double-click the column
name containing the group information
6. Click OK
SPSS
1. Input the raw data for all groups into one column
2.Input the group information into a second column (use group
numbers)
Note: The test statistic and p-value can be found in the first row
of the table under F and P-value, respectively.
TI-83/84
1.Enter the data for each group into a separate list starting
with L1 (In order to access lists press the STAT key, highlight
the option called Edit… then press ENTER)
2. Press STAT
3. Highlight TESTS
4. Highlight ANOVA and press ENTER
5. Press 2nd then 1
6. Press ,
7. Press 2nd then 2
8. Press ,
9.Continue to input lists where data is stored separated by
commas until you input the final list
10. When you are finished entering all lists, press )
11. Press ENTER
TI-Nspire
Summarized Data
1.Enter the summary information for the first group in a list
in the following order: the value for n followed by a comma
__
then the value of x​
​   followed by a comma then the value of s
(In order to access data lists select the spreadsheet option
and press enter)
Note: Be sure to title the lists by selecting the top row of the
column and typing a title.
2.Enter the summary information for the first group in a list in
the following order: the value for n followed by a comma then
__
the value of ​x​ followed by a comma then the value of s
3.Continue to enter summary information for each group in
this manner
Unless otherwise noted, all content on this page is © Cengage Learning.
4.When you are finished entering data for each group, press
menu then 4:Statistics then 4:Stat Tests then C:ANOVA…
then press enter
5.For Data Input Method choose Stats from the drop-down
menu
6. For Number of Groups enter the number of groups, k
7.In the box next to Group 1 Stats select the list containing
group one’s summary statistics
8.In the box next to Group 2 Stats select the list containing
group one’s summary statistics
9.Continue entering summary statistics in this manner for all
groups
10. Press OK
Raw data
1.Enter each group’s data into separate data lists (In order to
access data lists, select the spreadsheet option and press
enter)
29
Note: Be sure to title the lists by selecting the top row of the
column and typing a title.
2.Press the menu key and select 4:Statistics then 4:Stat Tests
then C:ANOVA… and press enter
3.For Data Input Method choose Data from the drop-down
menu
4.For Number of Groups input the number of groups, k
5. Press OK
6.For List 1 select the list title that contains group one’s data
from the drop-down menu
7.For List 2 select the list title that contains group two’s data
from the drop-down menu
8.Continue to select the appropriate lists for all groups
9. When you are finished inputting lists press OK
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Technology
Notes
17.2 Multiple
Comparisons
30
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Appendix Tables
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Table 7 Values That Capture Specified Upper-Tail F Curve Areas
df2 Area
1
2
3
4
5
6
7
8
9
.10
.05
.01
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
1
2
3
4
5
df1
6
7
8
9
10
39.86
49.50
53.59
55.83
57.24
58.20
58.91
59.44
59.86
60.19
161.40 199.50 215.70 224.60 230.20 234.00 236.80 238.90 240.50 241.90
4052.00 5000.00 5403.00 5625.00 5764.00 5859.00 5928.00 5981.00 6022.00 6056.00
8.53
9.00
9.16
9.24
9.29
9.33
9.35
9.37
9.38
9.39
18.51
19.00
19.16
19.25
19.30
19.33
19.35
19.37
19.38
19.40
98.50
99.00
99.17
99.25
99.30
99.33
99.36
99.37
99.39
99.40
998.50 999.00 999.20 999.20 999.30 999.30 999.40 999.40 999.40 999.40
5.54
5.46
5.39
5.34
5.31
5.28
5.27
5.25
5.24
5.23
10.13
9.55
9.28
9.12
9.01
8.94
8.89
8.85
8.81
8.79
34.12
30.82
29.46
28.71
28.24
27.91
27.67
27.49
27.35
27.23
167.00 148.50 141.10 137.10 134.60 132.80 131.60 130.60 129.90 129.20
4.54
4.32
4.19
4.11
4.05
4.01
3.98
3.95
3.94
3.92
7.71
6.94
6.59
6.39
6.26
6.16
6.09
6.04
6.00
5.96
21.20
18.00
16.69
15.98
15.52
15.21
14.98
14.80
14.66
14.55
74.14
61.25
56.18
53.44
51.71
50.53
49.66
49.00
48.47
48.05
4.06
3.78
3.62
3.52
3.45
3.40
3.37
3.34
3.32
3.30
6.61
5.79
5.41
5.19
5.05
4.95
4.88
4.82
4.77
4.74
16.26
13.27
12.06
11.39
10.97
10.67
10.46
10.29
10.16
10.05
47.18
37.12
33.20
31.09
29.75
28.83
28.16
27.65
27.24
26.92
3.78
3.46
3.29
3.18
3.11
3.05
3.01
2.98
2.96
2.94
5.99
5.14
4.76
4.53
4.39
4.28
4.21
4.15
4.10
4.06
13.75
10.92
9.78
9.15
8.75
8.47
8.26
8.10
7.98
7.87
35.51
27.00
23.70
21.92
20.80
20.03
19.46
19.03
18.69
18.41
3.59
3.26
3.07
2.96
2.88
2.83
2.78
2.75
2.72
2.70
5.59
4.74
4.35
4.12
3.97
3.87
3.79
3.73
3.68
3.64
12.25
9.55
8.45
7.85
7.46
7.19
6.99
6.84
6.72
6.62
29.25
21.69
18.77
17.20
16.21
15.52
15.02
14.63
14.33
14.08
3.46
3.11
2.92
2.81
2.73
2.67
2.62
2.59
2.56
2.54
5.32
4.46
4.07
3.84
3.69
3.58
3.50
3.44
3.39
3.35
11.26
8.65
7.59
7.01
6.63
6.37
6.18
6.03
5.91
5.81
25.41
18.49
15.83
14.39
13.48
12.86
12.40
12.05
11.77
11.54
3.36
3.01
2.81
2.69
2.61
2.55
2.51
2.47
2.44
2.42
5.12
4.26
3.86
3.63
3.48
3.37
3.29
3.23
3.18
3.14
10.56
8.02
6.99
6.42
6.06
5.80
5.61
5.47
5.35
5.26
16.39
13.90
12.56
11.71
11.13
10.70
10.37
10.11
9.89
22.86
(continued)
31
Appendix
Table 7 Values That Capture Specified Upper-Tail F Curve Areas (Continued)
df2 Area
10
11
12
13
14
15
16
17
18
19
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
1
3.29
4.96
10.04
21.04
3.23
4.84
9.65
19.69
3.18
4.75
9.33
18.64
3.14
4.67
9.07
17.82
3.10
4.60
8.86
17.14
3.07
4.54
8.68
16.59
3.05
4.49
8.53
16.12
3.03
4.45
8.40
15.72
3.01
4.41
8.29
15.38
2.99
4.38
8.18
15.08
2
2.92
4.10
7.56
14.91
2.86
3.98
7.21
13.81
2.81
3.89
6.93
12.97
2.76
3.81
6.70
12.31
2.73
3.74
6.51
11.78
2.70
3.68
6.36
11.34
2.67
3.63
6.23
10.97
2.64
3.59
6.11
10.66
2.62
3.55
6.01
10.39
2.61
3.52
5.93
10.16
3
2.73
3.71
6.55
12.55
2.66
3.59
6.22
11.56
2.61
3.49
5.95
10.80
2.56
3.41
5.74
10.21
2.52
3.34
5.56
9.73
2.49
3.29
5.42
9.34
2.46
3.24
5.29
9.01
2.44
3.20
5.18
8.73
2.42
3.16
5.09
8.49
2.40
3.13
5.01
8.28
4
2.61
3.48
5.99
11.28
2.54
3.36
5.67
10.35
2.48
3.26
5.41
9.63
2.43
3.18
5.21
9.07
2.39
3.11
5.04
8.62
2.36
3.06
4.89
8.25
2.33
3.01
4.77
7.94
2.31
2.96
4.67
7.68
2.29
2.93
4.58
7.46
2.27
2.90
4.50
7.27
5
2.52
3.33
5.64
10.48
2.45
3.20
5.32
9.58
2.39
3.11
5.06
8.89
2.35
3.03
4.86
8.35
2.31
2.96
4.69
7.92
2.27
2.90
4.56
7.57
2.24
2.85
4.44
7.27
2.22
2.81
4.34
7.02
2.20
2.77
4.25
6.81
2.18
2.74
4.17
6.62
6
7
8
9
10
2.46
3.22
5.39
9.93
2.39
3.09
5.07
9.05
2.33
3.00
4.82
8.38
2.28
2.92
4.62
7.86
2.24
2.85
4.46
7.44
2.21
2.79
4.32
7.09
2.18
2.74
4.20
6.80
2.15
2.70
4.10
6.56
2.13
2.66
4.01
6.35
2.11
2.63
3.94
6.18
2.41
3.14
5.20
9.52
2.34
3.01
4.89
8.66
2.28
2.91
4.64
8.00
2.23
2.83
4.44
7.49
2.19
2.76
4.28
7.08
2.16
2.71
4.14
6.74
2.13
2.66
4.03
6.46
2.10
2.61
3.93
6.22
2.08
2.58
3.84
6.02
2.06
2.54
3.77
5.85
2.38
3.07
5.06
9.20
2.30
2.95
4.74
8.35
2.24
2.85
4.50
7.71
2.20
2.77
4.30
7.21
2.15
2.70
4.14
6.80
2.12
2.64
4.00
6.47
2.09
2.59
3.89
6.19
2.06
2.55
3.79
5.96
2.04
2.51
3.71
5.76
2.02
2.48
3.63
5.59
2.35
3.02
4.94
8.96
2.27
2.90
4.63
8.12
2.21
2.80
4.39
7.48
2.16
2.71
4.19
6.98
2.12
2.65
4.03
6.58
2.09
2.59
3.89
6.26
2.06
2.54
3.78
5.98
2.03
2.49
3.68
5.75
2.00
2.46
3.60
5.56
1.98
2.42
3.52
5.39
2.32
2.98
4.85
8.75
2.25
2.85
4.54
7.92
2.19
2.75
4.30
7.29
2.14
2.67
4.10
6.80
2.10
2.60
3.94
6.40
2.06
2.54
3.80
6.08
2.03
2.49
3.69
5.81
2.00
2.45
3.59
5.58
1.98
2.41
3.51
5.39
1.96
2.38
3.43
5.22
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
df1
32
CHAPTER 17 Asking and Answering Questions about More Than Two Means
Table 7 Values That Capture Specified Upper-Tail F Curve Areas (Continued)
df2 Area
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
20
21
22
23
24
25
26
27
28
29
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
.10
.05
.01
.001
1
2.97
4.35
8.10
14.82
2.96
4.32
8.02
14.59
2.95
4.30
7.95
14.38
2.94
4.28
7.88
14.20
2.93
4.26
7.82
14.03
2.92
4.24
7.77
13.88
2.91
4.23
7.72
13.74
2.90
4.21
7.68
13.61
2.89
4.20
7.64
13.50
2.89
4.18
7.60
13.39
df1
2
3
4
5
2.59
3.49
5.85
9.95
2.57
3.47
5.78
9.77
2.56
3.44
5.72
9.61
2.55
3.42
5.66
9.47
2.54
3.40
5.61
9.34
2.53
3.39
5.57
9.22
2.52
3.37
5.53
9.12
2.51
3.35
5.49
9.02
2.50
3.34
5.45
8.93
2.50
3.33
5.42
8.85
2.38
3.10
4.94
8.10
2.36
3.07
4.87
7.94
2.35
3.05
4.82
7.80
2.34
3.03
4.76
7.67
2.33
3.01
4.72
7.55
2.32
2.99
4.68
7.45
2.31
2.98
4.64
7.36
2.30
2.96
4.60
7.27
2.29
2.95
4.57
7.19
2.28
2.93
4.54
7.12
2.25
2.87
4.43
7.10
2.23
2.84
4.37
6.95
2.22
2.82
4.31
6.81
2.21
2.80
4.26
6.70
2.19
2.78
4.22
6.59
2.18
2.76
4.18
6.49
2.17
2.74
4.14
6.41
2.17
2.73
4.11
6.33
2.16
2.71
4.07
6.25
2.15
2.70
4.04
6.19
2.16
2.71
4.10
6.46
2.14
2.68
4.04
6.32
2.13
2.66
3.99
6.19
2.11
2.64
3.94
6.08
2.10
2.62
3.90
5.98
2.09
2.60
3.85
5.89
2.08
2.59
3.82
5.80
2.07
2.57
3.78
5.73
2.06
2.56
3.75
5.66
2.06
2.55
3.73
5.59
6
7
8
9
10
2.09
2.60
3.87
6.02
2.08
2.57
3.81
5.88
2.06
2.55
3.76
5.76
2.05
2.53
3.71
5.65
2.04
2.51
3.67
5.55
2.02
2.49
3.63
5.46
2.01
2.47
3.59
5.38
2.00
2.46
3.56
5.31
2.00
2.45
3.53
5.24
1.99
2.43
3.50
5.18
2.04
2.51
3.70
5.69
2.02
2.49
3.64
5.56
2.01
2.46
3.59
5.44
1.99
2.44
3.54
5.33
1.98
2.42
3.50
5.23
1.97
2.40
3.46
5.15
1.96
2.39
3.42
5.07
1.95
2.37
3.39
5.00
1.94
2.36
3.36
4.93
1.93
2.35
3.33
4.87
2.00
2.45
3.56
5.44
1.98
2.42
3.51
5.31
1.97
2.40
3.45
5.19
1.95
2.37
3.41
5.09
1.94
2.36
3.36
4.99
1.93
2.34
3.32
4.91
1.92
2.32
3.29
4.83
1.91
2.31
3.26
4.76
1.90
2.29
3.23
4.69
1.89
2.28
3.20
4.64
1.96
2.39
3.46
5.24
1.95
2.37
3.40
5.11
1.93
2.34
3.35
4.99
1.92
2.32
3.30
4.89
1.91
2.30
3.26
4.80
1.89
2.28
3.22
4.71
1.88
2.27
3.18
4.64
1.87
2.25
3.15
4.57
1.87
2.24
3.12
4.50
1.86
2.22
3.09
4.45
1.94
2.35
3.37
5.08
1.92
2.32
3.31
4.95
1.90
2.30
3.26
4.83
1.89
2.27
3.21
4.73
1.88
2.25
3.17
4.64
1.87
2.24
3.13
4.56
1.86
2.22
3.09
4.48
1.85
2.20
3.06
4.41
1.84
2.19
3.03
4.35
1.83
2.18
3.00
4.29
(continued)
33
Appendix
Table 7 Values That Capture Specified Upper-Tail F Curve Areas (Continued)
df2 Area
30 .10
.05
.01
.001
40 .10
.05
.01
.001
60 .10
.05
.01
.001
90 .10
.05
.01
.001
120 .10
.05
.01
.001
240 .10
.05
.01
.001
∞ .10
.05
.01
.001
1
2.88
4.17
7.56
13.29
2.84
4.08
7.31
12.61
2.79
4.00
7.08
11.97
2.76
3.95
6.93
11.57
2.75
3.92
6.85
11.38
2.73
3.88
6.74
11.10
2.71
3.84
6.63
10.83
2
3
4
5
6
7
8
9
10
2.49
3.32
5.39
8.77
2.44
3.23
5.18
8.25
2.39
3.15
4.98
7.77
2.36
3.10
4.85
7.47
2.35
3.07
4.79
7.32
2.32
3.03
4.69
7.11
2.30
3.00
4.61
6.91
2.28
2.92
4.51
7.05
2.23
2.84
4.31
6.59
2.18
2.76
4.13
6.17
2.15
2.71
4.01
5.91
2.13
2.68
3.95
5.78
2.10
2.64
3.86
5.60
2.08
2.60
3.78
5.42
2.14
2.69
4.02
6.12
2.09
2.61
3.83
5.70
2.04
2.53
3.65
5.31
2.01
2.47
3.53
5.06
1.99
2.45
3.48
4.95
1.97
2.41
3.40
4.78
1.94
2.37
3.32
4.62
2.05
2.53
3.70
5.53
2.00
2.45
3.51
5.13
1.95
2.37
3.34
4.76
1.91
2.32
3.23
4.53
1.90
2.29
3.17
4.42
1.87
2.25
3.09
4.25
1.85
2.21
3.02
4.10
1.98
2.42
3.47
5.12
1.93
2.34
3.29
4.73
1.87
2.25
3.12
4.37
1.84
2.20
3.01
4.15
1.82
2.18
2.96
4.04
1.80
2.14
2.88
3.89
1.77
2.10
2.80
3.74
1.93
2.33
3.30
4.82
1.87
2.25
3.12
4.44
1.82
2.17
2.95
4.09
1.78
2.11
2.84
3.87
1.77
2.09
2.79
3.77
1.74
2.04
2.71
3.62
1.72
2.01
2.64
3.47
1.88
2.27
3.17
4.58
1.83
2.18
2.99
4.21
1.77
2.10
2.82
3.86
1.74
2.04
2.72
3.65
1.72
2.02
2.66
3.55
1.70
1.98
2.59
3.41
1.67
1.94
2.51
3.27
1.85
2.21
3.07
4.39
1.79
2.12
2.89
4.02
1.74
2.04
2.72
3.69
1.70
1.99
2.61
3.48
1.68
1.96
2.56
3.38
1.65
1.92
2.48
3.24
1.63
1.88
2.41
3.10
1.82
2.16
2.98
4.24
1.76
2.08
2.80
3.87
1.71
1.99
2.63
3.54
1.67
1.94
2.52
3.34
1.65
1.91
2.47
3.24
1.63
1.87
2.40
3.09
1.60
1.83
2.32
2.96
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
df1
34
CHAPTER 17 Asking and Answering Questions about More Than Two Means
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Table 8 Critical Values of q for the Studentized Range Distribution
Error
df
Confidence
level
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
24
30
40
60
120
∞
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
95%
99%
Number of populations, treatments, or levels being compared
3
4
5
6
7
8
9
10
4.60
6.98
4.34
6.33
4.16
5.92
4.04
5.64
3.95
5.43
3.88
5.27
3.82
5.15
3.77
5.05
3.73
4.96
3.70
4.89
3.67
4.84
3.65
4.79
3.63
4.74
3.61
4.70
3.59
4.67
3.58
4.64
3.53
4.55
3.49
4.45
3.44
4.37
3.40
4.28
3.36
4.20
3.31
4.12
5.22
7.80
4.90
7.03
4.68
6.54
4.53
6.20
4.41
5.96
4.33
5.77
4.26
5.62
4.20
5.50
4.15
5.40
4.11
5.32
4.08
5.25
4.05
5.19
4.02
5.14
4.00
5.09
3.98
5.05
3.96
5.02
3.90
4.91
3.85
4.80
3.79
4.70
3.74
4.59
3.68
4.50
3.63
4.40
5.67
8.42
5.30
7.56
5.06
7.01
4.89
6.62
4.76
6.35
4.65
6.14
4.57
5.97
4.51
5.84
4.45
5.73
4.41
5.63
4.37
5.56
4.33
5.49
4.30
5.43
4.28
5.38
4.25
5.33
4.23
5.29
4.17
5.17
4.10
5.05
4.04
4.93
3.98
4.82
3.92
4.71
3.86
4.60
6.03
8.91
5.63
7.97
5.36
7.37
5.17
6.96
5.02
6.66
4.91
6.43
4.82
6.25
4.75
6.10
4.69
5.98
4.64
5.88
4.59
5.80
4.56
5.72
4.52
5.66
4.49
5.60
4.47
5.55
4.45
5.51
4.37
5.37
4.30
5.24
4.23
5.11
4.16
4.99
4.10
4.87
4.03
4.76
6.33
9.32
5.90
8.32
5.61
7.68
5.40
7.24
5.24
6.91
5.12
6.67
5.03
6.48
4.95
6.32
4.88
6.19
4.83
6.08
4.78
5.99
4.74
5.92
4.70
5.85
4.67
5.79
4.65
5.73
4.62
5.69
4.54
5.54
4.46
5.40
4.39
5.26
4.31
5.13
4.24
5.01
4.17
4.88
6.58
9.67
6.12
8.61
5.82
7.94
5.60
7.47
5.43
7.13
5.30
6.87
5.20
6.67
5.12
6.51
5.05
6.37
4.99
6.26
4.94
6.16
4.90
6.08
4.86
6.01
4.82
5.94
4.79
5.89
4.77
5.84
4.68
5.69
4.60
5.54
4.52
5.39
4.44
5.25
4.36
5.12
4.29
4.99
6.80
9.97
6.32
8.87
6.00
8.17
5.77
7.68
5.59
7.33
5.46
7.05
5.35
6.84
5.27
6.67
5.19
6.53
5.13
6.41
5.08
6.31
5.03
6.22
4.99
6.15
4.96
6.08
4.92
6.02
4.90
5.97
4.81
5.81
4.72
5.65
4.63
5.50
4.55
5.36
4.47
5.21
4.39
5.08
6.99
10.24
6.49
9.10
6.16
8.37
5.92
7.86
5.74
7.49
5.60
7.21
5.49
6.99
5.39
6.81
5.32
6.67
5.25
6.54
5.20
6.44
5.15
6.35
5.11
6.27
5.07
6.20
5.04
6.14
5.01
6.09
4.92
5.92
4.82
5.76
4.73
5.60
4.65
5.45
4.56
5.30
4.47
5.16
17.2 Multiple Comparisons
35
Section 17.1
Exercise Set 1
17.1 (a) 0.001 , P-value , 0.01 (b) P-value . 0.10
(c) P-value 5 0.01 (d) P-value , 0.001 (e) 0.05 , P-value
, 0.10 (f) 0.01 , P-value , 0.05 (using df1 5 4 and
df2 5 60)
17.2 (a) H0: m1 5 m2 5 m3 5 m4 , Ha: At least two of the
four mi’s are different. (b) P-value 5 0.012, fail to reject H0
(c) P-value 5 0.012, fail to reject H0
17.3 F 5 6.687, P-value 5 0.001, reject H0
_
_
_
_
_
_
_
_
_
17.16
Sample mean
Fabric 3 Fabric 2 Fabric 5 Fabric 4 Fabric 1
14.96
16.35
10.5
11.633
12.3
Additional Exercises
17.4 (a) SSTr
5 n1(​x​ 1 2 ​x​ ​)2 1 n2 (​x​ 2 2 x​
​  ​)2 1 n3 (​x​ 3 2 x​
​  ​)2
_
_
_ 2
​  ​) 5 32.13815000;
1 n4 (​x​ 4 2 x​
Treatment df 5 k 2 1 5 3; SSE 5 (n1 2 1)s 1 (n2 2 1)s
1 (n3 2 1)s32 1 (n4 2 1)s42 5 32.90103333;
2
1
no slides and slides with unrelated text. However, there
was a significant difference between the mean numbers of
pretzels eaten for no slides and slides with no text (and also
between the results for no slides and slides with related
text). Likewise, there was a significant difference between
the mean numbers of pretzels eaten for slides with unrelated
text and slides with no text (and also between the results for
slides with unrelated text and slides with related text).
2
2
Error df 5 N 2 k 5 20
(b) H0: m1 5 m2 5 m3 5 m4 ; Ha: At least two among m1, m2,
m3, m4 are different; F 5 6.51, P-value 5 0.033; reject H0.
Additional Exercises
17.9 H0: m1 5 m2 5 m3 5 m4 ; Ha : At least two among m1,
m2, m3, m4 are different; F 5 25.094, P-value < 0; reject H0.
17.11 F 5 2.62, 0.05 , P-value , 0.10, fail to reject H0
17.13 (a) See solutions manual for detailed computations.
(b) F 5 2.142, P-value . 0.10, fail to reject H0
Section 17.2
Exercise Set 1
17.14 Since the interval for m2 2 m3 is the only one that
contains zero, we have evidence of a difference between m1
and m2, and between m1 and m3, but not between m2 and m3.
Thus, statement c is the correct choice.
17.15 In increasing order of the resulting mean numbers of
pretzels eaten, the treatments were: slides with related text,
slides with no text, no slides, and slides with unrelated text.
There were no significant differences between the results
for slides with related text and slides with no text, or for
17.21 The interval for m1 2 m2 contains zero, and hence m1
and m2 are judged not different. The intervals for µ1 2 m3
and m2 2 m3 do not contain zero, so m1 and m3 are judged to
be different, and m2 and m3 are judged to be different. There
is evidence that m3 is different from the other two means.
Are You Ready to Move On?
Chapter 17 Review Exercises
17.23 F 5 5.273, P-value 5 0.002, reject H0
17.25 (a)
Difference
Interval
Includes 0?
m1 2 m2
m1 2 m3
m1 2 m4
m2 2 m3
m2 2 m4
m3 2 m4
(24.027, 0.027)
(24.327, 20.273)
(25.327, 21.273)
(22.327, 1.727)
(23.327, 0.727)
(23.027, 1.027)
Yes
No
No
Yes
Yes
Yes
Sample mean
71 label
4.8
121 label
6.8
161 label
7.1
181 label
8.1
(b) The more restrictive the age label on the video game, the
higher the sample mean rating given by the boys used in
the experiment. However, according to the T-K intervals, the
only significant differences were between the means for
the 7 1 label and the 16 1 label and between the means for
the 7 1 label and the 18 1 label.
35
© 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Answers to Selected Exercises