Download Chapter 13.pdf

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Regression toward the mean wikipedia , lookup

Time series wikipedia , lookup

Choice modelling wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
CHAPTER 13
Section 13.1
1.
a.
x = 15
and
∑ (x
− x)
2
j
1 (x i − 15 )2
ˆ
= 250 , so s.d. of Yi − Yi is 10 1 − −
=
5
250
6.32, 8.37, 8.94, 8.37, and 6.32 for i = 1, 2, 3, 4, 5.
b.
Now
x = 20 and
∑ (x
− x ) = 1250 , giving standard deviations 7.87, 8.49, 8.83,
2
i
8.94, and 2.83 for i = 1, 2, 3, 4, 5.
c.
2.
The deviation from the estimated line is likely to be much smaller for the observation
made in the experiment of b for x = 50 than for the experiment of a when x = 25. That
is, the observation (50, Y) is more likely to fall close to the least squares line than is (25,
Y).
The pattern gives no cause for questioning the appropriateness of the simple linear regression
model, and no observation appears unusual.
3.
This plot indicates there are no outliers, the variance of ε is reasonably constant, and the ε
are normally distributed. A straight-line regression function is a reasonable choice for a
model.
1
residuals
a.
0
-1
100
150
filtration rate
393
200
Chapter 13: Nonlinear and Multiple Regression
b.
We need Sxx =
2
(
2817 .9 )
∑ (xi − x ) = 415,914.85 −
2
20
= 18,886 .8295 . Then each
ei
ei* can be calculated as follows: e * =
i
.4427 1 +
1 (x i − 140 .895)
+
20
18,886 .8295
. The table
2
below shows the values:
standardized
residuals
-0.31064
-0.30593
0.4791
1.2307
-1.15021
0.34881
-0.09872
-1.39034
0.82185
-0.15998
Notice that if
standardized
residuals
0.6175
0.09062
1.16776
-1.50205
0.96313
0.019
0.65644
-2.1562
-0.79038
1.73943
e / e*i
0.644053
0.614697
0.578669
0.647714
0.648002
0.643706
0.633428
0.640683
0.640975
0.621857
e / e*i
0.64218
0.64802
0.565003
0.646461
0.648257
0.643881
0.584858
0.647182
0.642113
0.631795
ei* ˜ e / s, then e / e*i ˜ s . All of the e / e*i ’s range between .57 and .65,
which are close to s.
This plot looks very much the same as the one in part a.
2
standardized residuals
c.
1
0
-1
-2
100
150
filtration rate
394
200
Chapter 13: Nonlinear and Multiple Regression
4.
a.
The (x, residual) pairs for the plot are (0, -.335), (7, -.508), (17. -.341), (114, .592), (133,
.679), (142, .700), (190, .142), (218, 1.051), (237, -1.262), and (285, -.719). The plot
shows substantial evidence of curvature.
b.
The standardized residuals (in order corresponding to increasing x) are -.50, -.75, -.50,
.79, .90, .93, .19, 1.46, -1.80, and -1.12. A standardized residual plot shows the same
pattern as the residual plot discussed in the previous exercise. The z percentiles for the
normal probability plot are –1.645, -1.04, -.68, -.39, -.13, .13, .39, .68, 1.04, 1.645. The
plot follows. The points follow a linear pattern, so the standardized residuals appear to
have a normal distribution.
Normal Probability Plot for the Standardized Residuals
std resid
1
0
-1
-2
-2
-1
0
1
2
percentile
5.
a.
97.7% of the variation in ice thickness can be explained by the linear relationship
between it and elapsed time. Based on this value, it appears that a linear model is
reasonable.
b.
The residual plot shows a curve in the data, so perhaps a non-linear relationship exists.
One observation (5.5, -3.14) is extreme.
2
residual
1
0
-1
-2
-3
0
1
2
3
elapsed time
395
4
5
6
Chapter 13: Nonlinear and Multiple Regression
6.
a.
H o : β1 = 0 vs. H a : β 1 ≠ 0 . The test statistic is t =
βˆ1
, and we will reject Ho if
s βˆ
1
t ≥ t.025, 4 = 2.776 or if t ≤ −2.776 . s ˆ
β
t=
s
7.265
=
=
= .565 , and
S xx 12.869
1
6.19268
= 10 .97 . Since 10.97 ≥ 2.776 , we reject Ho and conclude that the model
.565
is useful.
b.
yˆ (7.0) = 1008.14 + 6.19268(7.0) = 1051.49 , from which the residual is
y − yˆ (7.0 ) = 1046 − 1051.49 = −5.49 . Similarly, the other residuals are -.73, 4.11,
7.91, 3.58, and –9.38. The plot of the residuals vs x follows:
RE SI1
10
0
-10
10
15
20
x
Because a curved pattern appears, a linear regression function may be inappropriate.
The standardized residuals are calculated as
e1* =
− 5.49
1 (7.0 − 14.48)
7.265 1 + +
6
165.5983
2
= −1.074 , and similarly the others are -.123,
.624, 1.208, .587, and –1.841. The plot of e* vs x follows :
1
SRES1
c.
0
-1
-2
10
15
20
x
This plot gives the same information as the previous plot. No values are exceptionally
large, but the e* of –1.841 is close to 2 std deviations away from the expected value of 0.
396
Chapter 13: Nonlinear and Multiple Regression
7.
a.
120
dry weight
110
100
90
80
70
60
0
1
2
3
4
5
6
7
8
exposure time
There is an obvious curved pattern in the scatter plot, which suggests that a simple
linear model will not provide a good fit.
b.
The
y'
ˆ s , e’s, and e*’s are given below:
x
y
ŷ
e
e*
0
110
126.6
-16.6
-1.55
2
123
113.3
9.7
.68
4
119
100.0
19.0
1.25
6
86
86.7
-.7
-.05
8
62
73.4
-11.4
-1.06
397
Chapter 13: Nonlinear and Multiple Regression
First, we will look at a scatter plot of the data, which is quite linear, so it seems reasonable to
use linear regression.
70
oxygen uptake
8.
60
50
40
30
20
40
50
60
70
heart rate response
The linear regression output (Minitab) follows:
The regression equation is
y = - 51.4 + 1.66 x
Predictor
Constant
x
S = 6.119
Coef
-51.355
1.6580
StDev
9.795
0.1869
R-Sq = 84.9%
T
-5.24
8.87
P
0.000
0.000
R-Sq(adj) = 83.8%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
14
15
SS
2946.5
524.2
3470.7
MS
2946.5
37.4
F
78.69
P
0.000
A quick look at the t and p values shows that the model is useful, and r2 shows a strong
relationship between the two variables.
The observation (72, 72) has large influence, since its x value is a distance from the others.
We could run the regression again, without this value, and get the line:
oxygen uptake = - 44.8 + 1.52 heart rate response.
398
Chapter 13: Nonlinear and Multiple Regression
9.
Both a scatter plot and residual plot ( based on the simple linear regression model) for the first
data set suggest that a simple linear regression model is reasonable, with no pattern or
influential data points which would indicate that the model should be modified. However,
scatter plots for the other three data sets reveal difficulties.
Scatter Plot for Data Set #1
Scatter Plot for Data Set #2
11
9
10
8
9
7
y
y
8
7
6
6
5
5
4
4
3
4
9
14
4
9
x
Scatter Plot for Data Set #4
Scatter Plot for Data Set #3
13
13
12
12
11
11
10
y
10
y
14
x
9
9
8
8
7
7
6
6
5
5
4
9
10
14
15
20
x
x
For data set #2, a quadratic function would clearly provide a much better fit. For data set #3,
the relationship is perfectly linear except one outlier, which has obviously greatly influenced
the fit even though its x value is not unusually large or small. The signs of the residuals here
(corresponding to increasing x) are + + + + - - - - - + -, and a residual plot would reflect this
pattern and suggest a careful look at the chosen model. For data set #4 it is clear that the
slope of the least squares line has been determined entirely by the outlier, so this point is
extremely influential (and its x value does lie far from the remaining ones).
399
Chapter 13: Nonlinear and Multiple Regression
10.
a.
(
i
b.
)
ei = y i − βˆ 0 − βˆ 1 xi = yi − y − βˆ1 ( xi − x ) , so
Σe = Σ( y − y ) − βˆ Σ(x − x ) = 0 + βˆ ⋅ 0 = 0 .
Since
i
1
i
1
Σ ei = 0 always, the residuals cannot be independent. There is clearly a linear
relationship between the residuals. If one eI is large positive, then al least one other eI
would have to be negative to preserve Σ ei = 0 . This suggests a negative correlation
between residuals (for fixed values of any n – 2, the other two obey a negative linear
relationship).
c.
(Σxi )(Σyi )  ˆ  2 (Σxi )2 

ˆ
(
)
Σx i ei = Σ xi y i − Σxi y − β1Σ xi xi − x = Σ xi y i −
 − β 1  Σxi − n 
n




, but the first term in brackets is the numerator of
denominator of
0.
d.
The five
β̂ 1 , while the second term is the
β̂ 1 , so the difference becomes (numerator of β̂ 1 ) – (numerator of β̂ 1 ) =
ei* ' s from Exercise 7 above are –1.55, .68, 1.25, -.05, and –1.06, which sum to
-.73. This sum differs too much from 0 to be explained by rounding. In general it is not
true that
Σ e*i = 0 .
400
Chapter 13: Nonlinear and Multiple Regression
11.
1
a. Yi − Yˆi = Yi − Y − βˆ 1 ( xi − x ) = Yi − ∑ Y j −
n j
(x i − x ) Σj (x j − x )Y j
Σ(x j − x )
2
j
=∑ c j Y j ,
j
1
(x i − x )2
1 ( xi − x )(x j − x )
where c j = 1 − −
for
j
=
i
and
c
=
1
−
−
for
j
2
n nΣ(x j − x )2
n
Σ(x j − x )
(
)
j ≠ i . Thus Var Yi − Yˆi = ΣVar (c jY j ) (since the Yj ’s are independent) = σ 2 Σc 2j
which, after some algebra, gives equation (13.2).
b.
(
)
(
)
σ 2 = Var (Yi ) = Var (Yˆi + Yi − Yˆi ) = Var (Yˆi ) + Var Yi − Yˆi , so

(x i − x )2  , which is exactly
2
2
21
ˆ
ˆ
Var Yi − Yi = σ − Var (Yi ) = σ − σ
+
 n nΣ (x j − x )2 


(
)
(13.2).
c.
x i moves further from x , ( xi − x ) grows larger, so Var (Yˆi ) increases (since
( xi − x )2 has a positive sign in Var (Yˆi ) ), but Var Yi − Yˆi decreases (since
2
As
(
( xi − x )2 has a negative sign).
)
12.
a.
Σ ei = 34 , which is not = 0, so these cannot be the residuals.
b.
Each
x i ei is positive (since x i and e i have the same sign) so Σ xi ei > 0 , which
contradicts the result of exercise 10c, so these cannot be the residuals for the given x
values.
401
Chapter 13: Nonlinear and Multiple Regression
13.
The distribution of any particular standardized residual is also a t distribution with n – 2 d.f.,
since
ei* is obtained by taking standard normal variable
(Y − Yˆ ) and substituting the
(σ )
i
i
Yi −Yˆ
estimate of σ in the denominator (exactly as in the predicted value case). With
the ith standardized residual as a random variable, when n = 25
d.f. and
(
Ei* denoting
Ei* has a t distribution with 23
t .01, 23 = 2.50 , so P( Ei* outside (-2.50, 2.50)) =
) (
)
P Ei* ≥ 2.50 + P Ei* ≤ −2.50 = .01 + .01 = .02 .
14.
space
a. n1
= n 2 = 3 (3 observations at 110 and 3 at 230), n 3 = n 4 = 4 , y1. = 202.0 ,
y 2. = 149.0 , y 3. = 110.5 , y 4. = 107.0 , ΣΣy ij2 = 288,013 , so
[
]
SSPE = 288,013 − 3(202.0 ) + 3(149.0) + 4(110.5) + 4(107.0) = 4361 .
With
2
2
2
2
Σ xi = 4480 , Σ yi = 1923 , Σ x = 1,733,500 , Σ y = 288,013 (as above),
2
i
2
i
Σ xi y i = 544,730 , SSE = 7241 so SSLF = 7241-4361=2880. With c – 2 = 2 and n
2880
4361
– c = 10, F.05 , 2 ,10 = 4.10 . MSLF =
= 1440 and SSPE =
= 436.1 ,
2
10
1440
so the computed value of F is
= 3.30 . Since 3.30 is not ≥ 4.10 , we do not
436.1
and
reject Ho . This formal test procedure does not suggest that a linear model is
inappropriate.
b.
The scatter plot clearly reveals a curved pattern which suggests that a nonlinear model
would be more reasonable and provide a better fit than a linear model.
402
Chapter 13: Nonlinear and Multiple Regression
Section 13.2
15.
a.
Scatter Plot of Y vs X
15
y
10
5
0
0
10
20
30
40
50
60
x
The points have a definite curved pattern. A linear model would not be appropriate.
b.
In this plot we have a strong linear pattern.
Scatter Plot of ln(Y) vs ln(X)
3
ln(y)
2
1
0
2
3
4
ln(x)
c.
The linear pattern in b above would indicate that a transformed regression using the
natural log of both x and y would be appropriate. The probabilistic model is then
y = αx β ⋅ ε . (The power function with an error term!)
d.
A regression of ln(y) on ln(x) yields the equation ln( y) = 4.6384 − 1.04920 ln( x ) .
Using Minitab we can get a P.I. for y when x = 20 by first transforming the x value:
ln(20) = 2.996. The computer generated 95% P.I. for ln(y) when ln(x) = 2.996 is
(1.1188,1.8712). We must now take the antilog to return to the original units of Y:
(e
1. 1188
)
, e1.8712 = (3.06, 6.50) .
403
Chapter 13: Nonlinear and Multiple Regression
e.
A computer generated residual analysis:
Residual Model Diagnostics
I Chart of Re siduals
Normal Plot of Residuals
3
2
3. 0SL= 2.820
2
Resid ual
Re sidua l
1
0
-1
1
0
X= -0. 06771
-1
-2
-3 .0SL= -2 .956
-3
-2
-1.5 - 1.0 -0.5
0. 0
0.5
1.0
1
1. 5
2
No rmal Sco re
4
5
6
7
8
Observati on Nu mbe r
Hi stogram of Residual s
Residuals vs. Fits
3
2
2
1
Residua l
Freq uen cy
3
1
0
-1
0
-2
-1.5 - 1.0 -0. 5 0.0 0.5 1.0 1.5 2.0
0
1
Resid ual
2
3
Fit
Looking at the residual vs. fits (bottom right), one standardized residual, corresponding to
the third observation, is a bit large. There are only two positive standardized residuals,
but two others are essentially 0. The patterns in the residual plot and the normal
probability plot (upper left) are marginally acceptable.
16.
a.
Σ xi = 9.72 , Σ yi′ = 313.10 , Σ xi2 = 8.0976 , Σ yi′ 2 = 288,013 ,
Σ xi y ′i = 255.11 , (all from computer printout, where y ′i = ln (L178 ) ), from which
βˆ = 6.6667 and βˆ = 20.6917 (again from computer output). Thus
0
1
ˆ
βˆ = βˆ1 = 6.6667 and αˆ = e β 0 = 968,927,163 .
b.
We first predict
y ′ using the linear model and then exponentiate:
y ′ = 20.6917 + 6.6667(.75) = 25.6917 , so
yˆ = Lˆ178 = e 25.6917 = 1.438051363 ×1011 .
c.
We first compute a prediction interval for the transformed data and then exponentiate.
With
t .025,10 = 2.228 , s = .5946, and 1 +
1
(.95 − x )
+ 2
= 1.082 , the
12 Σx − (Σ x )2 / 12
2
y′ is
27.0251 ± (2.228)(.5496)(1.082 ) = 27.0251 ± 1.4334 = (25.5917,28.4585 ) .
prediction interval for
The P.I. for y is then
(e
25.5917
)
, e 28.4585 .
404
Chapter 13: Nonlinear and Multiple Regression
17.
a.
Σ x′i = 15.501 , Σ yi′ = 13.352 , Σ xi′2 = 20 .228 , Σ yi′2 = 16.572 ,
Σ x′i y ′i = 18.109 , from which βˆ1 = 1.254 and βˆ 0 = −.468 so βˆ = βˆ1 = 1.254
−.468
and αˆ = e
= .626 .
b.
The plots give strong support to this choice of model; in addition, r2 = .960 for the
transformed data.
c.
SSE = .11536 (computer printout), s = .1024, and the estimated sd of
t=
β̂ 1 is .0775, so
1.25 − 1.33
= −1.07 . Since –1.07 is not ≤ −t .05,11 = −1.796 , Ho cannot be
.0775
rejected in favor of Ha.
d.
18.
µ Y ⋅5 = 2 µ Y ⋅ 2.5 is equivalent to α ⋅ 5 β = 2α (2.5) β , or that β = 1.
1 − 1.33
Thus we wish test H o : β 1 = 1 vs. H a : β1 ≠ 1 . With t =
= −4.30 and
.0775
RR − t .005,11 ≤ −3.106 , Ho is rejected at level .01 since − 4 .30 ≤ −3.106 .
The claim that
A scatter plot may point us in the direction of a power function, so we try
y = αx β . We
x ′ = ln( x) , so y = α + β ln( x) . This transformation yields a linear regression
equation y = .0197 − .00128 x ′ or y = .0197 − .00128 ln( x ) . Minitab output follows:
transform
The regression equation is
y = 0.0197 - 0.00128 x
Predictor
Constant
x
Coef
0.019709
-0.0012805
S = 0.002668
StDev
0.002633
0.0003126
R-Sq = 49.7%
T
7.49
-4.10
P
0.000
0.001
R-Sq(adj) = 46.7%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
17
18
SS
0.00011943
0.00012103
0.00024046
MS
0.00011943
0.00000712
F
16.78
P
0.001
The model is useful, based on a t test, with a p value of .001. But r2 = 49.7, so only 49.7% of
the variation in y can be explained by its relationship with ln(x).
To estimate y 5000 , we need
x ′ = ln( 5000) = 8.51718 . A point estimate for y when x
=5000 is y = .009906. A 95 % prediction interval for y 5000 is
405
(.002257,017555).
Chapter 13: Nonlinear and Multiple Regression
19.
a.
No, there is definite curvature in the plot.
b.
Y ′ = β 0 + β 1 ( x′ ) + ε where x ′ =
1
and y ′ = ln( lifetime) . Plotting y′ vs.
temp
x′
gives a plot which has a pronounced linear appearance (and in fact r2 = .954 for the
straight line fit).
c.
Σ xi′ = .082273 , Σ yi′ = 123.64 , Σ x′i 2 = .00037813 , Σ yi′ 2 = 879.88 ,
Σ xi′ y ′i = .57295 , from which βˆ1 = 3735.4485 and βˆ 0 = −10.2045 (values read
x ′ = .00445 so
yˆ ′ = −10 .2045 + 3735.4485(.00445) = 6.7748 and thus yˆ = e yˆ ′ = 875.50 .
from computer output). With x = 220,
d.
For the transformed data, SSE = 1.39857, and
n1 = n 2 = n3 = 6, y1′. = 8.44695 ,
y ′2. = 6.83157 , y ′3. = 5.32891 , from which SSPE = 1.36594, SSLF = .02993,
.02993 / 1
f =
= .33 . Comparing this to F.01,1,15 = 8.68 , it is clear that Ho cannot
1.36594 / 15
be rejected.
20.
After examining a scatter plot and a residual plot for each of the five suggested models as well
as for y vs. x, I felt that the power model
Y = αx β ⋅ ε ( y ′ = ln( y ) vs.
x ′ = ln( x )) provided the bet fit. The transformation seemed to remove most of the curvature
from the scatter plot, the residual plot appeared quite random,
e ′i * < 1.65 for every i, there
was no indication of any influential observations, and r2 = .785 for the transformed data.
21.
10 4
. The summary
x
2
2
quantities are Σ x′i = 159.01 , Σ yi = 121.50 , Σ xi′ = 4058 .8 , Σ yi = 1865.2 ,
Σ x′i y i = 2281.6 , from which βˆ1 = −.1485 and βˆ 0 = 18.1391 , and the estimated
1485
regression function is y = 18.1391 −
.
x
= β 0 + β 1 ( x′ ) + ε where x ′ =
a.
The suggested model is Y
b.
x = 500 ⇒ yˆ = 18.1391 −
1485
= 15.17 .
500
406
Chapter 13: Nonlinear and Multiple Regression
22.
a.
b.
c.
1
1
= α + βx , so with y ′ = , y ′ = α + βx . The corresponding probabilistic model
y
y
1
is
= α + βx + ε .
y
1 
1 
1
− 1 = e α + βx , so ln  −1 = α + βx . Thus with y ′ = ln  − 1 , y ′ = α + βx .
y
y 
y 
The corresponding probabilistic model is Y ′ = α + βx + ε ′ , or equivalently
1
ε′
Y=
where ε = e .
α + βx
1+ e
⋅ε
ln ( y ) = eα + βx = ln (ln ( y )) = α + βx . Thus with y ′ = ln (ln ( y )), y ′ = α + βx .
The probabilistic model is Y ′ = α
α +βx
+ βx + ε ′ , or equivalently, Y = e e
⋅ ε where
ε′
ε =e .
d.
23.
This function cannot be linearized.
[
]
Var (Y ) = Var (αe β x ⋅ ε ) = αe βx ⋅ Var (ε ) = α 2 e 2 βx ⋅τ 2 where we have set
Var (ε ) = τ 2 . If β > 0 , this is an increasing function of x so we expect more spread in y
for large x than for small x, while the situation is reversed if β < 0 . It is important to realize
2
that a scatter plot of data generated from this model will not spread out uniformly about the
exponential regression function throughout the range of x values; the spread will only be
uniform on the transformed scale. Similar results hold for the multiplicative power model.
24.
H 0 : β1 = 0 vs H a : β 1 ≠ 0 . The value of the test statistic is z = .73, with a
corresponding p-value of .463. Since the p-value is greater than any sensible choice of alpha
we do not reject Ho . There is insufficient evidence to claim that age has a significant impact
on the presence of kyphosis.
407
Chapter 13: Nonlinear and Multiple Regression
25.
The point estimate of
β 1 is βˆ1 = .17772 , so the estimate of the odds ratio is
e β1 = e .17772 ≈ 1.194 . That is , when the amount of experience increases by one year (i.e. a
ˆ
one unit increase in x), we estimate that the odds ratio increase by about 1.194. The z value
of 2.70 and its corresponding p-value of .007 imply that the null hypothesis
H 0 : β1 = 0
can be rejected at any of the usual significance levels (e.g., .10, .05, .025, .01). Therefore,
there is clear evidence that β 1 is not zero, which means that experience does appear to affect
the likelihood of successfully performing the task. This is consistent with the confidence
interval ( 1.05, 1.36) for the odds ratio given in the printout, since this interval does not
contain the value 1. A graph of πˆ appears below.
0.9
0.8
0.7
p(x)
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0
10
20
30
experience
Section 13.3
26.
a.
There is a slight curve to this scatter plot. It could be consistent with a quadratic
regression.
b.
We desire R2 , which we find in the output: R2 = 93.8%
c.
H 0 : β1 = β 2 = 0 vs H a : at least one β i ≠ 0 . The test statistic is
MSR
f =
= 22.51 , and the corresponding p-value is .016. Since the p-value < .05,
MSE
we reject Ho and conclude that the model is useful.
d.
We want a 99% confidence interval, but the output gives us a 95% confidence interval of
(452.71, 529.48), which can be rewritten as 491.10 ± 38.38 ; t .025, 3 = 3.182 , so
38.38
= 12.06 ; Now, t .005,3 = 5.841 , so the 99% C.I. is
3.182
491.10 ± 5.841(12.06) = 491.10 ± 70.45 = (420.65,561.55) .
s yˆ ⋅14 =
e.
H 0 : β 2 = 0 vs H a : β 2 ≠ 0 . The test statistic is t = -3.81, with a corresponding pvalue of .032, which is < .05, so we reject Ho . the quadratic term appears to be useful in
this model.
408
Chapter 13: Nonlinear and Multiple Regression
27.
a.
A scatter plot of the data indicated a quadratic regression model might be appropriate.
75
70
y
65
60
55
50
1
2
3
4
5
6
7
8
x
b.
c.
d.
yˆ = 84.482 − 15.875(6) + 1.7679(6) = 52.88; residual =
y 6 − yˆ 6 = 53 − 52.88 = .12;
2
SST = Σ y i2 −
(Σy i ) 2
n
= 586.88 , so R 2 = 1 −
61.77
= .895 .
586.88
The first two residuals are the largest, but they are both within the interval (-2, 2).
Otherwise, the standardized residual plot does not exhibit any troublesome features. For
the Normal Probability Plot:
Residual
Zth percentile
-1.95
-1.53
-.66
-.89
-.25
-.49
.04
-.16
.20
.16
.58
.49
.90
.89
1.91
1.53
(continued)
409
Chapter 13: Nonlinear and Multiple Regression
The normal probability plot does not exhibit any troublesome features.
2
2
1
residual
std resid
1
0
-1
0
-1
-2
-2
1
2
3
4
5
6
7
8
-1
x
e.
0
1
z %ile
µˆ Y ⋅ 6 = 52.88 (from b) and t .025, n− 3 = t .025, 5 = 2.571 , so the C.I. is
52.88 ± (2.571)(1.69 ) = 52.88 ± 4.34 = (48.54,57.22 ) .
f.
28.
61.77
2
= 12.35 and 12.35 + (1.69) = 3.90 . The P.I. is
5
52.88 ± (2.571)(3.90) = 52.88 ± 10.03 = (42.85,62.91) .
SSE = 61.77 so
s2 =
a.
2
2
µˆ Y ⋅75 = βˆ 0 + βˆ1 (75) + βˆ 2 (75) = −113.0937 + 3.36684(75) − .01780(75) = 39.41
b.
2
yˆ = βˆ 0 + βˆ1 (60) + βˆ 2 (60) = 24.93 .
c.
SSE = Σyi2 − βˆ 0 Σy i − βˆ1Σx i yi − βˆ 2 Σx i2 y i = 8386.43 − (− 113.0937)(210.70)
− (3.3684)(17,002) − (− .0178)(1,419,780) = 217.82 ,
SSE 217.82
s2 =
=
= 72.61 , s = 8.52
n−3
3
217.82
= .779
987.35
d.
R2 =1−
e.
Ho will be rejected in favor of Ha if either
computed value of t is
t=
t ≥ t.005,3 = 5.841 or if t ≤ −5.841.
The
− .01780
= −7.88 , and since − 7.88 ≤ −5.841, we reject
.00226
Ho .
410
Chapter 13: Nonlinear and Multiple Regression
29.
a.
From computer output:
Thus
b.
c.
ŷ :
111.89
120.66
114.71
94.06
58.69
y − ŷ :
-1.89
2.34
4.29
-8.06
3.31
SSE = (− 1.89)2 + ... + (3.31)2 = 103.37 , s 2 =
SST = Σy −
2
i
(Σy i ) 2
n
= 2630 , so R 2 = 1 −
103.37
= 51.69 , s = 7.19 .
2
103.37
= .961.
2630
H 0 : β 2 = 0 will be rejected in favor of H a : β 2 ≠ 0 if either t ≥ t.025, 2 = 4.303 or if
− 1.84
t ≤ −4.303 . With t =
= −3.83 , Ho cannot be rejected; the data does not argue
.480
strongly for the inclusion of the quadratic term.
d.
To obtain joint confidence of at least 95%, we compute a 98% C.I. for each coefficient
using
t.01,2 = 6.965 .
For
β1 the C.I. is 8.06 ± (6.965)(4.01) = (− 19.87,35.99) ( an
extremely wide interval), and for
= (− 5.18,1.50) .
e.
β 2 the C.I. is − 1.84 ± (6.965)(.480)
t.05, 2 = 2.920 and βˆ 0 + 4βˆ 1 + 16βˆ 2 = 114.71 , so the C.I. is 114.71 ± (2.920)(5.01)
= 114.71 ± 14.63 = (100.08,129.34) .
f.
If we knew
βˆ 0 , βˆ1 , βˆ 2 , the value of x which maximizes βˆ 0 + βˆ1 x + βˆ 2 x 2 would be
obtained by setting the derivative of this to 0 and solving:
β1 + 2β 2 x = 0 ⇒ x = −
β1
βˆ
. The estimate of this is x = − 1 = 2.19.
2β2
2βˆ 2
411
Chapter 13: Nonlinear and Multiple Regression
30.
a.
R2 = 0.853. This means 85.3% of the variation in wheat yield is accounted for by the
model.
b.
− 135.44 ± (2.201)(41.97) = (− 227.82,−43.06)
c.
H 0 : µ y ⋅2.5 = 1500; H a : µ y⋅2.5 < 1500 ; RR : t ≤ −t.01,11 = −2.718
When x = 2.5, yˆ = 1402.15
1,402.15 − 1500
t=
= −1.83
53.5
Fail to reject Ho . The data does not indicate µ y⋅2.5 is less than 1500.
d.
1402.15 ± (2.201) (136.5)2 + (53.5) 2 = (1081.3,1725.0 )
a.
Using Minitab, the regression equation is y = 13.6 + 11.4x - 1.72x2.
b.
Again, using Minitab, the predicted and residual values are:
ŷ :
23.327
23.327
29.587
31.814
31.814
31.
y − ŷ :
-.327
1.173
1.587
.914
31.814
20.317
1.786
-.317
.186
Residuals Versus theFitted Values
(response i sy)
2
34
32
1
28
y
Residual
30
0
26
24
-1
22
20
-2
20
22
24
26
28
30
32
1
2
3
4
5
6
x
Fitted Value
The residual plot is consistent with a quadratic model (no pattern which would suggest
modification), but it is clear from the scatter plot that the point (6, 20) has had a great
influence on the fit – it is the point which forced the fitted quadratic to have a maximum
between 3 and 4 rather than, for example, continuing to curve slowly upward to a
maximum someplace to the right of x = 6.
c.
From Minitab output, s 2 = MSE = 2.040, and R2 = 94.7%. The quadratic model thus
explains 94.7% of the variation in the observed y’s , which suggests that the model fits
the data quite well.
412
Chapter 13: Nonlinear and Multiple Regression
d.
(
)
(
)
σ 2 = Var (Yˆi ) + Var Yi − Yˆi suggests that we can estimate Var Yi − Yˆi by
s − s and then take the square root to obtain the estimated standard deviation of each
2
2
yˆ
residual. This gives
2.040 − (.955)2 = 1.059, (and similarly for all points) 10.59,
1.236, 1.196, 1.196, 1.196, and .233 as the estimated std dev’s of the residuals. The
− .327
= −.31 , (and similarly) 1.10, -1.28,
1.059
standardized residuals are then computed as
-.76, .16, 1.49, and –1.28, none of which are unusually large. (Note: Minitab regression
output can produce these values.) The resulting residual plot is virtually identical to the
plot of b.
y − yˆ − .327
=
= −.229 ≠ −.31 , so standardizing using just s would not
s
1.426
yield the correct standardized residuals.
e.
Var (Y f ) + Var (Yˆ f ) is estimated by 2.040 + (.777 )2 = 2.638 , so
s y f + yˆ f = 2.638 = 1.624 . With yˆ = 31.81 and t .05, 4 = 2.132 , the desired P.I. is
31.81 ± (2.132)(1.624 ) = (28.35,35.27 ) .
32.
a.
.3463 − 1.2933(x − x ) + 2.3964(x − x ) − 2.3968( x − x ) .
b.
From a, the coefficient of x3 is -2.3968, so
2
3
βˆ 3 = −2.3968 . There sill be a contribution
2.3964( x − 4.3456) and from − 2 .3968( x − 4.3456) .
Expanding these and adding yields 33.6430 as the coefficient of x2 , so βˆ = 33.6430 .
to x2 both from
2
3
2
c.
x = 4.5 ⇒ x ′ = x − x = .1544 ; substituting into a yields yˆ = .1949 .
d.
t=
− 2.3968
= −.97 , which is not significant ( H 0 : β 3 = 0 cannot be rejected), so
2.4590
the inclusion of the cubic term is not justified.
413
Chapter 13: Nonlinear and Multiple Regression
33.
a.
x − 20
. For x = 20, x ′ = 0 , and
10.8012
yˆ = βˆ 0∗ = .9671 . For x = 25, x ′ = .4629 , so
x = 20 and s x = 10.8012 so x ′ =
yˆ = .9671 − .0502(.4629 ) − .0176(.4629) + .0062(.4629) = .9407 .
2
3
 x − 20 
 x − 20 
 x − 20 
yˆ = .9671 − .0502
 − .0176
 + .0062

 10.8012 
 10.8012 
 10.8012 
.00000492 x 3 − .000446058x 2 + .007290688 x + .96034944 .
2
b.
c.
3
.0062
= 2.00 . We reject Ho if either t ≥ t .025, n− 4 = t . 025, 3 = 3.182 or if
.0031
t ≤ −3.182 . Since 2.00 is neither ≥ 3.182 nor ≤ −3.182 , we cannot reject Ho ; the
t=
cubic term should be deleted.
d.
SSE = Σ ( y i − yˆ i ) and the yˆ i ' s are the same from the standardized as from the
unstandardized model, so SSE, SST, and R2 will be identical for the two models.
e.
Σ yi2 = 6.355538 , Σ yi = 6.664 , so SST = .011410. For the quadratic model R2 =
.987 and for the cubic mo del, R2 = .994; The two R2 values are very close, suggesting
intuitively that the cubic term is relatively unimportant.
34.
a.
b.
c.
d.
x = 49.9231 and s x = 41.3652 so for x = 50, x ′ = x − 49.9231 = .001859 and
41.3652
2
ˆ
(
)
(
µY ⋅50 = .8733 − .3255 .001859 + .0448 .001859 ) = .873 .
SST = 1.456923 and SSE = .117521, so R2 = .919.
 x − 49 .9231 
 x − 49 .9231 
.8733 − .3255 
 + .0448 

 41.3652 
 41 .3652 
1.200887 − .01048314 x + .00002618 x 2 .
2
βˆ ∗
1
βˆ 2 = 22 so the estimated sd of β̂ 2 is the estimated sd of β̂ 2∗ multiplied by :
sx
sx
1


sβˆ = (.0319 )
 = .00077118 .
2
 41 .3652 
e.
t=
.0448
= 1.40 which is not significant (compared to ± t .025, 9 at level .05), so the
.0319
quadratic term should not be retained.
414
Chapter 13: Nonlinear and Multiple Regression
35.
Y ′ = ln( Y ) = ln α + βx + γ x 2 + ln (ε ) = β 0 + β1 x + β 2 x 2 + ε ′ where ε ′ = ln (ε ) ,
β 0 = ln (α ) , β1 = β , and β 2 = γ . That is, we should fit a quadratic to ( x, ln ( y )) . The
2.00397 + .1799 x − .0022 x 2 , so
= 7.6883 . (The ln(y)’s are 3.6136, 4.2499,
resulting estimated quadratic (from computer output) is
βˆ = .1799, γˆ = −.0022 , and αˆ = e 2.0397
4.6977, 5.1773, and 5.4189, and the summary quantities can then be computed as before.)
Section 13.4
36.
a.
Holding age, time, and heart rate constant, maximum oxygen uptake will increase by .01
L/min for each 1 kg increase in weight. Similarly, holding weight, age, and heart rate
constant, the maximum oxygen uptake decreases by .13 L/min with every 1 minute
increase in the time necessary to walk 1 mile.
b.
yˆ 76, 20,12,140 = 5.0 + .01(76) − .05(20) − .13(12) − .01(140) = 1.8 L/min.
c.
yˆ = 1.8 from b, and σ = .4 , so, assuming y follows a normal distribution,
2.6 − 1.8 
 1.00 − 1.8
P(1.00 < Y < 2.60) = P
<Z <
 = P(− 2.0 < Z < 2.0) = .9544
.4
.4 

37.
a.
The mean value of y when x1 = 50 and x2 = 3 is
b.
When the number of deliveries (x2 ) is held fixed, then average change in travel time
associated with a one-mile (i.e. one unit) increase in distance traveled (x1 ) is .060 hours.
Similarly, when distance traveled (x1 ) is held fixed, then the average change in travel
time associated with on extra delivery (i.e., a one unit increase in x2 ) is .900 hours.
c.
Under the assumption that y follows a normal distribution, the mean and standard
deviation of this distribution are 4.9 (because x1 = 50 and x2 = 3) and σ = .5 ( since the
standard deviation is assumed to be constant regardless of the values of x1 and x2 ).
µ y ⋅50,3 = −.800 + .060(50 ) + .900(3) = 4.9 hours.
Therefore
6 − 4 .9 

P ( y ≤ 6) = P  z ≤
 = P( z ≤ 2.20 ) = .9861 . That is, in the long
.5 

run, about 98.6% of all days will result in a travel time of at most 6 hours.
415
Chapter 13: Nonlinear and Multiple Regression
38.
= 125 + 7.75(40 ) + .0950(1100 ) − .009(40)(1100) = 143.50
a.
mean life
b.
First, the mean life when x1 = 30 is equal to
125 + 7.75(30) + .0950 x2 − .009(30)x 2 = 357.50 − .175x 2 . So when the load
increases by 1, the mean life decreases by .175. Second, the mean life when x1 =40 is
( )
( )
equal to 125 + 7.75 40 + .0950 x 2 − .009 40 x 2
load increases by 1, the mean life decreases by .265.
= 435 − .265 x 2 . So when the
39.
a.
For x1 = 2, x2 = 8 (remember the units of x2 are in 1000,s) and x3 = 1 (since the outlet has
a drive-up window) the average sales are
yˆ = 10.00 − 1.2 2 + 6.8 8 + 15.3 1 = 77.3 (i.e., $77,300 ).
( )
()
()
b.
For x1 = 3, x2 = 5, and x3 = 0 the average sales are
c.
When the number of competing outlets (x1 ) and the number of people within a 1-mile
radius (x2 ) remain fixed, the sales will increase by $15,300 when an outlet has a drive-up
window.
40.
yˆ = 10.00 − 1.2(3) + 6.8(5) + 15.3(0 ) = 40.4 (i.e., $40,400 ).
a.
µˆ Y ⋅10,. 5, 50,100 = 1.52 + .02(10 ) − 1.40(.5) + .02(50 ) − .0006(100 ) = 1.96
b.
µˆ Y ⋅ 20,. 5, 50, 30 = 1.52 + .02(20) − 1.40(.5) + .02(50) − .0006(30) = 1.40
c.
βˆ 4 = −.0006; 100 βˆ 4 = −.06 .
d.
There are no interaction predictors – e.g.,
x 5 = x1 x 4 -- in the model. There would be
dependence if interaction predictors involving x4 had been included.
e.
R2 = 1−
20.0
= .490 . For testing H 0 : β1 = β 2 = β 3 = β 4 = 0 vs. Ha: at least
39.2
R
one among
rejected if
2
β 1 ,..., β 4 is not zero, the test statistic is F = (1− R 2 )
f ≥ F.05, 4, 25 = 2.76 . f =
.490
.510
4
k
(n− k −1)
. Ho will be
= 6.0 . Because 6.0 ≥ 2.76 , Ho is
25
rejected and the model is judged useful (this even though the value of R2 is not all that
impressive).
416
Chapter 13: Nonlinear and Multiple Regression
41.
H 0 : β 1 = β 2 = ... = β 6 = 0 vs. Ha: at least one among β 1 ,..., β 6 is not zero. The test
R
statistic is
f =
. 83
k
(n− k −1)
. Ho will be rejected if
f ≥ F.05, 6, 30 = 2.42 .
= 24.41 . Because 24.41 ≥ 2.42 , Ho is rejected and the model is judged
6
(1−.83)
2
F = (1− R 2 )
30
useful.
42.
a.
H 0 : β1 = β 2 = 0 vs. H a : at least one β i ≠ 0 , the test statistic is
MSR
f =
= 319.31 (from output). The associated p-value is 0, so at any reasonable
MSE
To test
level of significance, Ho should be rejected. There does appear to be a useful linear
relationship between temperature difference and at leas one of the two predictors.
b.
The degrees of freedom for SSE = n – (k + 1) = 9 – (2 – 1) = 6 (which you could simply
read in the DF column of the printout), and t .025, 6 = 2.447 , so the desired confidence
(
)(
)
(
)
interval is 3.000 ± 2.447 .4321 = 3.000 ± 1.0573 , or about 1.943,4.057 .
Holding furnace temperature fixed, we estimate that the average change in temperature
difference on the die surface will be somewhere between 1.943 and 4.057.
c.
When x1 = 1300 and x2 = 7, the estimated average temperature difference is
d.
From the printout, s = 1.058, so the prediction interval is
yˆ = −199.56 + .2100 x1 + 3.000 x 2 = −199.56 + .2100(1300 ) + 3.000(7) = 94.44
. The desired confidence interval is then 94.44 ± (2.447 )(.353) = 94.44 ± .864 , or
(93.58,95.30) .
94.44 ± (2.447 ) (1.058)2 + (.353) 2 = 94.44 ± 2.729 = (91.71,97.17 ) .
417
Chapter 13: Nonlinear and Multiple Regression
43.
a.
b.
x1 = 2.6, x2 = 250, and x1 x2 = (2.6)(250) = 650, so
yˆ = 185.49 − 45.97(2.6 ) − 0.3015(250 ) + 0.0888(650) = 48.313
No, it is not legitimate to interpret β 1 in this way. It is not possible to increase by 1 unit
the cobalt content, x1 , while keeping the interaction predictor, x3 , fixed. When x1
changes, so does x3 , since x3 = x1 x2 .
c.
Yes, there appears to be a useful linear relationship between y and the predictors. We
determine this by observing that the p-value corresponding to the model utility test is <
.0001 (F test statistic = 18.924).
d.
We wish to test
H 0 : β 3 = 0 vs. H a : β 3 ≠ 0 . The test statistic is t=3.496, with a
corresponding p-value of .0030. Since the p-value is < alpha = .01, we reject Ho and
conclude that the interaction predictor does provide useful information about y.
e.
A 95% C.I. for the mean value of surface area under the stated circumstances requires the
following quantities:
yˆ = 185.49 − 45.97 2 − 0.3015 500 + 0.0888 2 500 = 31.598 . Next,
( )
(
)
( )(
)
t .025,16 = 2.120 , so the 95% confidence interval is
31.598 ± (2.120)(4.69 ) = 31.598 ± 9.9428 = (21.6552,41.5408 )
44.
a.
Holding starch damage constant, for every 1% increase in flour protein, the absorption
rate will increase by 1.44%. Similarly, holding flour protein percentage constant, the
absorption rate will increase by .336% for every 1-unit increase in starch damage.
b.
R2 = .96447, so 96.447% of the observed variation in absorption can be explained by the
model relationship.
c.
To answer the question, we test
H 0 : β1 = β 2 = 0 vs H a : at least one β i ≠ 0 . The
test statistic is f = 339.31092 , and has a corresponding p-value of zero, so at any
significance level we will reject Ho . There is a useful relationship between absorption
and at least one of the two predictor variables.
d.
We would be testing
H a : β 2 ≠ 0 . We could calculate the test statistic t =
β2
, or we
sβ2
could look at the 95% C.I. given in the output. Since the interval (.29828, 37298) does
not contain the value 0, we can reject Ho and conclude that ‘starch damage’ should not be
removed from the model.
e.
The 95% C.I. is
The 95% P.I. is
42.253 ± (2.060 )(.350) = 42.253 ± 0.721 = (41.532,42.974 ) .
(
)
42.253 ± (2.060 ) 1.09412 2 + .350 2 = 42.253 ± 2.366 = (39.887,44.619 ) .
418
Chapter 13: Nonlinear and Multiple Regression
f.
We test
H a : β 3 ≠ 0 , with t =
− .04304
= −2.428 . The p-value is approximately
.01773
2(.012) = .024. At significance level .01 we do not reject Ho . The interaction term
should not be retained.
45.
a.
The appropriate hypotheses are
H 0 : β1 = β 2 = β 3 = β 4 = 0 vs. H a : at least one
R2
β i ≠ 0 . The test statistic is f = (1−R )
=
k
2
(n −k −1 )
. 946
4
(1−.946)
= 87 .6 ≥ 7.10 = F.001, 4 ,20 (the
20
smallest available significance level from Table A.9), so we can reject Ho at any
significance level. We conclude that at least one of the four predictor variables appears
to provide useful information about tenacity.
b.
The adjusted R2 value is 1 −
= 1−
c.
n − 1  SSE 
n −1
2
1− R

 =1−
n − (k + 1)  SST 
n − (k + 1)
(
)
24
(1 − .946 ) = .935 , which does not differ much from R2 = .946.
20
The estimated average tenacity when x1 = 16.5, x2 = 50, x3 = 3, and x4 = 5 is
yˆ = 6 .121 − . 082 x + .113 x + .256 x − .219 x
yˆ = 6.121 − .082 (16 .5 ) + .113 (50 ) + .256 (3) − .219 (5 ) = 10 .091 . For a 99% C.I.,
t.005 , 20 = 2.845 , so the interval is 10. 091 ± 2. 845(. 350 ) = (9. 095,11. 087 ) . Therefore,
when the four predictors are as specified in this problem, the true average tenacity is
estimated to be between 9.095 and 11.087.
46.
a.
Yes, there does appear to be a useful linear relationship between repair time and the two
model predictors. We determine this by conducting a model utility test:
H 0 : β1 = β 2 = 0 vs. H a : at least one β i ≠ 0 . We reject Ho if f ≥ F. 05, 2 ,9 = 4 .26 .
The calculated statistic is f =
SSR
SSE
k
(n − k −1)
=
MSR
=
MSE
10 .63
2
(20. 9 )
9
=
5.315
= 22.91 . Since
.232
22 .91 ≥ 4 .26 , we reject Ho and conclude that at least one of the two predictor variables
is useful.
b.
We will reject H 0 : β 2 = 0 in favor of
H a : β 2 ≠ 0 if t ≥ t .005,9 = 3. 25 . The test
statistic is t = 1.250 = 4. 01 which is ≥ 3 .25 , so we reject Ho and conclude that the “type of
.312
repair” variable does provide useful information about repair time, given that the
“elapsed time since the last service” variable remains in the model.
419
Chapter 13: Nonlinear and Multiple Regression
c.
d.
A 95% confidence interval for β 3 is: 1.250 ± (2.262 )(.312 ) = (.5443 ,1.9557 ) . We
estimate, with a high degree of confidence, that when an electrical repair is required the
repair time will be between .54 and 1.96 hours longer than when a mechanical repair is
required, while the “elapsed time” predictor remains fixed.
yˆ = .950 + .400 (6 ) + 1 .250 (1) = 4 .6 , s 2 = MSE = .23222 , and t.005 , 9 = 3.25 , so the
99% P.I. is 4 .6 ± (3 .25 ) (.23222 ) + (.192 ) = 4.6 ± 1 .69 = (2. 91,6.29 ) The
prediction interval is quite wide, suggesting a variable estimate for repair time under
these conditions.
2
47.
a.
For a 1% increase in the percentage plastics, we would expect a 28.9 kcal/kg increase in
energy content. Also, for a 1% increase in the moisture, we would expect a 37.4 kcal/kg
decrease in energy content.
b.
The appropriate hypotheses are
H 0 : β1 = β 2 = β 3 = β 4 = 0 vs. H a : at least one
β i ≠ 0 . The value of the F test statistic is 167.71, with a corresponding p-value that is
extremely small. So, we reject Ho and conclude that at least one of the four predictors is
useful in predicting energy content, using a linear model.
c.
H 0 : β 3 = 0 vs. H a : β 3 ≠ 0 . The value of the t test statistic is t = 2.24, with a
corresponding p-value of .034, which is less than the significance level of .05. So we can
reject Ho and conclude that percentage garbage provides useful information about energy
consumption, given that the other three predictors remain in the model.
d.
yˆ = 2244.9 + 28.925(20) + 7.644(25 ) + 4.297(40 ) − 37.354(45 ) = 1505.5 ,
and t .025, 25 = 2.060 . (Note an error in the text: s yˆ = 12.47 , not 7.46). So a 95% C.I
for the true average energy content under these circumstances is
1505.5 ± (2.060 )(12.47 ) = 1505.5 ± 25.69 = (1479.8,1531.1) . Because the
interval is reasonably narrow, we would conclude that the mean energy content has been
precisely estimated.
e.
A 95% prediction interval for the energy content of a waste sample having the specified
1505.5 ± (2.060 ) (31.48 )2 + (12.47 )2
= 1505.5 ± 69.75 = (1435.7,1575.2 ) .
characteristics is
420
Chapter 13: Nonlinear and Multiple Regression
48.
a.
H 0 : β1 = β 2 = ... = β 9 = 0
H a : at least one β i ≠ 0
RR: f ≥ F.01, 9 , 5 = 10.16
R
2
f = (1− R 2 )
k
(n− k −1)
=
. 938
9
(1−.938)
( 5)
= 8.41
Fail to reject Ho . The model does not appear to specify a useful relationship.
b.
µˆ y = 21.967 , t α / 2,n −( k +1) = t .025, 5 = 2.571 , so the C.I. is
21.967 ± (2.571)(1.248 ) = (18.76, 25.18) .
c.
s2 =
SSE
23.379
=
= 4.6758 , and the C.I. is
n − ( k + 1)
5
21.967 ± (2.571) 4.6758 + (1.248) 2 = (15.55, 28.39) .
d.
SSEk = 23.379, SSEl = 203.82,
H 0 : β 4 = β 5 = ... = β 9 = 0
H a : at least one of the above β i ≠ 0
RR: f ≥ Fα ,k −l , n −( k +1 ) = F. 05, 6 , 5 = 4.95
f =
(203 .82 −23 .379 )
( 23.379 )
(9 − 3 )
( 5)
= 6.43 .
Reject Ho . At least one of the second order predictors appears useful.
49.
a.
b.
µˆ y⋅18.9, 43 = 96.8303; Residual = 91 – 96.8303 = -5.8303.
H 0 : β1 = β 2 = 0 ; H a : at least one β i ≠ 0
RR: f ≥ F.05, 2 ,9 = 8.02
R
2
f = (1− R 2 )
k
(n− k −1)
=
.768
= 14.90 . Reject Ho . The model appears useful.
2
(1−.768)
9
c.
96.8303 ± (2.262 )(8.20 ) = (78.28,115.38)
d.
96.8303 ± (2.262 ) 24.45 2 + 8.20 2 = (38.50,155.16 )
421
Chapter 13: Nonlinear and Multiple Regression
e.
We find the center of the given 95% interval, 93.875, and half of the width, 57.845. This
latter value is equal to t .025, 9 ( s yˆ ) = 2.262( s yˆ ) , so s yˆ = 25.5725 . Then the 90%
interval is
f.
93.785 ± (1.833)(25.5725) = (46.911,140.659 )
With the p-value for
H a : β1 ≠ 0 being 0.208 (from given output), we would fail to
reject Ho . This factor is not significant given x2 is in the model.
g.
With
Rk2 = .768 (full model) and Rl2 = .721 (reduced model), we can use an
alternative f statistic (compare formulas 13.19 and 13.20).
n=12, k=2 and l=1, we have
F=
.768 − .721
(1−.768 )
9
=
F=
R 2k − Rl2
(1− Rk2 )
k −l
. With
n − ( k +1)
.047
= 1.83 .
.0257
t = (−1.36) = 1.85 . The discrepancy can be attributed to rounding error.
2
2
50.
a.
Here k = 5, n – (k+1) = 6, so Ho will be rejected in favor of Ha at level .05 if either
t ≥ t .025, 6 = 2.447 or t ≤ −2.447 . The computed value of t is t =
.557
= .59 , so
.94
Ho cannot be rejected and inclusion of x1 x2 as a carrier in the model is not justified.
b.
c.
No, in the presence of the other four carriers, any particular carrier is relatively
unimportant, but this is not equivalent to the statement that all carriers are unimportant.
(
)
SSEk = SST 1 − R 2 = 3224.65, so f =
not
(5384.18 −3224.65 )
(3224.65 )
3
= 1.34 , and since 1.34 is
6
≥ F. 05, 3,6 = 4.76 , Ho cannot be rejected; the data does not argue for the inclusion
of any second order terms.
51.
a.
No, there is no pattern in the plots which would indicate that a transformation or the
inclusion of other terms in the model would produce a substantially better fit.
b.
k = 5, n – (k+1) = 8, so
H 0 : β 1 = ... = β 5 = 0 will be rejected if
f ≥ F.05,5, 8 = 3.69; f =
(.759 )
(. 241)
5
= 5.04 ≥ 3.69 , so we reject Ho . At least one of the
8
coefficients is not equal to zero.
422
Chapter 13: Nonlinear and Multiple Regression
c.
When x1 = 8.0 and x2 = 33.1 the residual is e = 2.71 and the standardized residual is e* =
.44; since e* = e/(sd of the residual), sd of residual = e/e* = 6.16. Thus the estimated
2
2
Yˆ is (6.99) − (6.16) = 10.915 , so the estimated sd is 3.304. Since
yˆ = 24.29 and t .025, 8 = 2.306 , the desired C.I. is
variance of
24.29 ± 2.306(3.304 ) = (16.67,31.91) .
d.
F.05,3, 8 = 4.07 , so H 0 : β 3 = β 4 = β 5 = 0 will be rejected if f ≥ 4.07 . With
SSEk = 8, s 2 = 390.88, and f =
(894.95 −390. 88 )
(390.88 )
3
= 3.44 , and since 3.44 is not
8
≥ 4.07 , Ho cannot be rejected and the quadratic terms should all be deleted. (n.b.: this is
not a modification which would be suggested by a residual plot.
52.
53.
a.
The complete 2nd order model obviously provides a better fit, so there is a need to
account for interaction between the three predictors.
b.
A 95% CI for y when x1 =x2 =30 and x3 =10 is
.66573 ± 2.120(.01785) = (.6279,.7036 )
Some possible questions might be:
Is this model useful in predicting deposition of poly-aromatic hydrocarbons? A test of model
utility gives us an F = 84.39, with a p-value of 0.000. Thus, the model is useful.
Is x1 a significant predictor of y while holding x2 constant? A test of
H 0 : β1 = 0 vs the
two-tailed alternative gives us a t = 6.98 with a p-value of 0.000., so this predictor is
significant.
A similar question, and solution for testing x2 as a predictor yields a similar conclusion: With
a p-value of 0.046, we would accept this predictor as significant if our significance level
were anything larger than 0.046.
54.
x1 = x 2 = x 3 = x 4 = +1 , yˆ = 84.67 + .650 − .258 + ... + .050 = 85.390 .
The single y corresponding to these x i values is 85.4, so
y − yˆ = 85.4 − 85.390 = .010 .
a.
For
b.
Letting
x1′ ,..., x ′4 denote the uncoded variables, x1′ = .1x1 + .3, x ′2 = .1x2 + .3,
x ′3 = x 3 + 2.5, and x ′4 = 15 x 4 + 160 ; Substitution of x1 = 10 x1′ − 3,
x ′ + 160
x 2 = 10 x ′2 − 3, x 3 = x ′3 − 2.5, and x 4 = 4
yields the uncoded function.
15
423
Chapter 13: Nonlinear and Multiple Regression
c.
For the full model k = 14 and for the reduced model l – 4, while n – (k + 1) = 16. Thus
H 0 : β 5 = ... = β14 = 0 will be rejected if f ≥ F.05,10,16 = 2.49 .
(
)
SSE = 1 − R 2 SST so SSEk = 1.9845 and SSEl = 4.8146 , giving
f =
(4 .8146−1 .9845)
(1 .9845)
10
= 2.28 . Since 2.28 is not ≥ 2.49 , Ho cannot be rejected, so all
16
higher order terms should be deleted.
d.
H 0 : µ Y ⋅0, 0, 0, 0 = 85.0 will be rejected in favor of H a : µ Y ⋅0, 0, 0, 0 < 85.0 if
85.5548 − 85
t ≤ −t .05, 26 = −1.706 . With µˆ = βˆ 0 = 85.5548 , t =
= 7.19 ,
.0772
which is certainly not ≤ −1.706 , so Ho is not rejected and prior belief is not
contradicted by the data.
Section 13.5
55.
a.
ln( Q) = Y = ln( α ) + β ln( a) + γ ln( b) + ln( ε ) = β 0 + β 1 x1 + β 2 x 2 + ε ′ where
x1 = ln( a), x 2 = ln( b ), β 0 = ln( α ), β1 = β , β 2 = γ and ε ′ = ln( ε ) . Thus we
transform to ( y, x1 , x 2 ) = (ln (Q), ln( a), ln( b) ) (take the natural log of the values of
each variable) and do a multiple linear regression. A computer analysis gave
βˆ 0 = 1.5652 , βˆ1 = .9450 , and βˆ 2 = .1815 . For a = 10 and b = .01, x1 = ln(10) =
2 .9053
2.3026 and x2 = ln(.01) = -4.6052, from which yˆ = 2.9053 and Qˆ = e
= 18.27 .
b.
Again taking the natural log,
Y = ln( Q ) = ln( α ) + βa + γ b + ln( ε ) , so to fit this
model it is necessary to take the natural log of each Q value (and not transform a or b)
before using multiple regression analysis.
c.
We simply exponentiate each endpoint:
424
(e
.217
)
, e 1. 755 = (1.24,5.78) .
Chapter 13: Nonlinear and Multiple Regression
56.
a.
n = 20, k = 5, n − (k + 1) = 14 , so H 0 : β 1 = ... = β 5 = 0 will be rejected in favor
of H a : at least one among β1 ,..., β 5 ≠ 0 , if f ≥ F.01, 5 ,14 = 4.69 . With
f =
(. 769 )
(.231)
5
= 9.32 ≥ 4.69 , so Ho is rejected. Wood specific gravity appears to be
14
linearly related to at lest one of the five carriers.
b.
For the full model, adjusted
model, the adjusted
c.
From a,
R2
R2 =
(19)(.769 ) − 5 = .687 , while for the reduced
14
(19)(.769 ) − 4 = .707 .
=
15
SSEk = (.231)(.0196610) = .004542 , and
SSEl = (.346 )(.0196610) = .006803 , so f =
(. 002261)
(.004542)
3
= 2.32 . Since
14
F.05,3,14 = 3.34 and 2.32 is not ≥ 3.34 , we conclude that β1 = β 2 = β 4 = 0 .
x 3 − 52.540
x − 89.195
= −.4665 and x ′5 = 5
= .2196 , so
5.4447
3.6660
yˆ = .5255 − (.0236 )(− .4665) + (.0097 )(.2196 ) = .5386 .
d.
x ′3 =
e.
t .025,17 = 2.110 (error df = n – (k+1) = 20 – (2+1) = 17 for the two carrier model), so
the desired C.I. is
f.
g.
− .0236 ± 2.110(.0046) = (− .0333,−.0139) .
 x − 52.540 
 x − 89.195 
y = .5255 − .0236 3
 + .0097 5
 , so β̂ 3 for the
 5.4447 
 3.6660 
− .0236
unstandardized model =
= −.004334 . The estimated sd of the
5.447
.0046
unstandardized β̂ 3 is =
= −.000845 .
5.447
yˆ = .532 and
s 2 + s βˆ0 + βˆ 3 x′3 + βˆ5 x ′5 = .02058 , so the P.I. is
.532 ± (2.110 )(.02058) = .532 ± .043 = (.489,.575) .
425
Chapter 13: Nonlinear and Multiple Regression
57.
k
R2
Adj. R2
Ck =
SSE k
+ 2(k + 1) − n
s2
1
.676
.647
138.2
2
.979
.975
2.7
3
.9819
.976
3.2
4
.9824
4
Where s 2 = 5.9825
a.
Clearly the model with k = 2 is recommended on all counts.
b.
No. Forward selection would let x4 enter first and would not delete it at the next stage.
58.
At step #1 (in which the model with all 4 predictors was fit), t = .83 was the t ratio smallest in
absolute magnitude. The corresponding predictor x3 was then dropped from the model, and a
model with predictors x1 , x2 , and x4 was fit. The t ratio for x4 , -1.53, was the smallest in
absolute magnitude and 1.53 < 2.00, so the predictor x4 was deleted. When the model with
predictors x1 and x2 only was fit, both t ratios considerably exceeded 2 in absolute value, so
no further deletion is necessary.
59.
The choice of a “best” model seems reasonably clear–cut. The model with 4 variables
including all but the summerwood fiber variable would seem bests. R2 is as large as any of
the models, including the 5 variable model. R2 adjusted is at its maximum and CP is at its
minimum . As a second choice, one might consider the model with k = 3 which excludes the
summerwood fiber and springwood % variables.
60.
Backwards Stepping:
Step 1: A model with all 5 variables is fit; the smallest t-ratio is t = .12, associated with
variable x2 (summerwood fiber %). Since t = .12 < 2, the variable x2 was eliminated.
Step 2: A model with all variables except x2 was fit. Variable x4 (springwood light
absorption) has the smallest t-ratio (t = -1.76), whose magnitude is smaller than 2.
Therefore, x4 is the next variable to be eliminated.
Step 3: A model with variables x3 and x5 is fit. Both t-ratios have magnitudes that exceed 2,
so both variables are kept and the backwards stepping procedure stops at this step. The
final model identified by the backwards stepping method is the one containing x3 and x5 .
(continued)
426
Chapter 13: Nonlinear and Multiple Regression
Forward Stepping:
Step 1: After fitting all 5 one-variable models, the model with x3 had the t-ratio with the
largest magnitude (t = -4.82). Because the absolute value of this t-ratio exceeds 2, x3 was
the first variable to enter the model.
Step 2: All 4 two-variable models that include x3 were fit. That is, the models {x3 , x1 }, {x3 ,
x2 }, {x3 , x4 }, {x3 , x5 } were all fit. Of all 4 models, the t-ratio 2.12 (for variable x5 ) was
largest in absolute value. Because this t-ratio exceeds 2, x5 is the next variable to enter
the mo del.
Step 3: (not printed): All possible tree-variable models involving x3 and x5 and another
predictor, None of the t-ratios for the added variables have absolute values that exceed 2,
so no more variables are added. There is no need to print anything in this case, so the
results of these tests are not shown.
Note; Both the forwards and backwards stepping methods arrived at the same final model,
{x3 , x5 }, in this problem. This often happens, but not always. There are cases when the
different stepwise methods will arrive at slightly different collections of predictor
variables.
61.
If multicollinearity were present, at least one of the four R2 values would be very close to 1,
which is not the case. Therefore, we conclude that multicollinearity is not a problem in this
data.
62.
Looking at the h ii column and using
2 (k + 1) 8
=
= .421 as the criteria, three observations
n
19
appear to have large influence. With h ii values of .712933, .516298, and .513214,
observations 14, 15, 16, correspond to response (y) values 22.8, 41.8, and 48.6.
63.
We would need to investigate further the impact these two observations have on the equation.
Removing observation #7 is reasonable, but removing #67 should be considered as well,
before regressing again.
64.
a.
2(k + 1) 6
=
= .6; since h 44 > .6, data point #4 would appear to have large influence.
n
10
(Note: Formulas involving matrix algebra appear in the first edition.)
b.
For data point #2,
x ′(2 ) = (1 3.453 − 4.920 ) , so βˆ − βˆ (2 ) =
 1 
 .3032   − .333 


 

− .766
−1 
( X ′X )  3.453  = −1.0974 .1644  =  − .180  and similar
1 − .302
 − 4.920 
 .1156   − .127 



 

 .106 


calculations yield βˆ − βˆ ( 4) =  − .040  .
 .030 


427
Chapter 13: Nonlinear and Multiple Regression
c.
Comparing the changes in the
β̂ i ' s to the s βˆi ' s , none of the changes is all that
substantial (the largest is 1.2sd’s for the change in β̂ 1 when point #2 is deleted). Thus
although h 44 is large, indicating a potential high influence of point #4 on the fit, the actual
influence does not appear to be great.
Supplementary Exercises
65.
a.
Boxplots of ppv by prism quality
(means are indicated by solid circles)
ppv
1200
700
200
cracked
not cracked
prism qualilty
A two-sample t confidence interval, generated by Minitab:
Two sample T for ppv
prism qu
cracked
not cracke
N
12
18
Mean
827
483
95% CI for mu (cracked
StDev
295
234
SE Mean
85
55
) - mu (not cracke): ( 132,
428
557)
Chapter 13: Nonlinear and Multiple Regression
b.
The simple linear regression results in a significant model, r2 is .577, but we have an
extreme observation, with std resid = -4.11. Minitab output is below. Also run, but not
included here was a model with an indicator for cracked/ not cracked, and for a model
with the indicator and an interaction term. Neither improved the fit significantly.
The regression equation is
ratio = 1.00 -0.000018 ppv
Predictor
Coef
Constant
1.00161
ppv
-0.00001827
S = 0.004892
StDev
0.00204
0.00000295
R-Sq = 57.7%
T
491.18
-6.19
P
0.000
0.000
R-Sq(adj) = 56.2%
Analysis of Variance
Source
Regression
Residual Error
Total
DF
1
28
29
SS
0.00091571
0.00067016
0.00158587
Unusual Observations
Obs
ppv
ratio
29
1144
0.962000
MS
0.00091571
0.00002393
Fit
0.980704
F
38.26
StDev Fit
0.001786
P
0.000
Residual
-0.018704
St Resid
-4.11R
R denotes an observation with a large standardized residual
66.
a. For every 1 cm-1 increase in inverse foil thickness (x), we estimate that we would expect
steady-state permeation flux to increase by .26042 µA / cm . Also, 98% of the
2
observed variation in steady-state permeation flux can be explained by its relationship to
inverse foil thickness.
b. A point estimate of flux when inverse foil thickness is 23.5 can be found in the
Observation 3 row of the Minitab output:
yˆ = 5.722 µA / cm 2 .
c. To test model usefulness, we test the hypotheses
H 0 : β1 = 0 vs. H a : β 1 ≠ 0 . The
test statistic is t = 17034, with associated p-value of .000, which is less than any
significance level, so we reject Ho and conclude that the model is useful.
d. With
t .025, 6 = 2.447 , a 95% Prediction interval for Y(45) is
11.321 ± 2.447 .203 + (.253) 2 = 11.321 ± 1.264 = (10.057,12.585) . That is,
we are confident that when inverse foil thickness is 45 cm-1 , a predicted value of steadystate flux will be between 10.057 and 12.585
429
µA / cm 2 .
Chapter 13: Nonlinear and Multiple Regression
e.
Standard Residuals vs x
2
2
1
1
stdresid
std res
Normal Plot of Standard Residuals
0
-1
0
-1
-1
0
1
20
30
%iles
40
50
x
The normal plot gives no indication to question the normality assumption, and the
residual plots against both x and y (only vs x shown) show no detectable pattern, so we
judge the model adequate.
67.
a.
For a one-minute increase in the 1-mile walk time, we would expect the VO 2 max to
decrease by .0996, while keeping the other predictor variables fixed.
b.
We would expect male to have an increase of .6566 in VO 2 max over females, while
keeping the other predictor variables fixed.
c.
d.
yˆ = 3.5959 + .6566(1) + .0096(170 ) − .0996(11) − .0880(140) = 3.67 . The
residual is yˆ = (3.15 − 3.67 ) = −.52 .
R2 = 1−
SSE
30.1033
=1−
= .706, or 70.6% of the observed variations in
SST
102.3922
VO2 max can be attributed to the model relationship.
e.
H 0 : β1 = β 2 = β 3 = β 4 = 0 will be rejected in favor of H a : at least one among
β1 ,..., β 4 ≠ 0 , if f ≥ F.05, 4,15 = 8.25 . With f =
(. 706)
4
(1 −.706 )
= 9.005 ≥ 8.25 , so Ho
15
is rejected. It appears that the model specifies a useful relationship between VO 2 max and
at least one of the other predictors.
430
Chapter 13: Nonlinear and Multiple Regression
68.
a.
Scatter Plot of Log(edges) vs Log(time)
3
Log(time
2
1
0
1
2
3
4
Log(edge
Yes, the scatter plot of the two transformed variables appears quite linear, and thus
suggests a linear relationship between the two.
b.
Letting y denote the variable ‘time’, the regression model for the variables
y′ and x ′ is
log 10 ( y ) = y ′ = α + βx′ + ε ′ . Exponentiating (taking the antilogs of ) both sides
gives
( )( )
y = 10α + β log( x )+ε ′ = 10 α x β 10 ε ′ = γ 0 x γ 1 ⋅ ε ; i.e., the model is
y = γ 0 x ⋅ ε where γ 0 = α and γ 1 = β . This model is often called a “power
γ1
function” regression model.
c.
y′ and x ′ , the necessary sums of squares are
(42.4)(21.69 ) = 11.1615 and
S x ′y′ = 68.640 −
16
(42.4) 2 = 13.98 . Therefore βˆ = S x′y ′ = 11.1615 = .79839
S x ′x′ = 126.34 −
1
16
S x′x ′
13.98
21.69
 42.4 
and βˆ 0 =
− (.79839)
 = −.76011 . The estimate of γ 1 is
16
 16 
γˆ1 = .7984 and γ 0 = 10α = 10 −. 76011 = .1737 . The estimated power function model
Using the transformed variables
is then
y = .1737 x .7984 . For x = 300, the predicted value of y is
yˆ = .1737(300 )
. 7984
16.502 , or about 16.5 seconds.
431
Chapter 13: Nonlinear and Multiple Regression
69.
a.
Based on a scatter plot (below), a simple linear regression model would not be
appropriate. Because of the slight, but obvious curvature, a quadratic model would
probably be more appropriate.
350
Temperat
250
150
50
0
100
200
300
400
Pressure
b.
Using a quadratic model, a Minitab generated regression equation is
yˆ = 35.423 + 1.7191x − .0024753x 2 , and a point estimate of temperature when
pressure is 200 is yˆ = 280.23 . Minitab will also generate a 95% prediction interval of
(256.25, 304.22). That is, we are confident that when pressure is 200 psi, a single value
of temperature will be between 256.25 and 304.22
ο
F.
70.
a.
For the model excluding the interaction term,
R2 = 1−
5.18
= .394 , or 39.4% of the
8.55
observed variation in lift/drag ratio can be explained by the model without the interaction
accounted for. However, including the interaction term increases the amount of variation
in lift/drag ratio that can be explained by the model to
64.1%.
432
R2 = 1 −
3.07
= .641 , or
8.55
Chapter 13: Nonlinear and Multiple Regression
b.
H 0 : β1 = β 2 = 0 vs. H a : either β 1 or β 2 ≠ 0 .
Without interaction, we are testing
R
The test statistic is
f = (1− R 2 )
the calculated statistic is
f =
2
k
, The rejection region is
(n −k −1)
.394
= 1.95 , which does not fall in the rejection
2
(1−.394)
f ≥ F.05, 2,6 = 5.14 , and
6
region, so we fail to reject Ho . This model is not useful. With the interaction term, we
are testing H 0 : β 1 = β 2 = β 3 = 0 vs. H a : at least one of the β i ' s ≠ 0 . With
rejection region
f ≥ F.05,3, 5 = 5.41 and calculated statistic f =
.641
= 2.98 , we
3
(1−.641)
5
still fail to reject the null hypothesis. Even with the interaction term, there is not enough
of a significant relationship between lift/drag ratio and the two predictor variables to
make the model useful (a bit of a surprise!)
71.
a.
Using Minitab to generate the first order regression model, we test the model utility (to
see if any of the predictors are useful), and with f = 21.03 and a p-value of .000, we
determine that at least one of the predictors is useful in predicting palladium content.
Looking at the individual predictors, the p-value associated with the pH predictor has
value .169, which would indicate that this predictor is unimportant in the presence of the
others.
b.
Testing
H 0 : β1 = ... = β 20 = 0 vs. H a : at least one of the β i ' s ≠ 0 . With
calculated statistic f = 6.29 , and p-value .002, this model is also useful at any
reasonable significance level.
c.
Testing
H 0 : β 6 = ... = β 20 = 0 vs. H a : at least one of the listed β i ' s ≠ 0 , the test
statistic is
f =
(SSEl − SSEk )
(SSEk )
k −l
n − k −1
the rejection region would be
=
( 716.10 −290. 27 )
290.27
( 20− 5 )
(32 − 20 −1 )
= 1.07 . Using significance level .05,
f ≥ F.05,15,11 = 2.72 . Since 1.07 < 2.72, we fail to reject
Ho and conclude that all the quadratic and interaction terms should not be included in the
model. They do not add enough information to make this model significantly better than
the simple first order model.
d.
Partial output from Minitab follows, which shows all predictors as significant at level .05:
The regression equation is
pdconc = - 305 + 0.405 niconc + 69.3 pH - 0.161 temp + 0.993 currdens
+ 0.355 pallcont - 4.14 pHsq
Predictor
Constant
niconc
pH
temp
currdens
pallcont
pHsq
Coef
-304.85
0.40484
69.27
-0.16134
0.9929
0.35460
-4.138
StDev
93.98
0.09432
21.96
0.07055
0.3570
0.03381
1.293
433
T
-3.24
4.29
3.15
-2.29
2.78
10.49
-3.20
P
0.003
0.000
0.004
0.031
0.010
0.000
0.004
Chapter 13: Nonlinear and Multiple Regression
72.
a.
R2 = 1−
SSE
.80017
=1−
= .9506 , or 95.06% of the observed variation in
SST
16.18555
weld strength can be attributed to the given model.
b.
The complete second order model consists of nine predictors and nine corresponding
coefficients. The hypotheses are
H 0 : β 1 = ... = β 9 = 0 vs. H a : at least one of the
R
β i ' s ≠ 0 . The test statistic is f = (1− R 2 )
region is
which is
useful.
c.
2
k
, where k = 9, and n = 37.The rejection
(n −k −1)
f ≥ F.05, 9, 27 = 2.25 . The calculated statistic is f =
≥ 2.25 , so we reject the null hypothesis.
.9506
= 57.68
9
(1−.9506)
27
The complete second order model is
H 0 : β 7 = 0 vs H a : β 7 ≠ 0 (the coefficient corresponding to the wc*wt
To test
predictor),
f = 2.32 = 1.52 . With df = 27, the p-value ≈ 2(. 073) = .146
t=
(from Table A.8). With such a large p-value, this predictor is not useful in the presence
of all the others, so it can be eliminated.
d.
( )
yˆ = 3.352 + .098(10) + .222(12) + .297(6) − .0102 10 2
− .037 6 2 + .0128(10 )(12 ) = 7.962 . With t .025, 27 = 2.052 , the 95% P.I. would be
The point estimate is
( )
7.962 ± 2.052(.0750 ) = 7.962 ± .154 = (7.808,8.116 ) . Because of the
narrowness of the interval, it appears that the value of strength can be accurately
predicted.
73.
a.
We wish to test
R
is
f = (1− R 2 )
H 0 : β1 = β 2 = 0 vs. H a : either β 1 or β 2 ≠ 0 . The test statistic
2
k
, where k = 2 for the quadratic model. The rejection region is
(n −k −1)
f ≥ Fα ,k ,n −k −1 = F.01, 2, 5 = 13.27 . R 2 = 1 −
.29
= .9986 , giving f = 1783. No
202.88
doubt about it, folks – the quadratic model is useful!
b.
The relevant hypotheses are
t=
H 0 : β 2 = 0 vs. H a : β 2 ≠ 0 . The test statistic value is
βˆ 2
, and Ho will be rejected at level .001 if either t ≥ 6.869 or t ≤ −6.869 (df
s βˆ
2
= n – 3 = 5). Since
t=
− .00163141
= −48.1 ≤ −6.869 , Ho is rejected. The
.00003391
quadratic predictor should be retained.
434
Chapter 13: Nonlinear and Multiple Regression
c.
d.
No. R2 is extremely high for the quadratic model, so the marginal benefit of including
the cubic predictor would be essentially nil – and a scatter plot doesn’t show the type of
curvature associated with a cubic model.
2
t .025,5 = 2.571 , and βˆ 0 + βˆ 1 (100) + βˆ 2 (100 ) = 21.36 , so the C.I. is
21.36 ± (2.571)(.1141) = 21.36 ± .69 = (20.67,22.05)
e.
First, we need to figure out s 2 based on the information we have been given.
s 2 = MSE =
SSE
df
(
=
.29
5
= .058 . Then, the 95% P.I. is
)
21.36 ± 2.571 .058 + .1141 = 21.36 ± 1.067 = (20.293,22.427 )
74.
A scatter plot of
model Y
y ′ = log 10 ( y ) vs. x shows a substantial linear pattern, suggesting the
= α ⋅ (10 ) ⋅ ε , i.e. Y ′ = log (α ) + βx + log (ε ) = β 0 + β1 x + ε ′ . The
βx
necessary summary quantities are
Σ xi = 397, Σ xi2 = 14, 263, Σ yi′ = −74.3, Σ yi′2 = 47,081, and Σ xi yi′ = −2358.1 ,
12(− 2358.1) − (397 )(− 74.3)
giving βˆ1 =
= .08857312 and βˆ 0 = −9.12196058 .
2
12(14,263) − (397)
−9 .12196058
Thus βˆ = .08857312 and α = 10
. The predicted value of y′ when x = 35 is
− 9.12196058 + .08857312(35) = −6.0219 , so yˆ = 10 −6.0219 .
75.
a.
H 0 : β1 = β 2 = 0 will be rejected in favor of H a : either β 1 or β 2 ≠ 0 if
R2
f = (1−R 2 )
k
( n − k −1)
R2 = 1 −
≥ Fα , k, n − k− 1 = F.01,2 ,7 = 9.55 . SST = Σy 2 −
26 .98
= .898 , and f =
264 .5
.898
2
(.102)
7
(Σy) = 264.5 , so
n
= 30.8 . Because 30.8 ≥ 9.55 Ho is rejected at
significance level .01 and the quadratic model is judged useful.
b.
The hypotheses are
t=
H 0 : β 2 = 0 vs. H a : β 2 ≠ 0 . The test statistic value is
βˆ2 − 2.3621
=
= −7.69 , and t. 0005 , 7 = 5.408 , so Ho is rejected at level .001 and ps βˆ2
. 3073
value < .001. The quadratic predictor should not be eliminated.
c.
2
x = 1 here, and µˆ Y ⋅1 = βˆ0 + βˆ1 (1) + βˆ 2 (1) = 45 .96 . t.025,7 = 1. 895 , giving the C.I.
45 .96 ± (1 .895 )(1 .031 ) = (44 .01,47 .91) .
435
Chapter 13: Nonlinear and Multiple Regression
76.
a.
80.79
b.
Yes, p-value = .007 which is less than .01.
c.
No, p-value = .043 which is less than .05.
d.
.14167 ± (2 .447 )(.03301 ) = (.0609 ,.2224 )
e.
µˆ y⋅9 , 66 = 6.3067 , using α = .05 , the interval is
6.3067 ± (2.447 ) (.4851)2 + (.162 )2 = (5.06,7.56 )
77.
a.
2
Estimate = βˆ 0 + βˆ1 (15 ) + βˆ 2 (3.5 ) = 180 + (1)(15 ) + (10 .5 )(3.5 ) = 231 .75
117 .4
= .903
1210 .30
b.
R2 = 1 −
c.
H 0 : β1 = β 2 = 0 vs. H a : either β 1 or β 2 ≠ 0 (or both) . f =
.903
.097
2
= 41. 9 , which
9
greatly exceeds F.01,2, 9 so there appears to be a useful linear relationship.
2
117. 40
2
= 13.044 , s + (est.st.dev) = 3.806 , t.025 ,9 = 2 .262 . The P.I. is
12 − 3
229 . 5 ± (2 . 262 )(3. 806 ) = (220 . 9, 238 . 1)
s2 =
d.
78.
The second order model has predictors
x1 , x2 , x3 , x12 , x22 , x32 , x1 x 2 , x1 x 3 , x 2 x 3 with
β1 , β 2 , β 3 , β 4 , β 5 , β 6 , β 7 , β 8 , β 9 . We wish to test
H 0 : β 4 = β 5 = β 6 = β 7 = β 8 = β 9 = 0 vs. the alternative that at least one of these six
corresponding coefficients
β i ' s is not zero. The test statistic value is f =
1.1 <
(821.5 − 5027. 1)
(5027. 1)
(9 −3 )
(20 −10)
=
530.9
= 1.1 . Since
502.71
F.05,6,10 = 3.22 , Ho cannot be rejected. It doesn’t appear as though any of the
quadratic or interaction carriers should be included in the model.
79.
There are obviously several reasonable choices in each case.
a. The model with 6 carriers is a defensible choice on all three grounds, as are those with 7
and 8 carriers.
b.
The models with 7, 8, or 9 carriers here merit serious consideration. These models merit
consideration because
Rk2 , MSE k , and CK meet the variable selection criteria given in
Section 13.5.
436
Chapter 13: Nonlinear and Multiple Regression
80.
a.
(.90 )
(15 )
(.10 )
(4 )
f =
= 2.4 . Because 2.4 < 5.86, H 0 : β 1 = ... = β15 = 0 cannot be rejected.
There does not appear to be a useful linear relationship.
b.
The high R2 value resulted from saturating the model with predictors. In general, one
would be suspicious of a model yielding a high R2 value when K is large relative to n.
(R )
2
c.
(15)
(1−R )
2
( 4)
R2
21. 975
≥ 5.86 iff
≥ 21.975 iff R 2 ≥
= .9565
2
1− R
22. 975
81.
a.
The relevant hypotheses are
H 0 : β 1 = ... = β 5 = 0 vs. Ha: at least one among
β 1 ,..., β 5 is not 0. F.05,5,111 = 2.29 and f =
(.827 )
(.173)
(5 )
(111)
= 106.1 . Because
106.1 ≥ 2.29 , Ho is rejected in favor of the conclusion that there is a useful linear
relationship between Y and at least one of the predictors.
b.
t .05,111 = 1.66 , so the C.I. is .041 ± (1.66 )(.016 ) = .041 ± .027 = (.014,.068 ) . β 1
is the expected change in mortality rate associated with a one-unit increase in the particle
reading when the other four predictors are held fixed; we cab be 90% confident that .014
β 1 < .068.
<
c.
H 0 : β 4 = 0 will be rejected in favor of H a : β 4 ≠ 0 if t =
βˆ 4
is either ≥ 2.62
s βˆ
4
≤ −2.62 . t =
or
.014
= 5.9 ≥ 2.62 , so Ho is rejected and this predictor is judged
.007
important.
d.
yˆ = 19.607 + .041(166 ) + .071(60) + .001(788) + .041(68 ) + .687(.95) = 99.514
and the corresponding residual is 103 – 99.514 = 3.486.
82.
a.
The set
x1 , x3 , x 4 , x 5 , x6 , x8 includes both x1 , x4 , x5 , x8 and x1 , x3 , x 5 , x 6 , so
2
1, 3, 4,5, 6, 8
R
b.
(
)
≥ max R12, 4,5, 8 , R12, 3,5, 6 = .723 .
R12, 4 ≤ R12, 4,5, 8 = .723 , but it is not necessarily ≤ .689 since x1 , x 4 is not a subset of
x1 , x3 , x 5 , x 6 .
437
Chapter 13: Nonlinear and Multiple Regression
438