Download House Selling Price

Document related concepts
no text concepts found
Transcript
Chapter 13
Multiple Regression
Section 13.1
Using Several Variables to Predict a
Response
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Regression Models
The model that contains only two variables, x and y, is
called a bivariate model.
    x
y
3
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Regression Models
Suppose there are two predictors, denoted by
x1
This is called a multiple regression model.
    x   x
y
4
1
1
2
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
and
x2
.
Multiple Regression Model
 y of a
The multiple regression model relates the mean
quantitative response variable y to a set of explanatory
variablesx1 , x2 ,...
5
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Multiple Regression Model
Example: For three explanatory variables, the multiple
regression equation is:
    x   x   x
y
6
1
1
2
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
3
3
Multiple Regression Model
Example: The sample prediction equation with three
explanatory variables is:
yˆ  a  b x  b x  b x
1
7
1
2
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
3
3
Example: Predicting Selling Price Using
House Size and Number of Bedrooms
The data set “house selling prices” contains observations
on 100 home sales in Florida in November 2003.
A multiple regression analysis was done with selling price
as the response variable and with house size and
number of bedrooms as the explanatory variables.
8
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting Selling Price Using
House Size and Number of Bedrooms
Output from the analysis:
Table 13.3 Regression of Selling Price on House Size and Bedrooms. The regression
equation is price = 60,102 + 63.0 house size + 15,170 bedrooms.
9
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting Selling Price Using
House Size and Number of Bedrooms
Prediction Equation:
yˆ  60,102  63.0x1  15,170 x2
x1
where y = selling price,
of bedrooms.
10
x2
=house size and
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
= number
Example: Predicting Selling Price Using
House Size and Number of Bedrooms
One house listed in the data set had house size = 1679
square feet, number of bedrooms = 3:
Find its predicted selling price:
yˆ  60,102  63.0(1679)  15,170(3)
 $211,389
11
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting Selling Price Using
House Size and Number of Bedrooms
Find its residual:
y  yˆ  232,500  211,389  21,111
The residual tells us that the actual selling price was
$21,111 higher than predicted.
12
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Number of Explanatory Variables
You should not use many explanatory variables in a
multiple regression model unless you have lots of data.
A rough guideline is that the sample size n should be at
least 10 times the number of explanatory variables.
For example, to use two explanatory variables, you
should have at least n = 20.
13
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Plotting Relationships
Always look at the data before doing a multiple
regression.
Most software has the option of constructing scatterplots
on
a single graph for each pair of variables.
 This is called a scatterplot matrix.
14
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Plotting Relationships
Figure 13.1 Scatterplot Matrix for Selling Price, House Size, and Number of Bedrooms.
The middle plot in the top row has house size on the x -axis and selling price on the y -axis.
The first plot in the second row reverses this, with selling price on the x -axis and house size
on the y -axis. Question: Why are the plots of main interest the ones in the first row?
15
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Interpretation of Multiple Regression
Coefficients
The simplest way to interpret a multiple regression
equation looks at it in two dimensions as a function of a
single explanatory variable.
We can look at it this way by fixing values for the other
explanatory variable(s).
16
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Interpretation of Multiple Regression
Coefficients
Example using the housing data:
x2
Suppose we fix number of bedrooms
= three bedrooms.
The prediction equation becomes:
yˆ  60,102  63.0 x1  15,170(3)
 105,612  63.0 x1
17
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Interpretation of Multiple Regression
Coefficients
x2 is 63, the predicted
Since the slope coefficient of
selling price increases, for houses with this number of
bedrooms, by $63.00 for every additional square foot in
house size.
For a 100 square-foot increase in lot size, the predicted
selling price increases by 100(63.00) = $6300.
18
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Summarizing the Effect While Controlling
for a Variable
The multiple regression model assumes that the slope for
a particular explanatory variable is identical for all fixed
values of the other explanatory variables.
19
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Summarizing the Effect While Controlling
for a Variable
x1
For example, the coefficient of
equation:
in the prediction
yˆ  60,102  63.0x1  15,170 x2
is 63.0 regardless of whether we plug xin2  1
x2  3 .
or
20
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
x2  2
or
Summarizing the Effect While Controlling
for a Variable
Figure 13.2 The Relationship Between y and x1 for the Multiple Regression
Equation yˆ  60,102  63.0 x1  15,170 x2 . This shows how the equation simplifies
when number of bedrooms x2  1 , or x2  2 , or x2  3. Question: The lines move
upward (to higher ŷ -values ) as x2 increases. How would you interpret this fact?
21
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Slopes in Multiple Regression and in
Bivariate Regression
 In multiple regression, a slope describes the effect of
an explanatory variable while controlling effects of the
other explanatory variables in the model.
 Bivariate regression has only a single explanatory
variable. A slope in bivariate regression describes the
effect of that variable while ignoring all other possible
explanatory variables.
22
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Importance of Multiple Regression
 One of the main uses of multiple regression is to
identify potential lurking variables and control for them by
including them as explanatory variables in the model.
 Doing so can have a major impact on a variable’s
effect.
 When we control a variable, we keep that variable
from influencing the associations among the other
variables in the study.
23
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 13
Multiple Regression
Section 13.2
Extending the Correlation and
R-Squared for Multiple Regression
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Multiple Correlation
To summarize how well a multiple regression model
predicts y, we analyze how well the observed y values
ŷ values.
correlate with the predicted
The multiple correlation is the correlation between the
yˆ values.
observed y values and the predicted
 It is denoted by R.

25
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Multiple Correlation
 For each subject, the regression equation provides a
predicted value.
 Each subject has an observed y-value and a
predicted y-value.
Table 13.4 Selling Prices and Their Predicted Values. These values refer to the two
home sales listed in Table 13.1. The predictors are x1= house size and x2= number of
bedrooms.
26
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Multiple Correlation
The correlation computed between all pairs of observed
y-values and predicted y-values is the multiple
correlation, R.
The larger the multiple correlation, the better are the
predictions of y by the set of explanatory variables.
27
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Multiple Correlation
The R-value always falls between 0 and 1.
In this way, the multiple correlation ‘R’ differs from the
bivariate correlation ‘r’ between y and a single variable x,
which falls between -1 and +1.
28
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
R-squared
For predicting y, the square of R describes the relative
improvement from using the prediction equation instead
y .
of using the sample mean,
The error in using the prediction equation to predict y is
summarized by the residual sum of squares:

 ( y  yˆ )
29
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
R-squared
The error in using
y to predict y is summarized by the
total sum of squares:

30
 ( y  y)
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
R-squared
The proportional reduction in error is:
R
31
2


( y  y ) 2   ( y  yˆ ) 2

( y  y)
2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
R-squared
 The better the predictions are using the regression
equation, the larger
R 2 is.
 For multiple regression,
R2
multiple correlation,
.
R
32
is the square of the
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
For the 200 observations on y=selling price,
x1 = house size, and
x2 = lot
size, a table, called the ANOVA (analysis of variance) table was created.
The table displays the sums of squares in the SS column.
Table 13.5 ANOVA Table and R -Squared for Predicting House Selling Price (in
thousands of dollars) Using House Size (in thousands of square feet) and Number
of Bedrooms.
33
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
2
R value can be created from the sums of
The
squares in the table
( y  y )   ( y  yˆ )


 ( y  y)
2
R
2
2
2
2,668,870 - 1,269,345

 0.524
2,668,870
34
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
Using house size and lot size together to predict selling
price reduces the prediction error by 52%, relative to
y alone to predict selling price.
using

35
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
Find and interpret the multiple correlation.
R  R  0.524  0.72
2
There is a moderately strong association between the
observed and the predicted selling prices.
House size and number of bedrooms are very helpful in
predicting selling prices.
36
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
If we used a bivariate regression model to predict selling
2
r
price with house size as the predictor, the value would
be 0.51.
If we used a bivariate regression model to predict selling
r2
price with number of bedrooms as the predictor, the
value would be 0.1.
37
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
The multiple regression model has
R 2  0.52 , a similar
value to using only house size as a predictor.
There is clearly more to this prediction than using only one
variable in the model. Interpretation of results is important.
Larger lot sizes in this area could mean older homes with
smaller size or fewer bedrooms or bathrooms.
38
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
Table 13.6
39
R 2 Value for Multiple Regression Models for y = House Selling Price
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Predicting House Selling Prices
Although R2 goes up by only small amounts after house size is in the
model, this does not mean that the other predictors are only weakly
correlated with selling price.
Because the predictors are themselves highly correlated, once one
or two of them are in the model, the remaining ones don’t help much
in adding to the predictive power.
For instance, lot size is highly positively correlated with number of
bedrooms and with size of house. So, once number of bedrooms
and size of house are included as predictors in the model, there’s not
much benefit to including lot size as an additional predictor.
40
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Properties of R
2
2
R
The previous example showed that
for the multiple
2
r
regression model was larger than for a bivariate model
using only one of the explanatory variables.
2
R
A key factor of
is that it cannot decrease when
predictors are added to a model.
41
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Properties of R
R
2
2
falls between 0 and 1.
 The larger the value, the better the explanatory
variables collectively predict y.
R  1
only when all residuals are 0, that is, when all
regression predictions are perfect.
2
2
R
 0
when the correlation between y and each
explanatory variable equals 0.
42
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Properties of R
2
R 2 gets larger, or at worst stays the same, whenever
an explanatory variable is added to the multiple
regression model.
R
 The value of
measurement.
43
2
does not depend on the units of
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 13
Multiple Regression
Section 13.3
Using Multiple Regression to Make Inferences
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Inferences about the Population
Assumptions required when using a multiple regression
model to make inferences about the population:
 The regression equation truly holds for the
population means.
This implies that there is a straight-line relationship
between the mean of y and each explanatory
variable, with the same slope at each value of the
other predictors.
45
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Inferences about the Population
Assumptions required when using a multiple regression
model to make inferences about the population:
46

The data were gathered using randomization.

The response variable y has a normal distribution
at each combination of values of the explanatory
variables, with the same standard deviation.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Inferences about Individual Regression
Parameters
Consider a particular parameter,1
 If 1  0 , the mean of y is identical for all values of
at fixed values of the other explanatory variables.
x1 ,
 So, H 0 : 1  0 states that y and x1 are statistically
independent, controlling for the other variables.
 This means that once the other explanatory variables
are in the model, it doesn’t help to have x1 in the
model.
47
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: Significance Test about a
Multiple Regression Parameter
1. Assumptions:
 Each explanatory variable has a straight-line
relation with  y with the same slope for all
combinations of values of other predictors in the
model.
 Data gathered using randomization.
 Normal distribution for y with same standard
deviation at each combination of values of other
predictors in model.
48
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: Significance Test about a
Multiple Regression Parameter
2. Hypotheses:


H 0 : 1  0
H a : 1  0
When H 0 is true, y is independent of
the other predictors.
49
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
x1 , controlling for
SUMMARY: Significance Test about a
Multiple Regression Parameter
3. Test Statistic:
b 0
t
se
1
where se is the standard error for b1
50
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: Significance Test about a
Multiple Regression Parameter
4. P-value: Two-tail probability from t-distribution of
values larger than observed t test statistic (in absolute
value). The t-distribution has:
df = n – number of parameters in the regression equation
51
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: Significance Test about a
Multiple Regression Parameter
5. Conclusion: Interpret P-value in context; compare to
significance level if decision needed.
52
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
The “College Athletes” data set comes from a study of 64
University of Georgia female athletes.
The study measured several physical characteristics,
including total body weight in pounds (TBW), height in
inches (HGT), the percent of body fat (%BF) and age.
53
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
The results of fitting a multiple regression model for
predicting weight using the other variables:
Table 13.10 Multiple Regression Analysis for Predicting Weight
Predictors are HGT = height, %BF = body fat, and age of subject.
54
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
Interpret the effect of age on weight in the multiple
regression equation:
Let yˆ  predicted weight, x1  height,
x2  % body fat, and x3  age
Then yˆ  97.7  3.43x1  1.36 x2  0.96 x3
55
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
The slope coefficient of age is -0.96.
For athletes having fixed values for x1 and x2 , the
predicted weight decreases by 0.96 pounds for a 1-year
increase in age, and the ages vary only between 17 and
23.
56
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
Run a hypothesis test to determine whether age helps to
predict weight, if you already know height and percent
body fat. Here are the steps:
1. Assumptions Met:
 The 64 female athletes were a convenience
sample, not a random sample.
 Caution should be taken when making inferences
about all female college athletes.
57
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
2. Hypotheses:
 H 0 : 3  0
 H a : 3  0
3. Test statistic:
b  0  0.960
t

 1.48
se
0.648
3
58
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
4. P-value: This value is reported in the output as 0.144.
5. Conclusion: The P-value of 0.144 does not give much
evidence against the null hypothesis that 3  0 .
 Age may not significantly predict weight if we
already know height and % body fat.
59
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Confidence Interval for a Multiple
Regression Parameter
A 95% confidence interval for a
multiple regression equals:
 slope parameter in
Estimated slope  t (se)
.025
The t-score has df = (n - # of parameters in the model).
The assumptions are the same as for the t test.
60
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Confidence Interval for a Multiple
Regression Parameter
Construct and interpret a 95% CI for  3 , the effect of
age while controlling for height and % body fat
b  t ( se)  0.96  2.00(0.648)
3
.025
 0.96  1.30  (2.3,0.3)
61
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Confidence Interval for a Multiple
Regression Parameter
At fixed values of x1 and x2 , we infer that the population
mean of weight changes very little (and maybe not at all)
for a 1 year increase in age.
 The confidence interval contains 0.
 Age may have no effect on weight, once we control for
height and % body fat.
62
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Estimating Variability Around the
Regression Equation
A standard deviation parameter,  , describes variability
of the observations around the regression equation.
Its sample estimate is:
s
Residual SS

df
 ( y  yˆ )
n  (# of parameters in reg. eq.)
2
63
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athletes’ Weight?
 Anova Table for the “college athletes” data set:
Table 13.7 ANOVA Table for Multiple Regression Analysis of Athlete Weights
64
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athletes’ Weight?
For female athletes at particular values of height, % of
body fat, and age, estimate the standard deviation of
their weights.
 Begin by finding the Mean Square Error:
residual SS 6131.0
s 

 102.2
df
60
2
Notice that this value (102.2) appears in the MS column
in the ANOVA table.
65
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athletes’ Weight?
 The standard deviation is:
s  102.2  10.1
 This value is also displayed in the ANOVA table.
 For athletes with certain fixed values of height, %
body fat, and age, the weights vary with a standard
deviation of about 10 pounds.
66
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athletes’ Weight?
Insight:
The conditional distributions of weight are approximately
bell-shaped, about 95% of the weight values fall within
about 2s = 20 pounds of the true regression equation.
67
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Collective Effect of Explanatory
Variables
Example: With 3 predictors in a model, we can check
this by testing:
H :     0
0
1
2
3
H : At least one  parameter  0
a
68
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Collective Effect of Explanatory
Variables
The test statistic for H 0 is denoted by F. It equals the
ratio of the mean squares from the ANOVA table,
Mean square for regression
F
Mean square error
69
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Collective Effect of Explanatory
Variables
H 0 is true, the expected value of the F test
 When
statistic is approximately 1.
H0
 When
is false, F tends to be larger than 1.
 The larger the F test statistic, the stronger the
evidenceHagainst
.
0
70
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: F Test That All Beta
Parameters = 0
1. Assumptions:
 Multiple regression equation holds
 Data gathered randomly
 Normal distribution for y with same standard
deviation at each combination of predictors
71
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: F Test That All Beta
Parameters = 0
2. Hypothesis:
H0 : 1   2 
0
Ha : At least one  parameter  0
3. Test statistic:
Mean square for regression
F
Mean square error

72
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: F Test That All Beta
Parameters = 0
4. P-value:
Right-tail probability above observed F-test statistic value
from F distribution with:
 df1 = number of explanatory variables
 df2 = n – (number of parameters in regression
equation)
73
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
SUMMARY: F Test That All Beta
Parameters = 0
5. Conclusion: The smaller the P-value, the stronger
the evidence that at least one explanatory variable
has an effect on y.

74
If a decision is needed, reject H 0 if P-value
significance level, such as 0.05.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.

Example: What Helps Predict a Female
Athlete’s Weight?
For the 64 female college athletes, the regression model
for predicting y = weight using x1 = height, x2 = % body
fat and x3 = age is summarized in the ANOVA table on
the next slide.
75
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
Anova table for the “college athletes” data set:
Table 13.7 ANOVA Table for Multiple Regression Analysis of Athlete Weights
76
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
Use the output in the ANOVA table to test the
hypothesis:
H 0 : 1   2  3  0
H a : At least one  parameter  0
77
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
 The observed F statistic is 40.5
 The corresponding P-value is 0.000
 We can reject H0 at the 0.05 significance level
In summary, we conclude that at least one predictor
has an effect on weight.
78
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
Insight:
 The F-test tells us that at least one explanatory
variable has an effect.
 If the explanatory variables are chosen sensibly, at
least one should have some predictive power.
 The F-test result tells us whether there is sufficient
evidence to make it worthwhile to consider the
individual effects, using t-tests.
79
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
The individual t-tests identify which of the variables are
significant (controlling for the other variables)
Table 13.10 Multiple Regression Analysis for Predicting Weight
Predictors are HGT = height, %BF = body fat, and age of subject.
80
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: What Helps Predict a Female
Athlete’s Weight?
 If a variable turns out not to be significant, it can be
removed from the model.
 In this example, ‘age’ can be removed from the model.
81
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 13
Multiple Regression
Section 13.4
Checking a Regression Model Using Residual
Plots
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Assumptions for Inference with a
Multiple Regression Model
1. The regression equation approximates well the true
relationship between the predictors and the mean of y.
2. The data were gathered randomly.
3. y has a normal distribution with the same standard
deviation at each combination of predictors.
83
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Checking Shape and Detecting Unusual
Observations
To test Assumption 3 (the conditional distribution of y is
normal at any fixed values of the explanatory variables):
 Construction a histogram of the standardized
residuals.
 The histogram should be approximately bell-shaped.
 Nearly all the standardized residuals should fall
between -3 and +3. Any residual outside these limits
is a potential outlier.
84
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: House Selling Price

85
For the house selling price data, a MINITAB
histogram of the standardized residuals for the
multiple regression model predicting selling price by
the house size and the lot size was created and is
displayed on the following slide.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: House Selling Price
Figure 13.4 Histogram of Standardized Residuals for Multiple Regression Model
Predicting Selling Price. Question: Give an example of a shape for this histogram that
would indicate that a few observations are highly unusual.
86
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: House Selling Price
 The residuals are roughly bell shaped about 0.
 They fall mostly between about -3 and +3.
 No severe nonnormality is indicated.
87
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Plotting Residuals against Each
Explanatory Variable
 Plots of residuals against each explanatory variable
help us check for potential problems with the regression
model.
 Ideally, the residuals should fluctuate randomly about
the horizontal line at 0.
 There should be no obvious change in trend or change
x1
in variation as the values of the explanatory variable
increases.
88
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Plotting Residuals against Each
Explanatory Variable
Figure 13.5 Possible Patterns for Residuals, Plotted Against an Explanatory Variable.
Question: Why does the pattern in (b) suggest that the effect of x1 is not linear?
89
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 13
Multiple Regression
Section 13.5
Regression and Categorical Predictors
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Indicator Variables
Regression models can specify categories of a categorical
explanatory variable using artificial variables, called
indicator variables.
 The indicator variable for a particular category is binary.
 It equals 1 if the observation falls into that category and
it equals 0 otherwise.
91
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Indicator Variables
In the house selling prices data set, the condition of the
house is a categorical variable. It was measured with
categories (good, not good).
 The indicator variable x for condition is
 x  1 if house is in good condition
 x  0 if house is not in good condition
92
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Indicator Variables
The regression model is then  y     x , with x as just defined.
Substituting the possible values 1 and 0 for x ,
 y     (1)     , if house is in good condition (so x  1)
 y     (0)   , if house is not in good condition (so x  0)
The difference between the mean selling price for houses in good
condition and not in good condition is

y
for good condition    y for otherwise          
The coefficient  of the indicator variable x is the difference between
the mean selling prices for homes in good condition and for homes
not in good condition.
93
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Including Condition in
Regression for House Selling Price
Output from the regression model for selling price of
home using house size and region.
Table 13.11 Regression Analysis of y = Selling Price Using
and x2 = Indicator Variable for Condition (Good, Not Good)
94
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
x1=House Size
Example: Including Condition in
Regression for House Selling Price
1. Find and plot the lines showing how predicted selling
price varies as a function of house size, for homes in
good condition or not in good condition.
2. Interpret the coefficient of the indicator variable for
condition.
95
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Including Condition in
Regression for House Selling Price
The regression equation from the MINITAB output is:
yˆ  96,271  66.5x1  12,927 x2
96
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Including Condition in
Regression for House Selling Price
 For homes not in good condition,
x2  0
 The prediction equation then simplifies to:
yˆ  96,271  66.5 x1  12,927(0)
 96,271  66.5 x1
97
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Including Condition in
Regression for House Selling Price
 For homes in good condition,
x2  1
 The prediction equation then simplifies to:
yˆ  96,271  66.5 x1  12,927(1)
 109,198  66.5 x1
98
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Including Condition in
Regression for House Selling Price
Figure 13.7 Plot of Equation Relating ŷ =Predicted Selling Price to x1 =House Size,
According to x2 =Condition (1=Good, 0=Not Good). Question: Why are the lines parallel?
99
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Including Condition in
Regression for House Selling Price
Both lines have the same slope, 66.5
 The line for homes in good condition is above the
other line (not good) because its y-intercept is
larger. This means that for any fixed value of
house size, the predicted selling price is higher
for homes in better condition.

100
The P-value of 0.453 for the test for the coefficient
of the indicator variable suggests that this
difference is not statistically significant.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Is there Interaction?
For two explanatory variables, interaction exists
between them in their effects on the response variable
when the slope of the relationship between  y and one of
them changes as the value of the other changes.
101
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Interaction in effects on House
Selling Price
Suppose the actual population relationship between
house size and the mean selling price is:
 y  100, 000  50 x1 for homes in good condition
 y  80, 000  35 x1 for homes not in good condition
 Then the slope for the effect of x1 differs for the two
conditions. There is then interaction between house
size and condition in their effects on selling price. See
Figure 13.8 on the next slide.
102
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Interaction in effects on House
Selling Price
Figure 13.8 An Example of Interaction. There’s a larger slope between selling price
and house size for homes in good condition than in other conditions.
103
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Interaction in effects on House
Selling Price
How can you allow for interaction when you do a
regression analysis?
 To allow for interaction with two explanatory variables,
one quantitative and one categorical, you can fit a
separate regression line with a different slope
between the two quantitative variables for each
category of the categorical variable.
104
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Chapter 13
Multiple Regression
Section 13.6
Modeling a Categorical Response
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Modeling a Categorical Response Variable
The regression models studied so far are designed for a
quantitative response variable y.
When y is categorical, a different regression model
applies, called logistic regression.
106
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Examples of Logistic Regression
 A voter’s choice in an election (Democrat or
Republican), with explanatory variables: annual income,
political ideology, religious affiliation, and race.
 Whether a credit card holder pays their bill on time
(yes or no), with explanatory variables: family income
and the number of months in the past year that the
customer paid the bill on time.
107
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Logistic Regression Model
 Denote the possible outcomes for y as 0 and 1.
 Use the generic terms failure (for outcome = 0), and
success (for outcome =1).
 The population mean of the scores equals the
population proportion of ‘1’ outcomes (successes).
 That is,  y  p
 The proportion, p, also represents the probability that a
randomly selected subject has a successful outcome.
108
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Logistic Regression Model
 The straight-line model is usually inadequate when
there are multiple explanatory variables.
 A more realistic model has a curved S-shape instead
of a straight-line trend.
 The regression equation that best models this Sshaped curve is known as the logistic regression
equation.
109
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Logistic Regression Model
Figure 13.10 Two Possible Regressions for a Probability p of a Binary Response
Variable. A straight line is usually less appropriate than an S-shaped curve. Question:
Why is the straight-line regression model for a binary response variable often poor?
110
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
The Logistic Regression Model
A regression equation for an S-shaped curve for the
probability of success p is:
(   x )
e
p
1 e
(   x )
This equation for p is called the logistic regression
equation. Logistic regression is used when the response
variable has only two possible outcomes (it’s binary).
111
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Travel Credit Cards
An Italian study with 100 randomly selected Italian adults
considered factors that are associated with whether a
person possesses at least one travel credit card.
The table 13.12 on the next slide shows results for the
first 15 people on this response variable and on the
person’s annual income (in thousands of euros).
112
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Travel Credit Cards
Table 13.12 Annual Income (in thousands of euros) and Whether Possess a
Travel Credit Card. The response y equals 1 if a person has a travel credit card and
equals 0 otherwise.
113
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Travel Credit Cards
Let x = annual income and let y = whether the person
possesses a travel credit card (1 = yes, 0 = no).
Table 13.13 shows what software provides for
conducting a logistic regression analysis.
Table 13.13 Results of Logistic Regression for Italian Credit Card Data
114
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Travel Credit Cards
Substituting the  and  estimates into the logistic
regression model formula yields:
( 3.52 0.105 x )
e
pˆ 
1 e
115
( 3.52 0.105 x )
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Travel Credit Cards
Find the estimated probability of possessing a travel
credit card at the lowest and highest annual income
levels in the sample, which were x = 12 and x = 65.
116
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Travel Credit Cards
For x = 12 thousand euros, the estimated probability of
possessing a travel credit card is:
3.52 0.105 ( 12 )
2.26
e
e
pˆ 

1 e
1 e
0.104

 0.09
1.104
3.521.05 ( 12 )
117
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
 2.26
Example: Travel Credit Cards
For x = 65 thousand euros, the estimated probability of
possessing a travel credit card is:
3.52 0.105 ( 65 )
3.305
e
e
pˆ 

1 e
1 e
27.2485

 0.97
28.2485
3.521.05 ( 65 )
118
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
3.305
Example: Travel Credit Cards
Insight:
 Annual income has a strong positive effect on
having a credit card.
 The estimated probability of having a travel credit card
changes from 0.09 to 0.97 as annual income changes
over its range.
119
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
A three-variable contingency table from a survey of
senior high-school students is shown on the next slide.
The students were asked whether they had ever used:
alcohol, cigarettes or marijuana.
We’ll treat marijuana use as the response variable and
cigarette use and alcohol use as explanatory variables.
120
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
Table 13.14 Alcohol, Cigarette, and Marijuana Use for High School Seniors
121
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
 Let y indicate marijuana use, coded: (1 = yes, 0 = no)
x1 be an indicator variable for alcohol use, coded
 Let
(1 = yes, 0 = no)
 Let
x2 be an indicator variable for cigarette use, coded
(1 = yes, 0 = no)
122
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
Table 13.15 MINITAB Output for Estimating the Probability of Marijuana Use
Based on Alcohol Use and Cigarette Use
123
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
The logistic regression prediction equation is:
5.31 2.99 x1  2.85 x2
e
pˆ 
1 e
124
5.31 2.99 x1  2.85 x2
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
For those who have not used alcohol or cigarettes,
x1  x2  0 . For them, the estimated probability of
marijuana use is
5.31 2.99 ( 0 )  2.85 ( 0 )
e
pˆ 
1 e
125
5.31 2.99 ( 0 )  2.85 ( 0 )
 0.005
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
For those who have used alcohol and cigarettes,
x1  x2  1 . For them, the estimated probability of
marijuana use is
5.31 2.99(1)  2.85(1)
e
pˆ 
 0.629
5.31 2.99(1)  2.85(1)
1 e
126
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.
Example: Estimating Proportion of
Students Who’ve Used Marijuana
SUMMARY: The probability that students have tried
marijuana seems to depend greatly on whether they’ve
used alcohol and/or cigarettes.
127
Copyright © 2013, 2009, and 2007, Pearson Education, Inc.