Download Chapter 13

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Week 13
November 24-28
Three Mini-Lectures
QMM 510
Fall 2014
ML 13.1
What Is Multicollinearity?
•
•
•
13-2
Multicollinearity occurs when the independent variables X1, X2, …, Xm
are intercorrelated instead of being independent.
Collinearity occurs if only two predictors are correlated.
The degree of multicollinearity is the real concern.
Chapter 13
Multicollinearity
Variance Inflation
•
•
•
13-3
Multicollinearity induces variance inflation in the estimation of the
regression model coefficients.
This results in wider confidence intervals for the true coefficients b1, b2,
…, bk and makes the t statistic less reliable.
The separate contribution of each predictor in “explaining” the response
variable is difficult to identify.
Chapter 13
Multicollinearity
Correlation Matrix
To check whether two predictors are correlated (collinearity), inspect the
correlation matrix using Excel, MegaStat, or MINITAB. For example,
13-4
Chapter 13
Multicollinearity
Variance Inflation Factor (VIF)
•
The matrix scatter plots and correlation matrix only show
correlations between any two predictors.
•
The variance inflation factor (VIF) is a more comprehensive test
for multicollinearity.
•
For a given predictor j, the VIF is defined as
where Rj2 is the coefficient of determination when
predictor j is regressed against all other predictors.
13-5
Chapter 13
Multicollinearity
Variance Inflation Factor (VIF)
Some possible situations are:
13-6
Chapter 13
Multicollinearity
Rules of Thumb
13-7
•
There is no limit on the magnitude of the VIF.
•
A VIF of 10 says that the other predictors “explain” 90% of
the variation in predictor j.
•
A high VIF indicates that predictor j is strongly related to
the other predictors.
•
However, a high is not necessarily indicative of instability in
the least squares estimate.
•
A large VIF is a warning to consider whether predictor j
really belongs to the model.
Chapter 13
Multicollinearity
Are Coefficients Stable?
•
Evidence of instability is when X1 and X2 have a high pairwise
correlation with Y, yet one or both predictors have insignificant t
statistics in the fitted multiple regression, and/or if X1 and X2 are
positively correlated with Y, yet one has a negative slope in the
multiple regression.
•
As a test, try dropping a collinear predictor from the regression and
see what happens to the fitted coefficients in the re-estimated
model.
•
If they don’t change much, then multicollinearity is not a concern.
•
If there are sharp changes in one or more of the remaining
coefficients in the model, then multicollinearity may be causing
instability.
13-8
Chapter 13
Multicollinearity
13.2
Three Important Assumptions
1.
The errors are normally distributed.
2.
The errors have constant variance (i.e., they are homoscedastic).
3.
The errors are independent (i.e., they are nonautocorrelated).
Note: Everything you learned about residual tests in Chapter 12
is still true, except that the residuals are now based on multiple
predictors (k > 2). This may affect the degrees of freedom in
some tests, but the concepts are basically the same.
12-9
Chapter 13
Tests of Assumptions
Non-Normal Errors
•
Non-normality of errors is a mild violation since the parameter
estimates and their variances remain unbiased and consistent.
•
Confidence intervals for the parameters may be untrustworthy because
the normality assumption is used to justify using Student’s t.
Tests for Non-Normal Errors
•
•
12-10
Quick test: Make a histogram of residuals. Is it bell-shaped? Outliers?
For a more precise test, use the normal probability plot. If H0 is true,
the residual plot should be linear.
H0: Errors are normally distributed
H1: Errors are not normally distributed
Chapter 13
Residual Tests
Example: Normality Test (n = 50, k = 7)
MegaStat’s normal probability
plot is somewhat linear, but
has some weird values at
either end. Possible nonnormal residuals?
But a histogram of residuals
from Minitab's Stat > Graphical
Summary is arguably normal in
shape. From its p-value (.24) we
would not reject normality. Note
that the mean of the residuals is
zero, as it must be.
12-11
Chapter 13
Residual Tests
Heteroscedastic Errors (Nonconstant Variance)
•
Ideally, the error magnitude is constant (i.e., homoscedastic
errors). Heteroscedastic errors increase or decrease with X.
•
In multiple regression, we have several X’s, so for simplicity we
often just plot the n residuals against the fitted Y values.
•
In the most common form of heteroscedasticity, the variances of
the estimators may be understated, the t-statistics overstated,
and confidence intervals artificially narrow.
Tests for Heteroscedasticity
•
12-12
Plot the residuals against Yfitted or
against each of the k predictors.
Ideally, there is no pattern in the
residuals moving from left to right.
Chapter 13
Residual Tests
Example: Regression with 7 Predictors
•
12-13
Plots of residuals against Yfitted (shown below in blue) or against each of
the k predictors (shown below in pink) show evidence of
heteroscedasticity. Note that one predictor was binary (X = 0 or 1).
Chapter 13
Residual Tests
Autocorrelated Errors
•
Autocorrelation is a pattern of non-independent errors.
•
It is of more concern in time series data because the natural order of
the data is meaningful (whereas in cross-sectional data the order of
observatiohs if often alphabetical or randomized).
•
In a first-order autocorrelation, et is correlated with et  1.
•
The estimated variances of the OLS estimators are biased, resulting in
confidence intervals that are too narrow, overstating the model’s fit.
12-14
Chapter 13
Residual Tests
Runs Test for Autocorrelation
•
Look at a plot of residuals over time (or by observation). Count the number of
sign reversals (i.e., how often does the residual cross the zero centerline?).
•
If the pattern is random, the number of sign changes should be near n/2.
•
Fewer than n/2 would suggest positive autocorrelation.
•
More than n/2 would suggest negative autocorrelation.
Durbin-Watson (DW) Test
•
Tests for autocorrelation under the hypotheses
H0: Errors are non-autocorrelated
H1: Errors are autocorrelated
•
The DW statistic will range from 0 to 4.
DW < 2 suggests positive autocorrelation
DW = 2 suggests no autocorrelation (ideal)
DW > 2 suggests negative autocorrelation
12-15
Chapter 13
Residual Tests
Example: Normality Test (n = 50, k = 7)
•
Here is MegaStat’s plot of residuals by observation (n = 50). Count the number of
sign reversals (i.e., how often does the residual cross the zero centerline). If the
pattern is random, the number of sign changes should be near n/2 and the
Durbin-Watson statistic should be near 2.
DW is near 2 so there is not much
evidence of autocorrelation.
28 sign changes (close to n/2 = 50/2
= 25) so there is not much evidence
of autocorrelation.
12-16
Chapter 13
Residual Tests
Example: Excel’s Tests of Assumptions
Excel’s Data Analysis >
Regression does residual
plots (test for
heteroscedasticity) and
gives the DW test statistic.
Excel’s standardized
residuals are done in a
strange way, but usually
they are not misleading.
Warning: Excel offers
normal probability plots
for residuals, but they
are done incorrectly.
12-17
Chapter 13
Residual Tests
Chapter 13
Residual Tests
Example: MegaStat’s Tests of Assumptions
MegaStat will do all
three tests (if you
check the boxes). Its
runs plot (residuals
by observation) is a
visual test for
autocorrelation,
which Excel does not
offer.
12-18
•
Outliers? (omit only if clearly errors)
•
Missing Predictors? (usually you can’t tell)
•
Ill-Conditioned Data (adjust decimals or take logs)
•
Significance in Large Samples? (if n is huge, any regression will be
significant)
•
Model Specification Errors? (may show up in residual patterns)
•
Missing Data? (we may have to live without it)
•
Binary Response? (if Y = 0,1 we use logistic regression)
•
Stepwise and Best Subsets Regression (MegaStat does these)
13-19
13-19
ML 13.3
Chapter 13
Other Regression Topics
Tests for Nonlinearity
•
•
•
Sometimes the effect of a predictor is nonlinear.
A simple example would be estimating the volume of lumber to be
obtained from a tree.
To test for suspected nonlinearity of any predictor variable, we can
include its square in the regression.
Tests for Interaction
•
We can test for interaction between two predictors by including their
product in the regression.
Model with x1x2 interaction term.
13-20
Chapter 13
Tests for Nonlinearity and Interaction
Tests for Nonlinearity
13-21
Chapter 13
Tests for Nonlinearity and Interaction
Example: MegaStat
13-22
Caution: This is basically a
data-mining tool that
looks only at fit (not at
causal logic). Use it only as
a check.
Chapter 13
Stepwise Regression
13-23
Chapter 13
Chapter Summary
13-24
Chapter 13
Chapter Summary