Download Lecture_14new

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Regression toward the mean wikipedia , lookup

Choice modelling wikipedia , lookup

Least squares wikipedia , lookup

Time series wikipedia , lookup

Regression analysis wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Multiple Regression
Analysis: Further Issues
Chapter 6
Wooldridge: Introductory Econometrics:
A Modern Approach, 5e
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Announcements
Feedback on project proposals
Via Piazza and your partner’s worksheet
SOURCE program applications due
tomorrow.
Mid-term course evaluations today.
Due after Spring break: Next homework,
includes paper “Do Lenders Favor
Politically Connected Firms?”
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Example: Do larger firms spend
more on Research & Development?
• 32 firms observed
• rdintens is R&D
spending as percent
of sales. Mean = 3
• sales measured in
millions. Note the
one very high outlier.
Research Question:
Does having higher
sales enable a firm to
spend more on R&D?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Predicted values (in red) based on a SLR of
rdintens on sales.
Note: the coefficient on
sales is not statistically
significant. We cannot
reject the null
hypothesis that βsales
equals zero.
reg rdintens sales
predict pred_rdintens_linear
twoway (scatter rdintens sales) (scatter pred_rdintens_linear sales)
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Now we try the quadratic specification:
What’s different?
How do we test for the joint significance of sales and salessq?
Recall: F-test for overall significance of a regression (since sales and
salessq are the only explanatory variables in this model!*). This F stat is
reported by Stata automatically.
*If we also had other explanatory variables, then we would do an F test where the
restricted model removed sales and salessq from the regression.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Predictions from the quadratic specification
(green) vs. predictions from the linear
specification (red):
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
More on goodness-of-fit and selection of regressors
General remarks on R-squared: In economics, the main goal of
regression analysis is to credibly identify causal effects.
A high R-squared does not imply that there is a causal interpretation
A low R-squared does not preclude precise estimation of partial effects
However, when trying to identify the best functional form, it can
be helpful to compare measures of goodness-of-fit.
E.g. Comparing a model with log(sales) vs. sales as the explanatory variable.
reg rdintens sales --->
reg rdintens lsales --->
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Question: What if we want to compare the model with log(sales) as
regressor against a model with a quadratic functional form (sales and
sales-squared as regressors) ?
Why might we not want to use R-squared?
Because R-squared always increases when we add more regressors. It
wouldn‘t really be a “ fair“ comparison, since the quadratic regression
has more regressors
Instead, we use a measure called “Adjusted R-squared“ that takes
into account how many regressors are in the model.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Adjusted R-squared (cont.)
Correct for degrees of
freedom in nominator and
denominator
A better estimate taking into account degrees of freedom would be
The adjusted R-squared imposes a penalty for adding new regressors
The adjusted R-squared increases if, and only if, the t-statistic of a
newly added regressor is greater than one in absolute value
How to calculate adjusted R-squared, from R-squared.
The adjusted R-squared
may even get negative
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Using adjusted R-squared to choose between nonnested models
Models are nonnested if neither model is a special case of the other
A comparison between the R-squared of both models would be unfair
to the first model because the first model contains fewer parameters
In this example, even after adjusting for the difference in degrees of
freedom, the quadratic model is still preferred
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Comparing models with different dependent variables
CAUTION: R-squared or adjusted R-squared must not be used to
compare models which differ in their definition of the dependent var.
Example: CEO compensation and firm performance
There is much
less variation
in log(salary)
that needs to
be explained
than in salary
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Predicting y when log(y) is the dependent variable
Under the additional assumption that
is independent of
:
Prediction for y
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
Multiple Regression Analysis:
Further Issues
Comparing R-squared of a logged and an unlogged specification
These are the R-squareds for the predictions of the unlogged
salary variable (although the second regression is originally for
logged salaries). Both R-squareds can now be directly compared.
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.