Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Multiple Regression Analysis: Further Issues Chapter 6 Wooldridge: Introductory Econometrics: A Modern Approach, 5e © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Announcements Feedback on project proposals Via Piazza and your partner’s worksheet SOURCE program applications due tomorrow. Mid-term course evaluations today. Due after Spring break: Next homework, includes paper “Do Lenders Favor Politically Connected Firms?” © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Example: Do larger firms spend more on Research & Development? • 32 firms observed • rdintens is R&D spending as percent of sales. Mean = 3 • sales measured in millions. Note the one very high outlier. Research Question: Does having higher sales enable a firm to spend more on R&D? © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Predicted values (in red) based on a SLR of rdintens on sales. Note: the coefficient on sales is not statistically significant. We cannot reject the null hypothesis that βsales equals zero. reg rdintens sales predict pred_rdintens_linear twoway (scatter rdintens sales) (scatter pred_rdintens_linear sales) © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Now we try the quadratic specification: What’s different? How do we test for the joint significance of sales and salessq? Recall: F-test for overall significance of a regression (since sales and salessq are the only explanatory variables in this model!*). This F stat is reported by Stata automatically. *If we also had other explanatory variables, then we would do an F test where the restricted model removed sales and salessq from the regression. © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Predictions from the quadratic specification (green) vs. predictions from the linear specification (red): © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues More on goodness-of-fit and selection of regressors General remarks on R-squared: In economics, the main goal of regression analysis is to credibly identify causal effects. A high R-squared does not imply that there is a causal interpretation A low R-squared does not preclude precise estimation of partial effects However, when trying to identify the best functional form, it can be helpful to compare measures of goodness-of-fit. E.g. Comparing a model with log(sales) vs. sales as the explanatory variable. reg rdintens sales ---> reg rdintens lsales ---> © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues Question: What if we want to compare the model with log(sales) as regressor against a model with a quadratic functional form (sales and sales-squared as regressors) ? Why might we not want to use R-squared? Because R-squared always increases when we add more regressors. It wouldn‘t really be a “ fair“ comparison, since the quadratic regression has more regressors Instead, we use a measure called “Adjusted R-squared“ that takes into account how many regressors are in the model. © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues Adjusted R-squared (cont.) Correct for degrees of freedom in nominator and denominator A better estimate taking into account degrees of freedom would be The adjusted R-squared imposes a penalty for adding new regressors The adjusted R-squared increases if, and only if, the t-statistic of a newly added regressor is greater than one in absolute value How to calculate adjusted R-squared, from R-squared. The adjusted R-squared may even get negative © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues Using adjusted R-squared to choose between nonnested models Models are nonnested if neither model is a special case of the other A comparison between the R-squared of both models would be unfair to the first model because the first model contains fewer parameters In this example, even after adjusting for the difference in degrees of freedom, the quadratic model is still preferred © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues Comparing models with different dependent variables CAUTION: R-squared or adjusted R-squared must not be used to compare models which differ in their definition of the dependent var. Example: CEO compensation and firm performance There is much less variation in log(salary) that needs to be explained than in salary © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues Predicting y when log(y) is the dependent variable Under the additional assumption that is independent of : Prediction for y © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part. Multiple Regression Analysis: Further Issues Comparing R-squared of a logged and an unlogged specification These are the R-squareds for the predictions of the unlogged salary variable (although the second regression is originally for logged salaries). Both R-squareds can now be directly compared. © 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.