Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Is economic freedom related to economic growth? It is an article of faith among supporters of capitalism: economic freedom leads to economic growth. The publication Economic Freedom of the World: 2003 Annual Report (James Gwartney and Robert Lawson with Neil Emerick, Economic Freedom of the World: 2003 Annual Report, Vancouver: The Fraser Institute, 2003; data retrieved from www.freetheworld.com) is devoted to the examination of economic freedom across the world (economic freedom being defined as freedom to trade (an open trade policy, with minimal barriers to imports and minimal subsidies to domestic industries), freedom to invest (few restrictions on foreign investment), freedom to operate a business without overly burdensome regulation, and secure property rights. The following quote from page xi of that publication summarizes the beliefs of the authors: “The trend toward [economic] liberalization, that is, continued undisturbed. This suggests it is anything but a passing fad, the artifact of some economic ‘bubble.’ Rather, it represents a deep worldwide consensus that the path to prosperity lies in the verities of open trade, sound money, international flows of goods and capital (and labor), market-determined prices, sensible regulation, and the protection of property rights.” Do the data support the existence of these benefits of economic freedom? This is obviously a complex issue, but regression methods can be useful in looking at the evidence. Consider the following plot of the 2000 per capita Gross Domestic Product (GDP, in 1995 U.S. dollars) versus the 2000 Economic Freedom score. Note, by the way, that by using data from 2000, we are avoiding the effects of the worldwide economic downturn that started in 2001. The data comprise a near-census of all of the countries of the world, so when we talk about predicting per capita GDP (for example), we are actually treating these data as a sample from a reasonably stable ongoing process. c 2008, Jeffrey S. Simonoff 1 The freedom score is on a scale from 1 to 5, with lower values corresponding to more economic freedom. Presumably we would expect an inverse relationship here, and that is what we see. On the other hand, it is distinctly nonlinear, as the following fitted line plot shows: c 2008, Jeffrey S. Simonoff 2 It is very apparent from this plot that a straight-line model is not appropriate for these data. We’ll address that point in a little while, but for now I’m going to ignore the obvious evidence, and fit a least squares linear regression model anyway. Here is regression output for this model. Regression Analysis: Per capita GDP versus 2000 freedom score The regression equation is Per capita GDP = 40114 - 10831 2000 freedom score 150 cases used 6 cases contain missing values Predictor Constant 2000 fre S = 8428 Coef 40114 -10831.3 SE Coef 3020 972.8 R-Sq = 45.6% c 2008, Jeffrey S. Simonoff T 13.28 -11.13 P 0.000 0.000 R-Sq(adj) = 45.2% 3 Analysis of Variance Source Regression Residual Error Total DF SS 1 8806128436 148 10513603863 149 19319732299 MS 8806128436 71037864 F 123.96 P 0.000 The t-test for the slope and the F -test are of course equivalent here, and they imply that there is a strongly statistically significant relationship between freedom score and per capita GDP, as the p-value is quite small. We see that the estimated slope is −10831. This says that the estimated decrease in per capita GDP for a one point increase in the freedom score is $10,831. What does this mean? A one point increase in the freedom score is a large one, as it corresponds to a full category difference in a five-point scale. The following output can help in interpreting the coefficient: Descriptive Statistics: 2000 freedom score Variable 2000 fre Variable 2000 fre N 153 SE Mean 0.0605 N* 3 Mean 3.0588 Minimum 1.3000 Median 3.0000 Maximum 5.0000 TrMean 3.0478 Q1 2.5000 StDev 0.7482 Q3 3.6000 As you can see, the mean and median score are right in the middle, at (about) 3. Roughly one-quarter of the countries fall .5 below this, and roughly one-quarter fall .5 above it, so a one-unit change in the freedom index corresponds, for example, to the difference between a country at the 25th percentile and one at the 75th percentile. A per capita GDP difference of over $10,000 certainly seems to be meaningful from a practical point of view. The intercept of 40114 is the estimated expected per capita GDP when the freedom score is 0; since this is impossible, this coefficient has no physical meaning. Does this model provide useful predictive power? The standard error of the estimate is $8428, implying (by the standard assumptions) that this model can predict per capita GDP to within ±(2)(8428) = ±$16856 roughly 95% of the time. Is this a noteworthy accomplishment? Consider the following output: c 2008, Jeffrey S. Simonoff 4 Descriptive Statistics: Per capita GDP Variable Per capi Variable Per capi N 153 SE Mean 914 N* 3 Mean 7250 Minimum 116 Median 1649 Maximum 55744 TrMean 5749 Q1 577 StDev 11309 Q3 7324 The range of per capita GDP values is more than $55,000, so this improvement in predictive power might be considered moderate. This is consistent with the R2 , which tells us that a bit less than half of the variability in per capita GDP is accounted for by economic freedom. Of course, in actual fact, all of this is useless, since the linear model is inappropriate here. The scatter plot made this obvious, but there are other clues also. The target variable is very long right-tailed: c 2008, Jeffrey S. Simonoff 5 This was obvious from the descriptive statistics (the mean is more than four times the median), and also could have been seen in the original scatter plot. Residual plots reinforce the problems: increasing variability of the residuals with the fitted values, and a long right tail: c 2008, Jeffrey S. Simonoff 6 All of this evidence is pointing us in one direction — to analyze these data in a logged scale. Here is a scatter plot of logged per capita GDP (logs base 10) versus economic freedom: c 2008, Jeffrey S. Simonoff 7 Now, that’s more like it! Here is a fitted line plot, followed by regression output: c 2008, Jeffrey S. Simonoff 8 Regression Analysis: Logged GDPpc versus 2000 freedom score The regression equation is Logged GDPpc = 5.66 - 0.764 2000 freedom score 150 cases used 6 cases contain missing values Predictor Constant 2000 fre Coef 5.6608 -0.76425 S = 0.4532 SE Coef 0.1624 0.05231 R-Sq = 59.1% T 34.86 -14.61 P 0.000 0.000 R-Sq(adj) = 58.8% Analysis of Variance Source Regression Residual Error Total DF 1 148 149 c 2008, Jeffrey S. Simonoff SS 43.842 30.394 74.237 MS 43.842 0.205 F 213.48 P 0.000 9 How do we interpret these results? The R2 tells us that the index accounts for almost 60% of the variability in logged per capita GDP, a reasonably large amount. Once again the intercept is meaningless, but what about the slope? The slope coefficient is −.764; this says that a one-unit increase in the economic freedom index is associated with an expected .764 unit decrease in the logged per capita GDP. We’ve already talked about what a oneunit increase in the index means, but what does a .764 unit decrease in logged per capita GDP mean? The key is to remember what logarithms mean. Recall, for example, that log10 10 = 1, and log10 100 = 2. The logs of these two numbers differs by 1, which is telling us that the numbers themselves differ by a multiplicative factor of 101 , or 10 (that is 100 is ten times 10). Two numbers whose logs differ by .764 differ by a multiplicative factor of 10.764 = 5.8; that is, one is 5.8 times the other. Equivalently, if one number has log .764 lower than another, it is a multiplicative factor of 10−.764 = .172 smaller. Of course, we see that the regression itself is highly statistically significant. So, what does our −.764 coefficient mean? It says that a one-unit increase in the economic freedom index is associated with multiplying the expected per capita GDP by 10−.764 , or .172; that is, a one-unit increase in the index is associated with an expected 82.8% decrease in per capita GDP. This is quite a lot! [What if our predictor had also been a logged variable? First, we can just recognize that the coefficients continue to mean what they always mean. Say the fitted model was log Y = 2.3 + .6 × log X (that is, a regression was fit the logarithm (base 10) of Y as the target, and the logarithm of X as the predictor, for some variables X and Y ). The intercept is the estimated expected value of the target (log Y ) when the predictor (log X) equals 0; that is, when X = 1 (since log 1 = 0). So, in this hypothetical case, the estimated Y when X = 1 is 102.3 = 199.5. Whether this means anything or not depends on whether X = 1 is a meaningful condition. The slope tells us the estimated change in log Y when log X increases by 1, but that just corresponds to X being multiplied by 10. Therefore, in this hypothetical case, multiplying X by 10 is associated with multiplying Y by 10.6 = 3.98. Of course, this is also an elasticity, which means that a 1% change in X is estimated to be associated with a .6% change in Y .] We need to adjust our thinking regarding the standard error of the estimate in the same way. It is .453, which we know means that we should be able to predict logged per capita GDP to within ±(2)(.4532) = ±.9064 roughly 95% of the time, but what does that c 2008, Jeffrey S. Simonoff 10 mean? We just use the same argument as before; since 10−.906 = .124 and 10.906 = 8.06, this standard error of the estimate says that knowing the economic freedom index allows us to predict per capita GDP to within a multiplicative factor of (roughly) 8, roughly 95% of the time. So, for example, a country with economic freedom index equaling 3 has predicted logged per capita GDP of 5.6608 − .76425 × 3 = 3.36805, or predicted per capita GDP of 103.36805 = $2334. The standard error of the estimate tells us that we wouldn’t be surprised if the actual per capita GDP was as much as 8 times less that ($289) to as much as 8 times more that ($18,810). This wide a range might surprise you, but it’s inherent to the fact that per capita GDP is very long right-tailed; remember, the range of per capita GDP values is ($116, $55744), so the largest value is more than 480 times the smallest! Residual plots look okay for this model: c 2008, Jeffrey S. Simonoff 11 This is a situation where it is natural to consider the use of confidence and prediction intervals. Consider the case of Libya. Its 200 freedom score was 4.85, but its per capita GDP is not given in the data. What does the regression model imply for that value? Here are confidence and prediction interval results: Predicted Values for New Observations New Obs Fit SE Fit 95% CI 95% PI 1 1.9542 0.1025 (1.7517, 2.1567) (1.0361, 2.8724)X X denotes a point that is an outlier in the predictors. Values of Predictors for New Observations c 2008, Jeffrey S. Simonoff 12 2000 New 1 freedom Obs score 4.85 The confidence interval would correspond to an interval for the average response for all observations in the population with 2000 freedom scores equal to 4.85, but since the response variable is in the log scale it is not meaningful here. On the other hand, the prediction interval is useful here. The 95% interval for logged per capita GDP of (1.0361, 2.8724) converts to an interval of (10.87, 745.42). Note that the interval is asymmetric around the geometric mean of 101.9542 = 89.99, which reflects the much higher natural variability in per capita GDP at the upper end compared to at the lower end. Note that the ratio of the high to the low end of the interval is roughly 68.6, which corresponds closely to our earlier rough prediction interval of being able to predict per capita GDP to within a multiplicative factor of about 8. Thus, it would seem that we’re done, except for one thing — this analysis doesn’t really address that quote at the beginning of this handout. It’s not surprising that wealth and economic freedom would go together, but that doesn’t necessarily say anything about whether it is in a country’s economic best interest to become more economically free. As the quote says, it’s the “path to prosperity” that matters; that is, economic growth, not current economic status. Is economic freedom related to economic growth? Here is a plot of 2000 growth in GDP versus 2000 economic freedom index: c 2008, Jeffrey S. Simonoff 13 This is not very encouraging, as the fitted line plot shows: c 2008, Jeffrey S. Simonoff 14 There is apparently very little relationship between one-year economic growth and economic freedom. Now, of course, one-year growth might be too variable a measure, and longer-term growth measure might be more useful. Still, this result is disappointing, especially since it came at a time of general economic prosperity. Regression Analysis: GDP growth rate versus 2000 freedom score The regression equation is GDP growth rate = 3.10 + 0.372 2000 freedom score 153 cases used 3 cases contain missing values Predictor Constant 2000 fre S = 3.108 Coef 3.100 0.3719 SE Coef 1.061 0.3369 R-Sq = 0.8% c 2008, Jeffrey S. Simonoff T 2.92 1.10 P 0.004 0.271 R-Sq(adj) = 0.1% 15 Analysis of Variance Source Regression Residual Error Total DF 1 151 152 SS 11.767 1458.533 1470.300 MS 11.767 9.659 F 1.22 P 0.271 With an R2 less than 1%, there’s not much need to try to interpret the slope coefficient (in any event, it’s positive, which would imply that more freedom is associated with less growth!). The regression is of course not close to statistically significant. Do the regression assumptions seem reasonable here? Maybe not: c 2008, Jeffrey S. Simonoff 16 There are two clearly unusual points: Equatorial Guinea and Turkmenistan. Each of these countries had very high growth in 2000 (16.9% and 17.6%, respectively), despite repressive economic situations (freedom scores 4.05 and 4.3, respectively). Equatorial Guinea’s GDP growth is unusual in that extensive offshore oil exploitation only began in 1997, while Turkmenistan’s high growth came from oil and natural gas exports. It is important to explore whether these two countries have had a strong effect on the overall model, since they are not at all typical of the general pattern. We can do this by omitting them and seeing how things are affected, being sure to report that this is what we have done. Overall, not much changes; in fact, the relationship becomes even weaker: c 2008, Jeffrey S. Simonoff 17 Regression Analysis: GDP growth rate versus 2000 freedom score The regression equation is GDP growth rate = 3.99 + 0.026 2000 freedom score 151 cases used 3 cases contain missing values Predictor Constant 2000 fre S = 2.750 Coef 3.9872 0.0257 SE Coef 0.9482 0.3027 R-Sq = 0.0% T 4.20 0.09 P 0.000 0.932 R-Sq(adj) = 0.0% Analysis of Variance Source Regression Residual Error Total DF 1 149 150 c 2008, Jeffrey S. Simonoff SS 0.055 1126.886 1126.941 MS 0.055 7.563 F 0.01 P 0.932 18 The model now seems to fit fine, so we should just accept it as reflecting what is really going on here: c 2008, Jeffrey S. Simonoff 19 So what have we learned? The results are decidedly mixed. There is a clear relationship between a country’s current level of wealth and economic freedom, although this is accompanied by a good deal of variability. There is apparently no relationship between economic freedom and short-term GDP growth. Whether there is a relationship between economic freedom and long-term growth cannot be addressed with these data. c 2008, Jeffrey S. Simonoff 20 MINITAB commands Although it is possible to omit observations in a sample by simply highlighting them in the data worksheet and pressing the delete key, this is generally not advisable, since then the observation cannot be recovered without reopening the original file (and if you save the data before doing that, the observation is gone completely). A better approach is to create a subset of the worksheet that has the observations you want; this will create a new worksheet that can be analyzed, but the original worksheet will still be there as well. Click on Data → Subset Worksheet. You can give the new worksheet an identifying name if you like under Name:. Click the radio button next to Specify which rows to exclude, click the radio button next to Specify rows:, and enter the row numbers of the outliers in the associated box. Note that there is a good deal of flexibility in the subsetting; you can identify rows to include or exclude, identify them by some condition (for example, observations with values of a predictor greater than 10), or brush them on a scatter plot and identify them that way, in addition to specifying them by row number(s). c 2008, Jeffrey S. Simonoff 21