Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
STATA Introductory manual 1 MADE QUIZ • What are the main OLS assumptions? 1. On average right 2. Linear 3. Predicting variables and error term uncorrelated 4. No serial correlation in error term 5. Homoscedasticity + Normality of error term 2 MADE QUIZ • • • Do we know the error term? Do we know the coefficients? How can we know whether all the assumptions are fulfilled 1. 2. 3. 4. 5. 3 On average right => ??? Linearity => ??? X and ε uncorrelated => ??? ε serially uncorrelated => ??? ε homoscedastic => ??? MADE Before STATA • A little excercise in Excel – we’ll take a data set – compute x’x – compute (x’x)^-1 – compute (x’x)^-1x’ – compute (x’x)^-1x’y… … and guess what will come out in result of all this 4 MADE Matrix algebra in Excel 5 MADE Step 1- see the data 6 MADE Step 2 – get rid of empty cells • STATA does that automatically • In Excel one needs to do that manually • Best way I know is to sort and delete empty • … but we do not need to do it, ‘cause I’ve already prepared the spreadsheet (DATA nonempty) 7 MADE Step 3 – transposition and multiplication • To transpose: – just copy and paste special, then clicking on the transpose option • To multiply: – – – – need to know what comes out select the „right” area write =mmult(range1;range2) to do ranges, you can select them with the mouse – DO NOT press enter: press ctrl+shift+enter 8 MADE Step 4 – matrix inversion • To invert a matrix: – need to know what comes out => select this area – type =minvers(range) – DO NOT press enter: press ctrl+shift+enter – … and that’s it 9 MADE Step 5 – just do it yourself • Get • Get • Get • Get 10 x’x (x’x)^-1 (x’x)^-1x’ (x’x)^-1x’y MADE STATA as calculator • Multiplication and other simple operations – display (7*(6+5)/4)^(-0.3) • Statistical calculations – display normal(1.96) • Generating variables – generate t=1967 • Etc... 11 MADE Getting data to STATA • How to get data to STATA? – From Excel [Copy&Paste, remember „commas”] – Manually [if one has the time ... ] – From other sources [import function] 12 MADE Reading data in STATA • How to know what it is? – describe [read the texts] – summarize [statistical properties] • Learning more about your data set – correlate trade gdppc [just to see how it works] – histogram gdppc [graphing is easier from the menu] – scatter trade gddpc [as above ] 13 MADE Regression with STATA • How to do regression? – regress gdppc fdi trade inv school2 – regress gdppc fdi trade inv school2, nocons ... and that’s it Source 14 SS df MS Model Residual 2.0703e+09 691290985 4 88 517584509 7855579.38 Total 2.7616e+09 92 30017706.8 gdppc Coef. fdi trade inv school2 _cons 508.1666 13.57583 -72.10749 110.629 -83.17579 Std. Err. 147.8113 12.64687 39.74305 10.25845 899.459 t 3.44 1.07 -1.81 10.78 -0.09 Number of obs F( 4, 88) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.001 0.286 0.073 0.000 0.927 = = = = = = 93 65.89 0.0000 0.7497 0.7383 2802.8 [95% Conf. Interval] 214.4226 -11.55717 -151.0885 90.24245 -1870.661 801.9106 38.70884 6.873478 131.0155 1704.31 MADE Regression with STATA • Only now one should start thinking – Maybe there is something about groups of countries we should look at more in detail? – How about North/South division? – We have a variable for lattitude, right? • gen south=0 • replace south=1 if lat<0 • sum south • tab south – bysort south: reg gdppc fdi trade inv school2 • We can also draw many of these things – twoway (scatter inv gdppc, mlabel( ccode)), by( south) ... and many others 15 MADE Diagnostics with STATA • Normality of the residual – predict e, residual [directly after regress] – sktest e [Jarque-Bery test] • RESET test (so-called omitted variables or functional form test) – ovtest, rhs 16 MADE Diagnostics with STATA • Structural stability [Chow test] – – – – – – – – – – 17 gen d=0 gen dfdi=0 gen dtrade=0 gen dinv=0 replace d=1 if lat<0 (why so????) replace dfdi=fdi if lat<0 replace dtrade=trade if lat<0 replace dinv=inv if lat<0 reg gdppc fdi trade inv school2 d dfdi dtrade dinv test (d=0) (dfdi=0) (dtrade=0) (dinv=0) MADE Diagnostics with STATA • Heteroscedasticity – hettest, rhs [Breush-Pagan test] – imtest, white [White test] • Autocorrelation (we would need time, not this dataset) – tsset t – dwstat [Durbin-Watson test] – bgodfrey, lags(1 2 3) [Breush-Godfrey test] 18 MADE