Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
PANEL DATA Development Workshop What are we going to do today? 1. 2. 3. 4. 5. Panels – introduction and data properties How to measure distance What comes first: trade or GDP? What else affects trade? Role of currency? Why panel data? What is the sense of panel data? pooled data in econometrics panels in econometrics long or wide? fixed or random effects? Gravity model All that theory is ql, but transport costs matter and market size matters: => push and pull – – – – – Isard (1954), logs by Tinbergen (1962) [what if there were no barriers? „missing trade”], Linneman (1966) [standard macro approach], Anderson (1979) [first theoretical model – expenses based] Helpman-Krugman (1985) [intra-industry trade] Bergstrand (1985) [general equilibrium, one country/one factor] Bergstrand (1989) [H-O model with Lindera hypothesis] Simplest model Variables: Explained: bilateral trade – Explanatory: GDP, populations, distance reg trade gdp pop dist – Source SS df MS Model Residual 196764.006 129238.275 3 1070 65588.0021 120.783434 Total 326002.281 1073 303.823188 tradevolume Coef. gdpsum population~m distance _cons .0141613 .0528096 -.0073704 5.762674 Std. Err. .0011921 .0228549 .0005152 1.067794 t 11.88 2.31 -14.31 5.40 Number of obs F( 3, 1070) Prob > F R-squared Adj R-squared Root MSE P>|t| 0.000 0.021 0.000 0.000 = = = = = = 1074 543.02 0.0000 0.6036 0.6025 10.99 [95% Conf. Interval] .0118221 .0079642 -.0083813 3.667467 .0165004 .097655 -.0063594 7.857882 Panel data Same data, same question, but „sth” consists of groups over time STATA learns that by 1. Set of commands: iis grouping_var tis time_var 2. xtset grouping_var time_var 3. tsset grouping_var time_var (they are all equivalent) Once data are set for panel? xtsum vs sum Panel regression Do not forget context menu in STATA To find out how to do panel regressions in STATA: Statistics => Longtitudal/panel data – Many options already covered: xtset, sum, des, tab (check’em out) – Also: linear models Simplest code xtreg trade pop gdp dist Panel results Random-effects GLS regression Group variable: id Number of obs Number of groups = = 1074 91 R-sq: Obs per group: min = avg = max = 6 11.8 12 within = 0.4879 between = 0.6091 overall = 0.5995 Random effects u_i ~ Gaussian corr(u_i, X) = 0 (assumed) Std. Err. Wald chi2(3) Prob > chi2 tradevolume Coef. z gdpsum population~m distance _cons .0187795 -.0098166 -.0068902 4.429218 .0006722 .0375135 .0017132 3.53079 sigma_u sigma_e rho 10.536556 3.3908988 .90615037 (fraction of variance due to u_i) 27.94 -0.26 -4.02 1.25 P>|z| 0.000 0.794 0.000 0.210 = = 1070.28 0.0000 [95% Conf. Interval] .017462 -.0833418 -.010248 -2.491003 .0200969 .0637085 -.0035324 11.34944 How do we know if it makes sense? Different from pooled estimator? What if we add country effects to the pooled estimation? Let’s try areg trade pop gdp dist, absorb(grouping_var) Some we know from the literature and some from experience – Linear or in logs? Maybe also non-linear terms and interactions, trade or export share, etc. – Should we do fixed or random effects? – Are we interested in differences across time or across countries? Between and within R2 tell a different story, no? What do our models say? xttest0 tradevolume[id,t] = Xb + u[id] + e[id,t] Estimated results: Var tradevo~e e u Test: 303.8232 11.49819 111.019 sd = sqrt(Var) 17.43052 3.390899 10.53656 Var(u) = 0 chi2(1) = Prob > chi2 = 4793.89 0.0000 Huge problem - endogeneity What is first: – rich trade more or rich because trade more? – how to go around this problem? What is it that we want? – Cross country differences? – Time evolutions within one country? – Test theory? What do you find on do-file? 1. 2. 3. Declare panel, run simplest models, do graphs, etc Run diagnostics Learn more