Download Lecture 1

CES-Munich Lectures on: Econometric inference on endogenous interventions Jan F. Kiviet (University of Amsterdam) document version: 16 February 2011 Lecture 1: Perils of unobserved heterogeneity; simulation illustrations We provide some basic results from Ordinary Least-Squares (OLS) theory and illustrate them by a small-scale classic Monte Carlo simulation study. That means that we design a data generating process (DGP) ourselves on a computer in all its detail. Next we act as if we do not know the parameter values, or perhaps don’t know the appropriate model speci…cation either (which is the usual situation in practice!). Based on a chosen model speci…cation and by employing particular standard OLS-based inference techniques, we use the generated data in order to obtain actual drawings from the true distribution of the employed estimator of the model parameters, or of a test statistic to verify a particular parametric hypothesis. Thus, instead of using analytical methods to analyze the properties of the inference techniques, we assess these properties by experimentation. In a classic Monte Carlo simulation study we generate data according to the true DGP and apply the technique many times (this is the ‘number of replications’). Then, building on the Law of Large Numbers and the Central Limit Theorem, we assess from this su¢ ciently large simulation sample of realizations the distributional characteristics of the technique (such as bias, variance and other characteristics of its distribution) in a realistically small quasi-empirical sample. 1 Arti…cial data on student e¤orts and grades To get familiar with the interpretation of Monte Carlo simulation results, we do some simple experiments …rst in a situation where we know a lot about the actual …nite sample properties of OLS estimators (and in fact do not need simulation). First, we check how well Monte Carlo results reproduce these known …nite sample properties. Next we move to situations where the model is less standard and some actual …nite sample properties cannot (easily) be derived analytically. Then the Monte Carlo results on the actual …nite sample properties can be used to better understand the situation and to check how well/bad available asymptotic approximations work in practice. 1 First, regressor variables Wi (work input measured in study weeks) and ability Ai (i = 1; :::; n) are created. In fact we choose n = 40; Wi IIN(5,1) and Ai IIN(7,1.22 ), but such that the correlation between Ai and Wi is equal to a particular numerical value AW (called rhoaw in the computer program). Initially we choose AW = 0:5. Hence, we assume that more able students tend to work slightly less. We keep these regressor variables constant over the Monte Carlo replications. So, we condition on these values. The true regression relationship (DGP) is chosen to be: Ri = 1 + 2 W i + "i IIN(0; 2" ) 3 Ai + "i i = 1; :::; n: (1) The regressors Ai and Wi are actually generated in the program according to: (0) (1) vi ; vi IIN(0; 1); Ai = A Wi = (0) A vi + + W W[ IIN( (0) AW vi A; + (1 2 A ); 2 1=2 (1) vi ] AW ) (2) IIN( W; 2 W ); (3) so that indeed Cov(Ai ; Wi ) = Ef[Ai 2 W Of course, A ; W ; 2A ; have sample moments: = (0) (0) A W Efvi [ AW vi = A W AW : 1 n 1 rAW = n 1 2 1=2 (1) vi ]g AW ) + (1 n 1X 1X Ai ; mW = Wi ; n n i=1 (stdA )2 = E(Wi )]g are population moments. The n realized values of Ai and Wi n mA = E(Ai )][Wi i=1 n X 1 i=1 Pn (Ai mA )2 ; (stdW )2 = mA )(Wi i=1 (Ai stdA stdW mW ) 1 n 1 n X (Wi mW )2 ; i=1 ; where rAW is the sample equivalent of the population correlation coe¢ cient AW : Note that the sample means, standard deviations and the correlation will deviate from the chosen population values, because n is rather small. 2 Four EViews simulation programs We work with the EViews programs LSN.prg (least-squares under normality), LSNN.prg (leastsquares with non-normal disturbances), LSNM.prg (least-squares under normality of a misspeci…ed model) and LSNPAN.prg (least-squares under normality exploiting a panel data set). They enable to examine some properties of least-squares analysis in a model that mimics the situation introduced and discussed earlier. 2.1 LSN; appropriate speci…cation and normal disturbances In the ideal well-speci…ed multiple regression of Ri on Wi and Ai and an intercept, the leastsquares estimators are unbiased and because the disturbances are normally distributed, so are the OLS coe¢ cient estimates. Then, Student-statistics are really distributed as t37 under the 2 null-hypothesis tested. Thus, type I errors (the probability to reject a true null-hypothesis) can exactly be controlled and kept as small as one …nds appropriate, say 5%. This we illustrate and corroborate by simulation program LSN. More in general, econometric theory says that for the DGP y=X 0 + "; with " j X N (0; 2 0 I); (4) where X is an n k matrix, y and " random n 1 vectors and 0 a k 1 vector. Because the regressors X are exogenous with respect to " OLS is the preferred technique. The in practice unobserved true parameter values 0 and 20 should be estimated by ^ = (X 0 X) 1 X 0 y and s2 = (y X ^ )0 (y X ^ )=(n k): One can derive that ^ 2 N( ; 2 0 2 (n k)s = (^ 0 )j p 0 s [(X X) 1 ]jj (n t(n 2 0 1 0 (X X) ); k); k): (5) (6) (7) Hence, the OLS ^ is unbiased for 0 and it is normally distributed. Also, s2 is unbiased for 20 and distributed as a multiple of a 2 (n k) distribution, whereas V ar(s2 ) = n 2 k 4" : Testing H0 : j = c by a t-statistic yields a drawing from a Student distribution with n k degrees of freedom, provided c equals the true value of the j-th coe¢ cient in the DGP. In the program we generate in R = 10000 replications (so, again and again) the n = 40 realizations of the dependent variable yi = 1 + 2 Wi + 3 Ai +"i ; where "i IIN(0, 2" ), i = 1; :::; n. We choose " = 1; 1 = 5; 2 = 1; 3 = 1: Hence, for AW = 0:5 we have yi IIN(7,1.24), so some of the yi realizations may be larger than 10 (which would not occur with marks in practice, but this is of no concern for the purpose of the illustration). Remember that this is in fact a situation where we do not need Monte Carlo to assess E( ^ ); Var( ^ ); E(s2 ) and Var(s2 ); or the shape of the distribution functions of ^ and s2 ; because they can be derived analytically. However, this is a good starting point to understand the unavoidable (in)accuracies (because R << 1) of Monte Carlo simulation. These will remain, also in more complex situations, where the expectation and variance of estimators can not always be derived analytically. Then we will exploit Monte Carlo to assess (or better, to approximate) them, and to examine the accuracy for small n of standard large-n asymptotic approximations provided by econometric theory. Be aware of the di¤erent roles of the sample sizes n and R ! In practice one often has to accept that n is rather small, so that asymptotic approximations may be poor. Given some patience and su¢ cient computer power R can always be chosen so large that the remaining Monte Carlo simulation approximation errors are as small as one wishes. EViews program LSN: ’program: LSN.prg ([email protected]) ’all text after ’ is comment ’========= Monte Carlo study of simple normal multiple regression model =========== !n=40 ’n is sample size workfile c:\JFKdocuments\Munich\CES1\LSN.wf1 u 1 !n ’workfile for Undated series t=1,..,n rndseed 17022011 ’to initialize the random number generator genr v0=nrnd 3 genr a=7+1.2*v0 ’regressor ability on a 1-10+ scale !rhoaw=-0.5 ’population correlation between a and w genr v1=nrnd genr w=5+!rhoaw*v0+@sqrt(1-!rhoaw^2)*v1 ’regressor w work in weeks course 1 !beta1=-5 ’true intercept !beta2=1 ’true coefficient of w !beta3=1 ’true coefficient of a !sig=1 ’true std.dev sigma of disturbance term !R=10000 ’Monte Carlo will consist of R replications matrix (!R,3) SIMRES ’Monte Carlo results will be stored in matrix SIMRES for !rep=1 to !R ’R stochastically independent replications genr y = !beta1 + !beta2*w + !beta3*a + !sig*nrnd ’true DGP with Normal disturbances equation eq1.ls y c w a ’OLS estimation of correctly specified model simres(!rep,1)=eq1.@coefs(2) ’estimate of beta2 put in matrix SIMRES simres(!rep,2)=eq1.@stderrs(2) ’its estimated standard error in 2nd column simres(!rep,3)=eq1.@se ’estimate of sigma in 3rd column next SIMRES.write c:\JFKdocuments\Munich\CES1\simres.txt ’matrix SIMRES is written on a file workfile c:\JFKdocuments\Munich\CES1\LSNsim.wf1 u 1 !R ’workfile to contain matrix simres read c:\JFKdocuments\Munich\CES1\simres.txt b2 seb2 s ’SIMRES reformatted as workfile ’... containing 3 named variables (each of R observations) genr tb2=(b2-!beta2)/seb2 ’t test for b2 of true H0 genr s2=s*s ’OLS estimator of sigma-squared genr rejecttb2L=tb2<@qtdist(0.05,!n-3) ’rejections against 1-sided alternative ... genr rejecttb2R=tb2>@qtdist(0.95,!n-3) ’... at nominal significance level 5% Program output: On the regressors: Mean Maximum Minimum Std.Dev. Observations Correlation A 6.91 10.12 5.12 1.02 40 W 5.27 7.43 1.86 1.23 40 -0.39 On the regressions: 4 1600 Series: B2 Sample 1 10000 Observations 10000 1400 1200 1000 800 600 400 200 0 0.6 0.8 1.0 1.2 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 1.000511 0.999789 1.493138 0.477839 0.141591 0.002357 2.985302 Jarque-Bera Probability 0.099276 0.951574 1.4 So the actual bias of zero is estimated from the simulations to be 0.000511 (on basis of these 10000 replications), with standard error 0.0014 (because the p 10000 drawings of ^ 2 have standard error 0.1416 their sample average has standard error 0:1416= R = 0:0014). So, the unbiasedness of ^ 2 has been corroborated, and also its normality (p-value of 0.95). 1000 Series: S2 Sample 1 10000 Observations 10000 800 600 400 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.997461 0.977120 2.092082 0.383796 0.232354 0.469465 3.279405 Jarque-Bera Probability 399.8563 0.000000 200 0 0.50 0.75 1.00 1.25 1.50 1.75 2.00 The unbiased estimator s2 has an estimated bias of -0.002539 (with standard error 0.0023). 2 It has a distribution pconforming to a (37) divided by 37, hence has expectation 1 and true standard deviation 2=37 = 0:2325; which is in close agreement with the Monte Carlo estimate of 0.2324. 5 1000 Series: TB2 Sample 1 10000 Observations 10000 800 600 400 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.002066 -0.001504 4.552857 -4.266754 1.032424 -0.016947 3.193371 Jarque-Bera Probability 16.05881 0.000326 200 0 -2.5 0.0 2.5 Since 2 = 1 the null-hypothesis tested is true so the test statistic (collected R times in variable TB2) follows the Student distribution with 37 degrees of freedom, which has expectation 0 and a standard deviation slightly above 1. Also, it is symmetric, so its skewness is zero, whereas its tails are slightly fatter than for the normal, giving a kurtosis just above 3. Again, all these known theoretical results are corroborated (but not perfectly re-established, because R << 1) in the simulation. 10000 Series: REJECTTB2L Sample 1 10000 Observations 10000 8000 6000 4000 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.049000 0.000000 1.000000 0.000000 0.215879 4.178479 18.45969 Jarque-Bera Probability 128683.6 0.000000 2000 0 0.0 0.2 0.4 0.6 0.8 1.0 Against left-hand alternatives the true type I error probability is 5% and we …nd a frequency of rejection of 0.0490 (this estimate has standard error 0.002). 6 10000 Series: REJECTTB2R Sample 1 10000 Observations 10000 8000 6000 4000 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.051500 0.000000 1.000000 0.000000 0.221026 4.058543 17.47177 Jarque-Bera Probability 114716.4 0.000000 2000 0 0.0 0.2 0.4 0.6 0.8 1.0 Against right-hand side alternatives we again …nd an estimated type I error probability which deviates from the true value by a small and acceptable (given the Monte Carlo standard error) value. 2.2 LSNN; appropriate speci…cation and nonnormal disturbances In EViews program LSNN.prg we examine the well-speci…ed multiple regression of Ri on Wi and Ai and an intercept, so again the least-squares estimators are unbiased, but now the disturbances are not normally distributed. Therefore, the OLS estimator itself is no longer normally distributed, and the Student statistics are only approximately distributed as t37 under the null-hypothesis tested (for n ! 1 they would be t1 = N (0; 1); but what does that mean for n k = 37?). Thus, type I errors cannot exactly be controlled. How much the actual type I error probability deviates form the nominal one is very hard to evaluate analytically, but in particular cases it can easily be assessed by simulation (and by choosing the number of replications large enough so that the simulation errors can be guaranteed to be reasonably small). In program LSNN we examine the case in which the disturbances are still independent with expectation zero and variance 20 but they are extremely skew. They are centered and rescaled drawings from the 2 (1) distribution. So, in more general terms we now consider the DGP y=X ^ ( ; 0 + "; with " j X 2 0 1 0 (X X) ); (^ 0 )j p 0 s [(X X) 1 ]jj (0; a ^ n!1 a n!1 2 0 I); N( ; N (0; 1): where 2 0 1 0 (X X) ); (8) (9) (10) Apart from its …rst two moments, the only distributional properties of OLS that can relatively easily be derived analytically are asymptotic in nature. For n ! 1, due to the Central Limit Theorem, convergence of the OLS coe¢ cient estimator towards normality can be established, and similarly for t-statistics when the null hypothesis is true. 7 EViews program LSNN: This is di¤erent from LSN in just one respect. The disturbances "i are non-normal. Now they p have a skew distribution, but still the same …rst and second moment, because "i = 0 (u2i 1)= 2; where ui IIN(0; 1): Note that, although "i IID(0; 20 ); it has a skew distribution (rescaled 21 ). This does not change the asymptotic properties of ^ and s2 nor of the test statistics. In fact, ^ and s2 are still unbiased, but their distributions are di¤erent, and so is that of the test statistic on 2 ; giving rejection probabilities that may di¤er systematically from their nominal values. Apart from renaming the program and its work…les (changing LSN into LSNN) we only changed: | for !rep=1 to !R ’R stochastically independent replications genr v2=nrnd; ’v2 is yet another series of N(0,1) genr eps=!sig*(v2^2-1)/@sqrt(2) ’eps has expectation 0, std.dev. !sig and is skew genr y = !beta1 + !beta2*w + !beta3*a + eps ’true DGP with NonNormal disturbances equation eq1.ls y c w a ’OLS estimation of correctly specified model | 160000 Series: B2 Sample 1 1000000 Observations 1000000 120000 80000 40000 0 0.00 0.25 0.50 0.75 1.00 1.25 1.50 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.999992 1.000638 1.892529 0.040872 0.141208 -0.049555 3.715505 Jarque-Bera Probability 21740.43 0.000000 1.75 However, to indicate also the e¤ect of changing R (the number of replications) we increased it here to 106 ; also for good reasons, because we want to examine whether the introduced skewness and kurtosis of the disturbances changes the skewness and the kurtosis of the OLS coe¢ cient estimates, and it is well known that one needs a very large Monte Carlo sample size in order to estimate those with reasonable precision. Note that the bias and standard deviation (EViews should actually what it indicates as Std. Dev. call ‘sample standard error’) have not changed much (because their true population values remained the same). However, especially kurtosis did, and the p-value for the test on normality is now zero, so clearly rejecting normality! 8 200000 Series: TB2 Sample 1 1000000 Observations 1000000 160000 120000 80000 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 1.78e-05 0.005021 5.541440 -5.734523 1.027734 -0.047971 3.093635 Jarque-Bera Probability 748.8517 0.000000 40000 0 -6 -4 -2 0 2 4 6 The t-test statistic has a slightly skew distribution. Next we check whether these strange disturbances have a¤ected the actual signi…cance level. 1000000 Series: REJECTTB2L Sample 1 1000000 Observations 1000000 800000 600000 400000 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.050876 0.000000 1.000000 0.000000 0.219745 4.087693 17.70924 Jarque-Bera Probability 11799941 0.000000 200000 0 0.0 0.2 0.4 0.6 0.8 1.0 Against left-hand side alternatives we …nd 5.09% (which is genuinely larger than 5%, because (.0509-.0500)/.00022=4.09). 9 1000000 Series: REJECTTB2R Sample 1 1000000 Observations 1000000 800000 600000 400000 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.048981 0.000000 1.000000 0.000000 0.215828 4.179424 18.46758 Jarque-Bera Probability 12879853 0.000000 200000 0 0.0 0.2 0.4 0.6 0.8 1.0 And against right-hand alternatives the estimated actual 4.8981% is also signi…cantly smaller than 5%. However, for practical purposes mostly it does not really matter whether we test at the 5% level, or at the 6% or 4% level. Therefore, we should conclude that in this particular example the e¤ect of extremely skew disturbances does certainly not undermine standard OLS inference, because even at n = 40 the Central Limit Theorem already exposes its force. To illustrate how remarkable this is we also illustrate how extremely skew the disturbances used here are. Therefore, we run the following simple program: ’program skew.prg (illustrates skewness) !n=1000000 workfile c:\JFKdocuments\Munich\CES1\skew.wf1 u 1 !n genr eps=(nrnd^2-1)/@sqrt(2) This yields: 500000 Series: EPS Sample 1 1000000 Observations 1000000 400000 300000 200000 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.000163 -0.386198 16.53911 -0.707107 1.001301 2.837019 15.14047 Jarque-Bera Probability 7482735. 0.000000 100000 0 0 5 10 15 10 We drew a million realizations because when we just look at the n = 40 drawings that EViews generated in its …nal replication the nature of the distribution function is much less clear, as is seen below. 20 Series: EPS Sample 1 40 Observations 40 16 12 8 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis -0.161610 -0.451359 1.967872 -0.704835 0.681433 1.580973 4.643112 Jarque-Bera Probability 21.16287 0.000025 4 0 -0.5 2.3 0.0 0.5 1.0 1.5 2.0 LSNM; inappropriate speci…cation with normal disturbances In program LSNM.prg we consider the misspeci…ed regression of Ri on Wi and an intercept, while omitting the regressors Ai (in practice this might be unavailable, thus giving rise to unobserved heterogeneity). The bias of the OLS estimator of the coe¢ cient of Wi can be shown to be given by stdA rAW : (11) 3 stdW Hence, the absolute value of this bias increases with jrAW j ; j with stdW : Note that there is no bias when rAW = 0: 3j and with stdA ; and it decreases In general terms one can establish that when the DGP is y=X 0 + u = X1 (1) 0 (2) 0 + X2 + u; with u j X (1) then OLS of y on just the regressors X1 yields ^ = (X10 X1 ) E( ^ (1) ) = (X10 X1 ) = (1) 0 1 1 X 0 y; 1 (0; 2 0 I); (12) where (1) (2) 0 + X2 0 ) (2) 1 0 X1 X2 0 ; X10 (X1 + (X10 X1 ) (13) (2) so there is bias (unless 0 = 0 or X10 X2 = O). The second term of (13) specializes to (11) for the DGP of program LSNM. Moreover, we may derive that V ar( ^ (1) )= 2 0 1 0 (X1 X1 ) : (14) This expression is actually more attractive than that for the well-speci…ed model, where it is 2 fX 0 [I X (X 0 X ) 1 X 0 ]X g 1 ; which has larger diagonal elements, except for the case X 0 X = 2 1 0 1 2 2 2 1 2 O: 11 EViews program LSNM: Program LSNM has again normal disturbances, but now the misspeci…ed model is estimated which omits the regressor a (ability). The program allows to examine some of the e¤ects on OLS inference. The only di¤erences of program LSNM with respect to LSN are that the …le names have been changed from "lsn" into "lsnm" and the line which speci…es the regression to be run is now: equation eq1.ls y c w ’OLS estimation of incorrectly specified model We run again R = 10000 replications now and …nd for the simulated distribution of ^ 2 the histogram: 1600 Series: B2 Sample 1 10000 Observations 10000 1200 800 400 0 0.2 0.4 0.6 0.8 1.0 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.677359 0.676809 1.153622 0.169626 0.130980 0.006884 2.980307 Jarque-Bera Probability 0.240572 0.886667 1.2 This clearly shows the substantial bias, which is estimated by the Monte Carlo simulation to be 0:6774 1 = 0:3226: Note that we …nd for the analytical bias (11): 3 stdA 1:02 rAW = stdW 1:23 0:39 = 0:323: If rAW had been much larger and the ratio of the standard deviations too, we might have found a negative in‡uence of regressor W instead of the (true) positive e¤ect. The reason is that the estimated coe¢ cient aims to represent the joint e¤ects of W and A: Note also that Std.Dev. is indeed smaller here than in model LSN. Further inference is seriously a¤ected too. The residuals now represent also the e¤ect of A that could not be attributed to W: So, the residual variance is much larger than the disturbance variance, resulting in a positive bias of s2 as an estimator of 20 : We …nd: 12 1200 Series: S2 Sample 1 10000 Observations 10000 1000 800 600 400 200 0 0.5 1.0 1.5 2.0 2.5 3.0 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 1.902624 1.880764 3.608298 0.537487 0.383828 0.330991 3.070766 Jarque-Bera Probability 184.6786 0.000000 3.5 The distribution of the t-statistic has shifted to the left now, resulting in much too frequent rejections of the correct null hypothesis against left-hand side alternatives and no rejections against right-hand side alternatives. 1400 Series: TB2 Sample 1 10000 Observations 10000 1200 1000 800 600 400 200 0 -5.00 -3.75 -2.50 -1.25 0.00 13 1.25 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis -1.826437 -1.810931 1.025678 -5.368940 0.770098 -0.161504 3.206804 Jarque-Bera Probability 61.29233 0.000000 6000 Series: REJECTTB2L Sample 1 10000 Observations 10000 5000 4000 3000 2000 1000 0 0.0 0.2 0.4 0.6 0.8 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 0.567700 1.000000 1.000000 0.000000 0.495420 -0.273317 1.074702 Jarque-Bera Probability 1668.992 0.000000 1.0 We conclude that for a proper interpretation of regression results unobserved heterogeneity can be devastating. The e¤ects are more serious when the unobserved heterogeneity (A) is stronger correlated with the observed heterogeneity (W). So, at this stage we conclude that non-normality of the disturbances is not a very serious problem. It is correlation of the unexplained part in the model (the disturbances) with the regressors which may ruin a standard OLS-based regression analysis. 2.4 LSNPAN; panel analysis may neutralize unobserved heterogeneity We will now extend the DGP and data set and assume that we can analyze exam results for the same students on two di¤erent courses and exams, hence we have: R1i = R2i = + 1+ 1 2 W1i + 2 W2i + 3 Ai + "1i ; "1i 3 Ai + "2i ; "2i IIN(0; IIN(0; 2) " 2) " i = 1; :::; n: (15) Note that we assume the regression coe¢ cients and the disturbance variance to be the same for the two exams and for all the students. Since it are the same students at both exams, also the Ai values are the same in the two relationships, but we distinguish W1i and Wi2 (student do not necessarily make the same e¤orts for both exams) and self-evidently R1i and R2i may be di¤erent too. So, di¤erent e¤orts will lead to di¤erent results, but also the disturbances will be di¤erent at the two exams (these are the e¤ects of being ill, or in love, or having a hangover, broken alarm bell etc.; these are all modelled as random and supposed to be uncorrelated with W and A). We assume that W1i and W2i have equal expectation W and variance 2W and a correlation given by 12 (which is possibly positive; students that work hard for exam 1 do so too for exam 2). The regressors Ai ; W1i and W2i are generated in the program from the three mutually independent (0) (1) (2) IIN(0; 1) series vi ; vi ; vi according to (we immediately incorporated our choice W = 1): Ai = A + W1i = W + W2i = W + +[1 (0) A vi 2 A ; A ); (0) 2 1=2 (1) vi ; AW vi + (1 AW ) (0) 2 2 1=2 (1) vi AW vi + ( 12 AW )(1 AW ) 2 2 2 2 1 1=2 (2) ( 12 vi : AW AW ) (1 AW ) ] IIN( One can easily verify that V ar(W1i ) = V ar(W1i ) = 1 indeed, and that E(W1i W2i ) = 14 (16) (17) 12 : In the program we employ OLS to the data of both exams separately, both in the well-speci…ed model and in the model omitting regressor Ai : In addition, we also apply OLS to a combination of all data, in which we subtract the two equations given in (15), yielding Ri = 2 Wi + " i ; "i IIN(0; 2 2 " ), (18) where Ri = R1i R2i ; Wi = W1i W2i and "i = "1i "2i : Note that this model does not su¤er from unobserved heterogeneity and can directly be estimated by OLS. The estimator for 2 will be unbiased. However, it is most likely that it will have a larger variance than the estimator using just one exam in the (in practice unfeasible) well-speci…ed model, because the variance of the disturbances is twice as large. In addition the sample variance of Wi may be small, which will be the case especially if 12 is large. So, if the students don’t change their study attitude from exam to exam this does not allow to identify the parameter 2 unless variable Ai has been observed. EViews program LSNPAN: ’program: LSNPAN.prg ([email protected]) ’ ========= Monte Carlo study of simple static panel data model ================= !n=40 ’n is sample size workfile c:\JFKdocuments\Munich\CES1\LSNPAN.wf1 u 1 !n ’workfile for Undated series t=1,...,n rndseed 17022011 genr v0=nrnd genr a=7+1.2*v0 ’regressor ability on a 1-10+ scale !rhoaw=-0.5 ’population correlation between a and both w1 and w2 !rhow12=0.8 ’population correlation between w1 and w2 genr v1=nrnd genr w1=5+!rhoaw*v0+@sqrt(1-!rhoaw^2)*v1 ’regressor w1 work in weeks course 1 genr v2=nrnd genr w2=5+!rhoaw*v0+(!rhow12-!rhoaw^2)/@sqrt(1-!rhoaw^2)*v1 genr w2=w2+@sqrt(1-!rhoaw^2-(!rhow12-!rhoaw^2)^2/(1-!rhoaw^2))*v2 genr w1_w2=w1-w2 !beta1=-5 ’true intercept !beta2=1 ’true coefficient of w !beta3=1 ’true coefficient of a !sig=1 ’true std.dev sigma of error terms genr Ey1 = !beta1 + !beta2*w1 + !beta3*a genr Ey2 = !beta1 + !beta2*w2 + !beta3*a !R=10000 ’Monte Carlo will consist of R replications matrix (!R,5) SIMRES ’Monte Carlo results will be stored in matrix SIMRES for !rep=1 to !R ’R stochastically independent replications genr y1 = Ey1 + !sig*nrnd ’true DGP with Normal errors for course 1 equation eq1.ls y1 c w1 a ’OLS estimation of correctly specified 1st course simres(!rep,1)=eq1.@coefs(2) ’estimate of beta2 put in matrix SIMRES equation eq2.ls y1 c w1 ’OLS estimation of incorrectly specified 1st course simres(!rep,2)=eq2.@coefs(2) ’estimate of beta2 put in matrix SIMRES genr y2 = Ey2 + !sig*nrnd ’true DGP with Normal errors for course 2 equation eq3.ls y2 c w2 a ’OLS estimation of correctly specified 2nd course simres(!rep,3)=eq3.@coefs(2) ’estimate of beta2 put in matrix SIMRES equation eq4.ls y2 c w2 ’OLS estimation of incorrectly specified 2nd course simres(!rep,4)=eq4.@coefs(2) ’estimate of beta2 put in matrix SIMRES genr y1_y2=y1-y2 ’differenced data 15 equation eq5.ls y1_y2 c w1_w2 ’model with removed unobserved heterogeneity simres(!rep,5)=eq5.@coefs(2) ’estimate of beta2 put in matrix SIMRES next SIMRES.write c:\JFKdocuments\Munich\CES1\simres.txt ’matrix SIMRES is written on a file workfile c:\JFKdocuments\Munich\CES1\LSNPANsim.wf1 u 1 !R ’workfile to contain matrix simres read c:\JFKdocuments\Munich\CES1\simres.txt b21 b21m b22 b22m b2panel Program output: On the regressors (sample size n = 40): A 6.91 10.12 5.12 1.02 Mean Maximum Minimum Std.Dev. Correlation Cor (A,W2) W1 5.27 7.43 1.86 1.23 W2 4.97 7.02 2.31 1.23 -0.39 0.91 -0.35 Monte Carlo results (R = 10000) on the regression results for E( ^ 2 ) Std:Dev:( ^ 2 ) Exam 1 A included A omitted 1.00 0.68 0.14 0.13 2: Exam 2 A included A omitted 1.00 0.71 0.14 0.13 Panel A removed 1.01 0.43 For the results mentioned in the …nal column we present also the simulation histogram. Note that in the program we did put an intercept into the panel regression, as one would usually do, although the assumptions made allow to remove it. 1000 Series: B2PANEL Sample 1 10000 Observations 10000 800 600 400 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis 1.006671 1.002574 2.589746 -0.481723 0.432228 0.027757 2.938323 Jarque-Bera Probability 2.869153 0.238216 200 0 -0.5 -0.0 0.5 1.0 1.5 2.0 16 2.5 The large standard deviation of this panel estimator does not allow to make very sharp inferences on the true value of 2 : The situation would be much better, however, if we had more than 2 exams (and would apply so-called …xed e¤ects panel data analysis). Having data on more than 40 students would help too. This very basic example should only illustrate that unobserved heterogeneity if it a¤ects all the shifts (exams) of the panel equivalently does not preclude unbiased inference. 3 Suggestions for further experimentation 1. Run program LSN.prg again but …rst choose a di¤erent integer value for rndseed. What do you conclude? 2. Run LSN.prg again for R = 100: What do you conclude? 3. Run program LSN.prg again using R = 10000; but choose a di¤erent value for rhoaw. Note how this (multicollinearity) a¤ects V ar( ^ 2 ): 4. Run program LSNM.prg for di¤erent values of rhoaw and note and explain the e¤ects. 5. Increase V ar(Ai ) and note how it a¤ects the results of LSN, LSNM and LSPAN. 6. Increase the value of n to, say, 1000 and examine the e¤ects. 7. Reduce the magnitude of " (sig in the programs) and examine the e¤ects. 17

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 1