Download Lecture 1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
CES-Munich Lectures on:
Econometric inference on endogenous interventions
Jan F. Kiviet (University of Amsterdam)
document version: 16 February 2011
Lecture 1:
Perils of unobserved heterogeneity;
simulation illustrations
We provide some basic results from Ordinary Least-Squares (OLS) theory and illustrate them
by a small-scale classic Monte Carlo simulation study. That means that we design a data
generating process (DGP) ourselves on a computer in all its detail. Next we act as if we do
not know the parameter values, or perhaps don’t know the appropriate model speci…cation either
(which is the usual situation in practice!). Based on a chosen model speci…cation and by employing
particular standard OLS-based inference techniques, we use the generated data in order to obtain
actual drawings from the true distribution of the employed estimator of the model parameters, or
of a test statistic to verify a particular parametric hypothesis. Thus, instead of using analytical
methods to analyze the properties of the inference techniques, we assess these properties by
experimentation. In a classic Monte Carlo simulation study we generate data according to
the true DGP and apply the technique many times (this is the ‘number of replications’). Then,
building on the Law of Large Numbers and the Central Limit Theorem, we assess from this
su¢ ciently large simulation sample of realizations the distributional characteristics of the
technique (such as bias, variance and other characteristics of its distribution) in a realistically
small quasi-empirical sample.
1
Arti…cial data on student e¤orts and grades
To get familiar with the interpretation of Monte Carlo simulation results, we do some simple
experiments …rst in a situation where we know a lot about the actual …nite sample properties of
OLS estimators (and in fact do not need simulation). First, we check how well Monte Carlo results
reproduce these known …nite sample properties. Next we move to situations where the model is less
standard and some actual …nite sample properties cannot (easily) be derived analytically. Then
the Monte Carlo results on the actual …nite sample properties can be used to better understand
the situation and to check how well/bad available asymptotic approximations work in practice.
1
First, regressor variables Wi (work input measured in study weeks) and ability Ai (i = 1; :::; n)
are created. In fact we choose n = 40; Wi
IIN(5,1) and Ai
IIN(7,1.22 ), but such that the
correlation between Ai and Wi is equal to a particular numerical value AW (called rhoaw in
the computer program). Initially we choose AW = 0:5. Hence, we assume that more able
students tend to work slightly less. We keep these regressor variables constant over the Monte
Carlo replications. So, we condition on these values.
The true regression relationship (DGP) is chosen to be:
Ri = 1 + 2 W i +
"i
IIN(0; 2" )
3 Ai
+ "i
i = 1; :::; n:
(1)
The regressors Ai and Wi are actually generated in the program according to:
(0)
(1)
vi ; vi
IIN(0; 1);
Ai =
A
Wi =
(0)
A vi
+
+
W
W[
IIN(
(0)
AW vi
A;
+ (1
2
A );
2
1=2 (1)
vi ]
AW )
(2)
IIN(
W;
2
W );
(3)
so that indeed
Cov(Ai ; Wi ) = Ef[Ai
2
W
Of course, A ; W ; 2A ;
have sample moments:
=
(0)
(0)
A W Efvi [ AW vi
=
A W AW :
1
n
1
rAW
=
n 1
2
1=2 (1)
vi ]g
AW )
+ (1
n
1X
1X
Ai ; mW =
Wi ;
n
n
i=1
(stdA )2 =
E(Wi )]g
are population moments. The n realized values of Ai and Wi
n
mA =
E(Ai )][Wi
i=1
n
X
1
i=1
Pn
(Ai
mA )2 ; (stdW )2 =
mA )(Wi
i=1 (Ai
stdA stdW
mW )
1
n
1
n
X
(Wi
mW )2 ;
i=1
;
where rAW is the sample equivalent of the population correlation coe¢ cient AW :
Note that the sample means, standard deviations and the correlation will deviate from the
chosen population values, because n is rather small.
2
Four EViews simulation programs
We work with the EViews programs LSN.prg (least-squares under normality), LSNN.prg (leastsquares with non-normal disturbances), LSNM.prg (least-squares under normality of a misspeci…ed model) and LSNPAN.prg (least-squares under normality exploiting a panel data set). They
enable to examine some properties of least-squares analysis in a model that mimics the situation
introduced and discussed earlier.
2.1
LSN; appropriate speci…cation and normal disturbances
In the ideal well-speci…ed multiple regression of Ri on Wi and Ai and an intercept, the leastsquares estimators are unbiased and because the disturbances are normally distributed, so are
the OLS coe¢ cient estimates. Then, Student-statistics are really distributed as t37 under the
2
null-hypothesis tested. Thus, type I errors (the probability to reject a true null-hypothesis) can
exactly be controlled and kept as small as one …nds appropriate, say 5%. This we illustrate and
corroborate by simulation program LSN.
More in general, econometric theory says that for the DGP
y=X
0
+ "; with " j X
N (0;
2
0 I);
(4)
where X is an n k matrix, y and " random n 1 vectors and 0 a k 1 vector. Because the
regressors X are exogenous with respect to " OLS is the preferred technique. The in practice
unobserved true parameter values 0 and 20 should be estimated by ^ = (X 0 X) 1 X 0 y and
s2 = (y X ^ )0 (y X ^ )=(n k): One can derive that
^
2
N( ;
2
0
2
(n k)s =
(^
0 )j
p
0
s [(X X) 1 ]jj
(n
t(n
2
0
1
0 (X X) );
k);
k):
(5)
(6)
(7)
Hence, the OLS ^ is unbiased for 0 and it is normally distributed. Also, s2 is unbiased for 20
and distributed as a multiple of a 2 (n k) distribution, whereas V ar(s2 ) = n 2 k 4" : Testing
H0 : j = c by a t-statistic yields a drawing from a Student distribution with n k degrees of
freedom, provided c equals the true value of the j-th coe¢ cient in the DGP.
In the program we generate in R = 10000 replications (so, again and again) the n = 40
realizations of the dependent variable yi = 1 + 2 Wi + 3 Ai +"i ; where "i IIN(0, 2" ), i = 1; :::; n.
We choose " = 1; 1 = 5; 2 = 1; 3 = 1: Hence, for AW = 0:5 we have yi IIN(7,1.24), so
some of the yi realizations may be larger than 10 (which would not occur with marks in practice,
but this is of no concern for the purpose of the illustration).
Remember that this is in fact a situation where we do not need Monte Carlo to assess E( ^ );
Var( ^ ); E(s2 ) and Var(s2 ); or the shape of the distribution functions of ^ and s2 ; because they
can be derived analytically. However, this is a good starting point to understand the unavoidable
(in)accuracies (because R << 1) of Monte Carlo simulation. These will remain, also in more
complex situations, where the expectation and variance of estimators can not always be derived
analytically. Then we will exploit Monte Carlo to assess (or better, to approximate) them, and
to examine the accuracy for small n of standard large-n asymptotic approximations provided by
econometric theory.
Be aware of the di¤erent roles of the sample sizes n and R ! In practice one often has
to accept that n is rather small, so that asymptotic approximations may be poor. Given some
patience and su¢ cient computer power R can always be chosen so large that the remaining Monte
Carlo simulation approximation errors are as small as one wishes.
EViews program LSN:
’program: LSN.prg ([email protected])
’all text after ’ is comment
’========= Monte Carlo study of simple normal multiple regression model ===========
!n=40
’n is sample size
workfile c:\JFKdocuments\Munich\CES1\LSN.wf1 u 1 !n ’workfile for Undated series t=1,..,n
rndseed 17022011
’to initialize the random number generator
genr v0=nrnd
3
genr a=7+1.2*v0
’regressor ability on a 1-10+ scale
!rhoaw=-0.5
’population correlation between a and w
genr v1=nrnd
genr w=5+!rhoaw*v0+@sqrt(1-!rhoaw^2)*v1 ’regressor w work in weeks course 1
!beta1=-5
’true intercept
!beta2=1
’true coefficient of w
!beta3=1
’true coefficient of a
!sig=1
’true std.dev sigma of disturbance term
!R=10000
’Monte Carlo will consist of R replications
matrix (!R,3) SIMRES
’Monte Carlo results will be stored in matrix SIMRES
for !rep=1 to !R
’R stochastically independent replications
genr y = !beta1 + !beta2*w + !beta3*a + !sig*nrnd ’true DGP with Normal disturbances
equation eq1.ls y c w a
’OLS estimation of correctly specified model
simres(!rep,1)=eq1.@coefs(2)
’estimate of beta2 put in matrix SIMRES
simres(!rep,2)=eq1.@stderrs(2)
’its estimated standard error in 2nd column
simres(!rep,3)=eq1.@se
’estimate of sigma in 3rd column
next
SIMRES.write c:\JFKdocuments\Munich\CES1\simres.txt ’matrix SIMRES is written on a file
workfile c:\JFKdocuments\Munich\CES1\LSNsim.wf1 u 1 !R ’workfile to contain matrix simres
read c:\JFKdocuments\Munich\CES1\simres.txt b2 seb2 s ’SIMRES reformatted as workfile
’... containing 3 named variables (each of R observations)
genr tb2=(b2-!beta2)/seb2
’t test for b2 of true H0
genr s2=s*s
’OLS estimator of sigma-squared
genr rejecttb2L=tb2<@qtdist(0.05,!n-3)
’rejections against 1-sided alternative ...
genr rejecttb2R=tb2>@qtdist(0.95,!n-3)
’... at nominal significance level 5%
Program output:
On the regressors:
Mean
Maximum
Minimum
Std.Dev.
Observations
Correlation
A
6.91
10.12
5.12
1.02
40
W
5.27
7.43
1.86
1.23
40
-0.39
On the regressions:
4
1600
Series: B2
Sample 1 10000
Observations 10000
1400
1200
1000
800
600
400
200
0
0.6
0.8
1.0
1.2
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
1.000511
0.999789
1.493138
0.477839
0.141591
0.002357
2.985302
Jarque-Bera
Probability
0.099276
0.951574
1.4
So the actual bias of zero is estimated from the simulations to be 0.000511 (on basis of these
10000 replications), with standard error 0.0014 (because the p
10000 drawings of ^ 2 have standard
error 0.1416 their sample average has standard error 0:1416= R = 0:0014). So, the unbiasedness
of ^ 2 has been corroborated, and also its normality (p-value of 0.95).
1000
Series: S2
Sample 1 10000
Observations 10000
800
600
400
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.997461
0.977120
2.092082
0.383796
0.232354
0.469465
3.279405
Jarque-Bera
Probability
399.8563
0.000000
200
0
0.50
0.75
1.00
1.25
1.50
1.75
2.00
The unbiased estimator s2 has an estimated bias of -0.002539 (with standard error 0.0023).
2
It has a distribution
pconforming to a (37) divided by 37, hence has expectation 1 and true
standard deviation 2=37 = 0:2325; which is in close agreement with the Monte Carlo estimate
of 0.2324.
5
1000
Series: TB2
Sample 1 10000
Observations 10000
800
600
400
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.002066
-0.001504
4.552857
-4.266754
1.032424
-0.016947
3.193371
Jarque-Bera
Probability
16.05881
0.000326
200
0
-2.5
0.0
2.5
Since 2 = 1 the null-hypothesis tested is true so the test statistic (collected R times in variable
TB2) follows the Student distribution with 37 degrees of freedom, which has expectation 0 and
a standard deviation slightly above 1. Also, it is symmetric, so its skewness is zero, whereas its
tails are slightly fatter than for the normal, giving a kurtosis just above 3. Again, all these known
theoretical results are corroborated (but not perfectly re-established, because R << 1) in the
simulation.
10000
Series: REJECTTB2L
Sample 1 10000
Observations 10000
8000
6000
4000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.049000
0.000000
1.000000
0.000000
0.215879
4.178479
18.45969
Jarque-Bera
Probability
128683.6
0.000000
2000
0
0.0
0.2
0.4
0.6
0.8
1.0
Against left-hand alternatives the true type I error probability is 5% and we …nd a frequency
of rejection of 0.0490 (this estimate has standard error 0.002).
6
10000
Series: REJECTTB2R
Sample 1 10000
Observations 10000
8000
6000
4000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.051500
0.000000
1.000000
0.000000
0.221026
4.058543
17.47177
Jarque-Bera
Probability
114716.4
0.000000
2000
0
0.0
0.2
0.4
0.6
0.8
1.0
Against right-hand side alternatives we again …nd an estimated type I error probability which
deviates from the true value by a small and acceptable (given the Monte Carlo standard error)
value.
2.2
LSNN; appropriate speci…cation and nonnormal disturbances
In EViews program LSNN.prg we examine the well-speci…ed multiple regression of Ri on Wi and Ai
and an intercept, so again the least-squares estimators are unbiased, but now the disturbances are
not normally distributed. Therefore, the OLS estimator itself is no longer normally distributed,
and the Student statistics are only approximately distributed as t37 under the null-hypothesis
tested (for n ! 1 they would be t1 = N (0; 1); but what does that mean for n k = 37?). Thus,
type I errors cannot exactly be controlled. How much the actual type I error probability deviates
form the nominal one is very hard to evaluate analytically, but in particular cases it can easily
be assessed by simulation (and by choosing the number of replications large enough so that the
simulation errors can be guaranteed to be reasonably small).
In program LSNN we examine the case in which the disturbances are still independent with
expectation zero and variance 20 but they are extremely skew. They are centered and rescaled
drawings from the 2 (1) distribution.
So, in more general terms we now consider the DGP
y=X
^
( ;
0
+ "; with " j X
2
0
1
0 (X X) );
(^
0 )j
p
0
s [(X X) 1 ]jj
(0;
a
^
n!1
a
n!1
2
0 I);
N( ;
N (0; 1):
where
2
0
1
0 (X X) );
(8)
(9)
(10)
Apart from its …rst two moments, the only distributional properties of OLS that can relatively
easily be derived analytically are asymptotic in nature. For n ! 1, due to the Central Limit
Theorem, convergence of the OLS coe¢ cient estimator towards normality can be established, and
similarly for t-statistics when the null hypothesis is true.
7
EViews program LSNN:
This is di¤erent from LSN in just one respect. The disturbances "i are non-normal. Now they
p
have a skew distribution, but still the same …rst and second moment, because "i = 0 (u2i 1)= 2;
where ui IIN(0; 1): Note that, although "i IID(0; 20 ); it has a skew distribution (rescaled 21 ).
This does not change the asymptotic properties of ^ and s2 nor of the test statistics. In fact, ^
and s2 are still unbiased, but their distributions are di¤erent, and so is that of the test statistic
on 2 ; giving rejection probabilities that may di¤er systematically from their nominal values.
Apart from renaming the program and its work…les (changing LSN into LSNN) we only
changed:
|
for !rep=1 to !R
’R stochastically independent replications
genr v2=nrnd;
’v2 is yet another series of N(0,1)
genr eps=!sig*(v2^2-1)/@sqrt(2)
’eps has expectation 0, std.dev. !sig and is skew
genr y = !beta1 + !beta2*w + !beta3*a + eps ’true DGP with NonNormal disturbances
equation eq1.ls y c w a
’OLS estimation of correctly specified model
|
160000
Series: B2
Sample 1 1000000
Observations 1000000
120000
80000
40000
0
0.00
0.25
0.50
0.75
1.00
1.25
1.50
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.999992
1.000638
1.892529
0.040872
0.141208
-0.049555
3.715505
Jarque-Bera
Probability
21740.43
0.000000
1.75
However, to indicate also the e¤ect of changing R (the number of replications) we increased it
here to 106 ; also for good reasons, because we want to examine whether the introduced skewness
and kurtosis of the disturbances changes the skewness and the kurtosis of the OLS coe¢ cient
estimates, and it is well known that one needs a very large Monte Carlo sample size in order
to estimate those with reasonable precision. Note that the bias and standard deviation (EViews
should actually what it indicates as Std. Dev. call ‘sample standard error’) have not changed
much (because their true population values remained the same). However, especially kurtosis did,
and the p-value for the test on normality is now zero, so clearly rejecting normality!
8
200000
Series: TB2
Sample 1 1000000
Observations 1000000
160000
120000
80000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
1.78e-05
0.005021
5.541440
-5.734523
1.027734
-0.047971
3.093635
Jarque-Bera
Probability
748.8517
0.000000
40000
0
-6
-4
-2
0
2
4
6
The t-test statistic has a slightly skew distribution. Next we check whether these strange
disturbances have a¤ected the actual signi…cance level.
1000000
Series: REJECTTB2L
Sample 1 1000000
Observations 1000000
800000
600000
400000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.050876
0.000000
1.000000
0.000000
0.219745
4.087693
17.70924
Jarque-Bera
Probability
11799941
0.000000
200000
0
0.0
0.2
0.4
0.6
0.8
1.0
Against left-hand side alternatives we …nd 5.09% (which is genuinely larger than 5%, because
(.0509-.0500)/.00022=4.09).
9
1000000
Series: REJECTTB2R
Sample 1 1000000
Observations 1000000
800000
600000
400000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.048981
0.000000
1.000000
0.000000
0.215828
4.179424
18.46758
Jarque-Bera
Probability
12879853
0.000000
200000
0
0.0
0.2
0.4
0.6
0.8
1.0
And against right-hand alternatives the estimated actual 4.8981% is also signi…cantly smaller
than 5%. However, for practical purposes mostly it does not really matter whether we test at the
5% level, or at the 6% or 4% level. Therefore, we should conclude that in this particular example
the e¤ect of extremely skew disturbances does certainly not undermine standard OLS inference,
because even at n = 40 the Central Limit Theorem already exposes its force. To illustrate how
remarkable this is we also illustrate how extremely skew the disturbances used here are. Therefore,
we run the following simple program:
’program skew.prg (illustrates skewness)
!n=1000000
workfile c:\JFKdocuments\Munich\CES1\skew.wf1 u 1 !n
genr eps=(nrnd^2-1)/@sqrt(2)
This yields:
500000
Series: EPS
Sample 1 1000000
Observations 1000000
400000
300000
200000
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.000163
-0.386198
16.53911
-0.707107
1.001301
2.837019
15.14047
Jarque-Bera
Probability
7482735.
0.000000
100000
0
0
5
10
15
10
We drew a million realizations because when we just look at the n = 40 drawings that EViews
generated in its …nal replication the nature of the distribution function is much less clear, as is
seen below.
20
Series: EPS
Sample 1 40
Observations 40
16
12
8
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
-0.161610
-0.451359
1.967872
-0.704835
0.681433
1.580973
4.643112
Jarque-Bera
Probability
21.16287
0.000025
4
0
-0.5
2.3
0.0
0.5
1.0
1.5
2.0
LSNM; inappropriate speci…cation with normal disturbances
In program LSNM.prg we consider the misspeci…ed regression of Ri on Wi and an intercept, while
omitting the regressors Ai (in practice this might be unavailable, thus giving rise to unobserved
heterogeneity). The bias of the OLS estimator of the coe¢ cient of Wi can be shown to be given
by
stdA
rAW :
(11)
3
stdW
Hence, the absolute value of this bias increases with jrAW j ; j
with stdW : Note that there is no bias when rAW = 0:
3j
and with stdA ; and it decreases
In general terms one can establish that when the DGP is
y=X
0
+ u = X1
(1)
0
(2)
0
+ X2
+ u; with u j X
(1)
then OLS of y on just the regressors X1 yields ^ = (X10 X1 )
E( ^
(1)
) = (X10 X1 )
=
(1)
0
1
1 X 0 y;
1
(0;
2
0 I);
(12)
where
(1)
(2)
0 + X2 0 )
(2)
1 0
X1 X2 0 ;
X10 (X1
+ (X10 X1 )
(13)
(2)
so there is bias (unless 0 = 0 or X10 X2 = O). The second term of (13) specializes to (11) for
the DGP of program LSNM.
Moreover, we may derive that
V ar( ^
(1)
)=
2
0
1
0 (X1 X1 ) :
(14)
This expression is actually more attractive than that for the well-speci…ed model, where it is
2 fX 0 [I X (X 0 X ) 1 X 0 ]X g 1 ; which has larger diagonal elements, except for the case X 0 X =
2
1
0
1
2 2
2
1 2
O:
11
EViews program LSNM:
Program LSNM has again normal disturbances, but now the misspeci…ed model is estimated
which omits the regressor a (ability). The program allows to examine some of the e¤ects on OLS
inference.
The only di¤erences of program LSNM with respect to LSN are that the …le names have been
changed from "lsn" into "lsnm" and the line which speci…es the regression to be run is now:
equation eq1.ls y c w
’OLS estimation of incorrectly specified model
We run again R = 10000 replications now and …nd for the simulated distribution of ^ 2 the
histogram:
1600
Series: B2
Sample 1 10000
Observations 10000
1200
800
400
0
0.2
0.4
0.6
0.8
1.0
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.677359
0.676809
1.153622
0.169626
0.130980
0.006884
2.980307
Jarque-Bera
Probability
0.240572
0.886667
1.2
This clearly shows the substantial bias, which is estimated by the Monte Carlo simulation to
be 0:6774 1 = 0:3226: Note that we …nd for the analytical bias (11):
3
stdA
1:02
rAW =
stdW
1:23
0:39 =
0:323:
If rAW had been much larger and the ratio of the standard deviations too, we might have found
a negative in‡uence of regressor W instead of the (true) positive e¤ect. The reason is that the
estimated coe¢ cient aims to represent the joint e¤ects of W and A: Note also that Std.Dev. is
indeed smaller here than in model LSN.
Further inference is seriously a¤ected too. The residuals now represent also the e¤ect of A
that could not be attributed to W: So, the residual variance is much larger than the disturbance
variance, resulting in a positive bias of s2 as an estimator of 20 : We …nd:
12
1200
Series: S2
Sample 1 10000
Observations 10000
1000
800
600
400
200
0
0.5
1.0
1.5
2.0
2.5
3.0
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
1.902624
1.880764
3.608298
0.537487
0.383828
0.330991
3.070766
Jarque-Bera
Probability
184.6786
0.000000
3.5
The distribution of the t-statistic has shifted to the left now, resulting in much too frequent
rejections of the correct null hypothesis against left-hand side alternatives and no rejections against
right-hand side alternatives.
1400
Series: TB2
Sample 1 10000
Observations 10000
1200
1000
800
600
400
200
0
-5.00
-3.75
-2.50
-1.25
0.00
13
1.25
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
-1.826437
-1.810931
1.025678
-5.368940
0.770098
-0.161504
3.206804
Jarque-Bera
Probability
61.29233
0.000000
6000
Series: REJECTTB2L
Sample 1 10000
Observations 10000
5000
4000
3000
2000
1000
0
0.0
0.2
0.4
0.6
0.8
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.567700
1.000000
1.000000
0.000000
0.495420
-0.273317
1.074702
Jarque-Bera
Probability
1668.992
0.000000
1.0
We conclude that for a proper interpretation of regression results unobserved heterogeneity can
be devastating. The e¤ects are more serious when the unobserved heterogeneity (A) is stronger
correlated with the observed heterogeneity (W).
So, at this stage we conclude that non-normality of the disturbances is not a very serious
problem. It is correlation of the unexplained part in the model (the disturbances) with the
regressors which may ruin a standard OLS-based regression analysis.
2.4
LSNPAN; panel analysis may neutralize unobserved heterogeneity
We will now extend the DGP and data set and assume that we can analyze exam results for the
same students on two di¤erent courses and exams, hence we have:
R1i =
R2i =
+
1+
1
2 W1i
+
2 W2i +
3 Ai
+ "1i ; "1i
3 Ai + "2i ; "2i
IIN(0;
IIN(0;
2)
"
2)
"
i = 1; :::; n:
(15)
Note that we assume the regression coe¢ cients and the disturbance variance to be the same for
the two exams and for all the students. Since it are the same students at both exams, also the
Ai values are the same in the two relationships, but we distinguish W1i and Wi2 (student do not
necessarily make the same e¤orts for both exams) and self-evidently R1i and R2i may be di¤erent
too. So, di¤erent e¤orts will lead to di¤erent results, but also the disturbances will be di¤erent
at the two exams (these are the e¤ects of being ill, or in love, or having a hangover, broken alarm
bell etc.; these are all modelled as random and supposed to be uncorrelated with W and A).
We assume that W1i and W2i have equal expectation W and variance 2W and a correlation
given by 12 (which is possibly positive; students that work hard for exam 1 do so too for exam 2).
The regressors Ai ; W1i and W2i are generated in the program from the three mutually independent
(0) (1) (2)
IIN(0; 1) series vi ; vi ; vi according to (we immediately incorporated our choice W = 1):
Ai =
A
+
W1i =
W
+
W2i =
W
+
+[1
(0)
A vi
2
A ; A );
(0)
2
1=2 (1)
vi ;
AW vi + (1
AW )
(0)
2
2
1=2 (1)
vi
AW vi + ( 12
AW )(1
AW )
2
2
2
2
1 1=2 (2)
( 12
vi :
AW
AW ) (1
AW ) ]
IIN(
One can easily verify that V ar(W1i ) = V ar(W1i ) = 1 indeed, and that E(W1i W2i ) =
14
(16)
(17)
12 :
In the program we employ OLS to the data of both exams separately, both in the well-speci…ed
model and in the model omitting regressor Ai : In addition, we also apply OLS to a combination
of all data, in which we subtract the two equations given in (15), yielding
Ri =
2 Wi
+ " i ; "i
IIN(0; 2
2
" ),
(18)
where Ri = R1i R2i ; Wi = W1i W2i and "i = "1i "2i : Note that this model does not su¤er
from unobserved heterogeneity and can directly be estimated by OLS. The estimator for 2 will
be unbiased. However, it is most likely that it will have a larger variance than the estimator using
just one exam in the (in practice unfeasible) well-speci…ed model, because the variance of the
disturbances is twice as large. In addition the sample variance of Wi may be small, which will be
the case especially if 12 is large. So, if the students don’t change their study attitude from exam
to exam this does not allow to identify the parameter 2 unless variable Ai has been observed.
EViews program LSNPAN:
’program: LSNPAN.prg ([email protected])
’ ========= Monte Carlo study of simple static panel data model =================
!n=40
’n is sample size
workfile c:\JFKdocuments\Munich\CES1\LSNPAN.wf1 u 1 !n ’workfile for Undated series t=1,...,n
rndseed 17022011
genr v0=nrnd
genr a=7+1.2*v0
’regressor ability on a 1-10+ scale
!rhoaw=-0.5
’population correlation between a and both w1 and w2
!rhow12=0.8
’population correlation between w1 and w2
genr v1=nrnd
genr w1=5+!rhoaw*v0+@sqrt(1-!rhoaw^2)*v1 ’regressor w1 work in weeks course 1
genr v2=nrnd
genr w2=5+!rhoaw*v0+(!rhow12-!rhoaw^2)/@sqrt(1-!rhoaw^2)*v1
genr w2=w2+@sqrt(1-!rhoaw^2-(!rhow12-!rhoaw^2)^2/(1-!rhoaw^2))*v2
genr w1_w2=w1-w2
!beta1=-5
’true intercept
!beta2=1
’true coefficient of w
!beta3=1
’true coefficient of a
!sig=1
’true std.dev sigma of error terms
genr Ey1 = !beta1 + !beta2*w1 + !beta3*a
genr Ey2 = !beta1 + !beta2*w2 + !beta3*a
!R=10000
’Monte Carlo will consist of R replications
matrix (!R,5) SIMRES
’Monte Carlo results will be stored in matrix SIMRES
for !rep=1 to !R
’R stochastically independent replications
genr y1 = Ey1 + !sig*nrnd
’true DGP with Normal errors for course 1
equation eq1.ls y1 c w1 a
’OLS estimation of correctly specified 1st course
simres(!rep,1)=eq1.@coefs(2)
’estimate of beta2 put in matrix SIMRES
equation eq2.ls y1 c w1
’OLS estimation of incorrectly specified 1st course
simres(!rep,2)=eq2.@coefs(2)
’estimate of beta2 put in matrix SIMRES
genr y2 = Ey2 + !sig*nrnd
’true DGP with Normal errors for course 2
equation eq3.ls y2 c w2 a
’OLS estimation of correctly specified 2nd course
simres(!rep,3)=eq3.@coefs(2)
’estimate of beta2 put in matrix SIMRES
equation eq4.ls y2 c w2
’OLS estimation of incorrectly specified 2nd course
simres(!rep,4)=eq4.@coefs(2)
’estimate of beta2 put in matrix SIMRES
genr y1_y2=y1-y2
’differenced data
15
equation eq5.ls y1_y2 c w1_w2
’model with removed unobserved heterogeneity
simres(!rep,5)=eq5.@coefs(2)
’estimate of beta2 put in matrix SIMRES
next
SIMRES.write c:\JFKdocuments\Munich\CES1\simres.txt
’matrix SIMRES is written on a file
workfile c:\JFKdocuments\Munich\CES1\LSNPANsim.wf1 u 1 !R
’workfile to contain matrix simres
read c:\JFKdocuments\Munich\CES1\simres.txt b21 b21m b22 b22m b2panel
Program output:
On the regressors (sample size n = 40):
A
6.91
10.12
5.12
1.02
Mean
Maximum
Minimum
Std.Dev.
Correlation
Cor (A,W2)
W1
5.27
7.43
1.86
1.23
W2
4.97
7.02
2.31
1.23
-0.39
0.91
-0.35
Monte Carlo results (R = 10000) on the regression results for
E( ^ 2 )
Std:Dev:( ^ 2 )
Exam 1
A included A omitted
1.00
0.68
0.14
0.13
2:
Exam 2
A included A omitted
1.00
0.71
0.14
0.13
Panel
A removed
1.01
0.43
For the results mentioned in the …nal column we present also the simulation histogram. Note
that in the program we did put an intercept into the panel regression, as one would usually do,
although the assumptions made allow to remove it.
1000
Series: B2PANEL
Sample 1 10000
Observations 10000
800
600
400
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
1.006671
1.002574
2.589746
-0.481723
0.432228
0.027757
2.938323
Jarque-Bera
Probability
2.869153
0.238216
200
0
-0.5
-0.0
0.5
1.0
1.5
2.0
16
2.5
The large standard deviation of this panel estimator does not allow to make very sharp inferences on the true value of 2 : The situation would be much better, however, if we had more than
2 exams (and would apply so-called …xed e¤ects panel data analysis). Having data on more than
40 students would help too. This very basic example should only illustrate that unobserved heterogeneity if it a¤ects all the shifts (exams) of the panel equivalently does not preclude unbiased
inference.
3
Suggestions for further experimentation
1. Run program LSN.prg again but …rst choose a di¤erent integer value for rndseed. What do
you conclude?
2. Run LSN.prg again for R = 100: What do you conclude?
3. Run program LSN.prg again using R = 10000; but choose a di¤erent value for rhoaw. Note
how this (multicollinearity) a¤ects V ar( ^ 2 ):
4. Run program LSNM.prg for di¤erent values of rhoaw and note and explain the e¤ects.
5. Increase V ar(Ai ) and note how it a¤ects the results of LSN, LSNM and LSPAN.
6. Increase the value of n to, say, 1000 and examine the e¤ects.
7. Reduce the magnitude of
"
(sig in the programs) and examine the e¤ects.
17