Download Simple Panel Data Models

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Rubin causal model wikipedia , lookup

Choice modelling wikipedia , lookup

Linear regression wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Ch. 13. Pooled Cross Sections Across Time:
Simple Panel Data.
• Pooled Cross Sections
• Difference-in-Difference for treatment effects
• How DiD can eliminate bias in cross-sectional OLS.
• Potential sources of bias after DiD
• Panel Data
•
•
•
•
First Difference for two period panel data.
Fixed effects for multi-period panel data.
How first differencing or fixed effects can eliminate bias in OLS
Potential issues with FD and FE models
Pooling Cross Sections across Time: Simple Panel Data
Methods
• Policy analysis with pooled cross sections
• Two or more independently sampled cross sections can be used to evaluate
the impact of a certain event or policy change
• Effect of new garbage incinerator’s location on housing prices
• Examine the effect of the location of a house on its price before and after the
garbage incinerator was built:
After incinerator was
built
Before incinerator
was built
Pooling Cross Sections across Time: Simple Panel Data Methods
• Garbage incinerator and housing prices
• Note: near incinerator had negative effect on housing prices before incinerator
was built? Why?
• Would be inappropriate to interpret negative effect of incinerator after it‘s built as
a causal effect. Some of effect is due to fact that incinerator was built near lower
price homes.
• More appropriate to look at difference-in-difference (DiD)
after incinerator was built:
p near – p far
=
-30,688.27
before incinerator was built: =
p near – p far
=
-18,824.37
=
-11,863.9
difference in differences (DiD)
Pooling Cross Sections across Time: Simple Panel Data Methods
• Difference-in-differences in a regression framework
Differential effect of being in the location and after the incinerator was built
• Show how 𝛿1 is the DiD estimator derived above
• DiD regression allows for standard errors and t-stat of DiD effect.
• If houses sold before and after the incinerator was built were systematically
different, further explanatory variables should be included
• Adding housing characteristics will also reduce the error variance and thus
standard errors
• Before/After comparisons in “natural experiments”
• DiD can be used to evaluate policy changes or other exogenous events
Pooling Cross Sections across Time: Simple Panel Data Methods
• Policy evaluation using difference-in-differences
Compare outcomes of the two groups
before and after the policy change
Suppose that something happens in the treated group causing its growth to differ by 𝜃 relative to the control
group. DiD estimator will then include true effect of treatment (𝛿1 ) and the effect of the other factors
causing growth to differ by 𝜃 in the treated group.
Examples.
Minimum wage increase is the treatment. How is DiD estimate of employment effect biased if the state that
passes the minimum wage has unusually high economic growth? Unusually low economic growth?
Might use placebo test to be sure that DiD estimator isn‘t picking up effect of some other factor.
Minimum wage hike shouldn‘t affect employment growth of college graduates
Pooling Cross Sections across Time: Simple Panel Data Methods
• Two-period panel data (Fixed Effect) analysis
• Example: Effect of unemployment on city crime rate
• Assume that no other explanatory variables are available. Will it be possible
to estimate the causal effect of unemployment on crime?
• Yes, if cities are observed for at least two periods and other factors affecting
crime stay approximately constant over those periods:
Time dummy for
the second period
Unobserved city specific
time-invariant actors (= fixed
effect)
Examples of time-constant variables that
might affect city crime?
Other unobserved factors (=
idiosyncratic error)
Pooling Cross Sections across Time: Simple Panel Data Methods
• Effect of unemployment on city crime rate
• Estimate differenced equation by OLS:
Secular increase in crime
across all cities.
Fixed effect drops out
+ 1 percentage point unemployment rate leads to 2.22 more crimes
per 1,000 people
Pooling Cross Sections across Time: Simple Panel Data Methods
• Discussion of first-differenced panel estimator
• Further explanatory variables may be included in original equation
• There may be arbitrary correlation between the unobserved time-invariant
characteristics and the included explanatory variables
• For example, suppose cities with less educated workers (virtually a time-invariant
characteristic) have higher crime and also higher unemployment – how would this bias
OLS estimate of effect of unemployment?
• First differences cause effect of any time-invariant variables to be differenced
out of the regression. Eliminates bias from exclusion of important timeinvariant variables that would emerge in OLS.
• First-differenced estimates will be imprecise if explanatory variables vary little
over time (no estimate possible if time-invariant)
Panel Data Methods with More than 2 Periods.
• Fixed effects estimation
Fixed effect, potentially correlated
with explanatory variables
Form time-averages
for each individual
Because
(the fixed effect is removed)
• Estimate deviations from i-specific means using OLS
• Estimates rely on time variation within cross-sectional units
• (= within estimator)
• xtset & xtreg in Stata.
Advanced Panel Data Methods
• Example: Effect of training grants on firm scrap rate (number of
defective items per 100 produced)
Time-invariant reasons why one firm is more productive than another are controlled for. The
important point is that these may be correlated with the other explanatory variables.
Fixed-effects estimation using the years 1987, 1988, and 1989:
Stars denote
time-demeaning
Training grants significantly improve productivity (with a time lag)
Advanced Panel Data Methods
• Discussion of fixed effects estimator
• Strict exogeneity in the original model has to be assumed
• The R2 of the demeaned equation is inappropriate measure of R2
• The effect of time-invariant variables cannot be estimated
• The effect of interactions with time-invariant variables can be estimated (e.g.
the interaction of education with time dummies)
• If a full set of time dummies are included, the effect of variables whose
change over time is constant cannot be estimated (e.g. age)
• Degrees of freedom have to be adjusted because the individual specific
averages are estimated in addition to other coefficients (resulting degrees of
freedom = NT-N-k)
Advanced Panel Data Methods (Ch.14)
• Applying panel data methods to other data structures
• Panel data methods can be used in other contexts where constant
unobserved effects have to be removed
• Example: Wage equations for twins
Unobserved genetic and family characteristics that do not vary across twins
Equation for twin 1 in
family i
Equation for twin 2 in
family i
Estimate differenced
equation by OLS