Download Lecture1

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Applied Econometrics
Martin Huber
Chair of Applied Econometrics - Evaluation of Public Policies
University of Fribourg
1 / 25
Overview
Review of the OLS assumptions
MLR.1 linearity: y = β0 + β1 x1 + ...βk xk + u
MLR.2 i.i.d. (random) sampling
MLR.3 conditional mean expectation of errors: E (u |x ) = 0
MLR.4 no perfect collinearity
MLR.5 homoskedasticity: Var (u |x ) = σ 2
2 / 25
Overview
3 / 25
Overview
Topic of this course: What if some assumption(s) is/are violated?
Violation of MLR.1 nonlinear models
Violation of MLR.2 non-random sampling
Violation of MLR.3 omitted variables and endogeneity due to
measurement error: E (u |x ) 6= 0
Violation of MLR.5 heteroskedasticity: Var (u |x ) = σ 2 (x )
4 / 25
Contents of this lecture
1
Modelling nonlinearities using OLS
General modelling approaches
Ordinal variables
Variables representing qualitative features
2
Modelling heterogeneity using OLS
Dummy variables
Interaction terms
Tests for model heterogeneity
Wooldridge Chapters 7.1-7.4
5 / 25
Data example
6 / 25
General modelling approaches
Polynomials (of order J):
J
y = β0 +
∑ βj x j + u
(1)
j =1
Dummy variables for each category:
K −1
y = β0 +
∑ βj · 1(x = vk ) + u
(2)
k =1
Discrete variables mit K different values: x ∈ {v1 , v2 , ..., vK −1 , vK }.
Individuals with x = vK are the reference group.
7 / 25
Regression with polynomials
As the dependent variable is log(y ) rather than y , the coefficients
have to be interpreted as percentage changes in y due to unit
changes in the explanatory variable x.
E.g., the coefficient 0.080 on educ implies that the wage of an
otherwise comparable worker increases by 8% if education is
increased by one year.
8 / 25
Ordinal variables
Original variables degree in the data: professional degree
1 = no degree (nodeg)
2 = vocational training (voc)
3 = college degree (col)
4 = university degree (uni)
Ordinal variables: ordinal sorting (one value is larger or better than the
other), but no cardinal interpretation (by how much it is larger or better
is not specified)
wage
= β0 + β1 voc + β2 col + β3 uni + u
(3)
wage
= β0 + δ0 female + β1 voc + β2 col + β3 uni + u
(4)
Why did we omit nodeg in (3)?
What is the reference group in (3)?
What is the reference group in (4)?
9 / 25
Variables representing qualitative features
Original variable type: Type of university employees
1 = Professor
2 = Assistant professor
3 = Doctoral assistant
4 = Head of administration
5 = Administration
6 = Others
Qualitative features: neither ordinal, nor cardinal sorting
wage
= β0 + β1 1(type = 1) + β2 1(type = 2)
(5)
+β3 1(type = 3) + β4 1(type = 4) + β5 1(type = 5) + u
Why shouldn’t we directly use the original variable type?
What is the reference group?
What is the interpretation of β0 ?
What is the interpretation of β1 ?
10 / 25
Modelling heterogeneity using OLS
Dummy variables and interaction terms allow for different relations
between dependent and independent variables in different subgroups,
e.g. males and females.
11 / 25
Dummy variables
Suspicion, that females and males receive on average different wages,
even with the same level of education:
(female = 1 if the individual is female, and female = 0) if the individual
is male.
wage = β0 + δ0 female + β1 educ + u
(6)
Intercept males: β0
Intercept females: β0 + δ0
Males are the reference group: group for which dummy= 0 so that the
intercept is solely determined by the constant β0 .
12 / 25
Dummy variables
13 / 25
Interaction terms
Suspicion, that females and males not only receive different mean
wages with the same level of education, but also face different returns
to education:
wage = β0 + δ0 female + β1 educ + δ1 (female · educ ) + u
(7)
Intercept males: β0
Intercept females: β0 + δ0
Slope males: β1
Slope females: β1 + δ1
Interaction term: female · educ
14 / 25
Interaction terms
15 / 25
Interaction terms
Interaction of dummies: overlapping groups
wage = β0 + γ0 old + δ0 female + β1 educ + u
(8)
Intercept young male: β0
Intercept old male: β0 + γ0
Intercept young female: β0 + δ0
Intercept old female: β0 + γ0 + δ0
Age effect: γ0
Gender effect: δ0
16 / 25
Interaction terms
Interaction of dummies: non-overlapping groups
wage = β0 + λ1 d1 + λ2 d2 + λ3 d3 + β1 educ + u
(9)
Intercept young male: β0
Intercept old male (d1 = 1): β0 + λ1
Intercept young female (d2 = 1): β0 + λ2
Intercept old female (d3 = 1): β0 + λ3
λj is the wage difference of the group with dj = 1 when compared to
the group of young males (reference group) given an equal level of
education.
What is the relationship between λ1 , λ2 , λ3 , γ0 , and δ0 ?
17 / 25
Regression with dummies
As the dependent variable is log(y ) rather than y , the coefficients
have to be interpreted as percentage changes in y due to unit
changes in the explanatory variable (in case of x) or due to
percentage change in the explanatory variable (in case of log (x )).
E.g., the coefficient 0.054 on colonial implies that the price of an
otherwise comparable house is 5.4% higher if built in colonial
style (but the coefficient is not significant).
The coefficient 0.168 on log(lotsize) implies that the house price
increases by 0.168% if the lot size is increased by 1%.
18 / 25
Regression with dummies
Controlling for sales and employment, firms that received a grant
trained each worker, on average, 26.25 hours more.
The coefficient -6.07 on log(employ) implies that, if a firm is 1%
larger, it trains its workers 0.0607 hours less.
19 / 25
Regression with dummies and polynomials
20 / 25
Testing for differences in models across groups (Wooldr. 7.4)
Fully interacted model:
SSRur : Sum of squared residuals when estimating the OLS
model with group dummy and interaction terms with all
regressors (unrestricted model)
SSRr : Sum of squared residuals when estimating the OLS model
without group dummy/interaction terms (restricted model)
(recall: SSR = ∑ni=1 ûi2 )
F=
(SSRr − SSRur )/(k + 1)
SSRur /[n − 2(k + 1)]
(10)
21 / 25
Testing for differences in models across groups (Wooldr. 7.4)
Separate models:
SSR1 : Sum of squared residuals when estimating the OLS model
in subgroup 1
SSR2 : Sum of squared residuals when estimating the OLS model
in subgroup 2
SSR: Sum of squared residuals when estimating the OLS model
in the total sample
F=
[SSR − (SSR1 + SSR2 )]/(k + 1)
(SSR1 + SSR2 )/[n − 2(k + 1)]
Chow-Test
(11)
The standard test is based on the assumption of homoskedasticity,
stating that the variance of the error term is equal in both groups. The
validity of homoskedasticity can be tested.
If it is rejected, heteroskedasticity robust standard errors are to be
used in the statistic.
22 / 25
Example from Wooldridge 7.4
23 / 25
Example from Wooldridge 7.4
24 / 25
Example from Wooldridge 7.4
Important limitation of the Chow test: null hypothesis allows for no
differences at all between the groups.
It may be more interesting to allow for an intercept difference
between the groups and then to test for slope differences.
This can be tested by including the group dummy and all
interaction terms, as in equation (7.22), but then test joint
significance of the interaction terms only (in that equation, rather
than testing based on subsamples).
25 / 25