Download Class 8: Multilevel 2 - UCI School of Social Sciences

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Data assimilation wikipedia , lookup

Time series wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Multilevel Models 2
Sociology 229: Advanced Regression
Copyright © 2010 by Evan Schofer
Do not copy or distribute without permission
Announcements
• Assignment 5 Handed out
Multilevel Data
• Simpler example: 2-level data
Class
Class
Class
Class
Class
Class
• Which can be shown as:
Level 2
Level 1
Class 1
S1
S2
Class 2
S3
S1
S2
Class 3
S3
S1
S2
S3
Review: Multilevel Data: Problems
• Issue: Multilevel data often results in violation
of OLS regression assumption
• OLS requires an independent random sample…
• Students from the same class (or school) are not
independent… and may have correlated error
– Models tend to underestimate standard errors
• This leads to false rejection of H0.
– When is multilevel data NOT a problem?
• Answer: If you can successfully control for potential
sources of correlated error
Multilevel Data: Research Purposes
• Multilevel models are more than a “fix” for
problems of OLS. Uses:
– Raudenbush & Bryk 2002:8-9
– 1. Improved estimation of individual effects
• Take advantage of pooling; between/within variance
– 2. Modeling cross-level effects
• Separating effects of individual-level variables from
social context
– 3. Partitioning variance-covariance components
• An important descriptive issue: At what level is most of
the variance?
Review: Multilevel Data
• OLS: underestimated SEs
• Robust Cluster SEs
• A correction for OLS
• Aggregation: Focus on “between group”
effects
• Loss of sample size
• Watch for ecological fallacy
• Fixed effects: “within group” effects
• Random effects: Hybrid “within” & “between”
• Requires strong assumption that Xs uncorrelated with
error.
Fixed Effects Model (FEM)
• Fixed effects model:
Yij   j   X ij   ij
• For i cases within j groups
• Therefore j is a separate intercept for each group
• It is equivalent to solely at within-group variation:
Yij  Y j   ( X ij  X j )   ij   j
• X-bar-sub-j is mean of X for group j, etc
• Model is “within group” because all variables are
centered around mean of each group.
Random Effects
• Issue: The dummy variable approach (FEM)
treats group differences as a fixed effect
• Alternatively, we can treat it as a random effect
• Don’t estimate values for each case, but model it
• This requires making assumptions
– e.g., that group differences are normally distributed with a
standard deviation that can be estimated from data
– Random effects models is a hybrid: a weighted
average of between & within group effects
• It exploits between & within information, and thus can
be more efficient than FEM & aggregate models.
– IF distributional assumptions are correct.
Random Effects
• A simple random intercept model
– Notation from Rabe-Hesketh & Skrondal 2005, p. 4-5
Random Intercept Model
Yij   0   j   ij
• Where  is the main intercept
• Zeta () is a random effect for each group
– Allowing each of j groups to have its own intercept
– Assumed to be independent & normally distributed
• Error (e) is the error term for each case
– Also assumed to be independent & normally distributed
• Note: Other texts refer to random intercepts as uj or nj.
Linear Random Intercepts Model
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) re
Random-effects GLS regression
Group variable (i): country
R-sq:
within = 0.0220
between = 0.0371
overall = 0.0240
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Assumes
normal uj,
uncorrelated
with X vars
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
Wald chi2(7)
Prob > chi2
625.50
0.0000
=
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038709
.0008152
-4.75
0.000
-.0054688
-.0022731
male |
.0978732
.0229632
4.26
0.000
.0528661
.1428802
dmar |
.0030441
.0252075
0.12
0.904
-.0463618
.05245
demp | -.0737466
.0252831
-2.92
0.004
-.1233007
-.0241926
educ |
.0857407
.0061501
13.94
0.000
.0736867
.0977947
incomerel |
.0090308
.0059314
1.52
0.128
-.0025945
.0206561
ses |
.131528
.0134248
9.80
0.000
.1052158
.1578402
_cons |
5.924611
.1287468
46.02
0.000
5.672272
6.17695
-------------+---------------------------------------------------------------sigma_u | .59876138
SD of u (intercepts); SD of e; intra-class correlation
sigma_e | 1.8701896
rho | .09297293
(fraction of variance due to u_i)
Linear Random Intercepts Model
• Notes: Model can also be estimated with
maximum likelihood estimation (MLE)
• Stata:
xtreg y x1 x2 x3, i(groupid) mle
– Versus “re”, which specifies weighted least squares estimator
• Results tend to be similar
• But, MLE results include a formal test to see whether
intercepts really vary across groups
– Significant p-value indicates that intercepts vary
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) mle
Random-effects ML regression
Number of obs
=
27807
Group variable (i): country
Number of groups
=
26
… MODEL RESULTS OMITTED …
/sigma_u |
.5397755
.0758087
.4098891
.7108206
/sigma_e |
1.869954
.0079331
1.85447
1.885568
rho |
.0769142
.019952
.0448349
.1240176
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 2128.07 Prob>=chibar2 = 0.000
Choosing Models
• Which model is best?
• There is much discussion (e.g, Halaby 2004)
• Fixed effects are most consistent under a
wide range of circumstances
• Consistent: Estimates approach true parameter values
as N grows very large
• But, they are less efficient than random effects
– In cases with low within-group variation (big between group
variation) and small sample size, results can be very poor
– Random Effects = more efficient
• But, runs into problems if specification is poor
– Esp. if X variables correlate with random group effects
– Usually due to omitted variables.
Hausman Specification Test
• Hausman Specification Test: A tool to help
evaluate fit of fixed vs. random effects
• Logic: Both fixed & random effects models are
consistent if models are properly specified
• However, some model violations cause random effects
models to be inconsistent
– Ex: if X variables are correlated to random error
• In short: Models should give the same results… If not,
random effects may be biased
– If results are similar, use the most efficient model: random
effects
– If results diverge, odds are that the random effects model is
biased. In that case use fixed effects…
Hausman Specification Test
• Strategy: Estimate both fixed & random
effects models
• Save the estimates each time
• Finally invoke Hausman test
– Ex:
•
•
•
•
•
xtreg var1 var2 var3, i(groupid) fe
estimates store fixed
xtreg var1 var2 var3, i(groupid) re
estimates store random
hausman fixed random
Hausman Specification Test
• Example: Environmental attitudes fe vs re
. hausman fixed random
Direct comparison of coefficients…
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
fixed
random
Difference
S.E.
-------------+---------------------------------------------------------------age |
-.0038917
-.0038709
-.0000207
.0000297
male |
.0979514
.0978732
.0000783
.0004277
dmar |
.0024493
.0030441
-.0005948
.0007222
demp |
-.0733992
-.0737466
.0003475
.0007303
educ |
.0856092
.0857407
-.0001314
.0002993
incomerel |
.0088841
.0090308
-.0001467
.0002885
ses |
.1318295
.131528
.0003015
.0004153
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test:
Ho:
difference in coefficients not systematic
chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
2.70
Prob>chi2 =
0.9116
Non-significant pvalue indicates
that models yield
similar results…
Within & Between Effects
• Issue: What is the relationship between
within-group effects and between-group
effects?
• FEM models within-group variation
• BEM models between group variation (aggregate)
– Usually they are similar
• Ex: Student skills & test performance
• Within any classroom, skilled students do best on tests
• Between classrooms, classes with more skilled
students have higher mean test scores
– BUT…
Within & Between Effects
• But: Between and within effects can differ!
• Ex: Effects of wealth on attitudes toward welfare
• At the country level (between groups):
– Wealthier countries (high aggregate mean) tend to have prowelfare attitudes (ex: Scandinavia)
• At the individual level (within group)
– Wealthier people are conservative, don’t support welfare
• Result: Wealth has opposite between vs within effects!
– Watch out for ecological fallacy!!!
– Issue: Such dynamics often result from omitted
level-1 variables (omitted variable bias)
• Ex: If we control for individual “political conservatism”,
effects may be consistent at both levels…
Within & Between Effects / Centering
• Multilevel models & “centering” variables
• Grand mean centering: computing variables
as deviations from overall mean
• Often done to X variables
• Has effect that baseline constant in model reflects
mean of all cases
– Useful for interpretation
• Group mean centering: computing variables
as deviation from group mean
• Useful for decomposing within vs. between effects
• Often in conjunction with aggregate group mean vars.
Within & Between Effects
• You can estimate BOTH within- and betweengroup effects in a single model
• Strategy: Split a variable (e.g., SES) into two new
variables…
– 1. Group mean SES
– 2. Within-group deviation from mean SES
» Often called “group mean centering”
• Then, put both variables into a random effects model
• Model will estimate separate coefficients for between
vs. within effects
– Ex:
• egen meanvar1 = mean(var1), by(groupid)
• egen withinvar1 = var1 – meanvar1
• Include mean (aggregate) & within variable in model.
Within & Between Effects
• Example: Pro-environmental attitudes
. xtreg supportenv meanage withinage male dmar demp educ incomerel ses,
i(country) mle
Random-effects ML regression
Group variable (i): country
Random effects
~ Gaussian
Between
& withinu_i
effects
are opposite. Older
countries are MORE environmental, but older
people are LESS.
Omitted variables? Wealthy European countries
Log strong
likelihood
-56918.299
with
green =parties
have older populations!
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
LR chi2(8)
Prob > chi2
620.41
0.0000
=
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------meanage |
.0268506
.0239453
1.12
0.262
-.0200812
.0737825
withinage |
-.003903
.0008156
-4.79
0.000
-.0055016
-.0023044
male |
.0981351
.0229623
4.27
0.000
.0531299
.1431403
dmar |
.003459
.0252057
0.14
0.891
-.0459432
.0528612
demp | -.0740394
.02528
-2.93
0.003
-.1235873
-.0244914
educ |
.0856712
.0061483
13.93
0.000
.0736207
.0977216
incomerel |
.008957
.0059298
1.51
0.131
-.0026651
.0205792
ses |
.131454
.0134228
9.79
0.000
.1051458
.1577622
_cons |
4.687526
.9703564
4.83
0.000
2.785662
6.58939
Generalizing: Random Coefficients
• Linear random intercept model allows random
variation in intercept (mean) for groups
• But, the same idea can be applied to other coefficients
• That is, slope coefficients can ALSO be random!
Random Coefficient Model
Yij  1   1 j   2 X ij   2 j X ij   ij
Yij  1   1 j    2   2 j X ij   ij
Which can be written as:
• Where zeta-1 is a random intercept component
• Zeta-2 is a random slope component.
Linear Random Coefficient Model
Both
intercepts
and slopes
vary
randomly
across j
groups
Rabe-Hesketh & Skrondal 2004, p. 63
Random Coefficients Summary
• Some things to remember:
• Dummy variables allow fixed estimates of intercepts
across groups
• Interactions allow fixed estimates of slopes across
groups
– Random coefficients allow intercepts and/or
slopes to have random variability
• The model does not directly estimate those effects
– Just as we don’t estimate coefficients of “e” for each case…
• BUT, random components can be predicted after you
run a model
– Just as you can compute residuals – random error
– This allows you to examine some assumptions (normality).
STATA Notes: xtreg, xtmixed
• xtreg – allows estimation of between, within
(fixed), and random intercept models
•
•
•
•
xtreg y x1 x2 x3, i(groupid) fe - fixed (within) model
xtreg y x1 x2 x3, i(groupid) be - between model
xtreg y x1 x2 x3, i(groupid) re - random intercept (GLS)
xtreg y x1 x2 x3, i(groupid) mle - random intercept (MLE)
• xtmixed – allows random intercepts & slopes
• “Mixed” models refer to models that have both fixed and
random components
• xtmixed [depvar] [fixed equation] || [random eq], options
• Ex: xtmixed y x1 x2 x3 || groupid: x2
– Random intercept is assumed. Random coef for X2 specified.
STATA Notes: xtreg, xtmixed
• Random intercepts
• xtreg y x1 x2 x3, i(groupid) mle
– Is equivalent to
• xtmixed y x1 x2 x3 || groupid: , mle
• xtmixed assumes random intercept – even if no other
random effects are specified after “groupid”
– But, we can add random coefficients for all Xs:
• xtmixed y x1 x2 x3 || groupid: x1 x2 x3 , mle cov(unstr)
– Useful to add: “cov(unstructured)”
• Stata default treats random terms (intercept, slope) as
totally uncorrelated… not always reasonable
• “cov(unstr) relaxes constraints regarding covariance
among random effects (See Rabe-Hesketh &
Skrondal).
STATA Notes: GLLAMM
• Note: xtmixed can do a lot… but GLLAMM
can do even more!
• “General linear & latent mixed models”
• Must be downloaded into stata. Type “search gllamm”
and follow instructions to install…
– GLLAMM can do a wide range of mixed & latentvariable models
• Multilevel models; Some kinds of latent class models;
Confirmatory factor analysis; Some kinds of Structural
Equation Models with latent variables… and others…
• Documentation available via Stata help
– And, in the Rabe-Hesketh & Skrondal text.
Random intercepts: xtmixed
• Example: Pro-environmental attitudes
. xtmixed supportenv age male dmar demp educ incomerel ses || country: , mle
Mixed-effects ML regression
Group variable: country
Wald chi2(7)
=
625.75
Log likelihood = -56919.098
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
Prob > chi2
0.0000
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038662
.0008151
-4.74
0.000
-.0054638
-.0022687
male |
.0978558
.0229613
4.26
0.000
.0528524
.1428592
dmar |
.0031799
.0252041
0.13
0.900
-.0462193
.0525791
demp | -.0738261
.0252797
-2.92
0.003
-.1233734
-.0242788
educ |
.0857707
.0061482
13.95
0.000
.0737204
.097821
incomerel |
.0090639
.0059295
1.53
0.126
-.0025578
.0206856
ses |
.1314591
.0134228
9.79
0.000
.1051509
.1577674
_cons |
5.924237
.118294
50.08
0.000
5.692385
6.156089
-----------------------------------------------------------------------------[remainder of output cut off] Note: xtmixed yields identical results to xtreg , mle
Random intercepts: xtmixed
• Ex: Pro-environmental attitudes (cont’d)
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038662
.0008151
-4.74
0.000
-.0054638
-.0022687
male |
.0978558
.0229613
4.26
0.000
.0528524
.1428592
dmar |
.0031799
.0252041
0.13
0.900
-.0462193
.0525791
demp | -.0738261
.0252797
-2.92
0.003
-.1233734
-.0242788
educ |
.0857707
.0061482
13.95
0.000
.0737204
.097821
incomerel |
.0090639
.0059295
1.53
0.126
-.0025578
.0206856
ses |
.1314591
.0134228
9.79
0.000
.1051509
.1577674
_cons |
5.924237
.118294
50.08
0.000
5.692385
6.156089
----------------------------------------------------------------------------------------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Identity
|
sd(_cons) |
.5397758
.0758083
.4098899
.7108199
-----------------------------+-----------------------------------------------sd(Residual) |
1.869954
.0079331
1.85447
1.885568
-----------------------------------------------------------------------------LR test vs. linear regression: chibar2(01) = 2128.07 Prob >= chibar2 = 0.0000
xtmixed output puts all random effects below main
coefficients. Here, they are “cons” (constant) for groups
defined by “country”, plus residual (e)
Non-zero SD
indicates that
intercepts vary
Random Coefficients: xtmixed
• Ex: Pro-environmental attitudes (cont’d)
. xtmixed supportenv age male dmar demp educ incomerel ses || country: educ, mle
[output omitted]
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0035122
.0008185
-4.29
0.000
-.0051164
-.001908
male |
.1003692
.0229663
4.37
0.000
.0553561
.1453824
dmar |
.0001061
.0252275
0.00
0.997
-.0493388
.049551
demp | -.0722059
.0253888
-2.84
0.004
-.121967
-.0224447
educ |
.081586
.0115479
7.07
0.000
.0589526
.1042194
incomerel |
.008965
.0060119
1.49
0.136
-.0028181
.0207481
ses |
.1311944
.0134708
9.74
0.000
.1047922
.1575966
_cons |
5.931294
.132838
44.65
0.000
5.670936
6.191652
-----------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Independent
|
sd(educ) |
.0484399
.0087254
.0340312
.0689492
sd(_cons) |
.6179026
.0898918
.4646097
.821773
-----------------------------+-----------------------------------------------sd(Residual) |
1.86651
.0079227
1.851046
1.882102
-----------------------------------------------------------------------------LR test vs. linear regression:
chi2(2) = 2187.33
Prob > chi2 = 0.0000
Here, we have allowed the slope of educ to vary
randomly across countries
Educ (slope) varies, too!
Random Coefficients: xtmixed
• What if the random intercept or slope
coefficients aren’t significantly different from
zero?
• Answer: that means there isn’t much random variability
in the slope/intercept
• Conclusion: You don’t need to specify that random
parameter
– Also: Models include a LRtest to compare with a
simple OLS model (no random effects)
• If models don’t differ (Chi-square is not significant) stick
with a simpler model.
Random Coefficients: xtmixed
• What are random coefficients doing?
• Let’s look at results from a simplified model
8
– Only random slope & intercept for education
3
4
5
6
7
Model fits a
different slope
& intercept for
each group!
0
2
4
6
highest educational level attained
8
Random Coefficients
• Why bother with random coefficients?
– 1. A solution for clustering (non-independence)
– Usually people just use random intercepts, but slopes may be
an issue also
– 2. You can create a better-fitting model
– If slopes & intercepts vary, a random coefficient model may fit
better
– Assuming distributional assumptions are met
– Model fit compared to OLS can be tested….
– 3. Better predictions
– Attention to group-specific random effects can yield better
predictions (e.g., slopes) for each group
» Rather than just looking at “average” slope for all groups.
Random Coefficients
• 4. Multilevel models explicitly put attention on
levels of causality
• Higher level / “contextual” effects versus individual /
unit-level effects
• A technology for separating out between/within
• NOTE: this can be done w/out random effects
– But it goes hand-in-hand with clustered data…
• Note: Be sure you have enough level-2 units!
– Ex: Models of individual environmental attitudes
• Adding level-2 effects: Democracy, GDP, etc.
– Ex: Classrooms
• Is it student SES, or “contextual” class/school SES?
Multilevel Model Notation
• So far, we have expressed random effects in
a single equation:
Random Coefficient Model
Yij  1   1 j   2 X ij   2 j X ij   ij
• However, it is common to separate levels:
Level 1 equation
Yij  1   2 X ij   ij
Intercept equation
1   1  u1 j
Slope Equation
 2   2  u2 j
Gamma = constant
u = random effect
Here, we specify a random component for
level-1 constant & slope
Multilevel Model Notation
• The “separate equation” formulation is no
different from what we did before…
• But it is a vivid & clear way to present your models
• All random components are obvious because they are
stated in separate equations
• NOTE: Some software (e.g., HLM) requires this
– Rules:
• 1. Specify an OLS model, just like normal
• 2. Consider which OLS coefficients should have a
random component
– These could be the intercept or any X (slope) coefficient
• 3. Specify an additional formula for each random
coefficient… adding random components when desired
Cross-Level Interactions
• Does context (i.e., level-2) influence the effect
of level-1 variables?
– Example: Effect of poverty on homelessness
• Does it interact with welfare state variables?
– Ex: Effect of gender on math test scores
• Is it different in coed vs. single-sex schools?
– Can you think of others?
Cross-level interactions
• Idea: specify a level-2 variable that affects a
level-1 slope
Level 1 equation
Yij  1   2 X ij   ij
Intercept equation
1   1  u1 j
Slope equation with interaction
 2   2   3 Z j  u2 j
Cross-level interaction:
Level-2 variable Z affects slope (B2) of
a level-1 X variable

Coefficient 3 reflects size of
interaction (effect on B2 per unit
change in Z)
Cross-level Interactions
• Cross-level interaction in single-equation
form:
Random Coefficient Model with cross-level interaction
Yij  1   1 j   2 X ij   2 j X ij   3X ij  Z j   ij
– Stata strategy: manually compute cross-level
interaction variables
• Ex: Poverty*WelfareState, Gender*SingleSexSchool
• Then, put interaction variable in the “fixed” model
– Interpretation: B3 coefficient indicates the impact
of each unit change in Z on slope B2
• If B3 is positive, increase in Z results in larger B2 slope.
Cross-level Interactions
• Pro-environmental attitudes
. xtmixed supportenv age male dmar demp educ income_dev inc_meanXeduc ses ||
country: income_mean , mle cov(unstr)
Mixed-effects ML regression
Group variable: country
Interaction between country mean
Number of obs
=
27807
income and
individual-level
education
Number of groups
=
26
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038786
.0008148
-4.76
0.000
-.0054756
-.0022817
male |
.1006206
.0229617
4.38
0.000
.0556165
.1456246
dmar |
.0041417
.025195
0.16
0.869
-.0452395
.0535229
demp | -.0733013
.0252727
-2.90
0.004
-.1228348
-.0237678
educ |
-.035022
.0297683
-1.18
0.239
-.0933668
.0233227
income_dev |
.0081591
.005936
1.37
0.169
-.0034753
.0197934
inc_meanXeduc|
.0265714
.0064013
4.15
0.000
.0140251
.0391177
ses |
.1307931
.0134189
9.75
0.000
.1044926
.1570936
_cons |
5.892334
.107474
54.83
0.000
5.681689
6.102979
------------------------------------------------------------------------------
Interaction: inc_meanXeduc has a positive effect… The education slope is
bigger in wealthy countries
Note: main effects change. “educ” indicates slope when inc_mean = 0
Cross-level Interactions
• Random part of output (cont’d from last slide)
. xtmixed supportenv age male dmar demp educ income_dev inc_meanXeduc ses ||
country: income_mean , mle cov(unstr)
-----------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Unstructured
|
sd(income~n) |
.5419256
.2095339
.253995
1.156256
sd(_cons) |
2.326379
.8679172
1.11974
4.8333
corr(income~n,_cons) | -.9915202
.0143006
-.999692
-.7893791
-----------------------------+-----------------------------------------------sd(Residual) |
1.869388
.0079307
1.853909
1.884997
-----------------------------------------------------------------------------LR test vs. linear regression:
chi2(3) = 2124.20
Prob > chi2 = 0.0000
Random components:
Income_mean slope allowed to have random variation
Interceps (“cons”) allowed to have random variation
“cov(unstr)” allows for the possibility of correlation between
random slopes & intercepts… generally a good idea.
Beyond 2-level models
• Sometimes data has 3 levels or more
•
•
•
•
Ex: School, classroom, individual
Ex: Family, individual, time (repeated measures)
Can be dealt with in xtmixed, GLLAMM, HLM
Note: stata manual doesn’t count lowest level
– What we call 3-level is described as “2-level” in stata manuals
– xtmixed syntax: specify “fixed” equation and then
random effects starting with “top” level
• xtmixed var1 var2 var3 || schoolid: var2 || classid:var3
– Again, specify unstructured covariance: cov(unstr)
Crossed Effects / 2-way models
• Sometimes data are not really nested… but
crossed
• Example: Longitudinal data: Individuals nested within
countries and years
• Strategies:
– 1. Use a combination of fixed/random effects &
manual dummies
• Ex: Random effects for country, but dummies for years
– 2. Estimate “two-way” variance component model
• Random effects for country & year
• In stata you have to do this manually (3-level model)
– See Rabe-Hesketh & Skrondal for an example.
Advice about building models
• Raudenbush & Bryk 2002
– Start building the level 1 model first
– Then build level 2 model
• Keeping a close eye on level 2 N.
Beyond Linear Models
• Stata can specify multilevel models for
dichotomous & count variables
– Random intercept models
•
•
•
•
xtlogit – logistic regression – dichotomous
xtpois – poisson regression – counts
xtnbreg – negative binomial – counts
xtgee – any family, link… w/random intercept
– Random intercept & coefficient models
– Plus, allows more than 2 levels…
• xtmelogit – mixed logit model
• xtmepoisson – mixed poisson model
Shared Frailty Models: EHA
• Shared frailty model = random intercept in an
event history model
• Stata: stcox var1 var2 var3, shared(clusterID)
• Cluster ID variable could be country id, school id, etc…
• Formula: Cox model with shared frailty
h(t )  h0 (t ) exp( X ij  ui )
• Where ui is a random variable for i groups
• Parametric shared frailty models are similar…
Shared Frailty Models: EHA
• Shared frailty (random effects) are useful for:
– 1. Clustered data
• Just like prior examples
– 2. Models with repeated events
• Repeated events is a kind of clustering within caseid
• Again, dummy variables (FEM) is a reasonable option
– In stata, you’d have to enter the dummies manually
• Stata: specify cluster ID and form of frailty
• stcox var1 var2, frailty(gamma) shared(schoolid)
• streg var1 var2, dist(e) frailty(gamma) shared(schoolid)
Activity & Reading Discussion
• Activity: Break into groups of 3-4 (or so)
• Design a study of student performance in advanced
statistics classes
• Imagine data (students) nested within sociology
departments or universities (or both)
• Explicitly theorize individual-level and contextual effects
• Explicitly think of cross-level interactions
– E.g., contextual effects that amplify or diminish effects of a
level-1 variable
• Articles:
• Schofer and Fourcade Gourinchas
• Cohen and Huffman (handout)
Panel Data
• Panel data is a multilevel structure
• Cases measured repeatedly over time
• Measurements are ‘nested’ within cases
Person 1
Person 2
Person 3
Person 4
T1 T2 T3 T4 T5
T1 T2 T3 T4 T5
T1 T2 T3 T4 T5
T1 T2 T3 T4 T5
– Obviously, error is clustered within cases… but…
– Error may also be clustered by time
• Historical time events or life-course events may mean
that cases aren’t independent
– Ex: All T1s and all T5s
• Ex: Models of economic growth… certain periods
(e.g., Oil shocks of 1970s) affect all countries.
Panel Data
• Issue: panel data may involve clustering
across cases & time
• Good news: Stata’s “xt” commands were
made for this
• Allow specification of both ID and TIME clusters…
• Ex: xtreg var1 var2 var3, mle i(countryid) t(year)
– Note: You can also “mix and match” fixed and
random effects
• Ex: You can use dummies (manually) to deal with timeclustering with a random effect for case ids
Panel Data: serial correlation
• Panel data may have another problem:
• Sequential cases may have correlated error
– Ex: Adjacent years (1950 & 1951 or 2007 & 2008) may be
very similar. Correlation denoted by “rho” (r)
• Called “autocorrelation” or “serial correlation”
• “Time-series” models are needed
• xtregar – xtreg, for cases in which the error-term is
“first-order autoregressive”
• First order means the prior time influences the current
– Only adjacent time-points… assumes no effect of those prior
• Can be used to estimate FEM, BEM, or GLS model
• Use option “lbi” to test for autocorrelation (rho = 0?).
Panel Data: Choosing a Model
• If clustering is mainly a nuisance:
• Adjust SEs: vce(cluster caseid)
• Or simple fixed or random effects
– Choice between fixed & random
• Hausman test is one way to decide
• Fixed is “safer” – reviewers are less likely to complain
• But, if cross-sectional variation is of interest, fixed can
be a problem…
– In that case, use random effects… and hope the reviewers
don’t give you grief
– More on panel data next week!