Download Class 9: Multilevel 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Interaction (statistics) wikipedia , lookup

Regression analysis wikipedia , lookup

Least squares wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Multilevel Models 2
Sociology 229A, Class 18
Copyright © 2008 by Evan Schofer
Do not copy or distribute without permission
Multilevel Data
• Simple example: 2-level data
Class
Class
Class
Class
Class
Class
• Which can be shown as:
Level 2
Level 1
Class 1
S1
S2
Class 2
S3
S1
S2
Class 3
S3
S1
S2
S3
Multilevel Data: Problems
• When is multilevel data NOT a problem?
– Answer: If you can successfully control for
potential sources of correlated error
• Add a control to OLS model for: classroom, school,
and state characteristics that would be sources of
correlated error in each group
• Ex: Teacher quality, class size, budget, etc…
• But: We often can’t identify or measure all
relevant sources of correlated error
• Thus, we need to abandon simple OLS regression and
try other approaches.
Review: Multilevel Strategies
• Problems of multilevel models
• Non-independence; correlated error
• Standard errors = underestimated
• Solutions:
– Each has benefits, disadvantages…
•
•
•
•
•
•
1.
2.
3.
4.
5.
6.
OLS regression
Aggregation (between effects model)
Robust Standard Errors
Robust Cluster Standard Errors
Dummy variables (Fixed Effects Model)
Random effects models
Robust Standard Errors
• Strategy #1: Improve our estimates of the
standard errors
– Option 1: Robust Standard Errors
• reg y x1 x2 x3, robust
• The Huber / White / “Sandwich” estimator
• An alternative method of computing standard errors
that is robust to a variety of assumption violations
– Provides accurate estimates in presence of heteroskedasticity
• Also, robust to model misspecification
– Note: Freedman’s criticism: What good are accurate SEs if
coefficients are biased due to poor specification?
Robust Cluster Standard Errors
• Option 2: Robust cluster standard errors
– A modification of robust SEs to address clustering
• reg y x1 x2 x3, cluster(groupid)
– Note: Cluster implies robust (vs. regular SEs)
• It is easy to adapt robust standard errors to address
clustering in data; See:
– http://www.stata.com/support/faqs/stat/robust_ref.html
– http://www.stata.com/support/faqs/stat/cluster.html
• Result: SE estimates typically increase, which is
appropriate because non-independent cases aren’t
providing as much information as would a sample of
independent cases.
Dummy Variables
• Another solution to correlated error within
groups/clusters: Add dummy variables
• Include a dummy variable for each Level-2 group, to
explicitly model variance in means
• A simple version of a “fixed effects” model (see below)
• Ex: Student achievement; data from 3 classes
• Level 1: students; Level 2: classroom
• Create dummy variables for each class
– Include all but one dummy variable in the model
– Or include all dummies and suppress the intercept
Yi    DClass 2 X i  DClass 3 X i  X i   i
Dummy Variables
• What is the consequence of adding group
dummy variables?
• A separate intercept is estimated for each group
• Correlated error is absorbed into intercept
– Groups won’t systematically fall above or below the regression
line
• In fact, all “between group” variation (not just error) is
absorbed into the intercept
– Thus, other variables are really just looking at within group
effects
– This can be good or bad, depending on your goals.
Dummy Variables
• Note: You can create a set of dummy
variables in stata as follows:
• xi i.classid – creates dummy variables for each
unique value of the variable “classid”
– Creates variables named _Iclassid_1, _Iclassid2, etc
• These dummies can be added to the analysis by
specifying the variable: _Iclassid*
• Ex: reg y x1 x2 x3 _Iclassid*, nocons
– “nocons” removes the constant, allowing you to use a full set
of dummies. Alternately, you could drop one dummy.
Example: Pro-environmental values
• Dummy variable model
. reg supportenv age male dmar demp educ incomerel ses _Icountry*
Source |
SS
df
MS
-------------+-----------------------------Model | 11024.1401
32 344.504377
Residual | 97142.6001 27774 3.49760928
-------------+-----------------------------Total |
108166.74 27806 3.89005036
Number of obs
F( 32, 27774)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
27807
98.50
0.0000
0.1019
0.1009
1.8702
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038917
.0008158
-4.77
0.000
-.0054906
-.0022927
male |
.0979514
.0229672
4.26
0.000
.0529346
.1429683
dmar |
.0024493
.0252179
0.10
0.923
-.046979
.0518777
demp | -.0733992
.0252937
-2.90
0.004
-.1229761
-.0238223
educ |
.0856092
.0061574
13.90
0.000
.0735404
.097678
incomerel |
.0088841
.0059384
1.50
0.135
-.0027554
.0205237
ses |
.1318295
.0134313
9.82
0.000
.1055036
.1581554
_Icountry_32 | -.4775214
.085175
-5.61
0.000
-.6444687
-.3105742
_Icountry_50 |
.3943565
.0844248
4.67
0.000
.2288798
.5598332
_Icountry_70 |
.1696262
.0865254
1.96
0.050
.0000321
.3392203
… dummies omitted …
_Icountr~891 |
.243995
.0802556
3.04
0.002
.08669
.4012999
_cons |
5.848789
.082609
70.80
0.000
5.686872
6.010707
Dummy Variables
• Benefits of the dummy variable approach
• It is simple
– Just estimate a different intercept for each group
• sometimes the dummy interpretations can be of interest
• Weaknesses
• Cumbersome if you have many groups
• Uses up lots of degrees of freedom (not parsimonious)
• Makes it hard to look at other kinds of group dummies
– Non-varying group variables = collinear with dummies
• Can be problematic if your main interest is to study effects of
variables across groups
– Dummies purge that variation… focus on within-group variation
– If you don’t have much within group variation, there isn’t much left to
analyze.
Dummy Variables
• Note: Dummy variables are a simple example
of a “fixed effects” model (FEM)
• Effect of each group is modeled as a “fixed effect”
rather than a random variable
• Also can be thought of as the “within-group” estimator
– Looks purely at variation within groups
– Stata can do a Fixed Effects Model without the
effort of using all the dummy variables
• Simply request the “fixed effects” estimator in xtreg.
Fixed Effects Model (FEM)
• Fixed effects model:
Yij   j  X ij   ij
• For i cases within j groups
• Therefore j is a separate intercept for each group
• It is equivalent to solely at within-group variation:
Yij  Y j   ( X ij  X j )   ij   j
• X-bar-sub-j is mean of X for group j, etc
• Model is “within group” because all variables are
centered around mean of each group.
Fixed Effects Model (FEM)
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) fe
Fixed-effects (within) regression
Group variable (i): country
Number of obs
Number of groups
=
=
27807
26
R-sq:
Obs per group: min =
avg =
max =
511
1069.5
2154
within = 0.0220
between = 0.0368
overall = 0.0239
F(7,27774)
=
89.23
corr(u_i, Xb) = 0.0213
Prob > F
=
0.0000
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038917
.0008158
-4.77
0.000
-.0054906
-.0022927
male |
.0979514
.0229672
4.26
0.000
.0529346
.1429683
dmar |
.0024493
.0252179
0.10
0.923
-.046979
.0518777
demp | -.0733992
.0252937
-2.90
0.004
-.1229761
-.0238223
educ |
.0856092
.0061574
13.90
0.000
.0735404
.097678
incomerel |
.0088841
.0059384
1.50
0.135
-.0027554
.0205237
ses |
.1318295
.0134313
9.82
0.000
.1055036
.1581554
_cons |
5.878524
.052746
111.45
0.000
5.775139
5.981908
-------------+---------------------------------------------------------------sigma_u | .55408807
Identical to dummy variable model!
sigma_e | 1.8701896
rho | .08069488
(fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(25, 27774) =
94.49
Prob > F = 0.0000
ANOVA: A Digression
• Suppose you wish to model variable Y for j
groups (clusters)
• Ex: Wages for different racial groups
• Definitions:
• The grand mean is the mean of all groups
– Y-bar
• The group mean is the mean of a particular sub-group
of the population
– Y-bar-sub-j
ANOVA: Concepts & Definitions
• Y is the dependent variable
• We are looking to see if Y depends upon the particular
group a person is in
• The effect of a group is the difference
between a group’s mean & the grand mean
• Effect is denoted by alpha (a)
• If Y-bar = $8.75, YGroup 1 = $8.90, then Group 1= $0.15
• Effect of being in group j is:
α j  Yj  Y
• It is like a deviation, but for a group.
ANOVA: Concepts & Definitions
• ANOVA is based on partitioning deviation
• We initially calculated deviation as the
distance of a point from the grand mean:
d i  Yi  Y
• But, you can also think of deviation from a
group mean (called “e”):
ei ,Group1  Yi ,Group1  YGroup1
• Or, for any case i in group j:
eij  Yij  Y j
ANOVA: Concepts & Definitions
• The location of any case is determined by:
• The Grand Mean, m, common to all cases
• The group “effect” , common to members
• The distance between a group and the grand mean
• “Between group” variation
• The within-group deviation (e): called “error”
• The distance from group mean to an case’s value
The ANOVA Model
• This is the basis for a formal model:
• For any population with mean m
• Comprised of J subgroups, Nj in each group
• Each with a group effect 
• The location of any individual can be
expressed as follows: Y  μ  α
ij
j
 eij
• Yij refers to the value of case i in group j
• eij refers to the “error” (i.e., deviation from
group mean) for case i in group j
Sum of Squared Deviation
• We are most interested in two parts of model
• The group effects:
j
• Deviation of the group from the grand mean
• Individual case error:
eij
• Deviation of the individual from the group mean
• Each are deviations that can be summed up
• Remember, we square deviations when summing
• Otherwise, they add up to zero
• Remember variance is just squared deviation
Sum of Squared Deviation
• The total deviation can partitioned into j and
eij components:
• That is, j + eij = total deviation:
α j  Yj  Y
eij  Yij  Yj
eij  α j  (Yj  Y)  (Yij  Yj )  Yij  Y
Sum of Squared Deviation
• The total deviation can partitioned into j and
eij components:
• The total variance (SStotal) is made up of:
–
–
–
j : between group variance (SSbetween)
eij : within group variance (SSwithin)
SStotal = SSbetween + SSwithin
ANOVA & Fixed Effects
• Note that the ANOVA model is similar to the
fixed effects model
• But FEM also includes a X term to model linear trend
ANOVA
Yij  μ  α j  eij
Fixed Effects Model
Yij   j  X ij   ij
• In fact, if you don’t specify any X variables, they are
pretty much the same
Within Group & Between Group
Models
• Group-effect dummy variables in regression
model creates a specific estimate of group
effects for all cases
• Bs & error are based on remaining “within group”
variation
• We could do the opposite: ignore within-group
variation and just look at differences between
• Stata’s xtreg command can do this, too
• This is essentially just modeling group means!
Between Group Model
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) be
Between regression (regression on group means)
Group variable (i): country
Number of obs
Number of groups
=
=
27
27
R-sq:
Obs per group: min =
avg =
max =
1
1.0
1
within =
.
between = 0.2505
overall = 0.2505
sd(u_i + avg(e_i.))=
.6378002
F(7,19)
Prob > F
=
=
0.91
0.5216
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0211517
.0391649
0.54
0.595
-.0608215
.1031248
male |
3.966173
4.479358
0.89
0.387
-5.409232
13.34158
dmar |
.8001333
1.127099
0.71
0.486
-1.558913
3.15918
demp | -.0571511
1.165915
-0.05
0.961
-2.497439
2.383137
educ |
.3743473
.2098779
1.78
0.090
-.0649321
.8136268
incomerel |
.148134
.1687438
0.88
0.391
-.2050508
.5013188
ses | -.4126738
.4916416
-0.84
0.412
-1.441691
.6163439
_cons |
2.031181
3.370978
0.60
0.554
-5.024358
9.08672
Note: Results are identical to the aggregated
analysis… Note that N is reduced to 27
Fixed vs. Random Effects
• Dummy variables produce a “fixed” estimate
of the intercept for each group
• But, models don’t need to be based on fixed effects
• Example: The error term (ei)
• We could estimate a fixed value for all cases
– This would use up lots of degrees of freedom – even more
than using group dummies
• In fact, we would use up ALL degrees of freedom
– Stata output would simply report back the raw data (expressed
as deviations from the constant)
• Instead, we model e as a random variable
– We assume it is normal, with standard deviation sigma.
Random Effects
• A simple random intercept model
– Notation from Rabe-Hesketh & Skrondal 2005, p. 4-5
Random Intercept Model
Yij   0   j   ij
• Where  is the main intercept
• Zeta () is a random effect for each group
– Allowing each of j groups to have its own intercept
– Assumed to be independent & normally distributed
• Error (e) is the error term for each case
– Also assumed to be independent & normally distributed
• Note: Other texts refer to random intercepts as uj or nj.
Random Effects
• Issue: The dummy variable approach
(ANOVA, FEM) treats group differences as a
fixed effect
• Alternatively, we can treat it as a random effect
• Don’t estimate values for each case, but model it
• This requires making assumptions
– e.g., that group differences are normally distributed with a
standard deviation that can be estimated from data.
Linear Random Intercepts Model
• The random intercept idea can be applied to
linear regression
•
•
•
•
Often called a “random effects” model…
Result is similar to FEM, BUT:
FEM looks only at within group effects
Aggregate models (“between effects”) looks across
groups
– Random effects models yield a weighted average
of between & within group effects
• It exploits between & within information, and thus can
be more efficient than FEM & aggregate models.
– IF distributional assumptions are correct.
Linear Random Intercepts Model
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) re
Random-effects GLS regression
Group variable (i): country
R-sq:
within = 0.0220
between = 0.0371
overall = 0.0240
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Assumes
normal uj,
uncorrelated
with X vars
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
Wald chi2(7)
Prob > chi2
625.50
0.0000
=
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038709
.0008152
-4.75
0.000
-.0054688
-.0022731
male |
.0978732
.0229632
4.26
0.000
.0528661
.1428802
dmar |
.0030441
.0252075
0.12
0.904
-.0463618
.05245
demp | -.0737466
.0252831
-2.92
0.004
-.1233007
-.0241926
educ |
.0857407
.0061501
13.94
0.000
.0736867
.0977947
incomerel |
.0090308
.0059314
1.52
0.128
-.0025945
.0206561
ses |
.131528
.0134248
9.80
0.000
.1052158
.1578402
_cons |
5.924611
.1287468
46.02
0.000
5.672272
6.17695
-------------+---------------------------------------------------------------sigma_u | .59876138
SD of u (intercepts); SD of e; intra-class correlation
sigma_e | 1.8701896
rho | .09297293
(fraction of variance due to u_i)
Linear Random Intercepts Model
• Notes: Model can also be estimated with
maximum likelihood estimation (MLE)
• Stata:
xtreg y x1 x2 x3, i(groupid) mle
– Versus “re”, which specifies weighted least squares estimator
• Results tend to be similar
• But, MLE results include a formal test to see whether
intercepts really vary across groups
– Significant p-value indicates that intercepts vary
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) mle
Random-effects ML regression
Number of obs
=
27807
Group variable (i): country
Number of groups
=
26
… MODEL RESULTS OMITTED …
/sigma_u |
.5397755
.0758087
.4098891
.7108206
/sigma_e |
1.869954
.0079331
1.85447
1.885568
rho |
.0769142
.019952
.0448349
.1240176
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 2128.07 Prob>=chibar2 = 0.000
Choosing Models
• Which model is best?
• There is much discussion (e.g, Halaby 2004)
• Fixed effects are most consistent under a
wide range of circumstances
• Consistent: Estimates approach true parameter values
as N grows very large
• But, they are less efficient than random effects
– In cases with low within-group variation (big between group
variation) and small sample size, results can be very poor
– Random Effects = more efficient
• But, runs into problems if specification is poor
– Esp. if X variables correlate with random group effects
– Usually due to omitted variables.
Hausman Specification Test
• Hausman Specification Test: A tool to help
evaluate fit of fixed vs. random effects
• Logic: Both fixed & random effects models are
consistent if models are properly specified
• However, some model violations cause random effects
models to be inconsistent
– Ex: if X variables are correlated to random error
• In short: Models should give the same results… If not,
random effects may be biased
– If results are similar, use the most efficient model: random
effects
– If results diverge, odds are that the random effects model is
biased. In that case use fixed effects…
Hausman Specification Test
• Strategy: Estimate both fixed & random
effects models
• Save the estimates each time
• Finally invoke Hausman test
– Ex:
•
•
•
•
•
streg var1 var2 var3, i(groupid) fe
estimates store fixed
streg var1 var2 var3, i(groupid) re
estimates store random
hausman fixed random
Hausman Specification Test
• Example: Environmental attitudes fe vs re
. hausman fixed random
Direct comparison of coefficients…
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
fixed
random
Difference
S.E.
-------------+---------------------------------------------------------------age |
-.0038917
-.0038709
-.0000207
.0000297
male |
.0979514
.0978732
.0000783
.0004277
dmar |
.0024493
.0030441
-.0005948
.0007222
demp |
-.0733992
-.0737466
.0003475
.0007303
educ |
.0856092
.0857407
-.0001314
.0002993
incomerel |
.0088841
.0090308
-.0001467
.0002885
ses |
.1318295
.131528
.0003015
.0004153
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test:
Ho:
difference in coefficients not systematic
chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
2.70
Prob>chi2 =
0.9116
Non-significant pvalue indicates
that models yield
similar results…
Within & Between Effects
• What is the relationship between within-group
effects (FEM) and between-effects (BEM)?
•
•
•
•
Usually they are similar
Ex: Student skills & test performance
Within any classroom, skilled students do best on tests
Between classrooms, classes with more skilled
students have higher mean test scores.
Within & Between Effects
• Issue: Between and within effects can differ!
• Ex: Effects of wealth on attitudes toward welfare
• At the individual level (within group)
– Wealthier people are conservative, don’t support welfare
• At the country level (between groups):
– Wealthier countries (high aggregate mean) tend to have prowelfare attitudes (ex: Scandinavia)
• Result: Wealth has opposite between vs within effects!
– Issue: Such dynamics often result from omitted
level-1 variables (omitted variable bias)
• Ex: If we control for individual “political conservatism”,
effects may be consistent at both levels…
Within & Between Effects
• You can estimate BOTH within- and betweengroup effects in a single model
• Strategy: Split a variable (e.g., SES) into two new
variables…
– 1. Group mean SES
– 2. Within-group deviation from mean SES
» Often called “group mean centering”
• Then, put both variables into a random effects model
• Model will estimate separate coefficients for between
vs. within effects
– Ex:
• egen meanvar1 = mean(var1), by(groupid)
• egen withinvar1 = var1 – meanvar1
• Include mean (aggregate) & within variable in model.
Within & Between Effects
• Example: Pro-environmental attitudes
. xtreg supportenv meanage withinage male dmar demp educ incomerel ses,
i(country) mle
Random-effects ML regression
Group variable (i): country
Random effects
~ Gaussian
Between
& withinu_i
effects
are opposite. Older
countries are MORE environmental, but older
people are LESS.
Omitted variables? Wealthy European countries
Log strong
likelihood
-56918.299
with
green =parties
have older populations!
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
LR chi2(8)
Prob > chi2
620.41
0.0000
=
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------meanage |
.0268506
.0239453
1.12
0.262
-.0200812
.0737825
withinage |
-.003903
.0008156
-4.79
0.000
-.0055016
-.0023044
male |
.0981351
.0229623
4.27
0.000
.0531299
.1431403
dmar |
.003459
.0252057
0.14
0.891
-.0459432
.0528612
demp | -.0740394
.02528
-2.93
0.003
-.1235873
-.0244914
educ |
.0856712
.0061483
13.93
0.000
.0736207
.0977216
incomerel |
.008957
.0059298
1.51
0.131
-.0026651
.0205792
ses |
.131454
.0134228
9.79
0.000
.1051458
.1577622
_cons |
4.687526
.9703564
4.83
0.000
2.785662
6.58939
Within & Between Effects / Centering
• Multilevel models & “centering” variables
• Grand mean centering: computing variables
as deviations from overall mean
• Often done to X variables
• Has effect that baseline constant in model reflects
mean of all cases
– Useful for interpretation
• Group mean centering: computing variables
as deviation from group mean
• Useful for decomposing within vs. between effects
• Often in conjunction with aggregate group mean vars.
Generalizing: Random Coefficients
• Linear random intercept model allows random
variation in intercept (mean) for groups
• But, the same idea can be applied to other coefficients
• That is, slope coefficients can ALSO be random!
Random Coefficient Model
Yij  1   1 j   2 X ij   2 j X ij   ij
Yij  1   1 j    2   2 j X ij   ij
Which can be written as:
• Where zeta-1 is a random intercept component
• Zeta-2 is a random slope component.
Linear Random Coefficient Model
Both
intercepts
and slopes
vary
randomly
across j
groups
Rabe-Hesketh & Skrondal 2004, p. 63
Random Coefficients Summary
• Some things to remember:
• Dummy variables allow fixed estimates of intercepts
across groups
• Interactions allow fixed estimates of slopes across
groups
• Random coefficients allow intercepts and/or slopes to
vary across groups randomly!
– The model does not directly estimate those effects, just as a
model does not estimate coefficients for each case residual
– BUT, random components can be predicted after the fact (just
as you can compute residuals – random error).
STATA Notes: xtreg, xtmixed
• xtreg – allows estimation of between, within
(fixed), and random intercept models
•
•
•
•
xtreg y x1 x2 x3, i(groupid) fe - fixed (within) model
xtreg y x1 x2 x3, i(groupid) be - between model
xtreg y x1 x2 x3, i(groupid) re - random intercept (GLS)
xtreg y x1 x2 x3, i(groupid) mle - random intercept (MLE)
• xtmixed – allows random slopes & coefs
• “Mixed” models refer to models that have both fixed and
random components
• xtmixed [depvar] [fixed equation] || [random eq], options
• Ex: xtmixed y x1 x2 x3 || groupid: x2
– Random intercept is assumed. Random coef for X2 specified.
STATA Notes: xtreg, xtmixed
• Random intercepts
• xtreg y x1 x2 x3, i(groupid) mle
– Is equivalent to
• xtmixed y x1 x2 x3 || groupid: , mle
• xtmixed assumes random intercept – even if no other
random effects are specified after “groupid”
– But, we can add random coefficients for all Xs:
• xtmixed y x1 x2 x3 || groupid: x1 x2 x3 , mle
– Note: xtmixed can do a lot… but GLLAMM can
do even more!
• “General linear & latent mixed models”
• Must be downloaded into stata. Type “search gllamm”
and follow instructions to install…
Random intercepts: xtmixed
• Example: Pro-environmental attitudes
. xtmixed supportenv age male dmar demp educ incomerel ses || country: , mle
Mixed-effects ML regression
Group variable: country
Wald chi2(7)
=
625.75
Log likelihood = -56919.098
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
Prob > chi2
0.0000
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038662
.0008151
-4.74
0.000
-.0054638
-.0022687
male |
.0978558
.0229613
4.26
0.000
.0528524
.1428592
dmar |
.0031799
.0252041
0.13
0.900
-.0462193
.0525791
demp | -.0738261
.0252797
-2.92
0.003
-.1233734
-.0242788
educ |
.0857707
.0061482
13.95
0.000
.0737204
.097821
incomerel |
.0090639
.0059295
1.53
0.126
-.0025578
.0206856
ses |
.1314591
.0134228
9.79
0.000
.1051509
.1577674
_cons |
5.924237
.118294
50.08
0.000
5.692385
6.156089
-----------------------------------------------------------------------------[remainder of output cut off] Note: xtmixed yields identical results to xtreg , mle
Random intercepts: xtmixed
• Ex: Pro-environmental attitudes (cont’d)
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038662
.0008151
-4.74
0.000
-.0054638
-.0022687
male |
.0978558
.0229613
4.26
0.000
.0528524
.1428592
dmar |
.0031799
.0252041
0.13
0.900
-.0462193
.0525791
demp | -.0738261
.0252797
-2.92
0.003
-.1233734
-.0242788
educ |
.0857707
.0061482
13.95
0.000
.0737204
.097821
incomerel |
.0090639
.0059295
1.53
0.126
-.0025578
.0206856
ses |
.1314591
.0134228
9.79
0.000
.1051509
.1577674
_cons |
5.924237
.118294
50.08
0.000
5.692385
6.156089
----------------------------------------------------------------------------------------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Identity
|
sd(_cons) |
.5397758
.0758083
.4098899
.7108199
-----------------------------+-----------------------------------------------sd(Residual) |
1.869954
.0079331
1.85447
1.885568
-----------------------------------------------------------------------------LR test vs. linear regression: chibar2(01) = 2128.07 Prob >= chibar2 = 0.0000
xtmixed output puts all random effects below main
coefficients. Here, they are “cons” (constant) for groups
defined by “country”, plus residual (e)
Non-zero SD
indicates that
intercepts vary
Random Coefficients: xtmixed
• Ex: Pro-environmental attitudes (cont’d)
. xtmixed supportenv age male dmar demp educ incomerel ses || country: educ, mle
[output omitted]
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0035122
.0008185
-4.29
0.000
-.0051164
-.001908
male |
.1003692
.0229663
4.37
0.000
.0553561
.1453824
dmar |
.0001061
.0252275
0.00
0.997
-.0493388
.049551
demp | -.0722059
.0253888
-2.84
0.004
-.121967
-.0224447
educ |
.081586
.0115479
7.07
0.000
.0589526
.1042194
incomerel |
.008965
.0060119
1.49
0.136
-.0028181
.0207481
ses |
.1311944
.0134708
9.74
0.000
.1047922
.1575966
_cons |
5.931294
.132838
44.65
0.000
5.670936
6.191652
-----------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Independent
|
sd(educ) |
.0484399
.0087254
.0340312
.0689492
sd(_cons) |
.6179026
.0898918
.4646097
.821773
-----------------------------+-----------------------------------------------sd(Residual) |
1.86651
.0079227
1.851046
1.882102
-----------------------------------------------------------------------------LR test vs. linear regression:
chi2(2) = 2187.33
Prob > chi2 = 0.0000
Here, we have allowed the slope of educ to vary
randomly across countries
Educ (slope) varies, too!
Random Coefficients: xtmixed
• What are random coefficients doing?
• Let’s look at results from a simplified model
8
– Only random slope & intercept for education
3
4
5
6
7
Model fits a
different slope
& intercept for
each group!
0
2
4
6
highest educational level attained
8
Random Coefficients
• Why bother with random coefficients?
• 1. A solution for clustering (non-independence)
– Usually people just use random intercepts, but slopes may be
an issue also
• 2. You can create a better-fitting model
– If slopes & intercepts vary, a random coefficient model may fit
better
– Assuming distributional assumptions are met
– Model fit compared to OLS can be tested….
• 3. Better predictions
– Attention to group-specific random effects can yield better
predictions (e.g., slopes) for each group
» Rather than just looking at “average” slope for all groups
• 4. Helps us think about multilevel data
» Ex: cross-level interactions (we’ll discuss soon!)
Multilevel Model Notation
• So far, we have expressed random effects in
a single equation:
Random Coefficient Model
Yij  1   1 j   2 X ij   2 j X ij   ij
• However, it is common to separate the fixed
and random parts into multiple equations:
Yij  1   2 X ij   ij
Intercept equation
1   1  u1 j
Slope Equation
 2   2  u2 j
Just a basic OLS
model…
But, intercept & slope
are each specified
separately as having a
random component
Multilevel Model Notation
• The “separate equation” formulation is no
different from what we did before…
• But it is a vivid & clear way to present your models
• All random components are obvious because they are
stated in separate equations
• NOTE: Some software (e.g., HLM) requires this
– Rules:
• 1. Specify an OLS model, just like normal
• 2. Consider which OLS coefficients should have a
random component
– These could be the intercept or any X variable (slope)
• 3. Specify an additional formula for each random
coefficient.