Download Class 9: Multilevel 2

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia, lookup

Linear regression wikipedia, lookup

Data assimilation wikipedia, lookup

Least squares wikipedia, lookup

Coefficient of determination wikipedia, lookup

Interaction (statistics) wikipedia, lookup

Transcript
Multilevel Models 2
Sociology 229A, Class 18
Copyright © 2008 by Evan Schofer
Do not copy or distribute without permission
Multilevel Data
• Simple example: 2-level data
Class
Class
Class
Class
Class
Class
• Which can be shown as:
Level 2
Level 1
Class 1
S1
S2
Class 2
S3
S1
S2
Class 3
S3
S1
S2
S3
Multilevel Data: Problems
• When is multilevel data NOT a problem?
– Answer: If you can successfully control for
potential sources of correlated error
• Add a control to OLS model for: classroom, school,
and state characteristics that would be sources of
correlated error in each group
• Ex: Teacher quality, class size, budget, etc…
• But: We often can’t identify or measure all
relevant sources of correlated error
• Thus, we need to abandon simple OLS regression and
try other approaches.
Review: Multilevel Strategies
• Problems of multilevel models
• Non-independence; correlated error
• Standard errors = underestimated
• Solutions:
– Each has benefits, disadvantages…
•
•
•
•
•
•
1.
2.
3.
4.
5.
6.
OLS regression
Aggregation (between effects model)
Robust Standard Errors
Robust Cluster Standard Errors
Dummy variables (Fixed Effects Model)
Random effects models
Robust Standard Errors
• Strategy #1: Improve our estimates of the
standard errors
– Option 1: Robust Standard Errors
• reg y x1 x2 x3, robust
• The Huber / White / “Sandwich” estimator
• An alternative method of computing standard errors
that is robust to a variety of assumption violations
– Provides accurate estimates in presence of heteroskedasticity
• Also, robust to model misspecification
– Note: Freedman’s criticism: What good are accurate SEs if
coefficients are biased due to poor specification?
Robust Cluster Standard Errors
• Option 2: Robust cluster standard errors
– A modification of robust SEs to address clustering
• reg y x1 x2 x3, cluster(groupid)
– Note: Cluster implies robust (vs. regular SEs)
• It is easy to adapt robust standard errors to address
clustering in data; See:
– http://www.stata.com/support/faqs/stat/robust_ref.html
– http://www.stata.com/support/faqs/stat/cluster.html
• Result: SE estimates typically increase, which is
appropriate because non-independent cases aren’t
providing as much information as would a sample of
independent cases.
Dummy Variables
• Another solution to correlated error within
groups/clusters: Add dummy variables
• Include a dummy variable for each Level-2 group, to
explicitly model variance in means
• A simple version of a “fixed effects” model (see below)
• Ex: Student achievement; data from 3 classes
• Level 1: students; Level 2: classroom
• Create dummy variables for each class
– Include all but one dummy variable in the model
– Or include all dummies and suppress the intercept
Yi    DClass 2 X i  DClass 3 X i  X i   i
Dummy Variables
• What is the consequence of adding group
dummy variables?
• A separate intercept is estimated for each group
• Correlated error is absorbed into intercept
– Groups won’t systematically fall above or below the regression
line
• In fact, all “between group” variation (not just error) is
absorbed into the intercept
– Thus, other variables are really just looking at within group
effects
– This can be good or bad, depending on your goals.
Dummy Variables
• Note: You can create a set of dummy
variables in stata as follows:
• xi i.classid – creates dummy variables for each
unique value of the variable “classid”
– Creates variables named _Iclassid_1, _Iclassid2, etc
• These dummies can be added to the analysis by
specifying the variable: _Iclassid*
• Ex: reg y x1 x2 x3 _Iclassid*, nocons
– “nocons” removes the constant, allowing you to use a full set
of dummies. Alternately, you could drop one dummy.
Example: Pro-environmental values
• Dummy variable model
. reg supportenv age male dmar demp educ incomerel ses _Icountry*
Source |
SS
df
MS
-------------+-----------------------------Model | 11024.1401
32 344.504377
Residual | 97142.6001 27774 3.49760928
-------------+-----------------------------Total |
108166.74 27806 3.89005036
Number of obs
F( 32, 27774)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
27807
98.50
0.0000
0.1019
0.1009
1.8702
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038917
.0008158
-4.77
0.000
-.0054906
-.0022927
male |
.0979514
.0229672
4.26
0.000
.0529346
.1429683
dmar |
.0024493
.0252179
0.10
0.923
-.046979
.0518777
demp | -.0733992
.0252937
-2.90
0.004
-.1229761
-.0238223
educ |
.0856092
.0061574
13.90
0.000
.0735404
.097678
incomerel |
.0088841
.0059384
1.50
0.135
-.0027554
.0205237
ses |
.1318295
.0134313
9.82
0.000
.1055036
.1581554
_Icountry_32 | -.4775214
.085175
-5.61
0.000
-.6444687
-.3105742
_Icountry_50 |
.3943565
.0844248
4.67
0.000
.2288798
.5598332
_Icountry_70 |
.1696262
.0865254
1.96
0.050
.0000321
.3392203
… dummies omitted …
_Icountr~891 |
.243995
.0802556
3.04
0.002
.08669
.4012999
_cons |
5.848789
.082609
70.80
0.000
5.686872
6.010707
Dummy Variables
• Benefits of the dummy variable approach
• It is simple
– Just estimate a different intercept for each group
• sometimes the dummy interpretations can be of interest
• Weaknesses
• Cumbersome if you have many groups
• Uses up lots of degrees of freedom (not parsimonious)
• Makes it hard to look at other kinds of group dummies
– Non-varying group variables = collinear with dummies
• Can be problematic if your main interest is to study effects of
variables across groups
– Dummies purge that variation… focus on within-group variation
– If you don’t have much within group variation, there isn’t much left to
analyze.
Dummy Variables
• Note: Dummy variables are a simple example
of a “fixed effects” model (FEM)
• Effect of each group is modeled as a “fixed effect”
rather than a random variable
• Also can be thought of as the “within-group” estimator
– Looks purely at variation within groups
– Stata can do a Fixed Effects Model without the
effort of using all the dummy variables
• Simply request the “fixed effects” estimator in xtreg.
Fixed Effects Model (FEM)
• Fixed effects model:
Yij   j  X ij   ij
• For i cases within j groups
• Therefore j is a separate intercept for each group
• It is equivalent to solely at within-group variation:
Yij  Y j   ( X ij  X j )   ij   j
• X-bar-sub-j is mean of X for group j, etc
• Model is “within group” because all variables are
centered around mean of each group.
Fixed Effects Model (FEM)
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) fe
Fixed-effects (within) regression
Group variable (i): country
Number of obs
Number of groups
=
=
27807
26
R-sq:
Obs per group: min =
avg =
max =
511
1069.5
2154
within = 0.0220
between = 0.0368
overall = 0.0239
F(7,27774)
=
89.23
corr(u_i, Xb) = 0.0213
Prob > F
=
0.0000
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038917
.0008158
-4.77
0.000
-.0054906
-.0022927
male |
.0979514
.0229672
4.26
0.000
.0529346
.1429683
dmar |
.0024493
.0252179
0.10
0.923
-.046979
.0518777
demp | -.0733992
.0252937
-2.90
0.004
-.1229761
-.0238223
educ |
.0856092
.0061574
13.90
0.000
.0735404
.097678
incomerel |
.0088841
.0059384
1.50
0.135
-.0027554
.0205237
ses |
.1318295
.0134313
9.82
0.000
.1055036
.1581554
_cons |
5.878524
.052746
111.45
0.000
5.775139
5.981908
-------------+---------------------------------------------------------------sigma_u | .55408807
Identical to dummy variable model!
sigma_e | 1.8701896
rho | .08069488
(fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(25, 27774) =
94.49
Prob > F = 0.0000
ANOVA: A Digression
• Suppose you wish to model variable Y for j
groups (clusters)
• Ex: Wages for different racial groups
• Definitions:
• The grand mean is the mean of all groups
– Y-bar
• The group mean is the mean of a particular sub-group
of the population
– Y-bar-sub-j
ANOVA: Concepts & Definitions
• Y is the dependent variable
• We are looking to see if Y depends upon the particular
group a person is in
• The effect of a group is the difference
between a group’s mean & the grand mean
• Effect is denoted by alpha (a)
• If Y-bar = $8.75, YGroup 1 = $8.90, then Group 1= $0.15
• Effect of being in group j is:
α j  Yj  Y
• It is like a deviation, but for a group.
ANOVA: Concepts & Definitions
• ANOVA is based on partitioning deviation
• We initially calculated deviation as the
distance of a point from the grand mean:
d i  Yi  Y
• But, you can also think of deviation from a
group mean (called “e”):
ei ,Group1  Yi ,Group1  YGroup1
• Or, for any case i in group j:
eij  Yij  Y j
ANOVA: Concepts & Definitions
• The location of any case is determined by:
• The Grand Mean, m, common to all cases
• The group “effect” , common to members
• The distance between a group and the grand mean
• “Between group” variation
• The within-group deviation (e): called “error”
• The distance from group mean to an case’s value
The ANOVA Model
• This is the basis for a formal model:
• For any population with mean m
• Comprised of J subgroups, Nj in each group
• Each with a group effect 
• The location of any individual can be
expressed as follows: Y  μ  α
ij
j
 eij
• Yij refers to the value of case i in group j
• eij refers to the “error” (i.e., deviation from
group mean) for case i in group j
Sum of Squared Deviation
• We are most interested in two parts of model
• The group effects:
j
• Deviation of the group from the grand mean
• Individual case error:
eij
• Deviation of the individual from the group mean
• Each are deviations that can be summed up
• Remember, we square deviations when summing
• Otherwise, they add up to zero
• Remember variance is just squared deviation
Sum of Squared Deviation
• The total deviation can partitioned into j and
eij components:
• That is, j + eij = total deviation:
α j  Yj  Y
eij  Yij  Yj
eij  α j  (Yj  Y)  (Yij  Yj )  Yij  Y
Sum of Squared Deviation
• The total deviation can partitioned into j and
eij components:
• The total variance (SStotal) is made up of:
–
–
–
j : between group variance (SSbetween)
eij : within group variance (SSwithin)
SStotal = SSbetween + SSwithin
ANOVA & Fixed Effects
• Note that the ANOVA model is similar to the
fixed effects model
• But FEM also includes a X term to model linear trend
ANOVA
Yij  μ  α j  eij
Fixed Effects Model
Yij   j  X ij   ij
• In fact, if you don’t specify any X variables, they are
pretty much the same
Within Group & Between Group
Models
• Group-effect dummy variables in regression
model creates a specific estimate of group
effects for all cases
• Bs & error are based on remaining “within group”
variation
• We could do the opposite: ignore within-group
variation and just look at differences between
• Stata’s xtreg command can do this, too
• This is essentially just modeling group means!
Between Group Model
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) be
Between regression (regression on group means)
Group variable (i): country
Number of obs
Number of groups
=
=
27
27
R-sq:
Obs per group: min =
avg =
max =
1
1.0
1
within =
.
between = 0.2505
overall = 0.2505
sd(u_i + avg(e_i.))=
.6378002
F(7,19)
Prob > F
=
=
0.91
0.5216
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age |
.0211517
.0391649
0.54
0.595
-.0608215
.1031248
male |
3.966173
4.479358
0.89
0.387
-5.409232
13.34158
dmar |
.8001333
1.127099
0.71
0.486
-1.558913
3.15918
demp | -.0571511
1.165915
-0.05
0.961
-2.497439
2.383137
educ |
.3743473
.2098779
1.78
0.090
-.0649321
.8136268
incomerel |
.148134
.1687438
0.88
0.391
-.2050508
.5013188
ses | -.4126738
.4916416
-0.84
0.412
-1.441691
.6163439
_cons |
2.031181
3.370978
0.60
0.554
-5.024358
9.08672
Note: Results are identical to the aggregated
analysis… Note that N is reduced to 27
Fixed vs. Random Effects
• Dummy variables produce a “fixed” estimate
of the intercept for each group
• But, models don’t need to be based on fixed effects
• Example: The error term (ei)
• We could estimate a fixed value for all cases
– This would use up lots of degrees of freedom – even more
than using group dummies
• In fact, we would use up ALL degrees of freedom
– Stata output would simply report back the raw data (expressed
as deviations from the constant)
• Instead, we model e as a random variable
– We assume it is normal, with standard deviation sigma.
Random Effects
• A simple random intercept model
– Notation from Rabe-Hesketh & Skrondal 2005, p. 4-5
Random Intercept Model
Yij   0   j   ij
• Where  is the main intercept
• Zeta () is a random effect for each group
– Allowing each of j groups to have its own intercept
– Assumed to be independent & normally distributed
• Error (e) is the error term for each case
– Also assumed to be independent & normally distributed
• Note: Other texts refer to random intercepts as uj or nj.
Random Effects
• Issue: The dummy variable approach
(ANOVA, FEM) treats group differences as a
fixed effect
• Alternatively, we can treat it as a random effect
• Don’t estimate values for each case, but model it
• This requires making assumptions
– e.g., that group differences are normally distributed with a
standard deviation that can be estimated from data.
Linear Random Intercepts Model
• The random intercept idea can be applied to
linear regression
•
•
•
•
Often called a “random effects” model…
Result is similar to FEM, BUT:
FEM looks only at within group effects
Aggregate models (“between effects”) looks across
groups
– Random effects models yield a weighted average
of between & within group effects
• It exploits between & within information, and thus can
be more efficient than FEM & aggregate models.
– IF distributional assumptions are correct.
Linear Random Intercepts Model
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) re
Random-effects GLS regression
Group variable (i): country
R-sq:
within = 0.0220
between = 0.0371
overall = 0.0240
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
Assumes
normal uj,
uncorrelated
with X vars
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
Wald chi2(7)
Prob > chi2
625.50
0.0000
=
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038709
.0008152
-4.75
0.000
-.0054688
-.0022731
male |
.0978732
.0229632
4.26
0.000
.0528661
.1428802
dmar |
.0030441
.0252075
0.12
0.904
-.0463618
.05245
demp | -.0737466
.0252831
-2.92
0.004
-.1233007
-.0241926
educ |
.0857407
.0061501
13.94
0.000
.0736867
.0977947
incomerel |
.0090308
.0059314
1.52
0.128
-.0025945
.0206561
ses |
.131528
.0134248
9.80
0.000
.1052158
.1578402
_cons |
5.924611
.1287468
46.02
0.000
5.672272
6.17695
-------------+---------------------------------------------------------------sigma_u | .59876138
SD of u (intercepts); SD of e; intra-class correlation
sigma_e | 1.8701896
rho | .09297293
(fraction of variance due to u_i)
Linear Random Intercepts Model
• Notes: Model can also be estimated with
maximum likelihood estimation (MLE)
• Stata:
xtreg y x1 x2 x3, i(groupid) mle
– Versus “re”, which specifies weighted least squares estimator
• Results tend to be similar
• But, MLE results include a formal test to see whether
intercepts really vary across groups
– Significant p-value indicates that intercepts vary
. xtreg supportenv age male dmar demp educ incomerel ses, i(country) mle
Random-effects ML regression
Number of obs
=
27807
Group variable (i): country
Number of groups
=
26
… MODEL RESULTS OMITTED …
/sigma_u |
.5397755
.0758087
.4098891
.7108206
/sigma_e |
1.869954
.0079331
1.85447
1.885568
rho |
.0769142
.019952
.0448349
.1240176
-----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 2128.07 Prob>=chibar2 = 0.000
Choosing Models
• Which model is best?
• There is much discussion (e.g, Halaby 2004)
• Fixed effects are most consistent under a
wide range of circumstances
• Consistent: Estimates approach true parameter values
as N grows very large
• But, they are less efficient than random effects
– In cases with low within-group variation (big between group
variation) and small sample size, results can be very poor
– Random Effects = more efficient
• But, runs into problems if specification is poor
– Esp. if X variables correlate with random group effects
– Usually due to omitted variables.
Hausman Specification Test
• Hausman Specification Test: A tool to help
evaluate fit of fixed vs. random effects
• Logic: Both fixed & random effects models are
consistent if models are properly specified
• However, some model violations cause random effects
models to be inconsistent
– Ex: if X variables are correlated to random error
• In short: Models should give the same results… If not,
random effects may be biased
– If results are similar, use the most efficient model: random
effects
– If results diverge, odds are that the random effects model is
biased. In that case use fixed effects…
Hausman Specification Test
• Strategy: Estimate both fixed & random
effects models
• Save the estimates each time
• Finally invoke Hausman test
– Ex:
•
•
•
•
•
streg var1 var2 var3, i(groupid) fe
estimates store fixed
streg var1 var2 var3, i(groupid) re
estimates store random
hausman fixed random
Hausman Specification Test
• Example: Environmental attitudes fe vs re
. hausman fixed random
Direct comparison of coefficients…
---- Coefficients ---|
(b)
(B)
(b-B)
sqrt(diag(V_b-V_B))
|
fixed
random
Difference
S.E.
-------------+---------------------------------------------------------------age |
-.0038917
-.0038709
-.0000207
.0000297
male |
.0979514
.0978732
.0000783
.0004277
dmar |
.0024493
.0030441
-.0005948
.0007222
demp |
-.0733992
-.0737466
.0003475
.0007303
educ |
.0856092
.0857407
-.0001314
.0002993
incomerel |
.0088841
.0090308
-.0001467
.0002885
ses |
.1318295
.131528
.0003015
.0004153
-----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg
B = inconsistent under Ha, efficient under Ho; obtained from xtreg
Test:
Ho:
difference in coefficients not systematic
chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)
=
2.70
Prob>chi2 =
0.9116
Non-significant pvalue indicates
that models yield
similar results…
Within & Between Effects
• What is the relationship between within-group
effects (FEM) and between-effects (BEM)?
•
•
•
•
Usually they are similar
Ex: Student skills & test performance
Within any classroom, skilled students do best on tests
Between classrooms, classes with more skilled
students have higher mean test scores.
Within & Between Effects
• Issue: Between and within effects can differ!
• Ex: Effects of wealth on attitudes toward welfare
• At the individual level (within group)
– Wealthier people are conservative, don’t support welfare
• At the country level (between groups):
– Wealthier countries (high aggregate mean) tend to have prowelfare attitudes (ex: Scandinavia)
• Result: Wealth has opposite between vs within effects!
– Issue: Such dynamics often result from omitted
level-1 variables (omitted variable bias)
• Ex: If we control for individual “political conservatism”,
effects may be consistent at both levels…
Within & Between Effects
• You can estimate BOTH within- and betweengroup effects in a single model
• Strategy: Split a variable (e.g., SES) into two new
variables…
– 1. Group mean SES
– 2. Within-group deviation from mean SES
» Often called “group mean centering”
• Then, put both variables into a random effects model
• Model will estimate separate coefficients for between
vs. within effects
– Ex:
• egen meanvar1 = mean(var1), by(groupid)
• egen withinvar1 = var1 – meanvar1
• Include mean (aggregate) & within variable in model.
Within & Between Effects
• Example: Pro-environmental attitudes
. xtreg supportenv meanage withinage male dmar demp educ incomerel ses,
i(country) mle
Random-effects ML regression
Group variable (i): country
Random effects
~ Gaussian
Between
& withinu_i
effects
are opposite. Older
countries are MORE environmental, but older
people are LESS.
Omitted variables? Wealthy European countries
Log strong
likelihood
-56918.299
with
green =parties
have older populations!
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
LR chi2(8)
Prob > chi2
620.41
0.0000
=
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------meanage |
.0268506
.0239453
1.12
0.262
-.0200812
.0737825
withinage |
-.003903
.0008156
-4.79
0.000
-.0055016
-.0023044
male |
.0981351
.0229623
4.27
0.000
.0531299
.1431403
dmar |
.003459
.0252057
0.14
0.891
-.0459432
.0528612
demp | -.0740394
.02528
-2.93
0.003
-.1235873
-.0244914
educ |
.0856712
.0061483
13.93
0.000
.0736207
.0977216
incomerel |
.008957
.0059298
1.51
0.131
-.0026651
.0205792
ses |
.131454
.0134228
9.79
0.000
.1051458
.1577622
_cons |
4.687526
.9703564
4.83
0.000
2.785662
6.58939
Within & Between Effects / Centering
• Multilevel models & “centering” variables
• Grand mean centering: computing variables
as deviations from overall mean
• Often done to X variables
• Has effect that baseline constant in model reflects
mean of all cases
– Useful for interpretation
• Group mean centering: computing variables
as deviation from group mean
• Useful for decomposing within vs. between effects
• Often in conjunction with aggregate group mean vars.
Generalizing: Random Coefficients
• Linear random intercept model allows random
variation in intercept (mean) for groups
• But, the same idea can be applied to other coefficients
• That is, slope coefficients can ALSO be random!
Random Coefficient Model
Yij  1   1 j   2 X ij   2 j X ij   ij
Yij  1   1 j    2   2 j X ij   ij
Which can be written as:
• Where zeta-1 is a random intercept component
• Zeta-2 is a random slope component.
Linear Random Coefficient Model
Both
intercepts
and slopes
vary
randomly
across j
groups
Rabe-Hesketh & Skrondal 2004, p. 63
Random Coefficients Summary
• Some things to remember:
• Dummy variables allow fixed estimates of intercepts
across groups
• Interactions allow fixed estimates of slopes across
groups
• Random coefficients allow intercepts and/or slopes to
vary across groups randomly!
– The model does not directly estimate those effects, just as a
model does not estimate coefficients for each case residual
– BUT, random components can be predicted after the fact (just
as you can compute residuals – random error).
STATA Notes: xtreg, xtmixed
• xtreg – allows estimation of between, within
(fixed), and random intercept models
•
•
•
•
xtreg y x1 x2 x3, i(groupid) fe - fixed (within) model
xtreg y x1 x2 x3, i(groupid) be - between model
xtreg y x1 x2 x3, i(groupid) re - random intercept (GLS)
xtreg y x1 x2 x3, i(groupid) mle - random intercept (MLE)
• xtmixed – allows random slopes & coefs
• “Mixed” models refer to models that have both fixed and
random components
• xtmixed [depvar] [fixed equation] || [random eq], options
• Ex: xtmixed y x1 x2 x3 || groupid: x2
– Random intercept is assumed. Random coef for X2 specified.
STATA Notes: xtreg, xtmixed
• Random intercepts
• xtreg y x1 x2 x3, i(groupid) mle
– Is equivalent to
• xtmixed y x1 x2 x3 || groupid: , mle
• xtmixed assumes random intercept – even if no other
random effects are specified after “groupid”
– But, we can add random coefficients for all Xs:
• xtmixed y x1 x2 x3 || groupid: x1 x2 x3 , mle
– Note: xtmixed can do a lot… but GLLAMM can
do even more!
• “General linear & latent mixed models”
• Must be downloaded into stata. Type “search gllamm”
and follow instructions to install…
Random intercepts: xtmixed
• Example: Pro-environmental attitudes
. xtmixed supportenv age male dmar demp educ incomerel ses || country: , mle
Mixed-effects ML regression
Group variable: country
Wald chi2(7)
=
625.75
Log likelihood = -56919.098
Number of obs
Number of groups
=
=
27807
26
Obs per group: min =
avg =
max =
511
1069.5
2154
Prob > chi2
0.0000
=
-----------------------------------------------------------------------------supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038662
.0008151
-4.74
0.000
-.0054638
-.0022687
male |
.0978558
.0229613
4.26
0.000
.0528524
.1428592
dmar |
.0031799
.0252041
0.13
0.900
-.0462193
.0525791
demp | -.0738261
.0252797
-2.92
0.003
-.1233734
-.0242788
educ |
.0857707
.0061482
13.95
0.000
.0737204
.097821
incomerel |
.0090639
.0059295
1.53
0.126
-.0025578
.0206856
ses |
.1314591
.0134228
9.79
0.000
.1051509
.1577674
_cons |
5.924237
.118294
50.08
0.000
5.692385
6.156089
-----------------------------------------------------------------------------[remainder of output cut off] Note: xtmixed yields identical results to xtreg , mle
Random intercepts: xtmixed
• Ex: Pro-environmental attitudes (cont’d)
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0038662
.0008151
-4.74
0.000
-.0054638
-.0022687
male |
.0978558
.0229613
4.26
0.000
.0528524
.1428592
dmar |
.0031799
.0252041
0.13
0.900
-.0462193
.0525791
demp | -.0738261
.0252797
-2.92
0.003
-.1233734
-.0242788
educ |
.0857707
.0061482
13.95
0.000
.0737204
.097821
incomerel |
.0090639
.0059295
1.53
0.126
-.0025578
.0206856
ses |
.1314591
.0134228
9.79
0.000
.1051509
.1577674
_cons |
5.924237
.118294
50.08
0.000
5.692385
6.156089
----------------------------------------------------------------------------------------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Identity
|
sd(_cons) |
.5397758
.0758083
.4098899
.7108199
-----------------------------+-----------------------------------------------sd(Residual) |
1.869954
.0079331
1.85447
1.885568
-----------------------------------------------------------------------------LR test vs. linear regression: chibar2(01) = 2128.07 Prob >= chibar2 = 0.0000
xtmixed output puts all random effects below main
coefficients. Here, they are “cons” (constant) for groups
defined by “country”, plus residual (e)
Non-zero SD
indicates that
intercepts vary
Random Coefficients: xtmixed
• Ex: Pro-environmental attitudes (cont’d)
. xtmixed supportenv age male dmar demp educ incomerel ses || country: educ, mle
[output omitted]
supportenv |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
-------------+---------------------------------------------------------------age | -.0035122
.0008185
-4.29
0.000
-.0051164
-.001908
male |
.1003692
.0229663
4.37
0.000
.0553561
.1453824
dmar |
.0001061
.0252275
0.00
0.997
-.0493388
.049551
demp | -.0722059
.0253888
-2.84
0.004
-.121967
-.0224447
educ |
.081586
.0115479
7.07
0.000
.0589526
.1042194
incomerel |
.008965
.0060119
1.49
0.136
-.0028181
.0207481
ses |
.1311944
.0134708
9.74
0.000
.1047922
.1575966
_cons |
5.931294
.132838
44.65
0.000
5.670936
6.191652
-----------------------------------------------------------------------------Random-effects Parameters |
Estimate
Std. Err.
[95% Conf. Interval]
-----------------------------+-----------------------------------------------country: Independent
|
sd(educ) |
.0484399
.0087254
.0340312
.0689492
sd(_cons) |
.6179026
.0898918
.4646097
.821773
-----------------------------+-----------------------------------------------sd(Residual) |
1.86651
.0079227
1.851046
1.882102
-----------------------------------------------------------------------------LR test vs. linear regression:
chi2(2) = 2187.33
Prob > chi2 = 0.0000
Here, we have allowed the slope of educ to vary
randomly across countries
Educ (slope) varies, too!
Random Coefficients: xtmixed
• What are random coefficients doing?
• Let’s look at results from a simplified model
8
– Only random slope & intercept for education
3
4
5
6
7
Model fits a
different slope
& intercept for
each group!
0
2
4
6
highest educational level attained
8
Random Coefficients
• Why bother with random coefficients?
• 1. A solution for clustering (non-independence)
– Usually people just use random intercepts, but slopes may be
an issue also
• 2. You can create a better-fitting model
– If slopes & intercepts vary, a random coefficient model may fit
better
– Assuming distributional assumptions are met
– Model fit compared to OLS can be tested….
• 3. Better predictions
– Attention to group-specific random effects can yield better
predictions (e.g., slopes) for each group
» Rather than just looking at “average” slope for all groups
• 4. Helps us think about multilevel data
» Ex: cross-level interactions (we’ll discuss soon!)
Multilevel Model Notation
• So far, we have expressed random effects in
a single equation:
Random Coefficient Model
Yij  1   1 j   2 X ij   2 j X ij   ij
• However, it is common to separate the fixed
and random parts into multiple equations:
Yij  1   2 X ij   ij
Intercept equation
1   1  u1 j
Slope Equation
 2   2  u2 j
Just a basic OLS
model…
But, intercept & slope
are each specified
separately as having a
random component
Multilevel Model Notation
• The “separate equation” formulation is no
different from what we did before…
• But it is a vivid & clear way to present your models
• All random components are obvious because they are
stated in separate equations
• NOTE: Some software (e.g., HLM) requires this
– Rules:
• 1. Specify an OLS model, just like normal
• 2. Consider which OLS coefficients should have a
random component
– These could be the intercept or any X variable (slope)
• 3. Specify an additional formula for each random
coefficient.