Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Multilevel Models 2 Sociology 8811, Class 24 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission Announcements • Paper #2 due in 2 weeks • Come see me ASAP if you don’t have a plan • Unfortunately, I’m unavailable during office hours today – Please send me an email to make an appointment at some other time. Multilevel Data • Simple example: 2-level data Class Class Class Class Class Class • Which can be shown as: Level 2 Level 1 Class 1 S1 S2 Class 2 S3 S1 S2 Class 3 S3 S1 S2 S3 Review: Multilevel Strategies • Problems of multilevel models • Non-independence; correlated error • Standard errors = underestimated • Solutions: – Each has benefits, disadvantages… • • • • • • 1. 2. 3. 4. 5. 6. OLS regression Aggregation (between effects model) Robust Standard Errors Robust Cluster Standard Errors Dummy variables (Fixed Effects Model) Random effects models Example: Pro-environmental values • Source: World Values Survey (27 countries) • Let’s simply try OLS regression . reg supportenv age male dmar demp educ incomerel ses Source | SS df MS -------------+-----------------------------Model | 2761.86228 7 394.551755 Residual | 105404.878 27799 3.79167876 -------------+-----------------------------Total | 108166.74 27806 3.89005036 Number of obs F( 7, 27799) Prob > F R-squared Adj R-squared Root MSE = = = = = = 27807 104.06 0.0000 0.0255 0.0253 1.9472 -----------------------------------------------------------------------------supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------age | -.0021927 .000803 -2.73 0.006 -.0037666 -.0006187 male | .0960975 .0236758 4.06 0.000 .0496918 .1425032 dmar | .0959759 .02527 3.80 0.000 .0464455 .1455063 demp | -.1226363 .0254293 -4.82 0.000 -.172479 -.0727937 educ | .1117587 .0058261 19.18 0.000 .1003393 .1231781 incomerel | .0131716 .0056011 2.35 0.019 .0021931 .0241501 ses | .0922855 .0134349 6.87 0.000 .0659525 .1186186 _cons | 5.742023 .0518026 110.84 0.000 5.640487 5.843559 Dummy Variables • Another solution to correlated error within groups/clusters: Add dummy variables • Include a dummy variable for each Level-2 group, to explicitly model variance in means • A simple version of a “fixed effects” model (see below) • Ex: Student achievement; data from 3 classes • Level 1: students; Level 2: classroom • Create dummy variables for each class – Include all but one dummy variable in the model – Or include all dummies and suppress the intercept Yi DClass 2 X i DClass 3 X i X i i Dummy Variables • What is the consequence of adding group dummy variables? • A separate intercept is estimated for each group • Correlated error is absorbed into intercept – Groups won’t systematically fall above or below the regression line • In fact, all “between group” variation (not just error) is absorbed into the intercept – Thus, other variables are really just looking at within group effects – This can be good or bad, depending on your goals. Example: Pro-environmental values • Dummy variable model . reg supportenv age male dmar demp educ incomerel ses _Icountry* Source | SS df MS -------------+-----------------------------Model | 11024.1401 32 344.504377 Residual | 97142.6001 27774 3.49760928 -------------+-----------------------------Total | 108166.74 27806 3.89005036 Number of obs F( 32, 27774) Prob > F R-squared Adj R-squared Root MSE = = = = = = 27807 98.50 0.0000 0.1019 0.1009 1.8702 -----------------------------------------------------------------------------supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------age | -.0038917 .0008158 -4.77 0.000 -.0054906 -.0022927 male | .0979514 .0229672 4.26 0.000 .0529346 .1429683 dmar | .0024493 .0252179 0.10 0.923 -.046979 .0518777 demp | -.0733992 .0252937 -2.90 0.004 -.1229761 -.0238223 educ | .0856092 .0061574 13.90 0.000 .0735404 .097678 incomerel | .0088841 .0059384 1.50 0.135 -.0027554 .0205237 ses | .1318295 .0134313 9.82 0.000 .1055036 .1581554 _Icountry_32 | -.4775214 .085175 -5.61 0.000 -.6444687 -.3105742 _Icountry_50 | .3943565 .0844248 4.67 0.000 .2288798 .5598332 _Icountry_70 | .1696262 .0865254 1.96 0.050 .0000321 .3392203 … dummies omitted … _Icountr~891 | .243995 .0802556 3.04 0.002 .08669 .4012999 _cons | 5.848789 .082609 70.80 0.000 5.686872 6.010707 Dummy Variables • Benefits of the dummy variable approach • It is simple – Just estimate a different intercept for each group • sometimes the dummy interpretations can be of interest • Weaknesses • Cumbersome if you have many groups • Uses up lots of degrees of freedom (not parsimonious) • Makes it hard to look at other kinds of group dummies – Non-varying group variables = collinear with dummies • Can be problematic if your main interest is to study effects of variables across groups – Dummies purge that variation… focus on within-group variation – If you don’t have much within group variation, there isn’t much left to analyze. Dummy Variables • Note: Dummy variables are a simple example of a “fixed effects” model (FEM) • Effect of each group is modeled as a “fixed effect” rather than a random variable • Also can be thought of as the “within-group” estimator – Looks purely at variation within groups – Stata can do a Fixed Effects Model without the effort of using all the dummy variables • Simply request the “fixed effects” estimator in xtreg. Fixed Effects Model (FEM) • Fixed effects model: Yij j X ij ij • For i cases within j groups • Therefore j is a separate intercept for each group • It is equivalent to solely at within-group variation: Yij Y j ( X ij X j ) ij j • X-bar-sub-j is mean of X for group j, etc • Model is “within group” because all variables are centered around mean of each group. Fixed Effects Model (FEM) . xtreg supportenv age male dmar demp educ incomerel ses, i(country) fe Fixed-effects (within) regression Group variable (i): country Number of obs Number of groups = = 27807 26 R-sq: Obs per group: min = avg = max = 511 1069.5 2154 within = 0.0220 between = 0.0368 overall = 0.0239 F(7,27774) = 89.23 corr(u_i, Xb) = 0.0213 Prob > F = 0.0000 -----------------------------------------------------------------------------supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------age | -.0038917 .0008158 -4.77 0.000 -.0054906 -.0022927 male | .0979514 .0229672 4.26 0.000 .0529346 .1429683 dmar | .0024493 .0252179 0.10 0.923 -.046979 .0518777 demp | -.0733992 .0252937 -2.90 0.004 -.1229761 -.0238223 educ | .0856092 .0061574 13.90 0.000 .0735404 .097678 incomerel | .0088841 .0059384 1.50 0.135 -.0027554 .0205237 ses | .1318295 .0134313 9.82 0.000 .1055036 .1581554 _cons | 5.878524 .052746 111.45 0.000 5.775139 5.981908 -------------+---------------------------------------------------------------sigma_u | .55408807 Identical to dummy variable model! sigma_e | 1.8701896 rho | .08069488 (fraction of variance due to u_i) -----------------------------------------------------------------------------F test that all u_i=0: F(25, 27774) = 94.49 Prob > F = 0.0000 ANOVA: A Digression • Suppose you wish to model variable Y for j groups (clusters) • Ex: Wages for different racial groups • Definitions: • The grand mean is the mean of all groups – Y-bar • The group mean is the mean of a particular sub-group of the population – Y-bar-sub-j ANOVA: Concepts & Definitions • Y is the dependent variable • We are looking to see if Y depends upon the particular group a person is in • The effect of a group is the difference between a group’s mean & the grand mean • Effect is denoted by alpha (a) • If Y-bar = $8.75, YGroup 1 = $8.90, then Group 1= $0.15 • Effect of being in group j is: α j Yj Y • It is like a deviation, but for a group. ANOVA: Concepts & Definitions • ANOVA is based on partitioning deviation • We initially calculated deviation as the distance of a point from the grand mean: d i Yi Y • But, you can also think of deviation from a group mean (called “e”): ei ,Group1 Yi ,Group1 YGroup1 • Or, for any case i in group j: eij Yij Y j ANOVA: Concepts & Definitions • The location of any case is determined by: • The Grand Mean, m, common to all cases • The group “effect” , common to members • The distance between a group and the grand mean • “Between group” variation • The within-group deviation (e): called “error” • The distance from group mean to an case’s value The ANOVA Model • This is the basis for a formal model: • For any population with mean m • Comprised of J subgroups, Nj in each group • Each with a group effect • The location of any individual can be expressed as follows: Y μ α ij j eij • Yij refers to the value of case i in group j • eij refers to the “error” (i.e., deviation from group mean) for case i in group j Sum of Squared Deviation • We are most interested in two parts of model • The group effects: j • Deviation of the group from the grand mean • Individual case error: eij • Deviation of the individual from the group mean • Each are deviations that can be summed up • Remember, we square deviations when summing • Otherwise, they add up to zero • Remember variance is just squared deviation Sum of Squared Deviation • The total deviation can partitioned into j and eij components: • That is, j + eij = total deviation: α j Yj Y eij Yij Yj eij α j (Yj Y) (Yij Yj ) Yij Y Sum of Squared Deviation • The total deviation can partitioned into j and eij components: • The total variance (SStotal) is made up of: – – – j : between group variance (SSbetween) eij : within group variance (SSwithin) SStotal = SSbetween + SSwithin ANOVA & Fixed Effects • Note that the ANOVA model is similar to the fixed effects model • But FEM also includes a X term to model linear trend ANOVA Yij μ α j eij Fixed Effects Model Yij j X ij ij • In fact, if you don’t specify any X variables, they are pretty much the same Within Group & Between Group Models • Group-effect dummy variables in regression model creates a specific estimate of group effects for all cases • Bs & error are based on remaining “within group” variation • We could do the opposite: ignore within-group variation and just look at differences between • Stata’s xtreg command can do this, too • This is essentially just modeling group means! Between Group Model . xtreg supportenv age male dmar demp educ incomerel ses, i(country) be Between regression (regression on group means) Group variable (i): country Number of obs Number of groups = = 27 27 R-sq: Obs per group: min = avg = max = 1 1.0 1 within = . between = 0.2505 overall = 0.2505 sd(u_i + avg(e_i.))= .6378002 F(7,19) Prob > F = = 0.91 0.5216 -----------------------------------------------------------------------------supportenv | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------age | .0211517 .0391649 0.54 0.595 -.0608215 .1031248 male | 3.966173 4.479358 0.89 0.387 -5.409232 13.34158 dmar | .8001333 1.127099 0.71 0.486 -1.558913 3.15918 demp | -.0571511 1.165915 -0.05 0.961 -2.497439 2.383137 educ | .3743473 .2098779 1.78 0.090 -.0649321 .8136268 incomerel | .148134 .1687438 0.88 0.391 -.2050508 .5013188 ses | -.4126738 .4916416 -0.84 0.412 -1.441691 .6163439 _cons | 2.031181 3.370978 0.60 0.554 -5.024358 9.08672 Note: Results are identical to the aggregated analysis… Note that N is reduced to 27 Fixed vs. Random Effects • Dummy variables produce a “fixed” estimate of the intercept for each group • But, models don’t need to be based on fixed effects • Example: The error term (ei) • We could estimate a fixed value for all cases – This would use up lots of degrees of freedom – even more than using group dummies • In fact, we would use up ALL degrees of freedom – Stata output would simply report back the raw data (expressed as deviations from the constant) • Instead, we model e as a random variable – We assume it is normal, with standard deviation sigma. Random Effects • Issue: The dummy variable approach (ANOVA, FEM) treats group differences as a fixed effect • Alternatively, we can treat it as a random effect • Don’t estimate values for each case, but model it • This requires making assumptions – e.g., that group differences are normally distributed with a standard deviation that can be estimated from data Random Effects • A simple random intercept model – Notation from Rabe-Hesketh & Skrondal 2005, p. 4-5 Random Intercept Model Yij 0 j ij • Where is the main intercept • Zeta () is a random effect for each group – Allowing each of j groups to have its own intercept – Assumed to be independent & normally distributed • Error (e) is the error term for each case – Also assumed to be independent & normally distributed • Note: Other texts refer to random intercepts as uj or nj. Linear Random Intercepts Model • The random intercept idea can be applied to linear regression • • • • Often called a “random effects” model… Result is similar to FEM, BUT: FEM looks only at within group effects Aggregate models (“between effects”) looks across groups – Random effects models yield a weighted average of between & within group effects • It exploits between & within information, and thus can be more efficient than FEM & aggregate models. – IF distributional assumptions are correct. Linear Random Intercepts Model . xtreg supportenv age male dmar demp educ incomerel ses, i(country) re Random-effects GLS regression Group variable (i): country R-sq: within = 0.0220 between = 0.0371 overall = 0.0240 Random effects u_i ~ Gaussian corr(u_i, X) = 0 (assumed) Assumes normal uj, uncorrelated with X vars Number of obs Number of groups = = 27807 26 Obs per group: min = avg = max = 511 1069.5 2154 Wald chi2(7) Prob > chi2 625.50 0.0000 = = -----------------------------------------------------------------------------supportenv | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------age | -.0038709 .0008152 -4.75 0.000 -.0054688 -.0022731 male | .0978732 .0229632 4.26 0.000 .0528661 .1428802 dmar | .0030441 .0252075 0.12 0.904 -.0463618 .05245 demp | -.0737466 .0252831 -2.92 0.004 -.1233007 -.0241926 educ | .0857407 .0061501 13.94 0.000 .0736867 .0977947 incomerel | .0090308 .0059314 1.52 0.128 -.0025945 .0206561 ses | .131528 .0134248 9.80 0.000 .1052158 .1578402 _cons | 5.924611 .1287468 46.02 0.000 5.672272 6.17695 -------------+---------------------------------------------------------------sigma_u | .59876138 SD of u (intercepts); SD of e; intra-class correlation sigma_e | 1.8701896 rho | .09297293 (fraction of variance due to u_i) Linear Random Intercepts Model • Notes: Model can also be estimated with maximum likelihood estimation (MLE) • Stata: xtreg y x1 x2 x3, i(groupid) mle – Versus “re”, which specifies weighted least squares estimator • Results tend to be similar • But, MLE results include a formal test to see whether intercepts really vary across groups – Significant p-value indicates that intercepts vary . xtreg supportenv age male dmar demp educ incomerel ses, i(country) mle Random-effects ML regression Number of obs = 27807 Group variable (i): country Number of groups = 26 … MODEL RESULTS OMITTED … /sigma_u | .5397755 .0758087 .4098891 .7108206 /sigma_e | 1.869954 .0079331 1.85447 1.885568 rho | .0769142 .019952 .0448349 .1240176 -----------------------------------------------------------------------------Likelihood-ratio test of sigma_u=0: chibar2(01)= 2128.07 Prob>=chibar2 = 0.000 Choosing Models • Which model is best? • There is much discussion (e.g, Halaby 2004) • Fixed effects are most consistent under a wide range of circumstances • Consistent: Estimates approach true parameter values as N grows very large • But, they are less efficient than random effects – In cases with low within-group variation (big between group variation) and small sample size, results can be very poor – Random Effects = more efficient • But, runs into problems if specification is poor – Esp. if X variables correlate with random group effects. Hausman Specification Test • Hausman Specification Test: A tool to help evaluate fit of fixed vs. random effects • Logic: Both fixed & random effects models are consistent if models are properly specified • However, some model violations cause random effects models to be inconsistent – Ex: if X variables are correlated to random error • In short: Models should give the same results… If not, random effects may be biased – If results are similar, use the most efficient model: random effects – If results diverge, odds are that the random effects model is biased. In that case use fixed effects… Hausman Specification Test • Strategy: Estimate both fixed & random effects models • Save the estimates each time • Finally invoke Hausman test – Ex: • • • • • streg var1 var2 var3, i(groupid) fe estimates store fixed streg var1 var2 var3, i(groupid) fe estimates store fixed hausman fixed random Hausman Specification Test • Example: Environmental attitudes fe vs re . hausman fixed random Direct comparison of coefficients… ---- Coefficients ---| (b) (B) (b-B) sqrt(diag(V_b-V_B)) | fixed random Difference S.E. -------------+---------------------------------------------------------------age | -.0038917 -.0038709 -.0000207 .0000297 male | .0979514 .0978732 .0000783 .0004277 dmar | .0024493 .0030441 -.0005948 .0007222 demp | -.0733992 -.0737466 .0003475 .0007303 educ | .0856092 .0857407 -.0001314 .0002993 incomerel | .0088841 .0090308 -.0001467 .0002885 ses | .1318295 .131528 .0003015 .0004153 -----------------------------------------------------------------------------b = consistent under Ho and Ha; obtained from xtreg B = inconsistent under Ha, efficient under Ho; obtained from xtreg Test: Ho: difference in coefficients not systematic chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B) = 2.70 Prob>chi2 = 0.9116 Non-significant pvalue indicates that models yield similar results…