Download ED216C HLM 2010

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear regression wikipedia , lookup

Data assimilation wikipedia , lookup

Least squares wikipedia , lookup

Choice modelling wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
ED216C HLM 2010
Prof. Rumberger
Lecture--Week 2
I.
Preliminary matters
Correction on notation—Lecture notes page 7 (corrected on website version)
School 1:
Yi1 = 01 + 11 Xi1 + ri1
School j:
Yij = 0j + 1j Xij + rij
Notation: The symbol, ^ , which is used to represent an estimated parameter, should
appear over the symbol
Symbols such as Y·j , should have a ¯over them
Questions about anything?
II.
Review of models (Handout 4)
Differences due to
(1) Whether predictors at level 1 or level 2;
(2) Whether level-2 predictors are fixed or random
III.
One-Way ANOVA (fully unconditional or null) model
Normally first model tested
Used to:
(1) Estimate grand mean for outcome (dependent) variable
(2) Estimate variance components at both levels
(3) Estimate intra-class corrections and reliability
(4) Test hypothesis whether all j units have same mean
Model:
Level-1 model:
Yij = 0j + rij
,
rij ~ N (0,σ2) σ2 = level-1 variance
Level-2 model:
0j = 00 + u0j ,
u0j ~ N (0,τ00) τ00 = level-2 variance
0j = mean outcome for unit j
00 = grand-mean outcome in the population
u0j = random effect associated with unit j
rij = level-1 error term
1
Combined model:
Yij = 00 +
u0j + rij
Fixed effect
Random effects
Var (Yij) = Var (00 + u0j + rij) = τ00 + σ2
Intra-class correlation: p = τ00 / (τ00 + σ2) measures proportion of variance
between level-2 units
Estimation
Two possible estimators for 0j, true school mean:
(a) Y·j (sample mean) = 0j + r·j
Var (r·j) = σ2 / nj = Vj (error variance)
(b) Y·j = 00 + u0j + r·j
Var (Y·j) = Var (u0j) + Var (r·j)
= τ00 + Vj
= parameter variance + error variance
= ∆j
In case where sample sizes are equal, ∆ = τ00 + V (constant)
An unique, minimum-variance unbiased estimator of:
00 = ∑ Y·j / J
In case where sample sizes are unequal, ∆j = τ00 + Vj (variable)
An unique, minimum-variance unbiased estimator of:
00 = ∑ ∆j-1 Y·j / ∑ ∆j-1, where ∆j-1 = Precision (Y·j)
weighted least squares (maximum likelihood) estimator
HLM uses a weighted combination of both, known as a Bayes estimator:
*0j = λj Y·j + (1 - λj) ^00 , where
λj = reliability of least square estimator, Y·j , for parameter, 0j .
2
= Var (0j) / Var (Y·j)
= τ00 + (τ00 + Vj )
= (parameter variance)/(parameter variance + error variance)
reliability will be large (close to 1), when (a) group means vary
substantially across level-2 units or (b) level-1 sample sizes are
large
More reliable Y·j is, more it is used to estimate *0j
Less reliable Y·j is, more 00 is used to estimate *0j
*0j “pulls” Y·j toward grand mean, 00 , called shrinkage estimator
When λj computed from known variances, *0j known as a Bayes estimator
When λj computed from un known variances, *0j known as an empirical
Bayes estimator
Interval estimation:
95%CI (0j) = *0j ± 1.96 V*j½ (when variances, σ2 and τ00, are
known)
With unknown variances, we estimate reliabilities:
λj = reliability (Y·j)
= τ00 / [τ00 + (σ2 / nj)] for each school j
Overall reliability is λ = ∑ λj / J
Hypothesis Testing
Two types of hypothesis testing:
1. Single parameter tests
2. Multiple parameter tests (Differences across models)
e.g., significance of single predictor variable in multiple equations
e.g., significance of multiple predictors
Choice depends on particular hypothesis you want to test
Usually limit yourself to begin with single-parameter tests
One of the most common tests, for testing level-2 variance:
H0: τqq = 0
3
Illustration using HSB data (Book, pp. 69-72)
Review data descriptives
Model:
Level-1 model: Yij = 0j + rij ,
ri ~ N (0,σ2)
Level-2 model: 0j = 00 + u0j ,
u0j ~ N (0,τ00) τ00 = level-2 variance
σ2 = level-1 variance
Results:
1. Fixed effects
^00 = 12.64
95%CI = 12.64 ± 1.96 (.024) = (12.17, 13.11)
2. Variance components (Random effects)
Var^ (rij) = 39.15
Var^ (0j) = Var^ (u0j) = τ^00 = 8.61
Range of plausible values among schools:
^00 ± 1.96 (τ^00)½
12.64 ± 1.96 (8.61)½ = (6.89, 18.39)
Hypothesis testing: H0: τqq = 0
3. Intra-class correlation
Intra-class correlation: ρ^ = τ^00 / (τ^00 + σ^2)
= 8.61 / (8.61 + 39.15)
= .18
4. Reliability
λ^j = reliability (Y·j)
= τ^00 + (τ^00 + σ^2 / nj) for each school j
Overall reliability is λ = ∑ λj / J
For this model, reliability = .90, meaning within-school sample means
are good estimate (recall nj > 40 students)
4
IV.
Regression with Means-as-Outcomes
Model:
Level-1 model:
Yij = 0j + rij
Level-2 model:
0j = 00 + 01 Wj + u0j
Combined model:
Yij = 00 + 01 Wj + u0j + rij
Fixed effects Random effects
00 = intercept
01 = effect of Wj on 0j
u0j = 0j - 00 - 01Wj = residual
τ00 = residual or conditional variance in 0j, after controlling for Wj
Estimation:
As with the unconditional model, HLM produces a composite estimator for 0j
*0j = λj Y·j + (1 - λj) (00 + 01Wj )
Again this is an empirical Bayes or shrinkage estimator, but in this case, is
shrunk toward a predicted value than the grand mean, so it is called a
conditional shrinkage estimator
Interval estimation is the same as in the unconditional model:
95%CI (0j) = *0j ± 1.96 V*j½ (when variances, σ2 and τ00, are known)
5
Hypothesis testing:
1.
Single parameter test for fixed effects:
H0: qs = 0 (effect of level-2 predictor, Wsj , on particular
level-2 parameter, qj )
Can be tested with two statistics, but generally with t-statistic
with J – Sq –1 DOF
2.
Single parameter test for variance components:
H0: τqq = 0, where : τqq = Var (qj)
Can be tested with two statistics, but generally with Chisquare (X2) with J – Sq –1 DOF
Illustration using HSB data (Book, pp. 72-75)
Model:
Level-1 model:
Yij = 0j + rij
Level-2 model:
0j = 00 + 01(MEAN SES)j + u0j
Fixed effects
1. Fixed effects
^00 = 12.65
^01 = 5.86
t = ^01 / SE (^01) = 16.22 (highly significant)
2. Variance components (Random effects)
Range of plausible values among schools:
^00 ± 1.96 (τ^00)½ = 12.65 ± 1.96 (2.64)½ = (9.47, 15.83)
Note how much smaller this range is than in the conditional model—i.e.
Mean achievement among schools controlling for average SES is much
less variable than overall or observed mean achievement
6
Hypothesis testing: Is residual variance, τ00 , significantly different than
zero?
Statistic: X2 with 158 DOF = 633.52, p < .001
3. Variance explained
Proportion of variance explained in 0j
τ^00 (random ANOVA) - τ^00 (MEAN SES)
=
τ^00 (random ANOVA)
= (8.61 – 2.64)/8.61
= 0.69
Meaning: Mean SES explains 69 percent of variance in mean achievement
among schools.
4. Conditional intra-class correlation
ρ^ = τ^00 / (τ^00 + σ^2)
= 2.64 / (2.64 + 39.16)
= .06
Meaning: Variability between schools after controlling for mean SES is not
6 percent, compared to 18 percent in the unconditional model; much smaller
variability remains
5. Conditional reliability
u^0j = Y·j - ^00 - ^01(MEAN SES)j
= .74
Still highly reliable, but less reliable than sample means
7