Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
PART 4
Non-linear models
•
•
•
•
Term 4, 2006
Logistic regression
Other non-linear models
Generalized Estimating Equations (GEE)
Examples
– Crossover study
– British Social Attitudes Survey
BIO656--Multilevel Models
1
Models for Clustered Data
Inferential goals
• Marginal mean/Population Averaged
– Average response across “the population”
• Mean, conditional on
– Other responses in the cluster
– Unobserved random effects
Term 4, 2006
BIO656--Multilevel Models
2
Interpreting Linear Model Coefficients
Same interpretation for conditional (cluster-specific)
and population-averaged inferences
• Unit change in dependent variable for a unit change in
regressor
• Multi-level models specify correlations and latent effects:
– The random intercept model produces an
equal-correlation model (correlation
– The latent intercepts can be estimated and used for
prediction
Term 4, 2006
BIO656--Multilevel Models
3
Marginal Models
Inferential Target
• Marginal mean or population-averaged response for
different values of predictor variables
Examples
• Difference in mean alcohol consumption for
two age groups
• Rate of alcohol abuse for states with
addiction treatment programs compared to
those without
Public health assessments
Term 4, 2006
BIO656--Multilevel Models
4
Conditional Models
Conditional on other observations in cluster
• Probability that a person abuses alcohol
given family membership or given the number
of family members that do
• Probability that a person will abuse next year,
if abuses this year
• A person’s average alcohol consumption
given the average in the neighborhood
Term 4, 2006
BIO656--Multilevel Models
5
Conditional Models
Conditional on random effects
• Average consumption, conditional on a latent
tendency
• Probability that a person abuses alcohol, conditional
on a latent tendency
Can be thought of as
conditional on unmeasured covariates
Term 4, 2006
BIO656--Multilevel Models
6
The basic, conditional logistic model
• Conditional on a random effect, you have the logistic
regression:
logit(P) = log{P/(1-P)} = u + + X
u ~ (0, 2)
Implications
• Generally, the population averaged (marginal) model will
not have the logistic shape
• In any case, the slope on a covariate will have a different
impact in the conditional and marginal models
Term 4, 2006
BIO656--Multilevel Models
7
Condition on u
Term 4, 2006
BIO656--Multilevel Models
8
Conditional Logistic and Marginal Shapes
U N(0, 4)
u=0
b=
Term 4, 2006
BIO656--Multilevel Models
9
Conditional Logistic and Marginal Shapes
U is a two-point mixture at 2
u=0
b=
Term 4, 2006
BIO656--Multilevel Models
10
Adjust the conditional slope
to closely match the marginal curve
• Assume that there is a population relation that is
logistic with term X
• How far off is the marginal curve produced from the
conditional logistic curve with term X?
• Let * be the slope needed in the conditional logistic
so that the marginal curve produced from it comes
close to the population relation
– “Comes close” means to track the middle part of
the population curve
Term 4, 2006
BIO656--Multilevel Models
11
Non-linear model coefficients
• Usually, population-averaged (marginal) and conditional
models have different shapes
– Condition logistic is not population logistic
– But, conditional probit is population probit
• In any case, population-averaged and cluster-specific
coefficients have different magnitudes and
interpretations because they address different questions
• For example, when u is a two-point, 50/50 mixture at
2, = 4 and * = 8.
Need to consider impact on
probabilities not just on odds ratios
Term 4, 2006
BIO656--Multilevel Models
12
SHAPE & SLOPE CHANGES
• For linear models, regression coefficients in random
effects models and marginal models are identical:
average of linear model = linear model of average
• For non-linear models, coefficients have different
meanings and values:
average of non-linear model
non-linear model of average
coefficient value and meaning in average model
coefficient value and meaning in conditional model
Term 4, 2006
BIO656--Multilevel Models
13
Conditional Logistic and Marginal Shapes
Log(odds | u) = u -2.0 + 0.4X
Population
prevalences
X=1
X=0
Term 4, 2006
Cluster-specific
probabilities
BIO656--Multilevel Models
14
Logistic Regression Example
Cross-over trial
2 observations per person (before/after)
Response
1= not alcohol dependent; 0 = AlcDep
(so a high probability is good!)
Predictors
period (Pd = 0 or 1)
treatment group (Trt = 0 or 1)
Parameter of interest
• Treatment vs placebo after/before log(OddsRatio)
• A positive slope favors the treatment
Term 4, 2006
BIO656--Multilevel Models
15
Baseline/Follow-up Model
i = period, j = person; logit(P) = log(P/[1-P])
Population level (no individual effects)
logit(Pij) = + 1PDij + 2TRij + 3PDijTR2ij
= + 1PDi + 2TRj + 3PDiTRj
logit(P2j) - logit(P1j) = 1 + 3TRj
(3 is the treatment effect)
Person-level (individual intercept)
logit(Pij) = uj + * + *1PDi + *2TRj + *3PDiTRj
uj ~ (0, 2)
Term 4, 2006
BIO656--Multilevel Models
16
Results for population-level regressions
(logistic without multi-level component)
Marginal Models
log(OR)(se)
Regressor
Standard
Logistic
Logistic
(Accounting for
correlation)
Intercept
0.66(0.32)
0.67(0.29)
Period
-0.27(0.38)
-0.30(0.23)
Treatment
0.56(0.38)
0.57(0.23)
3
Similar estimates; wrong standard error for Std. Logistic
Term 4, 2006
BIO656--Multilevel Models
17
The effect of accounting for correlation
• Treatment effect estimates are the same for marginal
logistic and correlation accounted logistic
– But, SEs are 0.38 and 0.23 respectively
• Why is the second smaller than the first?
Answer
• The treatment effect is estimated by contrasting
(differencing) period 2 and period 1
• The positive, within-person correlation produces a
smaller variance of this difference than does
assuming independence
Term 4, 2006
BIO656--Multilevel Models
18
Population-level vs Random Intercept
logistic regressions
log(OR)(se)
Marginal
Regressor
Intercept
Period
Treatment
= sd(u)
Term 4, 2006
Ordinary
Logistic
Regression
Logistic
(Account for
correlation)
Conditional
RE Conditional
Logistic Reg.
0.66(0.32)
0.67(0.29)
2.2(1.00)
-0.27(0.38)
-0.30(0.23)
-1.0(0.84)
0.57(0.38)
0.57(0.23)
1.80(0.93)
3.56(0.81)
5.00(2.30)
0.0
BIO656--Multilevel Models
19
Marginal Logistic
versus
Random Intercept Logistic
Unconditional Logistic (Population-level inference):
The population AlcnonDep (after/before), treatment/placebo
prevalence odds ratio is exp(0.57) = 1.77
Conditional, RE Logistic (Individual-level inference):
An individual’s AlcnonDep (after/before), treatment/placebo
prevalence odds ratio is exp(1.80) = 6.05
Ratio: (Conditional)/(Marginal)
6.05/1.77 = 3.42 (= e1.23; 1.23 = 1.80-0.57)
Different questions; different (but compatible) answers
Term 4, 2006
BIO656--Multilevel Models
20
Consequence of Conditional/Marginal
Slope Differences
• A population-level analysis that does not build on
a multi-level model (that does not include the
random effect) can understate the individual-level
(cluster level) risk or benefit
– Understate environmental risk
– Understate benefits of lowering blood pressure
– .........
Term 4, 2006
BIO656--Multilevel Models
21
Relation between marginal
and conditional ORs
logit(pr(Y = 1 | X, u) = u + log(3)X
u = log(3) with probability 1/2
u = - log(3) u = log(3)
Marginal
(Population)
X = 0 P = 0.25
P = 0.75
0.50
X = 1 P = 0.50
P = 0.90
0.70
3.00
2.33 (=7/3)
OR(X=1 vs 0) 3.00
3.00 = (.5/.5)(.25/.75) = (.9/.1) (.75/.25)
Term 4, 2006
BIO656--Multilevel Models
22
u as a missing covariate
• Without knowing u, a marginal logistic regression
predicts 0.50 and 0.70 for X=0 and X=1 respectively
– The log(OR) slope on X is 0.847 = log(2.333)
• If we know u, a logistic regression with it as a
covariate (conditional on it) predicts as in the table
– The log(OR) slope on X is 1.099 = log(3.00)
Term 4, 2006
BIO656--Multilevel Models
23
Conditional Logistic and Marginal Shapes
Log(odds | u) = u + X
u>0
u< 0
X
Term 4, 2006
BIO656--Multilevel Models
24
The RE induces association
(Y1, Y2) are in the same cluster
The RE model produces the
following 22 table for X = 0
Y2 = 0
Y2 = 1
Y1 marginal
Y1 = 0 5/16
3/16
0.50
Y1 = 1 3/16
5/16
0.50
0.50
1.00
Y2 marginal
0.50
5/16 = [(3/4)(3/4) + (1/4)(1/4)]2
pr(Y2 =1 | Y1 = 0) = 3/8 = 3/(3+5)
pr(Y2 =1 | Y1 = 1) = 5/8 = 5/(3+5)
Term 4, 2006
BIO656--Multilevel Models
25
The RE induces association
(Y1, Y2) are in the same cluster
The RE model produces the
following 22 table for X = 1
Y2 = 0
Y2 = 1
Y1 marginal
Y1 = 0 13/100
17/100
0.30
Y1 = 1 17/100
53/100
0.70
0.70
1.00
Y2 marginal
0.30
13/100 = [(1/2)(1/2) + (1/10)(1/10)]2
pr(Y2 =1 | Y1 = 0) = 17/30 = 17/(17+13)
pr(Y2 =1 | Y1 = 1) = 53/70 = 53/(17+53)
Term 4, 2006
BIO656--Multilevel Models
26
Updating the distribution of u
For X = 1 (you can try it for X = 0)
pr(u = +log(3) | Y = 0) = pr(u = +log(3), Y = 0)/pr(Y = 0)
= (1/2)(1/10)(3/10) = 1/6 < 0.5
pr(u = +log(3) | Y = 1) = pr(u = +log(3), Y = 1)/pr(Y = 1)
= (1/2)(9/10)(7/10) = 9/14 > 0.5
pr(u = +log(3)) = (1/6)(3/10) + (9/14)(7/10) = 0.5
Can use these to get [Y2 | Y1]
Term 4, 2006
BIO656--Multilevel Models
27
Marginal Multi-level, non-linear Models
GEE: Marginal mean as a function of covariates
• Working independence or other working model
• Followed by Robust SE
– “Cluster(id) in Stata
– “Robust” Option in SAS Proc Mixed or GenMod
– No “robustness” in BUGS
Conditional mean, as a function of marginal mean
and cluster-specific random effects
– Heagerty (1999, Biometrics)
– Heagerty and Zeger (2000, Statistical Science)
Term 4, 2006
BIO656--Multilevel Models
28
Generalized Linear Models (GLMs)
g(mean) = 0 + 1 X1 + ... + p Xp
(always a marginal model)
Model
Linear
Logistic
Loglinear
Response
Y
= E(Y)
g() =
Distribution
Coefficient
Interpretation
(per unit
change in X)
Continuous/
Bell-shaped
Gaussian
near-Gaussian
Change in
E(Y)
Binary
Log(/(1-))
= Logit()
Bernoulli
Binomial
Change in log
odds
Counts
Concentration
Time to event
log()
Poisson
Log-normal
Weibull
Change in
Log rate
Term 4, 2006
BIO656--Multilevel Models
29
Baseline/Follow-up Model
i = period, j = person; logit(P) = log(P/[1-P])
Population level (no individual effects)
logit(Pij) = + 1PDij + 2TRij + 3PDijTR2ij
= + 1PDi + 2TRj + 3PDiTRj
logit(P2j) - logit(P1j) = 1 + 3TRj
(3 is the treatment effect)
Person-level (individual intercept)
logit(Pij) = uj + * + *1PDi + *2TRj + *3PDiTRj
uj ~ (0, 2)
Term 4, 2006
BIO656--Multilevel Models
30
Marginal Generalized Linear Models
via Generalized Estimating Equations (GEE)
• Ordinary GLM (linear, logistic, Poisson,..)
– Population-average parameters
– Logit: Oij = logit(pij) = 0 + 1Xij
• Then, model association among observations
i and i’ in cluster j:
corr(log(Oij/ Oi’j)) = function(G)
• Solve generalized estimating equation (GEE)
– Diggle, Heagerty, Liang and Zeger, 2002)
– Gives highly efficient and valid inferences
on population-average parameters
Term 4, 2006
BIO656--Multilevel Models
31
Marginal Models for the Cross-Over Study
log(OR)
Estimation method has an effect
Term 4, 2006
BIO656--Multilevel Models
32
Conditional (RE) Models
for the Cross-Over Study
log(OR)
Term 4, 2006
BIO656--Multilevel Models
33
Accounting for Clustering
via Sample Reuse
Standard GEE: “Robust” option in SAS
Jackknife
• Compute hat
• Delete a person (in general, a “unit”)
• Compute -i i = 1, ..., n
• Compute i* = nhat - (n-1) -i
• Compute the sampe (co)variance of the i*
Bootstrap
• Put each person’s data on a token
• Sample “n” tokens with replacement and compute
estimates from the sample
• Do this “Nboot” times and compute sample
(co)variance of the estimates
• Can get more sophisticated CIs, via BCa
Term 4, 2006
BIO656--Multilevel Models
34
FRAMEWORK FOR SAMPLE REUSE
Estimate
Data
“Black Box”
Procedure
Term 4, 2006
BIO656--Multilevel Models
35
British Social Attitudes Survey:
Conditional and Marginal MLMs
Note: Subscript order reversed from our usual
Response
• Yijk = 1 if favor abortion; 0 if not
– district i = 1,…264
– person j = 1,…,1056
– year k = 1, 2, 3, 4
Levels
1. Time within person
2. Persons within districts
3. Districts
Term 4, 2006
BIO656--Multilevel Models
36
Covariates at the three levels
Level 1: time
• Indicators of time
Level 2: person
• Class: upper working; lower working
• Gender
• Religion: protestant, catholic, other
Level 3: district
• Percentage protestant (derived)
Term 4, 2006
BIO656--Multilevel Models
37
Scientific Questions
Conditional Model
• How does a woman’s religion associate with her
probability of favoring abortion?
• How does the predominant religion in a district
associate with a woman’s probability of favoring
abortion?
Marginal Model
• How does the rate of favoring abortion differ between
Protestants and, otherwise similar, Catholics?
• How does the rate of favoring abortion differ between
districts that are predominantly Protestant versus
Catholic?
Term 4, 2006
BIO656--Multilevel Models
38
Schematic of Marginal Random-effects Model
Term 4, 2006
BIO656--Multilevel Models
39
Conditional Multi-level Model
Modeling the Population Expectation
We build a “regression model” for 2
Person and district random effects
Term 4, 2006
BIO656--Multilevel Models
40
Conditional Multi-level Model Results
All of this is a “regression model” for 2
Term 4, 2006
BIO656--Multilevel Models
41
Conditional model results
• How does a woman’s religion associate with her
probability of favoring abortion?
• How does the predominant religion in a district
associate with a woman’s probability of favoring
abortion?
Term 4, 2006
BIO656--Multilevel Models
42
Marginal Multi-level Model
If the conditional is logistic, can the marginal be logistic?
We simultaneously model the underlying random effects
structure, but we are still fitting the marginal model
Person and district random effects
Term 4, 2006
BIO656--Multilevel Models
43
Marginal Multi-level Model Results
All of this is a “regression model” for 2
Term 4, 2006
BIO656--Multilevel Models
44
Marginal model results
How does the rate of favoring abortion differ between
protestants and otherwise similar catholics?
How does the predominant religion in a district
influence the probability of favoring abortion?
Term 4, 2006
BIO656--Multilevel Models
45
Refresher: Forests & Trees
Multi-Level Models:
• Explanatory variables from multiple levels
– Family
– Neighborhood
– State
• Interactions
Must take account of correlation among responses from
same clusters:
• Marginal: GEE, MMM
• Conditional: RE, GLMM
Term 4, 2006
BIO656--Multilevel Models
46
Key Points
“Multi-level” Models:
• Have covariates from many levels and their interactions
• Acknowledge correlation among observations from within a
level (cluster)
Conditional and Marginal Multi-level models have
different targets; ask different questions
• When population-averaged parameters are the focus, use
– GEE
– Marginal Multi-level Models
(Heagerty and Zeger, 2000)
Term 4, 2006
BIO656--Multilevel Models
47
Key Points (continued)
• When cluster-specific parameters are the focus, use
random effects models that condition on unobserved
latent variables that are assumed to be the source of
correlation
• Warning: Model Carefully. Cluster-specific targets often
involve extrapolations where there are no actual data for
support
– e.g. % protestant in neighborhood given a
random neighborhood effect
Term 4, 2006
BIO656--Multilevel Models
48
Recap
Population-averaged parameters
• GEE
• Marginal multi-level models
Cluster-specific parameters and latent effects
• Random Effects models
– built up from latent effects (variance components)
• Possibly, overlay “Time Series” Models
– to induce additional correlation
Warning
• Inferences on latent effects can be very model-dependent
Term 4, 2006
BIO656--Multilevel Models
49
Working Independence
versus modeling correlation
Longitudinal Example
Generate data in clusters (i.e., a person)
• 5 observations per cluster
Response is a linear function of time
Yit = 0 + 1t + eit
The residuals are first-order autoregressive, AR(1)
eit = ei(t-1) + uit (the u’s are independent)
corr(ei(t+s) , eit) = s
Estimate the slope by
• OLS: assumes independent residuals
• Maximum likelihood: models the autocorrelation
Term 4, 2006
BIO656--Multilevel Models
50
Comparisons
Compare the following reported Var(1)
• That reported by OLS (it’s incorrect)
• That reported by a robustly estimated SE for the
OLS slope (It’s correct for the OLS slope)
• That reported by the MLE model
It’s correct if the MLE model is correct
You can use any working correlation model,
but need a robust SE to get valid inferences
Term 4, 2006
BIO656--Multilevel Models
51
Variance of OLS & MLE Estimates
of b versus , the first-lag Correlation
MLE reported variance
OLS reported variance
True variance
of OLS
Term 4, 2006
BIO656--Multilevel Models
52
Analytic Strategy
•
Use a model that fits the observed data well
– Directly model observeds or check fit by
aggregating a random effects model
– “Good” models (candidate models) will give
similar observed-data predictions
•
Then, “speculate” on latent effects models by
finding several that fit the observed data
– See if these give similar messages and
produce similar individual-level predictions
– Yes a sturdy finding; No additional info
needed
Note: > 0 indicates that there is unexplained,
individual-level heterogeneity
Term 4, 2006
BIO656--Multilevel Models
53
MLMs
Models are multi-level because they
• Include covariates from many levels
(and their interactions)
• Structure correlation among
observations within a cluster
Conditional and marginal models
• Have different goals
• Ask different questions
• Can/should get different answers
Term 4, 2006
BIO656--Multilevel Models
54
Benefits & Drawbacks
of working non-independence
Benefits
• Efficient estimates
• Valid standard errors and sampling distributions
• Protection from some missing data processes
• The MLM/RE approach allows estimating conditional-level
parameters, estimating latent effects and improving
estimates
Drawbacks
• Working non-independence imposes more strict validity
requirements on the fixed effects model (the Xs)
• Can get valid SEs via working independence with robust
standard errors
– At a sacrifice in efficiency
Term 4, 2006
BIO656--Multilevel Models
55
There is no free lunch!
• Working independence models
(coupled with robust SEs!!!)
are sturdy, but inefficient
• Fancy models are potentially
efficient, but can be fragile
Term 4, 2006
BIO656--Multilevel Models
56