Download Matematiikan ja tilastotieteen laitos / tilastotiede

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression toward the mean wikipedia , lookup

Regression analysis wikipedia , lookup

Least squares wikipedia , lookup

Data assimilation wikipedia , lookup

Linear regression wikipedia , lookup

Coefficient of determination wikipedia , lookup

Transcript
Multilevel modelling:
general ideas and uses
30.5.2017
Kari Nissinen
Finnish Institute for
Educational Research
Hierarchical data

Data in question is organized in a hierarchical /
multilevel manner
 Units at lower level (1-5) are arranged into
higher-level units (A, B)
A
1
2
B
3
4
5
Hierarchical data

Examples

Students within classes within schools
Employees within workplaces
Partners in couples
Residents within neighbourhoods
Nestlings within broods within populations…

Repeated measures within individuals




Hierarchical data

The key issue is clustering



lower-level units within an upper-level unit tend to be
more homogeneous than two arbitrary lower-level
units
E.g. students within a class: intra-cluster correlation
ICC (positive)
Repeated measures: autocorrelation (usually positive)
Hierarchical data
 Clustering
=> lower-level units are not
independent
 In cross-sectional studies this is a problem


Two correlated observations provide less information
than two independent observations (partial ’overlap’)
Efficient sample size smaller than nominal sample
size => statistical inference falsely powerful
Clustering in cross-sectional
studies
 Basic
statistical methods do not recognize
the dependence of observations
• Standard errors (variances) underestimated =>
confidence intervals too short, statistical tests too
significant
 Special
methodology needed for correct
variances…


Design-based approaches (variance
estimation in cluster sampling framework)
Model-based approaches: multilevel models
Clustering in cross-sectional
studies
of ’inference error’ due to
clustering: design effect (DEFF)
 Measure
= ratio of correct variance to underestimated
variance (no clustering assumed)
A function of ratio of nominal sample size to
effective sample size and/or homogeneity within
clusters (ICC)
Hierarchical data
 Hierarchy
is a property of population,
which can carry over into the sample data


Cluster sampling: hierarchy is explicitly
present in data collection => data possess the
same hierarchy (and possible clustering) exactly
Simple random sampling (etc): clustering may
or may not appear in the data
• It is present but hidden, may be difficult to identify
• Effect may be negligible
Hierarchical data
 Hierarchy
does not always lead to
clustering: units within a cluster can be
uncorrelated



Other side of the coin is heterogeneity
between upper-level units: if no heterogeneity,
then no homogeneity among lower-level units
Zero ICC => no need for special methodology
Clustering can affect some target variables,
but not some others
Longitudinal data
 Clustering
= measurements on an
individual are not independent

When analyzing change this is a benefit
• Each units serves as its own ’control unit’ (’block
design’) => ’true’ change
• Autocorrelation ’carries’ this link from time point to
another
• Appropriate methods utilize this correlation =>
powerful statistical inference
Mixed models
 An
approach for handling hierarchical /
clustered / correlated data
 Typically regression or ANOVA models,
which contain effects of explanatory
variables, which can be (i) fixed, (ii)
random or (iii) both


Linear mixed models: error distribution normal
(Gaussian)
Generalized linear mixed models: error
distribution binomial, Poisson, gamma, etc
Mixed models
 Variance
component models
 Random coefficient regression models
 Multilevel models
 Hierachical (generalized) linear models


All these are special cases of mixed models
Similar estimation procedures (maximum
likelihood & its variants), etc
Fixed vs random effects
 1-way
ANOVA fixed effects model
Y(ij) = μ + α(i) + e(ij)

μ = fixed intercept, grand mean
α(i) = fixed effect of group i

e(ij) = random error (’random effect’) of unit ij

• random, because it is drawn from a population
• it has a probability distribution (often N(0,σ²))
Fixed vs random effects
 Fixed
effects determine the means of
observations
E(Y(ij)) = μ + α(i), since E(e(ij))=0
 Random
effects determine the variances
(& covariances/correlations) of
observations
Var(Y(ij)) = Var(e(ij)) = σ²
Fixed vs random effects
 1-way
ANOVA random effects model
Y(ij) = μ + u(i) + e(ij)

μ = fixed intercept, grand mean

u(i) = random effect of group i
• random when the group is drawn from a population
of groups
• has a probability distribution N(0,σ(u)²)

e(ij) = random error (’random effect’) of unit ij
Fixed vs random effects
 Now
the mean of observations is just
E(Y(ij)) = μ
 Variance
is
Var(Y(ij)) = Var(u(i) + e(ij))
= σ(u)² + σ²

Sum of two variance components => variance
component model
Random effects and clustering
 Random
group => units ij and ik within
group i are correlated:
Cov(Y(ij),Y(ik))
= Cov(u(i) + e(ij), u(i) + e(ik))
= Cov(u(i), u(i)) = σ(u)²
 Positive intra-cluster correlation
ICC = Cov(Y(ij),Y(ik)) / Var(Y(ij))
= σ(u)² / (σ(u)² + σ²)
Mixed model
 Contains
both fixed and random effects,
e.g.
Y(ij) = μ + βX(ij) + u(i) + e(ij)





i = school, j = student
μ = fixed intercept
β = fixed regression coefficient
u(i) = random school effect (’school intercept’)
e(ij) = random error of student j in school i
Mixed model
Y(ij) = μ + βX(ij) + u(i) + e(ij)



The mean of Y is modelled as a function of
explanatory variable X through the fixed
parameters μ and β
The variance of Y and within-cluster
covariance (ICC) are modelled through the
random effects u (’level 2’) and e (’level 1’)
This is the general idea; extends versatilely
Regression lines in variance
component model: high ICC
Regression lines in variance
component model: low ICC
An extension: random coefficient
regression
Y(ij) = μ + βX(ij) + u(i) + v(i)X(ij) + e(ij)



v(i) = random school slope
Regression coefficient of X varies between
schools: β + v(i)
A ’side effect’: the variance of Y varies along
with X
• one possible way to model unequal variances (as
a function of X)
Random coefficient regression
Regression for repeated measures
data
Y(it) = μ(t) + βX(it) + e(it)



t = time, μ(t) = intercept at time t
i = individual
The errors e(it) of individual i correlated:
different (auto)correlation structures (e.g.
AR(1)) can be fitted as well as different
variance structures (unequal variances)
Thanks!