Download Unit5

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Multilevel models for
combining macro and micro data
Unit 5
Mark Tranmer
Cathie Marsh Centre for Census and
Survey Research
Introduction
• We will see how the multilevel model provides a
framework for combining individual level survey
data with aggregate group level data.
• We illustrate this with an example where
individual level data from the European Social
Survey are combined with country level data
from the Eurostat New Cronos data.
• The dependent variable in our example is voter
turnout in the most recent election in their
country of residence.
Learning objectives (1)
• To introduce the idea of multilevel modelling
• To explain why multilevel modelling is useful
when linking macro and micro data.
• To present the kinds of substantive research
questions that can be answered using this
approach
• To outline software that permits multilevel
models to be fitted.
Learning objectives (2)
• To give an example of linking micro and macro
data in the multilevel model framework by
combining the ESS micro data with country-level
macro data from Eurostat New Cronos
• To briefly outline the various multilevel models in
this context
• To explain how interactions between aggregate
(macro) and individual level (micro) measures
work in these models and why they might
answer important substantive research
questions.
Levels of analysis and inference
• Traditional regression models are used to carry out an
analysis at a single level.
• Such as the individual (person level) with individual level
data
• Or at the group (country level) with aggregate data.
• If we do an individual level analysis we can make
individual level inferences but, without group level
information such inferences may be made out of the
context in which the processes occur.
• Sometimes this is referred to the atomistic fallacy
• Ideally we want to do the analysis in context
Levels of analysis and inference
• We could also do a group (country level) analysis. For
example relating the % voting in each European country
with the unemployment rate in that country.
• This would tell us whether countries with higher
unemployment tended to have higher (or lower) levels of
voter turnout.
• But it wouldn’t tell us whether unemployed people were
more (or less) likely to vote than employed people.
• To make such an inference about individuals from a
group level analysis would be an example of the
ecological fallacy.
• In general the results of analyses carried out at the
group level do not apply at the individual level.
Multilevel models
• Multilevel models allow us to consider the
individual level and the group level in the same
analysis, rather than having to choose one or
the other.
• For example we can consider the individual and
the country level in the same analysis
• An alternative is to include dummy variables for
each of the groups (i.e. countries in the
analysis). A so called fixed effects approach.
• However multilevel models have several
advantages over this approach:
Multilevel models
1. They provide an ideal framework for combining
data from several sources, such as individual
level survey data (micro data) and country level
aggregate data (macro data).
2. They allow sophisticated hypotheses to be
tested without the need to add a lot of extra
variables and interactions to the model. E.g. it is
relatively straight forward to consider a research
question such as this: is the association of age
with voter turnout stronger in some countries
than others?
Multilevel modelling framework
• The current example involves individual level
micro data from the European Social Survey
• And country level aggregate macro data from
the Eurostat New Cronos.
• There are basically three ways of fitting
multilevel models for voter turnout with these
data:
Multilevel modelling framework
1. Models that involve the micro data only
2. Models that combine micro data and macro
data and assess the additional impact of the
variables from the macro dataset to explain
variations in voter turnout
3. Models that interact variables on the micro
data and macro data, such as whether or not
someone is unemployed (micro data) with the
% long term unemployed in the country (macro
data).
Multilevel modelling software
• We will use software called MLwiN.
• Although to some extent SPSS can be
used for multilevel modelling, MLwiN is
more flexible and has better graphics and
so on.
• More details of MLwiN at
www.cmm.bristol.ac.uk
• MLwiN is being made free to academics
Part 1: multilevel models
and ESS micro data
Modelling approaches: theory
Model 1: Single level model – e.g. predicting
chance of voting with age
pi  Pr( yi  1 | xi )
logit( pi )  0  1 xi
Modelling approaches: theory
Model 1: Single level model
Modelling approaches: theory
Model 2: null model (multilevel) – getting a sense of
where the variation in voter turnout is: between
people or between countries
Logit ( Pij )   0  u0 j
Var(U0 j )  
2
u0
Modelling approaches: theory
Model 3: Multilevel Model with varying intercepts.
Relating age to voting and allowing overall turnout to be
higher/lower in each European country.
Logit ( Pij )   0  1 xij  u0 j
Var (U oj | xij )  
2
u0 | x
Modelling approaches: theory
Model 3: Multilevel Model with varying intercepts
Modelling approaches: theory
Model 4: Multilevel Model with varying intercepts
and slopes – relationship of age with voting can be
stronger/weaker in each country
Logit( Pij )   0  1 j xij  u0 j
1 j  1  u1 j
Logit ( Pij )   0  1 xij  u1 j xij  u0 j
Model 4: graphical
representations
Using MLwiN to read in the data and set
up the binomial model
• We will set up a binomial model in MLwiN
and estimate some multilevel models
(models 2-4) using the ESS micro data
only
• We will use an MLwiN worksheet called
Lmmd6.ws
Using Mlwin to read in the data and set
up the binomial model
• Open MLwiN by locating it in the programmes
listed in the windows start menu or by clicking on
the MLwiN icon on your desktop.
• The default worksheet size for this exercise is
5000 cells which is too small to permit the
analysis. However, it is easy to increase the
worksheet size.
• To do this go to options and make the worksheet
10000 cells (change from 5000). NB: Do not
save worksheet when prompted.
• Now choose data manipulation > names
Setting up the model in MLwiN
Setting up the model in MLwiN
Setting up the model in MLwiN
Setting up the model in MLwiN
Setting up the model in MLwiN
Null model (model 2) is now set
up
Estimation type
Model 2 results
Model 3: results – add cent_age to model by
clicking on ‘add term’
Model 4: set up
Model 4: results
Part 2: combining macro and
micro data in multilevel models
Model 5: combining survey and aggregate data.
Combining data in mulitlevel
pmodels:
 Pr( y model
1 | x , X 5) – Main effects
ij
ij
ij
j
Logit ( Pij )   0  1 xij   2 X j  u 0 j
Var ( U
oj
| x ij , X
j
)  
2
u 0 |x , X
Combining data in mulitlevel
models: model 6 – interactions
pij  Pr( yij  1 | xij , X j )
Logit ( pij )   0  1 xij   2 X j   3 xij X j  u0 j
Var ( U
oj
| x ij , X j )  
2
u 0 |x , X
Model 5 main effects: results
Model 6 Interactions: results
Summary: what you have learnt
in this session
1. The multilevel model is an extremely useful
framework for combining macro and micro
data
2. Multilevel logistic regression models can be
used for an outcome with two categories such
as voter turnout
3. We can then fit a series of models to extent the
nature and extent of individual and country
level variations in voter turnout. We can use
software such as MLwiN to do this.
Summary: what you have learnt
in this session
4. We can then estimate multilevel models
with ESS micro data only
5. We can then combine micro and macro
data by adding variables from Eurostat
New Cronos to model
6. Finally we can also interact individual
level ESS variables with country level
variables from new Cronos data