Download Annibale Biggeri - DiSIA - Università degli Studi di Firenze

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Identifiability of Causal Relationships in
Non-Experimental Studies1
Identificabilità di relazioni causali in
studi non sperimentali
Fabrizia Mealli
Dipartimento di Statistica “G. Parenti” - Università degli Studi di Firenze
e-mail: [email protected]
Annibale Biggeri
Dipartimento di Statistica “G. Parenti” - Università degli Studi di Firenze
e-mail: [email protected]
Riassunto: L’identificazione di relazioni causali negli studi non sperimentali è uno
degli obiettivi più importanti dell’analisi epidemiologica. Nella letteratura statistica
esistono diversi approcci per l’identificazione e la stima di effetti causali. Tra questi, un
ruolo importante è rivestito dall’approccio che utilizza il concetto di risultati potenziali
(o Modello Causale di Rubin) e dalla teoria dei grafi causali. Nel presente lavoro si farà
inizialmente riferimento alla teoria dei grafi causali, mettendo in luce la sua relazione
con la definizione epidemiologica di confondimento. Verrà mostrata un’applicazione dei
modelli grafici a catena per l’analisi della relazione tra inquinamento atmosferico e
disturbi respiratori nell’infanzia. Verranno poi analizzate le complicazioni, in termini di
identificabilità degli effetti causali, dovute alla presenza di confondenti non osservabili;
sarà presentato, infine, uno studio longitudinale per l’analisi degli effetti di programmi
di formazione al lavoro sulle prospettive lavorative di giovani.
Keywords: causal models, confounding, graphical models, longitudinal studies, air
pollution.
1. Introduction
Some of the most challenging tasks for Epidemiology and Applied Statistics are related
to the identification of causal relationships in observational studies. Indeed, one could
argue that theoretical contributions on study design in Epidemiology lead to rules to
obtain samples on which causal relationships of interest can be estimated. To assure
validity of the inference appropriate control of confounding is, among others, one of the
main tasks of the researcher. This can be achieved by study design options or by using
statistical techniques in the analysis phase (Rothman and Greenland, 1998), and both
strategies have been and are currently employed.
In Epidemiology, statistical control of confounding is obtained via stratification or,
when many covariates have to be taken into account, via regression models; propensity
score methods would be another solution (Rosenbaum, Rubin, 1993). The first strategy
1
The paper is supported by COFIN 98-99 Biggeri, COFIN 98-99 Marchetti, and COFIN 99-00 Biggeri.
has many links with the so-called Simpson’s paradox while the second one relies
heavily on the correct model specification and therefore has been criticised since many
assumptions seem arbitrary in practice (Freedman, 1999).
There are different statistical approaches to deal with identification and estimation of
causal effects. A very important one, that has led to many operational techniques for
causal analysis, is usually referred to as counterfactual or potential outcomes framework
or Rubin’s Causal Model (Holland, 1986); causal effects involve comparisons of
outcomes that would have been observed under exposures of units to different
treatments1.
Another approach to causal modelling is based on the theory of causal graphs. Recent
development in this field, which has many links with structural equation modelling, has
provided a formal theory for evaluating causal effects and a new perspective for
confounder identification in Epidemiology (Greenland et al., 1999). The use of graphs
provides a tool for integrating statistical and subject-matter information: it facilitates to
explicate the assumptions underlying the causal model and to decide whether the
assumptions are sufficient for obtaining estimates of causal effects from observed
quantities.
The two formalisations have many links (see for example Greenland et al. 1999 and
Robins (1995) for a discussion on this). Rubin’s approach can ease the formulation and
the assessment of critical assumptions such as those, for example, involving
circumstances where features of units influence other units. We believe that they
provide complementary languages for exposing the fundamental assumptions required
to make causal estimates feasible.
In the paper we will first refer to the theory of causal graphs and its relation to the usual
epidemiological definition of confounding. We will show an application of graphical
chain models to a study of respiratory symptoms in childhood and air pollution. We will
then focus on identification problems (and related solutions) raised by the presence of
unmeasured confounding and show a longitudinal study, that may suffer from such
problems, of the effects of Youth Training programmes on labour market prospects.
2. Confounding
There is an intrinsic conflict in the formal definition of confounding based on
association criteria (Biggeri, 1999). This conflict is solved in practice by gathering
information on the subject specific matter. Indeed the definition of ‘effect’ implies a
directional association (XY) which requires an extra-statistical evaluation, while
generally speaking a ‘spurious’ association implies a subject-specific assessment on
what is to be considered the ‘authentic’ association. The ‘association criterion’ for
absence of confounding, which is common in Epidemiology textbooks, is the following:
“Denote with T the set of variables not depending on X. Then the relationship between X
and Y is not “confused” by T if any variable Z in T satisfies at least one of the following
conditions:
a) Z is not associated with X (P(x|z)=P(x)), or
b) Z is not associated with the response Y, given X (P(y|z,x)=P(y|x))”.
1
For a recent discussion on this approach see Dawid (2000).
This is not a purely statistical criterion since subject-specific knowledge is required to
identify the variables in T. Moreover this criterion fails in several circumstances (joint
treatment of confounders, proxy variables, intermediate variables, unmeasured
confounders). With this respect causal graphs can be helpful by providing a “causal”
definition of confounding. Causal graphs encode a set of qualitative assumptions about
causal relations among variables and allows to find conditions for effect identifiability
given those assumptions.
Pearl (1996) discussed this topic and reported a ‘causal’ criterion for absence of
confounding: “Denote with M a causal model generating the observed data. The
probability of observing response Y=y after having assigned treatment X=x is denoted
by P(y||x), given M. Then the relationship between X and Y is not “confused” in model
M if and only if: P(y||x)=P(y|x), where P(y|x) is the conditional probability given model
M”.
Therefore is important to understand that first we must define a causal model, then
conditions for absence of confounding can be derived. Stable absence of confounding is
identified by the absence of a common ancestor in the causal pathway XY (theorem 1
in Pearl,1996) which corresponds to the back-door graphical criterion.
In the presence of confounding the causal effect of X on Y is said to be identifiable if the
quantity P(y||x) can be computed uniquely from any positive distribution of the observed
variables that is compatible with M (Pearl, 1995). Criteria for identification have been
found: the back-door criterion, for example, allows to identify causal effects. A set of
variables Z satisfies the back-door criterion when the following two conditions are
satisfied:
a) no node in Z is a descendant of X;
b) Z blocks any path between X and Y which contains an arrow into X.
If a set a variables Z satisfies the back-door criterion then the causal effect is given by
the formula:
P(y||x)=zP(y|x,z)P(z).
This criterion allows the researcher to search for a sufficient set of confounders needed
for identification (Greenland et al., 1999), to decide what variables must be measured
and controlled for to obtain unconfounded effect estimates. This can also aid in planning
data collection and analysis avoiding subtle pitfalls of confounders selection.
3. Statistical modelling
The use of regression models to cope for potential confounders is popular in the
epidemiological literature. Several strategies have been suggested. Generally speaking
this approach is affected by the drawback of assuming the ‘association’ criterion for
absence of confounding. Moreover the underlying assumptions are far to be clear and
criticism on this line has been raised by several authors (e.g. Freedman 1999 for a recent
one or Vandenbroucke 1986 for a former). The consequence is that researchers tend to
adopt empirical strategies and simpler models. Multivariate responses and models for
multiple determinants are rarely addressed, relying on marginal effect or effect adjusted
for a small core of confounders. Propensity score methods can be a way out of this
problem (Rosenbaum, Rubin, 1983), allowing for properly account for confounding by
modelling the probability of being treated.
Here we show an application where graphical chain models are used to model jointly
propensity to treatment, intermediate variables and treatment effects.
3.1 Example 1: Cross-sectional study on childhood respiratory symptoms and air
pollution
Graphical chain models provide a suitable tool to investigate the structure of
independences in the joint distribution of the variables (Cox and Wermuth, 1996, Ch.
8). In a chain graph, the variables are represented by nodes or vertices; the
nodes/variables are divided into subsets and these subsets ordered in such a way that all
the variables in a given subset are potentially ancestors for the variables in the following
subsets. The ordering among the boxes reflects the causal order, and should be specified
by a priori knowledge (Cox, 1993). Biggeri and Stanghellini (1999) used graphical chain
models to analyse a data set concerning a study on the relationships between
atmospheric pollutants and respiratory symptoms in children, taking into account socioeconomic and individual risk factors. A questionnaire has been distributed to the parents
of 9847 children attending primary schools located in different areas for which measures
of air pollution were available (yearly average of the daily concentration of Particulate
and Nitrogen Oxide, NO2, g/m3).
The aim of the study was to investigate the causal role of several individual and
environmental variables in the prevalence of respiratory symptoms in childhood.
Therefore we have to consider a tri-variate response and subsets of explanatory and
intermediate covariates. Previous knowledge has been used to built a causal model,
leaving to statistical modelling the exact identification of the causal links.
The first subset includes socio-demographic characteristic of the child and geographical
location: mother’s education (IST), father’s social class (C.SO), geographical region of
residence (NS). Then we define four parallel subsets, a subset of outdoor environmental
variables (location of the house close to high traffic roads (ATV), levels of particulate
concentration (PO), levels of nitrogen oxides (AZO)), a subset of indoor environmental
variables (passive smoking exposure due to father (FP) or mother (FM) smoking habits)
and a subset of house characteristics (overcrowding (DE), mouldiness or humidity
(UM)), finally a subset of fixed variables (Sex (SESS), age (ET) and asthma familiarity
(FA)). The responses subset include presence of asthma attacks (ASM), wheezing
symptoms (SIBILI), bronchitis symptoms (cough or phlegm TC) in the last 12 months.
We were able to identify the effect of the socio-demographic characteristics and
geography on the outdoors environmental variables which were the determinants
actually
under study. Such variables acted as confounders only indirectly, through the mediating
effect of indoor environmental variables and smoking habits of the child parents.
Some relationships were present as a consequence of the multivariate response being
considered. Most noticeable, passive smoking has no direct effect on asthma attacks,
while nitrogen oxides exerted a direct action on bronchial responsiveness.
Figure 1: Graphical chain model of the relationships between air pollutants and
respiratory symptoms in childhood.
IST
R
DE
N
FP
UM
ID
FM
TC
C.SO
C
ASM
A
ATV
ET
A’
PO
LV
NS
SIBILI
SESS
O
AZO
TO
FA
M.
4. Front-door criterion and unobserved variables
One of the assumptions underlying the previous example is that uncontrolled (or
unmeasured) confounding was absent or, at least, negligible. Even if some unmeasured
confounding is present, there are still some cases where identifying conditions hold.
This is the case of another condition, named front-door criterion: it considers covariates
that are affected by the treatment and are also active agent determining the response. In
this case, even if there are unobserved variables (U) influencing both the response and
the treatment, the causal effect is identifiable (Pearl, 1995). An example of this would
be the study of the effect of smoking (X) on lung cancer (Y), supposing that the only
active agent is the intermediate variable I; the causal effect would be identified through
the conditional distribution of I given X:
P(y||x)=iP(i|x)i’P(y|x’,i)P(x’).
Figure 2: Graph showing the front-door criterion.
I
X
Y
U
In many applications, especially in socio-economical contexts, we cannot rely on
previous results that ensure identifiability of causal effects: there is a lack of sufficient
observed covariates that would make ignorability assumptions plausible, nor
intermediate variables that are not, themselves, affected by unobserved variables. This
general problem of additional unmeasured confounding can be dealt with, from a
statistical viewpoint, in different ways. The following example will show some of the
problems encountered in many realistic applications in the social sciences.
4.1 Example 2: Longitudinal analysis of youth training programmes and labour
market prospects
A major focus of microeconometric and statistical research and debate continues to be
related to the impact of youth training programmes on the labour market (Dehejia,
Wahba, 1999). Active labour market programmes, like the Youth Training Scheme
(YTS), which was provided by the British government during the 1980s and 1990s, are
usually embedded in the youth labour market history, which involves individual
transitions between different states such as employment, unemployment and various
form of education and training; repeated spells of training can be observed as well as
spells out of the labour force. The timing and destination state of each transition usually
depend on observed individual characteristics as well as previous market experience, but
the selection mechanism into training is potentially affected also by unobservable
characteristics that may be linked to subsequent labour market experience1. Robins
(1997) and Robins et al. (1999) give formal definition of no unmeasured confounders in
a longitudinal setting that are sufficient conditions for estimating causal effects2.
If these conditions do not hold, without additional assumptions, causal effects of YTS
are not identifiable. A partial solution to the problem is offered by the use of
instrumental variables (Angrist et al. 1996; Imbens, Rubin, 1997), i.e. variables that
affect the treatment but are uncorrelated with unobserved factors.
Figure 3: Graph showing an instrumental variable.
X
I
Y
U
In randomised trials with non-compliance, I is the indicator for assignment to the
treatment group, while X is the actual treatment indicator. Under suitable assumptions,
that usually hold in controlled trials but can be less obvious in observational studies,
causal effect for the subpopulation of compliers (those who comply with their
1
Other features of transition data and models in economics relate to dimensionality, institutional
constraints and sample attrition (non-ignorable drop-out) that raise similar problems to those concerning
the estimation of causal effects (Mealli, Pudney, 2000); for a recent contribution to the topic see
Sharfstein et al. (1999).
2
For a discussion of graphical methods for time-dependent variables see, among others, Robins (1997).
assignment) can be estimated, a simple estimator being the ratio two covariances:
cov(Y,I)/cov(X,I).
If the effect of treatment on the population as a whole is of interest, then instrumental
variables allow to derive effect bounds: Balke and Pearl (1997) generalise the results of
Manski (1990) and Robins (1989).
In the YTS study (Mealli, Pudney, 2000), although the data include detailed work
histories of the 1988 cohort of 3791 school leavers which were observed up to an
exogenously-determined date, it is hard to think at some observed variable as an
instrument: in this and similar contexts exclusion restrictions are very unlikely to hold.
Non parametric bounds on treatment effects can still be derived, although their range is
then even wider; tighter bounds or effect’s estimates can be obtained only if additional
prior restrictions are imposed (Manski, 1989).
An example of such assumptions is the one posed in the YTS study: a factor structure
was imposed on unobserved heterogeneity to capture dependence across episodes spent
in different states. Thus a transition model was specified that was empirically tractable
and economically interpretable: persistence, due to unobserved characteristics of
individuals, is in fact generally found in observed sequence of individual transitions.
The specified model is a modified form of the conventional heterogeneous multi-spell
multi-state transition model. Such models proceed by partitioning the observed work
history into a sequence of episodes. For the first spell of the sequence, there is a discrete
distribution of the state variable r0 with conditional probability mass function P(r0| x0,v).
Conditional on past history, each successive episode for j=1...m-1 is characterised by a
joint density/mass function f(tj,rj|xj,v), where xj may include functions of earlier state and
duration variables. The term v is a vector of unobserved random variables, constant over
time, that can thus generate strong serial dependence in the sequence of episodes. Under
our sampling scheme, the final observed spell is usually incomplete. Conditional on the
observed and unobserved covariates, the joint distribution of r0, t1, r1 , …, tm is:
f(r0, t1, r1 , …, tm |X,v) =P(r0|x0,v) f(tj, rj |xj,v) S(tm|xm,v)
The transition components of the model (the probability density/mass function f and the
survivor function S) are based on the notion of a set of origin- and destination-specific
transition intensity functions for each spell; these give the “instantaneous” probability of
exit to a given destination at a particular time, conditional on no previous exit having
occurred. Since the random effects are unobserved, they must be removed by
integration, once a joint (parametric or non parametric) distribution function, G(v), for
the unobserved heterogeneity terms is specified; estimation can then proceed by
maximising the log-likelihood. We opted for a model in which there is are reasonable
degrees of flexibility in both the transition intensity functions (removing proportionality
and monotonicity) and the heterogeneity distribution. In particular, we specify a four
state specific correlated random effects, that encompass the most common forms of
heterogeneity used in practice.
Once the transition intensities are estimated, the problem of estimating training effects
on, for example, the time spent in unemployment is still a hard one. We applied a
simulation strategy similar to those proposed by Robins.
Results show a strong positive effect of YTS participation on employment prospect.
5. Sensitivity analysis
All the assumptions that lead to the estimation of causal effects must be evaluated. If
from the causal graph identifiability conditions hold, then possible model specification
errors can and must be carefully verified.
When such conditions do not hold and effects are not non-parametrically identifiable,
bounds on treatment effects can still be derived. Inference based on additional
assumptions not fully testable from the data, such as those employed in the YTS
example, must be checked by means of sensitivity analysis. Although sensitivity
analysis may render the presentation of results confusing, as it includes non only the
sampling variability but also the variability of estimates across models, deviations from
the basic assumptions and presentation of results under different model specifications
may give alternative viewpoints on the observed data and suggest alternative
interpretations, as well as give a measure of the potential bias that might arise if some
assumptions do not hold (Angrist et al., 1996; Hirano et al., 2000).
Referring to the transition model of the previous section, for example, some of the
assumptions concerning the distribution of unobserved heterogeneity, such as (log-)
normality and independence of included regressors, can be tested and relaxed (Mealli,
Pudney, 1999). Sensitivity of results to possible unmeasured confounders can be
assessed via formal sensitivity analysis (Rosenbaum, Rubin, 1985; Copas, Li, 1997;
Scharfstein et al., 1999).
6. Conclusion
Recent developments within graphical models (Pearl, 1995, 2000) can be the basis for
representing and inferring about causal relationships; within such framework classical
strategies of controlling for confounding are encompassed in a unifying approach, that
exploit graphs’ language to clarify causal effects and their identification.
In the paper, we have given examples relative to confounding problems that show
different criteria of controlling for confounding (Greenland, Pearl and Robins, 1999).
When such theorems do not apply, and effects are not non-parametrically identifiable,
bounds on treatment effects can still be derived and reliance on additional assumptions
discussed, also by means of sensitivity analysis. Some examples of such assumptions,
within an socio-economical context, have been shown.
Previous tools can be extended also to more general graphical models than DAG
(Lauritzen, 2000). Chain graph models are an example: they are represented by graphs
that have both directed and undirected links. We have shown an application of graphical
chain models for the study of causal relationships between atmospheric pollutants and
respiratory symptoms in children.
References
Angrist J.D., Imbens G.W. and Rubin D. (1996) Identification of Causal Effects using
Instrumental Variables, Journal of the American Statistical Association, 91, 444-472.
Biggeri A. (1999) Confondimento e causalità in Epidemiologia. Epidemiologia e
Prevenzione, 23, 253-259.
Biggeri A. and Stanghellini E. (1999) Uso di Modelli Grafici per l’Analisi della
Relazione tra Inquinamento Atmosferico e Disturbi Respiratori nell’Infanzia, Atti
della XXXIX Riunione Scientifica della SIS, Napoli.
Balke A. and Pearl J (1997) Bounds on Treatment Effects from Studies with Imperfect
Compliance, Journal of the American Statistical Association, 92, 439, 1171-1176.
Copas J.B. and Li H.G. (1997) Inference for non-random samples (with discussion),
Journal of the Royal Statistical Society B, 59, 55-95.
Cox D.R., Causality and Graphical Models, Proceedings of the 49th Session of the ISI,
Book 1, 365-372.
Dawid A.P. (2000) Causal Inference Without Conterfactuals (with discussion), Journal
of the American Statistical Association, 95, 450, 407-447.
Deheija R.H., Wahba (1999) Causal Effects in Nonexperimental Studies: Reevaluating
the Evaluation of Training Programs, Journal of the American Statistical
Association, 94, 448, 1053-1062.
Greenland S., Robins J.M. and Pearl J. (1999) Confounding and Collapsibility in Causal
Inference, Statistical Science, 14, 29-46.
Greenland S., Pearl J. and Robins J.M. (1999) Causal Diagrams for Epidemiologic
Research, Epidemiology, 10, 37-48.
Freedman D. (1999) From Association to Causation: Some Remarks on the History of
Statistics, Statistical Science, 14, 243-258.
Hirano K., Imbens G.W., Rubin D.B. and Zhou X. (2000) Assessing the effect of an
influenza vaccine in an encouragement design, Biostatistics, 1, 69-88.
Imbens G. W., Rubin D.B. (1997) Bayesian Inference for causal effects in randomized
experiments with noncompliance, Annals of Statistics, 25, 305-327.
Lauritzen S. L. (2000) Causal Inference from Graphical Models, in: Complex Stochastic
System, Barndorff-Nielsen O.E., Cox D.R. and Klueppelberg C. (eds), Chapman and
Hall, London.
Manski (1989) Anatomy of the selection problem, Journal of Human Resources, 24,
343-360.
Manski (1990) Nonparametric bounds on Treatment Effects, American Economic
Review, Papers and Proceedings, 80, 319-323.
Mealli F. and Pudney S. (1999) Specification tests for random-effects transition models:
an application to a model of the British Youth Training Scheme, Lifetime Data
Analysis, 5, 213-237.
Mealli F. and Pudney S. (2000) Applying Heterogeneous Transition Models in Labour
Economics: the Role of Youth Training in Labour Market Transitions, in: Analysis of
Survey Data, Skinner C.J. and Chambers R.L. (eds.), Wiley, New York.
Pearl J. (1995) Causal Diagrams in Empirical Research, Biometrika, 82, 669-710.
Pearl J. (2000) Causality, Cambridge University Press, New York.
Robins J.M. (1989) The analysis of Randomized and Non-Randomized AIDS Treatment
Trials Using a New Approach to Causal Inference in Longitudinal Studies, in Health
Service Research Methodology: A Focus on AIDS, Sechert L., Freeman H. and
Mulley A. (eds), U.S. Public Health Service.
Robins J. M. (1997) Causal Inference from Complex Longitudinal Data, in Latent
Variable Modelling with Application to Causality, Berkane M. (ed), Springer, New
York, 69-117.
Robins J.M., Greenland S. and Hu F.C. (1999) Estimation of the Causal Effect of a
Time-Varying Exposure on the Marginal Mean of a Repeated Binary Outcome,
Journal of the American Statistical Association, 94, 447, 687-700.
Rosenbaum and Rubin D.B. (1983) The central role of the propensity score in
observational studies for causal effects, Biometrika, 70, 41-55.
Rothman K. and Greenland S. (1998) Modern Epidemiology, 2nd edition, LippincottRaven, Boston.
Scharfstein D.O, Rotnitzky A., Robins J.M. (1999) Adjusting for Nonignorable DropOut Using Semiparametric Nonresponse Models, with discussion, Journal of the
American Statistical Association, 94, 448, 1096-1120.
Vandenbroucke J.P. (1987) Should we abandon statistical modeling altogether?
American Journal of Epidemiology, 126, 1, 10-3.