Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Identifiability of Causal Relationships in Non-Experimental Studies1 Identificabilità di relazioni causali in studi non sperimentali Fabrizia Mealli Dipartimento di Statistica “G. Parenti” - Università degli Studi di Firenze e-mail: [email protected] Annibale Biggeri Dipartimento di Statistica “G. Parenti” - Università degli Studi di Firenze e-mail: [email protected] Riassunto: L’identificazione di relazioni causali negli studi non sperimentali è uno degli obiettivi più importanti dell’analisi epidemiologica. Nella letteratura statistica esistono diversi approcci per l’identificazione e la stima di effetti causali. Tra questi, un ruolo importante è rivestito dall’approccio che utilizza il concetto di risultati potenziali (o Modello Causale di Rubin) e dalla teoria dei grafi causali. Nel presente lavoro si farà inizialmente riferimento alla teoria dei grafi causali, mettendo in luce la sua relazione con la definizione epidemiologica di confondimento. Verrà mostrata un’applicazione dei modelli grafici a catena per l’analisi della relazione tra inquinamento atmosferico e disturbi respiratori nell’infanzia. Verranno poi analizzate le complicazioni, in termini di identificabilità degli effetti causali, dovute alla presenza di confondenti non osservabili; sarà presentato, infine, uno studio longitudinale per l’analisi degli effetti di programmi di formazione al lavoro sulle prospettive lavorative di giovani. Keywords: causal models, confounding, graphical models, longitudinal studies, air pollution. 1. Introduction Some of the most challenging tasks for Epidemiology and Applied Statistics are related to the identification of causal relationships in observational studies. Indeed, one could argue that theoretical contributions on study design in Epidemiology lead to rules to obtain samples on which causal relationships of interest can be estimated. To assure validity of the inference appropriate control of confounding is, among others, one of the main tasks of the researcher. This can be achieved by study design options or by using statistical techniques in the analysis phase (Rothman and Greenland, 1998), and both strategies have been and are currently employed. In Epidemiology, statistical control of confounding is obtained via stratification or, when many covariates have to be taken into account, via regression models; propensity score methods would be another solution (Rosenbaum, Rubin, 1993). The first strategy 1 The paper is supported by COFIN 98-99 Biggeri, COFIN 98-99 Marchetti, and COFIN 99-00 Biggeri. has many links with the so-called Simpson’s paradox while the second one relies heavily on the correct model specification and therefore has been criticised since many assumptions seem arbitrary in practice (Freedman, 1999). There are different statistical approaches to deal with identification and estimation of causal effects. A very important one, that has led to many operational techniques for causal analysis, is usually referred to as counterfactual or potential outcomes framework or Rubin’s Causal Model (Holland, 1986); causal effects involve comparisons of outcomes that would have been observed under exposures of units to different treatments1. Another approach to causal modelling is based on the theory of causal graphs. Recent development in this field, which has many links with structural equation modelling, has provided a formal theory for evaluating causal effects and a new perspective for confounder identification in Epidemiology (Greenland et al., 1999). The use of graphs provides a tool for integrating statistical and subject-matter information: it facilitates to explicate the assumptions underlying the causal model and to decide whether the assumptions are sufficient for obtaining estimates of causal effects from observed quantities. The two formalisations have many links (see for example Greenland et al. 1999 and Robins (1995) for a discussion on this). Rubin’s approach can ease the formulation and the assessment of critical assumptions such as those, for example, involving circumstances where features of units influence other units. We believe that they provide complementary languages for exposing the fundamental assumptions required to make causal estimates feasible. In the paper we will first refer to the theory of causal graphs and its relation to the usual epidemiological definition of confounding. We will show an application of graphical chain models to a study of respiratory symptoms in childhood and air pollution. We will then focus on identification problems (and related solutions) raised by the presence of unmeasured confounding and show a longitudinal study, that may suffer from such problems, of the effects of Youth Training programmes on labour market prospects. 2. Confounding There is an intrinsic conflict in the formal definition of confounding based on association criteria (Biggeri, 1999). This conflict is solved in practice by gathering information on the subject specific matter. Indeed the definition of ‘effect’ implies a directional association (XY) which requires an extra-statistical evaluation, while generally speaking a ‘spurious’ association implies a subject-specific assessment on what is to be considered the ‘authentic’ association. The ‘association criterion’ for absence of confounding, which is common in Epidemiology textbooks, is the following: “Denote with T the set of variables not depending on X. Then the relationship between X and Y is not “confused” by T if any variable Z in T satisfies at least one of the following conditions: a) Z is not associated with X (P(x|z)=P(x)), or b) Z is not associated with the response Y, given X (P(y|z,x)=P(y|x))”. 1 For a recent discussion on this approach see Dawid (2000). This is not a purely statistical criterion since subject-specific knowledge is required to identify the variables in T. Moreover this criterion fails in several circumstances (joint treatment of confounders, proxy variables, intermediate variables, unmeasured confounders). With this respect causal graphs can be helpful by providing a “causal” definition of confounding. Causal graphs encode a set of qualitative assumptions about causal relations among variables and allows to find conditions for effect identifiability given those assumptions. Pearl (1996) discussed this topic and reported a ‘causal’ criterion for absence of confounding: “Denote with M a causal model generating the observed data. The probability of observing response Y=y after having assigned treatment X=x is denoted by P(y||x), given M. Then the relationship between X and Y is not “confused” in model M if and only if: P(y||x)=P(y|x), where P(y|x) is the conditional probability given model M”. Therefore is important to understand that first we must define a causal model, then conditions for absence of confounding can be derived. Stable absence of confounding is identified by the absence of a common ancestor in the causal pathway XY (theorem 1 in Pearl,1996) which corresponds to the back-door graphical criterion. In the presence of confounding the causal effect of X on Y is said to be identifiable if the quantity P(y||x) can be computed uniquely from any positive distribution of the observed variables that is compatible with M (Pearl, 1995). Criteria for identification have been found: the back-door criterion, for example, allows to identify causal effects. A set of variables Z satisfies the back-door criterion when the following two conditions are satisfied: a) no node in Z is a descendant of X; b) Z blocks any path between X and Y which contains an arrow into X. If a set a variables Z satisfies the back-door criterion then the causal effect is given by the formula: P(y||x)=zP(y|x,z)P(z). This criterion allows the researcher to search for a sufficient set of confounders needed for identification (Greenland et al., 1999), to decide what variables must be measured and controlled for to obtain unconfounded effect estimates. This can also aid in planning data collection and analysis avoiding subtle pitfalls of confounders selection. 3. Statistical modelling The use of regression models to cope for potential confounders is popular in the epidemiological literature. Several strategies have been suggested. Generally speaking this approach is affected by the drawback of assuming the ‘association’ criterion for absence of confounding. Moreover the underlying assumptions are far to be clear and criticism on this line has been raised by several authors (e.g. Freedman 1999 for a recent one or Vandenbroucke 1986 for a former). The consequence is that researchers tend to adopt empirical strategies and simpler models. Multivariate responses and models for multiple determinants are rarely addressed, relying on marginal effect or effect adjusted for a small core of confounders. Propensity score methods can be a way out of this problem (Rosenbaum, Rubin, 1983), allowing for properly account for confounding by modelling the probability of being treated. Here we show an application where graphical chain models are used to model jointly propensity to treatment, intermediate variables and treatment effects. 3.1 Example 1: Cross-sectional study on childhood respiratory symptoms and air pollution Graphical chain models provide a suitable tool to investigate the structure of independences in the joint distribution of the variables (Cox and Wermuth, 1996, Ch. 8). In a chain graph, the variables are represented by nodes or vertices; the nodes/variables are divided into subsets and these subsets ordered in such a way that all the variables in a given subset are potentially ancestors for the variables in the following subsets. The ordering among the boxes reflects the causal order, and should be specified by a priori knowledge (Cox, 1993). Biggeri and Stanghellini (1999) used graphical chain models to analyse a data set concerning a study on the relationships between atmospheric pollutants and respiratory symptoms in children, taking into account socioeconomic and individual risk factors. A questionnaire has been distributed to the parents of 9847 children attending primary schools located in different areas for which measures of air pollution were available (yearly average of the daily concentration of Particulate and Nitrogen Oxide, NO2, g/m3). The aim of the study was to investigate the causal role of several individual and environmental variables in the prevalence of respiratory symptoms in childhood. Therefore we have to consider a tri-variate response and subsets of explanatory and intermediate covariates. Previous knowledge has been used to built a causal model, leaving to statistical modelling the exact identification of the causal links. The first subset includes socio-demographic characteristic of the child and geographical location: mother’s education (IST), father’s social class (C.SO), geographical region of residence (NS). Then we define four parallel subsets, a subset of outdoor environmental variables (location of the house close to high traffic roads (ATV), levels of particulate concentration (PO), levels of nitrogen oxides (AZO)), a subset of indoor environmental variables (passive smoking exposure due to father (FP) or mother (FM) smoking habits) and a subset of house characteristics (overcrowding (DE), mouldiness or humidity (UM)), finally a subset of fixed variables (Sex (SESS), age (ET) and asthma familiarity (FA)). The responses subset include presence of asthma attacks (ASM), wheezing symptoms (SIBILI), bronchitis symptoms (cough or phlegm TC) in the last 12 months. We were able to identify the effect of the socio-demographic characteristics and geography on the outdoors environmental variables which were the determinants actually under study. Such variables acted as confounders only indirectly, through the mediating effect of indoor environmental variables and smoking habits of the child parents. Some relationships were present as a consequence of the multivariate response being considered. Most noticeable, passive smoking has no direct effect on asthma attacks, while nitrogen oxides exerted a direct action on bronchial responsiveness. Figure 1: Graphical chain model of the relationships between air pollutants and respiratory symptoms in childhood. IST R DE N FP UM ID FM TC C.SO C ASM A ATV ET A’ PO LV NS SIBILI SESS O AZO TO FA M. 4. Front-door criterion and unobserved variables One of the assumptions underlying the previous example is that uncontrolled (or unmeasured) confounding was absent or, at least, negligible. Even if some unmeasured confounding is present, there are still some cases where identifying conditions hold. This is the case of another condition, named front-door criterion: it considers covariates that are affected by the treatment and are also active agent determining the response. In this case, even if there are unobserved variables (U) influencing both the response and the treatment, the causal effect is identifiable (Pearl, 1995). An example of this would be the study of the effect of smoking (X) on lung cancer (Y), supposing that the only active agent is the intermediate variable I; the causal effect would be identified through the conditional distribution of I given X: P(y||x)=iP(i|x)i’P(y|x’,i)P(x’). Figure 2: Graph showing the front-door criterion. I X Y U In many applications, especially in socio-economical contexts, we cannot rely on previous results that ensure identifiability of causal effects: there is a lack of sufficient observed covariates that would make ignorability assumptions plausible, nor intermediate variables that are not, themselves, affected by unobserved variables. This general problem of additional unmeasured confounding can be dealt with, from a statistical viewpoint, in different ways. The following example will show some of the problems encountered in many realistic applications in the social sciences. 4.1 Example 2: Longitudinal analysis of youth training programmes and labour market prospects A major focus of microeconometric and statistical research and debate continues to be related to the impact of youth training programmes on the labour market (Dehejia, Wahba, 1999). Active labour market programmes, like the Youth Training Scheme (YTS), which was provided by the British government during the 1980s and 1990s, are usually embedded in the youth labour market history, which involves individual transitions between different states such as employment, unemployment and various form of education and training; repeated spells of training can be observed as well as spells out of the labour force. The timing and destination state of each transition usually depend on observed individual characteristics as well as previous market experience, but the selection mechanism into training is potentially affected also by unobservable characteristics that may be linked to subsequent labour market experience1. Robins (1997) and Robins et al. (1999) give formal definition of no unmeasured confounders in a longitudinal setting that are sufficient conditions for estimating causal effects2. If these conditions do not hold, without additional assumptions, causal effects of YTS are not identifiable. A partial solution to the problem is offered by the use of instrumental variables (Angrist et al. 1996; Imbens, Rubin, 1997), i.e. variables that affect the treatment but are uncorrelated with unobserved factors. Figure 3: Graph showing an instrumental variable. X I Y U In randomised trials with non-compliance, I is the indicator for assignment to the treatment group, while X is the actual treatment indicator. Under suitable assumptions, that usually hold in controlled trials but can be less obvious in observational studies, causal effect for the subpopulation of compliers (those who comply with their 1 Other features of transition data and models in economics relate to dimensionality, institutional constraints and sample attrition (non-ignorable drop-out) that raise similar problems to those concerning the estimation of causal effects (Mealli, Pudney, 2000); for a recent contribution to the topic see Sharfstein et al. (1999). 2 For a discussion of graphical methods for time-dependent variables see, among others, Robins (1997). assignment) can be estimated, a simple estimator being the ratio two covariances: cov(Y,I)/cov(X,I). If the effect of treatment on the population as a whole is of interest, then instrumental variables allow to derive effect bounds: Balke and Pearl (1997) generalise the results of Manski (1990) and Robins (1989). In the YTS study (Mealli, Pudney, 2000), although the data include detailed work histories of the 1988 cohort of 3791 school leavers which were observed up to an exogenously-determined date, it is hard to think at some observed variable as an instrument: in this and similar contexts exclusion restrictions are very unlikely to hold. Non parametric bounds on treatment effects can still be derived, although their range is then even wider; tighter bounds or effect’s estimates can be obtained only if additional prior restrictions are imposed (Manski, 1989). An example of such assumptions is the one posed in the YTS study: a factor structure was imposed on unobserved heterogeneity to capture dependence across episodes spent in different states. Thus a transition model was specified that was empirically tractable and economically interpretable: persistence, due to unobserved characteristics of individuals, is in fact generally found in observed sequence of individual transitions. The specified model is a modified form of the conventional heterogeneous multi-spell multi-state transition model. Such models proceed by partitioning the observed work history into a sequence of episodes. For the first spell of the sequence, there is a discrete distribution of the state variable r0 with conditional probability mass function P(r0| x0,v). Conditional on past history, each successive episode for j=1...m-1 is characterised by a joint density/mass function f(tj,rj|xj,v), where xj may include functions of earlier state and duration variables. The term v is a vector of unobserved random variables, constant over time, that can thus generate strong serial dependence in the sequence of episodes. Under our sampling scheme, the final observed spell is usually incomplete. Conditional on the observed and unobserved covariates, the joint distribution of r0, t1, r1 , …, tm is: f(r0, t1, r1 , …, tm |X,v) =P(r0|x0,v) f(tj, rj |xj,v) S(tm|xm,v) The transition components of the model (the probability density/mass function f and the survivor function S) are based on the notion of a set of origin- and destination-specific transition intensity functions for each spell; these give the “instantaneous” probability of exit to a given destination at a particular time, conditional on no previous exit having occurred. Since the random effects are unobserved, they must be removed by integration, once a joint (parametric or non parametric) distribution function, G(v), for the unobserved heterogeneity terms is specified; estimation can then proceed by maximising the log-likelihood. We opted for a model in which there is are reasonable degrees of flexibility in both the transition intensity functions (removing proportionality and monotonicity) and the heterogeneity distribution. In particular, we specify a four state specific correlated random effects, that encompass the most common forms of heterogeneity used in practice. Once the transition intensities are estimated, the problem of estimating training effects on, for example, the time spent in unemployment is still a hard one. We applied a simulation strategy similar to those proposed by Robins. Results show a strong positive effect of YTS participation on employment prospect. 5. Sensitivity analysis All the assumptions that lead to the estimation of causal effects must be evaluated. If from the causal graph identifiability conditions hold, then possible model specification errors can and must be carefully verified. When such conditions do not hold and effects are not non-parametrically identifiable, bounds on treatment effects can still be derived. Inference based on additional assumptions not fully testable from the data, such as those employed in the YTS example, must be checked by means of sensitivity analysis. Although sensitivity analysis may render the presentation of results confusing, as it includes non only the sampling variability but also the variability of estimates across models, deviations from the basic assumptions and presentation of results under different model specifications may give alternative viewpoints on the observed data and suggest alternative interpretations, as well as give a measure of the potential bias that might arise if some assumptions do not hold (Angrist et al., 1996; Hirano et al., 2000). Referring to the transition model of the previous section, for example, some of the assumptions concerning the distribution of unobserved heterogeneity, such as (log-) normality and independence of included regressors, can be tested and relaxed (Mealli, Pudney, 1999). Sensitivity of results to possible unmeasured confounders can be assessed via formal sensitivity analysis (Rosenbaum, Rubin, 1985; Copas, Li, 1997; Scharfstein et al., 1999). 6. Conclusion Recent developments within graphical models (Pearl, 1995, 2000) can be the basis for representing and inferring about causal relationships; within such framework classical strategies of controlling for confounding are encompassed in a unifying approach, that exploit graphs’ language to clarify causal effects and their identification. In the paper, we have given examples relative to confounding problems that show different criteria of controlling for confounding (Greenland, Pearl and Robins, 1999). When such theorems do not apply, and effects are not non-parametrically identifiable, bounds on treatment effects can still be derived and reliance on additional assumptions discussed, also by means of sensitivity analysis. Some examples of such assumptions, within an socio-economical context, have been shown. Previous tools can be extended also to more general graphical models than DAG (Lauritzen, 2000). Chain graph models are an example: they are represented by graphs that have both directed and undirected links. We have shown an application of graphical chain models for the study of causal relationships between atmospheric pollutants and respiratory symptoms in children. References Angrist J.D., Imbens G.W. and Rubin D. (1996) Identification of Causal Effects using Instrumental Variables, Journal of the American Statistical Association, 91, 444-472. Biggeri A. (1999) Confondimento e causalità in Epidemiologia. Epidemiologia e Prevenzione, 23, 253-259. Biggeri A. and Stanghellini E. (1999) Uso di Modelli Grafici per l’Analisi della Relazione tra Inquinamento Atmosferico e Disturbi Respiratori nell’Infanzia, Atti della XXXIX Riunione Scientifica della SIS, Napoli. Balke A. and Pearl J (1997) Bounds on Treatment Effects from Studies with Imperfect Compliance, Journal of the American Statistical Association, 92, 439, 1171-1176. Copas J.B. and Li H.G. (1997) Inference for non-random samples (with discussion), Journal of the Royal Statistical Society B, 59, 55-95. Cox D.R., Causality and Graphical Models, Proceedings of the 49th Session of the ISI, Book 1, 365-372. Dawid A.P. (2000) Causal Inference Without Conterfactuals (with discussion), Journal of the American Statistical Association, 95, 450, 407-447. Deheija R.H., Wahba (1999) Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs, Journal of the American Statistical Association, 94, 448, 1053-1062. Greenland S., Robins J.M. and Pearl J. (1999) Confounding and Collapsibility in Causal Inference, Statistical Science, 14, 29-46. Greenland S., Pearl J. and Robins J.M. (1999) Causal Diagrams for Epidemiologic Research, Epidemiology, 10, 37-48. Freedman D. (1999) From Association to Causation: Some Remarks on the History of Statistics, Statistical Science, 14, 243-258. Hirano K., Imbens G.W., Rubin D.B. and Zhou X. (2000) Assessing the effect of an influenza vaccine in an encouragement design, Biostatistics, 1, 69-88. Imbens G. W., Rubin D.B. (1997) Bayesian Inference for causal effects in randomized experiments with noncompliance, Annals of Statistics, 25, 305-327. Lauritzen S. L. (2000) Causal Inference from Graphical Models, in: Complex Stochastic System, Barndorff-Nielsen O.E., Cox D.R. and Klueppelberg C. (eds), Chapman and Hall, London. Manski (1989) Anatomy of the selection problem, Journal of Human Resources, 24, 343-360. Manski (1990) Nonparametric bounds on Treatment Effects, American Economic Review, Papers and Proceedings, 80, 319-323. Mealli F. and Pudney S. (1999) Specification tests for random-effects transition models: an application to a model of the British Youth Training Scheme, Lifetime Data Analysis, 5, 213-237. Mealli F. and Pudney S. (2000) Applying Heterogeneous Transition Models in Labour Economics: the Role of Youth Training in Labour Market Transitions, in: Analysis of Survey Data, Skinner C.J. and Chambers R.L. (eds.), Wiley, New York. Pearl J. (1995) Causal Diagrams in Empirical Research, Biometrika, 82, 669-710. Pearl J. (2000) Causality, Cambridge University Press, New York. Robins J.M. (1989) The analysis of Randomized and Non-Randomized AIDS Treatment Trials Using a New Approach to Causal Inference in Longitudinal Studies, in Health Service Research Methodology: A Focus on AIDS, Sechert L., Freeman H. and Mulley A. (eds), U.S. Public Health Service. Robins J. M. (1997) Causal Inference from Complex Longitudinal Data, in Latent Variable Modelling with Application to Causality, Berkane M. (ed), Springer, New York, 69-117. Robins J.M., Greenland S. and Hu F.C. (1999) Estimation of the Causal Effect of a Time-Varying Exposure on the Marginal Mean of a Repeated Binary Outcome, Journal of the American Statistical Association, 94, 447, 687-700. Rosenbaum and Rubin D.B. (1983) The central role of the propensity score in observational studies for causal effects, Biometrika, 70, 41-55. Rothman K. and Greenland S. (1998) Modern Epidemiology, 2nd edition, LippincottRaven, Boston. Scharfstein D.O, Rotnitzky A., Robins J.M. (1999) Adjusting for Nonignorable DropOut Using Semiparametric Nonresponse Models, with discussion, Journal of the American Statistical Association, 94, 448, 1096-1120. Vandenbroucke J.P. (1987) Should we abandon statistical modeling altogether? American Journal of Epidemiology, 126, 1, 10-3.