Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Outline Causal inference for survival analysis (I) 1. Definition of causal effect Counterfactuals 2. Estimation of causal effects: Inverse probability weighting Miguel A. Hernán Department of Epidemiology Harvard School of Public Health 3. Causal diagrams Directed acyclic graphs 4. The bias of standard methods 5. Causal models Lysebu, September 2004 Marginal structural models Causal inference (I) 2 An intuitive definition of cause An intuitive definition of cause Ian took the pill on Sept 1, 2003 Jim didn’t take the pill on Sept 1, 2002 Five days later, he died Five days later, he was alive Had Ian not taken the pill on Sept 1, 2003 (all others things being equal) Five days later, he would have been alive Did the pill cause Ian’s death? Causal inference (I) Had Jim taken the pill on Sept 1, 2002 (all others things being equal) Five days later, he would have been alive Did the pill cause Jim’s survival? 3 Causal inference (I) 4 Human reasoning for causal inference Notation for actual data We compare (often only mentally) Y=1 if patient died, 0 otherwise the outcome when action A is present with the outcome when action A is absent all other things being equal Yi=1, Yj=0 A=1 if patient treated, 0 otherwise Ai=1, Aj=0 If the two outcomes differ, we say that the action A has a causal effect, causative or preventive. In epidemiology, A is commonly referred to as exposure or treatment. Causal inference (I) 5 ID A Y Ian 1 1 Jim 0 0 Causal inference (I) 6 1 Notation for ideal data Clarification: Ya=0=1 if subject would have died, had he not taken the pill Upper-case letters for random variables A, Y, Ya=0 , Ya=1 Lower-case letters for possible values (realizations) of those variables Yi, a=0= 0, Yj, a=0= 0 Ya=1=1 if patient would have died, had he taken the pill a is a possible value (0 or 1) of the random variable A Yi, a=1= 1, Yj, a=1= 0 ID A Ya=0 Ya=1 Ian 1 0 1 Jim 0 0 0 Causal inference (I) For our purposes, random variables are variables with different values for different individuals 7 Causal inference (I) 8 (Individual) Causal effect Potential or counterfactual outcomes (I) For Ian: Ya=0 and Ya=1 Random variables Pill has a causal effect because For Jim: Yi,a1 Yi,a0 Amenable to mathematical treatment, e.g., statistical models Pill does not have a causal effect because Yj,a1 Yj,a0 Sharp causal null hypothesis holds if, for all subjects, Ya1 Ya0 Causal inference (I) Refers to a “counter to the fact” situation 9 Potential or counterfactual outcomes (II) Causal inference (I) 10 Available data set One of them describes the subject's outcome value under the exposure value that the subject actually experienced Refers to an observed (factual) situation A given potential outcome is factual for some subjects and counterfactual for others Consistency By definition, if Ai=a then Yi, a = Yi, A = Yi Causal inference (I) One of them describes the subject's outcome value that would have been observed under a potential exposure value that the subject did not actually experience 11 ID Ian Jim Ken Leo Mike Nick … A 1 0 1 0 1 0 Y 1 0 0 1 1 0 Causal inference (I) Ya=0 ? 0 ? 1 ? 0 Ya=1 1 ? 0 ? 1 ? 12 2 Fundamental problem of causal inference First, more notation Individual causal effects cannot be determined Pr[Ya=1] except under extremely strong (and generally unreasonable) assumptions because only one counterfactual outcome is observed Causal inference as a missing data problem Whether using a randomized experiment or an observational study Need another definition of causal effect that requires weaker assumptions Causal inference (I) Unconditional or marginal probability “Calculated” by using data from the whole population 13 Causal inference (I) In the population, exposure A has a causal effect on the outcome Y if Pr[Ya=1=1] − Pr[Ya=0=1] = 0 Pr[Ya=1=1] / Pr[Ya=0=1] = 1 (Pr[Ya=1=1]/Pr[Ya=1=0])/(Pr[Ya=0=1]/Pr[Ya=0=0) =1 Causal effect can be measured in many scales: PrYa1 1 PrYa0 1 Causal null hypothesis holds if Pr[Ya=1=1] = Pr[Ya=0=1] causal risk difference, causal risk ratio, causal odds ratio, … Effect measures 15 Causal inference (I) Individual versus population causal effects Association and causation: More notation Individual causal effects cannot be determined Pr[Y=1|A=a] except under extremely strong assumptions Population causal effects can be determined under no assumptions (randomized studies) strong assumptions (observational studies) 16 proportion of subjects that developed the outcome Y among those who received exposure value a in the population Risk of Y among the exposed/unexposed Conditional probability We’ll refer to (population) causal effects only Causal inference (I) 14 Equivalent representations of the causal null hypothesis (Population) Causal effect Causal inference (I) proportion of subjects that would have developed the outcome Y had all subjects in the population of interest received exposure value a (Counterfactual) Risk of Ya 17 Calculated by using data from a subset of the population Causal inference (I) 18 3 Equivalent representations of independence Association Pr[Y=1|A=1] − Pr[Y=1|A=0] = 0 Pr[Y=1|A=1] / Pr[Y=1|A=0] = 1 (Pr[Y=1|A=1]/Pr[Y=0|A=1]) / (Pr[Y=1|A=0]/Pr[Y=0|A=0]) = 1 The exposure A and the outcome Y are associated if PrY 1|A 1 PrY 1|A 0 No association = independence PrY 1|A 1 PrY 1|A 0 A Y Y A Causal inference (I) 19 Association can be measured in many scales: Associational risk difference, associational risk ratio, associational odds ratio, … Association measures Causal inference (I) 20 Causal inference (I) 22 Again, crucial difference “Association is not causation” Association: different risk in two disjoint subsets of the population determined by the subjects' actual exposure value Pr[Y=1|A=a] is the risk in subjects of the population that meet the condition `having actually received exposure a’ Causation: different risk in the entire population under two exposure values Pr[Ya=1] is the risk in all subjects of the population had they received the counterfactual exposure a Causal inference (I) 21 An example of causal concept: Confounding Statistics and causation We need counterfactuals to talk about causation Statistics (as a discipline) leaned towards banning counterfactuals from statistical language Statistics w/o counterfactuals is a language for association, not for causation Causal concepts cannot be represented using statistics w/0 counterfactuals Causal inference (I) 23 There is confounding when association is not causation PrYa 1 PrY 1|A a Confounding cannot be defined using associational (statistical) language Causal inference (I) 24 4 Counterfactual theory in statistics Outline Neyman (1923) 1. Definition of causal effect Effects of point exposures in randomized experiments Counterfactuals 2. Estimation of causal effects Rubin (1974) Inverse probability weighting Effects of point exposures in randomized and observational studies 3. Causal diagrams Directed acyclic graphs 4. The bias of standard methods 5. Causal models Robins (1986) Total and direct effects of time-varying exposures in longitudinal studies Causal inference (I) Marginal structural models 25 Causal inference (I) Association measures Effect measures The associational risk ratio Pr[Y=1|A=1]/Pr[Y=1|A=0] can be directly computed in any study The causal risk ratio Pr[Ya=1=1] / Pr[Ya=0=1] cannot be directly computed in general because Y is observed in all subjects of the population because Ya=1 and Ya=0 are unobserved in some subjects of the population Pr[Ya=1=1] and Pr[Ya=0=1] are unobserved risks Pr[Y=1|A=1] and Pr[Y=1|A=0] are observed risks Causal inference (I) 27 Causal inference (I) Effect measures What is an ideal randomized experiment? can be computed using data from ideal randomized experiments No loss to follow-up with no assumptions 28 Full compliance with (adherence to) assigned exposure or treatment More rigorously, effect measures can be consistently estimated using data from ideal randomized experiments For now let’s consider experiments with near-infinite sample sizes only Causal inference (I) 26 Double blind assignment 29 Causal inference (I) 30 5 Randomization (I) In ideal randomized experiments Pr[Ya=1=1] is equal to Pr[Y=1|A=1] Pr[Ya=0=1] is equal to Pr[Y=1|A=0] One (near-infinite) population Divided into two groups Group 1 and group 2 Therefore the associational RR Membership in each group is randomly assigned Pr[Y=1|A=1] / Pr[Y=1|A=0] is equal to the causal RR e.g., by the flip of a coin Pr[Ya=1=1] / Pr[Ya=0=1] One group is treated and the other untreated Let’s prove it first we need to describe randomization Causal inference (I) 31 Causal inference (I) 32 Randomization (II) Randomization (III) First option When group membership is randomly assigned, results are the same Treat subjects in group 1, don’t treat subjects in group 2 The risk is, say, Pr[Y=1|A=1] = 0.57 whether group 1: treated, group 2: untreated or vice versa Both groups are comparable or exchangeable Exchangeability is the consequence of randomization Second option Treat subjects in group 2, don’t treat subjects in group 1 What is the value of the risk Pr[Y=1|A=1] ? Causal inference (I) 33 Exchangeability Causal inference (I) 34 Formal definition of exchangeability Subjects in group 1 would have had the same risk as those in group 2 had they received the treatment of those in group 2 The counterfactual risk in the treated equals the counterfactual risk in the untreated Ya A for all a Exchangeability implies lack of confounding Exchangeability is another causal concept that cannot be represented by associational (statistical) language PrYa 1|A 1 PrYa 1|A 0 A Ya Ya A Causal inference (I) 35 Causal inference (I) 36 6 Proof: Why Pr[Y=1|A=a] = Pr[Ya=1]? In an ideal randomized experiment Association is causation because randomization produces exchangeability Two steps: 1. Pr[Y=1|A=a] = Pr[Ya=1|A=a] by definition (consistency) 2. Pr[Ya=1|A=a] = Pr[Ya=1] We have a method for causal inference! by randomization (exchangeability) No need for adjustments of any sort Assumption-free Step 2 not generally true in the absence of randomization Causal inference (I) 37 Causal inference (I) Example: Does heart transplant (A) increase 5-year survival (Y)? Potential problems of real randomized experiments Select a large population of potential recipients of a transplant Get funding and IRB/Ethical approval Randomly allocate them to either transplant (A=1) or medical treatment (A=0) 5 years later, compute the associational RR a) Loss to follow-up Pr[Y=1|A=1] / Pr[Y=1|A=0] that equals (cons. estimates) the causal RR Pr[Ya=1=1] / Pr[Ya=0=1] Causal inference (I) b) Noncompliance c) Unblinding d) Other: ethics, feasibility, cost… 39 Causal inference (I) Consequence of problems a), b), c) Conclusion Although exchangeability still holds in randomized experiments but No clear-cut separation between randomized and observational studies “available association” may not be causation (loss to follow-up) exposure is misclassified (non compliance) or contaminated (unblinding) 40 Observational studies are needed Causal inference from real randomized studies may require assumptions and analytic methods similar to those for causal inference from observational studies Causal inference (I) 38 41 In fact, most of human knowledge comes from observations, e.g., evolution theory, tectonic plaques theory, hot coffee may cause burns… And so are methods for causal inference from observational data Causal inference (I) 42 7 In observational studies In observational studies, can we assume exchangeability? Absence of randomization implies that exchangeability is not guaranteed In general, Too strong an assumption! The exposed and the unexposed are not generally comparable Pr[Ya=1=1] is not equal to Pr[Y=1|A=1] Pr[Ya=0=1] is not equal to Pr[Y=1|A=0] e.g., individuals who receive a heart transplant may have a more severe disease than those who do not receive it Therefore the associational RR Pr[Y=1|A=1] / Pr[Y=1|A=0] is not generally equal to the causal RR Pr[Ya=1=1] / Pr[Ya=0=1] Causal inference (I) 43 Dead end? ¾ ¾ In general, PrYa 1|A 1 PrYa 1|A 0 A Ya Ya A Causal inference (I) In search of a weaker condition Consider only individuals with the same pre-exposure prognostic factors Then the exposed and the unexposed may be exchangeable Exchangeability (a consequence of randomization) is a condition for causal inference Exchangeability is not generally an acceptable assumption in observational studies A condition weaker than exchangeability is needed for causal inference from observational data Causal inference (I) 45 e.g., among individuals with an ejection fraction of 50%, those who do and do not receive a heart transplant may be comparable e.g., among individuals with CD4 count<100, those who do and do not receive antiretroviral therapy may be comparable This is often reasonable Especially if conditioning on many pre-exposure covariates L Causal inference (I) Available data set (with covariates) Conditional exchangeability ID 1 2 3 4 5 6 7 8 Within levels of the covariates L, exposed subjects would have had the same risk as unexposed subjects had they being unexposed, and vice versa Counterfactual risk is the same in the exposed and the unexposed with the same value of L L 0 0 0 0 1 1 1 1 44 A 1 1 0 0 1 1 0 0 Y 1 0 1 0 1 0 1 0 Causal inference (I) Ya=0 ? ? 1 0 ? ? 1 0 Ya=1 1 0 ? ? 1 0 ? ? 46 PrYa 1|A 1, L l PrYa 1|A 0, L l A Ya|L l Ya A|L l 47 Causal inference (I) 48 8 Formal definition of conditional exchangeability Ya A|L l OK, conditional exchangeability is a weaker condition, so what? Conditional exchangeability is a necessary condition for causal inference from observational data Under conditional exchangeability in all strata L=l, we can compute (consistently estimate) the causal risk ratio Pr[Ya=1=1] / Pr[Ya=0=1] The assumption of conditional exchangeability implies Pr[Y=1|L=l, A=a] = Pr[Ya=1|L=l] for all a Conditional exchangeability is equivalent to randomization within levels of L It implies no unmeasured (residual) confounding within levels of the measured variables L Causal inference (I) 49 Proof: Pr[Y=1|L=l, A=a] = Pr[Ya=1|L=l] Association is causation within levels of the covariates under the assumption of conditional exchangeability by definition of counterfactual variable 2. Pr[Ya=1|L=l, A=a] = Pr[Ya=1|L=l] by assumption (conditional exchangeability) We have a method for causal inference from observational data that it is not assumption-free Same as for randomized studies but within levels of L But the need to rely on this assumption is not THE problem 51 Causal inference (I) Can we check whether conditional exchangeability holds? That’s why causal inference from observational data is controversial No Expert knowledge can be used to enhance the plausibility of the assumption This is THE problem The assumption of conditional exchangeability is untestable 52 measure as many relevant pre-exposure covariates as possible Then one can only hope the assumption of conditional exchangeability is approximately true Even if there is conditional exchangeability, there is no way we can know it with certainty Causal inference (I) 50 In an observational study Two steps: 1. Pr[Y=1|L=l, A=a] = Pr[Ya=1|L=l, A=a] Causal inference (I) Causal inference (I) (All we are saying is that there may be confounding due to unmeasured factors) 53 Causal inference (I) 54 9 Inverse probability weighting (IPW) 500 HIV-infected individuals Variables: A method to compute causal effects under conditional exchangeability Plan of action: L=1: CD4 cell count<200 cells/microL A=1: on highly active antiretroviral therapy (HAART) Y=1: AIDS YOU will compute the causal risk ratio using IPTW in an observational study i.e., you will compute Pr[Ya=1=1]/Pr[Ya=0=1] Treatment status is decided after looking at CD4 cell count No loss to follow-up under conditional exchangeability We will prove that you were right Causal inference (I) A simplified example of observational study 55 The data summarized in a table Causal inference (I) 56 The data summarized in a tree 10 40 L=0 L=1 30 Y=1 Y=0 Y=1 Y=0 20 A=1 20 30 108 252 16 A=0 40 10 24 16 24 252 108 Causal inference (I) 57 Causal inference (I) 58 10 Your goal 40 To compute the effect of HAART on the risk of AIDS on the causal risk ratio scale Pr[Ya=1=1] / Pr[Ya=0=1] Assuming conditional exchangeability within levels of L 30 20 16 24 252 First, compute Pr[Ya=0=1] Second, compute Pr[Ya=1=1] Causal inference (I) 108 59 Causal inference (I) 60 10 20 Pseudopopulation data analysis 80 60 40 160 Ya = 1 Ya = 0 a=1 160 340 a=0 320 180 240 Pr[Ya=1=1] = 160 / (160 +340) = 0.32 Pr[Ya=0=1] = 320 / (320 + 180) = 0.64 Causal risk ratio = 0.32 / 0.64 = 0.5 280 120 Causal inference (I) 61 Which assumption are you making? Ya A|L l for all a Conditional exchangeability in the population exposure is randomized within levels of L no unmeasured confounding within levels of the measured variable L Within levels of L, the risk among the exposed if unexposed is the same as the risk among the unexposed in the population Causal inference (I) 62 Under conditional exchangeability… The observational study in the original population is a randomized experiment within levels of L The study in the pseudopopulation created by IPTW is a randomized experiment Exposed and unexposed subjects are (unconditionally!) exchangeable Because they are the same individuals Exposure is randomized, i.e., equally probable across levels of the covariate L There is no confounding In the pseudopopulation, causal effects can be estimated as in a randomized experiment No need for adjustments of any sort and vice versa Causal inference (I) 63 Causal inference (I) 64 W = 1 / f [A|L] You did it You computed the causal risk ratio using inverse-probability-oftreatment weighting Right? Causal inference (I) 65 Causal inference (I) 20 1/.5 = 2 80 1/.5 = 2 60 1/.5 = 2 40 1/.5 = 2 160 1/.1 = 10 240 1/.1 = 10 280 1/.9 = 1.11 120 1/.9 = 1.11 66 11 Inverse-probability-of-treatment weights W Important difference 1 fA|L Each individual in the population is weighted to create W individuals in the pseudopopulation The denominator of your W is (informally) the probability of having your observed treatment value given your L value Propensity score The PS is the probability of being treated given L Equal for all individuals with same L value because it does not depend on the A value Not equal for all individuals with same L value because it depends on A value as well Causal inference (I) 67 Causal inference (I) 68 Proof: Inverse probability weighted mean equals counterfactual mean Notational clarification fA(a) or f(a) is the probability density function (pdf) of the random variable A evaluated at the value a For discrete A: f(a) = Pr[A=a] We need to represent the probability that each subject had his/her own exposure level A Pr[A=A] makes no sense f(A) is the pdf evaluated at the random argument A, exactly what we mean Causal inference (I) W 1 PS PS PrA 1|L 69 E IA a E E E E Y fA|L E IA a Ya|L fA a|L IA a Ya fA a|L IA a |L EYa |L fA a|L By definition (consistency) Just algebra By assumption EEYa |L EYa E[X] = E[ E[X|Z] ] E[XZ] = E[X] E[Z] if X and Z are independent Just algebra Causal inference (I) 70 The positivity condition IPW estimation Required for the proof In each level of L in the population, there must be exposed and unexposed individuals If f(l)>0 then f(a|l)>0 for all a In general, we refer to the causal risk ratio estimate obtained by using weights IPT weights W as an IPTW estimate conditional probabilities must be positive IPW cannot be used when the positivity condition is not met Causal inference (I) 71 The idea is a generalization of Horvitz-Thompson (1952) estimators for survey sampling Causal inference (I) 72 12 IPW as a simulation Weighting is the equivalent of simulating what would happen had all individuals in the population experience every possible exposure level Individuals in the original population who received exposure level a are weighted to represent all individuals (regardless of exposure level) in the population sample size of pseudopulation is equal to number of exposure levels times the size of original population Causal inference (I) 73 13