Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Causality 14: Causal Graph Theory Counterfactual and intervention “Vaccination prevented a flu infection”, or more generally, "C caused E" What does it mean? -> in the closest possible world in which ~C is true, ~E is true What is such a world? the world which differs from the actual world only with respect to C, but otherwise the same the world in which C, and only C, is altered to ~C The idea of intervention An intervention “actualizes” the possible world "the world in which C, and only C, is altered to ~C" What does this world look like? Well, we can create this world by making ~C true, while preserving anything else! Causal claims are claims about possible consequences of hypothetical interventions “Vaccination prevents a flu infection" = if we get a person vaccinated, then she won’t get a flu. "C causes E" = if we are to manipulate the system to make it C, then E will happen. The interventionist theory of causation Woodward (2003), Making things happen X is a cause of Y if there is a possible intervention on X that changes the probability of Y e.g. Vaccination is a (inhibitive) cause of flu <- mandatory vaccination policy, if applied, reduces the probability of flu infection The meteor strike was a cause of the dinosaur extinction <- if, by whatever means, something could have prevented the strike dinosaurs would not have gone extinct “Interventions” need not be actual possibilities. It is ok if they are possible “in principle." Problems 1. What are “interventions”? 2. How do we calculate the change of probability of the putative effect upon such interventions? -> the causal graph theory Causal Graph Theory The basic framework Association/correlation: expressed by a probability distribution over variables Causality: expressed by a graph connecting variables Tries to... understand the general relationship between probability & causality (graph) infer a causal/graph structure from statistical data (associations) evaluate the effect of a hypothetical intervention Judea Pearl, Causality Sprites, Glamour and Scheines, Causation, prediction, and search Causal graphs Recall Salmon’s process view of causation... The Humean (regularity/probability) theory of causation is based on events. Salmon claimed that causation cannot be reduced to events. Rather it is a process. A causes B only if there is a physical process from A to B Let us represent these causal processes by arrows: A -> B causal processes among variables are represented by directed arrows a causal system (a causally connected set of variables) is represented by a causal graph G We say if A -> B: A is a direct cause of B if there is a chain of arrows from P to Q such that P -> A -> B -> … -> Q, then P is a cause of Q Q is an effect of P Causality and probability Suppose the following causal graph Somebody makes a call to Jim (M) -> Jim answers the phone (A) This is a causal process. And we expect a probabilistic dependence between M & A M & A are dependent, for we can predict A from M, and vice versa It seems the causal process generates the probabilistic dependence Now suppose a similar but bit different causal graph M -> Jim’s phone rings (R) -> A This is a causal process as well Still, we expect M & A to be probabilistically dependent But we also expect M & A become independent conditioned on R Why? If we know the phone rings (R), we know that someone has made a call (M) even without knowing whether Jim answers the call (A). So R screens M off A, and vice versa (recall screen-off is symmetry) Causality & probability As we have seen in Reihenbach’s CCP, there seems to exist a relationship between causality and probability. Can we reformulate this relationship? What would be the general relationship between causal graph G and probability P? Causal Markov Condition Let V: a set of variable G: a causal graph over V P: a probability distribution of V P satisfies the Markov condition with G if For any variable v in V, v becomes independent from any variable (except its effect) if conditioned on its direct causes The Causal Markov Condition (or Assumption) If G represents the true causal structure among variables V, (*and there is no hidden common cause, i.e. G exhausts all confounding) and P is a probability distribution of V; then G & P satisfies the Markov condition This is the essential assumption of the causal graph theory, and asserts a fundamental relationships between causality and probability. The useful properties of the Markov condition 1. The Markov condition logically implies Reichenbach’s common cause principle, i.e. If P & G are Markov, and two variables X and Y are dependent, X -> Y; or Y <- X; or X <- Z -> Y for some Z 2. The Markov condition gives the factorization of the (joint) probability P(all variables) = P(effect|causes) P(causes) This turns out useful when calculating consequences of an intervention. Interventions on causal graph Recall our 1st question: what are “interventions”? Answer: intervention is an alternation of the causal graph G By intervening on X we... dissociate X from the pre-existing mechanism and force it to take a fixed value x Intervention on variable X removes all arrows pointing to X fixes X to some predetermined value x we denote this by do(X=x) Intervention: an example In class Evaluating intervention effects The 2nd question: how to evaluate the effects of a given intervention? Suppose we know only probability P over a set of variables V. Then we can calculate P(A | B=b), the probability of A given that B is observed to be b. This is different from the probability of A given that B is made to be b. Intervention effects cannot be calculated from probability alone. But if we also know the causal graph G over V Intervention effects can be evaluated Pearl’s do calculus An exemplar setup for the do calculus Suppose we have three variables S: smoking history C: cancer D: drinking habit Assume we know P(S, C, D): the joint probability over the three variables G: the causal graph over {S, C, D} We want to evaluate P’(C) = P(C | do(~S)), i.e. the probability of Cancer if smoking is prohibited. Procedure of the do calculus 1. Carry out the intervention on Smoking history, i.e. Remove all edges into S Set P(S) = 0, P(~S) = 1 2. Calculate P'(S, C, D) in this modified graph, using The Markov condition: P(all variables) = P(effect|causes) P(causes) P'(S, C, D) = P(C | S, D) P(S) P(D) The right hand side is obtainable from our knowledge of P 3. The probability of Cancer P’(C) after banning smoking can be obtained by averaging P’(S, C, D) over D & S. (this is a common probability calculus) So... 1. The causal graph provides a means to formally represent interventions 2. The causal Markov condition enables one to calculate (using do calculus) the intervention effects Summary In the interventionist account, X is a cause of Y if we can change P(Y) by intervening on X The intervention can be represented as a modification of the causal graph The causal graph theory assumes the fundamental relationship between causality and probability, i.e. the causal Markov condition Effects of interventions can be evaluated by do calculus This is not possible by probability alone We need the causal graph and the Markov condition Reference 大塚淳. (2010). ベイズネットから見た因果と確率. 科学基礎論研究, 38(1), 39–47. https://junotkja.files.wordpress.com/2015/10/kisoron_otsuka2010.pdf