Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bill Shipley, département de biologie Université de Sherbrooke Sherbrooke (Qc) Canada [email protected] Number of churches Ln(murders)=0.009+0.99*Ln(churches) 10 20 50 Number of murders 2 5 New causal context…... 1 Number of murders per year per city 100 200 Pop size 1 2 5 10 20 50 100 Number of churches 200 Number of churches per city Pop size Passive prediction ONLY if the underlying causal processes are constant Number of murders 2-D Shadow 3-D Object What the audience sees Hidden from view “2-D” correlational shadow “3-D” causal process B & C correlated, but independent given A A & D correlated, but independent given B & C And so on…. A B C D E What the scientist sees Hidden from view R.A. Fisher Statistical Methods for Research Workers (1925) o o o o 15 plots with treatment (+fertilizer & water) 15 plots without treatment (+water) Treatment: 80 g 6 Control: 55 g 6 Nitrogen fertilizer T-test: p<0.0001 Crop growth Nitrogen fertilizer ? Crop growth X Random numbers X X Experimental (observational) unit... - the unit to which the treatment is applied - the UNIT to which the treatment is applied variable 1 variable 2 … variable n N fertilizer N, P, K... Worms…. No causal inferences between variables within the experimental unit THE PLANT Nitrogen fertilizer Nitrogen absorption Photosynthetic enzymes Carbon fixation Seed yield Scenario 1 Fertilizer addition Nitrogen absorption Photosynthetic enzymes Photosynthetic enzymes Nitrogen absorption Scenario 2 Fertilizer addition Scenario 3 Photosynthetic enzymes Fertilizer addition Nitrogen absorption La méthode expérimentale Claude Bernard 1813 - 1878 Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney X Color of blood in renal vein before entering the kidney Active/inactive state of the kidney Color of blood in the renal vein upon exiting the kidney 1. Hypothesize a causal structure. A B C A B C A B C A B C 2. Measure the correlations between the variables in their natural state. 3. Predict how these correlations will change if various physical manipulations hold constant different variables. 4. Compare the new correlations after controlling the variables to the predictions assuming the causal structure. 5. If any of the predicted changes in the correlational structure disagree with the observed changes, then reject the causal structure. Body size in autumn sex Survival to spring Causal hypothesis 1 Survival to spring Causal hypothesis 2 Other causes sex Survival to spring Other causes Body size in autumn Other causes sex Body size in autumn Other causes 0.120 Quantity and quality of summer forage Z Body weight in the autumn 0.040 Probability of survival until spring 1.5 0.0 Y 1.5 0.0 -1.5 1.5 X Z= f (X,Y) = f (X)f (Y) Survival (%) 24 26 28 30 7 5 22 4 3 40 50 60 70 80 40 Body mass (kg) 50 60 4 2 0 -2 -4 -1.0 70 Body mass (kg) Residuals Survival for a constant body mass Amount of forage (kg) 6 “residuals of Y given X” -0.5 0.0 Forage quality for constant body mass 0.5 80 “3-D” causal process “2-D” correlational shadow B & C independent given A A B A & D independent given B & C B & D independent given D C and so on... D E Hypothesis testing “3-D” causal process “2-D” correlational shadow B & C independent given A A B A & D independent given B & C C B & D independent given D D and so on... E Hypothesis generation A B & C independent given A B C D A & D independent given B & C B & D independent given D and so on... E A B 0.300 0.120 Z Z 0.100 0.040 1.5 1.5 0.0 0.0 Y 1.5 0.0 -1.5 1.5 Y -1.5 X Z= f (X,Y) = f (X)f (Y) P( x; , ) = 1 1 e[ X ] [ X ] (2 )n / 2 | |1/ 2 1.5 0.0 -1.5 X Z= f (X,Y) f (X)f (Y) The dangers of mistranslation between languages... French “demande” vs. English “demand” = Probability distributions •Deals only in information content conditional on other information •NOT causal relationships. •There is no notion of a causal (asymmetric) relationship in probability theory •Consistently mistranslates “X-->Y” as “Y=f(X)” = Bill Gates worth 1,000,000,000$ (machine translation into another language) (machine translation back into English) Payment request for doors in the fence worth 1,000,000,000$ Rain Mud Other causes of mud Mud (cm) = 0.1Rain (cm) + N(0, 0.1) Rain(cm)=10Mud(cm)+N(0,1) Rain Mud Other causes of mud 1. Express causal claims using graph theory (directed acyclic graphs - DAGs) Property: asymmetric relationships A B C 2. Apply a graph-theoretic operator (d-separation) on this graph. A_||_C|B (A is separated from C given B in the graph) 3. If two vertices (X,Y) in this DAG are d-separated given a set Q of other vertices, then variables X and Y are probabilistically independent given the set Q of conditioning variables in ANY multivariate probability distribution generated by the DAG 4. There always exists a basis set B of d-separation claims for the DAG that together completely specify the joint probability distribution over the variables represented by the DAG. B={A_||_C|B..} implies P(X,Y,Z) 5. Test the predicted and observed independence claims implied by the graphical model. - if there are significant differences, reject the causal model; - if there aren’t significant differences, tentatively accept the causal model (and continue testing…) 6. Now, translate the graphical model into prediction equations. A B C A=e1 B=f(A) + e2 C=f(B) + e3 7. The independence claims in the DAG are local, therefore, to change the causal structure, simply re-write the DAG and then go back to step 6. A=e1 A B C B= e2 C=f(B) + e3 Number of churches Ln(murders)=0.009+0.99*Ln(churches) 10 20 50 Number of murders 2 5 New causal context…... 1 Number of murders per year per city 100 200 Pop size 1 2 5 10 20 50 100 Number of churches 200 Number of churches per city Pop size Passive prediction ONLY if the underlying causal processes are constant Number of murders A few definitions... A A B B C C D D E E If you can follow the arrows from i to j then there is a directed path from i to j. A B C D Directed path from: E If you can go from i to j while ignoring the direction of the arrows then there is an undirected path from i to j. A to C NOT from A to E E to C NOT from E to A Undirected path from: A to E E to A A few definitions... A B C D Non-collider vertex E C Unshielded collider vertex Sheilded collider vertex A B C D E Causal children of A NOT causal children of A Causal children of E NOT causal children of E A B C D Causal ancestors of C E State of a vertex: A non-collider vertex allows causal influence to flow through it (naturally ON); conditioning (holding constant) blocks causal influence through it (turns OFF). A B C A B C A collider vertex prevents causal influence to flow through it (naturally OFF); conditioning (holding constant) allows causal influence through it (turns ON). A B C A B C A B C Rain A mud B water hose rain rain C Water hose Water hose mud mud 1. It rained 1. It didn’t rain 2. Therefore mud 2. There was mud 3. No idea about water hose 3. Therefore the water hose was on Is X and Y d-separated given a set Q={A, B, …} conditioning vertices? 1. List all undirected paths between X and Y For each such undirected path... 2. Are there any non-colliders along this path that are in Q? If yes, path is blocked; Go to next undirected path. 3. Are all colliders or causal children of colliders along this path in Q? If no, then path is blocked; go to next undirected path. If all undirected paths between X and Y are blocked by Q then X and Y are d-separated by Q. If X and Y are d-separated by Q, then they are probabilistically independent given Q in any probability distribution generated by the graph. Non-collider A B C Are B & C d-separated given A? B_||_C|{A}? A B D C D E E YES B & C are d-separated given A therefore... B & C will be independent conditional on A A A B C Are B & C d-separated given D? B_||_C|{D}? B D C D collider E E NO B & C are not d-separated given D therefore... B & C will be dependent conditional on A A B C A _||_E|{D}? YES A_||_E|{D,B}? YES B_||_C|{A,D}? NO B_||_C|{A,E}? NO D_||_A|{B}? NO D E_||_B|D? YES E … and so on for every unique pair (X,Y) conditioned on every unique pair of remaining variables... V V 2 V 2 x =0 2 x = 10 X [1 + 3 + 3 + 1] = 80 Basis set: the smallest set of d-separation claims in a DAG that, together, imply all others. A B C D E If you know the basis set, then you can specify the entire structure of the joint probability distribution that is generated by the directed acyclic graph. Therefore, you can test the causal structure by testing the d-separation claims given in the basis set. Special basis set: BU= {X_||_Y|{Pa(X) U Pa(Y)} X,Y pair of vertices not directly connected. (each unique pair of non-adjacent vertices, conditioned on the set of parents of both) BU={A_||_D|{B,C}, A_||_E|{D}, B_||_C|{A}, B_||_E|{A,D}, C_||_E|{A,D} } List basis set BU A B C D A_||_D|{B,C} A_||_E|{D} B_||_C|{A} B_||_E|{A,D} C_||_E|{A,D} Convert to probabilistic claims rA,D|{B,C}=0 rA,E|D=0 rB,C|A=0 rB,E|A,D=0 rC,E|A,D=0 k Calculate : C = 2 Ln( pi ) i =1 Calculate probability of each claim in data p1=0.23 p2=0.50 p3=0.001 p4=0.45 p5=0.12 C = 23.98 k=5 IF all d-sep claims in the graph are true in the data, then C follows a chi-squared distribution with 2k degrees of freedom E THEREFORE if the probability of C is below the significance level……… the causal structure is rejected by the data. THEREFORE if the probability of C is above the significance level……… the causal structure is consistent with the data. X2 of 23.98 with 10 degrees of freedom gives p=0.008 REJECT causal structure Claude Bernard Ronald Fisher Karl Pearson Sewall Wright Clark Glymour Judea Pearl