Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Can causal models be evaluated? Isabelle Guyon ClopiNet / ChaLearn http://clopinet.com/causality [email protected] Acknowledgements and references 1) Feature Extraction, Foundations and Applications I. Guyon, S. Gunn, et al. Springer, 2006. http://clopinet.com/fextract-book 2) Causation and Prediction Challenge I. Guyon, C. Aliferis, G. Cooper, A. Elisseeff, J.-P. Pellet, P. Spirtes, and A. Statnikov, Eds. CiML, volume 2, Microtome. 2010. http://www.mtome.com/Publications/CiML/ciml.html http://gesture.chalearn.org Co-founders: Constantin Aliferis André Elisseeff Gregory F. Cooper Alexander Statnikov Jean-Philippe Pellet Peter Spirtes ChaLearn directors and advisors: Alexander Statnivov Ioannis Tsamardinos Richard Scheines Frederick Eberhardt Florin Popescu Preparation of ExpDeCo Experimental design in causal discovery • • • • • • Motivations Quiz What we want to do (next challenge) What we already set up (virtual lab) What we could improve Your input… Note: Experiment = manipulation = action Causal discovery motivations (1) Interesting problems What affects… …your health? … the economy? and… …climate changes? which actions will have beneficial effects? Predict the consequences of (new) actions • Predict the outcome of actions – What if we ate only raw foods? – What if we imposed to paint all cars white? – What if we broke up the Euro? • Find the best action to get a desired outcome – Determine treatment (medicine) – Determine policies (economics) • Predict counterfactuals – A guy not wearing his seatbelt died in a car accident. Would he have died had he worn it? Causal discovery motivations (2) Lots of data available http://data.gov http://data.uk.gov http://www.who.int/research/en/ http://www.ncdc.noaa.gov/oa/ncdc.html http://neurodatabase.org/ http://www.ncbi.nlm.nih.gov/Entrez/ http://www.internationaleconomics.net/data.html http://www-personal.umich.edu/~mejn/netdata/ http://www.eea.europa.eu/data-and-maps/ Causal discovery motivations (3) Classical ML helpless Y Y X Causal discovery motivations (3) Classical ML helpless Y Y X Predict the consequences of actions: Under “manipulations” by an external agent, only causes are predictive, consequences and confounders are not. Causal discovery motivations (3) Classical ML helpless Y Y X If manipulated, a cause influences the outcome… Causal discovery motivations (3) Classical ML helpless Y Y X … a consequence does not … Causal discovery motivations (3) Classical ML helpless Y Y X … neither does a confounder (consequence of a common cause). Causal discovery motivations (3) Classical ML helpless • Special case: stationary or cross-sectional data (no time series). • Superficially, the problem resembles a classical feature selection problem. n n’ m X Quiz What could be the causal graph? Could it be that? Y X1 X2 Let’s try Y x2 Y X1 X2 Simpson’s paradox X1 || X2 | Y x1 Could it be that? X2 X1 Y Let’s try Y x2 X2 X1 Y x1 Plausible explanation 120 X1 X2 X2 || Y 100 80 X2 || Y | X1 60 40 20 180 190 200 x2 baseline 210 220 230 240 250 260 Y disease normal peak x1 baseline (X2) health (Y) peak (X1) What we would like x2 Y Y X1 X2 x1 Manipulate X1 x2 Y Y X1 X2 x1 Manipulate X2 x2 Y Y X1 X2 x1 What we want to do Causal data mining How are we going to do it? Obstacle 1: Practical Many statements of the "causality problem" Obstacle 2: Fundamental It is very hard to assess solutions Evaluation • Experiments are often: – Costly – Unethical – Infeasible • Non-experimental “observational” data is abundant and costs less. New challenge: ExpDeCo Experimental design in causal discovery - Goal: Find variables that strongly influence an outcome - Method: - Learn from a “natural” distribution (observational data) - Predict the consequences of given actions (checked against a test set of “real” experimental data) - Iteratively refine the model with experiments (using on-line learning from experimental data) What we have already done Models of systems QUERIES Database Anxiety Yellow Fingers Smoking Allergy Genetics Attention Disorder Lung Cancer Coughing ANSWERS Born an Even Day Peer Pressure Fatigue Car Accident http://clopinet.com/causality February 2007: Project starts. Pascal2 funding. August 2007: Two-year NSF grant. Dec. 2007: Workbench alive. 1st causality challenge. Sept. 2008: 2nd causality challenge (Pot luck). Fall 2009: Virtual lab alive. Dec. 2009: Active Learning Challenge (Pascal2). December 2010: Unsupervised and Transfer Learning Challenge (DARPA). Fall 2012: ExpDeCo (Pascal2) Planned: CoMSiCo What remains to be done ExpDeCo (new challenge) Setup: • Several paired datasets (preferably or real data): – “Natural” distribution – “Manipulated” distribution • Problems – – – – Learn a causal model from the natural distribution Assessment 1: test with natural distribution Assessment 2: test with manipulated distribution Assessment 3: on-line learning from manipulated distribution (sequential design of experiments) Challenge design constraints - Largely not relying on “ground truth” this is difficult or impossible to get (in real data) - Not biased towards particular methods - Realistic setting as close as possible to actual use - Statistically significant, not involving "chance“ - Reproducible on other similar data - Not specific of very particular settings - No cheating possible - Capitalize on classical experimental design Lessons learned from the Causation & Prediction Challenge Causation and Prediction challenge Challenge datasets Toy datasets Assessment w. manipulations (artificial data) Causality assessment with manipulations Anxiety Yellow Fingers Peer Pressure Smoking Allergy Genetics Lung Cancer Coughing LUCAS0: natural Born an Even Day Attention Disorder Fatigue Car Accident Causality assessment with manipulations Anxiety Yellow Fingers Peer Pressure Smoking Allergy Genetics Lung Cancer Coughing LUCAS1: manipulated Born an Even Day Attention Disorder Fatigue Car Accident Causality assessment with manipulations Anxiety Yellow Fingers Peer Pressure Smoking Allergy Genetics Lung Cancer Coughing LUCAS2: manipulated Born an Even Day Attention Disorder Fatigue Car Accident Assessment w. ground truth • We define: V=variables of interest (Theoretical minimal set of predictive variables, e.g. MB, direct causes, ...) 10 3 2 9 1 5 4 0 11 6 8 •Participants score feature relevance: S=ordered list of features 4 11 2 3 1 •We assess causal relevance with AUC=f(V,S) 7 Assessment without manip. (real data) Using artificial “probes” Anxiety Yellow Fingers Smoking Allergy LUCAP0: natural Born an Even Day Peer Pressure Genetics Lung Cancer Coughing Attention Disorder Fatigue Car Accident P1 P2 P3 Probes PT Using artificial “probes” Anxiety Yellow Fingers Smoking Allergy LUCAP1&2: manipulated Born an Even Day Peer Pressure Genetics Lung Cancer Coughing Attention Disorder Fatigue Car Accident P1 P2 P3 Probes PT Scoring using “probes” • What we can compute (Fscore): – Negative class = probes (here, all “non-causes”, all manipulated). – Positive class = other variables (may include causes and non causes). • What we want (Rscore): – Positive class = causes. – Negative class = non-causes. • What we get (asymptotically): Fscore = (NTruePos/NReal) Rscore + 0.5 (NTrueNeg/NReal) Pairwise comparisons Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr. Jinzhu Jia Jianming Jin L.E.B & Y.T. M.B. Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu Causal vs. non-causal Jianxin Yin: causal Vladimir Nikulin: non-causal Insensitivity to irrelevant features Simple univariate predictive model, binary target and features, all relevant features correlate perfectly with the target, all irrelevant features randomly drawn. With 98% confidence, abs(feat_weight) < w and Si wixi < v. ng number of “good” (relevant) features nb number of “bad” (irrelevant) features m number of training examples. How to overcome this problem? • Leaning curve in terms of number of features revealed – Without re-training on manipulated data – With on-line learning with manipulated data • Give pre-manipulation variable values and the value of the manipulation • Other metrics: stability, residuals, instrument variables, missing features by design Conclusion (more: http://clopinet.com/causality) • We want causal discovery to become “mainstream” data mining • We believe we need to start with “simple” standard procedures of evaluation • Our design is close enough to a typical prediction problem, but – Training on natural distribution – Test on manipulated distribution • We want to avoid pitfalls of previous challenge designs: – Reveal only pre-manipulated variable values – Reveal variables progressively “on demand”