Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon 1. Predictive Data Mining Finding predictive relationships in data – What feature of student behavior predicts learning – Who will default on credit cards – Who will get an “A” in your course – Which HS students will do well at CMU – Do students cluster by “learning style” Causal Data Mining Finding causal relationships in data – What feature of student behavior causes learning – What will happen when we make everyone take a reading quiz before each class – What will happen when we program our tutor to intervene to give hints after an error Predictive Data Mining X1 X2 X3 . . Xk Y 1 1.7 28 M . . 2.4 1 2 2.0 11 F . . 1.1 0 3 1.9 17 F . . 1.1 1 . . . . . . . . . . . . . . . . N 2.8 12 M . . 1.8 0 Data Mining Search Predictive Model Y = f(X1, X2, …Xk) Predictive Data Mining Model Classes 1. Simple Regression 2. Locally Weighted Regression 3. Logistic Regression Predictive Model 4. Neural Nets Y = f(X1, X2, …Xk) 5. Vector Support Machines 6. Decision Trees 7. Bayes Net 8. Naïve Bayes Classifier 9. Independent Components Data Mining Search 10. Clustering 11. Etc. Predictive Data Mining Data Mining Search Predictive Model under Constraints Y = f(X1, X2, …Xk), e.g., f Additive functions Predictive Data Mining Data Mining Search Predictive Model under Constraints Y = f(X1, X2, …Xk), Or Probability Model under Constraints: P(Y | X1, X2, …, Xk), where P Gaussian, with mean 0 Predictive Data Mining Decision Tree Search P(Hosp.) = .78 Pos >57 >1.4 X-Ray Neg . Age Lab2 1.4 Lab2 1.8 P(Hosp.) = .10 P(Hosp.) = .66 >1.8 57 P(Hosp.) = .59 Lab1 >2.3 P(Hosp.) = .75 2.3 P(Hosp.) = .05 Predictive Data Mining ≠ Causal Data Mining Conditioning is not the same as intervening P(Y | X1, X2, …, Xk) P(Y | X1set, X2, …, Xk) Teeth Slides Causal Discovery Statistical Data Causal Structure Data Equivalence Class of Causal Graphs X1 X1 X1 X2 X2 X2 X3 Causal Markov Axiom (D-separation) X3 X3 Statistical Inference Discovery Algorithm Independence Relations X1 Background Knowledge - X2 before X3 - no unmeasured common causes X3 | X2 Causal Discovery Software TETRAD IV www.phil.cmu.edu/projects/tetrad Full Semester Online Course in Causal & Statistical Reasoning Full Semester Online Course in Causal & Statistical Reasoning • Course is tooled to record certain events: Logins, page requests, print requests, quiz attempts, quiz scores, voluntary exercises attempted, etc. • Each event was associated with attributes: Time student-id Session-id Printing and Voluntary Comprehension Checks: 2002 --> 2003 2002 2003 -.41 voluntary questions print .75 .302 pre quiz print -.16 voluntary questions -.08 pre .353 .41 .323 .25 final final