Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Clusterpoint wikipedia , lookup
Data analysis wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Information privacy law wikipedia , lookup
Versant Object Database wikipedia , lookup
Data vault modeling wikipedia , lookup
Business intelligence wikipedia , lookup
Open data in the United Kingdom wikipedia , lookup
Tutorial Website https://oschulte.github.io/srl-tutorialslides/ Learning Bayesian Networks for Complex Relational Data Tutorial Introduction Presenters: Oliver Schulte, Ted Kirkpatrick School of Computing Science Simon Fraser University Vancouver-Burnaby, Canada Overview Learning Bayesian Networks for Complex Relational Data The Short Story Many organizations keep their data in a relational database. We describe methods for learning a Bayesian network for data in a relational database. Simultaneous joint statistical analysis of multiple interrelated tables. Learning Bayesian Networks for Complex Relational Data 3 Questions Considered 1.Semantics: how do you interpret a relational/first-order Bayesian network? 2.How can you use it? 3.What are the statistical challenges for learning? 4.What are the computational challenges for learning? Learning Bayesian Networks for Complex Relational Data 4 Motivation Learning Bayesian Networks for Complex Relational Data Database Management Systems Maintain data in linked tables. Structured Query Language (SQL) allows fast data retrieval. E.g., find all movie ratings > 4 where the user is a woman. Multi-billion dollar industry, $Bn15+ in 2006. IBM, Microsoft, Oracle, SAP, Peoplesoft. Learning Bayesian Networks for Complex Relational Data 6 Beyond storing and retrieving data Much new interest in analyzing databases. Data Mining. Data Warehousing. Business Intelligence. Predictive Analytics. Learning Bayesian Networks for Complex Relational Data 7 Unifying Logic and Probability • Fundamental Question in AI: how to combine logic and probability and learning? Statistical-Relational Learning • Domingos (U of W, CS): “Logic handles complexity, probability represents uncertainty.” • Recent survey paper by Stuart Russell Learning Bayesian Networks for Complex Relational Data 8 Query Examples Learning Bayesian Networks for Complex Relational Data Sample Queries Inference in a Bayesian network computes answers to probabilistic queries A Bayesian network for relational data can answer relational probabilistic queries We give some examples of relational and nonrelational queries Learning Bayesian Networks for Complex Relational Data 10 Single-Table Queries (Not relational) Query P(Drama(Movie) = T|RunTime(Movie) = Long) English Paraphrase The probability that a movie is a drama, given that it is long. P(Country(Actor) = U.S.|gender(Actor)=W) The probability that an actor is from the US, given that her gender is woman. 11 Cross-Table Queries (Movies) Query English Paraphrase Positive relationship P(Drama(Movie) = T| RunTime(Movie) = Long, ActsIn(Movie,”brad pitt”), ActsIn(Movie,”julie delp”), Country(“julie delp” = France)) The probability that a movie is from the US, given that it is long, and given that Brad Pitt and Julie Delp have appeared in it and Julie Delp is from France. Negative relationship P(Drama(Movie) = T| RunTime(Movie) = Long, ActsIn(Movie,”brad pitt”), not ActsIn(Movie,”juliette binoche”) Country(“juliette binoche” = France)) The probability that the movie named Movie is from the US, given that it is long, and given that Brad Pitt has appeared in it, and Juliette Binoche has not appeared in it 12 and is from France. Cross-Table Queries (Actors) Query English Paraphrase Positive relationship P(Country(Actor) = U.S.| gender(Actor)=W, ActsIn(“hate”,Actor), RunTime(“hate”)=short) The probability that an actor is from the US, given that she is a woman, and given that she appeared in the movie “hate”. Negative relationship The probability that the actor named Actor is from the US, given that she is a woman, and given that she appeared in the long movie Movie, and did not appear in the short movie “hate”. 13 P(Country(Actor) = U.S.|gender(Actor)=W, not ActsIn(“hate”,Actor), RunTime(“hate”)=short) Motivating Applications The ability to answer relational probabilistic queries has supported a number of successful applications. For example: Relational Query Optimization Information Extraction (DeepDive) Ontology Matching Entity Resolution Link-based classification Anomaly detection/exception mining 14 Tutorial Approach Our tutorial is a survey of issues, not of systems We give references to different systems Discuss a range of issues but only a single model class (Bayesian networks) Most concepts generalize to log-linear models for relational data. Focus on the new challenges of learning Bayesian networks with relational data, compared to traditional iid data Illustrate challenges and solutions with a running example Kimmig, A.; Mihalkova, L. & Getoor, L. (2014), 'Lifted graphical models: a survey', Machine Learning, 1--45. Sutton, C. & McCallum, A. (2007), An Introduction to Conditional Random Fields For Relational Learning’ Introduction to Statistical Relational Learning', MIT Press, , pp. 93-127. 15 Tutorial Plan Relational Data First-Order Bayesian networks Parameter Learning for First-Order BNs 30 min break Structure Learning for First-Order BNs Link-Based Classification using a First-Order BN Relational Anomaly Detection using a First-Order BN Learning Bayesian Networks for Complex Relational Data 16