Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Bayesian Networks Instructor: Dan Geiger על מה המהומה ? נישואים מאושרים בין תורת ההסתברות ותורת הגרפים. הילדים המוצלחים :אלגוריתמים לגילוי תקלות ,קודים לגילוי שגיאות, מודלים למערכות מורכבות .שימושים במגוון רחב של תחומים. Web page: www.cs.technion.ac.il/~dang/courseBN Email: [email protected] Phone: 829 4339 Office: Taub 616. . Course Information Meetings: Lecture: Mondays 10:30 –12:30 Tutorial: Wednesdays 16:30 – 17:30 Grade: 50% in 4 question sets. These questions sets are obligatory. Each contains 6 problems. Submit in pairs in two weeks time. 50% one hour lecture (Priority to graduate students). Prerequisites: Data structure 1 (cs234218) Algorithms 1 (cs234247) Probability (any course) Information and handouts: www.cs.technion.ac.il/~dang/courseBN 2 Lectures Plan Mathematical Foundations (5-6 weeks including 3 students’ lectures, based on Pearl’s Chapter 3 + papers). 1. Properties of Conditional Independence (Soundness and completeness of marginal independence, graphoid axioms and their interpretation as “irrelevance”, incompleteness of conditional independence, no disjunctive axioms possible.) 2. Properties of graph separation (Paz and Pearl 85, Theorem 3), soundness and completeness of saturated independence statements. Undirected Graphs as I-maps of probability distributions. Markov-Blankets, Pairwise independence basis. Representation theorems (Pearl and Paz, from each basis to I-maps). 3. Markov networks, HC representation theorem, Completeness theorem. Markov chains 4. Bayesian Networks, d-separation, Soundness, Completeness. 5. Chordal Graphs as the intersection of BN and Markov networks. Equivalence of their 4 definitions. Combinatorial Optimiziation of Exact Inference in Graphical models (3 weeks including 2 students lectures). 1. Variable elimination; greedy algorithms for optimization. 2. Clique tree algorithm. Conditioning. 3. Treewidth. Feddback Vertex Set. Learning (3 weeks including 2 students lectures). 1. The ML method and the EM algorithm 2. Chow and Liu’s algorithm; the TAN model. 3. K2 measure, score equivalence, Chickering’s theorem, Dirichlet priors, Characterization theorem. 3 What is it all above ? •How to use graphs to represent probability distributions over thousands of random variables ? •How to encode conditional independence in directed and undirected graphs ? •How to use such representations for efficient computations of the probability of events of interest ? •How to learn such models from data ? 4 Properties of Independence I (X,Y) iff Pr(X=x,Y=y) = Pr(X=x)Pr(Y=y) Properties Symmetry: I(X,Y) I(Y,X) Decomposition: I(X,YW) I(X,Y) Mixing: I(X,Y) and I(XY,W) I(X,YW) Are there more properties of independence ? 5 Properties of Conditional Independence I(X,Z,Y) if and only if Pr(X=x ,Y=y |Z=z) = Pr(X=x |Z=z) Pr(Y=y |Z=z) Properties Symmetry: I(X,Z,Y) I(Y,Z,X) Decomposition: I(X,Z,YW) I(X,Z,Y) Mixing: I(X,Z,Y) and I(XY,Z,W) I(X,Z,YW) Are there more properties of independence ? 6 A simple Markov network f4(d,a) A f1(a,c) D C f3(b,d) f2(c,b) B The probability function represented by this graph satisfies: I(A,{C,D},B) and I(C, {A,B}, D). In large graphs, how do we compute P(A|B) ? How do we learn the best graph from sample data ? 7 Relations to Some Other Courses .אמור לי מי חבריך ואומר לך מי אתה Introduction to Artificial Intelligence (cs236501) Introduction to Machine Learning (cs236756) Introduction to Neural Networks (cs236950) Algorithms in Computational Biology (cs236522) Error correcting codes Data mining 8