Download BN-00

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
‫‪Introduction to Bayesian Networks‬‬
‫‪Instructor: Dan Geiger‬‬
‫על מה המהומה ? נישואים מאושרים בין תורת ההסתברות ותורת הגרפים‪.‬‬
‫הילדים המוצלחים‪ :‬אלגוריתמים לגילוי תקלות‪ ,‬קודים לגילוי שגיאות‪,‬‬
‫מודלים למערכות מורכבות‪ .‬שימושים במגוון רחב של תחומים‪.‬‬
‫‪Web page: www.cs.technion.ac.il/~dang/courseBN‬‬
‫‪Email: [email protected]‬‬
‫‪Phone: 829 4339‬‬
‫‪Office: Taub 616.‬‬
‫‪.‬‬
Course Information
Meetings:
 Lecture: Mondays 10:30 –12:30
 Tutorial: Wednesdays 16:30 – 17:30
Grade:
 50% in 4 question sets. These questions sets are obligatory. Each
contains 6 problems. Submit in pairs in two weeks time.
 50% one hour lecture (Priority to graduate students).
Prerequisites:
 Data structure 1 (cs234218)
 Algorithms 1 (cs234247)
 Probability (any course)

Information and handouts:

www.cs.technion.ac.il/~dang/courseBN
2
Lectures Plan

Mathematical Foundations (5-6 weeks including 3 students’ lectures, based on Pearl’s Chapter 3 +
papers).
1.
Properties of Conditional Independence (Soundness and completeness of marginal
independence, graphoid axioms and their interpretation as “irrelevance”, incompleteness of
conditional independence, no disjunctive axioms possible.)
2.
Properties of graph separation (Paz and Pearl 85, Theorem 3), soundness and completeness of
saturated independence statements. Undirected Graphs as I-maps of probability distributions.
Markov-Blankets, Pairwise independence basis. Representation theorems (Pearl and Paz, from
each basis to I-maps).
3.
Markov networks, HC representation theorem, Completeness theorem. Markov chains
4.
Bayesian Networks, d-separation, Soundness, Completeness.
5.
Chordal Graphs as the intersection of BN and Markov networks. Equivalence of their 4
definitions.

Combinatorial Optimiziation of Exact Inference in Graphical models (3 weeks including 2 students
lectures).
1.
Variable elimination; greedy algorithms for optimization.
2.
Clique tree algorithm. Conditioning.
3.
Treewidth. Feddback Vertex Set.

Learning (3 weeks including 2 students lectures).
1.
The ML method and the EM algorithm
2.
Chow and Liu’s algorithm; the TAN model.
3.
K2 measure, score equivalence, Chickering’s theorem, Dirichlet priors, Characterization
theorem.
3
What is it all above ?
•How to use graphs to represent probability distributions
over thousands of random variables ?
•How to encode conditional independence in directed and
undirected graphs ?
•How to use such representations for efficient computations
of the probability of events of interest ?
•How to learn such models from data ?
4
Properties of Independence
I (X,Y) iff Pr(X=x,Y=y) = Pr(X=x)Pr(Y=y)
Properties
Symmetry: I(X,Y)  I(Y,X)
Decomposition: I(X,YW)  I(X,Y)
Mixing: I(X,Y) and I(XY,W)  I(X,YW)
Are there more properties of independence ?
5
Properties of Conditional Independence
I(X,Z,Y) if and only if
Pr(X=x ,Y=y |Z=z) = Pr(X=x |Z=z) Pr(Y=y |Z=z)
Properties
Symmetry: I(X,Z,Y)  I(Y,Z,X)
Decomposition: I(X,Z,YW)  I(X,Z,Y)
Mixing: I(X,Z,Y) and I(XY,Z,W)  I(X,Z,YW)
Are there more properties of independence ?
6
A simple Markov network
f4(d,a)
A
f1(a,c)
D
C
f3(b,d)
f2(c,b)
B
The probability function represented by this graph
satisfies: I(A,{C,D},B) and I(C, {A,B}, D).
In large graphs, how do we compute P(A|B) ?
How do we learn the best graph from sample data ?
7
Relations to Some Other Courses
.‫אמור לי מי חבריך ואומר לך מי אתה‬
 Introduction
to Artificial Intelligence (cs236501)
 Introduction to Machine Learning (cs236756)
 Introduction to Neural Networks (cs236950)
 Algorithms in Computational Biology (cs236522)
 Error correcting codes
 Data mining
8