Download Bayesian Networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of randomness wikipedia , lookup

Randomness wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Stochastic geometry models of wireless networks wikipedia , lookup

Birthday problem wikipedia , lookup

Probability box wikipedia , lookup

Dempster–Shafer theory wikipedia , lookup

Ars Conjectandi wikipedia , lookup

Inductive probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Transcript
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Networks
Distinguished Prof. Dr. Panos M. Pardalos
Center for Applied Optimization
Department of Industrial & Systems Engineering
Computer & Information Science & Engineering Department
Biomedical Engineering Program, McKnight Brain Institute
University of Florida
http://www.ise.ufl.edu/pardalos/
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Lecture Outline
Introduction
Fundamentals of Probability
Probability Space
Conditional Probability
Bayes Rule
Naı̈ve Bayes Classifier
Bayesian Approach
Example
Bayesian Networks
Graph Theory Concepts
Definition
Inferencing
Conclusions
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Introduction
I
Bayesian Networks are applied in cases of uncertainty when
we know certain [conditional] probabilities and are looking for
unknown probabilities given specific conditions.
I
Applications: bioinformatics and medicine, engineering,
document classification, image processing, data fusion, and
decision support systems, etc
Examples:
I
I
I
I
Inference: P(Diagnosis|Symptom)
Anomaly detection: Is this observation anomalous?
Active Data Collection: What is the next diagnostic test given
a set of observations?
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Discrete Random Variables
I
Let A denote a boolean-valued random variable
I
If A denotes an event, and there is some degree of uncertainty
as to whether A occurs.
Examples
I
I
I
I
A = Patient has Tuberculosis
A = Coin flipping outcome is Head
A = France will win World Cup in 2010
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Intuition Behind Probability
I
Intuitively probability of event A equals to the proportion of
the outcomes where A is true
I
Ω is the set of all possible outcomes.
I
Its area is P(Ω) = 1
I
The set colored in orange corresponds to the outcomes where
A is true
I
P(A) = Area of orange oval. Clearly 0 ≤ P(A) ≤ 1.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Kolmogorov’s Probability Axioms
“The theory of probability as a mathematical
discipline can and should be developed from axioms in
exactly the same way as geometry and algebra. ”
Andrey Nikolaevich Kolmogorov. Foundations of
the Theory of Probability, 1933.
1. P(A) ≥ 0, ∀A ⊆ Ω
2. P(Ω) = 1
3. σ-additivity: Any countable sequence of pairwise disjoint events
A1 , A2 , . . . satisfies
!
[
X
P
Ai =
P(Ai )
i
Panos Pardalos
i
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Other Ways to Deal with Uncertainty
I
Three-valued logic: True / False / Maybe
I
Fuzzy logic (truth values between 0 and 1)
I
Non-monotonic reasoning (especially focused on Penguin
informatics)
I
Dempster-Shafer theory (and an extension known as
quasi-Bayesian theory)
I
Possibabilistic Logic
I
But...
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Coherence of the Axioms
I
The Kolmogorov’s axioms of probability are the only model
with this property:
I
Wagers (probabilities) are assigned in such a way that no
matter what set of wagers your opponent chooses you are not
exposed to certain loss
Bruno de Finetti, Probabilismo, Napoli, Logos 14, 1931, pp
163-219.
Bruno de Finetti. Probabilism: A Critical Essay on the Theory
of Probability and on the Value of Science, (translation of
1931 article) in Erkenntnis, volume 31, September 1989, pp
169-223.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Consequences of the Axioms
I
P(Ā) = 1 − P(A), where Ā = Ω\A
I
P(∅) = 0
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Consequences of the Axioms
I
P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
I
P(A) = P(A ∩ B) + P(A ∩ B̄)
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Conditional Probability
P(A|B) = Proportion of the space in which A is true that also
have B true
Formal definition:
P(B|A) =
Panos Pardalos
P(A ∩ B)
P(A)
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Conditional Probability: Example
I
Let us draw a card from the deck of 52 playing cards.
I
A =the card is a court card. P(A) = 12/52 = 3/13
I
B =the card is a queen. P(B) = 4/52 = 1/13,
P(B ∩ A) = P(B) = 1/13
If we apply the definition we obtain very intuitive result:
I
I
P(B|A) =
I
P(A|B) =
1/13
3/13
1/13
1/13
= 1/3
=1
I
C =the suit is spade. P(C ) = 1/4
I
Note that P(C |A) = P(C ) = 1/4. In other words event C is
independent from A.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Independent Events
Definition
Two events A and B are independent if and only if
P(A ∩ B) = P(A)Pr (B). Let us denote independence of A and B
as I (A, B).
The independence of A and B implies
I
P(A|B) = P(A), if P(B) 6= 0
I
P(B|A) = P(B), if P(A) 6= 0
I
Why?
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Conditional Independence
I
One might observe that people of longer arms tend to have
higher levels of reading skills
I
If the age is fixed then this relationship disappears
I
Arm length and reading skills are conditionally independent
given the age
Definition
Two events A and B are conditionally independent given C if and
only if P(A ∩ B|C ) = P(A|C )Pr (B|C ). Notation: I (A, B|C ).
I
P(A|B, C ) = P(A|C ), if P(B|C ) 6= 0
I
P(B|A, C ) = P(B|C ), if P(A|C ) 6= 0
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Bayes Rule
I
The definition of conditional probability
P(A|B) =
P(A ∩ B)
P(B)
I
implies the chain rule: P(A ∩ B) = P(A|B)P(B).
I
By symmetry P(A ∩ B) = P(B|A)P(A)
I
After we equate right hand sides and do some algebra we
obtain Bayes Rule
P(B|A) =
Panos Pardalos
P(A|B)P(B)
P(A)
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Monty Hall Problem
I
I
I
The treasure is equally probable contained in one of the boxes
A, B and C , i.e. P(A) = P(B) = P(C ). You are offered to
chose one of them.
Let us say you choose box A.
Then the host of the game opens the box which you did not
chose and does not contain the treasure
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Probability Space
Conditional Probability
Bayes Rule
Monty Hall Problem
I
For instance, the host has opened box C
I
Then you are offered an option to reconsider your choice.
What would you do?
In other words what are the probabilities P(A|NA,C ) and
P(B|NA,C )?
I
I
I
What does your intuition advise?
Now apply Bayes Rule.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Classification based on Bayes Theorem
I
Let Y denote class variable. For example we want to predict if
the borrower will default.
I
Let X = (X1 , X1 , . . . , Xk ) denote the attribute set (i.e. home
owner, marital status, annual income, etc)
I
We can treat X and Y as random variables and determine
P(Y |X ) (posterior probability).
I
Knowing the probability P(Y |X 0 ) we can relate the relate the
record X to the class that maximizes the posterior probability.
I
How can we estimate P(Y |X ) from training data?
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
#
Home
Owner
(binary)
Marital
Status (categorical)
1
2
3
4
5
6
7
8
9
10
Yes
No
No
Yes
No
No
Yes
No
No
No
Single
Married
Single
Married
Divorced
Married
Divorced
Single
Married
Single
Bayesian Approach
Example
Annual
Income
(continuous)
125K
100K
70K
120K
95K
60K
220K
85K
75K
90K
Defaulted
Borrower
(class)
No
No
No
No
Yes
No
No
Yes
No
Yes
Table: Historical data for default prediction
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Bayes approach
I
I
Accurate estimate of posterior probability for every possible
combination of attributes and classes requires a very large
training set, even for a moderate number of attributes.
We can utilize Bayes theorem instead
P(Y |X ) =
I
I
I
P(X |Y ) × P(Y )
P(X )
P(X ) is a constant and can be calculated as a normalization
multiplier
P(Y ) can be easily estimated from training set (fraction of
training records that belong to each class)
P(X |Y ) is a more challenging task. Methods:
I
I
Naı̈ve Bayes Classifier
Bayesian Network
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Naı̈ve Bayes Classifier
I
Attributes are assumed to be conditionally independent, given
the class label y :
P(X |Y = y ) =
k
Y
P(Xi |Y = y )
i=1
I
thus
P(Y |X ) =
I
P(Y )
Qk
i=1 P(Xi |Y
= y)
P(X )
Now we need to estimate P(Xi |Y ) for i = 1, . . . , k.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Estimating Probabilities
I
I
P(Xi = x|Y = y ) is estimated according to fraction of
training instances in class y that take on a particular attribute
value xi .
For example
I
I
I
I
P(Home Owner=Yes|Y = No) = 3/7
P(Marital Status=Single|Y = Yes) = 2/3
What about continuous attributes?
One solution is to discretize each continuous attribute and
then replace value with its corresponding interval (transform
continuous attributes into ordinal attributes).
I
How can we discretize?..
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Continuous Attributes
I
Assume a certain type of probability distribution for
continuous attribute.
I
For example it can be a Gaussian distribution having p.d.f.
−
1
fij (xi ) = √
e
2πσij
I
(xi −µij )2
2σ 2
ij
parameters for fij can be estimated based on training records
that belongs to class yi
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Continuous Attributes
I
Using approximation
Z
xi +
P(xi < Xi ≤ xi + |Y = yi ) =
fij (y )dy ≈ fij (xi )
xi
I
and the fact that cancels out when we normalize posterior
probability for P(Y |X ) allows us to assume
P(Xi = xi |Y = yj ) = fij (xi )
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Example
I
The sample mean for annual income attribute with respect to
the class No
x̄ =
I
125 + 100 + 70 + . . . + 75
= 110
7
Variance
(125 − 110)2 + (100 − 110)2 + . . . + (75 − 100)2
= 2975
6
√
s = 2975 = 54.54
Given a test record with income $120K
s2 =
I
I
P(Income = 120|No) = √
Panos Pardalos
(120−110)2
1
e − 2×2975 = 0.0072
2π54.54
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Example
I
I
I
I
I
I
I
I
I
I
I
Suppose X =(Home Owner=No, Marital Status=Married, Income =
$120K)
P( Home Owner = Yes | Y = No) = 3/7
P( Home Owner = No | Y = No) = 4/7
P( Home Owner = Yes | Y = Yes) = 0
P( Home Owner = No | Y = Yes) = 1
P( Marital Status = Divorced | Y = No) = 2/7
P( Marital Status = Married | Y = No) = 1/7
P( Marital Status = Single | Y = No) = 4/7
P( Marital Status = Divorced | Y = Yes) = 2/3
P( Marital Status = Married | Y = Yes) = 1/3
P( Marital Status = Single | Y = Yes) = 0
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Example
I
For annual income:
I
I
I
class No: x̄ = 110, s 2 = 2975
class Yes: x̄ = 90, s 2 = 25
Class-conditional probabilities:
I
I
P(X |No) = P(Home Owner = No|No)×
×P(Status=Married|No)×
×P(Annual Income = $120K |No) = 4/7 × 4/7 × 0.0072 =
0.0024
P(X |Yes) = P(Home Owner = No|Yes)×
×P(Status=Married|Yes)×
×P(Annual Income = $120K |Yes) = 1 × 0 × 1.2 × 10−9 = 0
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Example
I
Posterior probabilities
I
I
P(No|X ) = α × 7/10 × 0.0024 = 0.0016α
P(Yes|X ) = 0
I
where α = 1/P(X )
I
Since P(No|X ) > (Yes|X ) the record is classified as “No”
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Bayesian Approach
Example
Naı̈ve Bayes Classifier: Discussion
I
Robust to isolated noise points because such points are
averaged out when estimating conditional probabilities from
data
I
Can handle missing values by ignoring the example during
model building and classification
I
Robust to irrelevant attributes. If Xi is irrelevant then
P(Xi |Y ) is almost uniformly distributed and thus P(Xi |Y ) has
little impact on posterior probability
Correlated attributes can degrade the performance because
conditional independence does not hold.
I
I
Bayesian Networks account dependence between attributes
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Directed Graph
Definition
A directed graph or digraph G is an ordered pair G := (V , A) with
I
V is a set, whose elements are called vertices or nodes,
I
A ⊆ V × V is a set of ordered pairs of vertices, called directed
edges, arcs, or arrows.
Panos Pardalos
I
V = {V1 , V2 , V3 , V4 , V5 }
I
E = {(V1 , V1 ), (V1 , V4 ),
(V2 , V1 ), (V4 , V2 ),
(V5 , V5 )}
I
Cycle: V1 → V4 → V2
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Directed Acyclic Graph
Definition
A directed acyclic graph (DAG ), is a directed graph with no
directed cycles; that is, for any vertex v , there is no nonempty
directed path that starts and ends on v .
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Some Graph Theory Notions
I
V1 and V4 are parents of V2
I
I
V5 , V3 and V2 are descendants of V1
I
I
V1 is connected to V5 , V3 and V2
with directed paths
V4 and V2 are ancestors of V3
I
I
(V1 , V2 ) ∈ E and (V4 , V2 ) ∈ E
There exist directed paths from V4
and V2 to V3
V6 and V4 are nondescendents of V1
I
Panos Pardalos
Directed paths from V1 to V4 and
V6 do not exist
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Bayesian Network Definition
I
Elements of Bayesian Network:
I
I
Directed acyclic graph (DAG) encodes the dependence
relationships among a set of variables
A probability table associating each node to its immediate
parent nodes
I
Each node of the graph represents a variable
I
Each arc asserts the dependence relationship between the pair
of variables
I
DAG satisfies Markov condition
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
The Markov Condition
Definition
Suppose we have a joint probability distribution P of the random
variables in some set V and a DAG G = (V , E ). We say that
(G , P) satisfies the Markov condition if for each variable X ∈ V ,
{X } is conditionally independent of the set of all its
nondescendents (NDX ) given the set of all its parents (PAX ).
I ({X }, NDX |PAX ).
The definitin implies that a root node X , which has no parents, is
unconditionally independent from its nondescendents.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Figure: Bayes network: a case study
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Markov Condition Example
Node
E
D
HD
Hb
B
C
Parents
∅
∅
E, D
?
?
?
Independency
I (E , {D, Hb})
I (D, E )
I (HD, Hb|{E , D})
?
?
?
Note that I (A, B|C ) implies I (A, D|C ) whenever D ⊂ B.
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Naı̈ve Bayes Classifier Representation
I
Recall that a naı̈ve Bayes classifier assumes conditional
independence of attributes X1 , X1 ,. . ., Xk , given target class
Y
I
This can be represented using a Bayesian Network below
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Inferencing
I
We can compute joint probability from a Bayesian Network
P(X1 , X2 , . . . , Xn ) =
n
Y
P(Xi |parents(Xi ))
i=1
I
Thus we can compute any conditional probability
P
P(X)
entries X
P(Xk , Xm )
matching Xk ,Xm
P(Xk |Xm ) =
= P
P(Xm )
P(X)
entries X
matching Xk
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Example of Inferencing
I
Suppose no prior information about the person is given
I
What is the probability of developing heart disease?
I
α = {Yes, No}, β = {Healthy, Unhealthy}
P(HD = Yes) =
P P
= α β P(HD = Yes|E = α, D = β)·
·P(E = α, D = β) =
P P
= α β P(HD = Yes|E = α, D = β)·
·P(E = α) · P(D = β) =
= 0.25 · 0.7 · 0.25 + 0.45 · 0.7 · 0.75+
+0.55 · 0.3 · 0.25 + 0.75 · 0.30.75 = 0.49
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
I
Now let us compute probability of heart disease when the
person has high blood pressure
γ = {Yes, No}
I
Probability of high blood pressure
I
P(B = High) =
P
= γ P(B = High|HD = γ)·
·P(HD = γ) =
0.85 · 0.49 + 0.2 · 0.51 =
= 0.5185
I
The posterior probability of heart
disease given high blood pressure is
P(HD = Yes|BP = High) =
P(BP = High|HD = Yes) · P(HD = Yes)
=
P(BP = High)
= (0.85 · 0.49)/0.5185 = 0.8033
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Complexity Issues
I
Recall, we can compute any conditional probability:
P
P(X)
entries X
P(Xk , Xm )
matching Xk ,Xm
P(Xk |Xm ) =
= P
P(Xm )
P(X)
entries X
matching Xk
I
Generally it requires exponentially large number of operations
I
We can apply various tricks to reduce complexity
I
But querying of Bayes nets is NP-hard
D. M. Chickering, D. Heckerman, C. Meek, Large-Sample
Learning of Bayesian Networks is NP-Hard. Journal of
Machine Learning Research, 5 (2004) 1287-1330
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Graph Theory Concepts
Definition
Inferencing
Bayesian Networks Discussion
I
Bayes network is an elegant way of encoding casual
probabilistic dependencies.
I
The dependency model can be represented graphically
I
Constructing a network requires effort but adding a new
variable is quite straightforward
I
Well suited for incomplete data.
I
Due to probabilistic nature of the model the method is robust
to model overfitting
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
What we have learned
I
Independence and conditional independence
I
Bayes theorem
I
Naı̈ve Bayes classification
I
The definition of a Bayes net
I
Computing probabilities with a Bayes net
Panos Pardalos
Bayesian Networks
Introduction
Fundamentals of Probability
Naı̈ve Bayes Classifier
Bayesian Networks
Conclusions
Literature
Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction
to Data Mining, Addison-Wesley, 2005
Finn V. Jensen, Thomas D Nielsen. Bayesian Networks and
Decision Graphs. 2nd Ed., Springer, 2007
Richard E. Neapolitan, Learning Bayesian Networks, Prentice
Hall, 2003
Indea Pearl. Probabilistic Reasoning in Intelligent Systems:
Networks of Plausible Inference, Morgan Kaufmann, 1988
Panos Pardalos
Bayesian Networks