Download Bayesian networks

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Bayesian networks
Chapter 14
Section 1 – 2
Bayesian networks
• A simple, graphical notation for conditional
independence assertions and hence for
compact specification of full joint
distributions
• Syntax:
– a set of nodes, one per variable
– a directed, acyclic graph (link ≈ "directly
influences")
– a conditional distribution for each node given
its parents:
P (Xi | Parents (Xi))
Example
• Topology of network encodes conditional independence
assertions:
• Weather is independent of the other variables
• Toothache and Catch are conditionally independent
given Cavity
Example
• I'm at work, neighbor John calls to say my alarm is
ringing, but neighbor Mary doesn't call. Sometimes it's
set off by minor earthquakes. Is there a burglar?
• Variables: Burglary, Earthquake, Alarm, JohnCalls,
MaryCalls
• Network topology reflects "causal" knowledge:
–
–
–
–
A burglar can set the alarm off
An earthquake can set the alarm off
The alarm can cause Mary to call
The alarm can cause John to call
Example contd.
Compactness
• If each variable has no more than k parents, the
complete network requires O(n · 2k) numbers
• I.e., grows linearly with n, vs. O(2n) for the full joint
distribution
• For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 251 = 31)
Semantics
Constructing Bayesian networks
• 1. Choose an ordering of variables X1, … ,Xn
• 2. For i = 1 to n
– add Xi to the network
– select parents from X1, … ,Xi-1 such that
P (Xi | Parents(Xi)) = P (Xi | X1, ... Xi-1)
This choice of parents guarantees:
n
P (X1, … ,Xn) = πi =1 P (Xi | X1, … , Xi-1) (chain rule)
n
= πi =1P (Xi | Parents(Xi)) (by construction)
Example
• Suppose we choose the ordering M, J, A, B, E
P(J | M) = P(J)?
Example
• Suppose we choose the ordering M, J, A, B, E
•
P(J | M) = P(J)?
No
P(A | J, M) = P(A | J)? P(A | J, M) = P(A)?
Example
• Suppose we choose the ordering M, J, A, B, E
•
P(J | M) = P(J)? No
P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No
P(B | A, J, M) = P(B | A)?
P(B | A, J, M) = P(B)?
Example
• Suppose we choose the ordering M, J, A, B, E
P(J | M) = P(J)?No
P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No
P(B | A, J, M) = P(B | A)? Yes
P(B | A, J, M) = P(B)? No
P(E | B, A ,J, M) = P(E | A)?
P(E | B, A, J, M) = P(E | A, B)?
Example
• Suppose we choose the ordering M, J, A, B, E
•
P(J | M) = P(J)?No
P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No
P(B | A, J, M) = P(B | A)? Yes
P(B | A, J, M) = P(B)? No
P(E | B, A ,J, M) = P(E | A)? No
P(E | B, A, J, M) = P(E | A, B)? Yes
Example contd.
• Deciding conditional independence is hard in non-causal directions
• (Causal models and conditional independence seem hardwired for
humans!)
• Network is less compact
Example contd.
• Deciding conditional independence is hard in non-causal directions
• (Causal models and conditional independence seem hardwired for
humans!)
• Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed
Another example
Another example
Example from Medical Diagnostics
Visit to Asia
Smoking
Patient Information
Tuberculosis
Lung Cancer
Bronchitis
Medical Difficulties
Tuberculosis
or Cancer
XRay Result
•
Dyspnea
Diagnostic
Tests
Network represents a knowledge structure that models the relationship
between medical difficulties, their causes and effects, patient information and
diagnostic tests
Example from Medical Diagnostics
Tuber
Lung Can
Tub or Can
Visit to Asia
Present
Present
True
Present
Absent
True
Absent
Present
True
Absent
Absent
False
Tuberculosis
•
Patient Information
Lung Cancer
Tuberculosis
or Cancer
XRay Result
Smoking
Bronchitis
Dyspnea
Medical Absent
Difficulties
Present
Tub or Can
Bronchitis
True
Present
0.90
0.l0
True
Absent
0.70
0.30
False
Present
0.80
0.20
False
Absent
0.10
0.90
Dyspnea
Diagnostic
Tests
Relationship knowledge is modeled by deterministic functions, logic and
conditional probability distributions
Example from Medical Diagnostics
V isit To Asia
Visit
1.00
N o Visit 99.0
Tuberculosis
Present 1.04
A bsent 99.0
Smoking
Smoker
50.0
N onSmoker 50.0
Lung Cancer
Present 5.50
A bsent 94.5
Bronchitis
Present 45.0
A bsent 55.0
Tuberculosis or Cancer
True
6.48
False
93.5
XRay Result
A bnormal 11.0
N ormal
89.0
•
•
D yspnea
Present 43.6
A bsent 56.4
Propagation algorithm processes relationship information to provide an
unconditional or marginal probability distribution for each node
The unconditional or marginal probability distribution is frequently called
the belief function of that node
V isit To Asia
Visit
100
N o Visit
0
Tuberculosis
Present 5.00
A bsent 95.0
Smoking
Smoker
50.0
N onSmoker 50.0
Lung Cancer
Present 5.50
A bsent 94.5
Bronchitis
Present 45.0
A bsent 55.0
Tuberculosis or Cancer
True
10.2
False
89.8
XRay Result
A bnormal 14.5
N ormal
85.5
•
•
•
D yspnea
Present 45.0
A bsent 55.0
As a finding is entered, the propagation algorithm updates the beliefs attached
to each relevant node in the network
Interviewing the patient produces the information that “Visit to Asia” is “Visit”
This finding propagates through the network and the belief functions of several
nodes are updated
V isit To Asia
Visit
100
N o Visit
0
Tuberculosis
Present 5.00
A bsent 95.0
Smoking
Smoker
100
N onSmoker
0
Lung Cancer
Present 10.0
A bsent 90.0
Bronchitis
Present 60.0
A bsent 40.0
Tuberculosis or Cancer
True
14.5
False
85.5
XRay Result
A bnormal 18.5
N ormal
81.5
•
•
D yspnea
Present 56.4
A bsent 43.6
Further interviewing of the patient produces the finding “Smoking” is “Smoker”
This information propagates through the network
V isit To Asia
Visit
100
N o Visit
0
Tuberculosis
Present 0.12
A bsent 99.9
Smoking
Smoker
100
N onSmoker
0
Lung Cancer
Present 0.25
A bsent 99.8
Bronchitis
Present 60.0
A bsent 40.0
Tuberculosis or Cancer
True
0.36
False
99.6
XRay Result
A bnormal
0
N ormal
100
•
•
•
D yspnea
Present 52.1
A bsent 47.9
Finished with interviewing the patient, the physician begins the examination
The physician now moves to specific diagnostic tests such as an X-Ray, which
results in a “Normal” finding which propagates through the network
Note that the information from this finding propagates backward and forward
through the arcs
V isit To Asia
Visit
100
N o Visit
0
Tuberculosis
Present 0.19
A bsent 99.8
Smoking
Smoker
100
N onSmoker
0
Lung Cancer
Present 0.39
A bsent 99.6
Bronchitis
Present 92.2
A bsent 7.84
Tuberculosis or Cancer
True
0.56
False
99.4
XRay Result
A bnormal
0
N ormal
100
•
•
D yspnea
Present 100
A bsent
0
The physician also determines that the patient is having difficulty breathing, the
finding “Present” is entered for “Dyspnea” and is propagated through the
network
The doctor might now conclude that the patient has bronchitis and does not
have tuberculosis or lung cancer
Summary
• Bayesian networks provide a natural
representation for (causally induced)
conditional independence
• Topology + conditional probability =
compact representation of joint distribution
• Generally easy for domain experts to
construct