Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bayesian networks Chapter 14 Section 1 – 2 Bayesian networks • A simple, graphical notation for conditional independence assertions and hence for compact specification of full joint distributions • Syntax: – a set of nodes, one per variable – a directed, acyclic graph (link ≈ "directly influences") – a conditional distribution for each node given its parents: P (Xi | Parents (Xi)) Example • Topology of network encodes conditional independence assertions: • Weather is independent of the other variables • Toothache and Catch are conditionally independent given Cavity Example • I'm at work, neighbor John calls to say my alarm is ringing, but neighbor Mary doesn't call. Sometimes it's set off by minor earthquakes. Is there a burglar? • Variables: Burglary, Earthquake, Alarm, JohnCalls, MaryCalls • Network topology reflects "causal" knowledge: – – – – A burglar can set the alarm off An earthquake can set the alarm off The alarm can cause Mary to call The alarm can cause John to call Example contd. Compactness • If each variable has no more than k parents, the complete network requires O(n · 2k) numbers • I.e., grows linearly with n, vs. O(2n) for the full joint distribution • For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 251 = 31) Semantics Constructing Bayesian networks • 1. Choose an ordering of variables X1, … ,Xn • 2. For i = 1 to n – add Xi to the network – select parents from X1, … ,Xi-1 such that P (Xi | Parents(Xi)) = P (Xi | X1, ... Xi-1) This choice of parents guarantees: n P (X1, … ,Xn) = πi =1 P (Xi | X1, … , Xi-1) (chain rule) n = πi =1P (Xi | Parents(Xi)) (by construction) Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)? Example • Suppose we choose the ordering M, J, A, B, E • P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? Example • Suppose we choose the ordering M, J, A, B, E • P(J | M) = P(J)? No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? P(B | A, J, M) = P(B)? Example • Suppose we choose the ordering M, J, A, B, E P(J | M) = P(J)?No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A ,J, M) = P(E | A)? P(E | B, A, J, M) = P(E | A, B)? Example • Suppose we choose the ordering M, J, A, B, E • P(J | M) = P(J)?No P(A | J, M) = P(A | J)? P(A | J, M) = P(A)? No P(B | A, J, M) = P(B | A)? Yes P(B | A, J, M) = P(B)? No P(E | B, A ,J, M) = P(E | A)? No P(E | B, A, J, M) = P(E | A, B)? Yes Example contd. • Deciding conditional independence is hard in non-causal directions • (Causal models and conditional independence seem hardwired for humans!) • Network is less compact Example contd. • Deciding conditional independence is hard in non-causal directions • (Causal models and conditional independence seem hardwired for humans!) • Network is less compact: 1 + 2 + 4 + 2 + 4 = 13 numbers needed Another example Another example Example from Medical Diagnostics Visit to Asia Smoking Patient Information Tuberculosis Lung Cancer Bronchitis Medical Difficulties Tuberculosis or Cancer XRay Result • Dyspnea Diagnostic Tests Network represents a knowledge structure that models the relationship between medical difficulties, their causes and effects, patient information and diagnostic tests Example from Medical Diagnostics Tuber Lung Can Tub or Can Visit to Asia Present Present True Present Absent True Absent Present True Absent Absent False Tuberculosis • Patient Information Lung Cancer Tuberculosis or Cancer XRay Result Smoking Bronchitis Dyspnea Medical Absent Difficulties Present Tub or Can Bronchitis True Present 0.90 0.l0 True Absent 0.70 0.30 False Present 0.80 0.20 False Absent 0.10 0.90 Dyspnea Diagnostic Tests Relationship knowledge is modeled by deterministic functions, logic and conditional probability distributions Example from Medical Diagnostics V isit To Asia Visit 1.00 N o Visit 99.0 Tuberculosis Present 1.04 A bsent 99.0 Smoking Smoker 50.0 N onSmoker 50.0 Lung Cancer Present 5.50 A bsent 94.5 Bronchitis Present 45.0 A bsent 55.0 Tuberculosis or Cancer True 6.48 False 93.5 XRay Result A bnormal 11.0 N ormal 89.0 • • D yspnea Present 43.6 A bsent 56.4 Propagation algorithm processes relationship information to provide an unconditional or marginal probability distribution for each node The unconditional or marginal probability distribution is frequently called the belief function of that node V isit To Asia Visit 100 N o Visit 0 Tuberculosis Present 5.00 A bsent 95.0 Smoking Smoker 50.0 N onSmoker 50.0 Lung Cancer Present 5.50 A bsent 94.5 Bronchitis Present 45.0 A bsent 55.0 Tuberculosis or Cancer True 10.2 False 89.8 XRay Result A bnormal 14.5 N ormal 85.5 • • • D yspnea Present 45.0 A bsent 55.0 As a finding is entered, the propagation algorithm updates the beliefs attached to each relevant node in the network Interviewing the patient produces the information that “Visit to Asia” is “Visit” This finding propagates through the network and the belief functions of several nodes are updated V isit To Asia Visit 100 N o Visit 0 Tuberculosis Present 5.00 A bsent 95.0 Smoking Smoker 100 N onSmoker 0 Lung Cancer Present 10.0 A bsent 90.0 Bronchitis Present 60.0 A bsent 40.0 Tuberculosis or Cancer True 14.5 False 85.5 XRay Result A bnormal 18.5 N ormal 81.5 • • D yspnea Present 56.4 A bsent 43.6 Further interviewing of the patient produces the finding “Smoking” is “Smoker” This information propagates through the network V isit To Asia Visit 100 N o Visit 0 Tuberculosis Present 0.12 A bsent 99.9 Smoking Smoker 100 N onSmoker 0 Lung Cancer Present 0.25 A bsent 99.8 Bronchitis Present 60.0 A bsent 40.0 Tuberculosis or Cancer True 0.36 False 99.6 XRay Result A bnormal 0 N ormal 100 • • • D yspnea Present 52.1 A bsent 47.9 Finished with interviewing the patient, the physician begins the examination The physician now moves to specific diagnostic tests such as an X-Ray, which results in a “Normal” finding which propagates through the network Note that the information from this finding propagates backward and forward through the arcs V isit To Asia Visit 100 N o Visit 0 Tuberculosis Present 0.19 A bsent 99.8 Smoking Smoker 100 N onSmoker 0 Lung Cancer Present 0.39 A bsent 99.6 Bronchitis Present 92.2 A bsent 7.84 Tuberculosis or Cancer True 0.56 False 99.4 XRay Result A bnormal 0 N ormal 100 • • D yspnea Present 100 A bsent 0 The physician also determines that the patient is having difficulty breathing, the finding “Present” is entered for “Dyspnea” and is propagated through the network The doctor might now conclude that the patient has bronchitis and does not have tuberculosis or lung cancer Summary • Bayesian networks provide a natural representation for (causally induced) conditional independence • Topology + conditional probability = compact representation of joint distribution • Generally easy for domain experts to construct