Download ppt

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Probabilistic Reasoning
ECE457 Applied Artificial Intelligence
Spring 2007
Lecture #9
Outline



Bayesian networks
D-separation and independence
Inference

Russell & Norvig, sections 14.1 to 14.4
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 2
Recall the Story from FOL


Anyone passing their 457 exam and
winning the lottery is happy. Anyone
who studies or is lucky can pass all their
exams. Bob did not study but is lucky.
Anyone who’s lucky can win the lottery.
Is Bob happy?
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 3
Add Probabilities


Anyone passing their 457 exam and winning the
lottery has a 99% chance of being happy. Anyone only
passing their 457 exam has an 80%, while someone
only winning the lottery has a 60% chance of being
happy, and someone who does neither has a 20%
chance of being happy. Anyone who studies has a
90% chance of passing their exams. Anyone who’s
lucky has a 50% chance of passing their exams.
Anyone who’s both lucky and who studied has a 99%
chance of passing, but someone who didn’t study and
is unlucky has a 1% chance of passing. There’s a 20%
chance that Bob studied, but a 75% chance that he’ll
be lucky. Anyone who’s lucky has a 40% chance of
winning the lottery, while an unlucky person only has a
1% chance of winning.
What’s the probability of Bob being happy?
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 4
Probabilities in the Story

Example of probabilities in the story








P(Lucky) = 0.75
P(Study) = 0.2
P(PassExam|Study) = 0.9
P(PassExam|Lucky) = 0.5
P(Win|Lucky) = 0.4
P(Happy|PassExam,Win) = 0.99
Some variables directly affect others!
Graphical representation of dependencies and
conditional independencies between
variables?
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 5
Bayesian Network

Lucky
Study
Belief network


Win

PassExam

Happy
ECE457 Applied Artificial Intelligence
Directed acyclic graph
Nodes represent
variables
Edges represent
conditional relationships
Concise representation
of any full joint
probability distribution
R. Khoury (2007)
Page 6
Bayesian Network

Lucky
Study

Win
PassExam
Nodes with no parents
have prior probabilities
Nodes with parents
have conditional
probability tables

Happy
ECE457 Applied Artificial Intelligence
For all truth value
combinations of their
parents
R. Khoury (2007)
Page 7
Bayesian Network
P(L) = 0.75
L
P(W)
F
T
0.01
0.4
Study
Win
PassExam
P(S) = 0.2
P(W|L)
P(W|L)
W E
F F
P(H) P(H)
0.2 0.8
T
F
T
0.6 0.4
0.8 0.2
0.99 0.01
F
T
T
Lucky
Happy
ECE457 Applied Artificial Intelligence
L
F
T
F
T
S
F
F
T
T
P(E)
0.01
0.5
0.9
0.99
R. Khoury (2007)
P(E|LS)
P(E|LS)
P(E|LS)
P(E|LS)
Page 8
Bayesian Network
a
o
b
e
p
d
c
q
n
s
m
f
y
k
ECE457 Applied Artificial Intelligence
t
v
w
l
i
j
u
h
g
r
z
x
R. Khoury (2007)
Page 9
Chain Rule

Recall the chain rule




P(A,B) = P(A|B)P(B)
P(A,B,C) = P(A|B,C)P(B,C)
P(A,B,C) = P(A|B,C)P(B|C)P(C)
P(A1,A2,…,An) =
P(A1|A2,…,An)P(A2|A3,…,An)…P(An-1|An)P(An)
P(A1,A2,…,An) = i=1n P(Ai|Ai+1,…,An)
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 10
Chain Rule

If we know the value of a node’s parents, we
don’t care about more distant ancestors




Their influence is included through the parents
A node is conditionally independent of its
predecessors given its parents
Or more generally, a node is conditionally
independent of its non-descendents given its
parents
Update chain rule

P(A1,A2,…,An) = i=1n P(Ai|parents(Ai))
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 11
Chain Rule Example


Probability that Bob is happy because
he won the lottery and passed his
exam, because he’s lucky but did not
study
P(H,W,E,L,S) = P(H|WE) * P(W|L) *
P(E|LS) * P(L) * P(S)
P(H,W,E,L,S) = 0.99 * 0.4 *
0.5 * 0.75 * 0.8
P(H,W,E,L,S) = 0.12
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 12
Constructing Bayesian Nets

Lucky
Study


Win
PassExam

Build from the topdown
Start with root nodes
Add children
Go down to leaves
Happy
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 13
Constructing Bayesian Nets

Lucky
Study

Win
PassExam

Happy
ECE457 Applied Artificial Intelligence
What happens if we
build with the wrong
order?
Network becomes
needlessly
complicated
Node ordering is
important!
R. Khoury (2007)
Page 14
Connections

We can understand dependence in a
network by considering how evidence is
transmitted through it



Information entered at one node
Propagates to descendents and ancestors
through connected nodes
Provided no node in path already has
evidence (in which case we would stop the
propagation)
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 15
Serial Connection

Lucky
Study

Win
PassExam

Happy
ECE457 Applied Artificial Intelligence
Study and Happy are
dependent
Study and Happy are
independent given
PassExam
Intuitively, the only
way Study can affect
Happy is through
PassExam
R. Khoury (2007)
Page 16
Converging Connection

Lucky
Study

Win
PassExam

Happy
ECE457 Applied Artificial Intelligence
Lucky and Study are
independent
Lucky and Study are
dependent given
PassExam
Intuitively, Lucky can
be used to explain
away Study
R. Khoury (2007)
Page 17
Diverging Connection

Lucky
Study

Win
PassExam
Happy
ECE457 Applied Artificial Intelligence

Win and PassExams are
dependent
Win and PassExams are
independent given Lucky
Intuitively, Lucky can
explain both Win and
PassExam. Win and
PassExam can affect each
other by changing the
belief in Lucky
R. Khoury (2007)
Page 18
D-Separation

Determine if two variables are independent
given some other variables


X is independent of Y given Z if X and Y are dseparate given Z
X is d-separate from Y if, for all (undirected)
paths between X and Y, there exists a node Z
for which:


The connection is serial or diverging and there is
evidence for Z
The connection is converging and there is no
evidence for Z or any of its descendents
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 19
D-Separation
X
X
Y
Z
Z
Blocks path if not
Blocks path if in
evidence
in evidence
Z
Z2
Blocks path if in
evidence
Y
X
ECE457 Applied Artificial Intelligence
Blocks path if not
in evidence
Y
R. Khoury (2007)
Page 20
D-Separation


Can be computed in linear time using
depth-first-search algorithm
Fast algorithm to know if two nodes are
independent


Allows us to infer whether learning the
value of a variable might give us
information about another variable given
what we already know
All d-separated variables are independent
but not all independent variable are dseparated
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 21
D-Separation Exercise
c
g
e
b
a
d

h
f
i
If we observe a value for node g, what other
nodes are updated?


j
Nodes f, h and i
If we observe a value for node a, what other
nodes are updated?

Nodes b, c, d, e, f
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 22
D-Separation Exercise
c
g
e
b
a
d

h
f
i
Given an observation of c, are nodes a and f
independent?


j
Yes
Given an observation of i, are nodes g and j
independent?

No
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 23
Other Independence Criteria
o
b
p
d
c
n
s
m
u
h
g
v
w
l
i


z
y
k
ECE457 Applied Artificial Intelligence
A node is
conditionally
independent of
its nondescendents
given its parents
Recall from
updated chain
rule
x
R. Khoury (2007)
Page 24
Other Independence Criteria
o
b
p
d
c
n
s
m
u
h
g
v
w
l
i

z
y
k
ECE457 Applied Artificial Intelligence
x

A node is
conditionally
independent of
all others in the
network given its
parents, children,
and children’s
parents
Markov blanket
R. Khoury (2007)
Page 25
Inference in Bayesian Network






Compute the posterior probability of a
query variable given an observed event
P(A1,A2,…,An) = i=1n P(Ai|parents(Ai))
Observed evidence variables
E = E1,…,Em
Query variable X
Between them: nonevidence (hidden)
variables Y = Y1…Yl
Belief network is X  E  Y
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 26
Inference in Bayesian Network

P(X|E)
Recall Bayes’ Theorem:
P(A|B) = P(A,B) / P(B)
P(X|E) = α P(X,E)
Recall marginalization:
P(Ai) = j P(Ai,Bj)
P(X|E) = α Y P(X,E,Y)
Recall chain rule:
P(A1,A2,…,An) = i=1n P(Ai|parents(Ai))
P(X|E) = α Y A=XE P(A|parents(A))
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 27
Inference Example
Lucky
P(L) = 0.75
L
P(W)
F
T
0.01
0.4
Study
Win
W E
F F
P(H)
0.2
T
F
T
0.6
0.8
0.99
F
T
T
PassExam
P(S) = 0.2
L
F
T
F
S
F
F
T
P(E)
0.01
0.5
0.9
T
T
0.99
Happy
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 28
Inference Example #1


With only the information from the
network (and no observations), what’s
the probability that Bob won the
lottery?
P(W) = l P(W,l)
P(W) = l P(W|l)P(l)
P(W) = P(W|L)P(L) + P(W|L)P(L)
P(W) = 0.4*0.75 + 0.01*0.25
P(W) = 0.3025
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 29
Inference Example #2


Given that we know that Bob is happy,
what’s the probability that Bob won the
lottery?
From the network, we know


P(h,e,w,s,l) = P(l)P(s)P(e|l,s)P(w|l)P(h|w,e)
We want to find


P(W|H) = α l s e
P(l)P(s)P(e|l,s)P(W|l)P(H|W,e)
P(W|H) also needed to normalize
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 30
Inference Example #2
l s
F F
T F
e
F
F
P(s)
0.8
0.8
P(l)
0.25
0.75
P(e|l,s)
0.99
0.5
P(W|l) P(H|W,e)
0.01
0.6
0.001188
0.4
0.6
0.072
F
T
F
T
T
T
F
F
F
F
T
T
0.2
0.2
0.8
0.8
0.25
0.75
0.25
0.75
0.1
0.01
0.01
0.5
0.01
0.4
0.01
0.4
0.6
0.6
0.99
0.99
0.00003
0.00036
0.0000198
0.1188
F T
T T
T
T
0.2
0.2
0.25
0.75
0.9
0.99
0.01
0.4
0.99
0.99
0.0004455
0.058806

P(W|H) = α 0.2516493
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 31
Inference Example #2
l
P(W|l)
P(H|W,e)
0.25 0.99
0.75 0.5
0.99
0.6
0.2
0.2
0.039204
0.036
0.2
0.2
0.8
0.8
0.25
0.75
0.25
0.75
0.99
0.6
0.99
0.6
0.2
0.2
0.8
0.8
0.00099
0.00018
0.001584
0.144
T
0.2
0.25 0.9
0.99
0.8
0.03564
T
0.2
0.75 0.99
0.6
0.8
0.07128
s
e
P(s) P(l)
F F
T F
F
F
0.8
0.8
F
T
F
T
T
T
F
F
F
F
T
T
F
T
T T

P(e|l,s)
0.1
0.01
0.01
0.5
P(W|H) = α 0.328878
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 32
Inference Example #2

P(W|H) = α <0.2516493, 0.328878>
P(W|H) = <0.4335, 0.5665>


Note that P(W|H) > P(W|H) because
P(W|L)  P(W|L)
The probability of Bob having won the
lottery has increased by 13.1% thanks
to our knowledge that he is happy!
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 33
Expert Systems

Bayesian networks used to implement
expert systems



Diagnostic systems that contains
subject-specific knowledge
Knowledge (nodes, relationships,
probabilities) typically provided by
human experts
System observes evidence by
asking questions to user, then
infers most likely conclusion
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 34
Pathfinder


Expert system for medical diagnostic of
lymph-node diseases
Very large Bayesian network




Over 60 diseases
Over 100 features of lymph nodes
Over 30 features for clinical information
Lot of work from medical experts



8 hours to define features and diseases
35 hours to build network topology
40 hours to assess probabilities
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 35
Pathfinder

One node for each disease


Assumes the diseases are mutually exclusive and
exhaustive
Large domain, hard to handle


Several small networks for diagnostic tasks built
individually
Then combined into a single large network
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 36
Pathfinder

Testing the
network


ECE457 Applied Artificial Intelligence
R. Khoury (2007)
53 test
cases (real
diagnostics)
Diagnostic
accuracy as
good as a
medical
expert
Page 37
Assumptions


Learning agent
Environment






Fully observable / Partially observable
Deterministic / Strategic / Stochastic
Sequential
Static / Semi-dynamic
Discrete / Continuous
Single agent / Multi-agent
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 38
Assumptions Updated

We can handle a new combination!

Fully observable & Deterministic


Fully observable & Stochastic


Games of chance (Monopoly, Backgammon)
Partially observable & Deterministic


No uncertainty (map of Romania)
Logic (Wumpus World)
Partially observable & Stochastic
ECE457 Applied Artificial Intelligence
R. Khoury (2007)
Page 39
Related documents