Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
QUIZ!!
T/F: Forward sampling is consistent. True
T/F: Rejection sampling is faster, but inconsistent. False
T/F: Rejection sampling requires less memory than Forward sampling. True
T/F: In likelihood weighted sampling you make use of every single sample. True
T/F: Forward sampling samples from the joint distribution. True
T/F: L.W.S. only speeds up inference for r.v.’s downstream of evidence. True
T/F: Too few samples could result in division by zero. True
1
xkcd
CSE511a: Artificial Intelligence
Fall 2013
Lecture 18: Decision Diagrams
04/03/2013
Robert Pless,
via Kilian Q. Weinberger
Several slides adapted from Dan Klein – UC Berkeley
Announcements
Class Monday April 8 cancelled in lieu of talk by
Sanmay Das:
“Bayesian Reinforcement Learning With
Censored Observations and Applications in
Market Modeling and Design”
Friday, April 5,
11-12
Lopata 101.
3
Sampling Example
There are 2 cups.
The first contains 1 penny and 1 quarter
The second contains 2 quarters
Say I pick a cup uniformly at random, then pick a
coin randomly from that cup. It's a quarter (yes!).
What is the probability that the other coin in that
cup is also a quarter?
Recap: Likelihood Weighting
Sampling distribution if z sampled and e fixed evidence
Cloudy
C
Now, samples have weights
S
R
W
Together, the weighted sampling distribution is consistent
5
Likelihood Weighting
+c
-c
0.5
0.5
Cloudy
+c
-c
+s
+s
-s
+s
-s
0.1
0.9
0.5
0.5
+r
-r
-s
+r
-r
+c
Sprinkler
+w
-w
+w
-w
+w
-w
+w
-w
0.99
0.01
0.90
0.10
0.90
0.10
0.01
0.99
Rain
WetGrass
-c
+r
-r
+r
-r
0.8
0.2
0.2
0.8
Samples:
+c, +s, +r, +w
…
6
Likelihood Weighting Example
Cloudy
0
0
0
0
1
Rainy
1
0
0
0
0
Sprinkler
1
1
1
1
1
Wet Grass
1
1
1
1
1
Weight
0.495
0.45
0.45
0.45
0.09
Inference:
Sum over weights that match query value
Divide by total sample weight
What is P(C|+w,+s)?
å sample _ weight
P(C | +w,+s) =
å sample _ weight
C=1
7
Likelihood Weighting
Likelihood weighting is good
Takes evidence into account as we
generate the sample
At right: W’s value will get picked based
on the evidence values of S, R
More samples will reflect the state of the
world suggested by the evidence
Likelihood weighting doesn’t solve
all our problems
Cloudy
C
S
Rain
R
W
Evidence influences the choice of
downstream variables, but not upstream
ones (C isn’t more likely to get a value
matching the evidence)
We would like to consider evidence
when we sample every variable
8
Markov Chain Monte Carlo*
Idea: instead of sampling from scratch, create samples
that are each like the last one.
Procedure: resample one variable at a time, conditioned
on all the rest, but keep evidence fixed. E.g., for P(b|c):
+b
+a
+c
-b
+a
+c
-b
-a
+c
Properties: Now samples are not independent (in fact
they’re nearly identical), but sample averages are still
consistent estimators!
What’s the point: both upstream and downstream
variables condition on evidence.
9
Gibbs Sampling
1. Set all evidence E1,...,Em to e1,...,em
2. Do forward sampling t obtain x1,...,xn
3. Repeat:
1.
2.
3.
4.
Pick any variable Xi uniformly at random.
Resample xi’ from p(Xi | markovblanket(Xi))
Set all other xj’=xj
The new sample is x1’,..., xn’
10
Markov Blanket
Markov blanket of X:
1. All parents of X
2. All children of X
3. All parents of children of X
(except X itself)
X
X is conditionally independent from all other
variables in the BN, given all variables in the
markov blanket (besides X).
11
MCMC algorithm
12
The Markov Chain
13
Summary
Sampling can be your salvation
Dominant approach to inference in BNs
Pros/Conts:
Forward Sampling
Rejection Sampling
Likelihood Weighted Sampling
Gibbs Sampling/MCMC
14
Google’s PageRank
http://en.wikipedia.org/wiki/Markov_chain
Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999).
The PageRank citation ranking: Bringing order to the Web.
See also:
J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACMSIAM Symposium on Discrete Algorithms, 1998.
15
Google’s PageRank
Graph of the Internet (pages
and links)
H
A
I
D
C
F
J
G
16
E
B
Google’s PageRank
Start at a random page,
take a random walk.
Where do we end up?
H
A
I
D
C
F
J
G
17
E
B
Google’s PageRank
Add 15% probability of
moving to a random
page. Now where do we
end up?
H
A
I
D
C
F
J
G
18
E
B
Google’s PageRank
PageRank(P) = Probability
that a long random walk
ends at node P
H
A
I
D
C
F
J
G
19
E
B
20
Learning
21
Where do the CPTs come from?
How do you build a Bayes Net to begin with?
Two scenarios:
Complete data
Incomplete data --- touch on, you are not
responsible for this.
22
Learning from complete data
Observe data from real
world:
Estimate probabilities by
counting:
P(X = x | pa(X) = p ) =
count(pa(X) = p , X = x)
count(pa(X) = p )
å I(x , x)I(pa(X), p )
P(X = x | pa(X) = p ) =
å I(pa(X), p )
t
t
t
(Just like rejection sampling,
except we observe the real
world.)
Cloudy
0
1
1
0
0
0
1
0
1
1
0
1
0
1
1
0
0
Rainy
0
1
0
0
0
0
1
0
0
1
0
1
0
1
1
1
0
Sprinkler
1
0
0
1
1
1
0
1
0
0
1
0
1
0
0
0
1
Wet Grass
1
1
0
1
1
1
1
1
0
1
1
1
1
1
1
1
0
(Indicator function I(a,b)=1 if and only if a=b.)
23
Learning from incomplete data
Observe real-world data
Some variables are
unobserved
Trick: Fill them in
EM algorithm
E=Expectation
M=Maximization
Cloudy
1
0
1
1
1
1
0
1
1
0
0
0
1
1
1
1
1
0
1
1
Rainy
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Sprinkler
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Wet Grass
0
1
1
0
1
1
1
0
0
1
0
1
0
1
1
0
1
1
0
0
24
EM-Algorithm
Initialize CPTs (e.g. randomly, uniformly)
Repeat until convergence
E-Step:
Compute for all i and x (inference)
P(Xi = x, pa(Xi ) = p |V t )
M-Step:
Update CPT tables
å P(X = x, pa(X ) = p | V )
P(X = x | pa(X) = p ) ¬
å P(pa(X ) = p | V )
(Just like likelihood-weighted sampling, but
on real-world data.)
t
i
i
i
t
t
i
t
25
EM-Algorithm
Properties:
No learning rate
Monotonic convergence
Each iteration improves log-likelihood of data
Outcome depends on initialization
Can be slow for large Bayes Nets ...
26
Decision Networks
27
Decision Networks
MEU: choose the action which
maximizes the expected utility
given the evidence
Umbrella
U
Can directly operationalize this
with decision networks
Bayes nets with nodes for
utility and actions
Lets us calculate the expected
utility for each action
Weather
New node types:
Chance nodes (just like BNs)
Actions (rectangles, cannot
have parents, act as observed
evidence)
Utility node (diamond, depends
on action and chance nodes)
Forecast
28
Decision Networks
Action selection:
Instantiate all
evidence
Set action node(s)
each possible way
Calculate posterior
for all parents of
utility node, given
the evidence
Calculate expected
utility for each action
Choose maximizing
action
Umbrella
U
Weather
Forecast
29
Example: Decision Networks
Umbrella = leave
Umbrella
U
Weather
Umbrella = take
A
W
P(W)
leave
sun
100
sun
0.7
leave
rain
0
rain
0.3
take
sun
20
take
rain
70
W
Optimal decision = leave
U(A,W)
Decisions as Outcome Trees
{}
Weather
U(t,s)
Weather
U(t,r)
U(l,s)
U(l,r)
Almost exactly like expectimax / MDPs
31
Evidence in Decision Networks
Find P(W|F=bad)
Umbrella
Select for evidence
W
U
P(W)
W
P(W)
W
P(F=bad|W)
sun
0.7
sun
0.7
sun
0.2
rain
0.3
rain
0.3
rain
0.9
F
P(F|sun)
good
0.8
bad
0.2
F
Weather
First we join P(W) and
P(bad|W)
Then we normalize
P(F|rain)
good
0.1
bad
0.9
Forecast
W
P(W,F=bad)
W
P(W | F=bad)
sun
0.14
sun
0.34
rain
0.27
rain
0.66
Example: Decision Networks
Umbrella = leave
Umbrella
W
P(W|F=bad)
sun
0.34
rain
0.66
U
Umbrella = take
Optimal decision = take
Weather
Forecast
=bad
A
W
U(A,W)
leave
sun
100
leave
rain
0
take
sun
20
take
rain
70
33
Value of Information
Idea: compute value of acquiring evidence
Can be done directly from decision network
Example: buying oil drilling rights
Two blocks A and B, exactly one has oil, worth k
You can drill in one location
Prior probabilities 0.5 each, & mutually exclusive
Drilling in either A or B has MEU = k/2
U
OilLoc
O
P
D
O
a
1/2
a
a
k
b
Value of knowing which of A or B has oil
Value is expected gain in MEU from new info
Survey may say “oil in a” or “oil in b,” prob 0.5 each
If we know OilLoc, MEU is k (either way)
Gain in MEU from knowing OilLoc?
VPI(OilLoc) = k/2
Fair price of information: k/2
1/2
a
b
0
b
a
0
b
b
k
Question: what’s the value of information?
DrillLoc
U
34
Value of Information
Assume we have evidence E=e. Value if we act now:
{e}
a
Assume we see that E’ = e’. Value if we act then:
BUT E’ is a random variable whose value is
unknown, so we don’t know what e’ will be
Expected value if E’ is revealed and then we act:
P(s | e)
U
{e, e’}
a
P(s | e, e’)
U
{e}
P(e’ | e)
Value of information: how much MEU goes up
by revealing E’ first:
VPI == “Value of perfect information”
{e, e’}
VPI Example: Weather
MEU with no evidence
Umbrella
U
MEU if forecast is bad
Weather
MEU if forecast is good
Forecast
Forecast distribution
F
P(F)
good
0.59
bad
0.41
A
W
U
leave
sun
100
leave
rain
0
take
sun
20
take
rain
70
36
VPI Properties
Nonnegative
Nonadditive ---consider, e.g., obtaining Ej twice
Order-independent
37
Quick VPI Questions
The soup of the day is either clam chowder or split pea,
but you wouldn’t order either one. What’s the value of
knowing which it is?
There are two kinds of plastic forks at a picnic. It must
be that one is slightly better. What’s the value of
knowing which?
You have $10 to bet double-or-nothing and there is a
75% chance that the Rams will beat the 49ers. What’s
the value of knowing the outcome in advance?
You must bet on the Rams, either way. What’s the
value now?
That’s all
Next Lecture
Hidden Markov Models!
(You gonna love it!!)
Reminder, next lecture is a week from today
(on Wednesday).
Go to talk by Sanmay Das, Lopata 101, Friday
11-12.
39