Download Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
QUIZ!!







T/F: Forward sampling is consistent. True
T/F: Rejection sampling is faster, but inconsistent. False
T/F: Rejection sampling requires less memory than Forward sampling. True
T/F: In likelihood weighted sampling you make use of every single sample. True
T/F: Forward sampling samples from the joint distribution. True
T/F: L.W.S. only speeds up inference for r.v.’s downstream of evidence. True
T/F: Too few samples could result in division by zero. True
1
xkcd
CSE511a: Artificial Intelligence
Fall 2013
Lecture 18: Decision Diagrams
04/03/2013
Robert Pless,
via Kilian Q. Weinberger
Several slides adapted from Dan Klein – UC Berkeley
Announcements
Class Monday April 8 cancelled in lieu of talk by
Sanmay Das:
“Bayesian Reinforcement Learning With
Censored Observations and Applications in
Market Modeling and Design”
Friday, April 5,
11-12
Lopata 101.
3
Sampling Example
 There are 2 cups.
 The first contains 1 penny and 1 quarter
 The second contains 2 quarters
 Say I pick a cup uniformly at random, then pick a
coin randomly from that cup. It's a quarter (yes!).
What is the probability that the other coin in that
cup is also a quarter?
Recap: Likelihood Weighting
 Sampling distribution if z sampled and e fixed evidence
Cloudy
C
 Now, samples have weights
S
R
W
 Together, the weighted sampling distribution is consistent
5
Likelihood Weighting
+c
-c
0.5
0.5
Cloudy
+c
-c
+s
+s
-s
+s
-s
0.1
0.9
0.5
0.5
+r
-r
-s
+r
-r
+c
Sprinkler
+w
-w
+w
-w
+w
-w
+w
-w
0.99
0.01
0.90
0.10
0.90
0.10
0.01
0.99
Rain
WetGrass
-c
+r
-r
+r
-r
0.8
0.2
0.2
0.8
Samples:
+c, +s, +r, +w
…
6
Likelihood Weighting Example
Cloudy
0
0
0
0
1
Rainy
1
0
0
0
0
Sprinkler
1
1
1
1
1
Wet Grass
1
1
1
1
1
Weight
0.495
0.45
0.45
0.45
0.09
 Inference:
 Sum over weights that match query value
 Divide by total sample weight
 What is P(C|+w,+s)?
å sample _ weight
P(C | +w,+s) =
å sample _ weight
C=1
7
Likelihood Weighting
 Likelihood weighting is good
 Takes evidence into account as we
generate the sample
 At right: W’s value will get picked based
on the evidence values of S, R
 More samples will reflect the state of the
world suggested by the evidence
 Likelihood weighting doesn’t solve
all our problems
Cloudy
C
S
Rain
R
W
 Evidence influences the choice of
downstream variables, but not upstream
ones (C isn’t more likely to get a value
matching the evidence)
 We would like to consider evidence
when we sample every variable
8
Markov Chain Monte Carlo*
 Idea: instead of sampling from scratch, create samples
that are each like the last one.
 Procedure: resample one variable at a time, conditioned
on all the rest, but keep evidence fixed. E.g., for P(b|c):
+b
+a
+c
-b
+a
+c
-b
-a
+c
 Properties: Now samples are not independent (in fact
they’re nearly identical), but sample averages are still
consistent estimators!
 What’s the point: both upstream and downstream
variables condition on evidence.
9
Gibbs Sampling
1. Set all evidence E1,...,Em to e1,...,em
2. Do forward sampling t obtain x1,...,xn
3. Repeat:
1.
2.
3.
4.
Pick any variable Xi uniformly at random.
Resample xi’ from p(Xi | markovblanket(Xi))
Set all other xj’=xj
The new sample is x1’,..., xn’
10
Markov Blanket
Markov blanket of X:
1. All parents of X
2. All children of X
3. All parents of children of X
(except X itself)
X
X is conditionally independent from all other
variables in the BN, given all variables in the
markov blanket (besides X).
11
MCMC algorithm
12
The Markov Chain
13
Summary
 Sampling can be your salvation
 Dominant approach to inference in BNs
 Pros/Conts:




Forward Sampling
Rejection Sampling
Likelihood Weighted Sampling
Gibbs Sampling/MCMC
14
Google’s PageRank
http://en.wikipedia.org/wiki/Markov_chain
Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999).
The PageRank citation ranking: Bringing order to the Web.
See also:
J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACMSIAM Symposium on Discrete Algorithms, 1998.
15
Google’s PageRank
Graph of the Internet (pages
and links)
H
A
I
D
C
F
J
G
16
E
B
Google’s PageRank
Start at a random page,
take a random walk.
Where do we end up?
H
A
I
D
C
F
J
G
17
E
B
Google’s PageRank
Add 15% probability of
moving to a random
page. Now where do we
end up?
H
A
I
D
C
F
J
G
18
E
B
Google’s PageRank
PageRank(P) = Probability
that a long random walk
ends at node P
H
A
I
D
C
F
J
G
19
E
B
20
Learning
21
Where do the CPTs come from?
 How do you build a Bayes Net to begin with?
 Two scenarios:
 Complete data
 Incomplete data --- touch on, you are not
responsible for this.
22
Learning from complete data
 Observe data from real
world:
 Estimate probabilities by
counting:
P(X = x | pa(X) = p ) =
count(pa(X) = p , X = x)
count(pa(X) = p )
å I(x , x)I(pa(X), p )
P(X = x | pa(X) = p ) =
å I(pa(X), p )
t
t
t
 (Just like rejection sampling,
except we observe the real
world.)
Cloudy
0
1
1
0
0
0
1
0
1
1
0
1
0
1
1
0
0
Rainy
0
1
0
0
0
0
1
0
0
1
0
1
0
1
1
1
0
Sprinkler
1
0
0
1
1
1
0
1
0
0
1
0
1
0
0
0
1
Wet Grass
1
1
0
1
1
1
1
1
0
1
1
1
1
1
1
1
0
(Indicator function I(a,b)=1 if and only if a=b.)
23
Learning from incomplete data
 Observe real-world data
 Some variables are
unobserved
 Trick: Fill them in
 EM algorithm
 E=Expectation
 M=Maximization
Cloudy
1
0
1
1
1
1
0
1
1
0
0
0
1
1
1
1
1
0
1
1
Rainy
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Sprinkler
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
?
Wet Grass
0
1
1
0
1
1
1
0
0
1
0
1
0
1
1
0
1
1
0
0
24
EM-Algorithm
 Initialize CPTs (e.g. randomly, uniformly)
 Repeat until convergence
 E-Step:
 Compute for all i and x (inference)
P(Xi = x, pa(Xi ) = p |V t )
 M-Step:
 Update CPT tables
å P(X = x, pa(X ) = p | V )
P(X = x | pa(X) = p ) ¬
å P(pa(X ) = p | V )
 (Just like likelihood-weighted sampling, but
on real-world data.)
t
i
i
i
t
t
i
t
25
EM-Algorithm
 Properties:
 No learning rate
 Monotonic convergence
 Each iteration improves log-likelihood of data
 Outcome depends on initialization
 Can be slow for large Bayes Nets ...
26
Decision Networks
27
Decision Networks
 MEU: choose the action which
maximizes the expected utility
given the evidence
Umbrella
U
 Can directly operationalize this
with decision networks
 Bayes nets with nodes for
utility and actions
 Lets us calculate the expected
utility for each action
Weather
 New node types:
 Chance nodes (just like BNs)
 Actions (rectangles, cannot
have parents, act as observed
evidence)
 Utility node (diamond, depends
on action and chance nodes)
Forecast
28
Decision Networks
 Action selection:
 Instantiate all
evidence
 Set action node(s)
each possible way
 Calculate posterior
for all parents of
utility node, given
the evidence
 Calculate expected
utility for each action
 Choose maximizing
action
Umbrella
U
Weather
Forecast
29
Example: Decision Networks
Umbrella = leave
Umbrella
U
Weather
Umbrella = take
A
W
P(W)
leave
sun
100
sun
0.7
leave
rain
0
rain
0.3
take
sun
20
take
rain
70
W
Optimal decision = leave
U(A,W)
Decisions as Outcome Trees
{}
Weather
U(t,s)
Weather
U(t,r)
U(l,s)
U(l,r)
 Almost exactly like expectimax / MDPs
31
Evidence in Decision Networks
 Find P(W|F=bad)
Umbrella
 Select for evidence
W
U
P(W)
W
P(W)
W
P(F=bad|W)
sun
0.7
sun
0.7
sun
0.2
rain
0.3
rain
0.3
rain
0.9
F
P(F|sun)
good
0.8
bad
0.2
F
Weather
 First we join P(W) and
P(bad|W)
 Then we normalize
P(F|rain)
good
0.1
bad
0.9
Forecast
W
P(W,F=bad)
W
P(W | F=bad)
sun
0.14
sun
0.34
rain
0.27
rain
0.66
Example: Decision Networks
Umbrella = leave
Umbrella
W
P(W|F=bad)
sun
0.34
rain
0.66
U
Umbrella = take
Optimal decision = take
Weather
Forecast
=bad
A
W
U(A,W)
leave
sun
100
leave
rain
0
take
sun
20
take
rain
70
33
Value of Information
 Idea: compute value of acquiring evidence
 Can be done directly from decision network
 Example: buying oil drilling rights




Two blocks A and B, exactly one has oil, worth k
You can drill in one location
Prior probabilities 0.5 each, & mutually exclusive
Drilling in either A or B has MEU = k/2
U
OilLoc
O
P
D
O
a
1/2
a
a
k
b
Value of knowing which of A or B has oil
Value is expected gain in MEU from new info
Survey may say “oil in a” or “oil in b,” prob 0.5 each
If we know OilLoc, MEU is k (either way)
Gain in MEU from knowing OilLoc?
VPI(OilLoc) = k/2
Fair price of information: k/2
1/2
a
b
0
b
a
0
b
b
k
 Question: what’s the value of information?







DrillLoc
U
34
Value of Information

Assume we have evidence E=e. Value if we act now:
{e}
a



Assume we see that E’ = e’. Value if we act then:
BUT E’ is a random variable whose value is
unknown, so we don’t know what e’ will be
Expected value if E’ is revealed and then we act:
P(s | e)
U
{e, e’}
a
P(s | e, e’)
U
{e}
P(e’ | e)

Value of information: how much MEU goes up
by revealing E’ first:
VPI == “Value of perfect information”
{e, e’}
VPI Example: Weather
MEU with no evidence
Umbrella
U
MEU if forecast is bad
Weather
MEU if forecast is good
Forecast
Forecast distribution
F
P(F)
good
0.59
bad
0.41
A
W
U
leave
sun
100
leave
rain
0
take
sun
20
take
rain
70
36
VPI Properties
 Nonnegative
 Nonadditive ---consider, e.g., obtaining Ej twice
 Order-independent
37
Quick VPI Questions
 The soup of the day is either clam chowder or split pea,
but you wouldn’t order either one. What’s the value of
knowing which it is?
 There are two kinds of plastic forks at a picnic. It must
be that one is slightly better. What’s the value of
knowing which?
 You have $10 to bet double-or-nothing and there is a
75% chance that the Rams will beat the 49ers. What’s
the value of knowing the outcome in advance?
 You must bet on the Rams, either way. What’s the
value now?
That’s all
 Next Lecture
 Hidden Markov Models!
 (You gonna love it!!)
 Reminder, next lecture is a week from today
(on Wednesday).
Go to talk by Sanmay Das, Lopata 101, Friday
11-12.
39
Related documents