Download Slides

QUIZ!!        T/F: Forward sampling is consistent. True T/F: Rejection sampling is faster, but inconsistent. False T/F: Rejection sampling requires less memory than Forward sampling. True T/F: In likelihood weighted sampling you make use of every single sample. True T/F: Forward sampling samples from the joint distribution. True T/F: L.W.S. only speeds up inference for r.v.’s downstream of evidence. True T/F: Too few samples could result in division by zero. True 1 xkcd CSE511a: Artificial Intelligence Fall 2013 Lecture 18: Decision Diagrams 04/03/2013 Robert Pless, via Kilian Q. Weinberger Several slides adapted from Dan Klein – UC Berkeley Announcements Class Monday April 8 cancelled in lieu of talk by Sanmay Das: “Bayesian Reinforcement Learning With Censored Observations and Applications in Market Modeling and Design” Friday, April 5, 11-12 Lopata 101. 3 Sampling Example  There are 2 cups.  The first contains 1 penny and 1 quarter  The second contains 2 quarters  Say I pick a cup uniformly at random, then pick a coin randomly from that cup. It's a quarter (yes!). What is the probability that the other coin in that cup is also a quarter? Recap: Likelihood Weighting  Sampling distribution if z sampled and e fixed evidence Cloudy C  Now, samples have weights S R W  Together, the weighted sampling distribution is consistent 5 Likelihood Weighting +c -c 0.5 0.5 Cloudy +c -c +s +s -s +s -s 0.1 0.9 0.5 0.5 +r -r -s +r -r +c Sprinkler +w -w +w -w +w -w +w -w 0.99 0.01 0.90 0.10 0.90 0.10 0.01 0.99 Rain WetGrass -c +r -r +r -r 0.8 0.2 0.2 0.8 Samples: +c, +s, +r, +w … 6 Likelihood Weighting Example Cloudy 0 0 0 0 1 Rainy 1 0 0 0 0 Sprinkler 1 1 1 1 1 Wet Grass 1 1 1 1 1 Weight 0.495 0.45 0.45 0.45 0.09  Inference:  Sum over weights that match query value  Divide by total sample weight  What is P(C|+w,+s)? å sample _ weight P(C | +w,+s) = å sample _ weight C=1 7 Likelihood Weighting  Likelihood weighting is good  Takes evidence into account as we generate the sample  At right: W’s value will get picked based on the evidence values of S, R  More samples will reflect the state of the world suggested by the evidence  Likelihood weighting doesn’t solve all our problems Cloudy C S Rain R W  Evidence influences the choice of downstream variables, but not upstream ones (C isn’t more likely to get a value matching the evidence)  We would like to consider evidence when we sample every variable 8 Markov Chain Monte Carlo*  Idea: instead of sampling from scratch, create samples that are each like the last one.  Procedure: resample one variable at a time, conditioned on all the rest, but keep evidence fixed. E.g., for P(b|c): +b +a +c -b +a +c -b -a +c  Properties: Now samples are not independent (in fact they’re nearly identical), but sample averages are still consistent estimators!  What’s the point: both upstream and downstream variables condition on evidence. 9 Gibbs Sampling 1. Set all evidence E1,...,Em to e1,...,em 2. Do forward sampling t obtain x1,...,xn 3. Repeat: 1. 2. 3. 4. Pick any variable Xi uniformly at random. Resample xi’ from p(Xi | markovblanket(Xi)) Set all other xj’=xj The new sample is x1’,..., xn’ 10 Markov Blanket Markov blanket of X: 1. All parents of X 2. All children of X 3. All parents of children of X (except X itself) X X is conditionally independent from all other variables in the BN, given all variables in the markov blanket (besides X). 11 MCMC algorithm 12 The Markov Chain 13 Summary  Sampling can be your salvation  Dominant approach to inference in BNs  Pros/Conts:     Forward Sampling Rejection Sampling Likelihood Weighted Sampling Gibbs Sampling/MCMC 14 Google’s PageRank http://en.wikipedia.org/wiki/Markov_chain Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999). The PageRank citation ranking: Bringing order to the Web. See also: J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACMSIAM Symposium on Discrete Algorithms, 1998. 15 Google’s PageRank Graph of the Internet (pages and links) H A I D C F J G 16 E B Google’s PageRank Start at a random page, take a random walk. Where do we end up? H A I D C F J G 17 E B Google’s PageRank Add 15% probability of moving to a random page. Now where do we end up? H A I D C F J G 18 E B Google’s PageRank PageRank(P) = Probability that a long random walk ends at node P H A I D C F J G 19 E B 20 Learning 21 Where do the CPTs come from?  How do you build a Bayes Net to begin with?  Two scenarios:  Complete data  Incomplete data --- touch on, you are not responsible for this. 22 Learning from complete data  Observe data from real world:  Estimate probabilities by counting: P(X = x | pa(X) = p ) = count(pa(X) = p , X = x) count(pa(X) = p ) å I(x , x)I(pa(X), p ) P(X = x | pa(X) = p ) = å I(pa(X), p ) t t t  (Just like rejection sampling, except we observe the real world.) Cloudy 0 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 0 Rainy 0 1 0 0 0 0 1 0 0 1 0 1 0 1 1 1 0 Sprinkler 1 0 0 1 1 1 0 1 0 0 1 0 1 0 0 0 1 Wet Grass 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 0 (Indicator function I(a,b)=1 if and only if a=b.) 23 Learning from incomplete data  Observe real-world data  Some variables are unobserved  Trick: Fill them in  EM algorithm  E=Expectation  M=Maximization Cloudy 1 0 1 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 Rainy ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Sprinkler ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Wet Grass 0 1 1 0 1 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 24 EM-Algorithm  Initialize CPTs (e.g. randomly, uniformly)  Repeat until convergence  E-Step:  Compute for all i and x (inference) P(Xi = x, pa(Xi ) = p |V t )  M-Step:  Update CPT tables å P(X = x, pa(X ) = p | V ) P(X = x | pa(X) = p ) ¬ å P(pa(X ) = p | V )  (Just like likelihood-weighted sampling, but on real-world data.) t i i i t t i t 25 EM-Algorithm  Properties:  No learning rate  Monotonic convergence  Each iteration improves log-likelihood of data  Outcome depends on initialization  Can be slow for large Bayes Nets ... 26 Decision Networks 27 Decision Networks  MEU: choose the action which maximizes the expected utility given the evidence Umbrella U  Can directly operationalize this with decision networks  Bayes nets with nodes for utility and actions  Lets us calculate the expected utility for each action Weather  New node types:  Chance nodes (just like BNs)  Actions (rectangles, cannot have parents, act as observed evidence)  Utility node (diamond, depends on action and chance nodes) Forecast 28 Decision Networks  Action selection:  Instantiate all evidence  Set action node(s) each possible way  Calculate posterior for all parents of utility node, given the evidence  Calculate expected utility for each action  Choose maximizing action Umbrella U Weather Forecast 29 Example: Decision Networks Umbrella = leave Umbrella U Weather Umbrella = take A W P(W) leave sun 100 sun 0.7 leave rain 0 rain 0.3 take sun 20 take rain 70 W Optimal decision = leave U(A,W) Decisions as Outcome Trees {} Weather U(t,s) Weather U(t,r) U(l,s) U(l,r)  Almost exactly like expectimax / MDPs 31 Evidence in Decision Networks  Find P(W|F=bad) Umbrella  Select for evidence W U P(W) W P(W) W P(F=bad|W) sun 0.7 sun 0.7 sun 0.2 rain 0.3 rain 0.3 rain 0.9 F P(F|sun) good 0.8 bad 0.2 F Weather  First we join P(W) and P(bad|W)  Then we normalize P(F|rain) good 0.1 bad 0.9 Forecast W P(W,F=bad) W P(W | F=bad) sun 0.14 sun 0.34 rain 0.27 rain 0.66 Example: Decision Networks Umbrella = leave Umbrella W P(W|F=bad) sun 0.34 rain 0.66 U Umbrella = take Optimal decision = take Weather Forecast =bad A W U(A,W) leave sun 100 leave rain 0 take sun 20 take rain 70 33 Value of Information  Idea: compute value of acquiring evidence  Can be done directly from decision network  Example: buying oil drilling rights     Two blocks A and B, exactly one has oil, worth k You can drill in one location Prior probabilities 0.5 each, & mutually exclusive Drilling in either A or B has MEU = k/2 U OilLoc O P D O a 1/2 a a k b Value of knowing which of A or B has oil Value is expected gain in MEU from new info Survey may say “oil in a” or “oil in b,” prob 0.5 each If we know OilLoc, MEU is k (either way) Gain in MEU from knowing OilLoc? VPI(OilLoc) = k/2 Fair price of information: k/2 1/2 a b 0 b a 0 b b k  Question: what’s the value of information?        DrillLoc U 34 Value of Information  Assume we have evidence E=e. Value if we act now: {e} a    Assume we see that E’ = e’. Value if we act then: BUT E’ is a random variable whose value is unknown, so we don’t know what e’ will be Expected value if E’ is revealed and then we act: P(s | e) U {e, e’} a P(s | e, e’) U {e} P(e’ | e)  Value of information: how much MEU goes up by revealing E’ first: VPI == “Value of perfect information” {e, e’} VPI Example: Weather MEU with no evidence Umbrella U MEU if forecast is bad Weather MEU if forecast is good Forecast Forecast distribution F P(F) good 0.59 bad 0.41 A W U leave sun 100 leave rain 0 take sun 20 take rain 70 36 VPI Properties  Nonnegative  Nonadditive ---consider, e.g., obtaining Ej twice  Order-independent 37 Quick VPI Questions  The soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is?  There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which?  You have $10 to bet double-or-nothing and there is a 75% chance that the Rams will beat the 49ers. What’s the value of knowing the outcome in advance?  You must bet on the Rams, either way. What’s the value now? That’s all  Next Lecture  Hidden Markov Models!  (You gonna love it!!)  Reminder, next lecture is a week from today (on Wednesday). Go to talk by Sanmay Das, Lopata 101, Friday 11-12. 39

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Slides