Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
QUIZ!! T/F: Forward sampling is consistent. True T/F: Rejection sampling is faster, but inconsistent. False T/F: Rejection sampling requires less memory than Forward sampling. True T/F: In likelihood weighted sampling you make use of every single sample. True T/F: Forward sampling samples from the joint distribution. True T/F: L.W.S. only speeds up inference for r.v.’s downstream of evidence. True T/F: Too few samples could result in division by zero. True 1 xkcd CSE511a: Artificial Intelligence Fall 2013 Lecture 18: Decision Diagrams 04/03/2013 Robert Pless, via Kilian Q. Weinberger Several slides adapted from Dan Klein – UC Berkeley Announcements Class Monday April 8 cancelled in lieu of talk by Sanmay Das: “Bayesian Reinforcement Learning With Censored Observations and Applications in Market Modeling and Design” Friday, April 5, 11-12 Lopata 101. 3 Sampling Example There are 2 cups. The first contains 1 penny and 1 quarter The second contains 2 quarters Say I pick a cup uniformly at random, then pick a coin randomly from that cup. It's a quarter (yes!). What is the probability that the other coin in that cup is also a quarter? Recap: Likelihood Weighting Sampling distribution if z sampled and e fixed evidence Cloudy C Now, samples have weights S R W Together, the weighted sampling distribution is consistent 5 Likelihood Weighting +c -c 0.5 0.5 Cloudy +c -c +s +s -s +s -s 0.1 0.9 0.5 0.5 +r -r -s +r -r +c Sprinkler +w -w +w -w +w -w +w -w 0.99 0.01 0.90 0.10 0.90 0.10 0.01 0.99 Rain WetGrass -c +r -r +r -r 0.8 0.2 0.2 0.8 Samples: +c, +s, +r, +w … 6 Likelihood Weighting Example Cloudy 0 0 0 0 1 Rainy 1 0 0 0 0 Sprinkler 1 1 1 1 1 Wet Grass 1 1 1 1 1 Weight 0.495 0.45 0.45 0.45 0.09 Inference: Sum over weights that match query value Divide by total sample weight What is P(C|+w,+s)? å sample _ weight P(C | +w,+s) = å sample _ weight C=1 7 Likelihood Weighting Likelihood weighting is good Takes evidence into account as we generate the sample At right: W’s value will get picked based on the evidence values of S, R More samples will reflect the state of the world suggested by the evidence Likelihood weighting doesn’t solve all our problems Cloudy C S Rain R W Evidence influences the choice of downstream variables, but not upstream ones (C isn’t more likely to get a value matching the evidence) We would like to consider evidence when we sample every variable 8 Markov Chain Monte Carlo* Idea: instead of sampling from scratch, create samples that are each like the last one. Procedure: resample one variable at a time, conditioned on all the rest, but keep evidence fixed. E.g., for P(b|c): +b +a +c -b +a +c -b -a +c Properties: Now samples are not independent (in fact they’re nearly identical), but sample averages are still consistent estimators! What’s the point: both upstream and downstream variables condition on evidence. 9 Gibbs Sampling 1. Set all evidence E1,...,Em to e1,...,em 2. Do forward sampling t obtain x1,...,xn 3. Repeat: 1. 2. 3. 4. Pick any variable Xi uniformly at random. Resample xi’ from p(Xi | markovblanket(Xi)) Set all other xj’=xj The new sample is x1’,..., xn’ 10 Markov Blanket Markov blanket of X: 1. All parents of X 2. All children of X 3. All parents of children of X (except X itself) X X is conditionally independent from all other variables in the BN, given all variables in the markov blanket (besides X). 11 MCMC algorithm 12 The Markov Chain 13 Summary Sampling can be your salvation Dominant approach to inference in BNs Pros/Conts: Forward Sampling Rejection Sampling Likelihood Weighted Sampling Gibbs Sampling/MCMC 14 Google’s PageRank http://en.wikipedia.org/wiki/Markov_chain Page, Lawrence; Brin, Sergey; Motwani, Rajeev and Winograd, Terry (1999). The PageRank citation ranking: Bringing order to the Web. See also: J. Kleinberg. Authoritative sources in a hyperlinked environment. Proc. 9th ACMSIAM Symposium on Discrete Algorithms, 1998. 15 Google’s PageRank Graph of the Internet (pages and links) H A I D C F J G 16 E B Google’s PageRank Start at a random page, take a random walk. Where do we end up? H A I D C F J G 17 E B Google’s PageRank Add 15% probability of moving to a random page. Now where do we end up? H A I D C F J G 18 E B Google’s PageRank PageRank(P) = Probability that a long random walk ends at node P H A I D C F J G 19 E B 20 Learning 21 Where do the CPTs come from? How do you build a Bayes Net to begin with? Two scenarios: Complete data Incomplete data --- touch on, you are not responsible for this. 22 Learning from complete data Observe data from real world: Estimate probabilities by counting: P(X = x | pa(X) = p ) = count(pa(X) = p , X = x) count(pa(X) = p ) å I(x , x)I(pa(X), p ) P(X = x | pa(X) = p ) = å I(pa(X), p ) t t t (Just like rejection sampling, except we observe the real world.) Cloudy 0 1 1 0 0 0 1 0 1 1 0 1 0 1 1 0 0 Rainy 0 1 0 0 0 0 1 0 0 1 0 1 0 1 1 1 0 Sprinkler 1 0 0 1 1 1 0 1 0 0 1 0 1 0 0 0 1 Wet Grass 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1 0 (Indicator function I(a,b)=1 if and only if a=b.) 23 Learning from incomplete data Observe real-world data Some variables are unobserved Trick: Fill them in EM algorithm E=Expectation M=Maximization Cloudy 1 0 1 1 1 1 0 1 1 0 0 0 1 1 1 1 1 0 1 1 Rainy ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Sprinkler ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Wet Grass 0 1 1 0 1 1 1 0 0 1 0 1 0 1 1 0 1 1 0 0 24 EM-Algorithm Initialize CPTs (e.g. randomly, uniformly) Repeat until convergence E-Step: Compute for all i and x (inference) P(Xi = x, pa(Xi ) = p |V t ) M-Step: Update CPT tables å P(X = x, pa(X ) = p | V ) P(X = x | pa(X) = p ) ¬ å P(pa(X ) = p | V ) (Just like likelihood-weighted sampling, but on real-world data.) t i i i t t i t 25 EM-Algorithm Properties: No learning rate Monotonic convergence Each iteration improves log-likelihood of data Outcome depends on initialization Can be slow for large Bayes Nets ... 26 Decision Networks 27 Decision Networks MEU: choose the action which maximizes the expected utility given the evidence Umbrella U Can directly operationalize this with decision networks Bayes nets with nodes for utility and actions Lets us calculate the expected utility for each action Weather New node types: Chance nodes (just like BNs) Actions (rectangles, cannot have parents, act as observed evidence) Utility node (diamond, depends on action and chance nodes) Forecast 28 Decision Networks Action selection: Instantiate all evidence Set action node(s) each possible way Calculate posterior for all parents of utility node, given the evidence Calculate expected utility for each action Choose maximizing action Umbrella U Weather Forecast 29 Example: Decision Networks Umbrella = leave Umbrella U Weather Umbrella = take A W P(W) leave sun 100 sun 0.7 leave rain 0 rain 0.3 take sun 20 take rain 70 W Optimal decision = leave U(A,W) Decisions as Outcome Trees {} Weather U(t,s) Weather U(t,r) U(l,s) U(l,r) Almost exactly like expectimax / MDPs 31 Evidence in Decision Networks Find P(W|F=bad) Umbrella Select for evidence W U P(W) W P(W) W P(F=bad|W) sun 0.7 sun 0.7 sun 0.2 rain 0.3 rain 0.3 rain 0.9 F P(F|sun) good 0.8 bad 0.2 F Weather First we join P(W) and P(bad|W) Then we normalize P(F|rain) good 0.1 bad 0.9 Forecast W P(W,F=bad) W P(W | F=bad) sun 0.14 sun 0.34 rain 0.27 rain 0.66 Example: Decision Networks Umbrella = leave Umbrella W P(W|F=bad) sun 0.34 rain 0.66 U Umbrella = take Optimal decision = take Weather Forecast =bad A W U(A,W) leave sun 100 leave rain 0 take sun 20 take rain 70 33 Value of Information Idea: compute value of acquiring evidence Can be done directly from decision network Example: buying oil drilling rights Two blocks A and B, exactly one has oil, worth k You can drill in one location Prior probabilities 0.5 each, & mutually exclusive Drilling in either A or B has MEU = k/2 U OilLoc O P D O a 1/2 a a k b Value of knowing which of A or B has oil Value is expected gain in MEU from new info Survey may say “oil in a” or “oil in b,” prob 0.5 each If we know OilLoc, MEU is k (either way) Gain in MEU from knowing OilLoc? VPI(OilLoc) = k/2 Fair price of information: k/2 1/2 a b 0 b a 0 b b k Question: what’s the value of information? DrillLoc U 34 Value of Information Assume we have evidence E=e. Value if we act now: {e} a Assume we see that E’ = e’. Value if we act then: BUT E’ is a random variable whose value is unknown, so we don’t know what e’ will be Expected value if E’ is revealed and then we act: P(s | e) U {e, e’} a P(s | e, e’) U {e} P(e’ | e) Value of information: how much MEU goes up by revealing E’ first: VPI == “Value of perfect information” {e, e’} VPI Example: Weather MEU with no evidence Umbrella U MEU if forecast is bad Weather MEU if forecast is good Forecast Forecast distribution F P(F) good 0.59 bad 0.41 A W U leave sun 100 leave rain 0 take sun 20 take rain 70 36 VPI Properties Nonnegative Nonadditive ---consider, e.g., obtaining Ej twice Order-independent 37 Quick VPI Questions The soup of the day is either clam chowder or split pea, but you wouldn’t order either one. What’s the value of knowing which it is? There are two kinds of plastic forks at a picnic. It must be that one is slightly better. What’s the value of knowing which? You have $10 to bet double-or-nothing and there is a 75% chance that the Rams will beat the 49ers. What’s the value of knowing the outcome in advance? You must bet on the Rams, either way. What’s the value now? That’s all Next Lecture Hidden Markov Models! (You gonna love it!!) Reminder, next lecture is a week from today (on Wednesday). Go to talk by Sanmay Das, Lopata 101, Friday 11-12. 39