* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Probabilistic Robotics
Survey
Document related concepts
Transcript
Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal October 2004 System Engineering and Automation Dpt. University of Málaga (Spain) Contents 1. Introduction 1.1 Probabilistic Robotics? 1.2 The Omnipresent Core: Bayes’ Rule 1.3 Let’s Filter! (Bayes Filter) 2. You Will Find Them Everywhere: Basic Mathematical Tools 2.1 A Visit to the Casino: MonteCarlo Methods 2.2 Partially Unknown Uncertainty: the EM Algorithm 2.3 Approximating Uncertainty Efficiently: Particle Filters 3. The Foundations: The Common Bayesian Framework 3.1 Graphs plus Uncertainty: Graphical Models 3.2 Arrows on Arcs: Bayesian Networks 3.3 Let it Move: Dynamic Bayesian Networks (DBNs) 4. Forgetting the Past: Markovian Models 4.1 It is Easy if it is Gaussian: Kalman Filters 4.2 On the Line: Markov Chains 4.3 What to Do?: Markov Decision Processes (MDPs) 4.4 For Non-Omniscient People: Hidden Markov Models (HMMs) 4.5 For People that Do Things: POMDPs October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 1. Introduction 1.1 Probabilistic Robotics? What’s Probabilistic Robotics? -Robotics that use probability calculus for modeling and / or reasoning about robot actions and perceptions. -State-of-the-art Robotics Why Probabilistic Robotics? -For coping with uncertainty on the robot’s environment. -For coping with uncertainty / noise on robot’s perceptions. -For coping with uncertainty / noise on robot’s actions. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 1. Introduction 1.2 The Omniscient Core: Bayes’ Rule (~1750) -Rule for updating your existing belief (probability of some variable) given new evidence (the occurrence of some event). -In spite of its simplicity, it is the basis for most probabilistic approaches in robotics and other sciences. P(R=r | e) = (The probability that a random variable R takes the value r, given that the event e has occurred) Posterior probability P(e | R=r) P(R=r) all r P(e | R=r) P(R=r) (The probability that the event e occurs if the random variable R would take the value r) (The probability that the random variable R would take the value r anyway) Prior probability (belief) (Normalizing factor for the posterior P(R=r) to add up to 1) =P(e) Conditional probability October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 1. Introduction 1.3 Let’s Filter! (Bayes Filter) -Bayes’ Rule can be iterated for improving estimate over time: P(R=r t=t2 | e t from t1 to t2) = P(e t=t2 | R=r t=t2) P(R=r t=t2 ) P(e | R=r) P(R=r) all r, t2 -In general, there are the following possibilities: This is called “filtering” “fixed-lag smoothing” “prediction” = Bayes Filter Known evidence t1 testimate t t2 testimate testimate October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 2. You’ll Find Them Everywhere: Basic Mathematical Tools 2.1 A Visit to the Casino: MonteCarlo Methods (1946) -A set of methods based on statistical sampling for approximating some value(s) (any quantitative data) when analytical methods are not available or computationally unsuitable. -Error in aproximation does not depend on dimensionality of data. -In its general form, it is a way to approximate integrals: Given the difficult integral (function “f” is known): 1 1 1 0 0 0 ... f (u1 , u2 ,..., un )du1du2 ...dun It can be approximated by the following steps: 1) 2) 3) 4) 6) 5) The Take ItCalculate Since The follows error standard aUuniform isfrom diminishes the uniform, expectation error probability distribution p(u)=1, is:with calculus many ofover so: f(U)the samples, bythat: region statistical but of integration: maybe sampling slowly... (m samples): m There are techniques of “variancen 1reduction” to reduce also sigma: EE ( f(error (fU(U )) )) f-Antitethic p0)(,du )du f (U k ) f( (fuU()u)) E(U 1uvariates m k 1 ( 0,m 1()0n,1-Control )n Variates -Importance Sampling wheresigma p(u) isisthe density function u. where theprobability standard deviation of eachofsample (unknown). -Stratified Sampling... October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 2. You’ll Find Them Everywhere: Basic Mathematical Tools 2.2 Partially Unknown Uncertainty: The EM Algorithm -The Expectation-Maximization algorithm (1977) can be used in general for estimating any probability distribution from real measurements that can be incomplete. -The algorithm works in two steps (E and M) that are iterated, improving the likelihood of the estimate over time. It can be demonstrated that the algorithm converges to a local optimum. 1. E-step (Expectation). Given the current estimate of the distribution, calculate the expectation of the measurements with respect to that estimate. 2. M-step (Maximization). Produce a new estimate that improve (maximizes) that expectation. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 2. You’ll Find Them Everywhere: Basic Mathematical Tools 2.2 Partially Unknown Uncertainty: The EM Algorithm -Mathematical formulation (in the general case): Z=(X,Y) All the data (both missing and measured) Measured data Missing or hidden data (not measured) p(Z | M) = p(X,Y | M) -E-step: In general, E[ h(W) | R=r) ] = Complete-data likelihood given a model M (we will maximize the expectation of this) h(w) p(w | r) dw all w from W Thus, E[ log p(X,Y | M) | X,M(i-1) ] = log p(X,y | M) p(y | X, M(i-1)) dy all y from Y unknown -M-step: M(i) = argmax(on M) E[ log p(X,Y | M) | X,M(i-1) ] to be optimized Variation: M(i) = any M that makes expectation greater than M(i-1) October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 2. You’ll Find Them Everywhere: Basic Mathematical Tools 2.3 Approximating Uncertainty Efficiently: Particle Filters -MonteCarlo Filter (i.e.: iterated over time). -It is useful due to its efficiency. -Represent probability distributions by samples with associated weights, and yield information from the distributions by computing on those samples. -As the number of samples increases, the accuracy of the estimate increases. -There are a diversity of particle filter algorithms depending on how to select the samples. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.1 Graphs plus Uncertainty: Graphical Models -A common formalism that copes with both uncertainty and complexity, two problems commonly found in applied mathematics and engineering. -Many specific derivations: mixture models, factor analysis, hidden Markov models, Kalman filters, etc. -A graphical model is a graph with associated probabilities. Nodes represent random variables. An arc between two nodes indicates a statistical dependence between two variables. -Three basic types: A) Undirected graphs (=Markov Random Fields): in Physics, Computer Vision, ... B) Directed (=Bayesian Networks): in Artificial Intelligence, Statistics, ... C) Mixed (=Chain Graphs). October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.2 Arrows on Graphs: Bayesian Networks P(C=true)=0.5 P(C=false)=0.5 Cloudy (C) Nodes (variables) can hold discrete or continuous values. Arcs represent causality (and conditional probability) P(S=true | C=true)=0.1 P(R=true | C=true)=0.8 P(S=true | C=false)=0.5 Sprinklet (S) P(R=true | C=false)=0.2 Rain (R) P(S=false | C=true)=0.9 P(R=false | C=true)=0.2 P(S=false | C=false)=0.5 P(R=false | C=false)=0.8 The model is completely defined by its graph structure, the values of its nodes (variables) and the conditional probabilities of the arcs wet grass (W) P(W=true | S=true, R=true)=0.99 P(W=true | S=true, R=false)=0.9 P(W=true | S=false, R=true)=0.9 P(W=true | S=false, R=false)=0 October 2004 P(W=false | S=true, R=true)=0.01 P(W=false | S=true, R=false)=0.1 P(W=false | S=false, R=true)=0.1 P(W=false | S=false, R=false)=1 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.2 Arrows on Graphs: Bayesian Networks - Inference 1) Bottom-up Reasoning or Diagnostic: from effects to causes -For example: given that the grass is wet (W=true), which is more likely, the sprinklet being on (S=true) or the rain (R=true)? Cloudy (C) Causes -Using the definition of conditional probability: Rain (R) P(S=true, W=true) P(S=true | W=true) = P(W=true) -In general, using the chain rule: Sprinklet (S) Effect -We seek P(S=true | W=true) and P(R=true | W=true) wet grass (W) P(C,S,R,W) = P(C) P(S|C) P(R|S,C) P(W|S,R,C) -But by the graph structure: P(C,S,R,W) = P(C) P(S|C) P(R|C) P(W|S,R) October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.2 Arrows on Graphs: Bayesian Networks - Inference 1) Bottom-up Reasoning or Diagnostic: from effects to causes -For example: given that the grass is wet (W=true), which is more likely, the sprinklet being on (S=true) or the rain (R=true)? Cloudy (C) Causes -Using the definition of conditional probability: Rain (R) P(S=true, W=true) P(S=true | W=true) = P(W=true) Sprinklet (S) Effect -We seek P(S=true | W=true) and P(R=true | W=true) wet grass (W) -By marginalization: P(W=true) = P(C=c,S=s,R=r,W=true) all c,s,r October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.2 Arrows on Graphs: Bayesian Networks - Inference 2) Top-down Reasoning or Causal / Generative Reasoning: from causes to effects -For example: given that it is cloudy (C=true), which is the probability that the grass is wet (W=true)? -We seek P(W=true | C=true) -The inference is similar. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.2 Arrows on Graphs: Bayesian Networks - Causality -It is possible to formalise if a variable (node) is a cause for another or if they are merely correlated. -It would be useful, for example, for a robot to learn the effects of its actions... October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 3. The Foundations: The Common Bayesian Framework 3.3 Let it Move: Dynamic Bayesian Networks (DBNs) -Bayesian Networks with time, not dynamical in the sense that the graph structure or parameters vary. -(Very) simplified taxonomy of Bayesian Networks: Graphical Models Undirected = Markov Random Fields Directed = Bayesian Networks Temporal = Dynamic Bayesian Networks Non-temporal Markov Processes (independence of future w.r.t. all past) No actions Markov Chains Markov Decision Processes Actions Totally Observable October 2004 Hidden Markov Models (HMM) Kalman Filters Partially Observable Markov Decision Processes (POMDP) Gaussian Models Partially Observable Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.1 It is Easy if it is Gaussian: Kalman Filters (1960) -They model dynamic systems, with partial observability, and gaussian probability distributions. -It is of interest: -To estimate the current state. Thus, the EM algorithm can be thought as an alternative not subjected to gaussians. -Applications: any in which it is needed to estimate the state of a known dynamical system under gaussian uncertainty / noise: computer vision (tracking), robot SLAM (if the map is considered part of the state), ... -Extensions: to reduce computational cost (e.g.: when the state has a large description), to cope with more than one hypothesis (e.g.: when two indistinguishable landmarks yield a two-modal distribution for the pose of a robot), to cope with non-linear systems (through linearization: EKF), ... October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.1 It is Easy if it is Gaussian: Kalman Filters (1960) -Mathematical formulation: P(x | u,x’) = Ax’ + Bu + ec current state actions performed of the system by the system last state known linear model of the system P(z | x) = Cx + em current observations current state of the system of the system gaussian noise in system actions gaussian noise in observations known linear model of the observations mc = 0 (mean of control noise) Sc (covariance matrix of control noise) mm = 0 (mean of observation noise) Sm October 2004 (covariance matrix of observation noise) Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.1 It is Easy if it is Gaussian: Kalman Filters (1960) -Through a Bayes Filter: m’t-1 = mt-1 + But S’t-1 = St-1 + Sc Kt = S’t-1CT(C S’t-1CT+ Sc)-1 mt = m’t-1 + Kt (zt-C m’t-1) St = (I- KtC) S’t-1 state estimate October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.2 On the Line: Markov Chains -Nodes of the network represent a random variable X in a given instant of time (unknown except for the first node). -Arc from node Xn to node Xn+1 represent the conditional probability that the random variable X takes a probability distribution Xn+1 given that it exhibited distribution Xn at the last instant (no other past instant is considered since the model is markovian). -Instants of time are discrete. All conditionals are known. -It is of interest: a) causal reasoning: to obtain Xn from all its past, and b) whether the probability distribution converges over time: assured if the chain is ergodic (any node is reachable in one step from any other node) . -Applications: { -Direct: physics, computer networks, ... -Indirect: as part of more sophisticated models. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.3 What to Do?: Markov Decision Processes -Markov Processes with actions (output arcs) that can be carried out at each node (state), with some reward as a result of a given action on a given state. -It is of interest: to obtain the best sequence of actions (markov chain) that optimize the reward. Any sequence of actions (chain) is called a policy. -In every MDP, it can be demonstrated that there always exists an optimal policy (the one that optimizes the reward). -Obtaining the optimal policy is expensive (polynomial). There are several algorithms for solving it, some of them reducing that cost (by hierarchies, etc.). The most classical one: value iteration. -Applications: decision making in general, robot path planning, travel route planning, elevator scheduling, bank customer retention, autonomous aircraft navigation, manufacturing processes, network switching and routing, ... October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models (~1960-70) Markov Processes without actions and with partial observability. States of the network are not directly accessable, except through some stochastic measurement. That is, observations are a probabilistic function of the state -It is of interest: a) which is the probability of the sequence of observations, given the network? b) which states have we visited more likely, given observations and network parameters? c) which are the network parameters that maximize the probability of having obtained those observations? -Applications: speech processing, robot SLAM, bioinformatics (gene finding, protein modeling, etc.), image processing, finance, traffic, ... October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -Elements in a HMM: N = nº of states in the network (i-th state=si). M = nº of different possible observations (k-th observation=ok). A = matrix (N x N) of state transition probabilities: axy=P(qt+1=sy | qt=sx) B = matrix (N x M) of observation probabilities: bx(ok)=P(ot=ok | qt=sx) P = matrix (1 x N) of initial state probabilities: Px=P(q0=sx) l = HMM model = (A,B, P) October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -Solution to Problem a): which is the probability of a sequence of observations (of length T), given the network? -Direct approach: enumerate all the possible sequences of states (paths) of length T in the network, and calculate for each one the probability that the given sequence of observations is obtained if the path is followed. Then, calculate the probability of that path, and thus the joint probability of the path and of the observations for that path. Finally, sum up for all the possible paths In the network. It depends on the number of paths of length T: O(2TNT ), unfeasible for T long enough. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -Efficient approach: the forward-backward procedure. -A forward variable is defined at(i)=P(O1,O2,...,Ot,qt=si | l) as the probability of the observation sequence O1,O2,...,Ot followed by reaching state si. -It is calculated recursively: 1. a1(i)=Pibi(O1), for all states i from 1 to N 2. at+1(j)=[ i=N at(i)aij] bj(Ot+1), for all states i from 1 to N i=1 i=N 3. P(O | l)= aT(i) i=1 -This calculation is O(N2T ). October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -Efficient approach: the forward-backward procedure. -Alternativelty, a backward variable can be defined bt(i)=P(Ot+1,Ot+2,...,OT,qt=si | l) as the probability of the observation sequence Ot+1,Ot+2,...,OT preceded by having reached state si. -It is calculated recursively: 1. b1(i)=1, for all states i from 1 to N 2. bt(i)= j=N aijbj(Ot+1)bt+1(j), for all states i from 1 to N j=1 3. P(O | l)= i=N b1(i) i=1 October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -Solution to Problem b): which states have we visited more likely, given a sequence of observations of length T and the network? -There is not a unique solution (as in problem a)): it depends on the optimality criteria chosen. But when one is chosen, a solution can be found analitycally. -The Viterbi Algorithm: it finds the best single-state sequence of states (the one that maximizes the probability of each single state of the sequence, independently on the other states, at each step). -The following two variables are defined: dt(i)= max P(q1,q2,...,qt=si,O1,O2,...,Ot | l) (It calculates all the sequences of states that reach state si at the end q1,q2,...,qt-1 and produce the given observations; Recursively: dt(j)=[ max (dt-1(i)aij) ] bj(Ot) Then it returns the maximum probability found) i=1,2,...,N t(j)= argmax (dt-1(i)aij) i=1,2,...,N October 2004 (traces the state that maximizes the expression, that is, that maximizes the likelihood of passing through single state sj) Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -The algorithm works as follows: 1. d1(i)=Pibi(O1), for all states i from 1 to N 1(i)=0, for all states i 2. dt(j)=)=[ max (dt-1(i)aij) ] bj(Ot) for all states j from 1 to N i=1,2,...,N t(j)= argmax (dt-1(i)aij), for all states j i=1,2,...,N 3. P*= max (dT(i)) i=1,2,...,N qT*= argmax (dT(i)) (the ending state) i=1,2,...,N 4. qt*= t+1(qt+1*) (for retrieving all the other states in the sequence) October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.4 For Non-Omniscient People: Hidden Markov Models -Solution to Problem c): which are the network parameters that maximize the probability of having obtained the observations? -Not only there is no one unique solution (as in problem b)), but there are not any analitycal procedure to obtain it: only approximations are available. -Approximation algorithms that obtain locally optimal models exist. Most popular: EM (Expectation-Maximization), which is called Baum-Welch when adapted to HMMs: -The sequence of observations are considered the measured data. -The sequence of states that yield those observations, the missing or hidden data. -The matrices A, B, P are the parameters to approximate. -The number of states is known a priori. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal 4. Forgetting the Past: Markovian Models 4.5 For People that Do Things: POMDPs -”Partially Observable Markov Decision Processes”. -Markov Processes with both actions and partial observability. -It is of interest: -The three problems of HMMs: likelihood of the model, localisation in a given model, and calculation of the model itself. -The basic problem of MDPs: best policy to do (through actions) for obtaining the greatest benefit. -Applications: any in which it is needed both to model some real process (or environment) and to act optimally with that model. Only recently applied to robotics (1996) October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal References Thrun S. (2002), “Robotic Mapping: A Survey”, Technical Report CMU-CS-02-111 Murphy K. (1998), “A Brief Introduction to Graphical Models and Bayesian Networks”, http://www.ai.mit.edu/~murphyk/Bayes/bayes.html. Murphy K. (2000), “A Brief Introduction to Bayes’ Rule”, http://www.ai.mit.edu/~murphyk/Bayes/bayesrule.html. Contingency Analysis (2004), “MonteCarlo Method”, http://www.riskglossary.com/articles/monte_carlo_method.htm. Bilmes J.A. (1998), “A Gentle Tutorial of the EM Algorithm and its Applications to Parameter Estimation for Gaussian Mixture and Hidden Markov Models”, International Computer Science Institute, Technical Report, CA (USA). Arulampalam S., Maskell S., Gordon N., Clapp T. (2001), “A Tutorial on Particle Filters for OnLine Non-Linear / Non-Gaussian Bayesian Tracking”, IEEE Transactions on Signal Processing vol. 50, no 2. West M. (2004), “Elements of Markov Chain Structure and Convergence”, Notes of Fall 2004 course, http://www.stat.duke.edu/courses/Fall04/sta214/Notes/214.5.pdf. Moore A. (2002), “Markov Systems, Markov Decision Processes, and Dynamic Programming”, teaching slides at CMU. Hannon M.E., Hannon S.S. (2000), “Reinforcement Learning: A Tutorial”, Reading of New Bulgarian University, www.nbu.bg/cogs/events/2000/Readings/Petrov/rltutorial.pdf. Cassandra T. (1999), “POMDP for Dummies”, http://www.cs.brown.edu/research/ai/pomdp/tutorial/index.html. Rabiner L. (1989), “A Tutorial in Hidden Markov Models and Selected Applications in Speech Recognition”, Proceedings of the IEEE, vol. 77, no. 2. Cassandra A.R., Kaelbling L.P., Kurien J.A. (1996), “Acting under Uncertainty: Discrete Bayesian Models for Mobile Robot Navigation”, Proceedings of the IROS’96. October 2004 Probabilistic Robotics: A Tutorial Juan Antonio Fernández Madrigal