Download Introduction of Probabilistic Reasoning and Dynamic

Introduction of Probabilistic Reasoning and Bayesian Networks Hongtao Du Group Presentation Outline  Uncertain Reasoning  Probabilistic Reasoning  Bayesian Network (BN)  Dynamic Bayesian Network (DBN) Reasoning  The activity of guessing the state of the domain from prior knowledge and observations. Causal reasoning  Diagnostic reasoning  Combinations of these two  Uncertain Reasoning (Guessing)  Some aspects of the domain are often unobservable and must be estimated indirectly through other observations.  The relationships among domain events are often uncertain, particularly the relationship between the observables and non-observables.  The observations themselves may be unreliable.  Even though observable, very often we do not have sufficient resource to observe all relevant events.  Even though events relations are certain, very often it is impractical to analyze all of them Probabilistic Reasoning  Methodology founded on the Bayesian probability theory.  Events and objects in the real world are represented by random variables.  Probabilistic models:     Bayesian reasoning Evidence theory Robust statistics Recursive operators Graphical Model  A tool that visually illustrate conditional independence among variables in a given problem.  Consisting of nodes (Random variables or States) and edges (Connecting two nodes, directed or undirected).  The lack of edge represents conditional independence between variables.  Chain, Path, Cycle, Directed Acyclic Graph (DAG), Parents and Children Bayesian Network (BN)  Probabilistic network, belief network, causal network.  A specific type of graphical model that is represented as a Directed Acyclic Graph.  BN consists of     variables (nodes) V={1, 2, …, k} A set of dependencies (edges) D A set of probability distribution functions (pdf) of each variable P Assumptions     0  P( X )  1 P(X)=1 if and only if X is certain If X and Y are mutually exclusive, then P(X v Y) = P(X) + P(Y) Joint probability P(X, Y)= P(X|Y) P(Y) P(Y | X ) P( X ) P( X | Y )  P(Y )      X represents hypothesis Y represents evidence P(Y|X) is likelihood P(X|Y) is the posterior probability If X, Y are conditionally independent P(X|Z, Y) = P(X|Z)  Given some certain evidence, BN operates by propagating beliefs throughout the network. P(Z, Y, U, V) = P(Z) * P(Y|Z) * P(U|Y) * P(V|U) n P( X 1 , , X n )   P[ X i | Pa( X i )] i 1 where Pa( X i ) is the parents of node X i  Explaining away   If a node is observed, its parents become dependent. Two causes (parents) compete to explain the observed data (child). Tasks in Bayesian Network Inference  Learning  Inference  Inference is the task of computing the probability of each state of a node in a BN when other variables are known.  Method: dividing set of BN nodes into non-overlapping subsets of conditional independent nodes. Example Z L  X N  YM L  N  M X N  {x0 ,, xN 1} YM  { y0 ,, yM 1} UK  ZL Given Y is the observed variable. Goal: find the conditional pdf P(U K | Y ) over U K Case 1: UK  Y K P(U K | Y )    (ui  yi ) i 1 x0 1  ( x)   0 otherwise Case 2: UK  X P(U K , Y ) P(U K , Y ) P(U K | Y )   P(Y ) U P(U K , Y ) K P(U K , Y )   P( x,U xX \U K K ,Y ) Learning  Goal: completing the missing beliefs in the network.  Adjusting the parameters of the Bayesian network so that the pdfs defined by the network sufficiently describes statistical behavior of the observed data.   M: a BN model  : Parameter of probability of distribution Z L : Observed data Goal: Estimating  to maximize the posterior probability P(M | Z L ) P( M ) P( M | Z L )  P(Z L |  , M ) P( | M )d  P( Z L )  Assume P(M | Z L ) is highly peaked around maximum likelihood estimates  ML P( M | Z L )  P( M ) P( Z L |  ML , M ) P( ML | M ) P( Z L )  ML  arg max log P( Z L |  )  Dynamic Bayesian Network (DBN)  Bayesian network with time-series to represent temporal dependencies.  Dynamically changing or evolving over time.  Directed graphical model of stochastic processes.  Especially aiming at time series modeling.  Satisfying the Markovian condition: The state of a system at time t depends only on its immediate past state at time t-1. Representation  t1  Time slice t2 tk The transition matrix that represent these time dependencies is called Conditional Probability Table (CPT). Description  T: time boundary we are investigating Y  { y0 , , yT 1} : observable variables X  {x0 ,, xT 1} : hidden-state variables T 1 T 1 t 1 t 1 P( X , Y )   P( xt | xt 1 ) P( yt | xt )P( x0 ) P( xt | xt 1 ) : state transition pdfs, specifying time dependencies between states. P( yt | xt ) : observation pdfs, specifying dependencies of observation nodes regarding to other nodes at time slice t. P( x0 ) : initial state distribution. Tasks in DBN  Inference  Decoding  Learning  Pruning Inference  Estimating the pdf of unknown states through given observations and initial probability distributions.  Goal: finding P( X 0T 1 | Y0T 1 ) Y0T 1  { y1 , , yT 1} : a finite set of T consecutive observations X 0T 1  {x1 , , xT 1} : the set of corresponding hidden variables x0 x0 x0 y0 y0 y0 Decoding  Finding the best-fitting probability values for the hidden states that have generated the known observations.  Goal: determine the sequence of hidden states with highest probabilities. Xˆ 0T 1  arg max P( X 0T 1 | Y0T 1 ) X 0T 1 Learning  Given a number of observations, estimating parameters of DBN that best fit the observed data.  Goal: Maximizing the joint probability distribution. log P( X  T 1 0 T 1 0 ,Y T 1 T 1 i 1 i 1 |  )   log P( xt | xt 1 )   log P( yt | xt )  log P( x0 ) : the model parameter vector [log P( X 0T 1 , Y0T 1 |  )] 0  Pruning  An important but difficult task in DBN.  Distinguishing which nodes are important for inference, and removing the unimportant nodes.  Actions:    Deleting states from a particular node Removing the connection between nodes Removing a node from the network  Time slice t  W1 (t ), ,Wq (t ) : designated world nodes, a subset of the nodes, representing the part we want to inspect. V (t k ) , tk  t If state of Wi (t ) is known, P(Wi (t )  si )  1 , then V (t k ) are no longer relevant to the overall  goal of the inference. Thus, (1) delete all nodes V (t k ) (2) incorporate knowledge that Wi (t )  si Future work  Probabilistic reasoning in multiagent systems.  Different DBNs and applications.  Discussion of DBN problems.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Introduction of Probabilistic Reasoning and Dynamic