Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Markov Random Fields Presented by: Vladan Radosavljevic Outline • Intuition Simple Example • Theory Simple Example - Revisited • Application • Summary • References Intuition • Simple example • Observation – noisy image – pixel values are -1 or +1 • Objective – recover noise free image Intuition • An idea • Represent pixels as random variables • y - observed variable • x - hidden variable x and y are binary variables (-1 or +1) • Question: – Is there any relation among those variables? Intuition • Building a model • Values of observed and original pixels should be correlated (small level of noise) - make connections! • Values of neighboring pixels should be correlated (large homogeneous areas, objects) - make connections! Final model Intuition • Why do we need a model? • y is given, x has to be find • The objective is to find an image x that maximizes p(x|y) • Use model – penalize connected pairs in the model that have opposite sign as they are not correlated • Assume distribution p(x,y) ~ exp(-E) E= over all pairs of connected nodes • If xi and xj have the same sign, the probability will be higher • The same holds for x and y Intuition • How to find an image x which maximizes probability p(x|y)? • Assume x=y • Take a node xi at time and evaluate E for xi=+1 and xi=-1 • Set xi to the value that has lowest E (highest probability) • Iterate through all nodes until convergence • This method finds local optimum Intuition • Result Theory • Graphical models – a general framework for representing and manipulating joint distributions defined over sets of random variables • Each variable is associated with a node in a graph • Edges in the graph represent dependencies between random variables – Directed graphs: represent causative relationships (Bayesian Networks) – Undirected graphs: represent correlative relationships (Markov Random Fields) • Representational aspect: efficiently represent complex independence relations • Computational aspect: efficiently infer information about data using independence relations Theory • Recall • If there are M variables in the model each having K possible states, straight forward inference algorithm will be exponential in the size of the model (KM) • However, inference algorithms (whether computing distributions, expectations etc.) can use structure in the graph for the purposes of efficient computation Theory • How to use information from the structure? • Markov property: If all paths that connect nodes in set A to nodes in set B pass through nodes in set C, then we say that A and B are conditionally independent given C: p(A,B|C) = p(A|C)p(B|C) • The main idea is to factorize joint probability, then use sum and product rules for efficient computation Theory • General factorization • If two nodes xi and xk are not connected, then they have to be conditionally independent given all other nodes • There is no link between those two nodes and all other links pass through nodes that are observed • This can be expressed as • Therefore, joint distribution must be factorized such that unconnected nodes do not appear in the same factor • This leads to the concept of clique: a subset of nodes such that all pairs of nodes in the subset are connected • The factors are defined as functions on the all possible cliques Theory • Example • Factorization where : potential function on the clique C Z : partition function Theory • Since potential functions have to be positive we can define them as: (there is also a theorem that proofs correspondence of this distribution and Markov Random Fields, E is energy function) • Recall E= Theory • Why is this useful? • Inference algorithms can take an advantage of such representation to significantly increase computational efficiency • Example, inference on a chain • To find marginal distribution ~ KN Theory • If we rearrange the order of summations and multiplications ~ NK2 Application Summary • Advantages • Graphical representation • Computational efficiency • Disadvantages • Parameter estimation • How to define a model? • Computing probability is sometimes difficult References [1] Alexander T. Ihler, Sergey Kirshner, Michael Ghilc, Andrew W. Robertson and Padhraic Smyth “Graphical models for statistical inference and data assimilation”, Physica D: Nonlinear Phenomena, Volume 230, Issues 1-2, June 2007, Pages 72-87