Download A Bayesian multi-scale model of perceptual organization

Probabilistic Reasoning for Modeling Unreliable Data Ron Tal York University Agenda  Modeling Uncertainty  Bayesian Reasoning  M-Estimation  Maximum Likelihood  Common Pitfall  More Advanced Models Modeling Uncertainty  Why is it necessary?  The only certainty in this world is uncertainty  Often we cannot afford or are not capable of explicitly enumerating all variables absolutely  Sometimes uncertainty is caused by a limit of the reliability of the technology  Making decisions with unreliable data Modeling Uncertainty (cont.)  Three competing paradigms:  Non-monotonous Calculus  Fuzzy Logic  Probability Theory  Since we cannot construct a deterministic solution to many problems, we model sources of uncertainty as probability distributions Bayesian Reasoning  At the core of probabilistic frameworks is Bayesian Inference  Let’s define a few concepts:  P( E | H ) - The probability of witnessing evidence E given a hypothesis H  P( H | E ) - The probability of hypothesis H given the evidence E  P( H ) - Probability of H prior to observing E  - P( E )  P(E | H )P(H ) i i Bayesian Reasoning: Bayes’ theorem  States that: Expressed in terms of our model P( E | H ) P( H ) P( H | E )  P( E ) What we want to maximize  Our life becomes simpler We usually know! We don’t always care! Bayesian Reasoning: Bayes’ theorem  If we prefer, it can also be written as P( E  H ) P( H | E )  P( E ) The joint probability M-Estimation  Bayesian Inference gives us a powerful tool to choose the hypothesis that models the data  A simple example is the set of parameters of a line of best fit through noisy data  Statistical tools to achieve this are called M-Estimators  The most popular choice is a special case called “Maximum Likelihood Estimator” Maximum Likelihood  Recall Bayes’ theorem: P( E | H ) P( H ) P( H | E )  P( E )  The denominator is merely a normalization constant  Maximum Likelihood can be applied if we assume the model prior is known Maximum Likelihood (cont.)  When model prior is constant: n ( H | e1 ,..., en )   P(ei | H ) i 1  Thus, we can fit model parameters by maximizing the likelihood Maximum Likelihood (cont.)  To determine parameters of a model, we maximize the negative log likelihood: ˆ  min   log       This let’s us avoid playing with products Maximum Likelihood (cont.)  For Gaussian distribution this is especially convenient:  n  1   ˆ   min   log  Z   i 1     n   ˆ i   min  2   2   i 1     e   2 2       i ˆ 2 2 1 log  Z e      Maximum Likelihood  Becoming: 1 log  Z   2 2 e   Constant   n 2   min ˆ     i    i 1  Least Squares  Common Pitfall  We love Gaussian Distributions  We love Least-Squares  However, using Least-Squares without the process of probabilistic reasoning is a common rookie mistake Common Pitfall: Illustration Better Modeling  Many statistical tools are available for when the Gaussian assumption fails  Assumptions can include  Good Data is Gaussian, Outliers are present  pdf can be represented as a mixture of causes  No parametric model is best suited for the job Robust Statistics  In Robust M-Estimators it is assumed that the data is locally Gaussian but outliers make traditional Least-Squares unsuitable  Essentially, we give ‘bad’ data more credibility than it deserves  Robust formulation ‘weighs’ the data with a Robust Influence Function Robust Statistics (cont.)  E.g. Tukey’s Biweight: Mixture Models  Data can be represented as caused by one of several possible causes  Essentially a weighted sum of distributions  GMM is extremely powerful  EM Clustering is the ideal estimator for that Non-parametric  Actual observed data is used in place of a fitted model  Usually a histogram  To find the ML fit between new observed data and the histogram we can minimize the Bhattachariyya Distance:   DB  p, q    ln   p  i  q  i    iN  Non-parametric  Very simple to use  Sometimes most accurate  Very inefficient for problems with high dimensionality Thank You 

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download A Bayesian multi-scale model of perceptual organization