Download Week 9 – Probability and Inference

MATHEMATICS FOR COMPUTER VISION WEEK 9 PROBABILITY AND INFERENCE 1 Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year 2013-14 OUTLINE OF WEEK 9  introduction to probability theory and Bayesian inference Probability measures  Random variables  Marginal and joint distributions  Bayes rule  conditional probability  Random processes  Markov chains  2 PROBABILITY MEASURES AND DISTRIBUTIONS 3 Week 9 – Probability and Inference PROBABILITY MEASURES probability measure → mathematical representation of the notion of chance  assigns a probability value to every subset of a collection of possible outcomes (of a random experiment, of a decision problem, etc)  collection of outcomes → sample space, universe  subset of the universe → event  4 EXAMPLE  typical example: the spinning wheel spinning wheel with 3 possible outcomes  universe Ω = {1,2,3}  eight possible events (right), including the empty set  probability of ∅ is 0, probability of Ω is 1  additivity: P({1,2}) = P({1}) + P({2})  5 FORMAL DEFINITION probability measure µ: a real-valued function on a probability space that satisfies countable additivity  probability space: is a triple (Ω, F, P) formed by a universe Ω, a σ-algebra of its subsets, and a probability measure on F  not all subsets of Ω belong necessarily to F  axioms:  µ(∅) = 0, µ(Ω) = 1  0 ≤ µ(A) ≤ 1 for all events A∈F  additivity: for all “countable” collection of pairwise disjoint events Ai µ(∪i Ai) = ∑i µ(Ai)  6 RANDOM VARIABLE a variable whose value is subject to random variations, i.e. due to chance: what chance is is subject to philosophical debate!  it can take one of a set of possible values, with associated probability  mathematically, it is a function X from a sample space Ω (which forms a probability space) to (usually) the reals  subject to a condition of “measurability”: each range of values of the real line must have an anti-image in Ω which has a probability value  this way, we can forget about the initial probability space and record the probabilities of various values of X  7 EXAMPLE  the sample space is the set of outcomes of rolling two dice Ω = { (1,1), (1,2), (1,3), (1,4), ... , (6,4), (6,5), (6,6) }  a random variable can be the function that associates each roll of the two dice to the sum S of the faces  random variables can be discrete or continuous  S is a discrete random variable  8 (CUMULATIVE) PROBABILITY DISTRIBUTION OF A RANDOM VARIABLE the probability distribution of a random variable X records the probability values for all real values x in the range of X  we can then answer all questions of the form, what is the probability P(a ≤ X ≤ b), P(X > a), etcetera • these ranges of values are called “Borel sets”  all the information is captured by the cumulative distribution F(x) = P(X ≤ x)  9 DISCRETE PROBABILITY DISTRIBUTIONS a random variable is called discrete when X can only assume a finite or a countably infinite (e.g. the set of integer numbers 1,2,3, ...) number of values  it is described by a (probability) mass function  example  common discrete distributions: Poisson, Bernoulli,  binomial → mathematical description of number of successes in a series of trials  10 BINOMIAL DISTRIBUTION the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p  example of probability distribution (left)  example of cumulative distribution (right)  11 CONTINUOUS DISTRIBUTIONS – PDFS a random variable is called continuous when it can assume values in a non-countable set (e.g. the real line)  it is described by a probability density function (PDF)  which describes the likelihood of the variable taking any continuous (real) value  the probability of any range of values (e.g., an interval) is the integral of the PDF over the range  there are mixed distributions as well, with cumulative function of the form  12 EXAMPLES OF CONTINUOUS PDFs  examples of continuous PDFs: • Gaussian → fundamental, see the law of large numbers • Beta, gamma, chi-square... 13 EXAMPLE OF CONTINUOUS PDF: THE GAUSSIAN PDF most “famous” continuous random variable: the Gaussian r.v.  typical PDF of a Gaussian:   shape characterised by a mean µ and a standard deviation σ 14 MOMENTS a random variable can be (partially) described by its moments, which give some indications on its shape  n-th moment of a probability distribution:  where X is a random variable with cumulative distr. F  E is called the expectation operator  may or may not exists, for a r.v. X  two major moments: mean and variance  15 MEAN AND VARIANCE mean or expected value (first order moment) • continuous case: • discrete case:  variance (second order moment) • continuous case • discrete case  describes how spread out are the values of X with respect to the mean  standard deviation → square root of variance  relation between mean and variance:  16 EXAMPLES OF MOMENTS  Normal (Gaussian) distribution: mean µ, variance σ2 Binomial: mean np, variance np (1-p)  Exponential distribution: • Mean λ-1 • Variance  17 LAWS OF PROBABILITY 18 Week 9 – Probability and Inference LAW OF LARGE NUMBERS describes what happens when you repeat the same random experiment an increasing number of times n  the average of the results (sample mean) should be close to the expected value (mean)  probabilities become predictable as we run the same trial more and more times!  strong law   weak law 19 CENTRAL LIMIT THEOREM the mean of a sufficiently large number of iterates of independent random variables is normally (Gaussian) distributed  let X1,...,Xn independent and identically distributed r.v.s with the same mean µ and variance σ2  we can build the usual sample average  the random variable √n (Sn − µ) tends to a Gaussian with mean 0 and variance σ2  20 CENTRAL LIMIT THEOREM – ILLUSTRATION  sum of N uniform random variables 21 CONDITIONAL PROBABILITY 22 Week 9 – Probability and Inference CONDITIONAL PROBABILITIES probability an event will occur, given that another event has occurred (or not)  said “probability of A given B” P(A|B)  two definitions:  quotient of joint prob of A and B and the probability of B   as an (multiplication) axiom of probability theory (De Finetti): 23 ILLUSTRATIONS P(A) = 0.52  P(A|B1) = 0.1  P(A|B2) = 0.12  rolling of two dice  P({A=2}) = 6/36 = 1/6  P({A=2}|{A+B≤5}) = 3/10 24 LAW OF TOTAL PROBABILITY fundamental law relating marginal probabilities to conditional probabilities  Idea: if the universe can be decomposed into a disjoint partition of events Bi, the marginal (total) probability of an event A is the sum of the joint probabilities with Bi  P(A∩Bi) can also be expressed via the conditionals  25 BAYES’ RULE relates conditional and prior probabilities  has various interpretations, according to different interpretation of probability measures  Bayesian interpretation: probability is a degree of belief in a proposition A, before P(A) or after P(A|B) new evidence in gathered  evidence is always in the form “proposition B is true”  nomenclature:  • P(A) is the prior (initial degree of belief in A) • P(A|B) is the posterior (after evidence B is considered)  posteriors are in the form of conditional probabilities, and can be computed by Bayes' rule 26 MANY VARIABLES 27 Week 9 – Probability and Inference JOINT DISTRIBUTION OF SEVERAL RANDOM VARIABLES what happens when we have more than one random variable, X, Y, ... on the same probability space?  we can define a joint distribution which specifies the probability of X, Y, etc falling in any given range of values  example: joint Gaussian  28 MARGINAL DISTRIBUTION from the joint distribution P(X,Y,..) of two or more random variables X,Y, ..., one can recover the distribution of each single random variable X  this is called the marginal distribution of X  discrete example:   discrete formula:  continuous formula: 29 INDEPENDENCE  independence of events  independence and conditional probability generalises to n events: distinguish pairwise/mutual independence  independence of random variables: every pair of Borel interval are independent (as events)  the joint PDF decomposes as  30 RANDOM PROCESS also called stochastic process: it is a collection of random variables  typically used to describe the evolution of some random value over time X(t)  in this sense, statistical counterpart to deterministic dynamical systems whose evolution is fixed given X(0)  however, can be defined over any domain, 2-D, etc  discrete time: sequence of random variables, or time series  continuous domain: random field  31 RANDOM PROCESSES AS RANDOM-VALUED FUNCTIONS       interpretation: a random process is a function on its domain, whose values are random variables random values at different points of the domain can be completely different usually, though, they are required to be of the same type (identically distributed, i.i.) the component random variables can be independent or have complicated statistical relations EEG signals, stock market fluctuations, but also images and videos! Markov Random Fields are used for image segmentation 32 RANDOM PROCESSES AS ENSEMBLES OF REALIZATIONS     helpful interpretation: as an ensemble of functions idea: you extract a sample value from each random variable forming the process you get a standard, “deterministic” function on the same domain of the random process to each such function is attached a probability value → process = probability distribution over functions 33 TYPES OF RANDOM PROCESSES a stationary process is one for which the joint distribution of a collection of its random variables does not change when shifted around its domain • For instance, P(X(t),X(t+1)) = P(X(t+2),X(t+3)) • weak sense →  a process is ergodic if its moments can be obtained as limits of sample means and covariances, for size of the sample that goes to ∞  discrete time, continuous time, etc  34 EXAMPLES simple markov chain describing market conditions  example of transition matrix of a Markov chains describing weather conditions  36 SUMMARY 37 Week 9 – Probability and Inference SUMMARY OF WEEK 9  Introduction to probability theory probability measures  random variables  moments, mean and variance  laws of probability  Bayes' rule and conditional probabilities  independence  random processes  Markov chains  38

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Week 9 – Probability and Inference