Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A gentle introduction to Gaussian distribution Review • Random variable • Coin flip experiment X=0 X=1 X: Random variable Review • Probability mass function (discrete) P(x) P(x) >= 0 0 1 x Any other constraints? Hint: What is the sum? Example: Coin flip experiment Review • Probability density function (continuous) f(x) f(x) >= 0 x Unlike discrete, Density function does not represent probability but its rate of change called the “likelihood” Examples? Review • Probability density function (continuous) f(x) f(x) >= 0 & Integrates to 1.0 x0 X0+dx P( x0 < x < x0+dx ) = f(x0).dx But, P( x = x0 ) = 0 x The Gaussian Distribution Courtesy: http://research.microsoft.com/~cmbishop/PRML/index.htm A 2D Gaussian Central Limit Theorem •The distribution of the sum of N i.i.d. random variables becomes increasingly Gaussian as N grows. •Example: N uniform [0,1] random variables. Central Limit Theorem (Coin flip) • Flip coin N times • Each outcome has an associated random variable Xi (=1, if heads, otherwise 0) • Number of heads NH = x1 + x2 + …. + xN • NH is a random variable – Sum of N i.i.d. random variables Central Limit Theorem (Coin flip) • Probability mass function of NH – P(Head) = 0.5 (fair coin) N=5 N = 10 N = 40 Geometry of the Multivariate Gaussian Moments of the Multivariate Gaussian (1) thanks to anti-symmetry of z Moments of the Multivariate Gaussian (2) Maximum likelihood • Fit a probability density model p(x | θ) to the data – Estimate θ • Given independent identically distributed (i.i.d.) data X = (x1, x2, …, xN) – Likelihood p( X | ) p( x1 | ) p( x2 | ) p( xN | ) – Log likelihood N ln p( X | ) ln p( xi | ) i 1 • Maximum likelihood: Maximize ln p(X | θ) w.r.t. θ Maximum Likelihood for the Gaussian (1) • Given i.i.d. data , the log likelihood function is given by • Sufficient statistics Maximum Likelihood for the Gaussian (2) • Set the derivative of the log likelihood function to zero, • and solve to obtain • Similarly Mixtures of Gaussians (1) • Old Faithful data set Single Gaussian Mixture of two Gaussians Mixtures of Gaussians (2) • Combine simple models into a complex model: Component Mixing coefficient K=3 Mixtures of Gaussians (3) Mixtures of Gaussians (4) • Determining parameters ¹, §, and ¼ using maximum log likelihood Log of a sum; no closed form maximum. • Solution: use standard, iterative, numeric optimization methods or the expectation maximization algorithm (Chapter 9). Thank you!