* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction to Statistics
Survey
Document related concepts
Transcript
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland Resources • Kéry, Marc (2010) Introduction to WinBUGS for Ecologists, Academic Press • Ntzoufras, Ioannis (2009) Bayesian Modelling Using WinBUGS, Wiley Purpose • The purpose of Bayesian modelling is to predict the response of a statistical process as a function of parameters and variables. • Note both parameters and variables may vary. • Bayesian models differ from frequentist models in the use of a prior distribution in addition to likelihood. • The result of the analysis is a posterior distribution that is the best explanation for the data in light of the prior. Basic Bayesian Modelling • Direct computation of the posterior distribution from the prior distribution and the likelihood function representing the data. • This is particularly efficient when the prior and posterior distributions are in the same family. For example, if the prior and likelihood are Gaussian, so will be the posterior. • Unfortunately, complex processes cannot be analysed this way, and some sort of sampling approach is needed. Markov Chain Monte Carlo (MCMC) Methods • Invented originally by Metropolis et al., and later developed by Hastings. • Under fairly unrestrictive conditions, a Markov chain will converge to an equilibrium distribution. If that distribution is the probability distribution of interest, life is good, as the chain can be sampled. If the convergence is relatively fast, life is very good. Gibbs Sampling • Gibbs sampling is a variant on the Metropolis-Hastings algorithm that is particularly efficient. • It defines the equilibrium distribution using the distribution of each variable and parameter conditional on the other variables and parameters. • During an epoch (a cycle through the variables and parameters), each is resampled using the values of the remaining. • Repeat, periodically recording the values (at an interval long enough that the autocorrelation of the process is minimised to produce pseudo-independence). BUGS • WinBUGS (or OpenBUGS) is a tool for doing Bayesian modelling of statistical distributions. • It allows you to use Gibbs sampling to go from a series of relationships to a statistical model implied by those relationships. • For systems too complicated to be modelled in the frequentist paradigm (see last lecture), it will give you answers. • For systems that can be modelled in the frequentist paradigm, it gives very similar or identical answers. Kéry’s Argument • Kéry recommends this approach for six reasons: 1. 2. 3. 4. 5. 6. Numerical tractability Absence of asymptotics Ease of error propagation Formal framework for combining information Intuition Coherence Numerical tractability • Gibbs sampling can handle statistical models too complex to be fitted using classical statistics. • When the model is fitted using classical statistics, Gibbs sampling gives answers close to those produced by classical methods. Absence of asymptotics • Classical inference using maximum likelihood is unbiased in the infinite limit. • For small samples, classical inference is often biased. • Gibbs sampling for small samples is unbiased. Ease of error propagation • In classical statistics, measuring the distribution of parameters often requires approximation methods. • In Gibbs sampling, the distribution of parameters can be sampled and reported directly. Formal framework for combining information • Bayesian methods define a theoretically correct approach for fusing data with existing knowledge of the process being studied. Intuition • Bayesian probability is concerned with calculating the distribution of parameters in the model, not with being able to reject null hypotheses on the data. Coherence • Bayesian statistics is much simpler in concept than classical statistics. Evaluation of Results • Suppose you have done your Gibbs sampling and you have a series of pseudo-random numbers from what appears to be the joint posterior distribution. How do you check it? – Verify the numbers come from what appears to be a stationary distribution. – Verify that the numbers are not auto-correlated. – Verify that the results are robust. Worked Example 1 • Using BUGS, demonstrate how to do the analysis. Worked Example 2 • Running BUGS from R.