Download Introduction to Statistics

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Regression analysis wikipedia , lookup

Least squares wikipedia , lookup

Data assimilation wikipedia , lookup

Choice modelling wikipedia , lookup

Time series wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Bayesian Modelling
Harry R. Erwin, PhD
School of Computing and Technology
University of Sunderland
Resources
• Kéry, Marc (2010) Introduction to WinBUGS for
Ecologists, Academic Press
• Ntzoufras, Ioannis (2009) Bayesian Modelling Using
WinBUGS, Wiley
Purpose
• The purpose of Bayesian modelling is to predict
the response of a statistical process as a function of
parameters and variables.
• Note both parameters and variables may vary.
• Bayesian models differ from frequentist models in
the use of a prior distribution in addition to
likelihood.
• The result of the analysis is a posterior distribution
that is the best explanation for the data in light of
the prior.
Basic Bayesian Modelling
• Direct computation of the posterior distribution
from the prior distribution and the likelihood
function representing the data.
• This is particularly efficient when the prior and
posterior distributions are in the same family. For
example, if the prior and likelihood are Gaussian,
so will be the posterior.
• Unfortunately, complex processes cannot be
analysed this way, and some sort of sampling
approach is needed.
Markov Chain Monte Carlo
(MCMC) Methods
• Invented originally by Metropolis et al., and
later developed by Hastings.
• Under fairly unrestrictive conditions, a Markov
chain will converge to an equilibrium
distribution. If that distribution is the probability
distribution of interest, life is good, as the chain
can be sampled. If the convergence is relatively
fast, life is very good.
Gibbs Sampling
• Gibbs sampling is a variant on the Metropolis-Hastings
algorithm that is particularly efficient.
• It defines the equilibrium distribution using the
distribution of each variable and parameter conditional
on the other variables and parameters.
• During an epoch (a cycle through the variables and
parameters), each is resampled using the values of the
remaining.
• Repeat, periodically recording the values (at an interval
long enough that the autocorrelation of the process is
minimised to produce pseudo-independence).
BUGS
• WinBUGS (or OpenBUGS) is a tool for doing Bayesian
modelling of statistical distributions.
• It allows you to use Gibbs sampling to go from a series
of relationships to a statistical model implied by those
relationships.
• For systems too complicated to be modelled in the
frequentist paradigm (see last lecture), it will give you
answers.
• For systems that can be modelled in the frequentist
paradigm, it gives very similar or identical answers.
Kéry’s Argument
• Kéry recommends this approach for six reasons:
1.
2.
3.
4.
5.
6.
Numerical tractability
Absence of asymptotics
Ease of error propagation
Formal framework for combining information
Intuition
Coherence
Numerical tractability
• Gibbs sampling can handle statistical models
too complex to be fitted using classical
statistics.
• When the model is fitted using classical
statistics, Gibbs sampling gives answers close
to those produced by classical methods.
Absence of asymptotics
• Classical inference using maximum likelihood
is unbiased in the infinite limit.
• For small samples, classical inference is often
biased.
• Gibbs sampling for small samples is unbiased.
Ease of error propagation
• In classical statistics, measuring the distribution
of parameters often requires approximation
methods.
• In Gibbs sampling, the distribution of
parameters can be sampled and reported
directly.
Formal framework for combining
information
• Bayesian methods define a theoretically correct
approach for fusing data with existing
knowledge of the process being studied.
Intuition
• Bayesian probability is concerned with
calculating the distribution of parameters in the
model, not with being able to reject null
hypotheses on the data.
Coherence
• Bayesian statistics is much simpler in concept
than classical statistics.
Evaluation of Results
• Suppose you have done your Gibbs sampling
and you have a series of pseudo-random
numbers from what appears to be the joint
posterior distribution. How do you check it?
– Verify the numbers come from what appears to be a
stationary distribution.
– Verify that the numbers are not auto-correlated.
– Verify that the results are robust.
Worked Example 1
• Using BUGS, demonstrate how to do the
analysis.
Worked Example 2
• Running BUGS from R.