Download Presentation - Duke ECE

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Josiah Willard Gibbs wikipedia , lookup

Transcript
How many iterations in the Gibbs sampler?
Adrian E. Raftery and Steven Lewis
(September, 1991)
Duke University Machine Learning Group
Presented by Iulian Pruteanu
11/17/2006
Outline
• Introduction
• How many iterations to estimate a posterior quantile?
• Extensions
• Examples
Introduction
• There is no guarantee, no matter how long you run the
MCMC algorithm for, that it will converge to the posterior
distribution.
• Diagnostic statistics identify problems with convergence but
cannot “prove” that convergence has occurred.
• One long run or many short runs?
• The Gibbs sampler can be extremely computationally
demanding, even for relatively small-scale problems.
One long run or many short runs?
• short runs: (a) choose a starting point; (b) run the Gibbs
sampler for T iterations and store only the last iterate; (c)
return to (a).
• one long run may well be more efficient. The starting point
for every sequence of length T is closer to a draw from the
stationary distribution than the short-runs case.
• it is still important to use several different starting points
since we don’t know if a single run has converged.
Introduction
•
The Raftery-Lewis test:
1. specify a particular quantile
accuracy rof the quantile.
q
of the distribution of interest, an
2. The test breaks the chain (coming from the Gibbs sampling) into a
(1,0) sequence and generates a two-state Markov chain.
3. The tests uses the sequence to estimate the transition probabilities
and than the number of addition burn-ins required and the total chain
length required to achieve the present level of accuracy.
The Raftery-Lewis test
• we want to estimate P (U  q ) within
 with
r probability , where
s
• U is the posterior conditional distribution we are looking to estimate.
• we calculate U t for each iteration t and then form
Z t   (U t  q)
• the problem is to determine M - initial iterations; N - further iterations
and k - step size
• {Z t }is a binary 0-1 process derived from a Markov chain
(k )
(k )
• we form a new process {Z t }where Z t  Z1(t 1) k
(k )
• assuming that {Z t } is indeed a Markov chain, we determine M, N and
k to approach stationarity.
The Raftery-Lewis test
 
1  

• let P  
1  
 
be the transition matrix for
{Zt(k ) }
 
 

,
• the equilibrium distribution is then   ( 0 ,1 )  
      
  0 1 
l  
 

• the l -step transition matrix is P  
  0 1       
l
  (1     )
 

 
The Raftery-Lewis test
• we require that P[Z m( k )  i | Z 0( k )  j ] be within of
• then
P[Z m( k )
 i | Z 0(k )
 j ]  ei P
 (   )
• then  
max( ,  )
m
• thus M  m  k
or
m
where
e0  (1,0)
e1  (0 ,1)
  (   ) 

log 
max( ,  ) 

m
log
i
The Raftery-Lewis test
n
1
• the sample mean of the process {Z t(k ) } is Z n( k )   Z t( k ) which
follows a normal distribution (the central limit theorem) n t 1
• so, P[q  r  Z n( k )  q  r ]  s will be satisfied if
n
 (2     )
(   )3




r




1


 N  (1  s )  


 2

2
• thus N  n  k
• initial number of iterations to estimate
 and 
Extensions
• several quantiles: run the Gibbs sampler for Ni iterations to each
quantile and then use the maximum values of M
, kand .N
• independent iterates: when it is much more expensive to analyze a Gibbs
iterate than to simulate it, it is desirable to have approximately
independent Gibbs iterates (by making kbig enough).
Examples
The method was applied to both simulated and real examples.
The results are given for q  .025 ,
r  .005 and
s  .95
Discussion
• for ‘nice’ posterior distributions, the accuracy can be achieved by
running the sampler for 5000 iterations and using all the iterates.
• when the posterior is not ‘nice’ the required number can be much
greater.
• the required number of iterations can be dramatically different, and
even for different quantities of interest within the same problem.
References
• Billingsley, P. “Convergence of probability measures”, 1968
• Holland, P. W. “Discrete multivariate analysis”, 1975
• Gelfand, A. E. “Sampling-based approaches to calculate marginal
densities”, 1990