Download Lab 3 - IDA.LiU.se

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Transcript
732A26 Computational statistics
Department of Computer and Information Science
Computer lab 3: Markov Chain Monte Carlo (MCMC)
simulation
Learning objectives
The main objective of this computer lab is to make the student acquainted with basic
forms of Markov Chain Monte Carlo simulations.
After completing the lab the student shall be able to implement simple forms of the
Metropolis-Hastings algorithm and interpret the results of a Markov Chain Monte Carlo
simulation.
Assignment 1: Using MCMC to generate binary random fields
Consider a set of pixels organized in N1 rows and N2 columns. If xm = 0 marks a black
pixel and xm = 1 a white one, there is a one-to-one correspondence between black and
white images and binary vectors (x1, …, xN) of length N = N1 N2. Your task is to generate
binary random fields (random images) according to a probability distribution on
  {( x1 , ..., x N ); xm  {0, 1}} that favours images in which pixels clump together in
groups of equal colour.
Two pixels are said to be next-door-neighbours on an image lattice if they are located in
adjacent positions in the same row or column. Denote by #x the number of pairs in which
the two pixels are next-door-neighbours and of different colour. Suppose that the
probability of x   is given by
p ( x) 
1
exp( 2 # x)
h( )
where  is a parameter and h() a normalizing constant. You shall use a VisualBasic
macro in the Excel document ‘MCMCClumping.xls’ to investigate whether MCMC can
be used to generate random numbers whose marginal distribution is close to p(x). Further,
details about the formulae used in the Visual basic macro can be found in the document
‘Nicholls.pdf’.
(i)
Open the worksheet ‘Pixel_data’ in the Excel document
‘MCMCClumping.xls’. There you can find an initial binary field and a button
to run a macro that step by step updates this field according to a Markov chain
with stationary distribution p(x). Make repeated runs with this macro to
investigate how the empirical distribution of #x after the burn-in period is
influenced by the length of this period, the total number of updates, and the
parameter  . The worksheets ‘Final_state’ and ‘MCMC_statistics’ show the
image obtained after the final adjustment, and the frequency distribution of #x
after the end of the burn-in period.
732A26 Computational statistics
Department of Computer and Information Science
(ii)
Select a different initial image with the same number of black and white
pixels as in the previous task. For instance, you can create an image where the
black pixels are located in the left part and the white ones in the right part of
the image. Make new runs with the VisualBasic macro and inspect the results
visually. Is the limiting distribution influenced by the initial state?
(iii)
Examine whether or not the limiting distribution of the generated Markov
chain is equal to p(x).
Assignment 2: Implementing the Metropolis-Hastings algorithm
Your task is to use MCMC and the Metropolis-Hastings (MH) algorithm to generate
random numbers with a standard normal distribution.
(i)
Let the proposal chain be a Markov chain on which the transition from state x
has a uniform distribution on the interval (x-d, x+d). Derive an analytical
expression for the goodness ratio of the proposed state.
(ii)
Choose a suitable d-value, and implement the MH-algorithm. This can be
done by typing and filling down formulae in an Excel sheet or by writing Rcode. Show that the limiting distribution of the generated Markov chain is
close to a standard normal distribution. Investigate how the convergence to the
desired marginal distribution of the Markov chain is influenced by the d-value
of the uniform distribution.
To hand in
A short report with the results of your simulations including appropriate histograms of
simulation results. If you use R, please also include the code you have used.