Download CMP_2_Monte_Carlo

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Approximations of π wikipedia , lookup

Infinite monkey theorem wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
Chapter 4
Monte Carlo Methods and
Simulation
Monte Carlo Methods
Introduction of Monte Carlo
1. Monte Carlo methods have been used for
centuries.
2. However during World War II, this
method was used to simulate the
probabilistic issues with neutron diffusion
(first real use).
3. Named after the capital of Monaco (one
of the world’s center for gambling), due to
the similarity to games of chance.
4. Otherwise known as the random walk
method.
Solve mathematical problems:
•
•
•
•
•
Multi-dimensional integrals
Integral equations
Differential equations
Matrix inversion
Extremum of functions
Disadvantages
•
•
•
•
Time consuming.
Doesn’t give exact solution.
Is the model correct?
Bugs, bugs, bugs.
Consider particle transport through matter
Three possible methods
• Direct Experiment
– Sometime virtually impossible
– Very expensive / time consuming
• Theoretical
– Solve complex equations subject to interface & boundary conditions
– Often need approximations
• Monte Carlo
– Use basic physics laws and probabilities
– Minimal amount of approximations
– Since a statistical method, can require large amounts of computer time
What is Monte Carlo
Monte Carlo methods are a class of algorithms that rely on repeated random
sampling to compute their results.
Monte Carlo Methods use Statistical Physics techniques to solve problems that
are difficult or inconvenient to solve deterministically.
Non Monte Carlo methods typically involve ODE/PDE equations that describe
the system.
Monte Carlo methods are stochastic techniques. (process involving a number of
random variables depending on a variable parameter)
It is based on the use of random numbers and probability statistics to simulate
problems.
Something can be called a Monte Carlo method if it uses random numbers to
examine the problem it is solving.
Components of Monte Carlo simulation
Probability distribution functions (pdf's) - the physical (or mathematical) system
must be described by a set of pdf's.
Random number generator - a source of random numbers uniformly distributed on
the unit interval must be available.
Sampling rule - a prescription for sampling from the specified pdf's, assuming the
availability of random numbers on the unit interval, must be given.
Scoring (or tallying) - the outcomes must be accumulated into overall tallies or
scores for the quantities of interest.
Error estimation - an estimate of the statistical error (variance) as a function of the
number of trials and other quantities must be determined.
Variance reduction techniques - methods for reducing the variance in the
estimated solution to reduce the computational time for Monte Carlo simulation
Parallelization and vectorization - algorithms to allow Monte Carlo methods to be
implemented efficiently on advanced computer architectures.
Probability Density Function
A probability density function (or probability distribution function) is a function f defined
on an interval (a, b) and having the following properties:
Uses for Random Numbers
In Monte Carlo calculations, we usually have to take random samples of
certain quantities with definite distributions.
For this reason, the random number we use should follow the same
distribution.
For example, in a one-dimensional Monte Carlo integration, we wish to sample
the variable of integration with uniform probability within the integration limits.
For this purpose, we need random numbers with a uniform distribution.
A simple way to obtain a random integer is to read the timer, or system
clock, on the computer we are using.
Another interesting way to produce random integers is the middle-square
method of von Neumann. If we square an integer consisting of several
digits, both the most significant digits and the least significant digits are
predictable. However, it is more difficult to predict the digits in the middle.
For example, if we square an integer of three digits,
(123)2 = 15,129
(123)2 = 15,129
the result is a five-digit integer.
15,129
By chopping off the leading and trailing digit, we obtain the number 512.
(512)2 = 262144
If we square this number and retain only the third, fourth, and fifth digits, we
obtain the number 214. In this way, a sequence of random integers can be
obtained, each constructed from the square of the previous one with only the
middle part of the digits retained.
2
(214) = 45796
Although this method has been in use for a long time, it is no longer a
preferred way to generate random integers. Tests have shown that there
are many ways such a sequence can develop into a bad direction. For
example, if for any reason a member of the sequence becomes zero, all the
remaining members of the sequence will obviously be zero.
The linear congruence method
The most popular way to generate random integers with a uniform distribution is
the linear congruence method. In this approach, a random integer Xn+1 is produced
from another one Xn through the operation
Xn+1 = (aXn + c) mod m
where the modulus, m, is a positive integer. The multiplier a and increment c are
also positive integers but their values must be less than m. To start off the
sequence of random integers X0, X1, X2, ..” we need to input an integer X0,
generally referred to as the seed of the random sequence.
A Simple Example: Rolling Dice
As a simple example of a Monte Carlo simulation, consider calculating the
probability of a particular sum of the throw of two dice (with each die having
values one through six). In this particular case, there are 36 combinations of
dice rolls:
Based on this, you can manually compute the probability of a particular
outcome. For example, there are six different ways that the dice could sum to
seven. Hence, the probability of rolling seven is equal to 6 divided by 36 =
0.167.
Instead of computing the probability in this way, however, we could instead throw
the dice a hundred times and record how many times each outcome occurs.
If the dice totaled seven 18 times (out of 100 rolls), we would conclude that the
probability of rolling seven is approximately 0.18 (18%).
Obviously, the more times we rolled the dice, the less approximate our result
would be. Better than rolling dice a hundred times, we can easily use a computer
to simulate rolling the dice 10,000 times (or more).
Because we know the probability of a particular outcome for one die (1 in 6 for all
six numbers), this is simple. The output of 10,000 realizations (using GoldSim
software):
Finding PI
Y
Equation for a Circle: x 2  y 2  r 2
(centered at the origin)
1
Area of a Circle: A    r 2
R=1
A
PI Equation:   2
r
Area of circle in
0
1st
quadrant:
1
 
A
4  r2
X
Finding PI
• We can determine PI by finding the area, A, for a given radius, r.
• This is equivalent to integrating the equation of the circle as follows:
b
A=
ò
a
r
f ( x)dx =
ò
0
r 2 - x 2 dx
Y
Rejection Method
C
B
1
y
A
0
x
1
X
Finding PI
Algorithm Guts:
•
•
•
•
•
•
Initialize a counter for the area.
Select a “x”.
Select a “y”.
Calculate x2 + y2.
If this value is less than r2, increment area counter.
Repeat sampling
Simulation Process
Main program
Monte Carlo loop Subroutine
Start
Start
Trial move
Read simulation
parameters
yes
New
simulation?
Initialize positions
of all particles
Satisfy
Metropolis
rule?
no
Update energy
and virial
Sample the
pressure
no
End of
simulation?
yes
Stop
Accept the
trial move
no
Read old
configuration
Monte Carlo
loop
yes
Stop
Program to compute Pi using Monte Carlo methods
/* Program to compute Pi using Monte Carlo methods */
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#define SEED 35791246
main(int argc, char* argv)
{
int niter=0;
double x,y;
int i,count=0; /* # of points in the 1st quadrant of unit circle */
double z;
double pi;
printf("Enter the number of iterations used to estimate pi: ");
scanf("%d",&niter);
/* initialize random numbers */
srand(SEED);
count=0;
for ( i=0; i<niter; i++) {
x = (double)rand()/RAND_MAX;
y = (double)rand()/RAND_MAX;
z = x*x+y*y;
if (z<=1) count++;
}
pi=(double)count/niter*4;
printf("# of trials= %d , estimate of pi is %g \n",niter,pi);
}
Monte–Carlo Integration
Advantages
• Evaluates integrals based on a selection of random samples in the integration
domain
• Works no matter how complex the function to be integrated is (even
discontinuous)
• Does not even require the a priori knowledge of the integrand!
Theorem (Mean Value Theorem for Integrals): If f(x) is continuous over [a, b],
then there exists a number c, with a < c < b, such that
1 b
f ( x)dx  f (c)
b  a a

b
a
f ( x)dx  (b  a) f (c)
Often we are faced with integrals which cannot be done analytically. Especially in
the case of multidimensional integrals the simplest methods of discretisation can
become prohibitively expensive. For example, the error in a trapezium rule
calculation of a d–dimensional integral falls as N-2’/d, where N is the number of
different values of the integrand used. In a Monte–Carlo calculation the error falls
as N-1/2 independently of the dimension. Hence for d > 4 Monte-Carlo integration
will usually converge faster.
We consider the expression for the average of a statistic, f(x) when x is a random
number distributed according to a distribution p(x), then
f ( x)   p( x) f ( x)dx
which is just a generalisation of the well known results for (e.g. ) <x> or <x2>,
where we are using the notation <> —
to denote averaging. Now consider an
Ž
integral of the sort which might arise while using Laplace transforms.
1.0

0.8
exp(-x)
 exp( x) f ( x)dx
0
0.6
0.4
0.2
0.0
0
2
4
6
x
8
10
This integral can be evaluated by generating a set of N random numbers,
{x1,..........xN}, from a Poisson distribution, p(x) = exp(x), and calculating the mean of
f(x) as
1
N
N
 f (x )
i 1
i
The error in this mean is evaluated as usual by considering the corresponding
standard error of the mean
2 
1
N 1

f 2 ( x)  f ( x)
2

4
1+ x2
Example: Let f .( x) =
1
Use the Monte Carlo method to calculate approximations to the integral
0
6
f ( x)
5
4
1
x2
2
4/(1+x )
4
3
2
1
0
0.0
0.2
0.4
0.6
0.8
1.0
x
1
I=
4
ò 1+ x2 dx
0
p
1
I = 4 arctan x ]0 = 4 arctan1- 4 arctan 0 = 4 - 0 = p
4
4
ò.1 + x
2
dx
The Metropolis Algorithm
In statistical mechanics we commonly want to evaluate thermodynamic averages of
the form
ye 



e

 Ei
y
i
i
 Ei
i
where Ei is the energy of the system in state i and  = 1/kT . Such problems can
be solved using the Metropolis et al. (1953) algorithm.
Choose an initial configuration of N spins. (E.g. take all the spins to be +1
initially)
Pick a spin at random and compute the energy difference ΔE, that would occur if
the spin were flipped.
Calculate W, the transition probability for that flip.
Draw a random number, r, uniformly distributed between 0 and 1.
For ΔE<0 accept the flip. For ΔE>0, accept the flip if r<W otherwise reject the
flip and retain the original microstate In any case, the configuration of the spins
obtained in this way at the end of step 5 is counted as a "new configuration".
Analyse the resulting configuration as desired, store its properties to calculate
the necessary averages.
Lattice spin model with nearest neighbour interaction. The red site interacts only with the 4
adjacent yellow sites.
If there are N sites on the lattice, then the system can be in 2N states. The energy state of
the system (Hamiltonian) can be written in terms of the products of the spins and interaction
with an external magnetic field, if present. The Partition function then is proportional to sum
over all the possible states of the system on the exponential of the Hamiltonian.
Let us suppose the system is initially in a particular state and we change it to
another state j. The detailed balance condition demands that in equilibrium the
flow from i to j must be balanced by the flow from j to i.
This can be expressed as
piTi  j  piT j i
where Pi is the probability of finding the system in state i and Ti→j is the
probability (or rate) that a system in state i will make a transition to state j. Eq.
can be rearranged to read
Ti  j
T j i

pj
pi
e
  ( E j  Ei )
Generally the right-hand-side of equation is known and we want to generate a
set of states which obey the distribution pi. This can be achieved by choosing
the transition rates such that
Ti  j

 1
    ( E j  Ei )

e
if p j  pi or E j  Ei
if p j  pi or E j  Ei
In practice if pj < pi, a random number,  , is chosen between 0 and 1
and the system is moved to state j only if the ratio is less than pj /pi or
e−(Ej−Ei).
This method is not the only way in which the condition can be fulfilled, but it is
by far the most commonly used.
An important feature of the procedure is that it is never necessary to evaluate
the partition function, the denominator in (8) but only the relative probabilities of
the different states. This is usually much easier to achieve as it only requires the
calculation of the change of energy from one state to another.
Note that, although we have derived the algorithm in the context of
thermodynamics, its use is by no means confined to that case.
The Ising model
Ernst Ising 1900-1998
Born in Cologne, Germany on May 10, 1900, died in Peoria, IL, USA on May 11, 1998.
Obtained his PhD at age 24 but dismissed from job when Hitler came to power.
Became school teacher, immigrated to Luxembourg but soon fled to USA.
Known as an excellent teacher.
He published just a few physics papers.
Between 1966-2000, at least 16,000 articles with contents related to the Ising model published.
Ferromagnetism arises when a collection of atomic spins align such that their
associated magnetic moments all point in the same direction, yielding a net
magnetic moment which is macroscopic in size. The simplest theoretical
description of ferromagnetism is called the Ising model. This model was invented
by Wilhelm Lenz in 1920: it is named after Ernst Ising, a student of Lenz who
chose the model as the subject of his doctoral dissertation in 1925.
http://cp3-origins.dk/content/movies/2013-10-25-1200-arthur.pdf/#/9
magnetic dipole
moment (Spin)
magnetic dipole moments
randomly aligned in
a paramagnetic sample
magnetic dipole moments
randomly aligned in
a ferromagnetic sample
Above TC a ferromagnet becomes a paramagnet
magnetic dipole moments
randomly aligned in
a antiferromagnetic sample
•Named after the physicist Ernst Ising who did it in 1920s
•It is a mathematical model in statistical mechanics
•It has since been used to model diverse phenomena in which bits of information,
interacting in pairs, produce collective effects
•Serious applications include complicated models of ferromagnets, fluids, alloyds,
interfaces, nuclei, subnuclear particles
•For us we will use it to model the behavior of a ferromagnet, just a single domain
within it.
•We will account for the tendency of neighboring dipoles to align parallel to each
other while neglecting interactions between dipoles
•Notation: For spin-up () we have si = 1 and spin-down () we have si = −1
•The energy due to interaction: −  if parallel and + if anti-parallel
We write the energy as
U  

si s j
neighbouring
pairs i , j
which is negative if parallel and positive if anti-parallel.
To predict the thermal behavior we need the partition function
Z

e  U
All possible sets of
dipole alignments
For N dipoles the number of terms in the sum is 2N.
In 1D it is possible to solve by hand as shown by Ingemar in the lecture.
In 2D you can find exact solution but it is complicated. Solved in 1940s by Lars
Onsager, a Norwegian who won The Nobel Prize in Chemistry 1968. Can you
imagine working with a 10 × 10 lattice which has 2100 ~ 1030 possible states?!
NO exact solution ever found in 3D  we need approximations!
The Ising model
The Ising model for a ferromagnet is not only a very simple model which has a
phase transition, but it can also be used to describe phase transitions in a whole
range of other physical systems. The model is defined using the equation
H   J  Si S j
i , ji
where the i and j designate points on a lattice and Si takes the values 1/2. The
various different physical systems differ in the definition and sign of the various
Jij’s.
Here we will consider the simple case of a 2 dimensional square lattice with
interactions only between nearest neighbors. In this case
E   J  Si S j
i , jk
where ji is only summed over the 4 nearest neighbors of i.
This model can be studied using the Metropolis method as described in the notes,
where the state can be changed by flipping a single spin. Note that the change in
energy due to flipping the kth spin from ↑ to ↓ is given by
Ek   J  S jk
jk
The only quantity which actually occurs in the calculation is
Z k  exp(Ek / kBT )
and this can only take one of five different values given by the number of
neighboring ↑ spins. Hence it is sensible to store these in a short array before
starting the calculation. Note also that there is really only 1 parameter in the
model, J/kBT , so that it would make sense to write your program in terms of this
single parameter rather than J and T separately. The calculation should use
periodic boundary conditions, in order to avoid spurious effects due to
boundaries. There are several different ways to achieve this. One of the most
efficient is to think of the system as a single line of spins wrapped round a torus.
This way it is possible to avoid a lot of checking for the boundary. For an N × N
system of spins define an array of N2 elements using the shortest sensible
variable type: char in C(++).
It is easier to use for spin ↑ and for spin ↓, as this makes the calculation of the
number of neighboring ↑ spins easier. In order to map between spins in a 2d
space Sr (r = (x, y)) and in the 1d array Sk the following mapping can be used.
S r  x

S r 1
S r  x

S k  N 2 1
S r  y

Sk  N
S r  y

Sk  N 2  N
where the 2nd N2 elements of the array are always maintained equal to the 1st N2.
This way it is never necessary to check whether one of the neighbors is over the
edge. It is important to remember to change Sk+N2 whenever Sk is changed.
The calculation proceeds as follows:
1. Initialize the spins, either randomly or aligned.
2. Choose a spin to flip. It is better to choose a spin at random rather than
systematically as systematic choices can lead to spurious temperature gradients
across the system.
3. Decide whether to flip the spin by using the Metropolis condition.
4. If the spin is to be flipped, do so but remember to flip its mirror in the array.
5. Update the energy and magnetization.
6. Add the contributions to the required averages.
7. Return to step 2 and repeat.
Quantum Monte Carlo Calculation
One does QMC for the same reason as one does classical simulations; there is
no other method able to treat exactly the quantum many-body problem aside from
the direct simulation method where electrons and ions are directly represented as
particles, instead of as a “fluid” as is done in mean-field based methods.
However, quantum systems are more difficult than classical systems because one
does not know the distribution to be sampled, it must be solved for. In fact,
it is not known today which quantum problems can be “solved” with simulation on
a classical computer in a reasonable amount of computer time.
Quantum Monte Carlo Calculation
Introduction
This project is to use the variational quantum Monte Carlo method to calculate the
ground state energy of the He atom. The He atom is a two electron problem which
cannot be solved analytically and so numerical methods are necessary. Quantum
Monte Carlo is one of the more interesting of the possible approaches (although
there are better methods for this particular problem).
The Method
The Schr¨odinger equation for the He atom in atomic units is,
 1 2 1 2 2 2 1
  r1  r2     (r1 , r2 )  E(r1 , r2 )
2
r1 r2 r12 
 2
where r1 and r0 are the position vectors of the two electrons, r1 = |r1| , r2 = |r2| and
r12 = |r1  r2 |
Energies are in units of the Hartree energy (1 Hartree = 2 Rydbergs) and
distances are in units of the Bohr radius. The ground state spatial wavefunction is
symmetric under exchange of the two electrons (the required antisymmetry is
taken care of by the spin part of the wavefunction, which we can forget about
otherwise).
The expression for the energy expectation value of a particular trial wavefunction,
T(r1, r2), is,
ET 
*
3
3
......

H

d
r
d
r2
T
T
1
 
*
3
3
......


d
r
d
  T T 1 r2
In the variational Monte Carlo method, this equation is rewritten in the form,
 1

ET   ...... 
H T  f (r1 , r2 )d 3r1d 3r2
 T

where
f (r1 , r2 ) 
*T (r1 , r2 )T (r1 , r2 )
*
3
3
......


d
r
d
  T T 1 r2
is interpreted as a probability density which is sampled using the Metropolis
algorithm. Note that the Metropolis algorithm only needs to know ratios of the
probability density at different points, and so the normalisation integral,
 ...... 
*
T
T d 3r1d 3r2 always cancels out and does not need to be evaluated.
The mean of the values of the “local energy”,
1
H T
T
at the various points along the Monte Carlo random walk then gives an
estimate of the energy expectation value. By the variational principle, the exact
energy expectation value is always greater than or equal to the true ground
state energy; but the Monte Carlo estimate has statistical errors and may lie
below the true ground state energy if these are large enough. Anyway, the
better the trial wavefunction, the closer to the true ground state energy the
variational estimate should be.
In this project you are given a possible trial wavefunction,
  e 2 r1 e 2 r2 e  r12 /2
Use the variational Monte Carlo technique to calculate variational estimates of
the true ground state energy.
Before you start programming, you will need the analytic expressions for the local
energy. This involves some nasty algebra, but the answer is,
r r
1
17 r  r
H     1 12  2 12

4
r1r12
r1r12
The Monte Carlo moves can be made by generating random numbers (use a
library routine to do this) and adding them to the electron coordinates. I suggest
that you update all six electron position coordinates (x1, y1, z1, x2, y2, z2) each
move, and so you will need six random numbers each time. The accepted lore is
that the Metropolis algorithm is most efficient when the step size is chosen to
keep the acceptance probability close to 0.5. However, the method should work in
principle no matter what the step size and you should try a few different step
sizes to confirm that this is indeed the case. The starting positions of the two
electrons can be chosen randomly, but remember that the Metropolis algorithm
only samples the probability distribution exactly in the limit as the number of
moves tends to infinity. You will therefore have to throw away the results from the
moves near the beginning of the run and only start accumulating the values of the
local energy once things have settled down. You should experiment to find out
how many moves you need to throw away.
The Physics
In general it is advisable to run the program for some time to allow it to reach
equilibrium before trying to calculate any averages. Close to a phase transition it is
often necessary to run for much longer to reach equilibrium. The behaviour of the
total energy during the run is usually a good guide to whether equilibrium has
been reached. The total energy, , and the magnetisation can be calculated from
M
1
N
S
i
i
It should be possible to calculate these as you go along, by accumulating the
changes rather than by recalculating the complete sum after each step. A lattice
should suffice for most purposes and certainly for testing, but you may require a
much bigger lattice close to a transition. A useful trick is to use the final state at
one temperature as the initial state for the next slightly different temperature. That
way the system won't need so long to reach equilibrium.
It should be possible to calculate the specific heat and the magnetic susceptibility.
The specific heat could be calculated by differentiating the energy with respect to
temperature. This is a numerically questionable procedure however. Much better
is to use the relationship
1 J 
Cv  

N  k BT 
2
E
2
 E
2

Similarly, in the paramagnetic state, the susceptibility can be calculated using
χ

1 J 
2

 E  E
N  kBT 
2

where and the averages are over different states, i.e. can be calculated by
averaging over the different Metropolis steps. Both these quantities are expected to
diverge at the transition, but the divergence will tend to be rounded off due to the
small size of the system. Note however that the fact that (4.33) & (4.34) have the
form of variances, and that these diverge at the transition, indicates that the
average energy and magnetisation will be subject to large fluctuations around the
transition.Finally a warning. A common error made in such calculations is to add a
contribution to the averages only when a spin is flipped. In fact this is wrong as the
fact that it isn't flipped means that the original state has a higher probability of
occupation.