Download RANDOM NUMBERS AND MONTE CARLO METHODS 1 Introduction

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Vincent's theorem wikipedia , lookup

Dimension wikipedia , lookup

Infinitesimal wikipedia , lookup

Large numbers wikipedia , lookup

Strähle construction wikipedia , lookup

Addition wikipedia , lookup

Arithmetic wikipedia , lookup

Proofs of Fermat's little theorem wikipedia , lookup

Central limit theorem wikipedia , lookup

Elementary mathematics wikipedia , lookup

Law of large numbers wikipedia , lookup

Transcript
RANDOM NUMBERS AND
MONTE CARLO METHODS
PER LÖTSTEDT
(excerpt adopted for Beräkningsvetenskap I)
1
1
Div of Scientific Computing, Dept of Information Technology
Uppsala University, SE-75105 Uppsala, Sweden
email: [email protected]
Introduction
Monte Carlo methods use sequences of random numbers to solve problems in
e.g. mathematics, physics and chemistry. The reasons why random numbers are
involved are
1. simulation of a stochastic process where a variable varies randomly in time,
2. a very complicated process is simulated for which there is neither analytical
solutions nor analysis available,
3. problems of high dimension such as integrals in multi-dimensions.
In the first example, the problem is described by a stochastic model such as
random walk. Let Xn be the position of a particle in the x-direction at time tn .
Then the position is changed by ∆Xn so that at time tn+1 the particle is at
Xn+1 = Xn + ∆Xn .
If the displacement ∆Xn is chosen randomly, then we have a random walk process.
In the second example there is no known way of solving the problem except
for simulating the behavior using randomness. In a biological system, molecules
react with each other, collide and move in space. We are interested in the large
scale behavior of the system and not all the details. It is sometimes very difficult
to derive a deterministic model such as a differential equation for the large scale.
A solution is then to simulate all the reactions, collisions and movements in
the system with a limited number of molecules and draw conclusions from the
simulations.
1
A problem that can be solved in principle with a deterministic method is
instead solved with a stochastic algorithm in the last example. The reason is that
the deterministic algorithm is so computationally complicated that the stochastic
one is preferred. The computational work grows too quickly for the deterministic
method when the dimension of the integral increases but it grows much slower
for the stochastic algorithm.
A discussion of Monte Carlo methods is found in [1, 2, 3].
2
Random number generation
A Monte Carlo method needs a reliable way of generating random numbers.
While it is difficult to compute perfectly random numbers, most generators compute pseudo-random numbers. They mimic the behavior of true random numbers
and are generated in a deterministic and predictable way. Since many millions
of random numbers are required in Monte Carlo simulations, the pseudo-random
numbers must be easily computed. They should also pass a number of statistical
tests of randomness. If the sequence of numbers is repeated after some period,
that period should be long. In some applications, it is important that the same
sequence can be generated many times for parameter tests. Two pseudo-random
generators satisfying these requirements are the linear congruential generator and
the Fibonacci generator.
2.1
Linear congruential generator
A sequence of integer numbers xi is generated by
xi = (axi−1 + b) mod M, i = 1, 2, . . .
(1)
The parameters in the method a, b, x0 , M, are non-negative integers such that
a, b, x0 < M and a 6= 0. The sequence is initialized by the seed x0 and has the
following properties:
1. 0 ≤ xi ≤ M − 1
2. the period is at most M
The first property follows from the definition of mod. In a sequence of M + 1
numbers xi at least two of them must be equal. Suppose that xj = xj+p for
some p > 0. Then the sequence xj , xj+1 , . . . , xj+p−1 , is identical to the sequence
xj+p , . . . , xj+2p−1 and the period is p ≤ M . The upper bound on the period length
M should be chosen large and preferably a > 0 and b ≥ 0.
Given xi , compute ui = xi /M . With properly chosen a, b, and M , ui will
approximately be uniformly (or rectangularly) distributed over the interval [0, 1)
so that ui ∼ U(0, 1) with probability density function f (x) = 1 for x ∈ [0, 1).
2
Values that have been used in the past are a = 216 + 3, b = 0, and M = 231 , see
[4]. However, this choice of values has deficiencies, see [5]. If we need a uniform
distribution in the interval [c, d), then after a transformation vi = c + (d − c)ui ∼
U(c, d). MATLAB generates a square matrix of dimension n with uniformly
distributed elements in [0, 1) with rand(n).
3
Monte Carlo integration
Suppose that we are interested in evaluating the integral
Z 1
I0 =
g(x) dx.
(2)
0
3.1
Integral as a sum of random function values
Let the sequence x1 , x2 , . . . , xN , consist of uniformly distributed numbers in [0, 1).
A Monte Carlo method for I0 in (2) is based on the idea that the average
N
1 X
ξN =
g(xi )
N i=1
(3)
is an approximation of the integral.
If the integration interval is [c, d], then
Z d
Z 1
I1 =
g(x) dx = (d − c)
g(c + (d − c)y) dy,
0
c
and the integral is approximated by
ξN = (d − c)
N
1 X
g(xi ),
N i=1
where xi is a random number with distribution U(c, d). This is generated by
letting xi = c + (d − c)yi where yi ∼ U(0, 1).
From the definition of the expectation of g(X) for a random variable X with
uniform distribution we have
Z 1
Z 1
E[g(X)] =
g(x)f (x) dx =
g(x) dx.
0
0
An approximation of E[g(X)] is ξN in (3). From the law of large numbers we
conclude that
Z 1
g(x) dx.
lim ξN = E[g(X)] =
N →∞
0
3
The error in the approximation is
²N = I0 − ξN .
The central limit theorem in statistics tells us that the error ²N is a stochastic
variable such that
σ
²N ≈ √ η,
N
(4)
where η is normally distributed N (0, 1). The variance of g is here denoted by σ 2
and is defined by
Z 1
2
σ =
(g(x) − I0 )2 dx.
(5)
0
The conclusion from (4) is that the error ²N with N terms in the approximation
ξN in (3) of I0 in (2) decays with the speed N −1/2 when N increases.
If we instead compute I0 with the trapezoidal method, then with a constant
step size h and N = 1/h, the formula is
I0 ≈ 0.5(g(0) + g(1)) +
N
−1
X
g(ih).
(6)
i=1
The error in the trapezoidal method is proportional to h2 or N −2 . Compare this
with the error in the Monte Carlo method in (4). The error vanishes much faster
with the trapezoidal method than with the Monte Carlo method in one dimension
when we add more evaluations of g and N increases. With 4 times as large a
N , the error in the Monte Carlo method decreases by a factor 2, but with the
trapezoidal
√ method the error is 16 times smaller! It is difficult to improve the
factor 1/ N in (4) but σ in (4) can be reduced by variance reduction methods.
Such methods are described in [1, 2].
If the integral is defined over several dimensions the situation is different. Let
D be the dimension and let D be a D-dimensional unit cube with edge length 1,
[0, 1]D . The integral is then
Z 1Z 1Z 1
Z 1
Z
ID =
...
g(x) dx1 dx2 . . . dxD =
g(x) dV,
0
0
0
0
D
where x is a D-dimensional vector with the components xk , k = 1, 2, . . . , D, and
dV is a volume element in D. With the trapezoidal method we must evaluate g(x)
in a grid or lattice with the distance h between the points in every dimension.
The number of points in each dimension will be h−1 + 1. The total number of
points will then be N ≈ h−D . The error still decays as h2 ≈ N −2/D .
4
The Monte Carlo approximation of ID is
ID ≈
N
1 X
g(xi ).
N i=1
Here, xi is a D-dimensional vector with D independent, uniformly distributed
random numbers in [0, 1). The method has an error that decreases like N −1/2
independent of the number of dimensions. Compared with the deterministic
trapezoidal method, the error vanishes more rapidly with the Monte Carlo approximation when N −1/2 < N −2/D , i.e.
2/D < 1/2.
This is the case when the dimension is greater than 4.
g(x)
g max
A
1
0
x
Figure 1: The function g(x), gmax , and the area A.
3.2
Integral as a fraction of an enclosing region
An alternative way of computing the integral I0 is to determine the area under
g(x) in the interval [0, 1]. If
g(x) ≥ 0 and maxx∈[0,1] g(x) ≤ gmax ,
then generate N pairs of independent and uniformly distributed random numbers
(Xi , Yi ) with Xi ∼ U(0, 1), Yi ∼ U(0, gmax ). Let M be the number of pairs such
that
Yi ≤ g(Xi ).
Then the area A under g(x) and the area Amax under gmax satisfy
A
M
≈
,
Amax
N
5
see Fig. 1. The integral I0 is then approximated by
I0 = A ≈
M
gmax .
N
The value of gmax above does not have to be the exact maximum of the function, but should be a value fairly near that.
For integrals in several dimensions one similarly computes uniformly distributed random points in enclosing ”volumes”. If the region to be integrated
over is irregular, this region also has to be enclosed in a surrounding rectangular
region.
References
[1] R. E. Caflish, Monte Carlo and quasi-Monte Carlo methods, Acta Numerica, 1998, p. 1–49.
[2] G. Dahlquist, Å. Björck, Numerical Methods, Dover, Mineola, NY,
2003.
[3] M. T. Heath, Scientific Computing, McGraw-Hill, New York, 1997.
[4] P. E. Kloeden, E. Platen, Numerical Solution of Stochastic Differential
Equations, Springer, Berlin, 1992.
[5] R. Seydel, Tools for Computational Finance, 2nd ed., Springer, Berlin,
2004.
6