Download Math 416/516: Stochastic Simulation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Math 416/516: Stochastic Simulation
Haijun Li
[email protected]
Department of Mathematics
Washington State University
Week 5
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
1 / 23
Outline
1
Inverse Transform Method
2
Acceptance-Rejection Method
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
2 / 23
Generalized Inverse Function
Recall: Any cumulative distribution function (cdf) F (x) is a
non-decreasing, right-continuous function from zero at −∞ to 1 at +∞.
Definition
Let F (x) be a cdf. Then the generalized inverse of F (x) is defined by
F −1 (u) = inf{x : F (x) ≥ u}, ∀ u ∈ [0, 1].
If F (x) is continuous and strictly increasing, then F −1 (u) is the
usual inverse function of F (x). That is,
F (F −1 (u)) = u, for 0 ≤ u ≤ 1;
F −1 (F (x)) = x, for x ∈ R.
F (x) ≥ u if and only if x ≥ F −1 (u).
In general,
F (F −1 (u)) ≥ u, for 0 ≤ u ≤ 1;
Haijun Li
F −1 (F (x)) ≤ x, for x ∈ R.
Math 416/516: Stochastic Simulation
Week 5
3 / 23
Inverse Transform Method
Theorem
Let F (x) be a cdf and U be uniformly distributed over [0, 1]. Then
X = F −1 (U)
has the cdf F (x).
Algorithm (Sampling from F ):
1
Draw a number u from [0, 1] at random.
2
Calculate x = F −1 (u).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
4 / 23
Example: Sampling from exponential distributions
Let X ∼ exp(λ); that is, F (x) = 1 − e−λx , x ≥ 0.
E(X ) = λ−1 , var(X ) = λ−2 . λ−1 = scale parameter.
Simulate a set of samples x1 , . . . , xn from this distribution.
Algorithm:
1
Draw n random numbers u1 , . . . , un from [0, 1] independently.
2
Calculate
1
xi = F −1 (ui ) = − log(1 − ui ), i = 1, 2, . . . , n.
λ
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
5 / 23
Example: Sampling from Fréchet distributions
−α
Let X ∼ Frechet(α); that is, F (x) = e−x , x ≥ 0. The shape
parameter α > 0 is known as the heavy tail index.
If 0 < α ≤ 1, E(X ) = ∞. If 0 < α ≤ 2, var(X ) = ∞.
For financial applications, α > 2.
In Internet data analysis, α can be smaller than 1.
Contribution: Maurice Fréchet (1927), Sir Ronald Fisher and
Leonard Tippett (1928), Emil Gumbel (1958).
Algorithm:
1
Draw n random numbers u1 , . . . , un from [0, 1] independently.
2
Calculate
xi = F −1 (ui ) = (− log ui )−1/α ,
3
i = 1, 2, . . . , n.
A set of samples {x1 , . . . , xn } is generated from Frechet(α).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
6 / 23
Sampling from Bernoulli(p) via Inverse Transform
Let X be discrete with a Bernoulli distribution; that is,
P(X = 1) = p, P(X = 0) = 1 − p, where 0 ≤ p ≤ 1.
E(X ) = p, and var(X ) = p(1 − p).
Let F (x) denote the cdf of X , then
0 if 0 ≤ u ≤ 1 − p
−1
F (u) =
1 if 1 − p < u ≤ 1.
Algorithm:
1
Draw n random numbers u1 , . . . , un from [0, 1] independently.
2
Calculate
0 if 0 ≤ ui ≤ 1 − p
yi = F −1 (ui ) =
, i = 1, 2, . . . , n.
1 if 1 − p < ui ≤ 1.
3
A set of samples {y1 , . . . , yn } is generated from Bernoulli(p).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
7 / 23
Discrete Distributions
Let X be discrete with probability mass function:
pi = P(X = vi ), i = 1, . . . , m,
where v1 ≤ v2 ≤ · · · ≤ vm are all the possible values of X .
Let F (x) denote its cdf, then

 0
Pk
F (x) =
i=1 pi

1
F −1 (u) = vk if
Haijun Li
Pk −1
i=1
pi < u ≤
if x < v1
if vk ≤ x < vk +1
if vm ≤ x.
Pk
i=1 pi .
Math 416/516: Stochastic Simulation
Week 5
8 / 23
Sampling from Discrete Distributions
Note that
Pm
i=1 pi
= 1, and p1 , . . . , pm are all non-negative.
Algorithm:
1
Draw a random numbers u from [0, 1].
2
Calculate

0 ≤ u ≤ p1
 v1 if P
P
−1
−1
vk if ki=1
pi < u ≤ ki=1 pi , 1 < k < m
x = F (u) =
P

vm if m−1
i=1 pi < u ≤ 1.
3
Repeat (1) and (2) n times to generate a set of samples
{x1 , . . . , xn } from the discrete cdf F (x).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
9 / 23
Remarks
The inverse transform method is useful and effective especially for
sampling from univariate distributions.
The inverse transform method can be applied to generating
samples from multivariate distributions. For example, to generate
a sample from (X , Y ) with joint cdf F (x, y ), using the following
two-step procedure:
(1) Generating a value x from the marginal distribution F1 (x) of X .
(2) Generating a value y from the conditional distribution F2|1 (y |x)
of Y conditioning on X = x.
Both steps can be implemented by using the inverse transform
method, but this method becomes intractable for high-dimensional
distributions.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
10 / 23
Generate more “easy samples”, then reject some.
Suppose that a (target) probability density function f (x) is difficult
to simulate directly.
But we know how to simulate samples from an alternative
probability density g(x), where g(x) ≥ f (x).
We simulate more samples from g(x), and then accept a portion
of them based on some rules.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
11 / 23
Acceptance-Rejection Method
Again, let f (x) and g(x) denote two probability density functions such
that cg(x) ≥ f (x) for some constant c ≥ 1.
Theorem
Let X ∼ f (x), and Y ∼ g(y ). If U is uniformly distributed over [0, 1] and
independent of Y , then for any subset A,
f (Y ) .
P(X ∈ A) = P Y ∈ A U ≤
cg(Y )
|
{z
}
accept rule
That is,
f (Y )
X = Y U ≤
cg(Y )
d
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
12 / 23
Proof. Observe that
Z f (Y ) f (y ) P U≤
= P U≤
Y = y g(y )dy
cg(Y )
cg(y )
Z Z
f (y )
1
f (y )
= P U≤
g(y )dy = .
g(y )dy =
cg(y )
cg(y )
c
Similarly, for any set A,
Z f (Y ) f (y ) P Y ∈ A, U ≤
=
g(y )dy
P U≤
cg(Y )
cg(y )
A
Z
Z
f (y )
1
=
g(y )dy =
f (y )dy .
c A
A cg(y )
Therefore,
f (Y )
P
Y
∈
A,
U
≤
cg(Y )
f (Y )
P Y ∈AU≤
=
f (Y )
cg(Y )
P U≤
cg(Y )
Z
f (y )dy = P(X ∈ A).
=
A
Q.E.D.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
13 / 23
Acceptance-Rejection Algorithm
Again, let f (x) and g(x) denote two probability density functions such
that cg(x) ≥ f (x) for some constant c ≥ 1.
Algorithm:
1
Generate a trial sample value y from the density g(y ).
2
Generate a random number u from the uniform distribution over
[0, 1].
3
Accept y and set x := y if
u≤
f (y )
;
cg(y )
otherwise, discard y and go to (1).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
14 / 23
Remark
The overall probability of accepting a trial sample value is 1/c,
and so on average, about c sample values from g(y ) are needed
to generate one acceptable sample value from f (x).
c can be selected as
f (x)
.
g(x)
all x’s
c ∗ = max
A smaller c that is close to 1 is more efficient.
The uniform and exponential distributions are often chosen as the
alternative distribution g(x).
The Acceptance-Rejection method can be applied to Monte Carlo
simulations from multi-dimensional distributions.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
15 / 23
Gamma Distribution
A random variable X is said to have a gamma distribution if it has
the pdf
λk x k −1 e−λx
, x > 0, where Γ(k ) =
f (x) =
Γ(k )
Z
∞
x k −1 e−x dx,
0
If k is a positive integer, then the gamma function Γ(k ) = (k − 1)!.
Widely used in reliability engineering and telecommunication.
E(X ) = λk , V (X ) =
k
.
λ2
k = shape parameter, λ = rate parameter, θ = λ−1 = scale
parameter.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
16 / 23
A special case: When k = n (integer), the distribution is called an
Erlang distribution (widely used in telecommunication).
k = 1: Exponential distribution. The pdf is given by
f (x) = λe−λx , x ≥ 0.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
17 / 23
Example: Sampling from Gamma Dist
Consider the case where k = 3/2, λ = 1 (note: Γ(3/2) =
r
x −x
f (x) = 2
e , x ≥ 0.
π
√
π/2):
To simulate samples from this distribution, we use the exponential
distribution as the alternative distribution g(x) = λe−λx , x ≥ 0.
Find the constant c such that
√
2 xe−x
f (x)
c = max
= max √
.
πλe−λx
x≥0
x≥0 g(x)
The maximum is attained at x ∗ = 0.5/(1 − λ) and
1
c=p
.
2eπλ2 (1 − λ)
The smallest c is most efficient. c is minimized at λ∗ = 2/3 with
minimal value c ∗ = 1.257.
That is, g(x) = 23 e−2x/3 and c ∗ = 1.257.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
18 / 23
Example: Sampling from Gamma Dist (cont’d)
Algorithm:
1
Generate a trial sample y from the exponential distribution with
rate 2/3.
2
Generate a random number u from the uniform distribution over
[0, 1].
3
Accept y and set x := y if
f (y )
u≤
=2
cg(y )
r
2ey −y /3
e
;
3
otherwise, discard y and go to (1).
Remark: In order generate 100 samples from f (x), we need to
generate about 125.7 trial samples from g(y ).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
19 / 23
Dealing With Unknown Normalizing Constant
Suppose that a probability density function
f (x) = k h(x), denoted by f (x) ∝ h(x)
where h(x) is a known function, but the normalizing constant k is
unknown.
Let g(y ) be a density function from which we know how to draw
samples. If
h(y )
≤ c, ∀ y ,
g(y )
then we can use the acceptance rule
u≤
h(y )
cg(y )
where y is drawn from g(y ) and u is drawn from the uniform
distribution over [0, 1] independently.
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
20 / 23
Example: Unknown Normalizing Constant
Simulate samples from the following density function:
√
2
2
f (x) ∝ e−x (1 − e− 1+x ), x ∈ R.
|
{z
}
h(x)
Observe that
2
e−x (1 − e−
√
1+x 2
)≤
√
2πφ(x)
where φ(x) is the standard normal density function.
Algorithm:
1
Generate a trial sample y from the normal distribution N(0, 1).
2
Generate a random number u from the uniform distribution over
[0, 1].
3
Accept y and set x := y if
√
2
2
h(y )
e−y (1 − e− 1+y )
=
;
u≤√
e−y 2 /2
2πφ(y )
otherwise, discard y and go to (1).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
21 / 23
Example: 2-Dimensional Simulation
Simulate samples from the following density function:
1/π if x 2 + y 2 ≤ 1
f (x, y ) =
0
otherwise.
Let
g(x, y ) =
1/4 if −1 ≤ x, y ≤ 1
0
otherwise.
Note that g(x, y ) is the joint pdf of two uniformly distributed
random variables over [−1, 1].
Observe that
f (x, y )
4
c = max
= ≈ 1.273.
π
−1≤x,y ≤1 g(x, y )
It is easy to see that
f (x, y )
=
cg(x, y )
Haijun Li
1 if x 2 + y 2 ≤ 1
0 otherwise.
Math 416/516: Stochastic Simulation
Week 5
22 / 23
Example: 2-Dimensional Simulation Algorithm
Algorithm:
1
Generate a pair of independent trial samples (x, y ) from the
uniform distribution over [−1, 1].
2
Generate a random number u from the uniform distribution over
[0, 1].
3
Accept the pair (x, y ) if
u≤
f (x, y )
= 1, equivalently x 2 + y 2 ≤ 1;
cg(x, y )
otherwise, discard (x, y ) and go to (1).
Remark: In order to generate 100 samples, we need to generate
about 127.3 trial pairs from g(x, y ).
Haijun Li
Math 416/516: Stochastic Simulation
Week 5
23 / 23
Related documents