Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Math 416/516: Stochastic Simulation Haijun Li [email protected] Department of Mathematics Washington State University Week 5 Haijun Li Math 416/516: Stochastic Simulation Week 5 1 / 23 Outline 1 Inverse Transform Method 2 Acceptance-Rejection Method Haijun Li Math 416/516: Stochastic Simulation Week 5 2 / 23 Generalized Inverse Function Recall: Any cumulative distribution function (cdf) F (x) is a non-decreasing, right-continuous function from zero at −∞ to 1 at +∞. Definition Let F (x) be a cdf. Then the generalized inverse of F (x) is defined by F −1 (u) = inf{x : F (x) ≥ u}, ∀ u ∈ [0, 1]. If F (x) is continuous and strictly increasing, then F −1 (u) is the usual inverse function of F (x). That is, F (F −1 (u)) = u, for 0 ≤ u ≤ 1; F −1 (F (x)) = x, for x ∈ R. F (x) ≥ u if and only if x ≥ F −1 (u). In general, F (F −1 (u)) ≥ u, for 0 ≤ u ≤ 1; Haijun Li F −1 (F (x)) ≤ x, for x ∈ R. Math 416/516: Stochastic Simulation Week 5 3 / 23 Inverse Transform Method Theorem Let F (x) be a cdf and U be uniformly distributed over [0, 1]. Then X = F −1 (U) has the cdf F (x). Algorithm (Sampling from F ): 1 Draw a number u from [0, 1] at random. 2 Calculate x = F −1 (u). Haijun Li Math 416/516: Stochastic Simulation Week 5 4 / 23 Example: Sampling from exponential distributions Let X ∼ exp(λ); that is, F (x) = 1 − e−λx , x ≥ 0. E(X ) = λ−1 , var(X ) = λ−2 . λ−1 = scale parameter. Simulate a set of samples x1 , . . . , xn from this distribution. Algorithm: 1 Draw n random numbers u1 , . . . , un from [0, 1] independently. 2 Calculate 1 xi = F −1 (ui ) = − log(1 − ui ), i = 1, 2, . . . , n. λ Haijun Li Math 416/516: Stochastic Simulation Week 5 5 / 23 Example: Sampling from Fréchet distributions −α Let X ∼ Frechet(α); that is, F (x) = e−x , x ≥ 0. The shape parameter α > 0 is known as the heavy tail index. If 0 < α ≤ 1, E(X ) = ∞. If 0 < α ≤ 2, var(X ) = ∞. For financial applications, α > 2. In Internet data analysis, α can be smaller than 1. Contribution: Maurice Fréchet (1927), Sir Ronald Fisher and Leonard Tippett (1928), Emil Gumbel (1958). Algorithm: 1 Draw n random numbers u1 , . . . , un from [0, 1] independently. 2 Calculate xi = F −1 (ui ) = (− log ui )−1/α , 3 i = 1, 2, . . . , n. A set of samples {x1 , . . . , xn } is generated from Frechet(α). Haijun Li Math 416/516: Stochastic Simulation Week 5 6 / 23 Sampling from Bernoulli(p) via Inverse Transform Let X be discrete with a Bernoulli distribution; that is, P(X = 1) = p, P(X = 0) = 1 − p, where 0 ≤ p ≤ 1. E(X ) = p, and var(X ) = p(1 − p). Let F (x) denote the cdf of X , then 0 if 0 ≤ u ≤ 1 − p −1 F (u) = 1 if 1 − p < u ≤ 1. Algorithm: 1 Draw n random numbers u1 , . . . , un from [0, 1] independently. 2 Calculate 0 if 0 ≤ ui ≤ 1 − p yi = F −1 (ui ) = , i = 1, 2, . . . , n. 1 if 1 − p < ui ≤ 1. 3 A set of samples {y1 , . . . , yn } is generated from Bernoulli(p). Haijun Li Math 416/516: Stochastic Simulation Week 5 7 / 23 Discrete Distributions Let X be discrete with probability mass function: pi = P(X = vi ), i = 1, . . . , m, where v1 ≤ v2 ≤ · · · ≤ vm are all the possible values of X . Let F (x) denote its cdf, then 0 Pk F (x) = i=1 pi 1 F −1 (u) = vk if Haijun Li Pk −1 i=1 pi < u ≤ if x < v1 if vk ≤ x < vk +1 if vm ≤ x. Pk i=1 pi . Math 416/516: Stochastic Simulation Week 5 8 / 23 Sampling from Discrete Distributions Note that Pm i=1 pi = 1, and p1 , . . . , pm are all non-negative. Algorithm: 1 Draw a random numbers u from [0, 1]. 2 Calculate 0 ≤ u ≤ p1 v1 if P P −1 −1 vk if ki=1 pi < u ≤ ki=1 pi , 1 < k < m x = F (u) = P vm if m−1 i=1 pi < u ≤ 1. 3 Repeat (1) and (2) n times to generate a set of samples {x1 , . . . , xn } from the discrete cdf F (x). Haijun Li Math 416/516: Stochastic Simulation Week 5 9 / 23 Remarks The inverse transform method is useful and effective especially for sampling from univariate distributions. The inverse transform method can be applied to generating samples from multivariate distributions. For example, to generate a sample from (X , Y ) with joint cdf F (x, y ), using the following two-step procedure: (1) Generating a value x from the marginal distribution F1 (x) of X . (2) Generating a value y from the conditional distribution F2|1 (y |x) of Y conditioning on X = x. Both steps can be implemented by using the inverse transform method, but this method becomes intractable for high-dimensional distributions. Haijun Li Math 416/516: Stochastic Simulation Week 5 10 / 23 Generate more “easy samples”, then reject some. Suppose that a (target) probability density function f (x) is difficult to simulate directly. But we know how to simulate samples from an alternative probability density g(x), where g(x) ≥ f (x). We simulate more samples from g(x), and then accept a portion of them based on some rules. Haijun Li Math 416/516: Stochastic Simulation Week 5 11 / 23 Acceptance-Rejection Method Again, let f (x) and g(x) denote two probability density functions such that cg(x) ≥ f (x) for some constant c ≥ 1. Theorem Let X ∼ f (x), and Y ∼ g(y ). If U is uniformly distributed over [0, 1] and independent of Y , then for any subset A, f (Y ) . P(X ∈ A) = P Y ∈ A U ≤ cg(Y ) | {z } accept rule That is, f (Y ) X = Y U ≤ cg(Y ) d Haijun Li Math 416/516: Stochastic Simulation Week 5 12 / 23 Proof. Observe that Z f (Y ) f (y ) P U≤ = P U≤ Y = y g(y )dy cg(Y ) cg(y ) Z Z f (y ) 1 f (y ) = P U≤ g(y )dy = . g(y )dy = cg(y ) cg(y ) c Similarly, for any set A, Z f (Y ) f (y ) P Y ∈ A, U ≤ = g(y )dy P U≤ cg(Y ) cg(y ) A Z Z f (y ) 1 = g(y )dy = f (y )dy . c A A cg(y ) Therefore, f (Y ) P Y ∈ A, U ≤ cg(Y ) f (Y ) P Y ∈AU≤ = f (Y ) cg(Y ) P U≤ cg(Y ) Z f (y )dy = P(X ∈ A). = A Q.E.D. Haijun Li Math 416/516: Stochastic Simulation Week 5 13 / 23 Acceptance-Rejection Algorithm Again, let f (x) and g(x) denote two probability density functions such that cg(x) ≥ f (x) for some constant c ≥ 1. Algorithm: 1 Generate a trial sample value y from the density g(y ). 2 Generate a random number u from the uniform distribution over [0, 1]. 3 Accept y and set x := y if u≤ f (y ) ; cg(y ) otherwise, discard y and go to (1). Haijun Li Math 416/516: Stochastic Simulation Week 5 14 / 23 Remark The overall probability of accepting a trial sample value is 1/c, and so on average, about c sample values from g(y ) are needed to generate one acceptable sample value from f (x). c can be selected as f (x) . g(x) all x’s c ∗ = max A smaller c that is close to 1 is more efficient. The uniform and exponential distributions are often chosen as the alternative distribution g(x). The Acceptance-Rejection method can be applied to Monte Carlo simulations from multi-dimensional distributions. Haijun Li Math 416/516: Stochastic Simulation Week 5 15 / 23 Gamma Distribution A random variable X is said to have a gamma distribution if it has the pdf λk x k −1 e−λx , x > 0, where Γ(k ) = f (x) = Γ(k ) Z ∞ x k −1 e−x dx, 0 If k is a positive integer, then the gamma function Γ(k ) = (k − 1)!. Widely used in reliability engineering and telecommunication. E(X ) = λk , V (X ) = k . λ2 k = shape parameter, λ = rate parameter, θ = λ−1 = scale parameter. Haijun Li Math 416/516: Stochastic Simulation Week 5 16 / 23 A special case: When k = n (integer), the distribution is called an Erlang distribution (widely used in telecommunication). k = 1: Exponential distribution. The pdf is given by f (x) = λe−λx , x ≥ 0. Haijun Li Math 416/516: Stochastic Simulation Week 5 17 / 23 Example: Sampling from Gamma Dist Consider the case where k = 3/2, λ = 1 (note: Γ(3/2) = r x −x f (x) = 2 e , x ≥ 0. π √ π/2): To simulate samples from this distribution, we use the exponential distribution as the alternative distribution g(x) = λe−λx , x ≥ 0. Find the constant c such that √ 2 xe−x f (x) c = max = max √ . πλe−λx x≥0 x≥0 g(x) The maximum is attained at x ∗ = 0.5/(1 − λ) and 1 c=p . 2eπλ2 (1 − λ) The smallest c is most efficient. c is minimized at λ∗ = 2/3 with minimal value c ∗ = 1.257. That is, g(x) = 23 e−2x/3 and c ∗ = 1.257. Haijun Li Math 416/516: Stochastic Simulation Week 5 18 / 23 Example: Sampling from Gamma Dist (cont’d) Algorithm: 1 Generate a trial sample y from the exponential distribution with rate 2/3. 2 Generate a random number u from the uniform distribution over [0, 1]. 3 Accept y and set x := y if f (y ) u≤ =2 cg(y ) r 2ey −y /3 e ; 3 otherwise, discard y and go to (1). Remark: In order generate 100 samples from f (x), we need to generate about 125.7 trial samples from g(y ). Haijun Li Math 416/516: Stochastic Simulation Week 5 19 / 23 Dealing With Unknown Normalizing Constant Suppose that a probability density function f (x) = k h(x), denoted by f (x) ∝ h(x) where h(x) is a known function, but the normalizing constant k is unknown. Let g(y ) be a density function from which we know how to draw samples. If h(y ) ≤ c, ∀ y , g(y ) then we can use the acceptance rule u≤ h(y ) cg(y ) where y is drawn from g(y ) and u is drawn from the uniform distribution over [0, 1] independently. Haijun Li Math 416/516: Stochastic Simulation Week 5 20 / 23 Example: Unknown Normalizing Constant Simulate samples from the following density function: √ 2 2 f (x) ∝ e−x (1 − e− 1+x ), x ∈ R. | {z } h(x) Observe that 2 e−x (1 − e− √ 1+x 2 )≤ √ 2πφ(x) where φ(x) is the standard normal density function. Algorithm: 1 Generate a trial sample y from the normal distribution N(0, 1). 2 Generate a random number u from the uniform distribution over [0, 1]. 3 Accept y and set x := y if √ 2 2 h(y ) e−y (1 − e− 1+y ) = ; u≤√ e−y 2 /2 2πφ(y ) otherwise, discard y and go to (1). Haijun Li Math 416/516: Stochastic Simulation Week 5 21 / 23 Example: 2-Dimensional Simulation Simulate samples from the following density function: 1/π if x 2 + y 2 ≤ 1 f (x, y ) = 0 otherwise. Let g(x, y ) = 1/4 if −1 ≤ x, y ≤ 1 0 otherwise. Note that g(x, y ) is the joint pdf of two uniformly distributed random variables over [−1, 1]. Observe that f (x, y ) 4 c = max = ≈ 1.273. π −1≤x,y ≤1 g(x, y ) It is easy to see that f (x, y ) = cg(x, y ) Haijun Li 1 if x 2 + y 2 ≤ 1 0 otherwise. Math 416/516: Stochastic Simulation Week 5 22 / 23 Example: 2-Dimensional Simulation Algorithm Algorithm: 1 Generate a pair of independent trial samples (x, y ) from the uniform distribution over [−1, 1]. 2 Generate a random number u from the uniform distribution over [0, 1]. 3 Accept the pair (x, y ) if u≤ f (x, y ) = 1, equivalently x 2 + y 2 ≤ 1; cg(x, y ) otherwise, discard (x, y ) and go to (1). Remark: In order to generate 100 samples, we need to generate about 127.3 trial pairs from g(x, y ). Haijun Li Math 416/516: Stochastic Simulation Week 5 23 / 23