* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download cours1
Newton's method wikipedia , lookup
Multiplication algorithm wikipedia , lookup
Simulated annealing wikipedia , lookup
Monte Carlo methods for electron transport wikipedia , lookup
Root-finding algorithm wikipedia , lookup
Mean field particle methods wikipedia , lookup
Resampling (statistics) wikipedia , lookup
False position method wikipedia , lookup
Simulation and Pseudorandom Sequences Lecture Series by Harald Niederreiter National University of Singapore 1. Fundamentals of Simulation 2. Testing of Pseudorandom Numbers 3. Linear Generators 4. Nonlinear Generators I 5. Nonlinear Generators II Fundamentals of Simulation Harald Niederreiter National University of Singapore — An example — The principle of simulation — The Monte Carlo method — Random vs. pseudorandom numbers — Nonuniform random variates — A brief bibliography An example There are many problems in computational mathematics that cannot be solved analytically. For instance, we don’t know the exact analytic value of Z 1 Z 3 2 dx −x e dx, ,.... 0 2 log x In such cases, we have to resort to numerical methods to obtain a good approximation to the exact value. Numerical methods have been developed for integration, optimization, solving differential equations, linear algebra problems,.... However, there are challenging problems in computational mathematics for which standard numerical methods fail to yield good results. Example: determine the area of a complicated domain D in the plane. Method: first enclose D by a rectangle R. Then choose points from R at random and determine the probability p that they fall into D (this is done by repeated experiments). Now area(D) , p= area(R) and so area(D) = p · area(R). This is reminiscent of Buffon’s needle experiment (Buffon, 1777). Given a floor with equally spaces parallel lines a distance d apart, find the probability P (d) that a needle of length d thrown randomly on the floor will intersect a line. Answer 2 P (d) = = 0.6366 . . . . π The principle of simulation The area calculation on the previous page and Buffon’s needle experiment are both instances of a simulation method. Generally speaking, a simulation method in computational mathematics is a probabilistic method using repeated experiments. Important ingredients are random samples and/or stochastic processes. In the modern age, simulation experiments are not done physically, but are run on a computer. Applications of simulation technology: – solve complicated computational problems; – run probabilistic algorithms in mathematics and computer science. The Monte Carlo method Go back to our area calculation. Write Z area(D) = χD (x) dx R with χD being the characteristic (or indicator) function of D. Let x1, . . . , xN be random samples from R. Then N 1 X area(D) =p≈ χD (xn). area(R) N n=1 On the other hand, Z area(D) = χD (x) dµ, area(R) R where dµ = (area(R))−1 dx is the differential of a probability measure µ on R. Thus, Z N X 1 χD (x) dµ ≈ χD (xn). N R n=1 Generalize: Let (X, B, λ) be an arbitrary probability space and let f be a real-valued λ-integrable function on X. We want to approximately compute the integral Z E(f ) = f dλ. X We choose independent λ-distributed samples x1, . . . , xN from X and use the Monte Carlo approximation N X 1 E(f ) ≈ f (xn). N n=1 In the language of statistics, we are approximating an expected value by a sample mean. The Monte Carlo method can be applied whenever the quantity to be computed can be expressed as an expected value. Besides numerical integration, it is used for differential equations, integral equations, numerical linear algebra,.... The strong law of large numbers shows that N X 1 f (xn) = E(f ) lim N →∞ N a.e., n=1 where “a.e.” means “with probability 1 on the sample space”. To determine the rate of convergence, we assume that f ∈ L2(λ), i.e., that f is square integrable. We introduce the variance Z σ 2(f ) = (f − E(f ))2 dλ < ∞. X Th. 1.1. If f ∈ L2(λ), then for any N ≥ 1 we have 2 Z Z N X 1 ··· f (xn) − E(f ) X X N n=1 σ 2(f ) dλ(x1) · · · dλ(xN ) = . N Proof. This means that the integration error is, on the average, σ(f )N −1/2. More precise information about the error can be obtained from the central limit theorem. If we are in Rs, then there are classical methods for numerical integration. They are based on forming Cartesian products of one-dim. integration rules such as the midpoint rule, trapezoidal rule,.... For instance, if we use the Cartesian product of the trapezoidal rule with N nodes in dimension s, then we get an error bound O(N −2/s). Since 1 2 − <− for s ≥ 5, 2 s the Monte Carlo method is better than the Cartesian product of the trapezoidal rule for s ≥ 5. Random vs. pseudorandom numbers We have seen that simulation methods, and in particular the Monte Carlo method, are based on random sampling. Ideally, random sampling would be executed by physical experiments (throwing dice, spinning roulette wheels,...). This would lead to truly random numbers. However, a typical implementation of a Monte Carlo method requires about 105 to 106 random numbers. It would take too long to produce these random numbers by physical experiments. Further disadvantages: • these random numbers need to be stored • these random numbers need to be tested extensively • no theory of physical random numbers can be developed In the computer age it is preferable to use machine-generated random numbers. This is definitely faster than physical experiments. Also, we have to store only a few parameters for the generation algorithm. If random numbers are generated by a deterministic algorithm, we speak of pseudorandom numbers. In case the generation algorithm for pseudorandom numbers can be subjected to mathematical analysis, we can hope to obtain theoretical results which predict the properties of the pseudorandom numbers. This will reduce the need for extensive statistical testing. Nonuniform random variates Random and pseudorandom numbers are sampled according to a given distribution. The distribution is described by a distribution function F on R. This is a nondecreasing function (usually continuous) with lim F (t) = 0, lim F (t) = 1. t→−∞ t→∞ A standardized distribution function is the uniform distribution function U . It is defined by U (t) = 0 for t < 0, U (t) = t for 0 ≤ t ≤ 1, and U (t) = 1 for t > 1. Random (or pseudorandom) numbers simulating the uniform distribution are called uniform random (or pseudorandom) numbers. Random (or pseudorandom) numbers simulating any other distribution are called nonuniform random variates. The generation of nonuniform random variates proceeds in two steps: 1. generate numbers simulating the uniform distribution U ; 2. transform these numbers to fit the given distribution F 6= U . Many methods are known for step 2 which often depend on the nature of F . A rather general method is the inversion method. Let F be strictly increasing and continuous on R. Then F has an inverse function F −1 which is defined at least on the open interval (0, 1). Now take a sequence x1, x2, . . . ∈ (0, 1) simulating the uniform distribution. Then define zn = F −1(xn), n = 1, 2, . . . . The sequence z1, z2, . . . simulates F since zn ≤ t ⇐⇒ xn ≤ F (t). A brief bibliography P. Bratley, B.L. Fox, and L.E. Schrage, A Guide to Simulation, Springer, New York, 1983. L. Devroye, Non-Uniform Random Variate Generation, Springer, New York, 1986. G.S. Fishman, Monte Carlo: Concepts, Algorithms, and Applications, Springer, New York, 1996. J.E. Gentle, Random Number Generation and Monte Carlo Methods, 2nd ed., Springer, New York, 2003. D.E. Knuth, The Art of Computer Programming, Vol. 2: Seminumerical Algorithms, 3rd ed., Addison-Wesley, Reading, MA, 1998. I. Peterson, The Jungles of Randomness: A Mathematical Safari, Wiley, New York, 1998.