* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Monte Carlo simulation - University of South Carolina
Survey
Document related concepts
Transcript
Random Numbers and Simulation Generating truly random numbers is not possible • Programs have been developed to generate pseudo-random numbers • Values are generated from deterministic algorithms © Fall 2011 John Grego and the University of South Carolina 1 Random Numbers Pseudo-random deviates can pass any statistical test for randomness They appear to be independent and identically distributed Random number generators for common distributions are available in R Special techniques (STAT 740) may be needed as well 2 Monte Carlo Simulation Some common uses of simulation • Modeling stochastic behavior • Calculating definite integrals • Approximating the sampling distribution of a statistics (e.g., maximum of a random sample) 3 Modeling Stochastic Behavior Buffon’s needle Random Walk Observe X1, X2, …, where p=P(Xi=1)=P(Xi=-1)=.5 and study S1,S2,…, where i Si X j j 1 4 Modeling Stochastic Behavior This is also called Gambler’s ruin; each Xi represents a $1 bet with a return of $2 for a win and $0 for a loss. 5 Gambler’s Ruin The properties of a fair game (p=.5) are a lot more interesting than the properties of an unfair game (p≠.5) Some properties of this process are easy to anticipate (E(S)) 6 Gambler’s Ruin Some properties are difficult to anticipate, and can be aided by simulation. • Expected number of returns to 0 • Expected length of a winning streak • Probability of going broke given an initial bank 7 Calculating Definite Integrals In statistics, we often have to calculate difficult definite integrals (posterior distributions, expected values) I b h(x)dx a (here, x could be multidimensional) 8 Calculating Definite Integrals Example 1 4 I1 0 2 dx 1 x 1 Example 2 I2 (4 x 1 1 0 0 2 1 2x )dx 2 dx1 2 2 9 Hit-or-Miss Monte Carlo Example 1 4 h(x) 1 x 2 1 4 0 2 dx 4(arctan(1) arctan( 0)) 4 /4 1 x Determine c such that c≥h(x) across entire region of interest (here, c=4) 10 Hit-or-Miss Monte Carlo Generate n random uniform (Xi,Yi) pairs, Xi’s from U[a,b] (here, U[0,1]) and Yi’s from U[0,c] (here, U[0,4]) Count the number of times (call this m) that Yi is less than h(Xi) Then I1 ≈c(b-a)m/n • I.e., (height)(width)(proportion under curve) 11 Classical Monte Carlo Integration I h(x)dx a Take n random uniform values, U1,…,Un over [a,b] and estimate I using b n ba I hU i n i1 This method seems straightforward, but is actually more efficient than Hit-or-Miss Monte Carlo 12 Expected Value of a Function of a Random Variable Suppose X is a random variable with density f. Find E[h(x)] for some function h, e.g., E X E 2 X E sin X 13 Expected Value of a Function of a Random Variable E hX hxdx X For n random values X1, X2, …, Xn from the distribution of X (i.e., with density f), 1 n Eh X h X i n i 1 14 Examples Example 3: If X is a random variable with a N(10,1) distribution, find E(X2) Example 4: If Y is a random variable with a Beta(5,1) distribution, E(-lnY) There are more advanced methods of integration using simulation (Importance Sampling) 15 Integration integrate() performs numerical integration for functions of a single variable (not using simulation techniques) adapt() in the adapt package performs multivariate numerical integration 16 Approximating the Sampling Distribution of a Statistic To perform inference (CI’s, hypothesis tests) based on sampling statistics, we need to know the sampling distribution of the statistics, at least up to an approximation Example: X1, X2, …, Xn ~ iid N(m,s2). X m T has a t(n 1) distribution s n 17 Approximating the Sampling Distribution of a Statistic What if the data’s distribution is not known? • Large sample: Central Limit Theorem • Small sample: Normal theory or nonparametric procedures based on permutation distributions 18 Approximating the Sampling Distribution of a Statistic If the population distribution is known, we can approximate the sampling distribution with simulation. • Repeatedly (m times) generate random samples of size n from the population distribution • Calculate a statistic (say, S) each time • The empirical (observed) distribution of Svalues approximates the true distribution of S 19 Example X1, X2, X3, X4 ~Expon(1) What is the sampling distribution of: X (the mean) max( X) min( X) (the midrange) 2 20