Download High Performance Computing 811

Computational Methods in Physics PHYS 3437 Dr Rob Thacker Dept of Astronomy & Physics (MM-301C) [email protected] Today’s Lecture  Introduction to Monte Carlo methods Background  Integration techniques  Introduction   “Monte Carlo” refers to the use of random numbers to model random events that may model a mathematical of physical problem Typically, MC methods require many millions of random numbers  Of course, computers cannot actually generate truly random numbers    However, we can make the period of repetition absolutely enormous Such pseudo-random number generators are based on truncation of numbers of their significant digits See Numerical Recipes, p 266-280 (2nd edition FORTRAN) “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.” John von Neumann History of numerical Monte Carlo methods       Another contribution to numerical methods related to research at Los Alamos Late 1940s: scientists want to follow paths of neutrons following various sub-atomic collision events Ulam & von Neumann suggest using random sampling to estimate this process 100 events can be calculated in 5 hours on ENIAC The method is given the name “Monte Carlo” by Nicholas Metropolis Explosion of inappropriate use in the 1950’s gave the technique a bad name  Subsequent research illuminated when the method was appropriate Terminology   Random deviate – a distribution of numbers choosen uniformly between [0,1] Normal deviate – numbers chosen randomly between (-∞,∞) weighted by a Gaussian Background to MC integration b  Suppose we have a definite integral I   f ( x)dx a  Given a “good” set of N sample points {xi} we can estimate the integral as ba N I   f ( x)dx  f ( xi )  N i 1 a b Sample points e.g. x3 x9 f(x) a b Each sample point yields an element of the integral of width (b-a)/N and height f(xi) What MC integration really does  While the previous explanation is a reasonable interpretation of the way MC integration works, the most popular explanation is below Height given by random sample of f(x) Average a b Mathematical Applications    Let’s formalize this just a little bit… Since by the mean value theorem  b f ( x)dx  (b  a)  f  a We can approximate the integral by calculating (b-a)<f>, and we can calculate <f> by averaging many values of f(x) N 1  f N  N   f (x ) i 1 i Where xiє[a,b] and the values are chosen randomly Example    1 Consider evaluating I   e x dx  e  1  1.718281828... 0 Let’s take N=1000, then evaluate f(x)=ex with xє[0,1] at 1000 random points For this set of points define  I1=(b-a)<f>N,1=<f>N,1   since b-a=1 Next choose 1000 different xє[0,1] and create a new estimate I2=<f>N,2 Next choose another 1000 different xє[0,1] and create a new estimate I3=<f>N,3 Distribution of the estimates   We can carry on doing this, say 10,000 times at which point we’ll have 10,000 values estimating the integral, and the distribution of these values will be a normal distribution The distribution of the all of the IN integrals constrains the errors we would expect on a single IN estimate    This is the Central Limit Theorem, for any given IN estimate the sum of the random variables within it will converge toward a normal distribution Specifically, the standard deviation will be the estimate of the error in a single IN estimate The mean, x0, will approach e-1 1  e ( x  x0 ) 2 2s N2 1/e x0-sN x0 x0+sN Calculating sN  The formula for the standard deviation of N samples is ( N  1)s N2  f ( xi ) 2    f ( xi )  2    2 1 1  2    f ( xi )     f ( xi )  N i 1  N i 1  If there is no deviation in the data then RHS is zero N N Given some deviation as N→∞, the RHS will settle to some constant value > 0 (in this case ~ 0.2420359…) 1 1 Thus we can write sN  ~ ( N  1) N A rough measure of how good a random number generator is how well does a histogram of the 10,000 estimates fit to a Gaussian. Add mc.ps plot 1000 samples per I integration Standard deviation is 0.491/√1000 Increasing the number of integral estimates makes the distribution closer and closer to the infinite limit. Resulting statistics  For data that fits a Gaussian, the theory of probability distribution functions asserts that     Interpretation of poll data:     68.3% of the data (<f>N) will fall within ±sN of the mean 95.4% of the data (19/20) will fall within ±2sN of the mean 99.7% of the data will fall within ±3sN etc… “These results will be accurate to ±4%, (19 times out of 20)” The ±4% corresponds to ±2s  s0.02 Since s1/sqrt(N)  N~2500 This highlights one of the difficulties with random sampling, to improve the result by a factor of 2 we must increase N by a factor of 4! Why would we use this method to evaluate integrals?  For 1D it doesn’t make a lot of sense Taking h~1/N then composite trapezoid rule error~h2~1/N2=N-2  Double N, get result 4 times better   In 2D, we can use an extension of the trapezoid rule to use squares    Taking h~1/N1/2 then errorh2N-1 In 3D we get h~1/N1/3 then errorh2N-2/3 In 4D we get h~1/N1/4 then errorh2N-1/2 MC beneficial for N>4   Monte Carlo methods always have sN~N-1/2 regardless of the dimension Comparing to the 4D convergence behaviour we see that MC integration becomes practical at this point    It wouldn’t make any sense for 3D though For anything higher than 4D (e.g. 6D,9D which are possible!) MC methods tend to be the only way of doing these calculations MC methods also have the useful property of being comparatively immune to singularities, provided that   The random generator doesn’t hit the singularity The integral does indeed exist! Importance sampling    In reality many integrals have functions that vary rapidly in one part of the number line and more slowly in others To capture this behaviour with MC methods requires that we introduce some way of “putting our points where we need them the most” We really want to introduce a new function into the problem, one that allows us to put the samples in the right places General outline  Suppose we have two similar functions g(x) & f(x), and g(x) is easy to integrate, then b b f ( x) I   f ( x)dx   g ( x)dx a a g ( x) Change variables : let dy  g(x)dx, then x y(x)   g ( x' )dx' when x  a, y  y(a); x  b, y  y(b) y(b) f ( x ( y )) I dy y(a) g ( x ( y )) General outline cont    f ( x( y )) The integral we have derived I   dy has y(a) g ( x( y )) some nice properties:  Because g(x)~f(x) (i.e. g(x) is a reasonable approximation of f(x) that is easy to integrate) then the integrand should be approximately 1  and the integrand shouldn’t vary much! y(b) It should be possible to calculate a good approximation with a fairly small number of samples Thus by applying the change of variables and mapping our sample points we get a better answer with fewer samples Example    Let’s look at integrating f(x)=ex again on [0,1] MC random samples are 0.23,0.69,0.51,0.93 Our integral estimate is then 1 0.23 0.69 0.51 0.93 I  (1  0) (e  e  e  e ) 4  1.8630 Difference to answer  1.8630 - 1.7183  0.1447 Apply importance sampling  We first need to decide on our g(x) function, as a guess let us take g(x)=1+x     We’ll it isn’t really a guess – we know this is the first two terms of the Taylor expansion of ex! 2 x y(x) is thus given by y ( x)   (1  x' )dx'  x  2 For end points we get y(0)=0, y(1)=3/2 Rearrange y(x) to give x(y): x x( y)  1  1  2 y Set up integral & evaluate samples  The integral to evaluate is now 1 1 2 y 3/ 2 e ex I dy   dy 0 1 x 0 1 2 y We must now choose y’s on the interval [0,3/2] e y 3/ 2  1 1 2 y 1 2 y 0.345 1.038 1.035 1.211 0.765 1.135 1.395 1.324 Close to 1 because g(x)~f(x) Evaluate  For the new integral we have 1 I  (3 / 2  0) (1.038  1.211  1.135  1.324) 4  1.7655 Difference to answer  1.7655 - 1.7183  0.0472 So 3 times better tha n the previous estimate!   Clearly this technique of ensuring the integrand doesn’t vary too much is extremely powerful Importance sampling is particularly important in multidimensional integrals and can add 1 or 2 significant figures of accuracy for a minimal amount of effort Rejection technique   Thus far we’ve look in detail at the effect of changing sample points on the overall estimate of the integral An alternative approach may be necessary when you cannot easily sample the desired region which we’ll call W   We define a larger region V that includes W   Particularly important in multi-dimensional integrals when you can calculate the integral for a simple boundary but not a complex one Note you must also be able to calculate the size of V easily The sample function is then redefined to be zero outside the volume, but have it’s normal value within it Rejection technique diagram f(x) V W Region we want to calculate Area of W=integral of region V multiplied by fraction of points falling below f(x) within V Algorithm: just count the total number of points calculated & the number in W! Better selection of points: subrandom sequences   Choosing N points using a uniform deviate produces an error that converges as N-0.5 If we could choose points “better” we could make convergence faster  For example, using a Cartesian grid of points leads to a method that converges as N-1     Think of Cartesian points as “avoiding” one another and thus sampling a given region more completely However, we don’t know a priori how fine the grid should be We want to avoid short range correlations – particles shouldn’t be too close to one another A better solution is to choose points that attempt to “maximally avoid” one another A list of sub-random sequences        Tore-SQRT sequences Van der Corput & Halton sequences Faure sequence Generalized Faure sequence Nets & (t,s)-sequences Sobol sequence Many to choose from! Niederreiter sequence  Well look very briefly at Halton & Sobol sequences, both of which are covered in detail in Numerical Recipes Halton’s sequence  Suppose in 1d we obtain the jth number in sequence, denoted Hj, via  (1) write j as a number in base b, where b is prime   (2) Reverse the digits and place a radix point in front    e.g. 17 in base 3 is 122 e.g. 0.221 base 3 It should be clear why this works, adding an additional digit makes the “mesh” of numbers progressively finer For a sequence of points in n dimensions (xi1,…,xin) we would typically use the first n primes to generate separate sequences for each of the xij components 2d Halton’s sequence example Pairs of points constructed from base 3 & 5 Halton sequences Sobol (1967) sequence   Useful method described in Numerical Recipes as providing close to N-1 convergence rate Algorithms are also available at www.netlib.org From Num. Recipes Summary     MC methods are a useful way of numerically integrating systems that are not tractable by other methods The key part of MC methods is the N-0.5 convergence rate Numerical integration techniques can be greatly improved using importance sampling If you cannot write down a function easily then the rejection technique can often be employed Next Lecture  More on MC methods – simulating random walks

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download High Performance Computing 811