Download • Phase Space: a concept of classical Statistical Mechanics • Each

Phase Space • Phase Space: a concept of classical Statistical Mechanics • Each Phase Space dimension corresponds to a particle degree of freedom • 3 dimensions correspond to Position in (real) space: x, y, z • 3 dimensions correspond to Momentum: px, py , pz (or Energy and direction: E, θ, φ) • More dimensions may be envisaged, corresponding to other possible degrees of freedom, such as quantum numbers: spin, etc. • Each particle is represented by a point in phase space • Time can also be considered as a coordinate, or it can be considered as an independent variable: the variation of the other phase space coordinates as a function of time (the trajectory of a phase space point) constitutes a particle “history” FLUKA course, Pavia, March 2006 1 The Boltzmann Equation • All particle transport calculations are (explicit or implicit) attempts to solve the Boltzmann Equation • It is a balance equation in phase space: at any phase-space-point, the increment of particle phase-space-density is equal to the sum of all “production terms” minus a sum of all “destruction terms” • Production: Sources, “Inscattering”, Particle Production, Decay • Destruction: Absorption, “Outscattering”, Decay • We can look for solutions of different type: at a number of (real or phase) space points, averages over (real or phase) space regions, projected on selected phase space hyperplanes, stationary or timedependent FLUKA course, Pavia, March 2006 2 Mean of a distribution (I) • In one dimension: Given a variable x, distributed according to a function f (x), the mean or average of another function of the same variable A(x) over an interval [a, b] is given by: Ā = Z b a A(x) f (x) dx Z b a f (x) dx Or, introducing the normalized distribution f 0 : f (x) f (x) = Z b a f (x) dx 0 Ā = Z b 0 A(x) f (x) dx a A particular case is that of A(x) = x: x̄ = Z b a x f 0(x) dx FLUKA course, Pavia, March 2006 3 Mean of a distribution (II) • In several dimensions: Given n variables x, y, z,. . . , distributed according to the (normalized) functions f 0 (x), g 0 (y), h0(z). . . , the mean or average of a function of those variables A(x, y, z . . .) over an n-dimensional domain D is given by: Ā = Z Z x y Z Z 0 0 0 · · · A(x, y, z, . . .) f (x) g (y) h (z) . . . dx dy dz . . . z Often impossible to calculate with traditional methods, but we can sample N values of A with probability f 0 · g 0 · h0 · · · and divide the sum of the sampled values by N : Central Limit Theorem: lim N →∞ N X 1 A(x, y, z, . . .)f 0(x) g 0(y) h0(z) . . . N = Ā Each term of the sum is distributed like A (Analog Monte Carlo). Integration, but also simulation! FLUKA course, Pavia, March 2006 4 Analog Monte Carlo • In an analog Monte Carlo calculation (“honest” simulation), not only the mean of the contributions converges to the mean of the real distribution, but also the variance and all moments of higher order:  lim N →∞          N X 1 (x − x̄) N 1 n  n        = σn and fluctuations and correlations are faithfully reproduced. FLUKA course, Pavia, March 2006 5 Integration efficiency • Traditional numerical integration methods (Simpson, etc.) converge to 1 the true value as N − n , where N = number of “points” (intervals) and n = number of dimensions • Monte Carlo converges instead as √1N Number of dimensions Traditional methods Monte Carlo n=1 1 N √1 N 1 √ n N √1 N √1 N √1 N n=2 n>2 FLUKA course, Pavia, March 2006 Remark MC not convenient about equivalent MC converges faster 6 Sampling from a distribution Sampling from a uniform distribution • All computers provide a pseudo-random number generator (or even several of them). In most computer languages (e.g., Fortran 90, C) a PRNG is even available as an intrinsic routine • All such PRNGs return a sequence of numbers uniformly distributed between 0 and 1. Generally 0 is included and 1 is not. It is easy to change from ξ ∈ [0, 1[ to ξ ∈]0, 1] by taking ξ 0 = 1 − ξ (for instance to avoid an overflow due to division by ξ) • Uniform and Random are independent concepts. For MC, sometimes uniformity is more important than randomness (but not if correlations are important!) • But it is important also that sampled histories be uncorrelated with each other • A simple example of sampling from a uniform distribution: a linear source along the z axis, between z1 and z2: Z = z1 + (z2 − z1) ξ FLUKA course, Pavia, March 2006 7 Sampling from a distribution Sampling from a generic distribution • Using one random number • Integrate the distribution function, analytically or numerically, and normalize to 1 to obtain the normalized cumulative distribution: Z F (x) = Z x xmin xmax xmin f (x) dx f (x) dx • Sample a uniform pseudo-random number ξ • Get the desired result by finding the inverse value X = F −1(ξ) , analytically or most often by interpolation (table look-up) • Using two random numbers (rejection technique) • Choose a constant value C > f (x) for any x • Sample two random numbers ξ1 and ξ2 • If ξ2 < f (ξ1)/C, X = ξ1 , otherwise re-sample ξ1, ξ2 FLUKA course, Pavia, March 2006 8 Analog vs. Biased Analog Monte Carlo • samples from actual physical phase space distributions • predicts average quantities and all statistical moments of any order • preserves correlations (provided the physics is correct, of course!) • reproduces fluctuations (provided. . . see above) • is (almost) safe and (sometimes) can be used as a “black box” But: • is inefficient and converges very slowly • fails to predict important contributions due to rare events FLUKA course, Pavia, March 2006 9 Analog vs. Biased Biased Monte Carlo • samples from artificial distributions, and applies a weight to the particles to correct for the bias • predicts average quantities, but not the higher moments (on the contrary, its goal is to minimize the second moment!) • same mean with smaller variance =⇒ faster convergence • allows sometimes to obtain acceptable statistics where an analog Monte Carlo would take years of CPU time to converge But: • cannot reproduce correlations and fluctuations • with a few exceptions, requires physical judgment, experience and a good understanding of the problem (it is not a black box!) • in general, a user does not get the definitive result after the first run, but needs to do a series of test runs in order to optimize the biasing parameters • =⇒ balance between user’s time and CPU time FLUKA course, Pavia, March 2006 10 Reduce variance or CPU time? Computer cost A Figure of Merit: Computer cost of an estimator = σ 2 × t (σ 2 = Variance, t = CPU time per primary particle) • some biasing techniques are aiming at reducing σ 2 , others at reducing t • often reducing σ 2 increases t, and viceversa • therefore, minimizing σ 2 × t means to reduce σ at a faster rate than t increases, or viceversa • =⇒ the choice depends on the problem, and sometimes a combination of several techniques is most effective • bad judgement, or excessive “forcing” on one of the two variables, can have catastrophic consequences on the other one, making computer cost “explode” FLUKA course, Pavia, March 2006 11 Two different worlds (I) Physics codes • In general, physicists are more interested in single events and in correlations within a same event than in averages over thousands of events • therefore, Monte Carlo codes used in physics research have been mostly fully analog so far • but there are cases (punchthrough, detector background studies, very high energy showers with large multiplicities) where an analog simulation can require prohibitive computer times FLUKA course, Pavia, March 2006 12 Two different worlds (II) Engineering codes • engineers and applied physicists working in the nuclear energy field need to simulate endless neutron histories in scattering materials • all the low energy neutron transport codes have always used variance reduction techniques, starting with the first code of Fermi and von Neumann • but some studies of nuclear physics detectors would benefit from simulations wich could predict coincidences and anticoincidences FLUKA course, Pavia, March 2006 13 Two different worlds (III) Simulation vs. Integration • In high-energy physics and in medical physics, people are used to think about Monte Carlo in terms of simulation: since all moments of phase space distributions are faithfully reproduced by analog Monte Carlo, results look “like real life” • For nuclear engineers, Monte Carlo is a technique to integrate the Boltzmann equation: what matters is the expectation value of a finite number of quantities of interest, which is obtained by integrating over phase space (actually integration takes place over the individual particle paths in phase space, i.e., the histories =⇒ ergodic theorem) • It is important to realize that the statistical weighting techniques used in biased Monte Carlo are mathematically rigorous: for N → ∞, all calculated averages tend to the corresponding physical expectation values (although the speed of convergence may be very different in different regions of phase space) Biasing is neither tuning nor tricks, it’s sound applied mathematics !!! FLUKA course, Pavia, March 2006 14 Importance biasing (FLUKA option: BIASING) • Importance biasing combines two techniques: • Surface Splitting (also called Geometry Splitting), which reduces σ but increases t • Russian Roulette (which does the opposite) • it is the simplest, most “safe” and easiest to use of all biasing techniques • the user assigns a relative importance to each geometry region (the actual absolute value doesn’t matter), based on: 1. expected fluence attenuation with respect to other regions 2. probability of contribution to score by particles entering the region FLUKA course, Pavia, March 2006 15 Importance biasing Surface Splitting • If a particle crosses a region boundary, coming from a region of importance I1 and entering a region of higher importance I2 > I1 : • the particle is replaced on average by n = I2/I1 identical particles with the same characteristics • the weight of each “daughter” is multiplied by I1/I2 • If I2/I1 is too large, excessive splitting may occur with codes which do not provide an appropriate protection • An internal limit in FLUKA prevents excessive splitting if I2/I1 is too large (> 5), a problem found in many biased codes. FLUKA course, Pavia, March 2006 16 Importance biasing Russian Roulette • RR acts in the opposite direction: if a particle crosses a boundary from a region of importance I1 to one of lower importance I2 < I1, the particle is submitted to a random survival test. • with a chance I2/I1 the particle survives with its weight increased by a factor I1/I2 • with a chance (1 − I2/I1) the particle is killed Importance biasing is commonly used to maintain a constant particle population, compensating for attenuation due to absorption or distance. In FLUKA it can be tuned per type of particle FLUKA course, Pavia, March 2006 17 Importance biasing Note: In FLUKA, for technical reasons, importances are internally stored as integers. Therefore: • importances can only take values between 0.0001 and 100000. • an input value 0.00015 is read as 0.0001, 0.00234 is read as 0.0023, etc. There is also a user routine USIMBS which allows to assign importances not only at boundaries, but at each step, according to any logic desired by the user (as a function of position, direction, energy. . . ). Very powerful, but very time-consuming (it is called at each step!). The user must balance the time gained by biasing with that wasted by calls. FLUKA course, Pavia, March 2006 18 Importance biasing: problems • Although importance biasing is relatively easy and safe to use, there are a few cases where caution is recommended +-----+---------------------------+ | | I=8 D | | +---------------------------+ | | I=4 C | | I=? +---------------------------+ | | I=2 B | | +---------------------------+ | E | I=1 A | +---------------------------------+ Which importance shall we assign to region E? Whatever value we choose, we will get inefficient splitting/RR at some of the boundaries • Another case is that of splitting in vacuum (or air). Splitting daughters are strongly correlated: it must be made sure that their further histories are differentiated enough to “forget” their correlation. (Difficult to do in ducts and mazes) • The above applies in part also to muons (the differentiation provided by multiple scattering and by dE dx fluctuations is not always sufficient) FLUKA course, Pavia, March 2006 19 Weight Window (I) (FLUKA options: WW-FACTOr, WW-THRESh, WW-PROFIle) • the WW technique too is a combination of splitting and RR, but it is based on the absolute value of the weight of each individual particle, rather than on relative region importance • the user sets an upper and a lower weight limit, generally as a function of region, energy, and particle • particles having a weight larger than the upper limit are splitted, those with weight smaller than the lower limit are submitted to RR =⇒ killed or put back “inside the window” • WW is a more powerful biasing tool than Importance Biasing, but it requires also more experience and patience to set it up correctly. “It is more an art than a science” (Quote from the MCNP manual) • use of the WW is essential whenever other biasing techniques generate large weight fluctuations in a given phase space region. FLUKA course, Pavia, March 2006 20 Weight Window (II) • Killing a particle with a very low weight (with respect to the average for a given phase space region) decreases t but has very little effect on the score (and therefore on σ) • Splitting a particle with a very large weight • increases t (in proportion to the number of additional particles to be followed) • but at the same time reduces σ by avoiding large fluctuations in the contributions to scoring • The global effect is to reduce σ 2 t • A too wide window is of course ineffective, but also too narrow windows should be avoided. Otherwise, too much CPU time would be spent in repeated splitting/RR. A typical ratio between the upper and the lower edge of the window is about 10. It is also possible to do RR without splitting (setting the upper window edge = ∞) or splitting without RR (setting the lower edge = 0) FLUKA course, Pavia, March 2006 21 The FLUKA Weight Window FLUKA course, Pavia, March 2006 22 Leading Particle Biasing (I) (FLUKA option: EMF-BIAS) • Leading particle biasing is available only for e+, e−and photons • it is generally used to avoid the geometrical increase with energy of the number of particles in an electromagnetic shower • it is characteristic of EM interactions that 2 particles are present in the final state (at least in the approximation made by most MC codes) • if LPB is activated, only one is randomly retained and its weight is adjusted so as to conserve weight × probability • the most energetic of the two particles is kept with higher probability (as the one which is more efficient in propagating the shower) The FLUKA implementation, unlike that of other codes, takes into account the indirectly enhanced penetration potential of positrons due to the emission of annihilation photons FLUKA course, Pavia, March 2006 23 Leading Particle Biasing (II) • LPB is very effective at reducing t, but increases σ by introducing large weight fluctuations (a few low energy particles carry a large weight, while many energetic ones have a small weight) • therefore, it should nearly always be backed up by WW • Very useful for shielding calculations at both electron and proton accelerators! In FLUKA, LPB can be finely tuned by region, by physical effect (Compton, bremsstrahlung, etc.) and by particle energy (LPB played only below a given energy threshold) FLUKA course, Pavia, March 2006 24 Multiplicity tuning (FLUKA option: BIASING) • Multiplicity tuning is unique to FLUKA (or it was until not long ago. . . ), and it is meant to be for hadrons what LPB is for electrons and photons • a hadronic nuclear interaction at LHC energies can end in hundreds of secondaries ⇒ to simulate a whole hadronic cascade in bulk matter may take a lot of CPU time • except for the leading particle, many secondaries are of the same type and have similar energies and other characteristics • therefore, it is possible to discard a predetermined average fraction of them, provided the weight of those which are kept and transported be adjusted so that total weight is conserved (but the leading particle must not be discarded) • the user can tune the average multiplicity in different regions of space by setting a region-dependent reduction factor (in fact it can even be > 1! But this possibility is seldom used) FLUKA course, Pavia, March 2006 25 Biased downscattering Only for experts! (FLUKA option: LOW-DOWN) • Especially suited to the multi-group treatment of low-energy neutron transport • the group-to-group transfer probabilities used in low-energy neutron transport (the so-called downscattering matrix) can be biased in order to artificially accelerate or slow down the moderation process • very difficult to use: the user must be able to visualize the neutron path in phase space and especially the path projection on the energy axis • but very efficient for medium-heavy materials when thermal or epithermal neutrons play an important role • in inexpert hands it can give results which only apparently converge fast, with large rare spikes over what appears as a smooth, statistically sound background (uniform undersampling, i.e., systematic sampling in the tail of a distribution, gives the illusion of a small variance! ) FLUKA course, Pavia, March 2006 26 Non-analog absorption (I) Another biasing technique for low energy neutron transport (FLUKA option: LOW-BIAS) • also called survival biasing • implemented in most low-energy neutron transport codes, where the user must choose between two options: • at each neutron collision either analog scattering or absorption according to the actual physical probability σs/σT and (1 − σs/σT ), respectively • systematic survival with a weight reduced by a factor σs/σT • in FLUKA there is also a third possible choice: the user can force the neutron absorption probability to take an arbitrary value, preassigned on a region-by-region basis and as a function of energy FLUKA course, Pavia, March 2006 27 Non-analog absorption (II) When and how to use it? • a small survival probability is often assigned to thermal neutrons to limit the number of scatterings in non-absorbing media • also very useful in materials with unusual scattering properties (e.g., iron) • survival probabilities too small with respect to the physical one σs/σT may introduce large weight fluctuations due to the very different number of collisions suffered by individual neutrons: in these cases, a Weight Window should be applied (never set the absorption probability more than few times larger than the physical one!!) FLUKA course, Pavia, March 2006 28 Biasing mean free paths Decay length (FLUKA option: LAM-BIAS) • the mean life/average decay length of unstable particles can be artificially shortened • it is also possible to increase the generation rate of decay products without the parent particle actually disappearing • typically used to enhance statistics of muon or neutrino production • the kinematics of decay can also be biased (decay angle) FLUKA course, Pavia, March 2006 29 Biasing mean free paths Interaction length (FLUKA option: LAM-BIAS) • in a similar way, the hadron or photon mean free path for nuclear inelastic interactions can be artificially decreased by a predefined particle or material dependent factor. (Much more sound than the “forced interaction” implemented in some old codes!) • this option is useful for instance to increase the probability of beam interaction in a very thin target or in a material of very low density • also necessary to simulate photonuclear reactions with acceptable statistics, the photonuclear cross section being much smaller than that for EM processes FLUKA course, Pavia, March 2006 30 User-written biasing (I) Do it yourself. . . (FLUKA user routines: source, ubsset) • the user routine ubsset.f allows to generate many biasing parameters from simple algorithms instead of entering each input value by hand • the user can also prepare a biased source • the case most often encountered is that of a biased source energy spectrum, in which some energies are preferentially sampled • but it is also common to bias the source spatial and/or angular distribution • of course, in such cases it is the user’s responsibility to ensure that weights are adjusted in a statistically correct way FLUKA course, Pavia, March 2006 31 User-written biasing (II) Do it yourself. . . Some tools available for user code: • The FLUKA PRNG is provided by function FLRNDM(DUMMY). It returns a double precision random number uniformly distributed in the [0, 1[ interval • Subroutine SFECFE(SFE,CFE) returns a random azimuthal sine and cosine • Subroutine RACO(TXX,TYY,TZZ) returns the cosines of a vector of random orientation • Subroutine SFLOOD(XXX,YYY,ZZZ,UXX,VXX,WXX) returns a random position and direction on the surface of a sphere of unit radius (generating a uniform and isotropic field inside the sphere) • Subroutine FLNRRN(RGAUSS) returns a random number normally distributed with unit variance FLUKA course, Pavia, March 2006 32 User-written biasing (III) Sampling from a biased distribution • The sampling technique based on the cumulative distribution can be extended to modify the probability of sampling in different parts of the interval (importance sampling) • We replace f (x) by a function g(x) = f (x) h(x), where h(x) is any appropriate weight function • We normalize g(x) in the same way as f (x) in the normal sampling G(x) = Z x g(x) dx Z x max g(x) dx xmin xmin = Z x xmin f (x) h(x) dx B and we need also the integral of f (x) over the whole interval: A= Z x max xmin f (x) dx • Now sample from the biased cumulative normalized distribution G instead of the original unbiased F : take a uniform PRNG ξ, and get the sampled value X by inverting G(x): X = G −1 (ξ) B 1 • The particle is assigned a weight A h(X) FLUKA course, Pavia, March 2006 33 User-written biasing (IV) Sampling from a biased distribution: a simple example • Uniform linear source along z axis, between z1 and z2 • Non biased was: f (z) = 1 A = z 2 − z1 F = Z z z1 dz A Z = z1 + A ξ = z1 + (z2 − z1 ) ξ = z − z1 • We decide to bias with a linear function, in order to sample preferentially larger values of z: h(z) = z g(z) = h(z) f (z) = z B= z22 − 2 z12 G(z) = 2 q Z = z12 + ξ (z22 − z12 ) Z z z1 z dz B = z 2 −z12 2 z22 −z12 2 2 z2 −z1 1 B 1 z2 + z1 2 W = = = A h(X) z2 − z1 Z 2Z FLUKA course, Pavia, March 2006 34 User-written biasing (V) Uniform sampling from a steep distribution • A special case of importance sampling is when the biasing function chosen is the inverse of the unbiased distribution function: h(x) = G(x) = 1 f (x) g(x) = f (x) h(x) = 1 Z x g(x) dx Z x max g(x) dx xmin xmin = x − xmin xmax − xmin • In this case, X = G−1(ξ) = xmin + ξ (xmax − xmin) and the sampled particle is assigned a weight f (X) xmax −xmin A • Because X is sampled with the same probability over all possible values of x, independently of the value f (X) of the function, this technique is used to ensure that sampling is done uniformly over the whole interval, even though f (x) might have very small values somewhere • For instance it may be important to avoid undersampling in the high-energy tail of a spectrum, steeply falling with energy but more penetrating, such as that of cosmic rays or synchrotron radiation FLUKA course, Pavia, March 2006 35

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download • Phase Space: a concept of classical Statistical Mechanics • Each