Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
New Algorithms to make Quantum Monte Carlo more Efficient Daniel R. Fisher William A. Goddard III Materials and Process Simulation Center (MSC) California Institute of Technology 2003, D.R. Fisher Dirac’s Thoughts on Chemistry "The underlying physical laws necessary for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the application of these laws leads to equations much too complicated to be soluble." - P. Dirac, Proc. Roy. Soc (London) 123:714 (1929) 2003, D.R. Fisher Computational Quantum Mechanics Algorithms Approximation Reliability Scaling With Size Time(2N)/(N) No. of Non-H Atoms Ab initio MO methods CCSD(T) MP2 Perturbation Theory DFT(B3LYP) Hartree-Fock Quantitative (~2 kcal/mol) Semiquantitative Semiquantitative Qualitative O(N6) O(N4) O(N3) O(N3) 64 16 8 8 5-6 10-50 50-100 100-200 Semiempirical MO methods AM1, PM3, MNDO Semiqualitative O(N2-3) 4-8 1000 Classical force-field methods Amber, Charm, MM3, … Semiqualitative O(N1-2) 2-4 103-4 (Semi?)Quantitative Quantitative Exact O(N1-3) O(N1-3) O(N1-3) 2-8 2-8 2-8 15-30* 10-20* 0-1* QMC methods Variational Fixed Node Diffusion Released Node Diffusion * Based on O(N3) Adapted from: Morokuma, et al. IBM J. Res. & Dev. Vol. 45 No. 3/4 May July 2001. p 367-395 2003, D.R. Fisher Hartree-Fock QM Calculations Replace explicit electron-electron interactions with average electron-electron interactions to get N,1-particle equations. Average Position of the Other Electrons This works BUT it does not deal correctly with electron-electron correlation 2003, D.R. Fisher When Is Electron Correlation Important? • Transition Metal Oxides – High Temprature Copper Oxide Superconductors • Heavy Fermion Metals – Usually Rare Earths or Actinides • Organic Charge Transfer Compounds • One and Two Dimensional Electron Gas Systems – Nano-scale Wires • Positron Chemistry – Defect Detection and Analysis • Muon/Exotic Particle Physics – Catalyzed Fusion See 21 April 2000 Science for More Examples 2003, D.R. Fisher QMC Basic Algorithm Traditional QC package Get YT (HF,DFT,MCSCF) Get Starting Jastrow Construct Initial Walker(s) Molecule VMC Equilibrate Task Conceived!!! -Accurate result -Small expense VMC Sampling Optimized? Enough DMC Equilibrate Very Accurate Result DMC Sampling Change Jastrow to Optimize it 2003, D.R. Fisher Metropolis Algorithm and QMC • Reformulate Energy Expectation Integral. E Y * Hˆ Ydx 3 N 1 E M ˆ H Y 3N 2 dx Elocaldx 3 N Y Y i M E x xi i 1 local i • Monte Carlo Integration for this 3N-dimensional integral. – Metropolis algorithm produces electron configurations with respect to Y2. – System energy expectation value is average of the local energies. 2003, D.R. Fisher VMC Wavefunction for a Molecule Y HF-Type Wavefunction Function of how far apart the electrons are from each other Function of how far electrons are from nuclei n n a1*rij a2 *rij2 n N A1 *riA A2 *riA2 YHF exp 1b *r b *r 2 exp 1 *r *r 2 1 ij 2 ij i 1 A1 A1 iA A2 iA i 1 j i What a typical YHe might look like: r1N ar12 r2 N YHe 1s (1)1s (2) exp exp 1 br12 1 r1N 1 r2 N 2003, D.R. Fisher Diffusion QMC 1 2 Y V Y 2 Change variables ... t i Y 0 e Solution at Key Point: ( E1 E0 ) Ground State 1 e ( E2 E0 ) 2 Excited States No Matter What Wavefunction We Start With, It Decays to the True Ground State as We Take Steps in “time,” ! 2003, D.R. Fisher New Algorithms and Areas Affected Traditional QC package Get YT (HF,DFT,MCSCF) Get Starting Jastrow Construct Initial Walker(s) Molecule Generic Jastrow VMC Equilibrate Task Conceived ManagerWorker Parallelization VMC Sampling Change Jastrow to Optimize it Optimized? Enough DMC Equilibrate Very Accurate Result DMC Sampling Decorrelation Algorithm 2003, D.R. Fisher Problem With Metropolis Sampling From intro statistics, the uncorrelated variance in energy is: 2 ( E) E 2 E Probability of a move being accepted is related to the transition probability. T(AB) Rejected Move Accepted Move 2 This does not work in this case because the energies calculated at sequential points are serially correlated. 2 (E) 2 (E) 2 N Energies at these points are correlated! cov( E , E ) i j i j 2003, D.R. Fisher Flyvbjerg-Petersen Decorrelation Algorithm Average blocks of the original data into new data elements. Original Data x1 x2 x3 x4 x5 x6 x7 x8 x9 xn-3 xn-2 xn-1 xn ••••••••••••• Blocked Data x1 x2 x3 x4 x5 x6 x7 x8 x9 If the data blocks are sufficiently large, the blocked data points are uncorrelated and the standard O(N) variance equation can be used. 2 ( E) E 2 E 2 2003, D.R. Fisher VMC Particle-in-a-Box Standard Deviation Calculation Uncorrelated Data Constant True Plateau Correlated Data Underestimate True 2003, D.R. Fisher New Statistical Analysis Algorithm Flyvbjerg-Petersen Algorithm • One processor must do all of the work N = 107-1012 log2N = 23-40 • Must communicate O(N) data when used on a parallel computer • Must store O(N) data • Must be used at the end of the calculation (can’t check convergence on-the-fly). Dynamic Distributable Decorrelation Algorithm Feldmann M.T., D.R. Kent IV, R.P. Muller, W.A. Goddard III. “Efficient Algorithm for “On-the-fly” Error Analysis of Local or Distributed Serially-Correlated Data”, J. Chem. Phys. (submitted) • Perfectly parallel even on inhomogeneous computers • Must communicate O(log2N) data when used on a parallel computer • Must store O(log2N) data • Can provide on-the-fly results (convergence based termination) 2003, D.R. Fisher New Manager Worker Parallelization Algorithm Current Algorithm - Pure Iterative • All processors do an equal amount of work • If one processor finishes first it must wait on all of the others • Intended only for homogeneous machines (all processors the same) • Convergence-based termination not possible New Algorithm - Manager Worker Feldmann M.T., D.R. Kent IV, R.P. Muller, W.A. Goddard III. “Manager-Worker Based Model for Massively Parallel and Perfectly Load-Balanced Quantum Monte Carlo”, J. Comp. Chem. (submitted) • All processors do as much work as they can • All processors complete at the same time • Intended for either homogeneous or inhomogeneous machines (processors can be different) • Convergence-based termination possible with DDDA 2003, D.R. Fisher Parallel Algorithm Performance: Heterogeneous Computer 8 total processors. Mixture of Pentium II 200 MHz and Pentium III 866 MHz 2003, D.R. Fisher Parallel Algorithm Performance: Heterogeneous Computer Performance depends on what is being measured… Not quite linear LLNL Blue Pacific LANL Nirvana Linear to 2048 Processors 2003, D.R. Fisher Are Transferable Parameter Sets Possible? Y HF-Type Wavefunction Function of how far apart the electrons are Function of how far electrons are from nuclei n n a1*rij a2 *rij2 n N A1 *riA A2 *riA2 YHF exp 1b *r b *r 2 exp 1 *r *r 2 1 ij 2 ij i 1 A1 A1 iA A2 iA i 1 j i Can parameter sets be found that are transferable between different systems? Similar idea to contracted Gaussian basis sets (e.g. 6-31G**). 2003, D.R. Fisher Correlation Energy Recovered by Generic Jastrow Function Correlation Energy Recovered (h) / Total Nuclear Charge For The Molecule 0.01 0.008 0.006 benzene 0.004 trans-butadiene 0.002 cis-butadiene ethylene 0 -0.002 1 3 5 7 9 ethane allene acetylene -0.004 methane -0.006 -0.008 -0.01 b parameter All these hydrocarbons have nearly the same optimal “b” parameter! 2003, D.R. Fisher Generic Jastrow Performance 0.1 1 10 100 1000 10000 DMC CH4 optimized 0.01 b=1 b=100 0.001 0.0001 0.00001 QMC Steps 0.1 1 10 100 1000 optimized b=1 0.01 b=3 (Generic) DMC C2H2 Variance Variance b=3 (Generic) b=100 0.001 0.0001 0.00001 QMC Steps 10000 2003, D.R. Fisher Why Isn’t QMC Really Linearly Scaling? • Each processor needs to perform statistically independent calculations • Each processor must begin calculating with independent, statistically significant points in configuration space • A separate Metropolis calculation must be equilibrated for each independent point in configuration space LANL Nirvana Efficiency Number of Processors 2003, D.R. Fisher Current Work- Initialization Solution • Make algorithm that reduces the initialization time. – Better guess for placement of initial walkers in the Monte Carlo random walk. – Reduce the number of steps to equilibrate the walker. – Make algorithm for initialization which itself is parallelizable. • There are two criteria by which we can judge the quality of a configuration. – The sum of one electron probability densities from the SCF calculation: 1 2 SCF x i x N i – The distance of the electrons from each other. 2003, D.R. Fisher Constructing Initial Configurations • Orbitals are linear combinations of primitive gaussians: i d jx y z e aj bj cj j r 2 j The square of the orbital is its probability density. We can change to spherical coordinates and separate the dependence in r, θ, and φ: 2 a a b b c c r 2 j k j k j k j k i d d j dk r 2 j ,k e 0 2 dr sin 0 a a b b sin cos d 1 0 j k j k 1 a j ak b j bk cos c c d j k 2003, D.R. Fisher Probability Distribution Functions • By integrating over one variable at a time, we can get the marginal probability distribution function in each direction. These are the probability distribution functions for the 2px orbital of Ne: Theta Dist P(theta) 1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Phi Dist 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0 0 0.5 1 1.5 2 2.5 3 3.5 r (au) P(phi) P(r) Radial Dist 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 1.5707965 theta 0 1.57 3.14 4.71 6.28 phi Now we can generate uniform random numbers between 0 and 1 to distribute electrons with respect to the probability density of this orbital. 3.141593 2003, D.R. Fisher Deciding which configurations to keep • We can invert each orbital of the molecule and distribute electrons in them according to their occupancy. This will ensure that our initial configurations are in regions of high one electron probability density. • This algorithm, however, will not prevent electrons from being placed near each other. • We plan to develop a heuristic score function that will take a 3N-dimensional walker as input and return a real number, such that the larger the result, the fewer steps the walker will need to take to reach an equilibrium region of configuration space. 2003, D.R. Fisher The Score function • The score function will depend on the one electron density and the distance between pairs of electrons: S x Gx , x , where N x j i 1 x N 1 xi x j x SCF i i Our goal is to determine a functional form of the score function G that is transferable between different molecular systems. 2003, D.R. Fisher Using the score function • Once a reliable score function is developed, it will be possible to construct high-quality walkers: - For an N-electron molecule, we could generate M>N electron positions and then find which combination gives the best walker. - Alternatively, a large set of N-electron configurations could be generated, and then the most favorable could be chosen. - The number of equilibration steps would be determined by the result of the score function ????? Could it be possible to use the score function to dynamically determine when a walker is equilibrated ????? If this is possible, we could terminate the initialization of each walker individually. This would mean we could start gathering data from a walker as soon as it is equilibrated, rather than waiting for the entire ensemble. 2003, D.R. Fisher Conclusion • For parallel calculations, the efficiency drops off drastically as the initialization time and number of processors increase. • This new class of Metropolis initialization algorithms will greatly decrease the time required to initialize a QMC calculation. • The efficient use of parallel processors will allow highly accurate quantum mechanical calculations of larger and more interesting chemical systems. 2003, D.R. Fisher Acknowledgements General: • Mike Feldmann • Chip Kent • Rick Muller • William A. Goddard III • Goddard Group • CACR staff Funding: • $$$ ASCI $$$ (Caltech-ASCI-MP)