* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download LINEAR SCALING ELECTRONIC STRUCTURE METHODS IN
Hidden variable theory wikipedia , lookup
Scalar field theory wikipedia , lookup
Canonical quantization wikipedia , lookup
Matter wave wikipedia , lookup
Chemical bond wikipedia , lookup
Renormalization wikipedia , lookup
Symmetry in quantum mechanics wikipedia , lookup
Perturbation theory wikipedia , lookup
Wave–particle duality wikipedia , lookup
History of quantum field theory wikipedia , lookup
Particle in a box wikipedia , lookup
Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup
Molecular Hamiltonian wikipedia , lookup
Hydrogen atom wikipedia , lookup
Molecular orbital wikipedia , lookup
Density functional theory wikipedia , lookup
Renormalization group wikipedia , lookup
Density matrix wikipedia , lookup
Coupled cluster wikipedia , lookup
Atomic orbital wikipedia , lookup
Hartree–Fock method wikipedia , lookup
Atomic theory wikipedia , lookup
Relativistic quantum mechanics wikipedia , lookup
COMPUTATIONAL CHEMISTRY LINEAR SCALING ELECTRONIC STRUCTURE METHODS IN CHEMISTRY AND PHYSICS Calculating the electronic structure of large atomistic systems requires algorithms that scale linearly with system size. Efficient implementations of these emerging algorithms provide scientists in various fields with powerful software tools to address challenging problems. S cientists have known the nonrelativistic equations of quantum mechanics since 1926, when Austrian physicist Erwin Schrödinger published a series of papers on quantum mechanics. As Paul Dirac pointed out in 1929, questions in quantum mechanics are in principle just questions in applied mathematics. In practice, however, solving these equations has proved challenging. In spite of the impressive computer power at our disposal, solving the basic equation of quantum mechanics—the many-electron Schrödinger equation, which rules the quantum mechanical behavior of molecules and materials at the atomic level and determines their basic properties—remains a difficult task, and will require the interplay of physics, chemistry, mathematics, and computational science. Developing new methods for electronic structure calculations is more than just developing new algorithms: it requires a deep physical and chemical understanding of many-electron systems. Combining 1521-9615/03/$17.00 © 2003 IEEE Copublished by the IEEE CS and the AIP STEFAN GOEDECKER CEA-Grenoble GUSTAVO E. SCUSERIA Rice University 14 this understanding with modern mathematical concepts leads to algorithms that exploit the peculiarities of electronic systems to yield powerful new electronic structure methods. Adapting these methods for modern computer architectures will result in powerful programs to aid the research of many scientists. In this drive for better methods, algorithms in which computing time increases linearly with respect to the number of atoms in the system are the ultimate goal. Most physical quantities are extensive—that is, they grow linearly with system size. We might therefore expect that the computational effort will grow linearly with system size as well. An even slower increase in computing time is certainly not possible unless we ignore the basic physics of the electronic system. In this article, we review the physical principles and algorithms behind the quest for electronic structure computational methods that scale linearly with respect to system size. Calculating the Electronic Structure of Large Atomistic Systems The development of fast algorithms is one of modern mathematics’ main achievements. In the mathematical nomenclature, fast algorithmssuch as the fast Fourier transform, the multigrid method, and the fast multipole methodrefer to low-complexity algorithms. An algorithm’s complexity is related to its asymptotic behavior. If the computing COMPUTING IN SCIENCE & ENGINEERING time TCPU grows as TCPU const Nk (1) in the limit of large systems, we say that the algorithm is O(N k). The algorithm has a low complexity if κ is small. In an O(N) algorithm, the computing time grows only linearly with respect to the system size N. The exponential increase in computer speed drives the development of low-complexity algorithms. Empirically, the constant in Equation 1 is smaller for the best high-complexity algorithms than for the best low-complexity algorithms. Consequently, high-complexity algorithms are typically faster for small problem sizes. This implies that low-complexity algorithms only become important once we have sufficient computational power to treat numerically system sizes beyond the crossover pointthe point at which a low-complexity algorithm becomes faster than a high-complexity algorithm. As computing power increases, it is becoming possible to pierce the region beyond the crossover point for more and more problems, making low-complexity algorithms advantageous. Whereas early fast algorithms targeted standard mathematical problems such as Fourier transformations, partial differential equations, and Coulombic point particle systems, current fast algorithms target more specific problems, such as electronic structure calculations. It is generally important that these fast special-purpose algorithms exploit the physical context. In this article, we only present the fundamental principles of these algorithms. Several more detailed review articles include technical details.1–4 To specify the size N of an atomistic system, we can measure the number of atoms, number of electrons or, in a simulation context, the number of basis functions used to represent the wavefunctions in the system. Because these measures are related by constants, they are equivalent with respect to algorithm complexity. In this article, N will denote the number of electrons with the tacit understanding that it is proportional to both the number of atoms and the number of basis functions. Understanding matter at the atomistic level is the key ingredient in many branches of science, such as solid-state physics, chemistry, materials science, and biology. In the spirit of Dirac, we could say that questions in these disciplines are just questions in applied physics. Computer simulations are in principle the preferred approach to addressing atomistic problems, eliminating experimental ambiguities and uncertainties. Unfortunately, highly accurate quantum mechanical simulations of large atomistic systems are still JULY/AUGUST 2003 unfeasible. Nevertheless, even less accurate methods—density functional methods, in particular—have great predictive power. For example, density functional methods helped scientists predict fullerenes before they could be synthesized in the laboratory. The fact that nearly half of all theory papers in chemistry and solid-state physics rely, at least in part, on computer simulation results also stresses simulation’s important role. O(N) methods will extend high-accuracy methods to larger systems used in many practical problems and thus strengthen simulation’s role in atomistic studies. Methods for Treating Electronic Systems A wide range of computational methods is available for simulating atomistic problems. These methods are based on approximations of the difficult manyelectron problem and thus represent a compromise between speed and accuracy. Wavefunction Methods Wavefunction methods are the most accurate, directly solving the many-electron Schrödinger equation, − 1 2 × N ∑ i =1 N ∇ 2i + N V ( ri ) i =1 N 1 ∑ ∑ ∑ i =1 j = i +1 r − r i + j ψ(r1, …, rN) = E ψ(r1, …, rN). (2) We use atomic units in this equation and throughout the article. For simplicity, we also suppress the electrons’ spin degrees of freedom. We must solve the eigenvalue problem of Equation 2 under the constraint that the many-electron wavefunction Ψ(r1, ..., rN) is antisymmetric with respect to the exchange of any two electron coordinates—that is, Ψ(…, ri, ..., rj, ...) = –Ψ(..., rj, ..., ri, ...). The price of this method’s accuracy is a large prefactor (Equation 1) and, at least until very recently, has high complexity. Traditional quantum chemistry methods are based on an expansion of the many-electron wavefunction in Slater determinants, formed from a set of orthonormal one-electron orbitals φi(r). The determinants’ mathematical properties, with respect to the interchange of rows and columns, lead automatically to the required antisymmetry of Ψ: Ψ(r1,..., rN ) = ∑I CI φ i1 ( r1 ) ... φ iN ( r1 ) ... ... ... (3) N ! φ ( r ) ... φ ( r ) i1 N iN N 1 15 The composite index I = (i1, ..., iN) characterizes a determinant, or configuration, by specifying which one-electron orbitals φi make up the configuration. If we have M orbitals φi, we can form M!/((M – N)!N!) different configurations. In the full configuration interaction method, the sum in Equation 3 runs over all possible M!/((M – N)!N!) terms. Because one must compute all the coefficients CI, the computational effort is at least M!/((M – N)!N!). We can call the scaling exponential because it is worse than any polynomial scaling. The most common quantum chemistry methods restrict the number of configurations to be included in the calculation and thus reduce scaling. Nevertheless, quantum chemistry methods remain highly complex in the standard algorithm context. The popular coupled-cluster method with single and double excitations has, for instance, an O(N6) scaling unless we use the new O(N) methods. Consequently, conventional algorithms can treat only a few dozen electrons in this way. The full configuration interaction method will give us the exact answer only if we use a complete basis set—that is, if M is infinite. This of course is not practical. A fast convergence with respect to M would be sufficient; however, because building the electron–electron cusp in the wavefunction from a continuous smooth basis set is difficult, convergence is slow. The preferred basis set for these calculations consists of atom-centered Gaussian-type orbitals, where exp(–a(r – R)2) represents, for example, an s-type orbital, and exp(–a(r – R)2)(x – X)/|r – R| represents a px-type orbital. X is the first component of atomic position R and x is the first component of electronic coordinate r. Independent Particle Methods In independent particle methods, electron–electron interactions are treated in an average way. Because in most chemical systems, correlations are relatively small compared to the total energy, a mean field treatment is usually satisfactory. The Kohn-Sham formulation of density functional theory is the most prominent in this context. In this method, we obtain the electronic ground state in an external potential V(r) by minimizing the energy expression over all mutually orthonormal orbitals ψ i (r), (∫ ψ i (r) ψ j (r)dr = δi,j), N E= 1 − ψ i ( r )∇ 2ψ i ( r ) ∫∑ 2 i =1 + V H ( r )ρ( r ) + ∈xc (ρ( r )) + V ( r )ρ( r ) dr (4) where 16 N ρ( r ) = f ( ∈i ) ψ i ( r ) ∑ i =1 2 (5) represents the electron charge density. For simplicity, we again consider the electrons to be spinless, and f(∈) is the Fermi distribution 1/(1 + exp((∈ – µ)/kbT). The Hartree potential VH V H (r ) = 1 ρ( r ) dr' 2 r − r' ∫ (6) represents the classical electrostatic average interaction between the electrons. The exchange correlation energy ∈xc is the nonclassical, quantum part of this interaction. The energy obtained by minimization of the variational parameters of the orbitals is the ground state total energy. Because the exact ∈xc remains unknown, the Kohn-Sham method is approximate in practice. However, various rather accurate forms of ∈xc are currently available. Minimization algorithms use the gradient 2Hψi(r) of the total energy expression with respect to the orbitals, which is obtained as a functional derivative of Equation 4. The Kohn-Sham Hamiltonian H is given by 1 H = − ∇ 2 + V H ( r ) + µ xc (ρ( r )) + V ( r ) , (7) 2 where µ xc (ρ) = ∂ ∈xc ( ρ ) ∂ρ is the exchange correlation potential. In a numerical calculation where each orbital wavefunction is represented as a linear combination of basis functions χk(r), the Hamiltonian operator H defined in Equation 7 becomes a matrix H whose elements are given by ∫ Hkl = χk(r) H χl(r)dr. (8) Minimizing the total energy is equivalent to solving a self-consistent eigenvalue problem for H. Selfconsistency is reached if after several self-consistency iterations, the charge density obtained from the solution (Equation 5) is identical to the charge density used to construct the Hamiltonian matrix (Equations 7 and 8). The self-consistent diagonalization approach—the most widely used with Gaussian-type orbitals—yields a cubic scaling with respect to N. Minimization-type algorithms for density functional methods also give a cubic scaling. The cubic term arises from the orthogonality requirement for the orbitals. In each step of an iterative COMPUTING IN SCIENCE & ENGINEERING minimization of Equation 4, this constraint must be reinforced by performing, for example, a Gram-Schmid orthogonalization. Imposing one constraint per pair of orbitals requires calculating scalar products and linear combinations of the two vectors representing the orbitals. The length of these vectors increases with system size because larger systems require larger basis sets. Hence, imposing a single orthogonality constraint has O(N) complexity. However, N(N – 1)/2 constraints exist because we must impose orthogonality not only for one pair of orbitals, but for all N(N – 1)/2 possible pairs. The overall complexity for the orthogonalization step is therefore O(N 3). Density functional methods are the current workhorse for atomistic simulations because they represent a good compromise between accuracy and speed. The cubic orthogonalization term in minimization-based methods becomes dominant only in systems of a few hundred atoms. These systems are therefore the largest that minimization algorithms can handle. In quantum chemistry calculations with Gaussiantype orbitals, an O(N 4) bottleneck must be eliminated before one is confronted with the O(N 3) orthogonalization bottleneck. Because the charge density is a quadratic form with respect to the wavefunction (Equation 5), calculating the Hamiltonian matrix elements of the Hartree potential (Equation 6) requires four center electron repulsion integrals of the type ∫ dr ∫ dr' χ i ( r ) χ k ( r' ) χ l ( r' ) χ j ( r ) r − r' . With four indices involved, a straightforward evaluation results in O(N 4) complexity. If the basis functions χ are Gaussian-type orbitals or other well-localized basis functions, reducing the complexity is easy. Unless the pair of basis functions with a common argument r significantly overlaps, the integral will be negligible. The same is true for the pair of basis functions with an argument r′. Hence, we can reduce the complexity in a controlled manner to the classical limit of O(N 2). We can even reduce the complexity of the calculation of the Hartree term to O(N). In the context of Gaussian-type orbitals, for example, we can achieve linear scaling using a modified fast multipole method.5,6 In this approach, the scaling is linear only with respect to the volume—not with respect to the number of basis functions. In other words, the prefactor grows rapidly if we use JULY/AUGUST 2003 larger and more accurate basis sets. A scaling behavior that is strictly linear with respect to the number of basis functions can be achieved in a wavelet basis.7 Tight-Binding and Semiempirical Methods The number of variational parameters in the expansion of the wavefunction is fairly large in density functional calculations. If we use Gaussian-type basis sets, we typically have a few dozen variational parameters per atom. If we use systematic basis sets such as plane waves, this number can rise to several hundred per atom. Tight-binding and semiempirical methods have far fewer variational parameters—only slightly more than the number of valence electrons per atom. We can obtain the same number of variational parameters using a minimal basis set in a density functional calculation; however, semiempirical methods are not identical to this combination. In semiempirical methods, we do not calculate all the integrals (Equation 8) needed to set up the Hamiltonian matrix exactly, but parameterize them with respect to the atomic positions. The smaller number of variational parameters and the absence of an exact integral evaluation obviously reduce accuracy. We can therefore consider these methods as the crudest quantum mechanical approach. The total energy in semiempirical methods is the sum over all occupied eigenvalues of the Hamiltonian matrix and possibly other easy-tocalculate contributions from classical positiondependent potentials. The computational bottleneck is therefore the diagonalization of the Hamiltonian matrix. Because a large percentage of all the eigenvectors must be calculated, standard classical diagonalization is the traditional solution method. Consequently, we again obtain cubic scaling (O(N 3)) with respect to system size. Although the prefactor in these methods is roughly two orders of magnitude smaller than in density functional calculations, we cannot treat much larger systems. Due to cubic scaling, reducing the prefactor by 125 only lets us treat systems five times as large. Force Fields Force fields, or interatomic potentials, eliminate electrons—the quantum mechanical glue linking the atoms. In these methods, atoms are particles that interact through sophisticated classical potentials, fitted to certain experimental or theoretical results. This approach is also called molecular mechanics. Because the fitted potentials are short range, obtaining 17 (small discs) in the crystals forms chemical bonds with its neighbors. For example, the nature and energy of the bonds that the red atom forms with its neighbors depend on three factors: • the number of nearest neighbors (the coordination number), • the atomic species type that the neighbor represents (whether it is a carbon or an oxygen atom, for instance) and, • the bond length. Figure 1. The spherical localization regions associated with two atoms. The bonds formed between the red and green atom and their neighbors are indicated by red and green lobes. linear scaling with respect to system size is straightforward. The only long-range interactions are electrostatic interactions, which we can treat with linear scaling algorithms such as fast multipole methods.8 Force fields can yield surprisingly accurate results when applied to configurations similar to those in the training or fitting set, but can fail badly otherwise. Obviously, they cannot give properties related to the atom’s electronic structure, such as optical properties, polarizabilities, excitation energies, or chemical reaction barriers. They are nevertheless popular in biological applications such as protein modeling, primarily because they are the only method that treats molecular systems containing many thousands or even millions of atoms. The popularity of force fields also shows the clear need for accurate simulation tools for very large systems. Traditional algorithms for solving quantum mechanical equations cannot handle these very large systems. Medium-size systems can, however, now be accessed with quantum mechanical O(N) methods. Because force fields are not quantum mechanical, we do not discuss them further. Physical Foundations of O(N) Methods Fundamental chemistry principles strongly suggest that linear scaling is possible. To see this, consider the model crystal shown in Figure 1. Each atom 18 These three properties are purely local. In this simple picture, what occurs outside the red atom’s localization sphere does not influence the bonding whatsoever. A more profound analysis1,2 shows that features outside the sphere do influence the bonds in the sphere, but the influence decays rapidly with distance. Van der Waals interactions have the slowest decay in this context, namely c/r6. Fortunately, the prefactor c is small and we can truncate the interactions at a reasonably sized radius. Only the wavefunction methods accurately treat van der Waals interactions, which are essentially absent in independent particle methods. Therefore, the influence of the region outside the localization sphere decays even faster in independent particle methods. In an insulator and a metal at finite temperature, influence decays exponentially. At room temperature, the exponential decay is much slower in a metal than in an insulator. At zero temperature, the decay in a metal is only algebraic. Finally, doing an electronic structure calculation in an insulator essentially amounts to determining the different bond energies. To calculate bond energy, we consider only a relatively small region around the bond. As the system grows, the number of bonds increases, but they are all embedded in similarly sized localization spheres. Figure 1 shows the localization region suitable for calculating the bonds formed by the green atom. We then obtain the system’s total energy by summing the partial bond energies. Hence the calculation scales linearly with respect to the number of bonds. Basic Principles of Linear Scaling Algorithms As described previously, obtaining linear scaling in an electronic structure calculation requires formulating the problem in terms of quantities that reflect the locality principle. Such quantities will decay rapidly, and if we accept a small but controlled error in the calculation, we can simply truncate these quantities at a large enough distance. Because COMPUTING IN SCIENCE & ENGINEERING the extended eigenorbitals diagonalizing the independent particle Hamiltonian, usually referred to as canonical orbitals, do not reflect this locality principle, they are not suitable as the basic quantities in O(N) calculations. Linear scaling also rules out the use of basis functions extending over the whole computational volume, such as plane waves. Density Matrix-Based Algorithms For both semiempirical and tight-binding methods and independent particle methods with small basis sets, the independent particle density matrix is usually the central quantity for O(N) algorithms. We define the finite-temperature independent-particle density operator F as F ( r , r' ) = ∑i f (∈i )ψ i (r )ψ i (r' ) where ∈i is the eigenvalue and ψi(r) is the eigenvector of the (self-consistent) Hamiltonian. At this independent particle model level, the density matrix contains all the information about our quantum mechanical system—that is, the quantities of interest such as charge density or total energy. The fact that F tends to zero when the observation point r is far from the source point r′ manifests chemistry’s locality principle: F ( r , r' ) → r − r' → ∞ 0. On discretization, the independent particle density operator F becomes the independent particle density matrix F. If we use a well-behaved localized basis set, and if the two basis functions entering into the matrix element are far apart, the operator’s rapid decay will produce small matrix elements. Neglecting these insignificant matrix elements yields a sparse matrix in which the number of important matrix elements per column or row is independent of system size. Hence the number of matrix elements increases as O(N). Several approaches to calculating such a sparse density matrix are available. Conceptually, the simplest is based on F being a matrix function of H, F = f(H), where f is the Fermi distribution. We can numerically represent the matrix function f in several ways: as a Chebychev polynomial, a rational function, or a recursively built-up polynomial. Another approach is to minimize an energy expression based on the density matrix. All these approaches use sparse matrix–matrix or matrix–vector multiplications to construct F and therefore scale linearly with the matrix’s dimensions. number of them can achieve reasonable accuracy. This is because they mimic the atomic orbitals that represent a first-order approximation of more complicated atoms’ and solids’ electronic structures. Gaussian-type orbitals, however, are not a systematic basis set in that they cannot solve Schrödinger’s equation with an arbitrarily small error. Thus, systematic basis sets (such as finite elements or wavelets) are preferred in various applications. The large number of basis functions per atom means that the density matrix contains too many significant elements, even though it is still a sparse matrix. In an insulator at zero temperature, however, we can find a compact representation of the density matrix in terms of Wannier functions Wi(r) F ( r , r' ) = ∑ Wi (r )Wi (r' ) . i = occ Of all the sets of orbitals that span the occupied space, the Wannier functions best meet a certain chosen localization criterion.9 Because they are localized, we can associate a center Ri with each Wannier function Wi(r). The locality principle again manifests through the decay properties of the Wannier functions Wi (r ) → Ri −r → ∞ 0. The vectors obtained by discretizing the Wannier functions in a well-behaved localized basis set are therefore sparse as well. Empirically, the Wannier functions typically represent bonds or lone electron pairs. O(N) algorithms let us directly calculate Wannier-like functions. This can either be done by projection methods or by minimizing a modified total energy expression. Because all the operations are done with sparse vectors, linear scaling can be obtained. All O(N) algorithms based on the density matrix and Wannier functions solve the electronic structure problem for a fixed independent particle Hamiltonian. In a self-consistent electronic structure calculation, these algorithms must be combined with O(N) methods for the calculation of the self-consistent potential. The only difficult part of the potential is the Hartree part. With the previously mentioned fast multipole method, linear scaling can be obtained for the Hartree potential. Wavefunction Method-Based Algorithms Wannier Function-Based Algorithms If we discretize Schrödinger’s equation in a basis set of Gaussian-type orbitals, a relatively small JULY/AUGUST 2003 The starting point of most sophisticated quantum chemistry wavefunction methods is the HartreeFock method, which consists of a single determinant 19 wavefunction. The Hartree-Fock method can be considered the simplest wavefunction method, in which electrons are assigned to individual one-particle orbitals. In Hartree-Fock theory, each electron interacts with the mean Coulomb field created by all other electrons, and “correlations” between them are neglected. From the viewpoint of Kohn-Sham density functional theory, Hartree Fock contains all the exchange energy but zero correlation energy. In terms of computational scaling, Hartree-Fock shares some requirements with density functional theory: the Coulomb and diagonalization steps are identical. However, in Hartree-Fock there is no need for a numerical quadrature to evaluate the exchange–correlation functional. On the other hand, the Hartree-Fock exchange energy must be evaluated. Fortunately, the exchange interaction decays exponentially in insulating systems. Thus, the Hartree-Fock exchange energy and associated potential are well suited for numerical screening or truncation because interactions below a given threshold can be safely ignored. Several schemes along these lines have been proposed in the literature, and their O(N) scaling properties have been demonstrated.10 For metallic systems at zero temperature, the Hartree-Fock exchange interaction decays much more slowly—namely, algebraically. This results in a very large prefactor, rendering O(N) methods impractical. However, because the usefulness of the Hartree-Fock method for correctly describing metallic systems has long been questioned, the problem seems to be of little practical relevance. For the same reason, hybrid density functional methods that contain some fraction of the Hartree-Fock exchange energy remain mostly limited in practice to systems with a significant insulating character. In wavefunction methods such as configuration interaction, coupled-cluster, or many-body perturbation theory, one has to calculate both the oneparticle density matrix ρ1, ∫ ∫ ρ1( r1′ , r1 ) = N ... dr2 ... drN Ψ( r1′ , r2 , ..., rN ) Ψ( r1, r2 , ..., rN ) . the two-particle density matrix ρ2 N ( N − 1) ρ 2 ( r1′ , r2 ′ , r1, r2 ) = ... dr3... drN 2 , Ψ( r1′ , r2 ′ , r3,..., rN ) Ψ( r1, r2 , r3,..., rN ) ∫ ∫ because the total energy E is a function of both of them 20 1 E = − ∇ 2 ′ + V ( r1 ) ρ1 r1′ , r1 r 2 1 ρ 2 ( r1, r2 , r1, r2 ) + dr1dr2. r1 − r2 ∫ r1′ = r1′ dr1 ∫∫ ρ1 becomes identical to the zero-temperature independent particle matrix discussed in the context of independent particle methods if the many-body wavefunction consists of only one determinant. Direct calculation (for example, by energy minimization) of ρ1 and ρ2 is not feasible because the necessary and sufficient conditions for the so-called Nrepresentability of ρ2 remain unknown. These conditions ensure that ρ2 can be obtained from an N-electron wavefunction and they would be needed to constrain a direct search of ρ2. For a given wavefunction construction (configuration interaction, coupled cluster, or many-body perturbation theory), the two-particle density matrix elements are therefore calculated from the configuration coefficients CI (Equation 3) that make up the particular wavefunction. The direct calculation of these matrix elements involves solving either a set of linear (secular matrix problem in configuration interaction) or nonlinear algebraic equations (coupled cluster). As mentioned previously, the computational scaling of these procedures with respect to system size is very steep when done in terms of the canonical Hartree-Fock orbitals. With canonical orbitals, the computational scaling for including all single and double excitations from the reference determinant is O(N 6). Adding the triple and quadruple excitations, which are needed to accurately dissociate molecules containing double or triple bonds, leads to an even worse scaling of O(N 8) and O(N 10), respectively. As discussed previously, the key issue in achieving linear scaling in these wavefunction approaches is to formulate the problem in a representation in which the short-range nature of the chemical interactions, including van der Waals interactions, becomes explicit.11 The traditional formulation of most wavefunction methods in terms of canonical orbitals is not adequate for this purpose. The short-range behavior can however be exploited by working with a set of localized molecular orbitals, which allows for truncation based on physical arguments,12,13 or by numerical truncation of small matrix elements in an atomic orbital representation.14 Perturbation theory methods are also very popular in quantum chemistry. Among them, second-order Moeller-Plesset perturbation theory is the simplest wavefunction method that accounts for van der Waals interactions. These interactions are fundamentally COMPUTING IN SCIENCE & ENGINEERING important for understanding both intermolecular interactions in biological systems and packing forces in molecular solids. Until recently, the steep scaling of the Moeller-Plesset computations (O(N5)) ruled out their application to large systems. In the last few years, several methods that allow for Moeller-Plesset calculations of fairly large systems have appeared in the literature and have been incorporated into software packages.12,15 These developments have also been extended to systems with periodic boundary conditions,16 opening avenues to study novel materials. The philosophy behind these linear scaling MP2 approaches is similar to that described in the previous paragraph: one must work in a local representation in which the decay properties of the relevant quantities with respect to their spatial separation become explicit and adequate for truncation. E ven though the basic principles of O(N) electronic structure calculations are well established, a substantial amount of work to develop practical computational tools remains. Reducing the prefactors remains crucial. Once the O(N) asymptote is reached, semiempirical methods are typically two orders of magnitude faster than density functional methods, which in turn are typically another two orders of magnitude faster than wavefunction methods. Sparsity of matrices and sparse matrix operations remain central for achieving O(N) scaling in electronic structure methods. Progress in this area, as well as massive parallelization of these tools, will allow even faster computations. In cases where matrix sparsity cannot be easily achieved (as in metals at zero temperature), more research for alternative techniques is still needed. Acknowledgments The linear scaling work at Rice University is supported by the US National Science Foundation, the US Department of Energy, the Welch Foundation, and Gaussian. References 1. S. Goedecker, “Linear Scaling Electronic Structure Methods,” Reviews of Modern Physics, vol. 71, no. 4, 1999, pp. 1085–1123. 2. S. Goedecker, Handbook of Numerical Analysis, special volume on computational chemistry, P.G. Ciarlet and C. Le Bris, eds., NorthHolland, 2003. 3. G.E. Scuseria, “Linear Scaling Density Functional Calculations with Gaussian Orbitals,” J. Physical Chemistry A, vol. 103, no. 25, 1999, pp. 4782–4790. 4. P. Ordejon, “Order-N Tight-Binding Methods for Electronic- JULY/AUGUST 2003 Structure and Molecular Dynamics,” Computational Materials Science, vol. 12, no. 3, 1998, pp. 157–191. 5. C.A. White and M. Head-Gordon, “Derivation and Efficient Implementation of the Fast Multipole Method,” J. Chemical Physics, vol. 101, no. 8, 1994, pp. 6593–6605. 6. M.C. Strain, G.E. Scuseria and M.J. Frisch, “Achieving Linear Scaling for the Electronic Quantum Coulomb Problem,” Science, vol. 271, no. 1, 1996, pp. 51–53. 7. S. Goedecker and O. Ivanov, “Linear Scaling Solution of the Coulomb Problem Using Wavelets,” Solid State Comm., vol. 105, no. 11, 1998, pp. 665–669. 8. L. Greengard, “Fast Algorithms for Classical Physics,” Science, vol. 265, 1994, pp. 909–914. 9. N. Marzari and D. Vanderbilt, “Maximally Localized Generalized Wannier Functions for Composite Energy Bands,” Physics Rev. B, vol. 56, no. 20, 1997, pp. 12847–12865. 10. J.C. Burant, G.E. Scuseria, and M.J. Frisch, “A Linear Scaling Method for Hartree-Fock Exchange Calculations of Large Molecules,” J. Chemical Physics, vol. 105, no. 19, 1996, pp. 8969–8972. 11. P. Pulay, “Localizability of Dynamic Electron Correlation,” Chemical Physics Letters, vol. 100, no. 2, 1983, pp. 151–154. 12. M. Schütz, G. Hetzer, and H.-J. Werner, “Low-Order Scaling Local Electron Correlation Methods. I. Linear Scaling Local MP2,” J. Chemical Physics, vol. 111, no. 13, 1999, pp. 5691–5705. 13. M.S. Lee, P.E. Maslen, and M. Head-Gordon, “Closely Approximating Second-Order Møller-Plesset Perturbation Theory with a Local Triatomics in Molecules Model,” J. Chemical Physics, vol. 112, no. 8, 2000, pp. 3592–3601. 14. G.E. Scuseria and P.Y. Ayala, “Linear Scaling Coupled Cluster and Perturbation Theories in the Atomic Orbital Basis,” J. Chemical Physics, vol. 111, no. 18, 1999, pp. 8330–8343. 15. P.Y. Ayala and G.E. Scuseria, “Linear Scaling Second-Order MøllerPlesset Theory in the Atomic Orbital Basis for Large Molecular Systems,” J. Chemical Physics, vol. 110, no. 8, 1999, pp. 3660–3671. 16. P.Y. Ayala, K.N. Kudin, and G.E. Scuseria, “Atomic Orbital Laplace-Transformed Second-Order Møller–Plesset Theory for Periodic Systems,” J. Chemical Physics, vol. 115, no. 21, 2001, pp. 9698–9707. Stefan Goedecker is a professor of computational physics at the University of Basel. His research interests include linear scaling algorithms for electronic structure calculations and other atomistic simulation methods. He received a PhD from the Swiss Federal Institute of Technology in Lausanne. Contact him at Stefan.Goedecker@ unibas.ch. Gustavo E. Scuseria is the Robert A. Welch Professor of Chemistry at Rice University. His research interests include the development of low-order scaling electronic structure methods and their application to molecules and solids. His undergraduate and PhD degrees are in physics from the University of Buenos Aires. He is a member of the American Chemical Society and a Fellow of the American Physical Society, the American Association for the Advancement of Science, and the Guggenheim Foundation. Contact him at [email protected]. 21