Download LINEAR SCALING ELECTRONIC STRUCTURE METHODS IN

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Hidden variable theory wikipedia , lookup

Scalar field theory wikipedia , lookup

Canonical quantization wikipedia , lookup

Matter wave wikipedia , lookup

Chemical bond wikipedia , lookup

T-symmetry wikipedia , lookup

Renormalization wikipedia , lookup

Symmetry in quantum mechanics wikipedia , lookup

Max Born wikipedia , lookup

Perturbation theory wikipedia , lookup

Wave–particle duality wikipedia , lookup

History of quantum field theory wikipedia , lookup

Particle in a box wikipedia , lookup

Theoretical and experimental justification for the Schrödinger equation wikipedia , lookup

Molecular Hamiltonian wikipedia , lookup

Hydrogen atom wikipedia , lookup

Molecular orbital wikipedia , lookup

Density functional theory wikipedia , lookup

Renormalization group wikipedia , lookup

Density matrix wikipedia , lookup

Coupled cluster wikipedia , lookup

Atomic orbital wikipedia , lookup

Hartree–Fock method wikipedia , lookup

Atomic theory wikipedia , lookup

Relativistic quantum mechanics wikipedia , lookup

Electron configuration wikipedia , lookup

Tight binding wikipedia , lookup

Transcript
COMPUTATIONAL
CHEMISTRY
LINEAR SCALING ELECTRONIC
STRUCTURE METHODS IN
CHEMISTRY AND PHYSICS
Calculating the electronic structure of large atomistic systems requires algorithms that scale
linearly with system size. Efficient implementations of these emerging algorithms provide
scientists in various fields with powerful software tools to address challenging problems.
S
cientists have known the nonrelativistic
equations of quantum mechanics since
1926, when Austrian physicist Erwin
Schrödinger published a series of papers
on quantum mechanics. As Paul Dirac pointed
out in 1929, questions in quantum mechanics are
in principle just questions in applied mathematics. In practice, however, solving these equations
has proved challenging. In spite of the impressive
computer power at our disposal, solving the basic
equation of quantum mechanics—the many-electron Schrödinger equation, which rules the quantum mechanical behavior of molecules and materials at the atomic level and determines their basic
properties—remains a difficult task, and will require the interplay of physics, chemistry, mathematics, and computational science.
Developing new methods for electronic structure
calculations is more than just developing new algorithms: it requires a deep physical and chemical understanding of many-electron systems. Combining
1521-9615/03/$17.00 © 2003 IEEE
Copublished by the IEEE CS and the AIP
STEFAN GOEDECKER
CEA-Grenoble
GUSTAVO E. SCUSERIA
Rice University
14
this understanding with modern mathematical concepts leads to algorithms that exploit the peculiarities
of electronic systems to yield powerful new electronic
structure methods. Adapting these methods for modern computer architectures will result in powerful
programs to aid the research of many scientists.
In this drive for better methods, algorithms in
which computing time increases linearly with respect to the number of atoms in the system are the
ultimate goal. Most physical quantities are extensive—that is, they grow linearly with system size.
We might therefore expect that the computational
effort will grow linearly with system size as well. An
even slower increase in computing time is certainly
not possible unless we ignore the basic physics of
the electronic system.
In this article, we review the physical principles
and algorithms behind the quest for electronic
structure computational methods that scale linearly
with respect to system size.
Calculating the Electronic Structure
of Large Atomistic Systems
The development of fast algorithms is one of modern mathematics’ main achievements. In the mathematical nomenclature, fast algorithmssuch as
the fast Fourier transform, the multigrid method,
and the fast multipole methodrefer to low-complexity algorithms. An algorithm’s complexity is related to its asymptotic behavior. If the computing
COMPUTING IN SCIENCE & ENGINEERING
time TCPU grows as
TCPU const Nk
(1)
in the limit of large systems, we say that the algorithm is O(N k). The algorithm has a low complexity if κ is small. In an O(N) algorithm, the computing time grows only linearly with respect to the
system size N.
The exponential increase in computer speed drives the development of low-complexity algorithms.
Empirically, the constant in Equation 1 is smaller
for the best high-complexity algorithms than for
the best low-complexity algorithms. Consequently,
high-complexity algorithms are typically faster for
small problem sizes. This implies that low-complexity algorithms only become important once we
have sufficient computational power to treat numerically system sizes beyond the crossover
pointthe point at which a low-complexity algorithm becomes faster than a high-complexity algorithm. As computing power increases, it is becoming possible to pierce the region beyond the
crossover point for more and more problems, making low-complexity algorithms advantageous.
Whereas early fast algorithms targeted standard
mathematical problems such as Fourier transformations, partial differential equations, and Coulombic
point particle systems, current fast algorithms target
more specific problems, such as electronic structure
calculations. It is generally important that these fast
special-purpose algorithms exploit the physical context. In this article, we only present the fundamental
principles of these algorithms. Several more detailed
review articles include technical details.1–4
To specify the size N of an atomistic system, we
can measure the number of atoms, number of electrons or, in a simulation context, the number of basis functions used to represent the wavefunctions
in the system. Because these measures are related
by constants, they are equivalent with respect to algorithm complexity. In this article, N will denote
the number of electrons with the tacit understanding that it is proportional to both the number of
atoms and the number of basis functions.
Understanding matter at the atomistic level is the
key ingredient in many branches of science, such as
solid-state physics, chemistry, materials science, and
biology. In the spirit of Dirac, we could say that questions in these disciplines are just questions in applied
physics. Computer simulations are in principle the
preferred approach to addressing atomistic problems,
eliminating experimental ambiguities and uncertainties. Unfortunately, highly accurate quantum mechanical simulations of large atomistic systems are still
JULY/AUGUST 2003
unfeasible. Nevertheless, even less accurate methods—density functional methods, in particular—have
great predictive power. For example, density functional methods helped scientists predict fullerenes before they could be synthesized in the laboratory.
The fact that nearly half of all theory papers in
chemistry and solid-state physics rely, at least in
part, on computer simulation results also stresses
simulation’s important role. O(N) methods will extend high-accuracy methods to larger systems used
in many practical problems and thus strengthen
simulation’s role in atomistic studies.
Methods for
Treating Electronic Systems
A wide range of computational methods is available
for simulating atomistic problems. These methods
are based on approximations of the difficult manyelectron problem and thus represent a compromise
between speed and accuracy.
Wavefunction Methods
Wavefunction methods are the most accurate, directly
solving the many-electron Schrödinger equation,

− 1
 2

×
N
∑
i =1
N
∇ 2i +
N

V ( ri )

i =1

N
1
∑
∑
∑
i =1 j = i +1 r − r
i
+
j
ψ(r1, …, rN) = E ψ(r1, …, rN).
(2)
We use atomic units in this equation and
throughout the article. For simplicity, we also suppress the electrons’ spin degrees of freedom.
We must solve the eigenvalue problem of Equation
2 under the constraint that the many-electron wavefunction Ψ(r1, ..., rN) is antisymmetric with respect to
the exchange of any two electron coordinates—that is,
Ψ(…, ri, ..., rj, ...) = –Ψ(..., rj, ..., ri, ...).
The price of this method’s accuracy is a large prefactor (Equation 1) and, at least until very recently, has
high complexity. Traditional quantum chemistry
methods are based on an expansion of the many-electron wavefunction in Slater determinants, formed
from a set of orthonormal one-electron orbitals φi(r).
The determinants’ mathematical properties, with respect to the interchange of rows and columns, lead
automatically to the required antisymmetry of Ψ:
Ψ(r1,..., rN ) =
∑I
CI
φ i1 ( r1 ) ... φ iN ( r1 )
... ...
...
(3)
N ! φ ( r ) ... φ ( r )
i1 N
iN N
1
15
The composite index I = (i1, ..., iN) characterizes a
determinant, or configuration, by specifying which
one-electron orbitals φi make up the configuration.
If we have M orbitals φi, we can form M!/((M
– N)!N!) different configurations. In the full configuration interaction method, the sum in Equation
3 runs over all possible M!/((M – N)!N!) terms. Because one must compute all the coefficients CI, the
computational effort is at least M!/((M – N)!N!).
We can call the scaling exponential because it is
worse than any polynomial scaling.
The most common quantum chemistry methods
restrict the number of configurations to be included in the calculation and thus reduce scaling.
Nevertheless, quantum chemistry methods remain
highly complex in the standard algorithm context.
The popular coupled-cluster method with single
and double excitations has, for instance, an O(N6)
scaling unless we use the new O(N) methods. Consequently, conventional algorithms can treat only
a few dozen electrons in this way.
The full configuration interaction method will
give us the exact answer only if we use a complete
basis set—that is, if M is infinite. This of course is
not practical. A fast convergence with respect to M
would be sufficient; however, because building the
electron–electron cusp in the wavefunction from
a continuous smooth basis set is difficult, convergence is slow. The preferred basis set for these calculations consists of atom-centered Gaussian-type
orbitals, where exp(–a(r – R)2) represents, for example, an s-type orbital, and exp(–a(r – R)2)(x
– X)/|r – R| represents a px-type orbital. X is the
first component of atomic position R and x is the
first component of electronic coordinate r.
Independent Particle Methods
In independent particle methods, electron–electron
interactions are treated in an average way. Because
in most chemical systems, correlations are relatively small compared to the total energy, a mean
field treatment is usually satisfactory.
The Kohn-Sham formulation of density functional theory is the most prominent in this context. In this method, we obtain the electronic
ground state in an external potential V(r) by
minimizing the energy expression over all mutually orthonormal orbitals ψ i (r), (∫ ψ i (r) ψ j (r)dr
= δi,j),
N
E=
1
− ψ i ( r )∇ 2ψ i ( r )
∫∑
2
i =1
+ V H ( r )ρ( r ) + ∈xc (ρ( r )) + V ( r )ρ( r ) dr (4)
where
16
N
ρ( r ) =
f ( ∈i ) ψ i ( r )
∑
i =1
2
(5)
represents the electron charge density. For simplicity, we again consider the electrons to be spinless, and f(∈) is the Fermi distribution 1/(1 + exp((∈
– µ)/kbT). The Hartree potential VH
V H (r ) =
1 ρ( r )
dr'
2 r − r'
∫
(6)
represents the classical electrostatic average interaction between the electrons.
The exchange correlation energy ∈xc is the nonclassical, quantum part of this interaction. The energy obtained by minimization of the variational
parameters of the orbitals is the ground state total
energy. Because the exact ∈xc remains unknown,
the Kohn-Sham method is approximate in practice.
However, various rather accurate forms of ∈xc are
currently available.
Minimization algorithms use the gradient
2Hψi(r) of the total energy expression with respect
to the orbitals, which is obtained as a functional derivative of Equation 4. The Kohn-Sham Hamiltonian H is given by
1
H = − ∇ 2 + V H ( r ) + µ xc (ρ( r )) + V ( r ) , (7)
2
where
µ xc (ρ) =
∂ ∈xc ( ρ )
∂ρ
is the exchange correlation potential. In a numerical
calculation where each orbital wavefunction is represented as a linear combination of basis functions χk(r),
the Hamiltonian operator H defined in Equation 7
becomes a matrix H whose elements are given by
∫
Hkl = χk(r) H χl(r)dr.
(8)
Minimizing the total energy is equivalent to solving a self-consistent eigenvalue problem for H. Selfconsistency is reached if after several self-consistency
iterations, the charge density obtained from the solution (Equation 5) is identical to the charge density
used to construct the Hamiltonian matrix (Equations 7 and 8). The self-consistent diagonalization
approach—the most widely used with Gaussian-type
orbitals—yields a cubic scaling with respect to N.
Minimization-type algorithms for density functional methods also give a cubic scaling. The cubic term arises from the orthogonality requirement for the orbitals. In each step of an iterative
COMPUTING IN SCIENCE & ENGINEERING
minimization of Equation 4, this constraint must
be reinforced by performing, for example, a
Gram-Schmid orthogonalization. Imposing one
constraint per pair of orbitals requires calculating scalar products and linear combinations of the
two vectors representing the orbitals. The length
of these vectors increases with system size because larger systems require larger basis sets.
Hence, imposing a single orthogonality constraint has O(N) complexity. However, N(N
– 1)/2 constraints exist because we must impose
orthogonality not only for one pair of orbitals,
but for all N(N – 1)/2 possible pairs. The overall
complexity for the orthogonalization step is
therefore O(N 3).
Density functional methods are the current
workhorse for atomistic simulations because they
represent a good compromise between accuracy
and speed. The cubic orthogonalization term in
minimization-based methods becomes dominant
only in systems of a few hundred atoms. These systems are therefore the largest that minimization algorithms can handle.
In quantum chemistry calculations with Gaussiantype orbitals, an O(N 4) bottleneck must be eliminated before one is confronted with the O(N 3) orthogonalization bottleneck.
Because the charge density is a quadratic form
with respect to the wavefunction (Equation 5), calculating the Hamiltonian matrix elements of the
Hartree potential (Equation 6) requires four center electron repulsion integrals of the type
∫ dr ∫ dr'
χ i ( r ) χ k ( r' ) χ l ( r' ) χ j ( r )
r − r'
.
With four indices involved, a straightforward
evaluation results in O(N 4) complexity. If the basis functions χ are Gaussian-type orbitals or
other well-localized basis functions, reducing the
complexity is easy. Unless the pair of basis functions with a common argument r significantly
overlaps, the integral will be negligible. The
same is true for the pair of basis functions with
an argument r′.
Hence, we can reduce the complexity in a controlled manner to the classical limit of O(N 2). We
can even reduce the complexity of the calculation
of the Hartree term to O(N). In the context of
Gaussian-type orbitals, for example, we can
achieve linear scaling using a modified fast multipole method.5,6 In this approach, the scaling is
linear only with respect to the volume—not with
respect to the number of basis functions. In other
words, the prefactor grows rapidly if we use
JULY/AUGUST 2003
larger and more accurate basis sets. A scaling behavior that is strictly linear with respect to the
number of basis functions can be achieved in a
wavelet basis.7
Tight-Binding and Semiempirical Methods
The number of variational parameters in the expansion of the wavefunction is fairly large in density functional calculations. If we use Gaussian-type
basis sets, we typically have a few dozen variational
parameters per atom. If we use systematic basis sets
such as plane waves, this number can rise to several
hundred per atom.
Tight-binding and semiempirical methods
have far fewer variational parameters—only
slightly more than the number of valence electrons per atom. We can obtain the same number
of variational parameters using a minimal basis
set in a density functional calculation; however,
semiempirical methods are not identical to this
combination. In semiempirical methods, we do
not calculate all the integrals (Equation 8) needed
to set up the Hamiltonian matrix exactly, but parameterize them with respect to the atomic positions. The smaller number of variational parameters and the absence of an exact integral
evaluation obviously reduce accuracy. We can
therefore consider these methods as the crudest
quantum mechanical approach.
The total energy in semiempirical methods is
the sum over all occupied eigenvalues of the
Hamiltonian matrix and possibly other easy-tocalculate contributions from classical positiondependent potentials. The computational bottleneck is therefore the diagonalization of the
Hamiltonian matrix. Because a large percentage
of all the eigenvectors must be calculated, standard classical diagonalization is the traditional solution method. Consequently, we again obtain
cubic scaling (O(N 3)) with respect to system size.
Although the prefactor in these methods is
roughly two orders of magnitude smaller than in
density functional calculations, we cannot treat
much larger systems. Due to cubic scaling, reducing the prefactor by 125 only lets us treat systems five times as large.
Force Fields
Force fields, or interatomic potentials, eliminate
electrons—the quantum mechanical glue linking the
atoms. In these methods, atoms are particles that interact through sophisticated classical potentials, fitted to certain experimental or theoretical results.
This approach is also called molecular mechanics. Because the fitted potentials are short range, obtaining
17
(small discs) in the crystals forms chemical bonds
with its neighbors. For example, the nature and energy of the bonds that the red atom forms with its
neighbors depend on three factors:
• the number of nearest neighbors (the coordination number),
• the atomic species type that the neighbor represents (whether it is a carbon or an oxygen
atom, for instance) and,
• the bond length.
Figure 1. The spherical localization regions associated with two atoms.
The bonds formed between the red and green atom and their
neighbors are indicated by red and green lobes.
linear scaling with respect to system size is straightforward. The only long-range interactions are electrostatic interactions, which we can treat with linear
scaling algorithms such as fast multipole methods.8
Force fields can yield surprisingly accurate results when applied to configurations similar to
those in the training or fitting set, but can fail
badly otherwise. Obviously, they cannot give
properties related to the atom’s electronic structure, such as optical properties, polarizabilities,
excitation energies, or chemical reaction barriers.
They are nevertheless popular in biological applications such as protein modeling, primarily because they are the only method that treats molecular systems containing many thousands or even
millions of atoms.
The popularity of force fields also shows the clear
need for accurate simulation tools for very large systems. Traditional algorithms for solving quantum
mechanical equations cannot handle these very
large systems. Medium-size systems can, however,
now be accessed with quantum mechanical O(N)
methods. Because force fields are not quantum mechanical, we do not discuss them further.
Physical Foundations of O(N) Methods
Fundamental chemistry principles strongly suggest
that linear scaling is possible. To see this, consider
the model crystal shown in Figure 1. Each atom
18
These three properties are purely local. In this simple picture, what occurs outside the red atom’s localization sphere does not influence the bonding
whatsoever. A more profound analysis1,2 shows that
features outside the sphere do influence the bonds
in the sphere, but the influence decays rapidly with
distance. Van der Waals interactions have the slowest decay in this context, namely c/r6. Fortunately,
the prefactor c is small and we can truncate the interactions at a reasonably sized radius.
Only the wavefunction methods accurately
treat van der Waals interactions, which are essentially absent in independent particle methods.
Therefore, the influence of the region outside the
localization sphere decays even faster in independent particle methods. In an insulator and a
metal at finite temperature, influence decays exponentially. At room temperature, the exponential decay is much slower in a metal than in an insulator. At zero temperature, the decay in a metal
is only algebraic.
Finally, doing an electronic structure calculation
in an insulator essentially amounts to determining
the different bond energies. To calculate bond energy, we consider only a relatively small region
around the bond. As the system grows, the number
of bonds increases, but they are all embedded in
similarly sized localization spheres.
Figure 1 shows the localization region suitable for
calculating the bonds formed by the green atom. We
then obtain the system’s total energy by summing
the partial bond energies. Hence the calculation
scales linearly with respect to the number of bonds.
Basic Principles
of Linear Scaling Algorithms
As described previously, obtaining linear scaling in
an electronic structure calculation requires formulating the problem in terms of quantities that reflect the locality principle. Such quantities will decay rapidly, and if we accept a small but controlled
error in the calculation, we can simply truncate
these quantities at a large enough distance. Because
COMPUTING IN SCIENCE & ENGINEERING
the extended eigenorbitals diagonalizing the independent particle Hamiltonian, usually referred to
as canonical orbitals, do not reflect this locality principle, they are not suitable as the basic quantities in
O(N) calculations. Linear scaling also rules out the
use of basis functions extending over the whole
computational volume, such as plane waves.
Density Matrix-Based Algorithms
For both semiempirical and tight-binding methods
and independent particle methods with small basis
sets, the independent particle density matrix is usually the central quantity for O(N) algorithms. We
define the finite-temperature independent-particle
density operator F as
F ( r , r' ) =
∑i f (∈i )ψ i (r )ψ i (r' )
where ∈i is the eigenvalue and ψi(r) is the eigenvector
of the (self-consistent) Hamiltonian. At this independent particle model level, the density matrix contains
all the information about our quantum mechanical
system—that is, the quantities of interest such as
charge density or total energy. The fact that F tends
to zero when the observation point r is far from the
source point r′ manifests chemistry’s locality principle:
F ( r , r' )
→
r − r' → ∞
0.
On discretization, the independent particle density
operator F becomes the independent particle density
matrix F. If we use a well-behaved localized basis set,
and if the two basis functions entering into the matrix
element are far apart, the operator’s rapid decay will
produce small matrix elements. Neglecting these insignificant matrix elements yields a sparse matrix in
which the number of important matrix elements per
column or row is independent of system size. Hence
the number of matrix elements increases as O(N).
Several approaches to calculating such a sparse
density matrix are available. Conceptually, the simplest is based on F being a matrix function of H, F
= f(H), where f is the Fermi distribution. We can
numerically represent the matrix function f in several ways: as a Chebychev polynomial, a rational
function, or a recursively built-up polynomial. Another approach is to minimize an energy expression
based on the density matrix. All these approaches use
sparse matrix–matrix or matrix–vector multiplications to construct F and therefore scale linearly with
the matrix’s dimensions.
number of them can achieve reasonable accuracy.
This is because they mimic the atomic orbitals
that represent a first-order approximation of
more complicated atoms’ and solids’ electronic
structures. Gaussian-type orbitals, however, are
not a systematic basis set in that they cannot solve
Schrödinger’s equation with an arbitrarily small
error. Thus, systematic basis sets (such as finite
elements or wavelets) are preferred in various applications. The large number of basis functions
per atom means that the density matrix contains
too many significant elements, even though it is
still a sparse matrix. In an insulator at zero temperature, however, we can find a compact representation of the density matrix in terms of Wannier functions Wi(r)
F ( r , r' ) =
∑ Wi (r )Wi (r' ) .
i = occ
Of all the sets of orbitals that span the occupied
space, the Wannier functions best meet a certain
chosen localization criterion.9 Because they are localized, we can associate a center Ri with each
Wannier function Wi(r). The locality principle
again manifests through the decay properties of the
Wannier functions
Wi (r )
→
Ri −r → ∞
0.
The vectors obtained by discretizing the Wannier functions in a well-behaved localized basis
set are therefore sparse as well. Empirically, the
Wannier functions typically represent bonds or
lone electron pairs.
O(N) algorithms let us directly calculate Wannier-like functions. This can either be done by
projection methods or by minimizing a modified
total energy expression. Because all the operations are done with sparse vectors, linear scaling
can be obtained.
All O(N) algorithms based on the density matrix
and Wannier functions solve the electronic structure problem for a fixed independent particle
Hamiltonian. In a self-consistent electronic structure calculation, these algorithms must be combined with O(N) methods for the calculation of the
self-consistent potential. The only difficult part of
the potential is the Hartree part. With the previously mentioned fast multipole method, linear
scaling can be obtained for the Hartree potential.
Wavefunction Method-Based Algorithms
Wannier Function-Based Algorithms
If we discretize Schrödinger’s equation in a basis
set of Gaussian-type orbitals, a relatively small
JULY/AUGUST 2003
The starting point of most sophisticated quantum
chemistry wavefunction methods is the HartreeFock method, which consists of a single determinant
19
wavefunction. The Hartree-Fock method can be
considered the simplest wavefunction method, in
which electrons are assigned to individual one-particle orbitals. In Hartree-Fock theory, each electron
interacts with the mean Coulomb field created by all
other electrons, and “correlations” between them
are neglected. From the viewpoint of Kohn-Sham
density functional theory, Hartree Fock contains all
the exchange energy but zero correlation energy.
In terms of computational scaling, Hartree-Fock
shares some requirements with density functional
theory: the Coulomb and diagonalization steps are
identical. However, in Hartree-Fock there is no
need for a numerical quadrature to evaluate the exchange–correlation functional. On the other hand,
the Hartree-Fock exchange energy must be evaluated. Fortunately, the exchange interaction decays
exponentially in insulating systems. Thus, the
Hartree-Fock exchange energy and associated potential are well suited for numerical screening or
truncation because interactions below a given
threshold can be safely ignored. Several schemes
along these lines have been proposed in the literature, and their O(N) scaling properties have been
demonstrated.10
For metallic systems at zero temperature, the
Hartree-Fock exchange interaction decays much
more slowly—namely, algebraically. This results in
a very large prefactor, rendering O(N) methods impractical. However, because the usefulness of the
Hartree-Fock method for correctly describing
metallic systems has long been questioned, the problem seems to be of little practical relevance. For the
same reason, hybrid density functional methods that
contain some fraction of the Hartree-Fock exchange
energy remain mostly limited in practice to systems
with a significant insulating character.
In wavefunction methods such as configuration
interaction, coupled-cluster, or many-body perturbation theory, one has to calculate both the oneparticle density matrix ρ1,
∫ ∫
ρ1( r1′ , r1 ) = N ... dr2 ... drN
Ψ( r1′ , r2 , ..., rN ) Ψ( r1, r2 , ..., rN ) .
the two-particle density matrix ρ2
N ( N − 1)
ρ 2 ( r1′ , r2 ′ , r1, r2 ) =
... dr3... drN
2
,
Ψ( r1′ , r2 ′ , r3,..., rN ) Ψ( r1, r2 , r3,..., rN )
∫ ∫
because the total energy E is a function of both of
them
20
 1

E =  − ∇ 2 ′ + V ( r1 ) ρ1 r1′ , r1
r


 2 1

ρ 2 ( r1, r2 , r1, r2 )
+
dr1dr2.
r1 − r2
∫
r1′ = r1′
dr1
∫∫
ρ1 becomes identical to the zero-temperature independent particle matrix discussed in the context
of independent particle methods if the many-body
wavefunction consists of only one determinant.
Direct calculation (for example, by energy minimization) of ρ1 and ρ2 is not feasible because the necessary and sufficient conditions for the so-called Nrepresentability of ρ2 remain unknown. These
conditions ensure that ρ2 can be obtained from an
N-electron wavefunction and they would be needed
to constrain a direct search of ρ2. For a given wavefunction construction (configuration interaction,
coupled cluster, or many-body perturbation theory),
the two-particle density matrix elements are therefore calculated from the configuration coefficients
CI (Equation 3) that make up the particular wavefunction. The direct calculation of these matrix elements involves solving either a set of linear (secular
matrix problem in configuration interaction) or nonlinear algebraic equations (coupled cluster).
As mentioned previously, the computational scaling of these procedures with respect to system size is
very steep when done in terms of the canonical
Hartree-Fock orbitals. With canonical orbitals, the
computational scaling for including all single and
double excitations from the reference determinant is
O(N 6). Adding the triple and quadruple excitations,
which are needed to accurately dissociate molecules
containing double or triple bonds, leads to an even
worse scaling of O(N 8) and O(N 10), respectively.
As discussed previously, the key issue in achieving linear scaling in these wavefunction approaches is to formulate the problem in a representation in which the short-range nature of the
chemical interactions, including van der Waals interactions, becomes explicit.11 The traditional formulation of most wavefunction methods in terms
of canonical orbitals is not adequate for this purpose. The short-range behavior can however be
exploited by working with a set of localized molecular orbitals, which allows for truncation based
on physical arguments,12,13 or by numerical truncation of small matrix elements in an atomic orbital representation.14
Perturbation theory methods are also very popular
in quantum chemistry. Among them, second-order
Moeller-Plesset perturbation theory is the simplest
wavefunction method that accounts for van der Waals
interactions. These interactions are fundamentally
COMPUTING IN SCIENCE & ENGINEERING
important for understanding both intermolecular interactions in biological systems and packing forces in
molecular solids. Until recently, the steep scaling of
the Moeller-Plesset computations (O(N5)) ruled out
their application to large systems. In the last few years,
several methods that allow for Moeller-Plesset calculations of fairly large systems have appeared in the literature and have been incorporated into software
packages.12,15 These developments have also been extended to systems with periodic boundary conditions,16 opening avenues to study novel materials.
The philosophy behind these linear scaling MP2 approaches is similar to that described in the previous
paragraph: one must work in a local representation in
which the decay properties of the relevant quantities
with respect to their spatial separation become explicit
and adequate for truncation.
E
ven though the basic principles of O(N)
electronic structure calculations are well
established, a substantial amount of
work to develop practical computational tools remains. Reducing the prefactors remains crucial. Once the O(N) asymptote is
reached, semiempirical methods are typically two
orders of magnitude faster than density functional
methods, which in turn are typically another two
orders of magnitude faster than wavefunction
methods.
Sparsity of matrices and sparse matrix operations
remain central for achieving O(N) scaling in electronic structure methods. Progress in this area, as
well as massive parallelization of these tools, will allow even faster computations. In cases where matrix sparsity cannot be easily achieved (as in metals
at zero temperature), more research for alternative
techniques is still needed.
Acknowledgments
The linear scaling work at Rice University is supported by
the US National Science Foundation, the US Department
of Energy, the Welch Foundation, and Gaussian.
References
1. S. Goedecker, “Linear Scaling Electronic Structure Methods,” Reviews of Modern Physics, vol. 71, no. 4, 1999, pp. 1085–1123.
2. S. Goedecker, Handbook of Numerical Analysis, special volume on
computational chemistry, P.G. Ciarlet and C. Le Bris, eds., NorthHolland, 2003.
3. G.E. Scuseria, “Linear Scaling Density Functional Calculations
with Gaussian Orbitals,” J. Physical Chemistry A, vol. 103, no. 25,
1999, pp. 4782–4790.
4. P. Ordejon, “Order-N Tight-Binding Methods for Electronic-
JULY/AUGUST 2003
Structure and Molecular Dynamics,” Computational Materials Science, vol. 12, no. 3, 1998, pp. 157–191.
5. C.A. White and M. Head-Gordon, “Derivation and Efficient Implementation of the Fast Multipole Method,” J. Chemical Physics,
vol. 101, no. 8, 1994, pp. 6593–6605.
6. M.C. Strain, G.E. Scuseria and M.J. Frisch, “Achieving Linear Scaling for the Electronic Quantum Coulomb Problem,” Science, vol.
271, no. 1, 1996, pp. 51–53.
7. S. Goedecker and O. Ivanov, “Linear Scaling Solution of the
Coulomb Problem Using Wavelets,” Solid State Comm., vol. 105,
no. 11, 1998, pp. 665–669.
8. L. Greengard, “Fast Algorithms for Classical Physics,” Science, vol.
265, 1994, pp. 909–914.
9. N. Marzari and D. Vanderbilt, “Maximally Localized Generalized
Wannier Functions for Composite Energy Bands,” Physics Rev. B,
vol. 56, no. 20, 1997, pp. 12847–12865.
10. J.C. Burant, G.E. Scuseria, and M.J. Frisch, “A Linear Scaling
Method for Hartree-Fock Exchange Calculations of Large Molecules,” J. Chemical Physics, vol. 105, no. 19, 1996, pp.
8969–8972.
11. P. Pulay, “Localizability of Dynamic Electron Correlation,” Chemical Physics Letters, vol. 100, no. 2, 1983, pp. 151–154.
12. M. Schütz, G. Hetzer, and H.-J. Werner, “Low-Order Scaling
Local Electron Correlation Methods. I. Linear Scaling Local
MP2,” J. Chemical Physics, vol. 111, no. 13, 1999, pp.
5691–5705.
13. M.S. Lee, P.E. Maslen, and M. Head-Gordon, “Closely Approximating Second-Order Møller-Plesset Perturbation Theory with
a Local Triatomics in Molecules Model,” J. Chemical Physics, vol.
112, no. 8, 2000, pp. 3592–3601.
14. G.E. Scuseria and P.Y. Ayala, “Linear Scaling Coupled Cluster and
Perturbation Theories in the Atomic Orbital Basis,” J. Chemical
Physics, vol. 111, no. 18, 1999, pp. 8330–8343.
15. P.Y. Ayala and G.E. Scuseria, “Linear Scaling Second-Order MøllerPlesset Theory in the Atomic Orbital Basis for Large Molecular Systems,” J. Chemical Physics, vol. 110, no. 8, 1999, pp. 3660–3671.
16. P.Y. Ayala, K.N. Kudin, and G.E. Scuseria, “Atomic Orbital
Laplace-Transformed Second-Order Møller–Plesset Theory for
Periodic Systems,” J. Chemical Physics, vol. 115, no. 21, 2001,
pp. 9698–9707.
Stefan Goedecker is a professor of computational
physics at the University of Basel. His research interests
include linear scaling algorithms for electronic structure
calculations and other atomistic simulation methods. He
received a PhD from the Swiss Federal Institute of Technology in Lausanne. Contact him at Stefan.Goedecker@
unibas.ch.
Gustavo E. Scuseria is the Robert A. Welch Professor of
Chemistry at Rice University. His research interests include the development of low-order scaling electronic
structure methods and their application to molecules
and solids. His undergraduate and PhD degrees are in
physics from the University of Buenos Aires. He is a
member of the American Chemical Society and a Fellow
of the American Physical Society, the American Association for the Advancement of Science, and the Guggenheim Foundation. Contact him at [email protected].
21