Download The decompositional approach to matrix computation

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Symmetric cone wikipedia , lookup

Rotation matrix wikipedia , lookup

Four-vector wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Determinant wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Jordan normal form wikipedia , lookup

Matrix calculus wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Principal component analysis wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

System of linear equations wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Gaussian elimination wikipedia , lookup

Matrix multiplication wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Transcript
THEDECOMPOSITIONAL
APPROACH
TO MKTRIX
COMPUTATION
The introduction of matrix decomposition into numerical linear algebra revolutionized
matrix computations. This article outlines the decompositional approach, comments on its
history, and surveys the six most widely used decompositions.
1951, Paul S. Dwyer pnblishcd I,inenr
Comptntions, perhaps the first hook dcvoted cntirely to numerical linear algebra.'
Digital conipiiting was in its infancy, and
Dwyer focused on computation with tncchanical calculators. Nonctheless, the hook was state
of the art. Figurc 1 reproduces a pagc of thc
book dealing with Gaussian elimination. In
1954, Alston S. Householder published Prim+
p b nf Nzc"cn1 Ann1ysi.r; one of the first modcrn treatments of high-spccd digital coinputation. Figure 2 rcproduccs a page from this hook,
also dealing with Gaussian elimiiiation.
The contrast between tlicse two pages is striking. l'he most obvious difference is that Ilwyer
11
1521-96151001S10.000 2000 IEEE
50
used scalar equations whereas Householder used
partitioned matriccs. Rut a d e e p difference is
that while Dwyer started from a system of equations, I-Iouseholder worked with i\ (block) LU
decoinpiisition-the factorization of a matrix
into the product of lower and upper triangular
matrices.
Gencrally spcaking, a decomposition is a factorization of a inatrix into simpler factors. T h e
undcrlying principlc of thc decompositional approach to matrix computation is that it is not the
husiness of tlic matrix algorithmists to solve particular problems hut to construct computational
platforms from which a variety of problems can
be solvcd. 'rliis approach, which was in full
swing by thc inid-l960s, has revolutionized inatrix computation.
To illustrate tlic nanirc of tlic decompositional
approach and its cousequcnces, I begin with a
discussion of the Cholesky decotnpositioii and
the solution of linear system-essentially the
decoiiiposition that Gauss computed in his dim-
COMPUTING
IN SCIENCE & ENGINEERING
I d P 8-3
+
"d.
n=t
-
Ra
/*
Figure 1. This page from Linear computations
shows that Paul D y e r ' s approach begins with a
system of scalar equations. Courtesy of John Wiley
6r Sons.
Figure 2. On this page from Principles ofNumerical
Analysis, Alston Householder uses partitioned
matrices and LU decomposition. Courtesy of McCraw-Hill.
ination algorithm. 'l'his article also provides a A caii be factored in thc form
tour of the five other major matrix decompositions, including thc pivoted I,U decomposition,
A = R%
(2)
the Q R decomposition, the spectral decomposition, the Schur decomposition, and the singu- where R is an upper triangular iuatrix. T h e factorization is called the Cholesky decomposition of
lar value decomposition.
A disclaiiner is in order. This article deals pri- A.
The factorizatioii in Equation 2 caii be used
marily with dense matrix computations. Ndiough tlic decompositional approach has greatly to solvc linear systems ;is follows. Ifwc write thc
influcnced iterativc and dircct methods for sparse system in thc form R"Rx = b and sety = R-Th,
matrices, the ways in which it has affected thctii then x is the solution of thc triangnlar system
Rx =y. IIowever, by clefinitioiiy is the solution of
are differcnt from what I describe here.
thc systeiii R'ry = b. Coiisequently, we havc reduced the prnhlein t~ the solutim of two trianThe Cholesky decomposition and
gular systems, iis illustrated in the following alllnear systems
gorithm:
We caii nse the decompositionill approach to
1 , Solve the system R-5 = b.
solve the system
2 . Solve the system &. =y.
(3)
Ax=b
(1)
Bccansc triaiigiilar systcins arc easy to solvc, the
where A is positive defiiiitc. It is wcll known that introduction of thc Cholesky dccoiiipositioii has
JANUARYIFEBRUARY
2000
51
Figure 3. These varieties of Gaussian elimination are all numerically
equivalent.
transformed our problem into one for which the
solution can be readily coniputed.
We can nse such decompositions to solve
innre than one problem. For example, the following algorithm solves thc systetnATx = b:
1. Solvc the system Ry = b.
2 . Solve the systcin d S x = y .
(4)
Again, in many statistical applications wc want
to coinputc the qnantity p = xTAA'r.Because
we can compute p as follows:
1. Solve the system R'y
2 . p = j'y.
= x.
(6)
used to solve a system at one point in a computation, yon can rensc the decomposition to do
the same thing later without having to reconpnte it. Historically, Gaussian elimination and
its variants (including CholcskyS algorithm) have
solved tlie systcin in Equation 1 by rcducing it
to an equivalent triangular system. This mixcs
tlie computation of the decomposition with the
solution of the first triangular system in Equation 3 , a i d it is not obvious how to reuse the
elimination when a new right-hand side presents
itself. A naive programmer is in danger of pcrforming the reduction from the hcginning, thus
repeating the IionTs share of the work. On the
other hand, a program that knows a decoinpnsition is in the background can reuse it as needed.
(By tlie way, the problem of recomputing dccompositions has not gone away. Some matrix
packages hide the fact that they repeatedly coinpnte a decoiiiposition by providing drivers to
solve linear systems with a call to a single row
tine. If the program calls the routine again with
the same matrix, it recomputes thc decoinposition--unnecessarily. Interpretive matrix system
snch as Matlab and Nlatheniatica have the same
problem-they hide decompositions behind operators and function calls. Such arc the consequences of not strcssing the decoinpositional approach to the consumers of matrix algorithms.)
Another advantage ofworking with decoinpositions is unity. There are differcnt w"ys of organizing the operations involved in solving linear systems by Gaussian elimination in general
and Cholcskyh algorithm in particular. Figme 3
illustrates some of these arrangcinents: a white
area contains clenients froin the original matrix,
a dark area contains the factnrs, a light gray
area contains partially processed elements, and
the boundaiy strips contain clciiients about to
he processcd. Most of these variants were originally presented in scalar form as new algorithms.
Once you recognize that a dccomposition is involved, it is easy to sec the essential niiity of the
various algorithms.
All the variants in Fignre 3 are numerically
equivalent. This means that one rounding-error
analysis serves all. For example, the Choleslcy algorithm, in whatever guise, is backward stable:
the computed factor R satisfies
T h e dccoinpoaitional approach can also save
computation, For example, the Cholesky de(A + E ) = R ~ R
(7)
composition requires O(n') operations tn compute, whereas the solution of triangular systcins where E is of the size of the rounding unit relarequires only O(nz)operations. Thus, if you rec- tivc to A. Establishing this backward is usually
ognizc that a Cholesky decomposition is beiug the most difficult part of an analysis of the use
52
COMPUTING
IN SCIENCE &
ENGINEERING
of a decomposition to solve a prohlem. For cxample, oncc Equation 7 has Iiccn estahlishetl, the
ruunding errors involved in the solutions of the
triangiilar systems in Equation 3 can bc incorporated in E with relative ease. Thus, another
advantage of the decoinpositional approach is
that it concentrates the most difficult aspects of
rounding-error analysis in one place.
In general, if yon change the elements of a
positive definite matsix, yon must recompute its
Cholesky decomposition from scratch. 1Iowever,
if the change is strnchlred, it tilay he possihlc to
compute the ncw deconipositinn dircctly froni
the old-a process known as z q d n t i q . For exaniplc, you caii cotnpute the Cholesky dccomposition of A + S S froin
~
that of A in O ( 2 ) 01’eratioris, a n enormous savings over the nb initio
computation of the decoinposition.
Finally, the decompositional approach has
greatly affected the development of software for
matrix computation. Instead of wasting energy
developing code for a variety of specific applications, the producers of a niatrix pachgc can concentrate (in the decompositions themselves, pcrhaps with a few auxiliary routines to handle the
most importmt applications. This apprmach has
informed the major public-domain packages: the
IIandbook series,’ Eispack? Linpack,’ atid 1,apack.6 A consequence o f this emphasis on dcconqinsitions is that software developers have
found that most algorithms have hroatl coniliutatioiial featurcs in cornnion-features than can
he relegated tn liasic linear-algebra subprograms
(such 21s Blas), which caii then lie optimized for
specific ~ n a c h i n e s . ’ ~ ~
For easy reference, the sidebar “Benefits of thc
decompusitional approach” summarizes the advantages of dccomposition.
History
All the widely used decompositions had niadc
their appcarance by 1900, when Schur introdnced the ilecomposition that now bears his
name. However, with the exception ofthc Schur
decomposition, they were nnt cast in tlic Iaiip a g e ofmatrices (in spite of the fact that matriccs had been introduccd in 1858“’). I provide
some historical background for thc individual
decoiupositions later, but it is instructivc here to
consider how the originators proceeded in the
absence o f matriccs.
Gauss, who worked with positivc definite systems defined by the nornial equations for least
squares, described his elimination procedure as
Benefits of the decompositional
approach
A matrix decomposition solves not one but many problems.
A matrix decomposition, which is generally expensive to
compute, can be reused to solve new problems involving the
original matrix.
The decompositional approach often shows that apparently
different algorithms are actually computing the same object.
The decompositional approach facilitates rounding-error
analysis.
Many matrix decompositions can be updated, sometimes
with great savings in computation.
By focusing on a few decompositions instead of a host of
specificproblems, software developers have been able to
produce highly effective matrix packages.
the reduction of a quadratic form q(x) = i’hx
(I ani simplifying a little here). In terms of the
Cholesky factorization A = R”R, Ganss wrotc
q(x) in the form
= p:(x)+p:(x)+K
(8)
where r:t is the ith row of R . Thus Gauss reduced &) to a sum of squares oflinear lunctioils
pk Because R is uppcr triangular, the fiinctinn
p;(x) depciids only on the cotnpunents s;,.. .x.
of.^ Sincc the coefficients in the linear hisins
p; arc the eleincnts ofR, Gauss, by showing how
to compute the pi, effectively computed the
Cholcsky decomposition ofA.
Other decotnpositions were introduced in
other ways. For example, Jacobi introduced thc
LU decomposition as a decomposition of a h linear form into a slim (if products of linear f~nictions having an apprnpriate triangularity with
respect tii the variahlcs. T h e singular valnc decomposition madc its appearance as an orthogonal change ofvariables that diagonalizcd a bilinear form. Eventually, all these decompositions
found expressions as factorizations of matrices.“
T h c process by which decomposition bcc2iine
so important to matrix computations was slow
and increincntal. Gauss certainly had the spirit.
H e used his decomposition to perform many
tasks, such as computing variances, and even
used it to update Icast-sqnares solutions. But
Gauss never regarded his dccnmposition as a
niatrix factorization. atid it would lie anachro-
nistic to consider hiin the father of thc decompositional approach.
In the 1940s, awareness grew that the usual algorithms for solving linear systems involved m trix factorization.Lz~l'
John Von Neuniann and
H.H. Goldstine, in their ground-breaking error
analysis of the solution of linear systems, pointcd
out the division of labor between computing a
factorization and solving the s y s t c ~ n : ' ~
We may therefbre interpret the elimination
method as one which bascs the invcrting of an
arbitrary matrix A on the coinbination of two
tricks: First it decomposes A into the product of
two semi-diagonal matrices C, I .. ., and consequcntly tlic invcrsc of A obtains imnicdiately
from those of C and B . Second it f o r m their iuverses by a simple, explicit, indoctive process.
algorithms that compute them have a satisfactory backward rounding-error analysis (see
Equation 7). In this hrief tour, I provide references only for details that cannot he found in the
many cxcellent texts and monographs on numerical lincar algcbra,18-26the IIandbook series,' or the LINPACK Users' Guide.'
The Cholesky decomposition
Desripion. Givcn a positive definite matrix
A , tlicre is a uniquc upper triangular matrix R
with positivc diagonal elements such that
A = KTK.
In this form, the decomposition is known as the
Cholcsky decomposition. It is often written in
the form
In the 1'950s and early 1960s, Householdcr
A = LDLT
systematically explored the rclation between various algorithms in matrix terms. His book The wherc D is diagonal and L is onit lower triangxiTheory of Matrices irz Nwnerical Analysis is the lar (that is, L is lower triangolar with ones on the
mathenlatical epitome of the decoinpositional diagonal).
Applications. T h e Cholesky decomposition is
approach."
In 1954, Givens showed how to reduce a syn- used primarily to solve positive definite lincar
metric matrix A to tridiagonal form by orthogo- systems, as in Equations 3 and 6. It can also be
nal transformation.'6 T h e reduction was merely employed to compute quantities usefiil in statisa way station to the computation of the eigcii- tics, as in Equation 4.
values ofA, and a t the time no one thought of it
Algorithms. A Cholcsky decomposition can
as a decomposition. However, it and othcr in- be computed using any of the variants of Gausstermcdiate f o r m have proven useful in their ian elimination (see Figure 3 t m o d i f i e d , of
owii right and have become a staple of thc de- course, to take advantage of symmetry. All these
algorithms take approximately n'/6 floatingcompositional approach.
In 1961,Jamcs Wilkinson gave thcfirst back- point additions atid multiplications. T h e algoward rounding-error analysis of the solutions of rithm Cholesky proposed corresponds to thc dilinear systems." Here, the division of labor is agram in the lower right of Figurc 3.
complete. IIe givcs one analysis of the compnUpdating. Given a Cholesky decomposition
tation of the LU decomposition and another of A = R"K,you can calculate the Cholcsky dcthe solution of triangular systems and then com- composition of A + xxT from R and x in O(n')
bines the two. Wilkinson continncd analyzing floating-point additions and multiplications.The
various algorithms for computing decomposi- Cholesky decomposition ofA -xi" can he caltions, introducing uniform techniques for dcal- culated with the same nnmher of operations.
ing with the transformations used in the coni- T h e latter process, which is called downdating is
pntations. By the time his hook Algebraic numerically less stable than updating.
Bigenvalm Problem" appeared in 1965, the deThe pivoted Cholesky decomposition. If P
compositional approach was firmly estahlished. is a permutation matrix and A is positive dcfinite, then PTAP is said to he a diagonal permutation of A (among other things, it permutes the
The big six
diagonals of A).Any diagonal permutation ofA
There are many matrix decompositions, old is positive definite and has a Cholesky factor.
and new, and the list of the latter seeins to grow Such a factorization is called a pivoted Cholesky
daily. Nonetheless, six decompositions hold the factorization. There are many ways to pivot a
center. 'The reason is that they are useful and sta- Cholesb decomposition, hut the most common
ble-they have important applicatiotis and the one produces a factor satisfying
54
COMPUTING IN SCIENCE & ENGINEERING
History. In estalilishing the existence of the
LU decomposition, Jacobi
that undcr
(9)
In particular, if A is positive semidefinite, this
stratem will assure that R has the form
where R, I is nonsinplar and has tlie same order
as t h e rank of A . Hence, the pivoted Cliolcsky
decomposition is widely used for rank deterniination.
History. The Cholesky decomposition (more
precisely, an LDL'" decomposition) was the dccomposition of Gauss's cliniination algorithm,
which he sketched in 1809*' and presented in
fiill in 1810.*8Benoit publishctl Cholcsky's variant posthuinously in 1924.''
certain conditions a bilinear form Hx,
y) can he
written in the form
d%Y)
= P l ( 4 m + PZ(x)dY) + ., , + p,(x)o,LY)
where p; and q arc linear tiinctions that depend
only on the last (n - i + 1) components of their
argnments. T l i c coefficients of the functions are
the elements o f L and U.
The QR decomposition
Description. Let A be an m x VI matrix with
m t n. l'liere is an orthogonal matrix Q such that
Q T A=
where R is upper tsiangnlar with nonnegative diagonal elements (or positive diagonal elements
Description. Givcn a matrix A of order n, if A is of rank n).
there are permutations P and Q such that
If we partition Q in tlic form
The pivoted LU decomposition
P~AQ
= LU
.where L is unit lower triangular and Uis upper
triangnlar. T h e matsices P atid Q are not unique,
and tlie process of selecting them is known as
Q = (QA Q3
where Q, has n columns, then we can write
A = QAR.
(10)
pivoting.
Applications. Like the Choleslty decompc~si- This is soinctiines called the QRficto&mun of
tion, the L U deconiposition is used primarily for A .
solving linear systems. However, since A is a
Applications. When A is of rank n, the
general matrix, this application covers a wide columns of Q, form an orthonormel basis lor
range. For example, the L U decomposition is the ctilunm space %A) ofA, and the columns of
used to compute the steady-state vector ofMarkov QL form an orthonormal basis of the orthogochains and, with the inverse power method, to nal complenmit of %A). In particular, QnQZ is
compute eigenvectors.
the orthogonal projection onto N A ) . For this
Algorithms. T h e basic algorithm for c o n - reason, tlic QR decomposition is widely uscd in
puting L U decompositions is a generalization of applications with a geometric flavor, especially
Gaussian elimination to nonsymmetric matrices. least squares.
When and how this generalization arose is obAlgorithms. lPierc are two distinct classes of
scure (see D y e r ' for comments and references). algorithms for computing the QR decotnposiExcept for special matrices (such as positive def- tion: Gran-Schmidt algorithms and orthogonal
inite and diagonally doininant matrices), the triangnlarization.
Gram-Schmidt algorithms proceed stcpwisc
method requires some tiorin of pivoting for stability The most coninion form is partial pivot- liy orthogonalizing the kth columns ofA against
ing, in which pivot elements are chosen from the the first (k - 1) columns of Q to get the kth
column to be eliminated. This algorithm re- column o f Q . There arc LWO f o r m of the
quires about n'/3 additions and ~ndtiplications. Gram-Schmidt algorithm, thc classical and the
Certain contrived examples sliow that Gauss- modified, and they both compute only the
ian elimination with partial pivoting can be un- factorization in Equation I O . The classical
stable. Nonctlieless, it works well for the over- Grain-Schmidt is unstable. T h e modified form
wheliniiig majority of real-life proble~ns.'~~''can produce a matrix Q, whose columns deviate
Why is an open question.
from orthogonality. But the deviation is
]ANUAUY/FEBRUAUY
2000
55
botiiided, and the compntcd factorization can he
used in certain application-notably computing least-sqnares solutions. i f the orthogonalization step is repeated a t cach stage-a process
known as ~.eorthu~onulizutiun-liothalgorithms
will produce a fiilly orthogonal factorization.
When n s m, the algorithms without reorthogonalizatioti require abont mnz additions and miltiplications.
Tlie method of orthogonal trianpilarization
proceeds by preniultiplying A by certain simple
orthogonal matrices until the elcinents below
the diagonal are zero. T h e product of the orthogonal matrices is Q,a d R is the upper triaiigular part of the rcduced A. Again, there are ~ w o
versions. T h e first reduces the matrix by IIouseholdcr transformations. 'Tlic method has the advantage that it represents the entire matrix Q in
the same amount of memory that is required to
hold A, a great savings when n >> p. Tlie second
method reduces A by plane rotations. It is less
efficient than the first method, lint is hetter
suited for matrices with structured patterns of
nonzero elements.
Relation to the Cholesky decomposition.
Froin Equation 10, it follows that
oted QR factorization has the form
It follows that either Q, or the lint k columns of
AP form a basis for the colmiin space of A .
Thus, the pivoted QR decomposition can he
used to extract a set of linearly independent
columns from A.
History. T h e QR factorization first appeared
in a work by Erhard Schmidt 011 integral equatiow3' Specifically,Schmidt showed how to orthogonalize a sequence of functions by what is
now known as the Gram-Schmidt algorithm.
(Curionsly, Laplace produced the liasic formula~~
but
' had no notion of orthogonality.) T h e
name QR comes from the QR algorithm, named
by Francis (see the history notes for the Schur
algorithm, discussed later). Householder introduced Honscholdcr transformations to matrix
computations and showed how they could be
used to triaiibdarizc a general ~natrix.~'
Planc
rotations were introduced hy Givens,16who used
them to reducc a symmetric matrix to tridiagonal form. Bogert and Rurris appear to lie the first
to use them in orthogonal triangiila~ization.~~.
A ~ =AR ~ R .
(11) Tlie first updating algorithm (adding a row) is
due to Goli~h,~'
who also introduced the idea of
In other words, the triangular factor of the QR pivoting.
decomposition ofA is the triangular factor of the
Cholesky decomposition of the cross-product The spectral decomposition
matrix A''A. Consequently, many probIeinsDescription. Let A he a symmetric matrix of
particularly least-squares prohlcm-can be order n. Therc is a t i orthogonal matrix Vsnch
solved using either a QR decoiiipositioii from a that
least-squares niatrix or the Cholesky decomposition from tlie noriiial cqiration. ?'he QR decomposition nsually gives more accurate results,
whereas the Cholesky decoinposition is often If videnntes the ith column of V, thenAvi = &vi,
faster.
Thus (Ai,
vi)is an cigenpair of A , and tlie spectral
Updating. Given a QK factorization of A , decomposition shown in Equation 12 exhibits the
there are stable, efficient algnrithms for rcconi- eigenvalues ofA along with completc orthonorputing the QR factorization after rows and mal system of eigenvectors.
columns have been added to or rcinovcd from
Applications. Thc spectral decomposition
A. In addition, the QR decomposition of the finds applications wherever the eigcnsystein o f
rank-one nuidification A + .tyT can be stably a syinmetric matrixis needed, which is to say in
updated.
virtually a11 technical disciplines.
T h e pivoted QR decomposition. if P is a
Algoritluns. Therc are three classes nf algoperiiiutation matrix, then APis a permutation of rithms to compute the spectral decomposition:
the columns of A , and (PIP)~(AP)
is a diagonal the QR algorithm, the divide-atid-conquer alperniutatioti o f ATA. In view of the relation of gorithm, and Jacobi's algorithm. T h e first two
the QR and the Choleslty decompositions, it is require a prcliinitiary reduction to tridiagonal
not surprising that there is a pivoted QR factor- forni by orthogonal similarities. I discuss the QR
ization whose triangular factor R satisfies Eqna- algorithm in the next section 011 tlie Schur detion 9. In particular, ifA has rank k, then its piv- composition. T h e divide-and-conquer algo-
56
COMPUTINGIN SCIENCE 8 ENGINEERING
r i t h ~ i i ' ~ is
. ' ~comparatively recent and is usnally Hcssctiberg form, which is usually done with
faster than the QR algorithm when both eigeii- Householder transforimtions, the Scliur from is
values and cigenvectors are dcsired; Iiowevcr, it computed using the QR algorithm."6 Elsewhere
is not suitable for probleins in which tlie eigen- in this issue, nercsford Parlctt discusses the
values vary widely in magnintde. T h c Jacolii al- modern form of the algorithm. It is one of the
gorithm is much slower than the other two, but most flcxihlc algorithms in tlie rcpertoirc, having
for positivc dcfiiiite matriccs it may hc more variants for tlie spectral dccomposition, the sillaccurate."' All these algorithm require O(n') gular valncs decomposition, a n d tlie generalized
eigenvalue problcm.
nperations.
Updating. T h e spectral decomposition can be
History. Schur introduced his dccnmposition
updated. Unlike the Clioleslcy and QR decoiii- in 1909?' It was tlic nnly onc of the big six to
positions, thc algorithm does not result i n a re- havc been derived in teriiis of matrices. It was
duction in the order of thc work-it still remains largely ignorcd until Francis's QK algorithm
O(n'), although the order constant is lower.
pushed it into the limcliglit.
History. T h c spectral dccomposition dates
back to a n 1829 paper by Cauchy," who intro- The singular value decomposition
duced the eigenvectors as solutions of equations
Desmption. Let A bc an ?n x n matrix with
of thc hirm Ax = /Lr a t i d proved tlic orthogonal- m t 12. Tlicre are orthogonal matrices U a n d V
ity of eigenvcctnrs belonging to distinct cigen- such that
values. In 1846, Jacobi" g-avc his fainous algorithm for spectral decomposition, which iteratively reduces the matrix in question to diagonal form by a special type of planc rotations, where
now called Jacnbi rotatioiis. T h e reduction to
tridiagonal form hy plane rotations is due to
C = diag(q, ...,oJ, 0,t o22 ...2 qzt 0.
GivensL6and hy Householder traiisforiiiations
to H~uselioldcr."~
This dccomposition is called thc sinplar uaLe
decampposition o f A .If U, consists ofthc first n
The Schur decomposition
columns of U, we can write
Desmption. Let A hc a matrix of order n.
There is a unitary matrix Usuch that
A = U,Z V"
(13)
A
=
which is sometirnc.s called thc sin,plur vukiie/ictnrizntion of A.
The diagonal clctiients of oare called thcsinpfilr u a l w of A . l l i e corrcspntiding columns of U
and Vare called leftandri~yhtsiizgulurncctors o f A .
Applications. Most of the applications of tlic
QR decomposition can also he handlcd by thc
siugular valuc decomposition. In addition, thc
singular value decoinpnsition gives a hasis for the
row space ofA and is more re1ial)le in deterininiiig rank. It can also be uscd to computc optimal
low-rank approximations and to computc angles
between suhspaces.
Relation to the spectral deconzposition. T h e
singular value factorization is related to tlie spectral decomposition in much the same way as tlic
QR factorizatioo is rclated to the Cholesky decompnsition. Specifically, from Equation 13 it
follows that
where T i s upper triangular and H means conjugxtc transpose. T h e diagonal elemcnts of Tare
the cigcnvalues of A , which, by appropriatc
clioicc nf U,can be made to appcar i n any order.
l'his decomposition is callcd a Schzir clecimqiosition o f A .
A real matrix can have complex eigenvalues
and hence a complex Schur fnrin. By allowing T
to have rcal2 x 2 blocks on its diagonal that colitain its coniplcx eigenvalucs, the eiitirc dccomposition can be madc rcal. This is sometimes
callcd a rcal Schzcrfom.
Applications. An important use of the Sclmr
foriii is as an intermediate form from which the
eigenvalues aiid eigenvectors of a matrix can he
computed. O n the other hand, the Schur (lecomposition can nften be nsed in place ofa cninplcte system of eigenpairs, which, in fact, may not
exist. A gnod exainplc is the solutioii of Sylvestcr's
equation aiid its reIati~es.*,'~
Algorithms. After a preliminary reduction to l'hus the eigenvalues of the cross-product maIANUARYIFEBRUARY
2000
57
trix A"A arc the squares of the singular vectors
7.
1.1.
Dongarra etal., "An Extended Setof Fortran Basic Linear Al-
gebra Subprograms," ACM irons. Mothematical Sollwore, Vol.
of A and the eigenvectors of ATA are right sin14,1988, pp. 1-1 7.
pilar vectors ofA.
8. 1.1. Dongarra et ai., "A Set of Level 3 Baric Linear Algebra SubAlgorithms.As with the spectral decomposiprograms," ACM Trans. Mothemotical Soliware, Vol. 16, 1990,
pp.1-17.
tion, there arc thrce classes of algorithms for
9. C.L. Lawson et ai., "Baric Linear Algebra Subprograms far Forcomputing the singiilar value decomposition: the
tran Usage," ACM Tronr. Motliemotical Sotware, Vol. 5. 1979,
QR algorithm, a divide-and-conquer algorithm,
pp. 308-323.
and a Jacobi-likc algorithm. T h e first two re- 10. A. Cayley, "A Memoir an the Theory of Matricer," Philosophicai
quire a rcdnction ofA to bidiaganal form. l h e
Tionr. RoyalSoc. ofLandon.Vo1. 148, 1858, pp. 17-37.
divide-and-conquer a l g o r i t h ~ nis~oftcn
~
fastcr 11, C.C. Mac Duffee, The Theaiy 01 Matrices, Chelsea, New York,
1946.
than the QR algorithm, and the Jacobi algorithm
12. P.S. Owyer, "A Matrix Prerentatianof Least Squarer and Correis the slowest.
lation Theory with Matrix Justificationof Improved Methods of
Histoiy. The singular value decomposition was
Solution," Annals Math. Stotirticr, Vol. IS, 1944, pp. 82-89.
introduced indcpendently by Bcltrdnii it? 18735u 13. H. len6en. An AttemptataSyrtemoti~Clarri~~otionafsome
Methand Jordan in 1874.5' The reduction to bidiagoods lor the Solution 01 N o m " Equations, Report No. 18. G e o ~
doetirk Inititut, Copenhagen, 1944.
i d form is dne to Golnb and ICahan," as is the
variant of the QR algorithm. The firstJacobi-like 14. J. von Neumann and H.H. Coidrtine, "Numericai Inverting of
Matrices of High Order." Bull. Am. Math. Sac., Val. 53,1947, pp.
algorithm for computing the singular value de1021-1 099.
composition was given by I<ogbctIiantz.'3
IS. A.S. Hourehoider, The Theoiy 01 Mutrim in NumericalAnalysis,
Dover, NewYork, 1964.
hc big six are not the only decompositions in use; in fact, there are many
more. As mentioned earlicr, certain
intermediate forins-such as tridiagonal and Hessenherg form-have come to be
regarded as decompositions in their own right.
Since the singular valuc decomposition is expensive to c o m p t e and not readily updated,
rank-revealing alternatives h a w received coiisidcrahle a t t e n t i o ~ iThere
. ~ ~ ~are
~ ~also generalizations of the singular value dccotnposition and
the Schnr decomposition for pairs of matriNI crystal halls become cloudy when
they look to thc fiimrc, but it seems safe to say
that as long as new matrix problems arisc, new
decompositions will be devised to solve thcni. %
Acknowledgment
This work was supported by the National Science
Foundation under Grant No. 970909.8562.
References
1. P.S. Dwyer, Linear Computotianr, JohnWiley & Sons, New York,
1911.
2. A S Homeholder, Principiis of Numerimi Anaiysis, McCraw-Hill,
NewYork, 1953.
3. J.H.Wiiklnron and C. Reinrch, Handbook IorAuIomatic Computalion, Val. il,LinrarAlyebra, Springer-Verlag, New York, 1971
4. 8.5. Garbow et al., "Matrix EigenrystemRautinedlrpack Guide
Extension." Lecture Notrr in Computer Science, Springer-Verlag,
NewYork, 1977.
5. 1.1. Dongarra et al., LINPACK User's Guide, SIAM, Philadelphia,
1979.
6. E. Anderson et al., LAPACK Users' Guide, second ed., SIAM,
Philadelphia, 1995.
58
16. W. Givens, Numerid Coinputofionof the Choraclwi5tic Voluei of
a Red Matrix, Tech. Report 1574, Oak Ridge Nat' Laboratory,
Oak Ridge, Tenn., 1954.
17. I.H. Wiikinron, "Error Analysis of Direct Methods of Matrix Inverrian.''l. ACM,Vol. 8, 1961, pp. 281-330.
18. I.H. Wilkinron, TheAigebr~i~€iyenvaiuePrabiem.
Ciarendon Press,
Oxford, U.K., 1965.
19. A. BjBrck, Nsmcdmi Methods lor Least Squarer Problems, SIAM,
Philadelphia, 1996.
20. B.N. Oatta, Numerid Linear Algebro ond Applicotiani, Brooks/
Cole, PacificGroue, Calif., 1991.
21. I.W. Demmel, Applied Numerirai Linear Aiyebra. SIAM, Philadelphia, 1997.
22. N.1, Higham, AccuroiyandStobiIi~~ofNumericoi
Aiyorithmr, SIAM,
Philadelphia, 1996.
23. B.N. Parlett, TheSymmetk Eiymvaiue Pmbiem, Prentice-Hall, Englewood Cliffs, N.I., 1980; reissued with revisions by SIAM,
Philadelphia, 1998.
24. G.W. Stewart, introduction to Matrix Computations, Academic
Press, NewYork, 1973.
25. G.W. Stewart, Motrix Aiyorithmr i:Baric Decompositions, SIAM,
Philadelphia, 1998.
26. D.S. Watkins, Fmdomentak ofMotrix Computotioonr, john Wiley &
Sonr, New York, 1991,
27. C.F. Gauss, Theoria Motus Carporom Coelertium in Sectionibur
Conicis Solem Ambientium [Themy 01 the Motion of the Heovenly
Bodies Moving obout the Sun in Conic Sections], Perthes and
Berrer, Hamburg, Germany, 1809.
28. C S Gaurs, "Dirquiritio de elementis eilipticir Palladis [Dirquirition on the Elliptical Elements of Pallas]," Commentatineirocietotis rqioe rcieotorlomGattingemis recentioorer, Voi. 1, 1810.
29. C. Benoit, "Note sur une methode de resolution der equations
normalei provenant de hpplicatian de la methade der molndrer clrrer a un system d'equatianr linearei en nambre inferieur
'a celui der inconnuer. --application de la methode d la resolution d'un ryrteme deiini d'equationr linearer (Procede du Commandant Cimlesky) [Note an a Method for Soiving the Normal
Equations Arising from the Application of the Method of Least
Squarer to a System of Linear Equations whore Number Is Less
than the Numberof Unknawnrdpplication of the Method to
the Solution of a Definite System of Linear Equations]." B u M n
Geoderique (iauioure), Vol. 2, 1924, pp. 5-77
COMPUTING
IN SCIENCE 8
ENGINEERING
30. L.N. Trefethen and R.S. Schreiber, "Average-Care Stability a i
in Tech. Report 9 M 7 ,Dept. of Computer Science, Univ. of Minnesota, Minneapolis, 1990.
Gaussian Elimination." SIAM I. M o t h Aoalyris ood Appiimtionr,
Vol. 11, 1990, pp. 335-360.
51. C. lardan, "Memoiremi lerformer bilineairer [Memoiran Bilin-
3 1 , L.N. Trefethen and D. BBU 111, Numericcol Linear Algebra, SIAM,
Philadelphia, 1997, pp. 166-1 70.
ear forms]," )aoinal de Mothematiquer Pure$et Appliquem, DeuxiemeSerie,Vol. 19, 1874,pp. 35-54.
32. C.C.I. lacobi, "Uber einen algebrairchen Fundamentalraa iind
seine Anwendungen [On a Fundamental Theorem in Algebra
and its Applications]," ) o u r ~ o l l ~ r dieineundongewandte
ie
Mathemotik, Vol. 53, 1857, pp. 275-280.
52. G.H. Galub and W. Kahan, "Calculating the Singular Valuer and
Preudo-Inverse af a Matrix," SIAM 1. NumericalAnalysis, Val. 2,
1965, pp. 205-224.
33. E. Schmidt, "Zur Theorie der linearen und nichtlinearen Integralgleichungen. l Teii, Entwicklung wilikurlichen Funktionen
nach System vorgerchriebener [On the Theory af Linear and
Nonlinear Integral Equations.Part I, Expansion of Arbitrary functions by I PrescribedSystem]," MothematixrcheAnnalen, Val. 63,
1907, pp.4331176.
tion of CoefficientsMillrix.'' iXorterIy of Applied Moth., Vol. 13,
1955, pp. 123-132.
34. P.5. Laplace, Theoh Analytiqoeder Pmbabilites [AnolyticolTheory
ofProbabilitier], 3rd ed., Courcier, Paris, 1820.
35. A.S. Householder, "Unitary Triangulariration of a NaniymmetTic Matrix,"). ACM, Val. 5, 1958, pp. 339-342.
36. D. Bogert and W.R. Burrir, Comparison of Leust Squarer Algorithms, Tech. Report ORNL-3499, V.l, 85.5,Neutron Physics
Div., Oak Ridge Nat'i Laboratory, Oak Ridge, Tenn., 1963.
53. E.G. Kagbetliana, "Solution of Lineal Systems by Diagonaiiza-
54. T.F. Chan, "Rank Revealing QR Factorizations,'' LinearAlgebia
ond Its Appiicotions, Vol. 88/89, 1987, pp. 67-82.
55. C.W. Stewart, "An Updating Algorithm for SubrpaceTracking,"
IIEETronr. Siq"alPraier~i"~,VVol.40, 1992. pp. 1535-1541
56. 2 . Bai and I.W. Demmei, "Computing tile Generalized Singular
Value Decampoiition," SIAM J. Scientifi< ond Statlrtlcd Camputinq, Vol. 14, 1993, pp. 1464-1468.
57. C.B. Moler and G.W. Stewalt, "An Algorithm for Generalized Matrix Eigenvalue Problems," SIAM l. NumericalAoolyrir, Vol. 10,
1973, pp. 241-256.
37. C.H. Golub, "Numerical Methods for Solving teart Squarer Problems," NumerisrrheMothemotih, Vol. 7, 1965. pp. 206-21 6.
38. I.J.M. Cuppen, "A Divide and Conquer Method for lhe Symmetric Eigenproblem," Numerixhe Mothemotik, Vol. 36, 1981,
pp. 177-195.
39. M Cu and S.C. Eirenrtat, "Stable and EfficientAlgorithm for the
Rank-One Modification of the Symmetric Eigenproblem," SIAM
), M a t i i * A n ~ l y ~ i ~ a n d A p p ~Val.
~ ~ 15,
t i ~ 1994,
n ~ , pp. 1266-1276.
40. J. Demmel and K. V e d i c , "Jacabi'r Method Is Mare Accurate
than QR," SIAMI. Matrix Anolyrir m d Appliccotimi, Val. 13, 1992,
pp. 1204-1245.
41, A.L. Cauchy, "Sur I'eqoation ai I'aide de laquelle on determine
ler inegaiites iecuiairer der mowementi der planeter [On the
Equation bywhich thelnequalitierfortheSeculaiMovementaf
Planets is Determined]," Oeuvres Completes (11, e Seie), Vol. 9,
1829.
42. C.C.J. lacobi, "Ubeiein leichter Verfahren die in derTheorieder
Sicularrtdrungen varkommenden Gleichungen nuherirch
aufmlbren [On an Easy Method for the Numerical Solution of
the Equations that Appear in the Theory of Secular Perturbationr1,"lourooif"~di~reine undungewandte Mothemmk, Vol. 30,
1846, pp. 51.594.
43.
I.H. Wilkinron, "Householder's Method for the Solution af the
Algebraic Eigenvalue Problem," Computei),, Vol. 3, 1960, pp.
23-27.
44. R.H. Barteir and G.W. Stewart, "Algorithm 432: The Solution of
the Matrix Eqiiation A X - B X = C," Camm. AGM, Vol. 8, 1972,
pp. 820-826.
45. G.H. Colub, 5. Narh, and C. Van Loan, "Heirenberg-Schur
Method for the Problem AX + X B = C," IEEE Tram Autornotk Control, Val. AC-24, 1979, pp. 909-91 3.
46. I.G.F. Francis, "The QRTranrformation, Paris i and 1",
).,Vol.4, 1961, pp.265-271, 1962, pp. 332-345.
47.
G.W. Stewart is a professor in the Department of
Computer Science and in the Institute for Advanced
Computer Studies a t the University of Maryland. His
research interests focus on numerical linear algebra,
perturbation theory, and rounding-error analysis. He
earned his PhD from the University of Tennessee,
where his advisor was A.S. Householder. Contact Stewart a t the Dept. of Computer Science, Univ. of Maryland, College Park, MD 20742; [email protected].
Computer
1.
Schur, "UberdiecharakterirtirchenWhrreln eiiieriinearen Substitution mit einer Anwendung aut die Thearie der Integral gleichungen [On the Characteristic Roots of a Linear Tranrfarmstion with an Application to the Theory of Integral Equationr],"
MathemotiicheAnn(r1en.Val. 66, 1909, pp. 448-510.
49. M. Cu and S.C. Eirenrtat, "A Divide~and-ConquerAlgarithmfar
tlie Bidiagonal SVD." SIAM). Matrix Analyiir andApplicotioonr, Vol.
16,1995, pp. 79-92.
50. E. Beitrami, "Sulle fundoni bilineari [On Bilinear Functionr]," Ciorode di Motematicheod Uro degli StudentiDelle Univerrita, Vol. 11,
1873, pp. 98-106. An English translation by D. Eoley is available
Scientific
Programming
will return in
the next issue
of CiSE.