* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download The decompositional approach to matrix computation
Symmetric cone wikipedia , lookup
Rotation matrix wikipedia , lookup
Four-vector wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Determinant wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Jordan normal form wikipedia , lookup
Matrix calculus wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Principal component analysis wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
System of linear equations wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Gaussian elimination wikipedia , lookup
THEDECOMPOSITIONAL APPROACH TO MKTRIX COMPUTATION The introduction of matrix decomposition into numerical linear algebra revolutionized matrix computations. This article outlines the decompositional approach, comments on its history, and surveys the six most widely used decompositions. 1951, Paul S. Dwyer pnblishcd I,inenr Comptntions, perhaps the first hook dcvoted cntirely to numerical linear algebra.' Digital conipiiting was in its infancy, and Dwyer focused on computation with tncchanical calculators. Nonctheless, the hook was state of the art. Figurc 1 reproduces a pagc of thc book dealing with Gaussian elimination. In 1954, Alston S. Householder published Prim+ p b nf Nzc"cn1 Ann1ysi.r; one of the first modcrn treatments of high-spccd digital coinputation. Figure 2 rcproduccs a page from this hook, also dealing with Gaussian elimiiiation. The contrast between tlicse two pages is striking. l'he most obvious difference is that Ilwyer 11 1521-96151001S10.000 2000 IEEE 50 used scalar equations whereas Householder used partitioned matriccs. Rut a d e e p difference is that while Dwyer started from a system of equations, I-Iouseholder worked with i\ (block) LU decoinpiisition-the factorization of a matrix into the product of lower and upper triangular matrices. Gencrally spcaking, a decomposition is a factorization of a inatrix into simpler factors. T h e undcrlying principlc of thc decompositional approach to matrix computation is that it is not the husiness of tlic matrix algorithmists to solve particular problems hut to construct computational platforms from which a variety of problems can be solvcd. 'rliis approach, which was in full swing by thc inid-l960s, has revolutionized inatrix computation. To illustrate tlic nanirc of tlic decompositional approach and its cousequcnces, I begin with a discussion of the Cholesky decotnpositioii and the solution of linear system-essentially the decoiiiposition that Gauss computed in his dim- COMPUTING IN SCIENCE & ENGINEERING I d P 8-3 + "d. n=t - Ra /* Figure 1. This page from Linear computations shows that Paul D y e r ' s approach begins with a system of scalar equations. Courtesy of John Wiley 6r Sons. Figure 2. On this page from Principles ofNumerical Analysis, Alston Householder uses partitioned matrices and LU decomposition. Courtesy of McCraw-Hill. ination algorithm. 'l'his article also provides a A caii be factored in thc form tour of the five other major matrix decompositions, including thc pivoted I,U decomposition, A = R% (2) the Q R decomposition, the spectral decomposition, the Schur decomposition, and the singu- where R is an upper triangular iuatrix. T h e factorization is called the Cholesky decomposition of lar value decomposition. A disclaiiner is in order. This article deals pri- A. The factorizatioii in Equation 2 caii be used marily with dense matrix computations. Ndiough tlic decompositional approach has greatly to solvc linear systems ;is follows. Ifwc write thc influcnced iterativc and dircct methods for sparse system in thc form R"Rx = b and sety = R-Th, matrices, the ways in which it has affected thctii then x is the solution of thc triangnlar system Rx =y. IIowever, by clefinitioiiy is the solution of are differcnt from what I describe here. thc systeiii R'ry = b. Coiisequently, we havc reduced the prnhlein t~ the solutim of two trianThe Cholesky decomposition and gular systems, iis illustrated in the following alllnear systems gorithm: We caii nse the decompositionill approach to 1 , Solve the system R-5 = b. solve the system 2 . Solve the system &. =y. (3) Ax=b (1) Bccansc triaiigiilar systcins arc easy to solvc, the where A is positive defiiiitc. It is wcll known that introduction of thc Cholesky dccoiiipositioii has JANUARYIFEBRUARY 2000 51 Figure 3. These varieties of Gaussian elimination are all numerically equivalent. transformed our problem into one for which the solution can be readily coniputed. We can nse such decompositions to solve innre than one problem. For example, the following algorithm solves thc systetnATx = b: 1. Solvc the system Ry = b. 2 . Solve the systcin d S x = y . (4) Again, in many statistical applications wc want to coinputc the qnantity p = xTAA'r.Because we can compute p as follows: 1. Solve the system R'y 2 . p = j'y. = x. (6) used to solve a system at one point in a computation, yon can rensc the decomposition to do the same thing later without having to reconpnte it. Historically, Gaussian elimination and its variants (including CholcskyS algorithm) have solved tlie systcin in Equation 1 by rcducing it to an equivalent triangular system. This mixcs tlie computation of the decomposition with the solution of the first triangular system in Equation 3 , a i d it is not obvious how to reuse the elimination when a new right-hand side presents itself. A naive programmer is in danger of pcrforming the reduction from the hcginning, thus repeating the IionTs share of the work. On the other hand, a program that knows a decoinpnsition is in the background can reuse it as needed. (By tlie way, the problem of recomputing dccompositions has not gone away. Some matrix packages hide the fact that they repeatedly coinpnte a decoiiiposition by providing drivers to solve linear systems with a call to a single row tine. If the program calls the routine again with the same matrix, it recomputes thc decoinposition--unnecessarily. Interpretive matrix system snch as Matlab and Nlatheniatica have the same problem-they hide decompositions behind operators and function calls. Such arc the consequences of not strcssing the decoinpositional approach to the consumers of matrix algorithms.) Another advantage ofworking with decoinpositions is unity. There are differcnt w"ys of organizing the operations involved in solving linear systems by Gaussian elimination in general and Cholcskyh algorithm in particular. Figme 3 illustrates some of these arrangcinents: a white area contains clenients froin the original matrix, a dark area contains the factnrs, a light gray area contains partially processed elements, and the boundaiy strips contain clciiients about to he processcd. Most of these variants were originally presented in scalar form as new algorithms. Once you recognize that a dccomposition is involved, it is easy to sec the essential niiity of the various algorithms. All the variants in Fignre 3 are numerically equivalent. This means that one rounding-error analysis serves all. For example, the Choleslcy algorithm, in whatever guise, is backward stable: the computed factor R satisfies T h e dccoinpoaitional approach can also save computation, For example, the Cholesky de(A + E ) = R ~ R (7) composition requires O(n') operations tn compute, whereas the solution of triangular systcins where E is of the size of the rounding unit relarequires only O(nz)operations. Thus, if you rec- tivc to A. Establishing this backward is usually ognizc that a Cholesky decomposition is beiug the most difficult part of an analysis of the use 52 COMPUTING IN SCIENCE & ENGINEERING of a decomposition to solve a prohlem. For cxample, oncc Equation 7 has Iiccn estahlishetl, the ruunding errors involved in the solutions of the triangiilar systems in Equation 3 can bc incorporated in E with relative ease. Thus, another advantage of the decoinpositional approach is that it concentrates the most difficult aspects of rounding-error analysis in one place. In general, if yon change the elements of a positive definite matsix, yon must recompute its Cholesky decomposition from scratch. 1Iowever, if the change is strnchlred, it tilay he possihlc to compute the ncw deconipositinn dircctly froni the old-a process known as z q d n t i q . For exaniplc, you caii cotnpute the Cholesky dccomposition of A + S S froin ~ that of A in O ( 2 ) 01’eratioris, a n enormous savings over the nb initio computation of the decoinposition. Finally, the decompositional approach has greatly affected the development of software for matrix computation. Instead of wasting energy developing code for a variety of specific applications, the producers of a niatrix pachgc can concentrate (in the decompositions themselves, pcrhaps with a few auxiliary routines to handle the most importmt applications. This apprmach has informed the major public-domain packages: the IIandbook series,’ Eispack? Linpack,’ atid 1,apack.6 A consequence o f this emphasis on dcconqinsitions is that software developers have found that most algorithms have hroatl coniliutatioiial featurcs in cornnion-features than can he relegated tn liasic linear-algebra subprograms (such 21s Blas), which caii then lie optimized for specific ~ n a c h i n e s . ’ ~ ~ For easy reference, the sidebar “Benefits of thc decompusitional approach” summarizes the advantages of dccomposition. History All the widely used decompositions had niadc their appcarance by 1900, when Schur introdnced the ilecomposition that now bears his name. However, with the exception ofthc Schur decomposition, they were nnt cast in tlic Iaiip a g e ofmatrices (in spite of the fact that matriccs had been introduccd in 1858“’). I provide some historical background for thc individual decoiupositions later, but it is instructivc here to consider how the originators proceeded in the absence o f matriccs. Gauss, who worked with positivc definite systems defined by the nornial equations for least squares, described his elimination procedure as Benefits of the decompositional approach A matrix decomposition solves not one but many problems. A matrix decomposition, which is generally expensive to compute, can be reused to solve new problems involving the original matrix. The decompositional approach often shows that apparently different algorithms are actually computing the same object. The decompositional approach facilitates rounding-error analysis. Many matrix decompositions can be updated, sometimes with great savings in computation. By focusing on a few decompositions instead of a host of specificproblems, software developers have been able to produce highly effective matrix packages. the reduction of a quadratic form q(x) = i’hx (I ani simplifying a little here). In terms of the Cholesky factorization A = R”R, Ganss wrotc q(x) in the form = p:(x)+p:(x)+K (8) where r:t is the ith row of R . Thus Gauss reduced &) to a sum of squares oflinear lunctioils pk Because R is uppcr triangular, the fiinctinn p;(x) depciids only on the cotnpunents s;,.. .x. of.^ Sincc the coefficients in the linear hisins p; arc the eleincnts ofR, Gauss, by showing how to compute the pi, effectively computed the Cholcsky decomposition ofA. Other decotnpositions were introduced in other ways. For example, Jacobi introduced thc LU decomposition as a decomposition of a h linear form into a slim (if products of linear f~nictions having an apprnpriate triangularity with respect tii the variahlcs. T h e singular valnc decomposition madc its appearance as an orthogonal change ofvariables that diagonalizcd a bilinear form. Eventually, all these decompositions found expressions as factorizations of matrices.“ T h c process by which decomposition bcc2iine so important to matrix computations was slow and increincntal. Gauss certainly had the spirit. H e used his decomposition to perform many tasks, such as computing variances, and even used it to update Icast-sqnares solutions. But Gauss never regarded his dccnmposition as a niatrix factorization. atid it would lie anachro- nistic to consider hiin the father of thc decompositional approach. In the 1940s, awareness grew that the usual algorithms for solving linear systems involved m trix factorization.Lz~l' John Von Neuniann and H.H. Goldstine, in their ground-breaking error analysis of the solution of linear systems, pointcd out the division of labor between computing a factorization and solving the s y s t c ~ n : ' ~ We may therefbre interpret the elimination method as one which bascs the invcrting of an arbitrary matrix A on the coinbination of two tricks: First it decomposes A into the product of two semi-diagonal matrices C, I .. ., and consequcntly tlic invcrsc of A obtains imnicdiately from those of C and B . Second it f o r m their iuverses by a simple, explicit, indoctive process. algorithms that compute them have a satisfactory backward rounding-error analysis (see Equation 7). In this hrief tour, I provide references only for details that cannot he found in the many cxcellent texts and monographs on numerical lincar algcbra,18-26the IIandbook series,' or the LINPACK Users' Guide.' The Cholesky decomposition Desripion. Givcn a positive definite matrix A , tlicre is a uniquc upper triangular matrix R with positivc diagonal elements such that A = KTK. In this form, the decomposition is known as the Cholcsky decomposition. It is often written in the form In the 1'950s and early 1960s, Householdcr A = LDLT systematically explored the rclation between various algorithms in matrix terms. His book The wherc D is diagonal and L is onit lower triangxiTheory of Matrices irz Nwnerical Analysis is the lar (that is, L is lower triangolar with ones on the mathenlatical epitome of the decoinpositional diagonal). Applications. T h e Cholesky decomposition is approach." In 1954, Givens showed how to reduce a syn- used primarily to solve positive definite lincar metric matrix A to tridiagonal form by orthogo- systems, as in Equations 3 and 6. It can also be nal transformation.'6 T h e reduction was merely employed to compute quantities usefiil in statisa way station to the computation of the eigcii- tics, as in Equation 4. values ofA, and a t the time no one thought of it Algorithms. A Cholcsky decomposition can as a decomposition. However, it and othcr in- be computed using any of the variants of Gausstermcdiate f o r m have proven useful in their ian elimination (see Figure 3 t m o d i f i e d , of owii right and have become a staple of thc de- course, to take advantage of symmetry. All these algorithms take approximately n'/6 floatingcompositional approach. In 1961,Jamcs Wilkinson gave thcfirst back- point additions atid multiplications. T h e algoward rounding-error analysis of the solutions of rithm Cholesky proposed corresponds to thc dilinear systems." Here, the division of labor is agram in the lower right of Figurc 3. complete. IIe givcs one analysis of the compnUpdating. Given a Cholesky decomposition tation of the LU decomposition and another of A = R"K,you can calculate the Cholcsky dcthe solution of triangular systems and then com- composition of A + xxT from R and x in O(n') bines the two. Wilkinson continncd analyzing floating-point additions and multiplications.The various algorithms for computing decomposi- Cholesky decomposition ofA -xi" can he caltions, introducing uniform techniques for dcal- culated with the same nnmher of operations. ing with the transformations used in the coni- T h e latter process, which is called downdating is pntations. By the time his hook Algebraic numerically less stable than updating. Bigenvalm Problem" appeared in 1965, the deThe pivoted Cholesky decomposition. If P compositional approach was firmly estahlished. is a permutation matrix and A is positive dcfinite, then PTAP is said to he a diagonal permutation of A (among other things, it permutes the The big six diagonals of A).Any diagonal permutation ofA There are many matrix decompositions, old is positive definite and has a Cholesky factor. and new, and the list of the latter seeins to grow Such a factorization is called a pivoted Cholesky daily. Nonetheless, six decompositions hold the factorization. There are many ways to pivot a center. 'The reason is that they are useful and sta- Cholesb decomposition, hut the most common ble-they have important applicatiotis and the one produces a factor satisfying 54 COMPUTING IN SCIENCE & ENGINEERING History. In estalilishing the existence of the LU decomposition, Jacobi that undcr (9) In particular, if A is positive semidefinite, this stratem will assure that R has the form where R, I is nonsinplar and has tlie same order as t h e rank of A . Hence, the pivoted Cliolcsky decomposition is widely used for rank deterniination. History. The Cholesky decomposition (more precisely, an LDL'" decomposition) was the dccomposition of Gauss's cliniination algorithm, which he sketched in 1809*' and presented in fiill in 1810.*8Benoit publishctl Cholcsky's variant posthuinously in 1924.'' certain conditions a bilinear form Hx, y) can he written in the form d%Y) = P l ( 4 m + PZ(x)dY) + ., , + p,(x)o,LY) where p; and q arc linear tiinctions that depend only on the last (n - i + 1) components of their argnments. T l i c coefficients of the functions are the elements o f L and U. The QR decomposition Description. Let A be an m x VI matrix with m t n. l'liere is an orthogonal matrix Q such that Q T A= where R is upper tsiangnlar with nonnegative diagonal elements (or positive diagonal elements Description. Givcn a matrix A of order n, if A is of rank n). there are permutations P and Q such that If we partition Q in tlic form The pivoted LU decomposition P~AQ = LU .where L is unit lower triangular and Uis upper triangnlar. T h e matsices P atid Q are not unique, and tlie process of selecting them is known as Q = (QA Q3 where Q, has n columns, then we can write A = QAR. (10) pivoting. Applications. Like the Choleslty decompc~si- This is soinctiines called the QRficto&mun of tion, the L U deconiposition is used primarily for A . solving linear systems. However, since A is a Applications. When A is of rank n, the general matrix, this application covers a wide columns of Q, form an orthonormel basis lor range. For example, the L U decomposition is the ctilunm space %A) ofA, and the columns of used to compute the steady-state vector ofMarkov QL form an orthonormal basis of the orthogochains and, with the inverse power method, to nal complenmit of %A). In particular, QnQZ is compute eigenvectors. the orthogonal projection onto N A ) . For this Algorithms. T h e basic algorithm for c o n - reason, tlic QR decomposition is widely uscd in puting L U decompositions is a generalization of applications with a geometric flavor, especially Gaussian elimination to nonsymmetric matrices. least squares. When and how this generalization arose is obAlgorithms. lPierc are two distinct classes of scure (see D y e r ' for comments and references). algorithms for computing the QR decotnposiExcept for special matrices (such as positive def- tion: Gran-Schmidt algorithms and orthogonal inite and diagonally doininant matrices), the triangnlarization. Gram-Schmidt algorithms proceed stcpwisc method requires some tiorin of pivoting for stability The most coninion form is partial pivot- liy orthogonalizing the kth columns ofA against ing, in which pivot elements are chosen from the the first (k - 1) columns of Q to get the kth column to be eliminated. This algorithm re- column o f Q . There arc LWO f o r m of the quires about n'/3 additions and ~ndtiplications. Gram-Schmidt algorithm, thc classical and the Certain contrived examples sliow that Gauss- modified, and they both compute only the ian elimination with partial pivoting can be un- factorization in Equation I O . The classical stable. Nonctlieless, it works well for the over- Grain-Schmidt is unstable. T h e modified form wheliniiig majority of real-life proble~ns.'~~''can produce a matrix Q, whose columns deviate Why is an open question. from orthogonality. But the deviation is ]ANUAUY/FEBRUAUY 2000 55 botiiided, and the compntcd factorization can he used in certain application-notably computing least-sqnares solutions. i f the orthogonalization step is repeated a t cach stage-a process known as ~.eorthu~onulizutiun-liothalgorithms will produce a fiilly orthogonal factorization. When n s m, the algorithms without reorthogonalizatioti require abont mnz additions and miltiplications. Tlie method of orthogonal trianpilarization proceeds by preniultiplying A by certain simple orthogonal matrices until the elcinents below the diagonal are zero. T h e product of the orthogonal matrices is Q,a d R is the upper triaiigular part of the rcduced A. Again, there are ~ w o versions. T h e first reduces the matrix by IIouseholdcr transformations. 'Tlic method has the advantage that it represents the entire matrix Q in the same amount of memory that is required to hold A, a great savings when n >> p. Tlie second method reduces A by plane rotations. It is less efficient than the first method, lint is hetter suited for matrices with structured patterns of nonzero elements. Relation to the Cholesky decomposition. Froin Equation 10, it follows that oted QR factorization has the form It follows that either Q, or the lint k columns of AP form a basis for the colmiin space of A . Thus, the pivoted QR decomposition can he used to extract a set of linearly independent columns from A. History. T h e QR factorization first appeared in a work by Erhard Schmidt 011 integral equatiow3' Specifically,Schmidt showed how to orthogonalize a sequence of functions by what is now known as the Gram-Schmidt algorithm. (Curionsly, Laplace produced the liasic formula~~ but ' had no notion of orthogonality.) T h e name QR comes from the QR algorithm, named by Francis (see the history notes for the Schur algorithm, discussed later). Householder introduced Honscholdcr transformations to matrix computations and showed how they could be used to triaiibdarizc a general ~natrix.~' Planc rotations were introduced hy Givens,16who used them to reducc a symmetric matrix to tridiagonal form. Bogert and Rurris appear to lie the first to use them in orthogonal triangiila~ization.~~. A ~ =AR ~ R . (11) Tlie first updating algorithm (adding a row) is due to Goli~h,~' who also introduced the idea of In other words, the triangular factor of the QR pivoting. decomposition ofA is the triangular factor of the Cholesky decomposition of the cross-product The spectral decomposition matrix A''A. Consequently, many probIeinsDescription. Let A he a symmetric matrix of particularly least-squares prohlcm-can be order n. Therc is a t i orthogonal matrix Vsnch solved using either a QR decoiiipositioii from a that least-squares niatrix or the Cholesky decomposition from tlie noriiial cqiration. ?'he QR decomposition nsually gives more accurate results, whereas the Cholesky decoinposition is often If videnntes the ith column of V, thenAvi = &vi, faster. Thus (Ai, vi)is an cigenpair of A , and tlie spectral Updating. Given a QK factorization of A , decomposition shown in Equation 12 exhibits the there are stable, efficient algnrithms for rcconi- eigenvalues ofA along with completc orthonorputing the QR factorization after rows and mal system of eigenvectors. columns have been added to or rcinovcd from Applications. Thc spectral decomposition A. In addition, the QR decomposition of the finds applications wherever the eigcnsystein o f rank-one nuidification A + .tyT can be stably a syinmetric matrixis needed, which is to say in updated. virtually a11 technical disciplines. T h e pivoted QR decomposition. if P is a Algoritluns. Therc are three classes nf algoperiiiutation matrix, then APis a permutation of rithms to compute the spectral decomposition: the columns of A , and (PIP)~(AP) is a diagonal the QR algorithm, the divide-atid-conquer alperniutatioti o f ATA. In view of the relation of gorithm, and Jacobi's algorithm. T h e first two the QR and the Choleslty decompositions, it is require a prcliinitiary reduction to tridiagonal not surprising that there is a pivoted QR factor- forni by orthogonal similarities. I discuss the QR ization whose triangular factor R satisfies Eqna- algorithm in the next section 011 tlie Schur detion 9. In particular, ifA has rank k, then its piv- composition. T h e divide-and-conquer algo- 56 COMPUTINGIN SCIENCE 8 ENGINEERING r i t h ~ i i ' ~ is . ' ~comparatively recent and is usnally Hcssctiberg form, which is usually done with faster than the QR algorithm when both eigeii- Householder transforimtions, the Scliur from is values and cigenvectors are dcsired; Iiowevcr, it computed using the QR algorithm."6 Elsewhere is not suitable for probleins in which tlie eigen- in this issue, nercsford Parlctt discusses the values vary widely in magnintde. T h c Jacolii al- modern form of the algorithm. It is one of the gorithm is much slower than the other two, but most flcxihlc algorithms in tlie rcpertoirc, having for positivc dcfiiiite matriccs it may hc more variants for tlie spectral dccomposition, the sillaccurate."' All these algorithm require O(n') gular valncs decomposition, a n d tlie generalized eigenvalue problcm. nperations. Updating. T h e spectral decomposition can be History. Schur introduced his dccnmposition updated. Unlike the Clioleslcy and QR decoiii- in 1909?' It was tlic nnly onc of the big six to positions, thc algorithm does not result i n a re- havc been derived in teriiis of matrices. It was duction in the order of thc work-it still remains largely ignorcd until Francis's QK algorithm O(n'), although the order constant is lower. pushed it into the limcliglit. History. T h c spectral dccomposition dates back to a n 1829 paper by Cauchy," who intro- The singular value decomposition duced the eigenvectors as solutions of equations Desmption. Let A bc an ?n x n matrix with of thc hirm Ax = /Lr a t i d proved tlic orthogonal- m t 12. Tlicre are orthogonal matrices U a n d V ity of eigenvcctnrs belonging to distinct cigen- such that values. In 1846, Jacobi" g-avc his fainous algorithm for spectral decomposition, which iteratively reduces the matrix in question to diagonal form by a special type of planc rotations, where now called Jacnbi rotatioiis. T h e reduction to tridiagonal form hy plane rotations is due to C = diag(q, ...,oJ, 0,t o22 ...2 qzt 0. GivensL6and hy Householder traiisforiiiations to H~uselioldcr."~ This dccomposition is called thc sinplar uaLe decampposition o f A .If U, consists ofthc first n The Schur decomposition columns of U, we can write Desmption. Let A hc a matrix of order n. There is a unitary matrix Usuch that A = U,Z V" (13) A = which is sometirnc.s called thc sin,plur vukiie/ictnrizntion of A. The diagonal clctiients of oare called thcsinpfilr u a l w of A . l l i e corrcspntiding columns of U and Vare called leftandri~yhtsiizgulurncctors o f A . Applications. Most of the applications of tlic QR decomposition can also he handlcd by thc siugular valuc decomposition. In addition, thc singular value decoinpnsition gives a hasis for the row space ofA and is more re1ial)le in deterininiiig rank. It can also be uscd to computc optimal low-rank approximations and to computc angles between suhspaces. Relation to the spectral deconzposition. T h e singular value factorization is related to tlie spectral decomposition in much the same way as tlic QR factorizatioo is rclated to the Cholesky decompnsition. Specifically, from Equation 13 it follows that where T i s upper triangular and H means conjugxtc transpose. T h e diagonal elemcnts of Tare the cigcnvalues of A , which, by appropriatc clioicc nf U,can be made to appcar i n any order. l'his decomposition is callcd a Schzir clecimqiosition o f A . A real matrix can have complex eigenvalues and hence a complex Schur fnrin. By allowing T to have rcal2 x 2 blocks on its diagonal that colitain its coniplcx eigenvalucs, the eiitirc dccomposition can be madc rcal. This is sometimes callcd a rcal Schzcrfom. Applications. An important use of the Sclmr foriii is as an intermediate form from which the eigenvalues aiid eigenvectors of a matrix can he computed. O n the other hand, the Schur (lecomposition can nften be nsed in place ofa cninplcte system of eigenpairs, which, in fact, may not exist. A gnod exainplc is the solutioii of Sylvestcr's equation aiid its reIati~es.*,'~ Algorithms. After a preliminary reduction to l'hus the eigenvalues of the cross-product maIANUARYIFEBRUARY 2000 57 trix A"A arc the squares of the singular vectors 7. 1.1. Dongarra etal., "An Extended Setof Fortran Basic Linear Al- gebra Subprograms," ACM irons. Mothematical Sollwore, Vol. of A and the eigenvectors of ATA are right sin14,1988, pp. 1-1 7. pilar vectors ofA. 8. 1.1. Dongarra et ai., "A Set of Level 3 Baric Linear Algebra SubAlgorithms.As with the spectral decomposiprograms," ACM Trans. Mothemotical Soliware, Vol. 16, 1990, pp.1-17. tion, there arc thrce classes of algorithms for 9. C.L. Lawson et ai., "Baric Linear Algebra Subprograms far Forcomputing the singiilar value decomposition: the tran Usage," ACM Tronr. Motliemotical Sotware, Vol. 5. 1979, QR algorithm, a divide-and-conquer algorithm, pp. 308-323. and a Jacobi-likc algorithm. T h e first two re- 10. A. Cayley, "A Memoir an the Theory of Matricer," Philosophicai quire a rcdnction ofA to bidiaganal form. l h e Tionr. RoyalSoc. ofLandon.Vo1. 148, 1858, pp. 17-37. divide-and-conquer a l g o r i t h ~ nis~oftcn ~ fastcr 11, C.C. Mac Duffee, The Theaiy 01 Matrices, Chelsea, New York, 1946. than the QR algorithm, and the Jacobi algorithm 12. P.S. Owyer, "A Matrix Prerentatianof Least Squarer and Correis the slowest. lation Theory with Matrix Justificationof Improved Methods of Histoiy. The singular value decomposition was Solution," Annals Math. Stotirticr, Vol. IS, 1944, pp. 82-89. introduced indcpendently by Bcltrdnii it? 18735u 13. H. len6en. An AttemptataSyrtemoti~Clarri~~otionafsome Methand Jordan in 1874.5' The reduction to bidiagoods lor the Solution 01 N o m " Equations, Report No. 18. G e o ~ doetirk Inititut, Copenhagen, 1944. i d form is dne to Golnb and ICahan," as is the variant of the QR algorithm. The firstJacobi-like 14. J. von Neumann and H.H. Coidrtine, "Numericai Inverting of Matrices of High Order." Bull. Am. Math. Sac., Val. 53,1947, pp. algorithm for computing the singular value de1021-1 099. composition was given by I<ogbctIiantz.'3 IS. A.S. Hourehoider, The Theoiy 01 Mutrim in NumericalAnalysis, Dover, NewYork, 1964. hc big six are not the only decompositions in use; in fact, there are many more. As mentioned earlicr, certain intermediate forins-such as tridiagonal and Hessenherg form-have come to be regarded as decompositions in their own right. Since the singular valuc decomposition is expensive to c o m p t e and not readily updated, rank-revealing alternatives h a w received coiisidcrahle a t t e n t i o ~ iThere . ~ ~ ~are ~ ~also generalizations of the singular value dccotnposition and the Schnr decomposition for pairs of matriNI crystal halls become cloudy when they look to thc fiimrc, but it seems safe to say that as long as new matrix problems arisc, new decompositions will be devised to solve thcni. % Acknowledgment This work was supported by the National Science Foundation under Grant No. 970909.8562. References 1. P.S. Dwyer, Linear Computotianr, JohnWiley & Sons, New York, 1911. 2. A S Homeholder, Principiis of Numerimi Anaiysis, McCraw-Hill, NewYork, 1953. 3. J.H.Wiiklnron and C. Reinrch, Handbook IorAuIomatic Computalion, Val. il,LinrarAlyebra, Springer-Verlag, New York, 1971 4. 8.5. Garbow et al., "Matrix EigenrystemRautinedlrpack Guide Extension." Lecture Notrr in Computer Science, Springer-Verlag, NewYork, 1977. 5. 1.1. Dongarra et al., LINPACK User's Guide, SIAM, Philadelphia, 1979. 6. E. Anderson et al., LAPACK Users' Guide, second ed., SIAM, Philadelphia, 1995. 58 16. W. Givens, Numerid Coinputofionof the Choraclwi5tic Voluei of a Red Matrix, Tech. Report 1574, Oak Ridge Nat' Laboratory, Oak Ridge, Tenn., 1954. 17. I.H. Wiikinron, "Error Analysis of Direct Methods of Matrix Inverrian.''l. ACM,Vol. 8, 1961, pp. 281-330. 18. I.H. Wilkinron, TheAigebr~i~€iyenvaiuePrabiem. Ciarendon Press, Oxford, U.K., 1965. 19. A. BjBrck, Nsmcdmi Methods lor Least Squarer Problems, SIAM, Philadelphia, 1996. 20. B.N. Oatta, Numerid Linear Algebro ond Applicotiani, Brooks/ Cole, PacificGroue, Calif., 1991. 21. I.W. Demmel, Applied Numerirai Linear Aiyebra. SIAM, Philadelphia, 1997. 22. N.1, Higham, AccuroiyandStobiIi~~ofNumericoi Aiyorithmr, SIAM, Philadelphia, 1996. 23. B.N. Parlett, TheSymmetk Eiymvaiue Pmbiem, Prentice-Hall, Englewood Cliffs, N.I., 1980; reissued with revisions by SIAM, Philadelphia, 1998. 24. G.W. Stewart, introduction to Matrix Computations, Academic Press, NewYork, 1973. 25. G.W. Stewart, Motrix Aiyorithmr i:Baric Decompositions, SIAM, Philadelphia, 1998. 26. D.S. Watkins, Fmdomentak ofMotrix Computotioonr, john Wiley & Sonr, New York, 1991, 27. C.F. Gauss, Theoria Motus Carporom Coelertium in Sectionibur Conicis Solem Ambientium [Themy 01 the Motion of the Heovenly Bodies Moving obout the Sun in Conic Sections], Perthes and Berrer, Hamburg, Germany, 1809. 28. C S Gaurs, "Dirquiritio de elementis eilipticir Palladis [Dirquirition on the Elliptical Elements of Pallas]," Commentatineirocietotis rqioe rcieotorlomGattingemis recentioorer, Voi. 1, 1810. 29. C. Benoit, "Note sur une methode de resolution der equations normalei provenant de hpplicatian de la methade der molndrer clrrer a un system d'equatianr linearei en nambre inferieur 'a celui der inconnuer. --application de la methode d la resolution d'un ryrteme deiini d'equationr linearer (Procede du Commandant Cimlesky) [Note an a Method for Soiving the Normal Equations Arising from the Application of the Method of Least Squarer to a System of Linear Equations whore Number Is Less than the Numberof Unknawnrdpplication of the Method to the Solution of a Definite System of Linear Equations]." B u M n Geoderique (iauioure), Vol. 2, 1924, pp. 5-77 COMPUTING IN SCIENCE 8 ENGINEERING 30. L.N. Trefethen and R.S. Schreiber, "Average-Care Stability a i in Tech. Report 9 M 7 ,Dept. of Computer Science, Univ. of Minnesota, Minneapolis, 1990. Gaussian Elimination." SIAM I. M o t h Aoalyris ood Appiimtionr, Vol. 11, 1990, pp. 335-360. 51. C. lardan, "Memoiremi lerformer bilineairer [Memoiran Bilin- 3 1 , L.N. Trefethen and D. BBU 111, Numericcol Linear Algebra, SIAM, Philadelphia, 1997, pp. 166-1 70. ear forms]," )aoinal de Mothematiquer Pure$et Appliquem, DeuxiemeSerie,Vol. 19, 1874,pp. 35-54. 32. C.C.I. lacobi, "Uber einen algebrairchen Fundamentalraa iind seine Anwendungen [On a Fundamental Theorem in Algebra and its Applications]," ) o u r ~ o l l ~ r dieineundongewandte ie Mathemotik, Vol. 53, 1857, pp. 275-280. 52. G.H. Galub and W. Kahan, "Calculating the Singular Valuer and Preudo-Inverse af a Matrix," SIAM 1. NumericalAnalysis, Val. 2, 1965, pp. 205-224. 33. E. Schmidt, "Zur Theorie der linearen und nichtlinearen Integralgleichungen. l Teii, Entwicklung wilikurlichen Funktionen nach System vorgerchriebener [On the Theory af Linear and Nonlinear Integral Equations.Part I, Expansion of Arbitrary functions by I PrescribedSystem]," MothematixrcheAnnalen, Val. 63, 1907, pp.4331176. tion of CoefficientsMillrix.'' iXorterIy of Applied Moth., Vol. 13, 1955, pp. 123-132. 34. P.5. Laplace, Theoh Analytiqoeder Pmbabilites [AnolyticolTheory ofProbabilitier], 3rd ed., Courcier, Paris, 1820. 35. A.S. Householder, "Unitary Triangulariration of a NaniymmetTic Matrix,"). ACM, Val. 5, 1958, pp. 339-342. 36. D. Bogert and W.R. Burrir, Comparison of Leust Squarer Algorithms, Tech. Report ORNL-3499, V.l, 85.5,Neutron Physics Div., Oak Ridge Nat'i Laboratory, Oak Ridge, Tenn., 1963. 53. E.G. Kagbetliana, "Solution of Lineal Systems by Diagonaiiza- 54. T.F. Chan, "Rank Revealing QR Factorizations,'' LinearAlgebia ond Its Appiicotions, Vol. 88/89, 1987, pp. 67-82. 55. C.W. Stewart, "An Updating Algorithm for SubrpaceTracking," IIEETronr. Siq"alPraier~i"~,VVol.40, 1992. pp. 1535-1541 56. 2 . Bai and I.W. Demmei, "Computing tile Generalized Singular Value Decampoiition," SIAM J. Scientifi< ond Statlrtlcd Camputinq, Vol. 14, 1993, pp. 1464-1468. 57. C.B. Moler and G.W. Stewalt, "An Algorithm for Generalized Matrix Eigenvalue Problems," SIAM l. NumericalAoolyrir, Vol. 10, 1973, pp. 241-256. 37. C.H. Golub, "Numerical Methods for Solving teart Squarer Problems," NumerisrrheMothemotih, Vol. 7, 1965. pp. 206-21 6. 38. I.J.M. Cuppen, "A Divide and Conquer Method for lhe Symmetric Eigenproblem," Numerixhe Mothemotik, Vol. 36, 1981, pp. 177-195. 39. M Cu and S.C. Eirenrtat, "Stable and EfficientAlgorithm for the Rank-One Modification of the Symmetric Eigenproblem," SIAM ), M a t i i * A n ~ l y ~ i ~ a n d A p p ~Val. ~ ~ 15, t i ~ 1994, n ~ , pp. 1266-1276. 40. J. Demmel and K. V e d i c , "Jacabi'r Method Is Mare Accurate than QR," SIAMI. Matrix Anolyrir m d Appliccotimi, Val. 13, 1992, pp. 1204-1245. 41, A.L. Cauchy, "Sur I'eqoation ai I'aide de laquelle on determine ler inegaiites iecuiairer der mowementi der planeter [On the Equation bywhich thelnequalitierfortheSeculaiMovementaf Planets is Determined]," Oeuvres Completes (11, e Seie), Vol. 9, 1829. 42. C.C.J. lacobi, "Ubeiein leichter Verfahren die in derTheorieder Sicularrtdrungen varkommenden Gleichungen nuherirch aufmlbren [On an Easy Method for the Numerical Solution of the Equations that Appear in the Theory of Secular Perturbationr1,"lourooif"~di~reine undungewandte Mothemmk, Vol. 30, 1846, pp. 51.594. 43. I.H. Wilkinron, "Householder's Method for the Solution af the Algebraic Eigenvalue Problem," Computei),, Vol. 3, 1960, pp. 23-27. 44. R.H. Barteir and G.W. Stewart, "Algorithm 432: The Solution of the Matrix Eqiiation A X - B X = C," Camm. AGM, Vol. 8, 1972, pp. 820-826. 45. G.H. Colub, 5. Narh, and C. Van Loan, "Heirenberg-Schur Method for the Problem AX + X B = C," IEEE Tram Autornotk Control, Val. AC-24, 1979, pp. 909-91 3. 46. I.G.F. Francis, "The QRTranrformation, Paris i and 1", ).,Vol.4, 1961, pp.265-271, 1962, pp. 332-345. 47. G.W. Stewart is a professor in the Department of Computer Science and in the Institute for Advanced Computer Studies a t the University of Maryland. His research interests focus on numerical linear algebra, perturbation theory, and rounding-error analysis. He earned his PhD from the University of Tennessee, where his advisor was A.S. Householder. Contact Stewart a t the Dept. of Computer Science, Univ. of Maryland, College Park, MD 20742; [email protected]. Computer 1. Schur, "UberdiecharakterirtirchenWhrreln eiiieriinearen Substitution mit einer Anwendung aut die Thearie der Integral gleichungen [On the Characteristic Roots of a Linear Tranrfarmstion with an Application to the Theory of Integral Equationr]," MathemotiicheAnn(r1en.Val. 66, 1909, pp. 448-510. 49. M. Cu and S.C. Eirenrtat, "A Divide~and-ConquerAlgarithmfar tlie Bidiagonal SVD." SIAM). Matrix Analyiir andApplicotioonr, Vol. 16,1995, pp. 79-92. 50. E. Beitrami, "Sulle fundoni bilineari [On Bilinear Functionr]," Ciorode di Motematicheod Uro degli StudentiDelle Univerrita, Vol. 11, 1873, pp. 98-106. An English translation by D. Eoley is available Scientific Programming will return in the next issue of CiSE.