Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
MA5233 Lecture 6 Krylov Subspaces and Conjugate Gradients Wayne M. Lawton Department of Mathematics National University of Singapore 2 Science Drive 2 Singapore 117543 Email [email protected] http://www.math.nus/~matwml Tel (65) 6874-2749 1 EUCLIDEAN SPACES Definition A Euclidean structure on a vector space V is a function , : V V R that satisfies for all a, b R, u , v, w V au bv , w a u, w b v, w Bilinear and w, au bv a w, u b w, v Symmetric u , w w, u Positive Definite u , u 0 and u 0 u, u 0 Definition The norm is || u || u, u 2 Example 1. (Standard) V R , u, v j 1 u j v j Example 2. V where EXAMPLES A R d d R , u, v j 1 u ( Av) d d j d d j is positive definite and symmetric. b Example 3. V C ([ a, b]), f , h f ( x) h( x) p( x) dx a where p V is positive except at possible a finite number of points – hence p is nonnegative. Example 4. V and , are obtained by the Euclidean space in example 3. Then V is a Real Hilbert Space = Complete Real Euclidean Space. 3 ORTHONORMAL BASES Definition q1 ,..., qd is an orthonormal basis for V if qi , q j ij 1 if i j , else 0 Example q1 ,..., qd R is an orthonormal basis for the standard Euclidean space iff the matrix d Q [q1 ,..., qd ] satisfies Q QI T T where Q is the transpose matrix defined by Q T ij Q ji and I is the identity matrix defined by I ij ij Such a matrix is called orthogonal T 1 T and satisfies QQ I and Q Q 4 GRAM-SCHMIDT PROCESS Given a basis b1 ,..., bd for a Euclidean space V there exists a unique upper triangular matrix T with positive numbers on its diagonal such that q j i 1Tijbi , j j 1,..., d are orthogonal (and therefore are a basis for V). Proof We apply the Gram-Schmidt Process q1 b1 / || b1 || For j = 2 to d p j b j i 1 qi , b j qi q j p j / || p j || j 1 5 QR FACTORIZATION d Given a basis b1 ,..., bd for R Gram-Schmidt yields an upper triangular matrix T with positive numbers on its diagonal such that Q BT are therefore, since a factorization R T 1 is upper triangular, B QR that has important applications to least-squares problems (section 5.3) and to compute eigenvalues and eigenvectors (section 5.5) 6 PARTIAL HESSENBERG FACTORIZATION Definition A (not necessarily square) matrix H [ hij ] is upper Hessenberg if i j 1 hij 0 We consider a matrix A R d d and integer n d and orthonormal vectors q1 ,..., qn1 such that h11 h1n h 21 A q1 ,..., qn q1 ,..., qn 1 0 hnn or, equivalently 0 hn1,n Aqn h1n q1 hnn qn hn 1,n qn1 7 KRYLOV SPACES AND ARNOLDI ITERATION n1 If the Krylov space K n span { b, Ab,..., A b } has dimension n, then an orthonormal basis q1 ,..., qn can be computed by GS using the Arnoldi Iteration based on the equation Aqn h1n q hnn q hn 1,n q 1 q1 b / || b || (Recall that n1 n bR ) d For j = 2 to n b j Aq j 1 p j b j i 1 qi , b j qi q j p j / || p j || j 1 8 COMPLETE HESSENBERG FACTORIZATION Possibly using more than one Krylov subspace we can construct an orthonormal basis q1 ,..., qd for R d such that A Q Q H h11 h12 where h h 21 22 Q q1 ,..., qd , H 0 0 hd ,d 1 h1d h2 d hdd We observe that the number of Krylov subspaces equals 1+ number of zeros on the diagonal beneath the main diagonal. 9 TRI-DIAGONAL MATRIX Theorem AT A iff H T H T Proof. A Q Q H H Q AQ H Q AQ H Q A Q T T therefore A A H H T T T T Corollary If AT A then H is tridiagonal. 10 LANCZOS ITERATION 1 1 Theorem If H H T then 2 1 A q1 ,..., qn q1 ,..., qn 0 2 0 and an orthonormal basis for K n 0 2 3 n1 0 n1 n can be computed by GS using the Lanczos Iteration 0 0, q0 0, b R d , q1 b / || b || For j = 1 to n-1 u Aqn v u n1qn1 n qn , v v v n1qn1 n qn n || v || qn1 v / n 11 CONJUGATE GRADIENT ITERATION that Hestenes and Stiefel made famous solves Ax = b under the assumption that A is symmetric and pos. def. x0 0, r0 b, p0 r0 , c0 r r T 0 0 For j = 1 to n-1 v j Ap j 1 T j c j 1 / p j 1v j x j x j 1 j p j 1 rj rj 1 j v j cj r r T j j j c j / c j 1 p j rj j p j 1 12 CONJUGATE GRADIENT ITERATION Theorem 1. The following sets all = K n X n span{x1 ,..., xn }, Pn span{ p0 ,..., pn1}, Rn span{r0 ,..., rn1} and j n (rnT rj 0) ( pnT Ap j 0) Proof By induction ( p0 r0 ) ( p j rj j p j 1 ) P R ( x0 0) ( x j x j 1 j p j 1 ) X P (r0 b) (rj rj 1 j v j rj 1 j Ap j 1 ) R K n (b) if j < n-1then rnT rj rnT1rj n ( Apn1 )T rj rnT1rj n pnT1 Arj 0 T T T and rn rn 1 rn 1rn 1 n pn 1 Arn 1 0 since n r r / p A(rn 1 n pn 2 ) r r / p Arn 1 if j < n-1then p Ap j rnT Ap j n p Ap 0 T T and pn Apn 1 r Apn 1 n pn 1 Apn 1 0 since T T T T n rn rn / rn 1rn 1 rn ( n Apn 1 ) / pn 1 ( n Apn 1 ) 13 T n 1 n 1 T n 1 T n T n T n 1 n 1 T n 1 j T n 1 CONJUGATE GRADIENT ITERATION Theorem 2. If A is symmetric and positive definite then if the CG algorithm to solve Ax = 0 has not converged, that is rn1 0 then en x xn T 2 minimizes || en || A en Aen for xn K n and convergence is monotonic || en || A || en1 || A T T Proof If 0 x K n then 2en Ax 2rn x 0 therefore || en x || en Aen x Ax || en || 2 A T T 2 A Theorem 3. If subordinate to the 2-norm cond ( A) n then || e || 1 Proof See the 2 || e0 || A 1 n A handouts 14