Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Linear algebra wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Gaussian elimination wikipedia , lookup
Horner's method wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Matrix multiplication wikipedia , lookup
Matrix calculus wikipedia , lookup
Jordan normal form wikipedia , lookup
COMPUTER SCIENCE 349B HANDOUT #36 NOTES ON THE POWER METHOD EXAMPLE Let 2 − 1 3 A = − 2 − 1 1 , − 3 3 2 which has eigenvalues 4, 1 and -1. Let x ( 0) = (1 , 1 , 1) T and p = 1 . The following computations illustrate the Power Method. Ax (0) A x (1) Ax ( 2) A x ( 3) A x ( 4) 4 = − 2 2 3/ 2 = − 1 − 7 / 2 − 12 / 7 = 11 / 7 29 / 7 − 1.4828 = 1.4483 4.3793 − 1.3543 = 1.3465 4.0079 µ0 = 4 x p =1 (1) µ1 = 3 / 2 p=3 µ 2 = 29 / 7 = 4.1429 p=3 µ 3 = 4.3793 p=3 µ 4 = 4.0079 p=3 1 = − 1 / 2 1 / 2 − 3 / 7 ( 2) x = 2 / 7 1 − 0.4138 x = 0.3793 1 − 0.3386 ( 4) x = 0.3307 1 ( 3) x ( 5) − 0.3379 = 0.3360 1 After 10 iterations: µ 9 = 4.0000839 and x (10 ) − 0.3333346 = 0.3333327 . 1 137 { } So, it is clear that {µ k } → 4 and x (k ) − 1 / 3 → 1 / 3 , which are the dominant eigenvalue of 1 A and its corresponding eigenvector. COMMENTS ON THE POWER METHOD 1. The speed of convergence depends on the ratio λ2 , which is always < 1. λ1 Convergence will be slow if this ratio is close to 1. The k-th approximation to λ1 is x (pk ) x (pk −1) k k λn ( n) λ2 ( 2) (1) λ α 1v + α 2 v + L + α n v λ1 λ1 p ≈ λ1 = k −1 k −1 λ λ λ1k −1 α 1v (1) + α 2 2 v ( 2) + L + α n n v ( n ) λ1 λ1 p k 1 and the order of convergence is linear: if the error at the k-th step is ek = lim k →∞ ek +1 ek = x (pk ) x (pk −1) − λ1 , then it can be shown that λ2 . λ1 As with any linearly convergent sequence, the speed of convergence can be accelerated by using the Aitken ∆2 process (see page 562; page 563 of 7 th ed.). 2. Another technique for improving the speed of convergence is to try to reduce or λ2 by a technique called "shifting", which involves applying the λ1 Power Method to the matrix A − pI for some scalar p. minimize the ratio For example, if the set of eigenvalues of A is {− 4 , 3.8 , 1 , 1 }, then the speed of k 3.8 convergence of the Power method depends on how quickly → 0 . However, if the 4 138 Power method is applied to the matrix A − 2.4 I , which has eigenvalues {− 6.4 , 1.4 , − 1.4 , − 1.4 }, then the speed of convergence of the Power method depends k 1.4 on how quickly → 0 . Note that if an eigenvalue of A − pI is computed, then an 6.4 eigenvalue of A is obtained by adding p onto the computed value. The difficulty with this approach is determining a suitable value for p. 3. Most of the limitations of the Power Method (given in the convergence theorem in Handout #35) are not important in practice: -- the condition that α 1 ≠ 0 is not important due to the presence of round-off errors; in practice, almost any initial vector x ( 0) will work. -- the condition that A be nondefective is not necessary. The Power Method will converge if A is defective, and even if the dominant eigenvalue λ1 has multiplicity > 1. In this latter case, the method will converge with the same order of convergence as in the nondefective case if λ1 has a full set of eigenvectors (but very slowly if not). The critical limitation of the Power Method: it will not converge if there is not a unique dominant eigenvalue (that is, if λ1 = λ 2 and λ1 ≠ λ 2 ). DEFLATION Once the dominant eigenvalue λ1 and its eigenvector x1 are determined, λ 2 can be obtained by a procedure called deflation -- that is, use λ1 and x1 to compute an (n − 1) × (n − 1) matrix that has as its eigenvalues λ 2 , λ3 , K, λ n , and then apply the Power Method to this matrix. This will converge to the next eigenvalue λ2 of A provided that λ 2 > λ3 . Wielandt's deflation: page 568 of the text (page 571 of 7 th ed.). Ignore this, as it is numerically unstable. A better deflation technique is the following. Having computed λ1 and x1 , normalize x1 so that x1 2 = 1 and determine an orthogonal matrix P such that P T x1 = e1 ≡ (1 , 0 , 0 , K , 0 ) T . Note that P T could be constructed as a Householder matrix. Now define B = P T AP . Then A and B are (orthogonally) similar (as P T = P −1 ), and 139 B e1 = P T APe1 = P T Ax1 , since P T x1 = e1 ⇒ x1 = Pe1 = P (λ1 x1 ) T = λ1e1 , since P T x1 = e1 . Since the entries of the vector Be1 are just equal to those of the first column vector of B, we have that λ1 0 B = 0 M 0 b12 b13 L b1n . Bˆ Therefore, B̂ is an (n − 1) × (n − 1) matrix with eigenvalues λ 2 , λ3 , K, λ n (as B is similar to A and clearly λ1 is an eigenvalue of B), and thus B̂ is the desired deflated matrix. The remaining eigenvalues and eigenvectors of A could be computed from B̂ . 140