* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download computer science 349b handout #36
Linear algebra wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Gaussian elimination wikipedia , lookup
Horner's method wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Matrix multiplication wikipedia , lookup
Matrix calculus wikipedia , lookup
Jordan normal form wikipedia , lookup
COMPUTER SCIENCE 349B
HANDOUT #36
NOTES ON THE POWER METHOD
EXAMPLE
Let
2 − 1
3
A = − 2 − 1 1 ,
− 3 3
2
which has eigenvalues 4, 1 and -1. Let x ( 0) = (1 , 1 , 1) T and p = 1 . The following
computations illustrate the Power Method.
Ax
(0)
A x (1)
Ax
( 2)
A x ( 3)
A x ( 4)
4
= − 2
2
3/ 2
= − 1
− 7 / 2
− 12 / 7
= 11 / 7
29 / 7
− 1.4828
= 1.4483
4.3793
− 1.3543
= 1.3465
4.0079
µ0 = 4
x
p =1
(1)
µ1 = 3 / 2
p=3
µ 2 = 29 / 7
= 4.1429
p=3
µ 3 = 4.3793
p=3
µ 4 = 4.0079
p=3
1
= − 1 / 2
1 / 2
− 3 / 7
( 2)
x = 2 / 7
1
− 0.4138
x = 0.3793
1
− 0.3386
( 4)
x = 0.3307
1
( 3)
x ( 5)
− 0.3379
= 0.3360
1
After 10 iterations:
µ 9 = 4.0000839 and x
(10 )
− 0.3333346
= 0.3333327 .
1
137
{ }
So, it is clear that {µ k } → 4 and x
(k )
− 1 / 3
→ 1 / 3 , which are the dominant eigenvalue of
1
A and its corresponding eigenvector.
COMMENTS ON THE POWER METHOD
1. The speed of convergence depends on the ratio
λ2
, which is always < 1.
λ1
Convergence will be slow if this ratio is close to 1.
The k-th approximation to λ1 is
x (pk )
x (pk −1)
k
k
λn ( n)
λ2 ( 2)
(1)
λ α 1v + α 2 v + L + α n v
λ1
λ1
p
≈ λ1
=
k −1
k −1
λ
λ
λ1k −1 α 1v (1) + α 2 2 v ( 2) + L + α n n v ( n )
λ1
λ1
p
k
1
and the order of convergence is linear:
if the error at the k-th step is ek =
lim
k →∞
ek +1
ek
=
x (pk )
x (pk −1)
− λ1 , then it can be shown that
λ2
.
λ1
As with any linearly convergent sequence, the speed of convergence can be accelerated
by using the Aitken ∆2 process (see page 562; page 563 of 7 th ed.).
2. Another technique for improving the speed of convergence is to try to reduce or
λ2
by a technique called "shifting", which involves applying the
λ1
Power Method to the matrix A − pI for some scalar p.
minimize the ratio
For example, if the set of eigenvalues of A is {− 4 , 3.8 , 1 , 1 }, then the speed of
k
3.8
convergence of the Power method depends on how quickly
→ 0 . However, if the
4
138
Power method is applied to the matrix A − 2.4 I , which has eigenvalues
{− 6.4 , 1.4 , − 1.4 , − 1.4 }, then the speed of convergence of the Power method depends
k
1.4
on how quickly
→ 0 . Note that if an eigenvalue of A − pI is computed, then an
6.4
eigenvalue of A is obtained by adding p onto the computed value. The difficulty with this
approach is determining a suitable value for p.
3. Most of the limitations of the Power Method (given in the convergence theorem in
Handout #35) are not important in practice:
-- the condition that α 1 ≠ 0 is not important due to the presence of round-off
errors; in practice, almost any initial vector x ( 0) will work.
-- the condition that A be nondefective is not necessary. The Power Method will
converge if A is defective, and even if the dominant eigenvalue λ1 has multiplicity > 1.
In this latter case, the method will converge with the same order of convergence as in the
nondefective case if λ1 has a full set of eigenvectors (but very slowly if not).
The critical limitation of the Power Method: it will not converge if there is not a
unique dominant eigenvalue (that is, if λ1 = λ 2 and λ1 ≠ λ 2 ).
DEFLATION
Once the dominant eigenvalue λ1 and its eigenvector x1 are determined, λ 2 can
be obtained by a procedure called deflation -- that is, use λ1 and x1 to compute an
(n − 1) × (n − 1) matrix that has as its eigenvalues λ 2 , λ3 , K, λ n , and then apply the
Power Method to this matrix. This will converge to the next eigenvalue λ2 of A
provided that λ 2 > λ3 .
Wielandt's deflation: page 568 of the text (page 571 of 7 th ed.). Ignore this, as it is
numerically unstable. A better deflation technique is the following.
Having computed λ1 and x1 , normalize x1 so that x1
2
= 1 and determine an orthogonal
matrix P such that P T x1 = e1 ≡ (1 , 0 , 0 , K , 0 ) T .
Note that P T could be constructed as a Householder matrix.
Now define B = P T AP . Then A and B are (orthogonally) similar (as P T = P −1 ), and
139
B e1 = P T APe1
= P T Ax1 ,
since P T x1 = e1
⇒ x1 = Pe1
= P (λ1 x1 )
T
= λ1e1 ,
since P T x1 = e1 .
Since the entries of the vector Be1 are just equal to those of the first column vector of B,
we have that
λ1
0
B = 0
M
0
b12
b13 L b1n
.
Bˆ
Therefore, B̂ is an (n − 1) × (n − 1) matrix with eigenvalues λ 2 , λ3 , K, λ n (as B is similar
to A and clearly λ1 is an eigenvalue of B), and thus B̂ is the desired deflated matrix. The
remaining eigenvalues and eigenvectors of A could be computed from B̂ .
140