* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Similarity and Diagonalization Similar Matrices
Euclidean vector wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Vector space wikipedia , lookup
Exterior algebra wikipedia , lookup
Rotation matrix wikipedia , lookup
System of linear equations wikipedia , lookup
Determinant wikipedia , lookup
Symmetric cone wikipedia , lookup
Covariance and contravariance of vectors wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Principal component analysis wikipedia , lookup
Gaussian elimination wikipedia , lookup
Jordan normal form wikipedia , lookup
Matrix calculus wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Four-vector wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Matrix multiplication wikipedia , lookup
MATH10212 • Linear Algebra • Brief lecture notes 48 Similarity and Diagonalization Similar Matrices Let A and B be n × n matrices. We say that A is similar to B if there is an invertible n × n matrix P such that P −1 AP = B. If A is similar to B, we write A ∼ B. Remarks • If A ∼ B, we can write, equivalently, that A = P BP −1 or AP = P B. • If A ∼ B, we can write, equivalently, that A = P BP −1 or AP = P B. • The matrix P depends on A and B. It is not unique for a given pair of similar matrices A and B. To see this, simply take A = B = I, in which case I ∼ I, since P −1 IP = I for any invertible matrix P . Theorem 4.21. Let A, B and C be n × n matrices. a. A ∼ A. b. If A ∼ B, then B ∼ A. c. If A ∼ B and B ∼ C, then A ∼ C. This means that ∼ is an equivalence relation. The main problem is to find a “good” representative in each equivalence class. The “real” meaning of P −1 AP is that this is the matrix of the same linear transformation (given in the standard basis by the matrix A) in a different basis, which consists of the columns of P . This really much better explains why many properties are the same for A and P −1 AP . Theorem 4.22. Let A and B be n × n matrices with A ∼ B. Then a. det A = det B. b. A is invertible if and only if B is invertible. c. A and B have the same rank. d. A and B have the same characteristic polynomial. e. A and B have the same eigenvalues. MATH10212 • Linear Algebra • Brief lecture notes 49 Diagonalization Definition. An n × n matrix A is diagonalizable if there is a diagonal matrix D such that A is similar to D — that is, if there is an invertible matrix P such that P −1 AP = D. Note that the eigenvalues of D are its diagonal elements, and these are the same eigenvalues as for A. Theorem 4.23. Let A be an n × n matrix. Then A is diagonalizable if and only if A has n linearly independent eigenvectors. More precisely, there exists an invertible matrix P and a diagonal matrix D such that P −1 AP = D if and only if the columns of P are n linearly independent eigenvectors of A and the diagonal entries of D are the eigenvalues of A corresponding to the eigenvectors in P in the same order. Theorem 4.25. If A is an n × n matrix with n distinct eigenvalues, then A is diagonalizable. ...Since eigenvectors for distinct eigenvalues are lin. indep. by Th. 4.20. Theorem 4.24. Let A be an n × n matrix and let λ1 , λ2 , . . . , λk be distinct eigenvalues of A. If Bi is a basis for the eigenspace Eλi , then B = B1 ∪ B2 ∪ · · · ∪ Bk (i.e., the total collection of basis vectors for all of the eigenspaces) is linearly independent. Lemma 4.26. If A is an n × n matrix, then the geometric multiplicity of each eigenvalue is less than or equal to its algebraic multiplicity. Theorem 4.27. The Diagonalization Theorem Let A be an n×n matrix whose distinct eigenvalues are λ1 , λ2 , . . . , λk . The following statements are equivalent: a. A is diagonalizable. b. The union B of the bases of the eigenspaces A (as in Theorem 4.24) Pof k contains n vectors (which is equivalent to i=1 dim Eλi = n). c. The algebraic multiplicity of each eigenvalue equals its geometric multiplicity and all eigenvalues are real numbers — this condition is missing in the textbook!. MATH10212 • Linear Algebra • Brief lecture notes 50 In these theorems the eigenvalues are supposed to be real numbers, although for real matrices there may be some complex roots of the characteristic polynomial (in fact, these theorems remain valid for vector spaces and matrices over C — then, of course, one does not need the condition that the eigenvalues be all real). Theorem 4.27 and Th. 4.23 actually give a method to decide whether A is diagonalizable, and if yes, to find P such that P −1 AP is diagonal: the columns of P are vectors of bases of the eigenspaces. 1 2 2 Example. For A = 2 1 2 the characteristic polynomial is 2 2 1 ¯ ¯ ¯1 − λ 2 2 ¯¯ ¯ 1−λ 2 ¯¯ = det(A − λI) = ¯¯ 2 ¯ 2 2 1 − λ¯ 3 (1 − λ) + 8 + 8 − 4(1 − λ) − 4(1 − λ) − 4(1 − λ) = · · · = −(λ − 5)(λ + 1)2 . Thus, eigenvalues are 5 and −1. 2 2 2 x1 0 Eigenspace E−1 : (A−(−1)I)~x = ~0; 2 2 2 x2 = 0; x1 = −x2 −x3 , x3 0 2 2 2 −s − t where x2 , x3 are free var.; E−1 = s | s, t ∈ R ; t −1 −1 1 , 0 . a basis of E−1 : 0 1 x1 0 −4 2 2 Eigenspace E5 : (A − 5I)~x = ~0; 2 −4 2 x2 = 0; solve this 2 2 −4 x3 0 t system....: x1 = x2 = x3 , where x3 is a free var.; E5 = t | t ∈ R ; t 1 1 . a basis of E5 : 1 Together the dimensions add up to 3, so B5 ∪ B−1 is a basis of R3 , so A is diagonalizable. 1 −1 −1 5 0 0 0 ; then P −1 AP = 0 −1 0 . Let P = 1 1 1 0 1 0 0 −1 (Note that is we arrange the eigenvectors in a different order, then the −1 −1 1 0 1; eigenvalues on the diagonal must be arranged accordingly: let Q = 1 0 1 1 −1 0 0 then Q−1 AQ = 0 −1 0.) 0 0 5 51 MATH10212 • Linear Algebra • Brief lecture notes 3 20 29 Example. For A = 0 1 82 the eigenvalues are 3, 1, and 7. Since 0 0 7 they are distinct, the matrix is diagonalizable. 3 0 0 (To find that P such that P −1 AP = 0 1 0, one still needs to solve those 0 0 7 linear systems (A − (λ)I)~x = ~0......). 3 1 0 Example. For A = 0 3 1 the eigenvalue is 3 of alg. multiplicity 3. 0 0 3 0 1 0 Eigenspace E3 : 0 0 1 ~x = ~0; matrix has rank 2, so dim E3 = 1. So A is 0 0 0 not digonalizable. · ¸ 1 2 for A = . Eigenvalues Example. Use diagonalization to find A 2 1 · ¸ ½· ¸¾ −2 2 1 ~ are.... −1 and 3. Eigenspace E3 : ~x = 0; x1 = x2 ; basis . 2 −2 1 ¸ · ¸ ½· ¸¾ · 2 2 −1 1 −1 Eigenspace E−1 : ~x = ~0; x1 = −x2 ; basis ; . Let P = 2 ·2 1 1 1 ¸ −1 0 then P −1 AP = D = . Now, A = P DP −1 , so A100 = (P DP −1 )100 = 0 3 · ¸· ¸100 · ¸ −1 1 −1 0 −1/2 1/2 −1 −1 −1 100 −1 P DP P DP · · · P DP = P D P = = 1· 1 0 3¸ 1/2 1/2 ¸· ¸· ¸ · ¸ · −1 1 1 0 −1/2 1/2 −1 3100 −1/2 1/2 = = 100 1 ·1 0 3 1/2 1/2 1 3100 1/2 1/2 ¸ 3100 + 1 3100 − 1 (1/2) 100 . 3 − 1 3100 + 1 100 52 MATH10212 • Linear Algebra • Brief lecture notes Orthogonality in Rn We introduce the dot product of vectors in Rn by setting ~u · ~v = ~uT ~v ; that is, if u1 v1 .. .. ~u = . and ~v = . un vn then £ ~u · ~v = ~uT ~v = u1 ··· v1 ¤ un ... = u1 v1 + u2 v2 + · · · + un vn . vn The dot product is frequently called scalar product or inner product; we shall use the latter term in a slightly more general context. Notice the following properties of the dot product which can be easily checked directly or immediately follow from the properties of matrix multiplication. They hold for arbitrary vectors ~u, ~v , w ~ ∈ Rn and arbitrary scalar λ. • ~u · ~v = ~v · ~u (commutativity). ~ = ~u · ~v + ~u · w ~ • ~u · (~v + w) • ~u · (λ~v ) = λ(~v · ~u) (The last two properties are referred to as linearity of the dot product.) • ~u · ~u = u21 + · · · + u2n and therefore ~u · ~u ≥ 0. Moreover, if ~u · ~u = 0 then ~u = ~0. We define the length (or norm) k~v k of vector v1 .. ~v = . vn by k~v k = q √ ~v · ~v = v12 + v22 + · · · + vn2 Orthogonal and Orthonormal Sets of Vectors A set of vectors ~v1 , ~v2 , . . . , ~vk 53 MATH10212 • Linear Algebra • Brief lecture notes in Rn is called an orthogonal set if all pairs of distinct vectors in the set are orthogonal – that is, if ~vi · ~vj = 0 whenever i 6= j for i, j = 1, 2, . . . , k The standard basis ~e1 , ~e2 , . . . , ~en in Rn is an orthogonal set, as is any subset of it. As the first example illustrates, there are many other possibilities. Example 5.1 Show that {~v1 , ~v2 , ~v3 } is an orthogonal set in R3 if 2 0 1 ~v1 = 1 , ~v2 = 1 , ~v3 = −1 −1 1 1 Solution We must show that every pair of vectors from this set is orthogonal. This is true, since ~v1 · ~v2 = 2(0) + 1(1) + (−1)(1) = 0 ~v2 · ~v3 = 0(1) + 1(−1) + (1)(1) = 0 ~v1 · ~v3 = 2(1) + 1(−1) + (−1)(1) = 0 Theorem 5.1. If ~v1 , ~v2 , . . . , ~vk is an orthogonal set of nonzero vectors in Rn , then these vectors are linearly independent. Proof If c1 , c2 , . . . , ck are scalars such that c1~v1 + c2~v2 + · · · + ck~vk = ~0, then (c1~v1 + c2~v2 + · · · + ck~vk ) · ~vi = ~0 · ~vi = 0 or, equivalently, c1 (~v1 · ~vi ) + · · · + ci (~vi · ~vi ) + · · · + ck (~vk · ~vi ) = 0 (1) Since ~v1 , ~v2 , . . . , ~vk is an orthogonal set, all of the dot products in equation (1) are zero, except ~vi · ~vi . Thus, equation (1) reduces to ci (~vi · ~vi ) = 0 MATH10212 • Linear Algebra • Brief lecture notes 54 Now, ~vi · ~vi 6= 0 because ~vi 6= ~0 by hypothesis. So we must have ci = 0. The fact that this is true for all i = 1, . . . , k implies that ~v1 , ~v2 , . . . , ~vk is a linearly independent set. ¤ Remark. Thanks to the Theorem 5.1, we know that if a set of vectors is orthogonal, it is automatically linearly independent. For example, we can immediately deduce that the three vectors in Example 5.1 are linearly independent. Contrast this approach with the work needed to establish their linear independence directly! An orthogonal basis for a subspace W of Rn is a basis of W that is an orthogonal set. Example 5.2. The vectors 2 0 1 ~v1 = 1 , ~v2 = 1 , ~v3 = −1 −1 1 1 from Example 5.1 are orthogonal and, hence, linearly independent. Since any three linearly independent vectors in R3 form a basis in R3 , by the Fundamental Theorem of Invertible Matrices, it follows that ~v1 , ~v2 , ~v3 is an orthogonal basis for R3 . Theorem 5.2 Let {~v1 , ~v2 , . . . , ~vk } be an orthogonal basis for a subspace W of Rn and let w ~ be any vector in W . Then the unique scalars c1 , c2 , . . . , ck such that w ~ = c1~v1 + c2~v2 + · · · + ck~vk are given by ci = w ~ · ~vi for i = 1, . . . , k ~vi · ~vi Proof Since ~v1 , ~v2 , . . . , ~vk is a basis for W , we know that there are unique scalars c1 , c2 , . . . , ck such that w ~ = c1~v1 + c2~v2 + · · · + ck~vk (from Theorem 3.29). To establish the formula for ci , we take the dot product of this linear combination with ~vi to obtain w ~ · ~vi = (c1~v1 + c2~v2 + · · · + ck~vk ) · ~vi = c1 (~v1 · ~vi ) + · · · + ci (~vi · ~vi ) + · · · + ck (~vk · ~vi ) 55 MATH10212 • Linear Algebra • Brief lecture notes = ci (~vi · ~vi ) since ~vj ·~vi = 0 for j 6= i. Since ~vi 6= ~0, ~vi ·~vi 6= 0. Dividing by ~vi ·~vi , we obtain the desired result. ¤ A unit vector is a vector of unit length. Notice that if ~v 6= ~0 then ~u = ~v kvk is a unit vector collinear (directed along the same line) as ~v : ~v = k~v k~u. A set of vectors in Rn is an orthonormal set if it is an orthogonal set of unit vectors. An orthonormal basis for a subspace W of Rn is a basis of W that is an orthonormal set. Theorem 5.3 Let {~q1 , ~q2 , . . . , ~qk } be an orthonormal basis for a subspace W of Rn and let w ~ be any vector in W . Then w ~ = (w ~ · ~q1 )~q1 + (w ~ · ~q2 )~q2 + · · · + (w ~ · ~qk )~qk and this representation is unique. Theorem 5.4. The columns of an m × n matrix Q form an orthonormal set if and only if QT Q = In . Proof. We need to show that ½ (QT Q)ij = 0 1 if i 6= j if i = j Let ~qi denote the ith column of Q (and, hence, the ith row of QT ). Since the (i, j) entry of QT Q is the dot product of the ith row of QT and the jth column of Q, it follows that (QT Q)ij = ~qi · ~qj (2) by the definition of matrix multiplication. Now the columns of Q form an orthonormal set if and only if ½ 0 if i 6= j ~qi · ~qj = 1 if i = j which, by equation (2) holds if and only if ½ 0 if i 6= j T (Q Q)ij = 1 if i = j 56 MATH10212 • Linear Algebra • Brief lecture notes This completes the proof. ¤ If the matrix Q in Theorem 5.4 is a square matrix, is has a special name. An n × n matrix Q whose columns form an orthonormal set is called an orthogonal matrix. The most important fact about orthogonal matrices is given by the next theorem. Theorem 5.5. A square matrix Q is orthogonal if and only if Q−1 = QT . Proof. By Theorem 5.4, Q is orthogonal if and only if QT Q = I. This is true if and only if Q is invertible and Q−1 = QT , by Theorem 3.13. ¤ Example Each of the following matrices is orthogonal: √ ¸ ¸ · ¸ · √ · 1 0 1 0 1/√2 1/ √2 , , , 0 1 0 −1 1/ 2 −1/ 2 · cos α sin α sin α − cos α ¸ Theorem 5.6. Let Q be an n × n matrix. The following statements are equivalent: a. Q is orthogonal. b. ||Q~x|| = ||~x|| for every ~x in Rn . c. Q~x · Q~y = ~x · ~y for every ~x and ~y in Rn . Theorem 5.7. thonormal set. If Q is an orthogonal matrix, then its rows form an or- Theorem 5.8. Let Q be an orthogonal matrix. a. Q−1 is orthogonal. b. det Q = ±1. c. If λ is an eigenvalue of Q, then |λ| = 1. d. If Q1 and Q2 are orthogonal n × n matrices, then so is Q1 Q2 .