* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download (pdf)
System of linear equations wikipedia , lookup
Determinant wikipedia , lookup
Rotation matrix wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Gaussian elimination wikipedia , lookup
Covariance and contravariance of vectors wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Matrix calculus wikipedia , lookup
Matrix multiplication wikipedia , lookup
Jordan normal form wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Density of diagonalizable square matrices Student: Daniel Cervone; Mentor: Saravanan Thiyagarajan University of Chicago VIGRE REU, Summer 2007. For this entire paper, we will refer to V as a vector space over and L(V) as the set of linear operators {A⏐Α : V → V}. Recall the following definition: if A is a linear operator on a vector space V, and ∃v ≠ 0 ∈ V and λ ∈ st Av = λ v, then v and λ are an eigenvector and eigenvalue of A, respectively. Theorem 1: A matrix is called diagonalizable if it is similar to some diagonal matrix. If A ∈ L(V) has distinct eigenvalues then A is diagonalizable. Proof : Let w1 … w n (assuming dimV = n) be the eigenvectors that correspond to each eigenvalue. Let W be the matrix that has w1 … w n for each of its columns. A quick calculation will verify that: ⎛ a1,1 … a1,n ⎞ ⎛ ⎜ ⎟⎜ ⎜ ⎟ ⎜ w1 ⎜a an ,n ⎟⎠ ⎜⎝ ⎝ n ,1 ⎞ ⎛ ⎟ ⎜ w 2 … w n ⎟ = ⎜ w1 ⎟ ⎜ ⎠ ⎝ ⎞ ⎟ w2 … wn ⎟ ⎟ ⎠ ⎛ λ1 ⎜ ⎜ ⎜ ⎝0 0 ⎞⎟ ⎟ λn ⎟⎠ … ⎛ ⎞ ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ LHS = ⎜ Aw1 Aw 2 … Aw n ⎟ and RHS = ⎜ λ1 w1 λ2 w 2 … λn w n ⎟ and clearly Aw i = λi w i . ⎜ ⎟ ⎜ ⎟ ⎝ ⎠ ⎝ ⎠ And we know that W is invertible since the fact that the eigenvalues of A are distinct implies that w1 … w n are linearly independent. Thus: ⎛ a1,1 … a1,n ⎞ ⎛ ⎞ ⎛ λ1 ⎜ ⎟ ⎜ ⎟ ⎜ ⎜ ⎟ = ⎜ w1 w 2 … w n ⎟ ⎜ ⎟ ⎜ ⎜a an ,n ⎟⎠ ⎜⎝ ⎠ ⎝0 ⎝ n ,1 This proves the theorem. 0 ⎞⎟ ⎛ ⎜ ⎟ ⎜ w1 ⎜ λn ⎟⎠ ⎝ ⎞ ⎟ w2 … wn ⎟ ⎟ ⎠ −1 Theorem 2: Suppose T ∈ L(V) with nondistinct eigenvalues. Let λ1 … λm be the distinct eigenvalues of T, thus m < dim(V). Then ∃ a basis of V with respect to which T has the form: ⎛ A1 ⎜ ⎜ ⎜ ⎝0 ⎛λ ⎜ j ⎟ where each Aj is an upper triangular matrix of the form: ⎜ ⎜⎜ ⎟ An ⎠ ⎝0 0 ⎞⎟ * * ⎞⎟ *⎟ ⎟ λ j ⎟⎠ Proof : ∀1 ≤ j ≤ m let U j be the subspace of generalized eigenvectors of T corresponding to λ j : ∀j, U j = { v ∈ V : (T − λ j I)k v = 0 for some k ∈ }. It follows from this immediately that ∀j, U j = null(T − λ j I) k , and that (T − λ j I)⏐Uj is nilpotent. Note that if A is any linear operator on V, then null(A) is a subspace of V since it contains 0 and clearly satisfies closure under addition and scalar multiplication (these follow from A being linear). Before continuing, we need a crucial lemma: Lemma 1: If N is a nilpotent linear operator on a vector space X, then ∃ a basis of X with respect to which ⎛ 0 * *⎞ ⎜ ⎟ N has the form: ⎜ * ⎟ (i.e. N has 0's on and below the diagonal). ⎜ ⎟ 0⎠ ⎝0 Proof of Lemma: First note that N nilpotent on X ⇒ ∃p ∈ st N p = [0] ⇒ X = null(N p ) { } { } Next, choose a basis b1 ,… , bk1 of null(N) and extend this to a basis b1 ,… , bk2 of null(N 2 ), where k1 ≤ k2 ≤ p. We can do this because if v ∈ null(N), then Nv = [0], so clearly N(Nv ) = [0]. Thus null(N) ⊂ null(N 2 ). And since b1 ,… , bk1 are linearly independent vectors that span null(N), we can span null(N 2 ) by b1 ,… , bk1 and 1 or more linearly independent vectors bk1 +1 ,… , bk2 in null(N 2 ) that do not depend on b1 ,… , bk1 . We can keep extending the basis of null(N 2 ) to a basis of null(N3 ) and eventually null(N p ). In doing so, we establish a basis B = {b1 ,… , bp } of X, since B is a basis of null(N p ) = X. Now let us consider N with respect to this basis. We know that by changing the basis of N, we can write N with respect to B as the matrix: [Nb1⏐Nb2⏐Nb3 …⏐Nbp ] where each column is the (p × 1) vector Nbi . Since b1 ∈ null(N), the first column will be entirely 0. This is in fact true for each column through k1 . The next column is Nbk1 +1 , where Nbk1 +1 ∈ null(N) since N 2 bk1 +1 = 0 (recall that bk1 +1 ∈ null(N 2 )). Nbk1 +1 ∈ null(N) ⇒ Nbk1 +1 is a linear combination of b1 … bk1 ⇒ all nonzero entries in the k1 + 1 column lie above the diagonal. This is in fact true for all columns from k1 … k2 where b1 … bk2 span null(N 2 ). Similarly, we can take the next column, Nbk2 +1 , which is in null(N 2 ) since bk2 +1 is a basis vector of null(N 3 ). Thus Nbk2 +1 depends on b1 … bk2 and any nonzero entries in the k2 + 1 column lie above the diagonal. We can continue this process through column p, thus confirming that N with respect to ⎛ 0 * *⎞ ⎜ ⎟ the basis B is of the form: ⎜ * ⎟ . This proves the lemma. ⎜ ⎟ 0⎠ ⎝0 We now continue the proof of the theorem. Recall that ∀1 ≤ j ≤ m (T − λ j I)⏐Uj is nilpotent. Thus, by the lemma we just proved, ∀j ∃ a basis B j of U j st with respect to B j : ⎛0 ⎜ (T − λ j I)⏐Uj = ⎜ ⎜ ⎝0 * ⎛λ ⎜ j ⎟ * ⎟ , and therefore T⏐Uj = ⎜ ⎜⎜ ⎟ 0⎠ ⎝0 *⎞ * * ⎞⎟ * ⎟. ⎟ λ j ⎟⎠ Moreover, if B is the basis {B1...Bm } of U where U = U1 ⊕ U 2 ⊕ … ⊕ U m then T⏐U with respect to B is in the form: ⎛λ * * ⎞ ⎛ T1 0 ⎞⎟ ⎜ j ⎟ ⎜ *⎟ ⎜ ⎟ where each Tj is an upper triangular matrix of the form: ⎜ ⎜⎜ ⎟ ⎜ ⎟ Tm ⎠ λ j ⎟⎠ ⎝0 ⎝0 Note that this is the desired form corresponding to our theorem. However, we still need to show that this form is possible for T with respect to a basis of V. It suffices to show that V = U, then clearly a basis of U is a basis of V. m To do this, consider the linear operator S ∈ L(V) where S = ∏ (T − λ j I)dimV . Our claim is that S⏐U = 0. j =1 To verify this, consider that null(T − λ j I) ⊆ null(T − λ j I) ⊆ … null(T − λ j I)n for any n. We want to strengthen this 1 2 statement into the following lemma: Lemma 2: null(T − λ j I)1 ⊆ null(T − λ j I) 2 ⊆ … null(T − λ j I) dimV = null(T − λ j I) dimV+1 … = null(T − λ j I) dimV+n Proof : Suppose ∃k st null(T − λ j I) k = null(T − λ j I) k+1. If x ∈ null(T − λ j I) k+n+1 for n ∈ then (T − λ j I) k+n+1x = 0. ⇒ (T − λ j I) k+1 (T − λ j I) n x = 0 ⇒ (T − λ j I) n x ∈ null(T − λ j I) k+1 = (T − λ j I) k . Thus null(T − λ j I) k+n+1 ⊆ null(T − λ j I)k+n ⇒ null(T − λ j I) k+n+1 = null(T − λ j I)k+n . So null(T − λ j I) k = null(T − λ j I) k+1 ⇒ null(T − λ j I)1 ⊆ null(T − λ j I) 2 ⊆ … null(T − λ j I) k = null(T − λ j I) k+1 … = null(T − λ j I) k+n . Now we want to show that null(T − λ j I)dimV = null(T − λ j I)dimV+1 . To prove this, assume the contrary, i.e.: null(T − λ j I)1 y null(T − λ j I) 2 … y null(T − λ j I)dimV y null(T − λ j I)dimV+1 . Since each null(T − λ j I)i is a subspace of V, null(T − λ j I)i y null(T − λ j I)i+1 ⇒ dim(null(T − λ j I)i ) + 1 ≤ dim(null(T − λ j I)i+1 ) since the term left of " y " has a lower dim than the one to the right. But then dim(null(T − λ j I) dimV+1 ) > dim V, which is a contradiction since null(T − λ j I)dimV+1 is a subspace of V. Therefore the following is true: null(T − λ j I)1 ⊆ null(T − λ j I)2 ⊆ … null(T − λ j I)dimV = null(T − λ j I)dimV+1 … = null(T − λ j I)dimV+n . This completes the proof of lemma 2. We again return to verifying that S⏐U = 0. Now consider Su for some u ∈ U. u ∈ U ⇒ u = u1 + u 2 … u m for u i ∈ U i . Since matrix multiplication is distributive, Su = Su1 + Su 2 … Su m . Moreover, we know that ∀i,j ≤ m, (T − λi I) dimV and (T − λ j I)dimV are commutable (this is because their product in either direction consists of terms of T of some order and terms of TI or IT of some order. And clearly T commutes with T and I commutes with any matrix). So Sui = (T − λ1 I) dimV (T − λ2 I) dimV (T − λi −1 I) dimV (T − λi +1 I) dimV (T − λi I) dimV ui . Of course, (T − λi I)dimV ui = 0 since ∀1 ≤ i ≤ m (T − λi I)dimV ui . Thus Su = 0 and we have proven our claim that S⏐U = 0, which gives U ⊆ null(S). m Yet suppose u ∈ null(S). Then (∏ (T − λ j I) dimV )(u)=0. Therefore for some i ≤ m, (T − λ j I) dimV )(u)=0. j =1 ⇒ u ∈ U i ⇒ u ∈ U ⇒ null(S) ⊆ U ⇒ null(S) = U. Now, we have shown that ∀i,j ≤ m, (T − λi I) dimV and (T − λ j I) dimV are commutable. From this it follows that S and T are commutable. For a vector v ∈ V, of course S(Tv ) ∈ Img(S). Yet since T and S commute, T(Sv) ∈ Img(S) (i.e. Img(S) is invariant over T). Let us assume that Img(S) ≠ 0. Img(S) invariant over T ⇒ ∃w ∈ Img(S) where w is an eigenvector for T. Moreover, w ∈ Img(S) ⇒ ∃x ∈ V st Sx is an eigenvector of T. By definition, Sx ≠ 0, thus x ∉ null(S). But Sx is an eigenvector of T, so clearly SSx = 0. Thus, null(S) y null(S2 ). m m j =1 j =1 This contradicts lemma 2, since null(S) y null(S2 ) ⇒ dim(null(∏ (T − λ j I))dimV ) < dim(null(∏ (T − λ j I)) 2dimV ). Therefore Img(S)=0. If we apply the rank-nullity theorem to S:V → V, we get: dimV = dim(null(S)) + dim(Img(S)). Img(S)=0 ⇒ dim(Img(S))=0, so dimV = dim(null(S)). We showed earlier that U = null(S), so dimU=dimV. And U being a subspace of V and dimU=dimV ⇒ U=V. Thus, a basis of U is also a basis of V. This proves the theorem. Theorem 3: ∃ A n st A i ∈ L(V), Ai has distinct eigenvalues, and A n → T. Proof : Theorem 1 (in light of the recent observation) shows that ∃T ′ st T ∼ T ′ and T ′ ⎛ λ1 *T′ ⎞ ⎜ ⎟ can be written: ⎜ ⎟ where λi are eigenvalues ⎜ λk ⎟⎠ ⎝0 but not necessarily distinct from one another (i ≠ j does not imply λi ≠ λ j ). ⎛ λ1 *T′ ⎞ ⎜ n ⎟ Now let A n be ⎜ ⎟ where ∀1 ≤ i ≤ k λin → λi and ∀i, j ≤ k λin = λ jn ⇔ i = j ⎜ ⎟ ⎜0 λkn ⎟⎠ ⎝ (the eigenvalues of each A n are distinct). The fact that ∀1 ≤ i ≤ k λin → λi ⇒ A n → T ′ entrywise ⇒ A n → T ′. But T ∼ T ′ ⇒ ∃ nonsingular matrix P st T = P -1T ′P. Now A n → T ′ ⇒ P-1 A n P → T since matrix multiplication is continuous (this is fairly easy to verify: if v n → v then surely Av n → v entrywise ⇒ Av n → Av. And if a sequence of matrices X n → X, then clearly the column vectors converge, thus AX n → AX. And since A n ∼ P -1 A n P, the eigenvalues of P -1 A n P are equal to thoseof A n . Thus P -1 A n P is a sequence of matrices with distinct eigenvalues and P -1 A n P → T. This proves the theorem. Observation: If T ∈ L(V) and ∃ a basis of V with respect to which T is triangular, this is equivalent to saying that ∃ T ′ ∈ L(V) st T ′ is triangular and T ∼ T ′ (T is similar to T ′), i.e. ∃ nonsingular matrix P st T = P -1T ′P. Corollary: Theorems 1, 2, and 3 imply that any square matrix is a limit point of a sequence of square matrices with distinct eigenvalues. By definition then, square matrices with distinct eigenvalues are dense in L(V). And Theorem 1 shows that any square matrix with distinct eigenvalues is diagonlizable, thus the diagonalizable matrices are also dense in L(V). REFERENCES: Axler, Sheldon. Linear Algebra Done Right. 1997. http://planetmath.org