Download (pdf)

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

System of linear equations wikipedia , lookup

Determinant wikipedia , lookup

Rotation matrix wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Gaussian elimination wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Matrix calculus wikipedia , lookup

Matrix multiplication wikipedia , lookup

Jordan normal form wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Four-vector wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Transcript
Density of diagonalizable square matrices
Student: Daniel Cervone; Mentor: Saravanan Thiyagarajan
University of Chicago VIGRE REU, Summer 2007.
For this entire paper, we will refer to V as a vector space over
and L(V) as the set of linear operators
{A⏐Α : V → V}. Recall the following definition: if A is a linear operator on a vector space V,
and ∃v ≠ 0 ∈ V and λ ∈
st Av = λ v, then v and λ are an eigenvector and eigenvalue of A, respectively.
Theorem 1: A matrix is called diagonalizable if it is similar to some diagonal matrix. If A ∈ L(V) has
distinct eigenvalues then A is diagonalizable.
Proof : Let w1 … w n (assuming dimV = n) be the eigenvectors that correspond to each eigenvalue.
Let W be the matrix that has w1 … w n for each of its columns. A quick calculation will verify that:
⎛ a1,1 … a1,n ⎞ ⎛
⎜
⎟⎜
⎜
⎟ ⎜ w1
⎜a
an ,n ⎟⎠ ⎜⎝
⎝ n ,1
⎞ ⎛
⎟ ⎜
w 2 … w n ⎟ = ⎜ w1
⎟ ⎜
⎠ ⎝
⎞
⎟
w2 … wn ⎟
⎟
⎠
⎛ λ1
⎜
⎜
⎜
⎝0
0 ⎞⎟
⎟
λn ⎟⎠
…
⎛
⎞
⎛
⎞
⎜
⎟
⎜
⎟
LHS = ⎜ Aw1 Aw 2 … Aw n ⎟ and RHS = ⎜ λ1 w1 λ2 w 2 … λn w n ⎟ and clearly Aw i = λi w i .
⎜
⎟
⎜
⎟
⎝
⎠
⎝
⎠
And we know that W is invertible since the fact that the eigenvalues of A are distinct implies that
w1 … w n are linearly independent. Thus:
⎛ a1,1 … a1,n ⎞ ⎛
⎞ ⎛ λ1
⎜
⎟ ⎜
⎟ ⎜
⎜
⎟ = ⎜ w1 w 2 … w n ⎟ ⎜
⎟ ⎜
⎜a
an ,n ⎟⎠ ⎜⎝
⎠ ⎝0
⎝ n ,1
This proves the theorem.
0 ⎞⎟ ⎛
⎜
⎟ ⎜ w1
⎜
λn ⎟⎠ ⎝
⎞
⎟
w2 … wn ⎟
⎟
⎠
−1
Theorem 2: Suppose T ∈ L(V) with nondistinct eigenvalues. Let λ1 … λm be the distinct eigenvalues of T,
thus m < dim(V). Then ∃ a basis of V with respect to which T has the form:
⎛ A1
⎜
⎜
⎜
⎝0
⎛λ
⎜ j
⎟ where each Aj is an upper triangular matrix of the form: ⎜
⎜⎜
⎟
An ⎠
⎝0
0 ⎞⎟
*
* ⎞⎟
*⎟
⎟
λ j ⎟⎠
Proof : ∀1 ≤ j ≤ m let U j be the subspace of generalized eigenvectors of T corresponding to λ j :
∀j, U j = { v ∈ V : (T − λ j I)k v = 0 for some k ∈
}.
It follows from this immediately that ∀j, U j = null(T − λ j I) k , and that (T − λ j I)⏐Uj is nilpotent.
Note that if A is any linear operator on V, then null(A) is a subspace of V since it contains 0 and clearly
satisfies closure under addition and scalar multiplication (these follow from A being linear).
Before continuing, we need a crucial lemma:
Lemma 1: If N is a nilpotent linear operator on a vector space X, then ∃ a basis of X with respect to which
⎛ 0 * *⎞
⎜
⎟
N has the form: ⎜
* ⎟ (i.e. N has 0's on and below the diagonal).
⎜
⎟
0⎠
⎝0
Proof of Lemma: First note that N nilpotent on X ⇒ ∃p ∈ st N p = [0] ⇒ X = null(N p )
{
}
{
}
Next, choose a basis b1 ,… , bk1 of null(N) and extend this to a basis b1 ,… , bk2 of null(N 2 ), where k1 ≤ k2 ≤ p.
We can do this because if v ∈ null(N), then Nv = [0], so clearly N(Nv ) = [0]. Thus null(N) ⊂ null(N 2 ).
And since b1 ,… , bk1 are linearly independent vectors that span null(N), we can span null(N 2 ) by
b1 ,… , bk1 and 1 or more linearly independent vectors bk1 +1 ,… , bk2 in null(N 2 ) that do not depend on b1 ,… , bk1 .
We can keep extending the basis of null(N 2 ) to a basis of null(N3 ) and eventually null(N p ). In doing so,
we establish a basis B = {b1 ,… , bp } of X, since B is a basis of null(N p ) = X.
Now let us consider N with respect to this basis. We know that by changing the basis of N, we can write
N with respect to B as the matrix: [Nb1⏐Nb2⏐Nb3 …⏐Nbp ] where each column is the (p × 1) vector Nbi .
Since b1 ∈ null(N), the first column will be entirely 0. This is in fact true for each column through k1 .
The next column is Nbk1 +1 , where Nbk1 +1 ∈ null(N) since N 2 bk1 +1 = 0 (recall that bk1 +1 ∈ null(N 2 )).
Nbk1 +1 ∈ null(N) ⇒ Nbk1 +1 is a linear combination of b1 … bk1 ⇒ all nonzero entries in the k1 + 1 column
lie above the diagonal. This is in fact true for all columns from k1 … k2 where b1 … bk2 span null(N 2 ).
Similarly, we can take the next column, Nbk2 +1 , which is in null(N 2 ) since bk2 +1 is a basis vector of
null(N 3 ). Thus Nbk2 +1 depends on b1 … bk2 and any nonzero entries in the k2 + 1 column lie above
the diagonal. We can continue this process through column p, thus confirming that N with respect to
⎛ 0 * *⎞
⎜
⎟
the basis B is of the form: ⎜
* ⎟ . This proves the lemma.
⎜
⎟
0⎠
⎝0
We now continue the proof of the theorem. Recall that ∀1 ≤ j ≤ m (T − λ j I)⏐Uj is nilpotent. Thus,
by the lemma we just proved, ∀j ∃ a basis B j of U j st with respect to B j :
⎛0
⎜
(T − λ j I)⏐Uj = ⎜
⎜
⎝0
*
⎛λ
⎜ j
⎟
* ⎟ , and therefore T⏐Uj = ⎜
⎜⎜
⎟
0⎠
⎝0
*⎞
*
* ⎞⎟
* ⎟.
⎟
λ j ⎟⎠
Moreover, if B is the basis {B1...Bm } of U where U = U1 ⊕ U 2 ⊕ … ⊕ U m then T⏐U with respect to B is in the form:
⎛λ * * ⎞
⎛ T1
0 ⎞⎟
⎜ j
⎟
⎜
*⎟
⎜
⎟ where each Tj is an upper triangular matrix of the form: ⎜
⎜⎜
⎟
⎜
⎟
Tm ⎠
λ j ⎟⎠
⎝0
⎝0
Note that this is the desired form corresponding to our theorem. However, we still need to show
that this form is possible for T with respect to a basis of V. It suffices to show that V = U,
then clearly a basis of U is a basis of V.
m
To do this, consider the linear operator S ∈ L(V) where S = ∏ (T − λ j I)dimV . Our claim is that S⏐U = 0.
j =1
To verify this, consider that null(T − λ j I) ⊆ null(T − λ j I) ⊆ … null(T − λ j I)n for any n. We want to strengthen this
1
2
statement into the following lemma:
Lemma 2: null(T − λ j I)1 ⊆ null(T − λ j I) 2 ⊆ … null(T − λ j I) dimV = null(T − λ j I) dimV+1 … = null(T − λ j I) dimV+n
Proof : Suppose ∃k st null(T − λ j I) k = null(T − λ j I) k+1. If x ∈ null(T − λ j I) k+n+1 for n ∈
then (T − λ j I) k+n+1x = 0.
⇒ (T − λ j I) k+1 (T − λ j I) n x = 0 ⇒ (T − λ j I) n x ∈ null(T − λ j I) k+1 = (T − λ j I) k .
Thus null(T − λ j I) k+n+1 ⊆ null(T − λ j I)k+n ⇒ null(T − λ j I) k+n+1 = null(T − λ j I)k+n .
So null(T − λ j I) k = null(T − λ j I) k+1 ⇒
null(T − λ j I)1 ⊆ null(T − λ j I) 2 ⊆ … null(T − λ j I) k = null(T − λ j I) k+1 … = null(T − λ j I) k+n .
Now we want to show that null(T − λ j I)dimV = null(T − λ j I)dimV+1 . To prove this, assume the contrary, i.e.:
null(T − λ j I)1 y null(T − λ j I) 2 … y null(T − λ j I)dimV y null(T − λ j I)dimV+1 . Since each null(T − λ j I)i is a
subspace of V, null(T − λ j I)i y null(T − λ j I)i+1 ⇒ dim(null(T − λ j I)i ) + 1 ≤ dim(null(T − λ j I)i+1 )
since the term left of " y " has a lower dim than the one to the right. But then dim(null(T − λ j I) dimV+1 ) > dim V,
which is a contradiction since null(T − λ j I)dimV+1 is a subspace of V. Therefore the following is true:
null(T − λ j I)1 ⊆ null(T − λ j I)2 ⊆ … null(T − λ j I)dimV = null(T − λ j I)dimV+1 … = null(T − λ j I)dimV+n . This
completes the proof of lemma 2.
We again return to verifying that S⏐U = 0. Now consider Su for some u ∈ U.
u ∈ U ⇒ u = u1 + u 2 … u m for u i ∈ U i . Since matrix multiplication is distributive, Su = Su1 + Su 2 … Su m .
Moreover, we know that ∀i,j ≤ m, (T − λi I) dimV and (T − λ j I)dimV
are commutable (this is because their product in either direction consists of terms of T of some order
and terms of TI or IT of some order. And clearly T commutes with T and I commutes with any matrix).
So Sui = (T − λ1 I) dimV (T − λ2 I) dimV
(T − λi −1 I) dimV (T − λi +1 I) dimV
(T − λi I) dimV ui . Of course, (T − λi I)dimV ui = 0
since ∀1 ≤ i ≤ m (T − λi I)dimV ui . Thus Su = 0 and we have proven our claim that S⏐U = 0, which gives U ⊆ null(S).
m
Yet suppose u ∈ null(S). Then (∏ (T − λ j I) dimV )(u)=0. Therefore for some i ≤ m, (T − λ j I) dimV )(u)=0.
j =1
⇒ u ∈ U i ⇒ u ∈ U ⇒ null(S) ⊆ U ⇒ null(S) = U.
Now, we have shown that ∀i,j ≤ m, (T − λi I) dimV and (T − λ j I) dimV are commutable. From this it follows that
S and T are commutable. For a vector v ∈ V, of course S(Tv ) ∈ Img(S). Yet since T and S commute,
T(Sv) ∈ Img(S) (i.e. Img(S) is invariant over T). Let us assume that Img(S) ≠ 0. Img(S) invariant over T
⇒ ∃w ∈ Img(S) where w is an eigenvector for T. Moreover, w ∈ Img(S) ⇒ ∃x ∈ V st Sx is an eigenvector of T.
By definition, Sx ≠ 0, thus x ∉ null(S). But Sx is an eigenvector of T, so clearly SSx = 0. Thus, null(S) y null(S2 ).
m
m
j =1
j =1
This contradicts lemma 2, since null(S) y null(S2 ) ⇒ dim(null(∏ (T − λ j I))dimV ) < dim(null(∏ (T − λ j I)) 2dimV ).
Therefore Img(S)=0. If we apply the rank-nullity theorem to S:V → V, we get:
dimV = dim(null(S)) + dim(Img(S)).
Img(S)=0 ⇒ dim(Img(S))=0, so dimV = dim(null(S)). We showed earlier that U = null(S), so dimU=dimV.
And U being a subspace of V and dimU=dimV ⇒ U=V.
Thus, a basis of U is also a basis of V. This proves the theorem.
Theorem 3: ∃ A n st A i ∈ L(V), Ai has distinct eigenvalues, and A n → T.
Proof : Theorem 1 (in light of the recent observation) shows that ∃T ′ st T ∼ T ′ and T ′
⎛ λ1
*T′ ⎞
⎜
⎟
can be written: ⎜
⎟ where λi are eigenvalues
⎜
λk ⎟⎠
⎝0
but not necessarily distinct from one another (i ≠ j does not imply λi ≠ λ j ).
⎛ λ1
*T′ ⎞
⎜ n
⎟
Now let A n be ⎜
⎟ where ∀1 ≤ i ≤ k λin → λi and ∀i, j ≤ k λin = λ jn ⇔ i = j
⎜
⎟
⎜0
λkn ⎟⎠
⎝
(the eigenvalues of each A n are distinct).
The fact that ∀1 ≤ i ≤ k λin → λi ⇒ A n → T ′ entrywise ⇒ A n → T ′.
But T ∼ T ′ ⇒ ∃ nonsingular matrix P st T = P -1T ′P. Now A n → T ′ ⇒ P-1 A n P → T since matrix multiplication is
continuous (this is fairly easy to verify: if v n → v then surely Av n → v entrywise ⇒ Av n → Av. And if a sequence
of matrices X n → X, then clearly the column vectors converge, thus AX n → AX. And since A n ∼ P -1 A n P, the
eigenvalues of P -1 A n P are equal to thoseof A n . Thus P -1 A n P is a sequence of matrices with distinct eigenvalues and
P -1 A n P → T. This proves the theorem.
Observation: If T ∈ L(V) and ∃ a basis of V with respect to which T is triangular, this is equivalent to saying
that ∃ T ′ ∈ L(V) st T ′ is triangular and T ∼ T ′ (T is similar to T ′), i.e. ∃ nonsingular matrix P st T = P -1T ′P.
Corollary: Theorems 1, 2, and 3 imply that any square matrix is a limit point of a sequence of square matrices
with distinct eigenvalues. By definition then, square matrices with distinct eigenvalues are dense in L(V). And
Theorem 1 shows that any square matrix with distinct eigenvalues is diagonlizable, thus the diagonalizable matrices
are also dense in L(V).
REFERENCES:
Axler, Sheldon. Linear Algebra Done Right. 1997.
http://planetmath.org