Download Similarity and Diagonalization Similar Matrices

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Euclidean vector wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Vector space wikipedia , lookup

Exterior algebra wikipedia , lookup

Rotation matrix wikipedia , lookup

System of linear equations wikipedia , lookup

Determinant wikipedia , lookup

Symmetric cone wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Principal component analysis wikipedia , lookup

Gaussian elimination wikipedia , lookup

Jordan normal form wikipedia , lookup

Matrix calculus wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Four-vector wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Transcript
MATH10212 • Linear Algebra • Brief lecture notes
48
Similarity and Diagonalization
Similar Matrices
Let A and B be n × n matrices. We say that A is similar to B if there is an
invertible n × n matrix P such that P −1 AP = B. If A is similar to B, we
write A ∼ B.
Remarks
• If A ∼ B, we can write, equivalently, that A = P BP −1 or AP = P B.
• If A ∼ B, we can write, equivalently, that A = P BP −1 or AP = P B.
• The matrix P depends on A and B. It is not unique for a given pair
of similar matrices A and B. To see this, simply take A = B = I, in
which case I ∼ I, since P −1 IP = I for any invertible matrix P .
Theorem 4.21. Let A, B and C be n × n matrices.
a. A ∼ A.
b. If A ∼ B, then B ∼ A.
c. If A ∼ B and B ∼ C, then A ∼ C.
This means that ∼ is an equivalence relation. The main problem is to
find a “good” representative in each equivalence class.
The “real” meaning of P −1 AP is that this is the matrix of the same linear
transformation (given in the standard basis by the matrix A) in a different
basis, which consists of the columns of P . This really much better explains
why many properties are the same for A and P −1 AP .
Theorem 4.22. Let A and B be n × n matrices with A ∼ B. Then
a. det A = det B.
b. A is invertible if and only if B is invertible.
c. A and B have the same rank.
d. A and B have the same characteristic polynomial.
e. A and B have the same eigenvalues.
MATH10212 • Linear Algebra • Brief lecture notes
49
Diagonalization
Definition. An n × n matrix A is diagonalizable if there is a diagonal
matrix D such that A is similar to D — that is, if there is an invertible
matrix P such that P −1 AP = D.
Note that the eigenvalues of D are its diagonal elements, and these are
the same eigenvalues as for A.
Theorem 4.23. Let A be an n × n matrix. Then A is diagonalizable if and
only if A has n linearly independent eigenvectors.
More precisely, there exists an invertible matrix P and a diagonal matrix D such that P −1 AP = D if and only if the columns of P are n linearly
independent eigenvectors of A and the diagonal entries of D are the eigenvalues of A corresponding to the eigenvectors in P in the same order.
Theorem 4.25. If A is an n × n matrix with n distinct eigenvalues, then
A is diagonalizable.
...Since eigenvectors for distinct eigenvalues are lin. indep. by Th. 4.20.
Theorem 4.24. Let A be an n × n matrix and let
λ1 , λ2 , . . . , λk
be distinct eigenvalues of A. If Bi is a basis for the eigenspace Eλi , then
B = B1 ∪ B2 ∪ · · · ∪ Bk
(i.e., the total collection of basis vectors for all of the eigenspaces) is linearly
independent.
Lemma 4.26. If A is an n × n matrix, then the geometric multiplicity of
each eigenvalue is less than or equal to its algebraic multiplicity.
Theorem 4.27. The Diagonalization Theorem Let A be an n×n matrix
whose distinct eigenvalues are λ1 , λ2 , . . . , λk . The following statements are
equivalent:
a. A is diagonalizable.
b. The union B of the bases of the eigenspaces
A (as in Theorem 4.24)
Pof
k
contains n vectors (which is equivalent to i=1 dim Eλi = n).
c. The algebraic multiplicity of each eigenvalue equals its geometric multiplicity and all eigenvalues are real numbers — this condition
is missing in the textbook!.
MATH10212 • Linear Algebra • Brief lecture notes
50
In these theorems the eigenvalues are supposed to be real numbers,
although for real matrices there may be some complex roots of the characteristic polynomial (in fact, these theorems remain valid for vector spaces
and matrices over C — then, of course, one does not need the condition that
the eigenvalues be all real).
Theorem 4.27 and Th. 4.23 actually give a method to decide whether A
is diagonalizable, and if yes, to find P such that P −1 AP is diagonal: the
columns of P are vectors of bases of the eigenspaces.


1 2 2
Example. For A = 2 1 2 the characteristic polynomial is
2 2 1
¯
¯
¯1 − λ
2
2 ¯¯
¯
1−λ
2 ¯¯ =
det(A − λI) = ¯¯ 2
¯ 2
2
1 − λ¯
3
(1 − λ) + 8 + 8 − 4(1 − λ) − 4(1 − λ) − 4(1 − λ) = · · · = −(λ − 5)(λ + 1)2 . Thus,
eigenvalues are 5 and −1.
   

2 2 2
x1
0
Eigenspace E−1 : (A−(−1)I)~x = ~0; 2 2 2 x2  = 0; x1 = −x2 −x3 ,
x3  0
 2 2  2

 −s − t
where x2 , x3 are free var.; E−1 =  s  | s, t ∈ R ;


t
   
−1 
 −1
 1 ,  0  .
a basis of E−1 :


0
1
 

 
x1
0
−4 2
2
Eigenspace E5 : (A − 5I)~x = ~0;  2 −4 2  x2  = 0; solve this
2
2 −4 x3   0 
 t

system....: x1 = x2 = x3 , where x3 is a free var.; E5 = t | t ∈ R ;


t
 
 1 
1 .
a basis of E5 :


1
Together the dimensions add up to 3, so B5 ∪ B−1 is a basis of R3 , so A is
diagonalizable.




1 −1 −1
5 0
0
0 ; then P −1 AP = 0 −1 0 .
Let P = 1 1
1 0
1
0 0 −1
(Note that is we arrange the eigenvectors in a different order, then
the


−1 −1 1
0 1;
eigenvalues on the diagonal must be arranged accordingly: let Q =  1
0
1 1


−1 0 0
then Q−1 AQ =  0 −1 0.)
0
0 5
51
MATH10212 • Linear Algebra • Brief lecture notes


3 20 29
Example. For A = 0 1 82 the eigenvalues are 3, 1, and 7. Since
0 0 7
they are distinct, the matrix is diagonalizable.


3 0 0
(To find that P such that P −1 AP = 0 1 0, one still needs to solve those
0 0 7
linear systems (A − (λ)I)~x = ~0......).


3 1 0
Example. For A = 0 3 1 the eigenvalue is 3 of alg. multiplicity 3.
0 0 3

0 1 0
Eigenspace E3 : 0 0 1 ~x = ~0; matrix has rank 2, so dim E3 = 1. So A is
0 0 0
not digonalizable.
·
¸
1 2
for A =
. Eigenvalues
Example. Use diagonalization to find A
2 1
·
¸
½· ¸¾
−2 2
1
~
are.... −1 and 3. Eigenspace E3 :
~x = 0; x1 = x2 ; basis
.
2
−2
1 ¸
·
¸
½· ¸¾
·
2 2
−1 1
−1
Eigenspace E−1 :
~x = ~0; x1 = −x2 ; basis
;
. Let P =
2 ·2
1 1
1
¸
−1 0
then P −1 AP = D =
. Now, A = P DP −1 , so A100 = (P DP −1 )100 =
0 3
·
¸·
¸100 ·
¸
−1 1 −1 0
−1/2 1/2
−1
−1
−1
100 −1
P DP P DP · · · P DP = P D P =
=
1· 1
0 3¸
1/2 1/2
¸·
¸·
¸ ·
¸
·
−1 1 1
0
−1/2 1/2
−1 3100 −1/2 1/2
=
=
100
1 ·1 0 3
1/2 1/2
1 3100
1/2 1/2
¸
3100 + 1 3100 − 1
(1/2) 100
.
3 − 1 3100 + 1
100
52
MATH10212 • Linear Algebra • Brief lecture notes
Orthogonality in Rn
We introduce the dot product of vectors in Rn by setting
~u · ~v = ~uT ~v ;
that is, if


 
u1
v1
 .. 
 .. 
~u =  .  and ~v =  . 
un
vn
then

£
~u · ~v = ~uT ~v = u1
···

v1
¤ 
un  ...  = u1 v1 + u2 v2 + · · · + un vn .
vn
The dot product is frequently called scalar product or inner product;
we shall use the latter term in a slightly more general context. Notice the
following properties of the dot product which can be easily checked directly
or immediately follow from the properties of matrix multiplication. They
hold for arbitrary vectors ~u, ~v , w
~ ∈ Rn and arbitrary scalar λ.
• ~u · ~v = ~v · ~u (commutativity).
~ = ~u · ~v + ~u · w
~
• ~u · (~v + w)
• ~u · (λ~v ) = λ(~v · ~u) (The last two properties are referred to as linearity
of the dot product.)
• ~u · ~u = u21 + · · · + u2n and therefore ~u · ~u ≥ 0. Moreover, if ~u · ~u = 0 then
~u = ~0.
We define the length (or norm) k~v k of vector
 
v1
 .. 
~v =  . 
vn
by
k~v k =
q
√
~v · ~v = v12 + v22 + · · · + vn2
Orthogonal and Orthonormal Sets of Vectors
A set of vectors
~v1 , ~v2 , . . . , ~vk
53
MATH10212 • Linear Algebra • Brief lecture notes
in Rn is called an orthogonal set if all pairs of distinct vectors in the set
are orthogonal – that is, if
~vi · ~vj = 0 whenever i 6= j for i, j = 1, 2, . . . , k
The standard basis
~e1 , ~e2 , . . . , ~en
in Rn is an orthogonal set, as is any subset of it. As the first example
illustrates, there are many other possibilities.
Example 5.1 Show that {~v1 , ~v2 , ~v3 } is an orthogonal set in R3 if

 



2
0
1
~v1 =  1  , ~v2 =  1  , ~v3 =  −1 
−1
1
1
Solution We must show that every pair of vectors from this set is orthogonal. This is true, since
~v1 · ~v2 = 2(0) + 1(1) + (−1)(1) = 0
~v2 · ~v3 = 0(1) + 1(−1) + (1)(1) = 0
~v1 · ~v3 = 2(1) + 1(−1) + (−1)(1) = 0
Theorem 5.1.
If
~v1 , ~v2 , . . . , ~vk
is an orthogonal set of nonzero vectors in Rn , then these vectors are linearly
independent.
Proof If c1 , c2 , . . . , ck are scalars such that
c1~v1 + c2~v2 + · · · + ck~vk = ~0,
then
(c1~v1 + c2~v2 + · · · + ck~vk ) · ~vi = ~0 · ~vi = 0
or, equivalently,
c1 (~v1 · ~vi ) + · · · + ci (~vi · ~vi ) + · · · + ck (~vk · ~vi ) = 0
(1)
Since
~v1 , ~v2 , . . . , ~vk
is an orthogonal set, all of the dot products in equation (1) are zero, except
~vi · ~vi . Thus, equation (1) reduces to
ci (~vi · ~vi ) = 0
MATH10212 • Linear Algebra • Brief lecture notes
54
Now, ~vi · ~vi 6= 0 because ~vi 6= ~0 by hypothesis. So we must have ci = 0. The
fact that this is true for all i = 1, . . . , k implies that
~v1 , ~v2 , . . . , ~vk
is a linearly independent set.
¤
Remark. Thanks to the Theorem 5.1, we know that if a set of vectors
is orthogonal, it is automatically linearly independent. For example, we
can immediately deduce that the three vectors in Example 5.1 are linearly
independent. Contrast this approach with the work needed to establish
their linear independence directly!
An orthogonal basis for a subspace W of Rn is a basis of W that is an
orthogonal set.
Example 5.2.
The vectors


 


2
0
1
~v1 =  1  , ~v2 =  1  , ~v3 =  −1 
−1
1
1
from Example 5.1 are orthogonal and, hence, linearly independent. Since
any three linearly independent vectors in R3 form a basis in R3 , by the
Fundamental Theorem of Invertible Matrices, it follows that ~v1 , ~v2 , ~v3 is an
orthogonal basis for R3 .
Theorem 5.2
Let
{~v1 , ~v2 , . . . , ~vk }
be an orthogonal basis for a subspace W of Rn and let w
~ be any vector in
W . Then the unique scalars c1 , c2 , . . . , ck such that
w
~ = c1~v1 + c2~v2 + · · · + ck~vk
are given by
ci =
w
~ · ~vi
for i = 1, . . . , k
~vi · ~vi
Proof Since
~v1 , ~v2 , . . . , ~vk
is a basis for W , we know that there are unique scalars c1 , c2 , . . . , ck such
that
w
~ = c1~v1 + c2~v2 + · · · + ck~vk
(from Theorem 3.29). To establish the formula for ci , we take the dot product of this linear combination with ~vi to obtain
w
~ · ~vi = (c1~v1 + c2~v2 + · · · + ck~vk ) · ~vi
= c1 (~v1 · ~vi ) + · · · + ci (~vi · ~vi ) + · · · + ck (~vk · ~vi )
55
MATH10212 • Linear Algebra • Brief lecture notes
= ci (~vi · ~vi )
since ~vj ·~vi = 0 for j 6= i. Since ~vi 6= ~0, ~vi ·~vi 6= 0. Dividing by ~vi ·~vi , we obtain
the desired result.
¤
A unit vector is a vector of unit length. Notice that if ~v 6= ~0 then
~u =
~v
kvk
is a unit vector collinear (directed along the same line) as ~v :
~v = k~v k~u.
A set of vectors in Rn is an orthonormal set if it is an orthogonal set of
unit vectors. An orthonormal basis for a subspace W of Rn is a basis of
W that is an orthonormal set.
Theorem 5.3
Let
{~q1 , ~q2 , . . . , ~qk }
be an orthonormal basis for a subspace W of Rn and let w
~ be any vector in
W . Then
w
~ = (w
~ · ~q1 )~q1 + (w
~ · ~q2 )~q2 + · · · + (w
~ · ~qk )~qk
and this representation is unique.
Theorem 5.4. The columns of an m × n matrix Q form an orthonormal set
if and only if QT Q = In .
Proof. We need to show that
½
(QT Q)ij =
0
1
if i 6= j
if i = j
Let ~qi denote the ith column of Q (and, hence, the ith row of QT ). Since
the (i, j) entry of QT Q is the dot product of the ith row of QT and the jth
column of Q, it follows that
(QT Q)ij = ~qi · ~qj
(2)
by the definition of matrix multiplication.
Now the columns of Q form an orthonormal set if and only if
½
0 if i 6= j
~qi · ~qj =
1 if i = j
which, by equation (2) holds if and only if
½
0 if i 6= j
T
(Q Q)ij =
1 if i = j
56
MATH10212 • Linear Algebra • Brief lecture notes
This completes the proof.
¤
If the matrix Q in Theorem 5.4 is a square matrix, is has a special
name.
An n × n matrix Q whose columns form an orthonormal set is called an
orthogonal matrix.
The most important fact about orthogonal matrices is given by the next
theorem.
Theorem 5.5.
A square matrix Q is orthogonal if and only if Q−1 = QT .
Proof. By Theorem 5.4, Q is orthogonal if and only if QT Q = I. This is true
if and only if Q is invertible and Q−1 = QT , by Theorem 3.13.
¤
Example
Each of the following matrices is orthogonal:
√ ¸
¸
·
¸
· √
·
1 0
1 0
1/√2 1/ √2
,
,
,
0 1
0 −1
1/ 2 −1/ 2
·
cos α
sin α
sin α
− cos α
¸
Theorem 5.6. Let Q be an n × n matrix. The following statements are
equivalent:
a. Q is orthogonal.
b. ||Q~x|| = ||~x|| for every ~x in Rn .
c. Q~x · Q~y = ~x · ~y for every ~x and ~y in Rn .
Theorem 5.7.
thonormal set.
If Q is an orthogonal matrix, then its rows form an or-
Theorem 5.8.
Let Q be an orthogonal matrix.
a. Q−1 is orthogonal.
b. det Q = ±1.
c. If λ is an eigenvalue of Q, then |λ| = 1.
d. If Q1 and Q2 are orthogonal n × n matrices, then so is Q1 Q2 .