Download CLASS NOTES ON LINEAR ALGEBRA 1. Matrices Suppose that F is

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Tensor operator wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Quadratic form wikipedia , lookup

Euclidean vector wikipedia , lookup

System of linear equations wikipedia , lookup

Hilbert space wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Matrix calculus wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Jordan normal form wikipedia , lookup

Covering space wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Homomorphism wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Vector space wikipedia , lookup

Cartesian tensor wikipedia , lookup

Four-vector wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Dual space wikipedia , lookup

Linear algebra wikipedia , lookup

Bra–ket notation wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Transcript
CLASS NOTES ON LINEAR ALGEBRA
STEVEN DALE CUTKOSKY
1. Matrices
Suppose that F is a field. F n will denote the set of n×1 column vectors with coefficients
in F , and Fm will denote the set of 1 × m row vectors with coefficients in F . Mm,n (F )
will denote the set of m × n matrices A = (aij ) with coefficients in F . We will sometimes
denote the zero vector of F n by 0n , the zero vector of Fm by 0m and the zero vector of
Mm,n by 0m,n . We will also sometimes abbreviate these different zeros as ~0 or 0. We will
let ei ∈ F n be the column vector whose entries are all zero, except for a 1 in the i-the
row. If A ∈ Mm,n (F ) then At ∈ Mn,m (F ) will denote the transpose of A. We will write
A = (A1 , A2 , . . . , An ) where Ai ∈ F m are the columns of A, and


A1
 A2 


A =  .. 
 . 
Am
where Aj ∈ Fn are the rows of A. For x = (x1 , . . . , xn )t ∈ F n , we have the formula
Ax = x1 A1 + x2 A2 + · · · + xn An .
In particular, Aei = Ai for 1 ≤ i ≤ n and (ei )t Aej = aij for 1 ≤ i ≤ m, 1 ≤ j ≤ n. We
also have


A1 x
 A2 x 


Ax =  .. 
 . 
Am x



= 

(A1 )t · x
(A2 )t · x
..
.
(Am )t · x





where for x = (x1 , . . . , xn )t , y = (y1 , . . . , yn ) ∈ F n , x · y = x1 y1 + x2 y2 + · · · + xn yn is the
dot product on F n .
Lemma 1.1. The following are equivalent for two matrices A, B ∈ Mm,n (F ).
1. A = B.
2. Ax = Bx for all x ∈ F n .
3. Aei = Bei for 1 ≤ i ≤ n.
N will denote the natural numbers {0, 1, 2 . . .}.
{1, 2, 3, . . .}.
1
Z+ is the set of positive integers
2. Span, Linear Dependence, Basis and Dimension
In this section we suppose that V is a vector space over a field F . We will denote the
zero vector in V by 0V . Sometimes, we will abbreviate, and write the zero vector as ~0 or
0.
Definition 2.1. Suppose that S is a nonempty subset of V . Then
Span(S) = {c1 v1 + · · · + cn vn | n ∈ Z+ , v1 , . . . , vn ∈ S and c1 , . . . , cn ∈ F }.
This definition is valid even when S is an infinite set. We define Span{∅} = {~0}.
Lemma 2.2. Suppose that S is a subset of V . Then Span(S) is a subspace of V .
Definition 2.3. Suppose that S is a subset of V . S is a linearly dependent set if there
exists n ∈ Z+ , v1 , . . . , vn ∈ S and c1 , . . . , cn ∈ F such that c1 , . . . , cn are not all zero, and
c1 v1 + c2 v2 + · · · + cn vn = 0. S is a linearly independent set if it is not linearly dependent.
Observe that ∅ is a linearly independent set.
Definition 2.4. Suppose that S is subset of V which is a linearly independent set and
Span(S) = V . Then S is a basis of V .
Theorem 2.5. Suppose that V is a vector space over a field F . Then V has a basis.
We have that the empty set ∅ is a basis of the 0 vector space {~0}.
Theorem 2.6. (Extension Theorem) Suppose that V is a vector space over a field F and
S is a subset of V which is a linearly independent set. Then there exists a basis of V which
contains S.
Theorem 2.7. Suppose that V is a vector space over a field F and S1 and S2 are two
bases of V . Then S1 and S2 have the same cardinality.
This theorem allows us to make the following definition.
Definition 2.8. Suppose that V is a vector space over a field F . Then the dimension of
V is the cardinality of a basis of V .
The dimension of {~0} is 0.
Definition 2.9. V is called a finite dimensional vector space if V has a finite basis.
For the most part, we will consider finite dimensional vector spaces. If V is finite
dimensional of dimension n, a basis is considered as an ordered set β = {v1 , . . . , vn }.
Lemma 2.10. Suppose that V is a finite dimensional vector space and W is a subspace
of V . Then dim W ≤ dim V , and dim W = dim V if and only if V = W .
Lemma 2.11. Suppose that V is a finite dimensional vector space and β = {v1 , v2 , . . . , vn }
is a basis of V . Suppose that v ∈ V . Then v has a unique expression
v = c1 v1 + c2 v2 + · · · + cn vn
with c1 , . . . , cn ∈ F .
Lemma 2.12. Let {v1 , . . . , vn } be a set of generators of a vector space V . Let {v1 , . . . , vr }
be a maximal subset of linearly independent elements. Then {v1 , . . . , vr } is a basis of V .
2
3. Direct Sums
Suppose that V is a vector space over a field F , and W1 , W2 , . . . , Wm are subspaces of
V . Then the sum
W1 + · · · + Wm = Span(W1 ∪ · · · ∪ Wm ) = {w1 + w2 + · · · + wm | wi ∈ Wi for 1 ≤ i ≤ m}
is the subspace of V spanned by W1 , W2 , . . . , Wm .
L
L
L
The sum W1 + · · · + Wm is called a direct sum, denoted by W1 W2 · · · Wm , if
every element v ∈ W1 + W2 + · · · + Wm has a unique expression v = w1 + · · · + wm with
wi ∈ Wi for 1 ≤ i ≤ m.
We have the following useful criterion.
Lemma 3.1. The sum W1 + W2 + · · · + Wm is a direct sum if and only if 0 = w1 + w2 +
· · · + wm with wi ∈ Wi for 1 ≤ i ≤ m implies wi = 0 for 1 ≤ i ≤ m.
In the case when V is finite dimensional, the direct sum is characterized by the following
equivalent conditions.
Lemma 3.2. Suppose that V is a finite dimensional vector space. Let n = dim (W1 +
· · · + Wm ) and ni = dim Wi for 1 ≤ i ≤ m. The following conditions are equivalent
1. If v ∈ W1 + W2 + · · · + Wm then v has a unique expression v = w1 + · · · + wm with
wi ∈ Wi for 1 ≤ i ≤ m.
2. Suppose that βi = {wi,1 , . . . , wi,ni } are bases of Wi for 1 ≤ i ≤ m. Then β =
{w1,1 , . . . , w1,n1 , w2,1 , . . . , wm,nm } is a basis of W1 + · · · + Wm .
3. n = n1 + · · · + nm .
The following lemma gives a useful criterion.
Lemma 3.3. Suppose that U and W are subspaces of a vector space V . Then V = U
if and only if V = U + W and U ∩ W = {~0}.
L
W
4. Linear Maps
In this section, we will suppose that all vector spaces are over a fixed field F .
We recall the definition of equality of maps, which we will use repeatedly to show that
two maps are the same. Suppose that f : V → W and g : V → W are maps (functions).
Then f = g if and only if f (v) = g(v) for all v ∈ V .
If f : U → V and g : V → W , we will usually write the composition g ◦ f : U → W as
gf . If x ∈ U , we will sometimes write f x for f (x), and gf x for (g ◦ f )(x).
Definition 4.1. Suppose that V and W are vector spaces. A map L : V → W is linear if
L(v1 + v2 ) = L(v1 ) + L(v2 ) for v1 , v2 ∈ V and L(cv) = cL(v) for v ∈ V and c ∈ F .
Lemma 4.2. Suppose that L : V → W is a linear map. Then
1. L(0V ) = 0W .
2. L(−v) = −L(v) for v ∈ V .
Lemma 4.3. Suppose that L : V → W is a linear map and T : W → U is a linear map.
Then the composition T L : V → U is a linear map.
If S and T are sets, a map f : S → T is 1-1 and onto if and only if there exists a map
g : T → S such that g ◦ f = IS and f ◦ g = IT , where IS is the identity map of S and IT is
3
the identity map of T . When this happens, g is uniquely determined. We write g = f −1
and say that g is the inverse of f .
Definition 4.4. Suppose that L : V → W is a linear map. L is an isomorphism if there
exists a linear map T : W → V such that T L = IV is the identity map of V , and LT = IW
is the identity map of W .
The map T in the above definition is unique, if it exists, by the following lemma. If L
is an isomorphism, we write L−1 = T , and call T the inverse of L.
Lemma 4.5. Suppose that L : V → W is a map. Suppose that T : W → V and
S : W → V are maps which satisfy T L = IV , LT = IW and SL = IV , LS = IW . Then
S = T.
Lemma 4.6. a linear map L : V → W is an isomorphism if and only if L is 1-1 and
onto.
Lemma 4.7. Suppose that L : V → W is a linear map. Then
1. Image L = {L(v) | v ∈ V } is a subspace of W .
2. Kernel L = {v ∈ V | L(v) = 0} is a subspace of V .
Lemma 4.8. Suppose that L : V → W is a linear map. Then L is 1-1 if and only if
Kernel L = {0}.
Definition 4.9. Suppose that F is a field, and V , W are vector spaces over F . Let
LF (V, W ) be the set of linear maps from V to W .
We will sometimes write L(V, W ) = LF (V, W ) if the field is understood to be F .
Lemma 4.10. Suppose that F is a field, and V , W are vector spaces over F . Then
LF (V, W ) is a vector space.
Theorem 4.11. (Dimension Theorem) Suppose that L : V → W is a homomorphism of
finite dimensional vector spaces. Then
dim Kernel L + dim Image L = dim V.
Proof. Let {v1 , . . . , vr } be a basis of Kernel L. Extend this to a basis {v1 , . . . , vr , vr+1 , . . . , vn }
of V . Let wi = L(vi ) for r + 1 ≤ i ≤ n. We will show that {wr+1 , . . . , wn } is a basis of Image L. Since L is linear, {wr+1 , . . . , wn } span Image L. We must show that
{wr+1 , . . . , wn } are linearly independent. Suppose that we have a relation
cr+1 wr+1 + · · · + cn wn = 0
for some cr+1 , . . . , cn ∈ F . Then
L(cr+1 vr+1 + · · · + cn vn ) = cr+1 L(vr+1 ) + · · · + cn L(vn ) = cr+1 wr+1 + · · · + cn wn = 0.
Thus cr+1 vr+1 + · · · cn vn ∈ Kernel L, so that we have an expansion
cr+1 vr+1 + · · · cn vn = d1 v1 + · · · + dr vr
for some d1 , . . . , dr ∈ F , which we can rewrite as
d1 v1 + · · · + dr vr − cr+1 vr+1 − · · · − cn vn = 0.
Since {v1 , . . . , vn } are linearly independent, we have that d1 = · · · = dr = cr+1 = · · · =
cn = 0. Thus {wr+1 , . . . , wn } are linearly independent.
4
Corollary 4.12. Suppose that V is a finite dimensional vector space and Φ : V → V and
Ψ : V → V are linear maps such that ΨΦ = IV . Then Φ and Ψ are isomorphisms and
Ψ = Φ−1 .
Theorem 4.13. (Universal Property of Vector Spaces) Suppose that V and W are finite
dimensional vector spaces, β = {v1 , . . . , vn } is a basis of V , and w1 , . . . , wn ∈ W . Then
there exists a unique linear map L : V → W such that L(vi ) = wi for 1 ≤ i ≤ n.
Proof. We first prove existence. For v ∈ V , define L(v) = c1 w1 + c2 w2 + · · · + cn wn if
v = c1 v1 + · · · + cn vn with c1 , . . . , cn ∈ F . L is a well defined map since every v ∈ V has a
unique expression v = c1 v1 + · · · + cn vn with c1 , . . . , cn ∈ F , as β is a basis of V . We have
that L(vi ) = wi for 1 ≤ i ≤ n. We leave the verification that L is linear to the reader.
Now we prove uniqueness. Suppose that T : V → W is a linear map such that T (vi ) = wi
for 1 ≤ i ≤ n. Suppose that v ∈ V . Then v = c1 v1 + · · · + cn vn for some c1 , . . . , cn ∈ F .
We have that
T (v) = T (c1 v1 + · · · cn vn ) = c1 T (v1 ) + · · · + cn T (vn ) = c1 w1 + · · · + cn wn = L(v).
Since T (v) = L(v) for all v ∈ V , we have that T = L.
Lemma 4.14. Suppose that A ∈ Mm,n (F ). Then the map LA : F n → F m defined by
LA (x) = Ax for x ∈ F n is linear.
Lemma 4.15. Suppose that A = (A1 , . . . , An ) ∈ Mm,n (F ). Then
Image LA = Span{A1 , . . . , An }.
Kernel LA = {x ∈ F n | Ax = 0} is the solution space to Ax = 0.
Lemma 4.16. Suppose that A, B ∈ Mm,n (F ). Then LA+B = LA + LB . Suppose that
c ∈ F . Then cLA = LcA . Suppose that A ∈ Mm,n (F ) and C ∈ Ml,m . Then the composition
of maps LC LA = LCA .
Lemma 4.17. The map Φ : Mm,n → LF (F n , F m ) defined by Φ(A) = LA for A ∈ Mm,n (F )
is an isomorphism.
Proof. By Lemma 4.16, Φ is a linear map. We will now show that Kernel Φ is {0}. Let
A ∈ Kernel Φ. Then LA = 0. Thus Ax = 0 for all x ∈ F n . By Lemma 1.1, we have
that A = 0. Thus Kernel Φ = {0} and Φ is 1-1. Suppose that L ∈ LF (F n , F m ). Let
wi = L(ei ) for 1 ≤ i ≤ n. Let A = (w1 , . . . , wn ) ∈ Mm,n (F ). For 1 ≤ i ≤ n, we have
that LA (ei ) = Aei = wi . By Theorem 4.13, L = LA . Thus Φ is onto, and Φ is an
isomorphism.
Definition 4.18. Suppose that V is a finite dimensional vector space. Suppose that β =
{v1 , . . . , vn } is a basis of V . Define the coordinate vector (v)β ∈ F n of v respect to β by
(v)β = (c1 , . . . , cn )t ,
where c1 , . . . , cn ∈ F are the unique elements of F such that v = c1 v1 + · · · + cn vn .
Lemma 4.19. Suppose that V is a finite dimensional vector space. Suppose that β =
{v1 , . . . , vn } is a basis of V . Suppose that u1 , u2 ∈ V and c ∈ F . Then (u1 + u2 )β =
(u1 )β + (u2 )β and (cu1 )β = c(u1 )β .
Theorem 4.20. Suppose that V is a finite dimensional vector space. Let β = {v1 , . . . , vn }
be a basis of V . Then the map Φ : V → F n defined by Φ(v) = (v)β is an isomorphism.
5
Proof. Φ is a linear map by Lemma 4.19. Note that Φ(vi ) = ei for 1 ≤ i ≤ n. By Theorem
4.13, there exists a unique linear map Ψ : F n → V such that Ψ(ei ) = vi for 1 ≤ i ≤ n
(where {e1 , . . . , en } is the standard basis of F n ). Now ΨΦ : V → V is a linear map which
satisfies ΨΦ(vi ) = vi for 1 ≤ i ≤ n. By Theorem 4.13, there is a unique linear map from
V → V which takes vi to vi for 1 ≤ i ≤ n. Since the identity map IV of V has this
property, we have that ΨΦ = IV . By a similar calculation, ΦΨ = IF n . Thus Φ is an
isomorphism (with inverse Ψ).
Definition 4.21. Suppose that V and W are finite dimensional vector spaces. Suppose
that β = {v1 , . . . , vn } is a basis of V and β 0 = {w1 , . . . , wm } is a basis of W . Suppose that
L : V → W is a linear map. Define the matrix Mββ0 (L) ∈ Mm,n (F ) of L with respect to
the bases β and β 0 to be the matrix
Mββ0 (L) = ((L(v1 )β 0 , (L(v2 ))β 0 , . . . , (L(vn ))β 0 ).
The matrix Mββ0 (L) of L with respect to β and β 0 has the following important property:
(1)
Mββ0 (L)(v)β = (L(v)β 0
for all v ∈ V . In fact, Mββ0 (L) is the unique matrix A such that A(v)β = (L(v))β 0 for all
v ∈V.
Lemma 4.22. Suppose that V and W are finite dimensional vector spaces. Suppose
that β = {v1 , . . . , vn } is a basis of V and β 0 = {w1 , . . . , wm } is a basis of W . Suppose
that L1 , L2 ∈ LF (V, W ) and c ∈ F . Then Mββ0 (L1 + L2 ) = Mββ0 (L1 ) + Mββ0 (L2 ) and
Mββ0 (cL1 ) = cMββ0 (L1 ).
Theorem 4.23. Suppose that V and W are finite dimensional vector spaces. Suppose
that β = {v1 , . . . , vn } is a basis of V and β 0 = {w1 , . . . , wm } is a basis of W . Then the
map Λ : LF (V, W ) → Mm,n (F ) defined by Λ(L) = Mββ0 (L) is an isomorphism.
Proof. Λ is a linear map by Lemma 4.22. It remains to verify that Λ is 1-1 and onto,
from which it will follow that Λ is an isomorphism Suppose that L1 , L2 ∈ LF (V, W ) and
Λ(L1 ) = Λ(L2 ). Then Mββ0 (L1 ) = Mββ0 (L2 ) so that (L1 (vi ))β 0 = (L2 (vi ))β 0 for 1 ≤ i ≤ n.
Thus L1 (vi ) = L2 (vi ) for 1 ≤ i ≤ n. Since β is a basis of V , L1 = L2 by the uniqueness
statement of Theorem 4.13. Thus Λ is 1-1. Suppose that A = (aij ) ∈ Mm,n (F ). Write
A
= (A1 , . . . , An ) where Ai = (a1,i , . . . , am,i )t ∈ F m are the columns of A. Let zi =
Pm
j=1 aji wj ∈ W for 1 ≤ i ≤ n. By Theorem 4.13, there exists a linear map L : V → W
such that L(vi ) = zi for 1 ≤ i ≤ n. We have that (L(vi ))β 0 = (zi )β 0 = Ai for 1 ≤ i ≤ n.
We have that Λ(L) = Mββ0 (L) = (A1 , . . . , An ) = A, and thus Λ is onto.
Corollary 4.24. Suppose that V is a vector space of dimension n and W is a vector space
of dimension m. Then LF (V, W ) is a vector space of dimension mn.
Lemma 4.25. Suppose that U , V and W are finite dimensional vector spaces, with respective bases β1 , β2 , β3 . Suppose that L1 ∈ LF (U, V ) and L2 ∈ LF (V, W ). Then
Mββ31 (L2 L1 ) = Mββ32 (L2 )Mββ21 (L1 ).
Theorem 4.26. Suppose that V is a finite dimensional vector space, and β is a basis of V .
Then the map Λ : LF (V, V ) → Mn,n (F ) defined by Λ(L) = Mββ (L) is a ring isomorphism.
6
5. Bilinear Forms
Let U, V, W be vector spaces over a field F . Let Bil(U × V, W ) be the set of bilinear
maps from U × V to W . Bil(U × V, W ) is a vector space over F . We call an element of
BilF (V ×V, F ) a bilinear form. A bilinear form is usually written as a pairing < v, w >∈ F
for v, w ∈ V . A bilinear form <, > is symmetric if < v, w >=< w, v > for all v, w ∈ V . A
symmetric bilinear form is nondegenerate if whenever v ∈ V is such that < v, w >= 0 for
all w ∈ V then v = 0.
Theorem 5.1. Let F be a field and A ∈ Mm,n (F ). Let gA : F m × F n → F be defined
by gA (v, w) = v t Aw. Then gA ∈ BilF (F m × F n , F ). Further, the map Ψ : Mm,n (F ) →
BilF (F m × F n , F ) defined by Ψ(A) = gA is an isomorphism of F -vector spaces.
Also, in the case when m = n,
1. Ψ gives a 1-1 correspondence between n × n matrices and bilinear forms on F n .
2. Ψ gives a 1-1 correspondence between n × n symmetric matrices and symmetric
forms on F n
3. Ψ gives a 1-1 correspondence between invertible n × n symmetric matrices and
nondegenerate symmetric forms
From now on in this section, suppose that V is a vector space over a field F , and <, >
is a nondegenerate symmetric form on V . An important example is the dot product on
F m where F is a field. We begin by stating the identity
1
(2)
< v, w >= [< v + w, v + w > − < v, v > − < w, w >]
2
for v, w ∈ V .
Define v, w ∈ V to be orthogonal (or perpendicular) if < v, w >= 0. Suppose that
S ⊂ V is a subset. Define
S ⊥ = {v ∈ V |< v, w >= 0 for all w ∈ S}.
Lemma 5.2. Suppose that S is a subset of V . Then
1. S ⊥ is a subspace of V .
2. If U is the subspace Span(S) of V , then U ⊥ = S ⊥ .
Lemma 5.3. Suppose


A1


A =  ...
 ∈ Mm,n (F ).
Am
Let U = Span{At1 , . . . , Atm } ⊂ F n , which is the column space of At .
Kernel LA .
Then U ⊥ =
A basis β = {v1 , . . . , vn } of V is called an orthogonal basis if < vi , vj >= 0 whenever
i 6= j.
Theorem 5.4. Suppose that V is a finite dimensional vector space with a symmetric
nondegenerate bilinear form < , >. Then V has an orthogonal basis.
Proof. We prove the theorem by induction on n = dim V . If n = 1, the theorem is true
since any basis is an orthogonal basis.
Assume that dim V = n > 1, and that any subspace of V of dimension less than n has
an orthogonal basis. There are two cases. The first case is when every v ∈ V satisfies
7
< v, v >= 0. Then by (2), we have < v, w >= 0 for every v, w ∈ V . Thus any basis of V
is an orthogonal basis.
Now assume that there exists v1 ∈ V such that < v1 , v1 >6= 0. Let U = Span{v1 }, a
one dimensional subspace of V . Suppose that v ∈ V . Let
< v, v1 >
∈ F.
c=
< v1 , v1 >
Then v − cv1 ∈ U ⊥ , and thus v = cv1 + (v − cv1 ) ∈ U + U ⊥ . It follows that V = U + U ⊥ .
L ⊥
We have that U ∩ U ⊥ = {~0}, since < v1 , v1 >6= 0. Thus V = U
U . Since U ⊥ has
⊥
dimension n − 1 by Lemma 3.2, we have by induction that U has an orthogonal basis
{v2 , . . . , vn }. Thus {v1 , v2 , . . . , vn } is an orthogonal basis of V .
Theorem 5.5. Let V be a finite dimensional vector space over a field F , with a symmetric
bilinear form < , >. Assume dim V > 0. Let V0 be the subspace of V defined by
V0 = V ⊥ = {v ∈ V |< v, w >= 0 for all w ∈ V }.
Let {v1 , . . . , vn } be an orthogonal basis of V . Then the number of integers i such that
< vi , vi >= 0 is equal to the dimension of V0 .
Proof. Order the {v1 , . . . , vn } so that < vi , vi >= 0 for 1 ≤ i ≤ r and < vi , vi >6= 0 for
r < i ≤ n. Suppose that w ∈ V . Then w = c1 v1 + · · · + cn vn for some c1 , . . . , cn ∈ F .
< vi , w >=
n
X
ci < vi , vj >= ci < vi , vi >= 0
j=1
for 1 ≤ i ≤ r. Thus v1 , . . . , vr ∈ V0 .
Now suppose that v ∈ V0 . We have v = d1 v1 + · · · + dn wn for some d1 , . . . , dn ∈ F . We
have
n
X
0 =< vi , v >=
dj < vi , vj >= di < vi , vi >
j=1
for 1 ≤ i ≤ n. Since < vi , vi >6= 0 for r < i ≤ n we have di = 0 for r < i ≤ n. Thus
v = d1 v1 + · · · + dr vr ∈ Span{v1 , . . . , vr }. Thus V0 = Span{v1 , . . . , vr }. Since v1 , . . . , vr
are linearly independent, they are a basis of V0 , and thus dim V0 = r.
The dimension of V0 in Theorem 5.5 is called the index of nullity of the form.
Theorem 5.6. (Sylvester’s Theorem) Let V be a finite dimensional vector space over R
with a symmetric bilinear form. There exists an integer r with the following property. If
{v1 , . . . , vn } is an orthogonal basis of V , then there are precisely r integers i such that
< vi , vi >> 0.
Proof. Let {v1 , . . . , vn } and {w1 , . . . , wn } be orthogonal bases of V . We may suppose that
their elements are arranged so that < vi , vi >> 0 is 1 ≤ i ≤ r, < vi , vi >< 0 if r +1 ≤ i ≤ s
and < vi , vi >= 0 if s + 1 ≤ i ≤ n. Similarly, < wi , wi >> 0 is 1 ≤ i ≤ r0 , < wi , wi >< 0
if r0 + 1 ≤ i ≤ s0 and < wi , wi >= 0 if s0 + 1 ≤ i ≤ n.
We first prove that v1 , . . . , vr , wr0 +1 , . . . , wn are linearly independent. Suppose we have
a relation
x1 v1 + · · · + xr vr + yr0 +1 wr0 +1 + · · · yn wn = 0
for some x1 , . . . , xr , yr;+1 , . . . , yn ∈ F . Then
x1 v1 + · · · + xr vr = −(yr0 +1 wr0 +1 + · · · + yn wn ).
8
Let ci =< vi , vi > and di =< wi , wi > for all i. Taking the product of each side of the
preceding equation with itself, we have
c1 x21 + · · · + cr x2r = dr0 +1 yr20 +1 + · · · + ds0 ys20 .
Since the left hand side is ≥ 0 and the right hand side is ≤ 0, we must have that both
sides of this equation are zero. Thus x1 = · · · = xr = 0. Since wr0 +1 , . . . , wn are linearly
independent, yr0 +1 = · · · = yn = 0.
Since dim V = n, we have that r + n − r0 ≤ n, so that r ≤ r0 . Interchanging the roles
of the two bases in the proof, we also obtain r0 ≤ r, so that r = r0 .
The integer r in the proof of Sylvester’s theorem is call the index of positivity of the
form.
6. Exercises
1. Let V be an n-dimensional vector space over a field F . The dual space of V is the
vector space V ∗ = L(V, F ). Elements of V ∗ are called functionals.
i) Suppose that β = {v1 , . . . , vn } is a basis of V . For 1 ≤ i ≤ n, define vi∗ ∈ V ∗
by vi∗ (v) = ci if v = c1 v1 + · · · + cn vn ∈ V (so that vi∗ (vj ) = δij ). Prove that
the vi∗ are elements of V ∗ and that they form a basis of V ∗ . (This basis is
called the dual basis to β.)
ii) Suppose that V has a nondegenerate symmetric bilinear form < , >. For
v ∈ V define Lv : V → F by Lv (w) =< v, w > for w ∈ V . Show that Lv ∈ V ∗ ,
and that the map Φ : V → V ∗ defined by Φ(v) = Lv is an isomorphism of
vector spaces.
2. Suppose that L : V → W is a linear map of finite dimensional vector spaces.
Define L̂ : W ∗ → V ∗ by L̂(ϕ) = ϕL for ϕ ∈ W ∗ .
i) Show that L̂ is a linear map.
ii) Suppose that L is 1-1. Is L̂ 1-1? Is L̂ onto? Prove your answers.
iii) Suppose that L is onto. Is L̂ 1-1? Is L̂ onto? Prove your answers.
iv) Suppose that A ∈ Mm,n (F ) where F is a field, and L = LA : F n → F m . Let
{e1 , . . . , en } be the standard basis of F n , and {f1 , . . . , fm } be the standard
basis of F m . Let β = {e∗1 , . . . , e∗n } be the dual basis of (F n )∗ and let β 0 =
0
∗ } be the dual basis of (F m )∗ . Compute M β (L̂).
{f1∗ , . . . , fm
β
3. Let V be a vector space of dimension n, and W be a subspace of V . Define
PerpV ∗ (W ) = {ϕ ∈ V ∗ | ϕ(w) = 0 for all w ∈ W }.
i) Prove that dim W + dim PerpV ∗ (W ) = n.
ii) Suppose that V has a nondegenerate symmetric bilinear form < , >. Prove
that the map v 7→ Lv of Problem 1.ii) induces an isomorphism of
W ⊥ = {v ∈ V |< v, w >= 0 for all w ∈ W }
with PerpV ∗ (W ). Conclude that dim W + dim W ⊥ = n.
iii) Use the conclusions of Problem 3.ii) to prove that the column rank of a matrix
is equal to the row rank of a matrix (see Definition 7.4). This gives a different
proof of part of Theorem 7.5.
iv) (rank-nullity theorem) Suppose that A ∈ Mm,n (F ) is a matrix. Show that
the dimension of the solution space
{x ∈ F n | Ax = ~0}
9
is equal to n − rank A. Thus with the notation of Lemma 5.3,
dim U ⊥ + rank A = n.
v) Consider the complex vector space C2 with the dot product as nondegenerate
symmetric bilinear form. Give an example of a subspace W of C2 such that
C2 6= W ⊕W ⊥ . Verify for your example that dim W +dim W ⊥ = dim C2 = 2.
4. Suppose that V is a finite dimensional vector space with a nondegenerate symmetric bilinear form < , >. An operator Φ on V is a linear map Φ : V → V . If Φ is
an operator on V , show that there exists a unique operator Ψ : V → V such that
< Φ(v), w >=< v, Ψ(w) > for all v, w ∈ V . Ψ is called the transpose of Φ, and we
write Ψ = Φt .
Hint: Define a map Ψ : V → V as follows. For w ∈ V , Let L : V → F be the
map defined by L(v) =< Φ(v), w > for v ∈ V . Verify that L is linear, so that
L ∈ V ∗ . By Problem 1.ii), there exists a unique w0 ∈ V such that for all v ∈ V ,
we have L(v) =< v, w0 >. Define Ψ(w) = w0 . Prove that Ψ is linear, and that
< Φ(v), w >=< v, Ψ(w) > for all v, w ∈ V . Don’t forget to prove that Ψ is unique!
5. Let V = F n with the dot product as nondegenerate symmetric bilinear form.
Suppose that L : V → V is an operator. Show that Lt = LAt if A ∈ Mn×n (F ) is
the n × n matrix such that L = LA .
6. Give a “down to earth” (but “coordinate dependent”) proof of Problem 4, starting
like this. Let β = {v1 , . . . , vn } be a basis of V . The linear map Φβ : V → F n
defined by Φβ (v) = (v)β is an isomorphism, with inverse Ψ : F n → V defined by
Ψ((x1 , . . . , xn )t ) = x1 v1 + · · · + xn vn . Define a product [ , ] on F n by [x, y] =<
Ψ(x), Ψ(y) >. Verify that [ , ] is a nondegenerate symmetric bilinear form on F n .
By our classification of nondegenerate symmetric bilinear forms on F n , we know
that [x, y] = xt By for some symmetric invertible matrix B ∈ Mn×n (F ).
7. Determinants
Theorem 7.1. Suppose that F is a field, and n is a positive integer. Then there exists a
unique function Det : Mn,n (F ) → F which satisfies the following properties:
1. Det is multilinear in the columns of matrices in Mn,n (F ).
2. Det is alternating in the columns of matrices in Mn,n (F ); that is Det(A) = 0
whenever two columns of A are equal.
3. Det(In ) = 1.
Existence can be proven by defining a function on A = (aij ) ∈ Mn,n (F ) by
X
Det(A) =
sgn(σ)aσ(1),1 aσ(2),2 · · · aσ(n),n
σ∈Sn
where for a permutation σ ∈ Sn ,
sgn(σ) =
1
if σ is even
−1 if σ is odd.
It can be verified that Det is multilinear, alternating and Det(In ) = 1.
Uniqueness is proven by showing that if a function Φ : Mn,n (F ) → F is multilinear and
alternating on columns and satisfies Φ(In ) = 1 then Φ = Det.
Lemma 7.2. Suppose that A, B ∈ Mn,n (F ). Then
10
1. Det(AB) = Det(A)Det(B).
2. Det(At ) = Det(A).
Lemma 7.3. Suppose that A ∈ Mn,n (F ) Then the following are equivalent
1. Det(A) 6= 0
2. The columns {A1 , . . . , An } of A are linearly independent.
3. The solution space to Ax = 0 is (0).
Proof. 2. is equivalent to 3. by the dimension formula, applied to the linear map LA :
F n → F n.
We will now establish that 1. is equivalent to 2. Suppose that {A1 , . . . , An } are linearly
dependent. Then, after possibly permuting the columns of A, there exist c1 , . . . , cn−1 ∈ F
such that An = c1 A1 + · · · + cn−1 An−1 . Thus
Det(A) =
=
=
=
Det(A1 , . . . , An−1 , An )
Det(A1 , . . . , An−1 , c1 A1 + · · · + cn−1 An−1 )
c1 Det(A1 , . . . , An−1 , A1 ) + · · · + cn−1 Det(A1 , . . . , An−1 , An−1 )
0.
Now suppose that {A1 , . . . , An } are linearly independent. Then dim Image LA = n. By
the dimension theorem, dim Kernel LA = 0. Thus LA is an isomorphism, so A is invertible
with an inverse B. We have that 1 = Det(In ) = Det(AB) = Det(A)Det(B). Thus
Det(A) 6= 0.
Definition 7.4. Suppose that A ∈ Mm,n (F ).
1. The column rank of A is the dimension of the column space Span{A1 , . . . , An },
which is a subspace of F m .
2. The row rank of A is the dimension of the row space Span{A1 , . . . , Am }, which is
a subspace of Fn .
3. The rank of A is the largest integer r such that there exists an r × r submatrix B
of A such that Det(B) 6= 0.
Theorem 7.5. Suppose that A ∈ Mm,n (F ). Then the column rank of A, the row rank of
A and the rank of A are all equal.
Proof. Let s be the column rank of A, and let r be the rank of A. We will show that s = r.
There exists an r × r submatrix B of A such that Det(B) 6= 0. There exist 1 ≤ i1 <
i2 < · · · < ir ≤ m and 1 ≤ j1 < j2 < · · · < jr ≤ n such that B is obtained from A by
deleting the rows Ai from A such that i 6∈ {i1 , . . . , ir } and deleting the columns Aj such
that j 6∈ {j1 , . . . , jr }. We have that the columns of B are linearly independent, so the
columns Aj1 , . . . , Ajr of A are linearly independent. Thus s ≥ r.
We now will prove that r ≥ s be induction on n, from which it follows that r = s. The
case n = 1 is easily verified; r and s are either 0 or 1, depending on if A is zero or not.
Assume that the induction statement is true for matrices of size less than n. After
permuting the columns of A, we may assume that the first s columns of A are linearly
independent. Since the column space is a subspace of F m , we have that s ≤ m. Let B
the matrix consisting of the first s rows of A. if s < n, then by induction we have that
the rank of B is s, so that A has rank r ≥ s.
We have reduced to the case where s = n ≤ m. Let C be the (n − 1) × m submatrix of
A consisting of the first n − 1 columns of A. By induction, there exists an (n − 1) × (n − 1)
submatrix E of C whose determinant is non zero. After possibly interchanging rows of A,
we may assume that E consists of the first n − 1 rows of C. For 1 ≤ i ≤ n, let Ei be the
11
column vector consisting of the first n − 1 rows of the i-th column of A, and for 1 ≤ i ≤ n
and n ≤ j ≤ m, let Eij be the column vector consisting of the first n − 1 rows of the i-th
column of A, followed by the j-th row of the i-th column of A.
Since Det(E) 6= 0, {E1 , . . . , En−1 } are linearly independent, and are thus a basis of
F n−1 . Thus there are unique elements a1 , . . . , an−1 ∈ F such that
a1 E1 + · · · + an−1 En−1 = En .
(3)
Suppose that all determinants of n × n submatrices of A are zero. Then {E1j , . . . , Enj }
are linearly dependent for n ≤ j ≤ m. By uniqueness of the relation (3), we must then
have that
j
a1 E1j + · · · + an−1 En−1
− En = 0
for n ≤ j ≤ m. Thus
a1 A1 + · · · + an−1 An−1 − An = 0,
a contradiction to our assumption that A has column rank n. Thus some n × n submatrix
of A has nonzero determinant, and thus r ≥ s.
Taking the transpose of A, the above argument shows that the row rank of A is equal
to the rank of A.
8. Rings
A set R is a ring if it has an addition operation under which A is an abelian group, and
an associative multiplication operation such that a(b + c) = ab + ac and (b + c)a = ba + ca
for all a, b, c ∈ A. A further has a multiplicative identity 1A (written 1 when there is no
danger of confusion). A ring A is a commutative ring if ab = ba for all a, b ∈ A.
Suppose that A and B are rings. A ring homomorphism ϕ : A → B is a mapping
such that ϕ is a homomorphism of abelian groups, ϕ(ab) = ϕ(a)ϕ(b) for a, b ∈ A and
ϕ(1A ) = 1B .
Suppose that B is a commutative ring, A is a subring of B and S is a subset of B.
The subring of B generated by S and A is the intersection of all subrings T of B which
contain A and S. The subring of B generated by S and A is denoted by A[S]. It should
be verified that A[S] is in fact a subring of B, and that
A[S] = {
n
X
aii ,...,ir si11 · · · sirr | r ∈ N, n ∈ N, ai1 ,...,ir ∈ A, s1 , . . . , sr ∈ S}.
i1 ,i2 ,...,ir =0
and let Ri = R for i ∈ N. Let
T = {{ai } ∈ ×i∈N Ri | ai = 0 for i 0}.
We can also write a sequence {ai } ∈ T as (a0 , a1 , a2 , . . . , ar , 0, 0, . . .) for somePr ∈ N. T is
a ring with addition {ai } + {bi } = {ai + bi } and {ai }{bj } = {ck } where ck = i+j=k ai bk .
The zero element 0T is the sequence all of whose terms are 0, and 1T is the sequence all
of whose terms are zero expect the zero’th term which is 1. That is, 0T = (0, 0, . . . , ) and
1T = (1, 0, 0, . . .). Let x be the sequence whose first term is 1 and all other terms are zero;
that is, x = (0, 1, 0, 0, . . .). Then xi is the sequence whose i-th term is 1 and all other
terms are zero. The natural map of R into T which maps a ∈ R to the sequence whose
zero-th term is a and all other terms are zero identifies R with a subring of T . Suppose
12
that {ai } ∈ T . Then there is some natural number r such that ai = 0 for all i > r. We
have that
r
X
{ai } = (a0 , a1 , . . . , , ar , 0, 0, . . .) =
ai xi .
i=0
Thus the subring R[x] of T generated by x and R is equal to T . R[x] is called a polynomial
ring.
Lemma 8.1. Suppose that f (x), g(x) ∈ R[x] have expansions f (x) = a0 + a1 x + · · · ar xr ∈
R[x] and g(x) = b0 +b1 x+· · ·+br xr ∈ R[x] with a0 , . . . , ar ∈ R and b0 , . . . , br ∈ R Suppose
that f (x) = g(x). Then ai = bi for 0 ≤ i ≤ r.
Proof. We have f (x) = (a0 , a1 , . . . , ar , 0, . . .) and g(x) = (b0 , b1 , . . . , br , 0, . . .) as elements
of ×i∈N Ri . Since two sequences in ×i∈N Ri are equal if and only if their coefficients are
equal, we have that ai = bi for all i.
Theorem 8.2. (Universal Property of Polynomial Rings) Suppose that A and B are commutative rings and ϕ : A → B is a ring homomorphism. Suppose that b ∈ B. Then there
exists a unique ring homomorphism ϕ : A[x] → B such that ϕ(a) = ϕ(a) for a ∈ A and
ϕ(x) = b.
Proof. We first prove existence. Define ϕ : A → B by ϕ(f (x)) = ϕ(a0 ) + ϕ(a1 )b + · · · +
ϕ(ar )br for
(4)
f (x) = a0 + a1 x + · · · + ar xr ∈ R[x]
with a0 , . . . , ar ∈ R. This map is well defined, since the expansion (4) is unique by Lemma
8.1. It should be verified that ϕ is a ring homomorphism with the desired properties.
We will now prove uniqueness. Now suppose that Ψ : A[x] → B is a ring homomorphism
such that Ψ(a) = ϕ(a) for a ∈ A and Ψ(x) = b. Suppose that f (x) ∈ A[x]. Write
f (x) = a0 + a1 x + · · · + ar xr with a0 , . . . , ar ∈ A. We have
Ψ(f (x)) = Ψ(a0 + a1 x + · · · + ar xr ) = Ψ(a0 ) + Ψ(a1 )Ψ(x) + · · · + Ψ(ar )Ψ(x)r
= ϕ(a0 ) + ϕ(a1 )b + · · · + ϕ(ar )br = ϕ(a0 + a1 x + · · · + ar xr )
= ϕ(f (x)).
Since this identity is true for all f (x) ∈ A[x], we have that Ψ = ϕ.
Let F be a field, and F [x] be a polynomial ring over F . Suppose that f (x) = a0 +
a1 x + · · · + an xn ∈ F [x] with a0 , · · · an ∈ F and an 6= 0. The degree of f (x) is defined to
be deg f (x) = n. If f (x) = 0 define deg f (x) = −∞. For f, g ∈ F [x] we have
deg f g = deg f + deg g
and
deg f + g ≤ max {deg f, deg g}.
f ∈ F [x] is a unit (has a multiplicative inverse in F [x]) if and only if f is a nonzero element
of F , if and only if deg f = 0 (this follows from the formulas on degree). F [x] is a domain;
that is it has no zero divisors. A basic result is Euclidean division.
Theorem 8.3. Suppose that f, g ∈ F [x] with f 6= 0. Then there exist unique polynomials
q, r ∈ F [x] such that g = qf + r with deg r < deg f .
Corollary 8.4. Suppose that f (x) ∈ F [x] is nonzero. Then f (x) has at most deg f
distinct roots in F .
13
Corollary 8.5. F [x] is a principal ideal domain; that is every ideal in F [x] is generated
by a single element.
An element f (x) ∈ F [x] is defined to be irreducible if it has degree ≥ 1, and f cannot
be written as a product f = gh with f, g 6∈ F . A polynomial f (x) ∈ F [x] of positive
degree n is monic if f (x) = a0 + a1 x + · · · + an−1 xn−1 + xn for some a0 , . . . , an−1 ∈ F .
Theorem 8.6. (Unique Factorization) Suppose that f (x) ∈ F [x] has positive degree. Then
there is a factorization
f (x) = cf1n1 · · · frnr
(5)
where r is a positive integer, f1 , . . . , fr ∈ F [x] are monic irreducible, 0 6= c ∈ F and
n1 , . . . , nr are positive integers.
This factorization is unique, in the sense that any other factorization of f (x) as a
product of monic irreducibles is obtained from (6) by permuting the fi .
If F is an algebraically closed field (such as the complex numbers C) the monic irreducibles are exactly the linear polynomials x − α for α ∈ F .
Theorem 8.7. Suppose that F is an algebraically closed field and f (x) ∈ F [x] has positive
degree. Then there is a (unique) factorization
f (x) = c(x − α1 )n1 · · · (x − αr )nr
(6)
where r is a positive integer, α1 , . . . , αr ∈ F are distinct, 0 6= c ∈ F and n1 , . . . , nr are
positive integers.
Suppose that f (x), g(x) ∈ F [x] are not both zero. A common divisor of f and g in F [x]
is an element h ∈ F [x] such that h divides f and h divides g in F [x]. A greatest common
divisor of f and g in F [x] is an element d ∈ F [x] such that d is a common divisor of f and
g in F [x] and d divides every other common divisor of f and g in F [x].
Theorem 8.8. Suppose that f, g ∈ F [x] are not both zero. Then
1. there exists a unique monic greatest common divisor d(x) = gcd(f, g) of f and g
in F [x].
2. There exist p, q ∈ F [x] such that d = pf + qg.
3. Suppose that K is a field containing F . Then d is a greatest common divisor of f
and g in K[x].
9. Eigenvalues and Eigenvectors
Suppose that F is a field.
Definition 9.1. Suppose that V is a vector space over a field F and L : V → V is a linear
map. An element λ ∈ F is an eigenvalue of L if there exists a nonzero element v ∈ V
such that L(v) = λv. A nonzero vector v ∈ V such that L(v) = λv is called an eigenvector
of L with eigenvalue λ. If λ ∈ F is an eigenvalue of L, then
E(λ) = {v ∈ V | L(v) = λv}
is the eigenspace of λ for L.
Lemma 9.2. Suppose that V is a vector space over a field F , L : V → V is a linear map,
and λ ∈ F is an eigenvalue of L. Then E(λ) is a subspace of V of positive dimension.
14
Theorem 9.3. Suppose that V is a vector space over a field F , and L : V → V is a linear
map. Let λ1 , . . . , λr ∈ F be distinct eigenvalues of L. Then the subspace of V spanned by
E(λ1 ), . . . , E(λr ) is a direct sum; that is
M
M M
E(λ1 ) + E(λ2 ) + · · · + E(λr ) = E(λ1 )
E(λ2 )
···
E(λr ).
Proof. We prove the theorem by induction on r. The case r = 1 follows from the definition
of direct sum. Now assume that the theorem is true for r − 1 eigenvalues. Suppose that
vi ∈ E(λi ) for 1 ≤ i ≤ r, wi ∈ E(λi ) for 1 ≤ i ≤ r and
v 1 + v 2 + · · · + v r = w1 + w2 + · · · + wr .
(7)
We must show that vi = wi for 1 ≤ i ≤ r. Multiplying equation (7) by λr , we obtain
λr v1 + λr v2 + · · · + λr vr = λr w1 + λr w2 + · · · + λr wr .
(8)
Applying L to equation (7) we obtain
λ1 v1 + λ2 v2 + · · · + λr vr = λ1 w1 + λ2 w2 + · · · + λr wr .
(9)
Subtracting equation (8) from (9), we obtain
(λ1 − λr )v1 + · · · + (λr−1 − λr )vr−1 = (λ1 − λr )w1 + · · · + (λr−1 − λr )wr−1 .
Now by induction on r, and since λi − λr 6= 0 for 1 ≤ i ≤ r − 1, we get that vi = wi for
1 ≤ i ≤ r − 1. Substituting into equation (7), we then obtain that vr = wr , and we see
that the sum is direct.
Corollary 9.4. Suppose that V is a finite dimensional vector space over a field F , and
L : V → V is linear. Then L has at most dim(V ) distinct eigenvalues.
10. Characteristic Polynomials
Suppose that F is a field. We let F [t] be a polynomial ring over F .
Definition 10.1. Suppose that F is a field, and A ∈ Mn,n (F ). The characteristic polynomial of A is the polynomial pA (t) = Det(tIn − A) ∈ F [t].
pA (t) is a monic polynomial of degree n.
Lemma 10.2. Suppose that V is an n-dimensional vector space over a field F , and L :
V → V is a linear map. Let β, β 0 be two bases of V . Then
pM β (L) (t) = pM β0 (L) (t).
β0
β
Proof. We have that
0
0
Mββ0 (L) = Mββ0 (I)Mββ (L)Mββ (I).
0
0
Let A = Mββ (L), B = Mββ0 (L) and C = Mββ (I), so that B = C −1 AC. Then
pM β0 (L) (t) = Det(tIn − B) = Det(tIn − C −1 AC)
β0
= Det(C −1 (tIn − A)C) = Det(C)−1 Det(tIn − A)Det(C)
= Det(tIn − A) = pM β (L) (t).
β
15
Definition 10.3. Suppose that V is an n-dimensional vector space over a field F , and
L : V → V is a linear map. Let β be a basis of V . Define the characteristic polynomial
pL (t) of L to be
pL (t) = pM β (L) (t).
β
Lemma 10.2 shows that the definition of characteristic polynomial of L is well defined;
it is independent of choice of basis of V .
Theorem 10.4. Suppose that V is an n-dimensional vector space over a field F , and
L : V → V is a linear map. Then the eigenvalues of L are the roots of pL (t) = 0 which
are in F . In particular, we have another proof that L has at most n distinct eigenvalues.
Proof. Let β be a basis of V , and suppose that λ ∈ F . Then the following are equivalent:
1. λ is an eigenvalue of L.
2. There exists a nonzero vector v ∈ V such that Lv = λv.
3. the Kernel of λI − L : V → V is nonzero.
4. The solution space to Mββ (λI − L)y = 0 is non zero.
5. The solution space to (λIn − Mββ (L))y = 0 is non zero.
6. Det(λIn − Mββ (L)) = 0
7. pL (λ) = 0.
Definition 10.5. Suppose that V is an n-dimensional vector space over a field F , and
L : V → V is a linear map. L is diagonalizable if there exists a basis β of V such that
Mββ (L) is a diagonal matrix.
Theorem 10.6. Suppose that V is an n-dimensional vector space over a field F , and
L : V → V is a linear map. Then the following are equivalent:
1. L is diagonalizable.
2. There exists a basis of V consisting of eigenvectors of L.
3. Let λ1 , . . . , λr be the distinct eigenvalues of L. Then
dim E(λ1 ) + · · · + dim E(λr ) = n.
Proof. We first prove 1 implies 2. By assumption, there exists a basis β = {v1 , . . . , vn } of
V such that


λ1 0 · · · 0
 0 λ2
0 ··· 


Mββ (L) = ((L(v1 ))β , . . . , (L(vn ))β ) = 

..


.
0 · · · 0 λn
is a diagonal matrix. Thus L(vi ) = λi vi for 1 ≤ i ≤ n and so β is a basis of V consisting
of eigenvectors of L.
We now prove 2 implies 3. By assumption, there exists a basis β = {v1 , . . . , vn } of V
consisting of eigenvectors of L. Thus if λ1 , . . . , λr are the distinct eigenvalues of L, then
V ⊂ E(λ1 ) + · · · + E(λr ). Thus
M M
V = E(λ1 ) + · · · + E(λr ) = E(λ1 )
···
E(λr )
by Theorem 9.3, so
n = dim V = dim E(λ1 ) + · · · + dim E(λr ).
16
L L
We now prove 3 implies 1. L
We have that E(λ1 ) + · · · + E(λr ) = E(λ1 ) · · · E(λr ) by
Theorem 9.3 so that
L E(λ
L1 ) · · · E(λr ) is a subspace of V which has dimension n = dim V .
Thus V = E(λ1 ) · · · E(λr ).
Let {v1i , . . . , vsi i } be a basis of E(λi ) for 1 ≤ i ≤ r. Then
β = {v11 , . . . , vs11 , v12 , . . . , vs22 , . . . , v1r , . . . , vsrr }
L
L
is a basis of V = E(λ1 ) · · · E(λr ). We have that Mββ (L) = ((L(v11 )β , . . . , (L(vsrr )) is
a diagonal matrix.
Lemma 10.7. Suppose that A ∈ Mn,n (F ) is a matrix. Then LA is diagonalizable if and
only if A is similar to a diagonal matrix (there exists an invertible matrix B ∈ Mn,n (F )
and a diagonal matrix D ∈ Mn,n (F ) such that B −1 AB = D).
Proof. Suppose that LA is diagonalizable. Then there exists a basis β = {v1 , . . . , vn } of
K n consisting of eigenvectors of LA . Let λi be the eigenvalue of vi , so that Avi = λi vi for
1 ≤ i ≤ n. We have that


λ1 0 · · · 0
 0 λ2
0 ··· 


β
D = Mβ (LA ) = ((L(v1 ))β , . . . , (L(vn ))β ) = 

..


.
0
···
0
λn
is a diagonal matrix. Let β ∗ = {E 1 , . . . , E n } be the standard basis of F n .
∗
∗
D = Mββ (LA ) = Mββ (idF n )Mββ∗ (LA )Mββ∗ (idF n ) = B −1 AB
where B = Mββ∗ (idK n ) = (v1 , v2 , . . . , vn ).
Now suppose that B −1 AB = D is a diagonal matrix. Let B = (B 1 , . . . , B n ) and
β = {B 1 , . . . , B n }, a basis of F n since B is invertible.
∗
∗
Mββ (LA ) = Mββ (idF n )Mββ∗ (LA )Mββ∗ (idF n ) = B −1 AB = D
is a diagonal matrix,

λ1 0
 0 λ2

D=

0 ···
···
0
..
.
0

0
··· 



λn
so LA (B i ) = λi B i for 1 ≤ i ≤ n, and so {B 1 , . . . , B n } is a basis of eigenvectors of LA , and
so LA is diagonalizable.
11. Triangulation
Let V be a finite dimensional vector space over a field F , of positive dimension n ≥ 1.
Suppose that L : V → V is a linear map. A subspace W of V is L-invariant if L(W ) ⊂ W .
A fan {V1 , . . . , Vn } of L in V is a finite sequence of L-invariant subspaces
V1 ⊂ V2 ⊂ · · · ⊂ Vn = V
such that dim Vi = i for 1 ≤ i ≤ n.
A fan basis (of the fan {V1 , . . . , Vn }) for L is a basis {v1 , . . . , vn } of V such that
{v1 , . . . , vi } is a basis of Vi for 1 ≤ i ≤ n. Given a fan, a fan basis always exists.
17
Theorem 11.1. Let β = {v1 , . . . , vn } be a fan basis for L. Then the matrix Mββ (L) of L
with respect to the basis β is an upper triangular matrix
Proof. Let {V1 , . . . , Vn } be the fan corresponding to the fan basis. L(Vi ) ⊂ Vi for all i
implies there exist aij ∈ F such that
L(v1 ) = a11 v1
L(v2 ) = a21 v1 + a22 v2
..
.
L(vn ) = an1 v1 + · · · + ann vn .
Thus
Mββ (L) = (L(v1 )β , L(v2 )β , · · · , L(vn )β ) = (aij )
is upper triangular.
Definition 11.2. A linear map L : V → V is triangular if there exists a basis β of V such
that Mββ (L) is upper triangular. A matrix A ∈ Mnn (F ) is triangulizable if there exists an
invertible matrix B ∈ Mnn (F ) such that B −1 AB is upper triangular.
A matrix A is triangulizable if and only if the linear map LA : F n → F n is triangulizable;
if β is a fan basis for LA , then
β
st
(LA )Mst
(I) = B −1 AB
Mββ (LA ) = Mβst (I)Mst
where I : V → V is the identity map and st denotes the standard basis of F n , and
β
B = Mst
(I).
L
Suppose that V is a vector space, and W1 , W2 are subspaces such that V = W1 W2 .
Then every element v ∈ V has a unique expression v = w1 + w2 with w1 ∈ W1 and
w2 ∈ W2 We may thus define a projection π1 : V → W1 by π1 (v) = w1 if v = w1 + w2
with w1 ∈ W1 and w2 ∈ W2 , and define a projection π2 : V → W1 by π1 (v) = w2 for
v = w1 + w2 ∈ V . These projections are linear maps, which depend on both W1 and W2 .
Composing with the inclusion W1 ⊂ V , we can view π1 as a map from V to V . Composing
with the inclusion W2 ⊂ V , we can view π2 as a map from V to V . Then π1 + π2 = IV .
Theorem 11.3. Let V be a non zero finite dimensional vector space over a field F , and
let L : V → V be a linear map. Suppose that the characteristic polynomial pL (t) factors
into linear factors in F [t] (this will always happen if F is algebraically closed). Then there
exists a fan of L in V , so that L is triangulizable.
Proof. We prove the theorem by induction on n = dim V . If n = 1, then any basis of
V is a fan basis for V . Assume that the theorem is true for linear maps of vector spaces
W over F of dimension less than n = dim V . Since pL (t) splits into linear factors in
F [t], and pL (t) has degree n > 0, pL (t) has a root λ1 in F , which is thus an eigenvalue
of L. Let v1 ∈ V be an eigenvector of L with the eigenvalue λ1 . Let V1 be the one
dimensional subspace of V generated by {v1 }. Extend {v1 } to a basis β = {v1 , v2 , . . . vn }
0
of V . Let
L W = Span{v2 , . . . , vn }. β = {v2 , . . . , vn } is a basis of W . V is the direct sum
V = V1 W . Let π1 : V → V1 and π2 : V → W be the projections. The composed linear
map is π2 L : W → W . From
!
λ
∗
1
0
Mββ (L) =
,
0 Mββ0 (π2 L)
18
we calculate pL (t) = (t − λ1 )pπ2 L (t). Since pL (t) factors into linear factors in F [t], pπ2 L (t)
also factors into linear factors in F [t].
By induction, there exists a fan of π2 L in W , say {W1 , . . . , Wn−1 }. Let Vi = V1 + Wi−1
for 2 ≤ i ≤ n. The subspaces Vi form a chain
V1 ⊂ V2 ⊂ · · · ⊂ Vn = V.
Let {u1 , . . . , un−1 } be a fan basis for π2 L, so that {u1 , . . . , uj } is a basis of Wj for 1 ≤ j ≤
n − 1. Then {v1 , u1 , . . . , ui−1 } is a basis of Vi . We will show that {V1 , . . . , Vn } is a fan of
L in V , with fan basis {v1 , u1 , . . . , un−1 }.
We have that L = IL = (π1 + π2 )L = π1 L + π2 L. Let v ∈ Vi . We have an expression
v = cv1 + wi−1 with c ∈ F and wi−1 ∈ Wi−1 . π1 L(v) = π1 (L(v)) ∈ V1 ⊂ Vi . π2 L(v) =
π2 L(cv1 ) + π2 L(wi−1 ). Now π2 L(cv1 ) = cπ2 L(v1 ) = cπ2 (λ1 v1 ) = 0 and π2 L(wi−1 ) ∈ Wi−1
since {w1 , . . . , wn−1 } is a fan basis for π2 L. Thus π2 L(v) ∈ Wi−1 and so L(v) ∈ Vi . Thus
Vi is L-invariant, and we have shown that {V1 , . . . , Vn } is a fan of L in V .
12. The minimal polynomial of a linear operator
Suppose that L : V → V is a linear map, where V is a finite dimensional vector space
over a field F . Let LF (V, V ) be the F -vector space of linear maps from V to V . LF (V, V )
is a ring with multiplication given by composition of maps. Let I : V → V be the identity
map, which is the multiplicative identity of the ring LF (V, V ). There is a natural inclusion
of the field F as a subring (and subvector space) of LF (V, V ), obtained by mapping λ ∈ F
to λI ∈ LF (V, V ). Let F [L] be the subring of LF (V, V ) generated by F and L.
F [L] = {a0 I + a1 L + · · · + ar Lr | r ∈ N and a0 , . . . , ar ∈ F }.
F [L] is a commutative ring. By the universal property of polynomial rings, there is a
surjective ring homomorphism ϕ from the polynomial ring F [t] onto F [L], obtained by
mapping f (t) ∈ F [t] to f (L). ϕ is also a vector space homomorphism. Since F [L] is
a subspace of LF (V, V ), which is an F -vector space of dimension n2 , F [L] is a finite
dimensional vector space. Thus the kernel of ϕ is nonzero. Let mL (t) be the monic
generator of the kernel of ϕ. We have an isomorphism
F [L] ∼
= F [t]/(mL (t)).
mL (t) is the minimal polynomial of L.
We point out here that if g(t), h(t) ∈ F [t] are polynomials, then since F [L] is a commutative ring, we have equality of composition of operators
g(L) ◦ h(L) = h(L) ◦ g(L).
We generally write g(L)h(L) = h(L)g(L) to denote the composition of operators. If v ∈ V ,
we will also write Lv for L(v) and f (L)v for f (L)(v).
The above theory also works for matrices. The n × n matrices Mn,n (F ) is a vector space
over F , and is also a ring with matrix multiplication. F is embedded as a subring by the
map λ → λIn , where In denotes the n × n identity matrix. Given A ∈ Mn,n (F ), F [A] is a
commutative subring, and there is a natural surjective ring homomorphism F [t] → F [A].
The kernel is generated by a monic polynomial mA (t) which is the minimal polynomial of
A.
Now suppose that V is a finite dimensional vector space over F , and β is a basis of V .
Then the map LF (V, V ) → Mn,n (F ) defined by ϕ → Mββ (ϕ) for ϕ ∈ LF (V, V ) is a vector
space isomorphism, and a ring isomorphism.
19
Suppose that L : V → V is a linear map, and let A = Mββ (L). Then we have an induced
isomorphism of F [L] with F [A]. For a polynomial f (x) ∈ F [x], we have that the image of
f (L) in F [A] is
Mββ (f (L)) = f (Mββ (L)) = f (A).
Thus we have that mA (t) = mL (t).
Lemma 12.1. Suppose that A is an n × n matrix with coefficients in a field F , and K is
an extension field of F . Let f (x) be the minimal polynomial of A over F , and let g(x) be
the minimal polynomial of A over K. Then f (x) = g(x).
Proof. Let a1 , a2 , . . . , ar ∈ K be elements of K which are linearly independent over F . We
will use the following observation. Suppose that A1 , . . . , Ar ∈ Mnn (F ) and a1 A1 + · · · +
ar Ar = 0 in Mnn (K). Then since each coefficient of a1 A1 + · · · + ar Ar is zero, we have
that A1 = A2 = · · · = Ar = 0.
Write g(x) = xe + be−1 xe−1 + · · · + b0 with bi ∈ K. Let {a1 = 1, . . . , ar } be a basis of
the subspace of K (recall that K is a vector space over F ) spanned by {1, be−1 , . . . , b0 }.
Pt
e
e−1 + · · · + c
Expand bi =
0,1 and
j=1 cij aj with cij ∈ F . Let f1 (x) = x + ce−1,1 x
Pr
e−1
fj (x) = ce−1,j x
+ · · · + c0,j for j ≥ 2. fi (x) ∈ F [x] for all i, and g(x) = j=1 aj fj (x).
P
We have that 0 = g(A) = ti=1 ai fi (A), which implies that fi (A) = 0 for all i, as observed
above. Thus f (x) divides fi (x) in F [x] for all i, and then f (x) divides g(x) in K[x]. Since
f (A) = 0, g(x) divides f (x) in K[x] so f and g generate the same ideal, so f = g is the
unique monic generator.
13. The theorem of Hamilton-Cayley
Theorem 13.1. Let V be a non zero finite dimensional vector space, over an algebraically
closed field F , and let L : V → V be a linear map. Let pL (t) be its characteristic polynomial. Then pL (L) = 0.
Here “pL (L) = 0” means that pL (L) is the zero map on V .
Proof. There exists a fan {V1 , . . . , Vn } of L in V , with associated fan basis β = {v1 , . . . , vn }.
The matrix Mββ (L) ∈ Mn,n (F ) is upper triangular. Write Mββ (L) = (aij ). Then
pL (t) = Det(tIn − Mββ (L)) = (t − a11 )(t − a22 ) · · · (t − ann ).
We shall prove by induction on i that
(L − a11 I) · · · (L − aii I)v = 0
for all v ∈ Vi .
We first prove this in the case i = 1. Then (L − a11 I)v = Lv − a11 v = 0 for v ∈ V1 ,
since V1 is in the eigenspace for the eigenvalue a11 .
Now assume that i > 1, and that
(L − a11 I) · · · (L − ai−1,i−1 I)v = 0
for all v ∈ Vi−1 . Suppose that v ∈ Vi . Then v = v 0 + cvi for some v 0 ∈ Vi−1 and c ∈ F .
Let z = (L − aii I)v 0 . z ∈ Vi−1 because L(Vi−1 ) ⊂ Vi−1 and aii v 0 ∈ Vi−1 . By induction,
(L − a11 I) · · · (L − ai−1,i−1 I)(L − aii I)v 0 = (L − a11 I) · · · (L − ai−1,i−1 I)z = 0.
We have (L − aii I)cvi ∈ Vi−1 , since L(vi ) = ai1 v1 + ai2 v2 + · · · + aii vi . Thus
(L − a11 I) · · · (L − ai−1,i−1 I)(L − aii I)cvi = 0,
20
and thus
(L − a11 I) · · · (L − ai,i I)v = 0.
Corollary 13.2. Let F be an algebraically closed field and suppose that A ∈ Mn,n (F ).
Let pA (t) be its characteristic polynomial. Then pA (A) = 0.
Here “pA (A) = 0” means that pA (A) is the zero matrix.
Proof. Let L = LA : F n → F n . Then 0 = pL (L). Let β be the standard basis of F n .
Since A = Mββ (L), we have that pL (t) = pA (t) ∈ F [t]. Now
pA (A) = pL (Mββ (L)) = Mββ (pL (L)) = Mββ (0) = 0
where the right most zero in the above equation denotes the n × n zero matrix.
Corollary 13.3. Suppose that F is a field, and A ∈ Mn,n (F ). Then pA (A) = 0.
Proof. Let K be an algebraically closed field containing F . From the natural inclusion
Mn,n (F ) ⊂ Mn,n (K) we may regard A as an element of Mn,n (K). The computation
pA (t) = Det(tIn − A) does not depend on the field F or K. By Corollary 13.2, pA (A) =
0.
Theorem 13.4. (Hamilton-Cayley) Suppose that V is a non zero finite dimensional vector
space over a field F and L : V → V is a linear map. Then pL (L) = 0.
Proof. Let β be a basis of V , and let A = Mββ (L) ∈ Mn,n (F ). We have pL (t) = pA (t). By
Corollary 13.3, pA (A) = 0. Now
Mββ (pL (L)) = pL (Mββ (L)) = pA (A) = 0.
Thus pL (L) = 0.
Corollary 13.5. Suppose that V is a non zero finite dimensional vector space over a
field F and L : V → V is a linear map. Then the minimal polynomial mL (t) divides the
characteristic polynomial pL (t) in F [t].
Proof. By the Hamilton-Cayley Theorem, pL (t) is in the kernel of the homomorphism
F [t] → F [L] which takes t to L and is the identity on F . Since mL (t) is a generator for
this ideal, mL (t) divides pL (t).
14. Invariant Subspaces
Suppose that V is a finite dimensional vector space over a field F , and L : V → V is
a linear
L map.
L Suppose that W1 , . . . , Wr are L-invariant subspaces of V such that V =
W1 · · · Wr . Let βi = {vi,1 , . . . , vi,σ(i) } be bases of Wi , and let
β = {v1,1 , . . . , v1,σ(1) , v2,1 . . . , vr,σ(r) }
which is a basis of V . Since the Wi are L-invariant, the matrix of L with respect to β is
a block matrix


A1 0 0 · · · 0
 0 A2 0 · · · 0 


Mββ (L) = 

..


.
0
0 0 · · · Ar
where Ai = Mββii (L|Wi ) are σ(i) × σ(i) matrices.
21
Lemma 14.1. Suppose that V is a vector space over a field F and L : V → V is a linear
map. Suppose that f (t) ∈ F [t] and let W be the kernel of f (L) : V → V . Then W is an
L-invariant subspace of V .
Proof. Let x ∈ W . Then f (L)Lx = Lf (L)x = L0 = 0. Thus Lx ∈ W .
Theorem 14.2. Let f (t) ∈ F [t]. Suppose that f = f1 f2 with f1 , f2 ∈ F [t] and deg f1 ≥ 1,
deg (f2 ) ≥ 1 and gcd(f1 , f2 ) = 1. Suppose that V is a vector space over a field F , L : V →
V is a linear
L map and f (L) = 0. Let W1 = kernel f1 (L) and W2 = kernel f2 (L). Then
V = W1 W2 .
Proof. Since gcd(f1 , f2 ) = 1, there exist g1 , g2 ∈ F [t] such that 1 = g1 (t)f1 (t) + g2 (t)f2 (t),
so that
(10)
g1 (L)f1 (L) + g2 (L)f2 (L) = I
is the identity map of V . Let x ∈ V .
x = g1 (L)f1 (L)x + g2 (L)f2 (L)x.
g1 (L)f1 (L)x ∈ W2 since
f2 (L)g1 (L)f1 (L)x = g1 (L)f1 (L)f2 (L)x = g1 (L)f (L)x = g1 (L)0 = 0.
Similarly, g2 (L)f2 (L)x ∈ W1 . Thus V = W1 +W2 . To show that the sum is direct, we must
show that an expression x = w1 + w2 with w1 ∈ W1 and w2 ∈ W2 is uniquely determined
by x. Apply g1 (L)f1 (L) to this expression to get
g1 (L)f1 (L)x = g1 (L)f1 (L)w1 + g1 (L)f1 (L)w2 = 0 + g1 (L)f1 (L)w2 .
Now apply (10) to w2 , to get
w2 = g1 (L)f1 (L)w2 + g2 (L)f2 (L)w2 = g1 (L)f1 (L)w2 + 0,
which implies that w2 = g1 (L)f1 (L)x is uniquely determined by x. Similarly, w1 =
g2 (L)f2 (L)x is uniquely determined by x.
Theorem 14.3. Let V be a vector space over a field F , and let L : V → V be a linear
map. Suppose f (t) ∈ F [t] satisfies f (L) = 0. Suppose that f (t) = f1 (t)f2 (t) · · · fr (t) where
fi (t) ∈ FL
[t] and
Lgcd(f
Li , fj ) = 1 if i 6= j. Let Wi = kernel fi (L) for 1 ≤ i ≤ r. Then
V = W1 W2 · · · Wr .
Proof. We prove the theorem by induction on r, the case r = 1 being trivial. We have that
gcd (f1 , fL
2 f3 · · · fr ) = 1 since gcd (f1 , fi ) = 1 for 2 ≤ i ≤ r. Theorem 14.2 thus implies
V = W1 W , where W = kernel f2 (L)fL
· · fr (L). fj (L) : W → W for 2 ≤ j ≤ r.
3 (L) · L
· · · Ur , where for j ≥ 2, Uj is the kernel of
By induction on r, we have thatL
W =L
U2 L
fj (L) : W → W . Thus V = W1 U2 · · · Ur .
We will prove that Wj = Uj for j ≥ 2, which will establish the theorem. We have
that Uj ⊂ Wj . Let v ∈ Wj , with j ≥ 2. Then v ∈ kernel fj (L) implies v ∈ W . Thus
v ∈ Wj ∩ W = Uj .
Corollary 14.4. Suppose V is a vector space over a field F . Let L : V → V be a linear
map. Suppose that a monic polynomial f (t) ∈ F [t] satisfies f (L) = 0, and f (t) splits into
linear factors in F [t] as
f (t) = (t − α1 )m1 · · · (t − αr )mr
where α1 , . . . , αr ∈ F are distinct. Let Wi = kernel(L − αi )mi . Then
M
M M
V = W1
W2
···
Wr .
22
As an application of this Corollary, we consider the following example
d
. Suppose that n ∈ N and
Let A(C) be the set of analytic functions on C, and D = dx
a0 , a1 , . . . , an−1 ∈ C. Let
V = {f ∈ A(C) | Dn f + an−1 Dn−1 f + · · · + a0 f = 0}.
Then V is a complex vector space. Let
p(t) = tn + an−1 tn−1 + · · · + a0 ∈ C[t].
Suppose that p(t) factors as
p(t) = (t − α1 )m1 · · · (t − αr )mr
with α1 , . . . , αr ∈ C distinct. Then by Corollary 14.4, we have that
V = ⊕ri=1 Wi
where
Wi = {f ∈ A(C) | (D − αi I)mi f = 0}.
Now from the following Lemma, we are able to construct a basis of V . In fact, we see
that
{eα1 x , xeα1 x , . . . , xm1 −1 eα1 x , eα2 x , . . . , xmr −1 eαr x }
is a basis of V .
Lemma 14.5. Let α ∈ C, and m ∈ N. Let
W = {f ∈ A(C) | (D − αI)m f = 0}.
Then
{eαx , xeαx , . . . , xm−1 eαx }
is a basis of W .
Proof. It can be verified by induction on m that
(D − αI)m f = eαx Dm (e−αx f )
for f ∈ A(C). Thus f ∈ W if and only if Dm (e−αx f ) = 0. The functions whose m-th
derivitives are zero are the polynomials of degree ≤ m − 1. Hence the space of solutions to
(D − αI m )f = 0 is the space generated by {eαx , xeαx , . . . , xm−1 eαx }. It remains to verify
that these functions are linearly independent. Suppose that there exist ci ∈ C such that
c0 eαs + c1 seαs + · · · + cm−1 sm−1 eαs = 0
for all s ∈ C. Let
q(t) = c0 + c1 t + · · · + cm−1 tm−1 ∈ C[t].
We have q(s)eαs = 0 for all s ∈ C. But eαs 6= 0 for infinitely many s ∈ C, so q(s) = 0 for
infinitely many s ∈ C. Since C is infinite, and the equation q(t) = 0 has at most a finite
number of roots if q(t) 6= 0, we must have q(t) = 0. Thus ci = 0 for 0 ≤ i ≤ m − 1.
23
15. Cyclic Vectors
Definition 15.1. Let V be a finite dimensional vector space over a field F , and let L :
V → V be a linear map. V is called L-cyclic if there exists v ∈ V such that
V = F [L]v = {f (L)v | f (t) ∈ F [t]}.
We will say that the vector v is L-cyclic.
Lemma 15.2. Let V be a finite dimensional vector space over a field F , and let L : V → V
be a linear map. Suppose that V is L-cyclic and v ∈ V is an L-cyclic vector. Let d be the
degree of the minimal polynomial mL (t) ∈ F [t] of L. Then {v, Lv, . . . , Ld−1 v} is a basis
of V .
Proof. Suppose w ∈ V . Then w = f (L)v for some f (t) ∈ F [t]. We have f (t) =
p(t)mL (t) + r(t) with p(t), r(t) ∈ F [t] and deg r(t) < d. Thus w = f (L)v = r(L)v ∈
Span{v, Lv, . . . , Ld−1 v}. This shows that V = Span{v, Lv, . . . , Ld−1 v}. Suppose that
there is a dependence relation
c0 v + c1 Lv + · · · + cs Ls v = 0
where 0 ≤ s ≤ d − 1, all ci ∈ F and cs 6= 0. Let f (t) = c0 + c1 t + · · · + cs ts ∈ F [t]. Let
w ∈ V . Then there exist g(t) ∈ F [t] such that w = g(L)v.
f (L)w = f (L)g(L)v = g(L)f (L)v = g(L)0 = 0.
Thus the operator f (L) = 0, which implies that mL (t) divides f (t), which is impossible
since f (t) is nonzero and has degree less than d. Thus {v, Lv, . . . , Ld−1 v} are linearly
independent, and a basis of V .
With the hypotheses of Lemma 15.2, express mL (t) = a0 + a1 t + · · · + ad−1 td−1 + td
with the ai ∈ F . Let β = {v, Lv, . . . , Ld−1 v}, a basis of V . Then


0 0 · · · 0 0 −a0
 1 0
0 0 −a1 


 0 1
0 0 −a2 


β
Mβ (L) =  ..
.
 .



 0 0
0 0 −ad−2 
0 0
0 1 −ad−1
16. Nilpotent Operators
Suppose that V is a vector space, and L : V → V is a linear map. We will say that L is
nilpotent if there exists a positive integer r such that Lr = 0. If r is the smallest positive
integer such that Lr = 0, then mL (t) = tr .
Theorem 16.1. Let V be a finite dimensional vector space over a field F , and let L : V →
V be a nilpotent linear map. Then V is a direct sum of L-invariant L-cyclic subspaces.
Proof. We prove the theorem by induction on dim V . If V has dimension 1, then V
is L-cyclic. Assume that the theorem is true for vector spaces of dimension less than
s = dim V .
Let r be the smallest integer such that Lr = 0, so that the minimal polynomial of L is
mL (t) = tr . Since Lr−1 6= 0, there exists w ∈ V such that Lr−1 w 6= 0. Let v = Lr−1 w.
Lv = 0 so Kernel L 6= 0.
dim L(V ) + dim Kernel L = dim V
24
implies dim L(V ) < dim V .
By induction on s = dim V , L(V ) is a direct sum of L-invariant subspaces which are
L-cyclic, say
M M
L(V ) = W1
···
Wm .
Let wi ∈ Wi be an L|Wi -cyclic vector, so that if ri is the degree of the minimal polynomial
mL|Wi (t), then Wi has a basis {wi , Lwi , . . . , Lri −1 wi } by Lemma 15.2. Since L|Wi is
nilpotent, mL|Wi (t) = tri . Let vi ∈ V be such that Lvi = wi .
Let Vi be the cyclic subspace F [L]vi of V . Lri +1 vi = Lri wi = 0 implies that {vi , Lvi , . . . , Lri vi }
spans Vi . Suppose that
d0 vi + d1 Lvi + · · · + dri Lri vi = 0
with d0 , . . . , dri ∈ F . Then
0 = L(0) = L(d0 vi + d1 Lvi + · · · + dri Lri vi )
= d0 Lvi + · · · + dri −1 Lri vi + dri Lri +1 vi
= d0 wi + · · · + dri −1 Lri −1 wi .
Thus d0 = · · · = dri −1 = 0, since {wi , . . . , Lri −1 wi } is a basis of Wi . Now since Lri vi =
Lri −1 wi 6= 0, we have dri = 0. Thus {vi , Lvi , . . . , Lri vi } are linearly independent, and are
thus a basis of Vi .
We will prove that the subspace V 0 = V1 + · · · + Vm of V is a direct sum. We have to
prove that if
(11)
0 = u1 + · · · + um
with ui ∈ Vi , then ui = 0 for all i. Since ui ∈ Vi , we have an expression ui = fi (L)vi where
fi (t) ∈ F [t] is a polynomial. Thus (11) becomes
(12)
f1 (L)v1 + · · · + fm (L)vm = 0.
Apply L to (12), and recall that Lfi (L) = fi (L)L, to get
f1 (L)w1 + · · · + fm (L)wm = 0.
Now W1 + · · · + Wm is a direct sum decomposition of L(V ) by L-invariant subspaces, so
fi (L)wi = 0 (for all i with 1 ≤ i ≤ m) which implies that fi (L)(Wi ) = 0, since Wi is
wi -cyclic. Thus mL|Wi (t) = tri divides fi (t). In particular, t divides fi (t) in F [t]. Write
fi (t) = gi (t)t for some polynomial gi (t). Then fi (L) = gi (L)L. (12) implies
g1 (L)w1 + · · · + gm (L)wm = 0.
Thus gi (L)wi = 0 for all i, since L(V ) is the direct sum of the Wi . This implies that tri
divides gi (t) in F [t] so that tri +1 divides L
fi (t) which
implies that ui = fi (L)vi = 0. This
L
0
0
proves that V is the direct sum V = V1 · · · Vm .
An element of L(V ) is of the form
f1 (L)w1 + · · · + fm (L)wm = f1 (L)Lv1 + · · · + fm (L)Lvm
= L(f1 (L)v1 + · · · + fm (L)vm )
for some polynomials fi (t) ∈ F [t]. Thus L(V ) ⊂ L(V 0 ). Since V 0 ⊂ V , we have L(V 0 ) ⊂
L(V ) and so L(V 0 ) = L(V ). Now let v ∈ V . Lv = Lv 0 for some v 0 ∈ V 0 . Then L(v−v 0 ) = 0.
Thus v = v 0 + (v − v 0 ) ∈ V 0 + Kernel L. We conclude that V = V 0 + kernel L (which may
not be a direct sum).
Let β 0 = {Lj vi | 1 ≤ i ≤ m, 1 ≤ j ≤ ri } be the basis we have constructed of V 0 . We
extend β 0 to a basis of V by using elements of kernel L, to get a basis β = {β 0 , z1 , . . . , ze }
of V where z1 , . . . , ze ∈ kernel L. Each zj satisfies Lzj = 0, so zj is an eigenvector for L,
25
and the one dimensional space generated by zj is L-invariant and cyclic. Let this subspace
be Zj . We have
M
M
M M
M M
M
V =V0
Z1 · · ·
Ze = V 1
···
Vm
Z1
···
Ze
expressing V as a direct sum of L-cyclic subspaces.
17. Jordan Form
Definition 17.1. Suppose that V is a finite dimensional vector space over a field F , and
L : V → V is a linear map. A basis β of V is a Jordan basis for L if the matrix Mββ (L)
is a block matrix


J1 0 · · · 0
 0 J2 · · · 0 


J = Mββ (L) = 
,
..

.
0 
where each Ji is a Jordan block; that is a

αi 1
 0 αi

 0 0

Ji = 


 0 0
0 0
for some αi ∈ F .
0 0 · · · Jm
matrix of the form

0 ··· 0 0
1 ··· 0 0 

αi · · · 0 0 


..

.

0 · · · αi 1 
0 · · · 0 αi
A matrix of the form J is called a Jordan form.
Theorem 17.2. Suppose that V is a finite dimensional vector space over a field F , and
L : V → V is a linear map. Suppose that the minimal polynomial mL (t) splits into linear
factors in F [t]. Then V has a Jordan basis for L.
Proof. By assumption, we have a factorization mL (t) = (t − α1 )a1 · · · (t − αs )as with
α1 , . . . , αs ∈ F distinct. Let
Vi = {v ∈ V | (L − αi I)ai v = 0}
for 1 ≤ i ≤ s. We proved L
in Lemma
14.1 and Corollary 14.4 that Vi are L-invariant
L
subspaces of V and V = V1 · · · Vs . It thus suffices to prove the theorem in the case
that there exists α ∈ F and r ∈ N such that (L − αI)r = 0. In this case we apply Theorem
16.1 to the nilpotent operator T = L − αI, to show that V is a direct sum of T -cyclic
subspaces. Since a T -invariant subspace is L-invariant, it suffices to prove the theorem in
the case that T is nilpotent, and V is T -cyclic. Let v be a T -cyclic vector. There exists
a positive integer r such that the minimal polynomial of T is mT (t) = tr , and by Lemma
15.2, {T r−1 v, . . . , T v, v} is a basis of V . Recalling that T = L − αI, we have
L(L − αI)r−1 v = α(L − αI)r−1 v
L(L − αI)r−2 v = (L − αI)r−1 v + α(L − αI)r−2 v
..
.
Lv
= (L − α)v + αv.
26
Thus the matrix of L with respect to the basis β is the Jordan block

α 1 0 ···
0 α 1 ···
0 0 α ···
..
.




β
Mβ (L) = 


 0
0
0
0
···
···
0
0
0
0
0
0
0
0





.


α 1 
0 α
The proof of the following theorem reduces the calculation of the Jordan form of an
operator L to a straightforward calculation, if we are able to determine the eigenvalues
(roots of the characteristic equation) of L.
Theorem 17.3. Suppose that V is a finite dimensional vector space over a field F , and
L : V → V is a linear map. Suppose that the minimal polynomial mL (t) splits into linear
factors in F [t]. For α ∈ F and i ∈ N, define
si (α) = dim Kernel (L − αI)i .
Then the Jordan form of L is uniquely determined, up to permutation of Jordan blocks,
by the si (α).
In particular, the Jordan Form of L is uniquely determined up to permutation of Jordan
blocks.
Proof. Suppose that α ∈ F . Let si = si (α) for 1 ≤ i ≤ n = dim V . Suppose that β is a
Jordan basis of V . Let


J1 0 · · · 0
 0 J2
0 


A = Mββ (L) =  ..

 .

0 0
Js
where the



Ji = 

αi 1 0 · · ·
0 αi 1
..
.
0
0
0

0
0 



αi
are Jordan blocks. Suppose that Ji has size ni for 1 ≤ i ≤ s. We have that
dim Kernel (L − αI)i = dim Kernel LM β (L−αI)i = dim Kernel L(A−αIn )i
β
where In is the n × n identity matrix and LM β (L−αI)i is the associated linear map from
β
F n to F n . We have that



(A − αIn )i = 

(J1 − αIn1 )i
0
···
0
(J2 − αIn2 )i
..
.
0
0
27
0
0
(Js − αIns )i



.

Now if αj 6= α, then



(Jj − αInj )i = 

(αj − α)i
∗
∗ ···
i
0
(αj − α) ∗
..
.
0
0
0
∗
∗
(αj − α)i



,

and



(Jj − αj Inj )i = 

0 ···
0
..
.
0 1 0 ···
0 0 1
0
0 0 0

0
0 



0
for 1 ≤ i ≤ nj − 1, where the 1 in the first row occurs in the i + 1-st column. We also have
that (Jj − αj Inj )nj = 0. We have
0
if αj 6= α
dim Kernel L(Jj −αInj )i =
min{i, nj } if αj = α.
Let rj be the number of Jordan blocks Jj of size nj = i and with αj = α. Let
s1
si
= r1 + · · · + rn
..
.
= r1 + 2r2 + 3r3 + · · · + i(ri + · · · + rn )
..
.
sn = r1 + 2r2 + 3r3 + · · · + nrn .
From the above system we obtain r1 = 2s1 − s2 , ri = 2si − si+1 − si−1 for 2 ≤ i ≤ n − 1 and
rn = sn − sn−1 . Since the Jordan form of L is determined up to permutation of the Jordan
blocks by the knowledge of the ri = ri (α) for α ∈ F , and the si = si (α) are completely
determined by L, the Jordan form of L is completely determined up to permutation of
Jordan blocks.
18. Exercises
Suppose that V is a finite dimensional vector space over an algebraically closed field F ,
and L : V → V is a linear map.
1. If L is nilpotent and not the zero map, show that L is not diagonalizable.
2. Show that L can be written in the form L = D + N where D : V → V is
diagonalizable, N : V → V is nilpotent, and DN = N D.
3. Give a formula for the minimal polynomial of L, in terms of its Jordan form.
4. Let pL (t) be the characteristic polynomial of L, and suppose that it has a factorization
r
Y
pL (t) =
(t − αi )mi
i=1
where α1 , . . . , αr are distinct. Let f (t) be a polynomial. Express the characteristic
polynomial pf (L) (t) as a product of factors of degree 1.
28
19. Inner Products
Let z denote complex conjugation, for z ∈ C. z ∈ R if and only if z = z.
Suppose that V is a vector space over F , where F = R or F = C. An inner product on
V is a (respectively, real or complex) valued function on V × V such that for v1 , v2 , w ∈ V
and α1 ∈ F
1. < v1 + v2 , w >=< v1 , w > + < v2 , w >,
2. < α1 v1 , w >= α1 < v1 , w1 >,
3. < v, w >= < w, v >,
4. < w, w >≥ 0 and < w, w >= 0 if and only if w = 0.
An inner product space is a vector space with an inner product. A complex inner product
is often called an Hermitian product.
A real inner product is a positive definite, symmetric bilinear form. However, an Hermitian inner product is not even a bilinear form. In spite of this difference, the theory for
real and complex inner products is very similar.
From now on in this section, suppose that V is an inner product space, with an inner
product <, >.
For v ∈ V , we define the norm of v by
√
||v|| = < v, v >.
From the definition of an inner product, we obtain the polarization identities. If V is a
real inner product space, then for v, w ∈ V , we have
1
1
(13)
< v, w >= ||v + w||2 − ||v − w||2 .
4
4
If V is an Hermitian inner product space, then for v, w ∈ V , we have
1
1
i
i
(14)
< v, w >= ||v + w||2 − ||v − w||2 + ||v + iw||2 − ||v − iw||2 .
4
4
4
4
A property of inner products that we will use repeatedly is the fact that if x, y ∈ V and
< v, x >=< v, y > for all v ∈ V , then x = y.
We say that vectors v, w ∈ V are orthogonal or perpendicular if < v, w >= 0. Suppose
that S ⊂ V is a subset. Define
S ⊥ = {v ∈ V |< v, w >= 0 for all w ∈ S}.
Lemma 19.1. Suppose that S is a subset of V . Then
1. S ⊥ is a subspace of V .
2. If U is the subspace Span(S) of V , then U ⊥ = S ⊥ .
A set of nonzero vectors {v1 , . . . , vr } in V are called orthogonal if < vi , vj >= 0 whenever
i 6= j.
Lemma 19.2. Suppose that {v1 , · · · , vr } are nonzero orthogonal vectors in V . Then
v1 , . . . , vr are linearly independent.
A set of vectors {u1 , . . . , ur } in V are called orthonormal if < ui , uj >= 0 if i 6= j and
||ui || = 1 for all i.
Lemma 19.3. Suppose that {u1 , · · · , ur } are orthonormal vectors in V . Then u1 , . . . , ur
are linearly independent.
A basis of V consisting of orthonormal vectors is called an orthonormal basis.
29
Lemma 19.4. Let {v1 , . . . , vs } be a set of nonzero orthogonal vectors in V , and v ∈ V .
Let
< v, vi >
.
ci =
< vi , vi >
1. We have that
s
X
v−
ci vi ∈ Span{v1 , . . . , vs }⊥ .
i=1
2. Let a1 , . . . , as ∈ F . Then
||v −
s
X
ci vi || ≤ ||v −
i=1
s
X
ai vi ||
i=1
with equality if and only if ai = ci for all i. Thus c1 v1 + · · · + cs vs gives the best
approximation to v as a linear combination of v1 , . . . , vs .
Proof. We first prove 1. For 1 ≤ j ≤ s, we have
s
s
X
X
<v−
ci vi , vj >=< v, vj > −
ci < vi , vj >=< v, vj > −cj < vj , vj >= 0.
i=1
i=1
We now prove 2. We have that
P
P
P
2
||v − si=1 ai vi ||2 = ||v − Psi=1 ci vi + si=1
P(cs i − ai )vi ||
s
2
= ||v − i=1 ci vi || + || i=1 (ci − ai )vi ||2
P
since v − si=1 ci vi ∈ Span{v1 , . . . , vs }⊥ .
Theorem 19.5. (Gram Schmidt) Suppose that V is a finite dimensional inner product
space. Suppose that {u1 , . . . , ur } is a set of orthonormal vectors in V . Then {u1 , . . . , ur }
can be extended to an orthonormal basis {u1 , . . . , ur , ur+1 , . . . un } of V .
Proof. First extend u1 , . . . , ur to a basis {u1 , . . . , ur , vr+1 , . . . , vn } of V . Inductively define
wi = v i −
i−1
X
< vi , uj > uj
j=0
and
ui =
1
wi
||wi ||
for r + 1 ≤ i ≤ n.
Let Vr = Span{u1 , . . . , ur } and Vi = Span{u1 , . . . , ur , vr+1 , . . . , vi } for r + 1 ≤ i ≤ n.
Then Vi = Span{u1 , . . . , ui } for r + 1 ≤ i ≤ n. By Lemma 19.4, we have that ui ∈ (V i−1 )⊥
for r + 1 ≤ i ≤ n. Thus {u1 , . . . , un } are an orthonormal set of vectors which form a basis
of V .
L ⊥
Corollary 19.6. Suppose that W is a subspace of V . Then V = W
W .
Proof. First construct, using Gram Schmidt, an orthonormal basis {w1 , . . . , ws } of W .
Now apply Gram Schmidt again to extend this to an orthonormal
L basis {w1 , . . . , ws , u1 , . . . , ur }
of V . Let U = Span{u1 , . . . , ur }.
U . It remains to show that
P We have that V = W
U = W ⊥ . Let x ∈ U . Then x = ri=1 ci ui for some ci ∈ F . For all j,
< x, wj >=
r
X
ci < ui , wj >= 0.
i=1
30
Thus x ∈ W ⊥ and U ⊂ W ⊥ .
P
P
Suppose x ∈ W ⊥ . Expand x = ri=1 ci ui + sj=1 dj wj for some ci , dj ∈ F . For all k,
0 =< x, wk >=
r
X
ci < ui , wk > +
i=1
Thus x ∈ U and
W⊥
s
X
dj < wj , wk >= dk .
j=1
⊂ U.
We now can give another proof of Theorem 7.5, when F = R.
Corollary 19.7. Suppose that A ∈ Mm,n (R). Then the row rank of A is equal to the
column rank of A.
Suppose that V , W are inner product spaces (both over R or over C), with inner
products <, > and [, ] respectively. We will say that a linear map ϕ : V → W is a map of
inner product spaces if < v, w >= [ϕ(v), ϕ(w)] for all v, w ∈ V .
Corollary 19.8. Suppose that V is a real inner product space of dimension n. Consider
Rn as an inner product space with the dot product. Then there is an isomorphism ϕ : V →
Rn of inner product spaces.
Corollary 19.9. Suppose that V is an Hermitian inner product space of dimension n.
Consider Cn as an inner product space with the standard Hermitian product, [v, w] = v t w
for v, w ∈ V . Then there is an isomorphism ϕ : V → Cn of inner product spaces.
Proof. Let {v1 , . . . , vn } be an orthonormal basis of V . Let {e1 , . . . , en } be the standard
basis of Cn . We can thus define a linear map ϕ : V → Cn by ϕ(v) = a1 e1 + · · · an en if
v = a1 v1 + · · · + an vn with a1 , . . . , an ∈ C. Since {e1 , . . . , en } is a basis of Cn , ϕ is an
isomorphism. We have < vi , vj >= δij and [ϕ(vi ), ϕ(vj )] = [ei , ej ] = δij for 1 ≤ i, j ≤ n.
Suppose that v = a1 v1 + · · · + an vn ∈ V and w = b1 v1 + · · · + bn vn ∈ V . Then
n
n
X
X
ai bi .
< v, w >=
ai bj δij =
i=1
i,j=1
We also calculate
n
n
n
n
X
X
X
X
[ϕ(v), ϕ(w)] = [
ai ei ,
bj ej ] =
ai bj δij =
ai bi .
i=1
j=1
i,j=1
i=1
Thus < v, w >= [ϕ(v), ϕ(w)] for all v, w ∈ V .
20. Symmetric, Hermitian, Orthogonal and Unitary Operators
Throughout this section, assume that V is a finite dimensional inner product space. Let
<, > be the inner product. Suppose that L : V → V is a linear map.
Recall that the dual space of V is the vector space V ∗ = LF (V, K), and dim V ∗ =
dim V .
Lemma 20.1. Suppose that z ∈ V . Then the map ϕz : V → F defined by ϕz (v) =< v, z >
is a linear map. Suppose that ψ ∈ V ∗ . Then there exists a unique z ∈ V such that ψ = ϕz .
Proof. Let {v1 , . . . , vn } be an orthonormal basis of V with dual basis {v1∗ , . . . , vn∗ }. Suppose
that ψ ∈ V ∗ . Then there exist (unique) c1 , . . . , cn ∈ F such that ψ = c1 v1∗ + · · · + cn vn∗ .
Let z = c1 v1 + · · · + cn vn ∈ V . We have that ψ(vi ) = ci for 1 ≤ i ≤ n.
ϕz (vi ) =< vi , z >= c1 < vi , v1 > + · · · + cn < vi , vn >= ci = ψ(vi )
31
for 1 ≤ i ≤ n. Since {v1 , . . . , vn } is a basis of V , we have that ψ = ϕz . Suppose that
w ∈ V and ϕw = ψ. Expand v = d1 v1 + · · · + dn vn . We have that
ci = ϕw (vi ) = di
for 1 ≤ i ≤ n. Thus di = ci for 1 ≤ i ≤ n, and w = z.
Theorem 20.2. There exists a unique linear map L∗ : V → V such that < Lv, w >=<
v, L∗ w > for all v, w ∈ V .
Proof. For fixed w ∈ V , the map ψw : V → F defined by ψw (v) =< L(v), w > is a linear
map. Hence ψw ∈ V ∗ . By Lemma 20.1, there exists a unique vector zw ∈ V such that
ψw = ϕzw (where ϕzw (v) =< v, zw > for v ∈ V ). Thus we have a function L∗ : V → V
defined by L∗ (w) = zw for w ∈ V .
We will now verify that L∗ is linear. Suppose w1 , w2 ∈ V . For v ∈ V ,
< v, L∗ (w1 + w2 ) = < L(v), w1 + w2 >=< L(v), w1 > + < L(v), w2 >
= < v, L∗ (w1 ) > + < v, L∗ (w2 ) >=< v, L∗ (w1 ) + L∗ (w2 ) > .
Since this identity holds for all v ∈ V , we have that L∗ (w1 + w2 ) = L∗ (w1 ) + L∗ (w2 ).
Suppose w ∈ V and c ∈ F .
< v, L∗ (cw) >=< L(v), cw >= c < L(v), w >= c < v, L∗ (w) >=< v, cL∗ (w) > .
Since this identity holds for all v ∈ V , we have that L∗ (cw) = cL∗ (w). Thus L∗ is linear.
Now we prove uniqueness. Suppose that T : V → V is a linear map such that <
L(v), w >=< v, T (w) > for all v, w ∈ V . Then for any w ∈ W , < v, T (w) >=< v, L∗ (w) >
for all v ∈ V . Thus T (w) = L∗ (w), and L∗ is unique.
L∗ is called the adjoint of L.
Definition 20.3. L is called self adjoint if L∗ = L.
If V is a real inner product space, a self adjoint operator is called symmetric, and we
sometimes write L∗ = Lt . If V is an Hermitian inner product space, a self adjoint operator
is called Hermitian.
Lemma 20.4. Let V = Rn with the standard inner product. Suppose that L = LA : Rn →
Rn for some A ∈ Mnn (R). Then L∗ = LAt .
Lemma 20.5. Let V = Cn with the standard Hermitian inner product. Suppose that
L = LA : Cn → Cn for some A ∈ Mnn (C). Then L∗ = LAt .
Proof. Let {e1 , . . . , en } be the standard basis of Cn . Let B ∈ Mn,n (C) be the unique
matrix such that L∗ = LB . Write A = (aij ) and B = (bij ). We have that eti Bej = bij . For
1 ≤ i, j ≤ n, we have
< Lei , ej >=< ei , L∗ ej >=< ei , Bej >= eti Bej = ei Bej = bij .
We also calculate
< Lei , ej >=< Aei , ej >= (Aei )t ej = eti At ej = aji .
t
Thus bij = aji , and B = A .
Lemma 20.6. The following are equivalent:
1. LL∗ = I.
2. ||Lv|| = ||v|| for all v ∈ V .
3. < Lv, Lw >=< v, w > for all v, w ∈ V
32
Proof. Suppose that 1. holds, so that LL∗ = I. Then L∗ is a right inverse of L so that L
is invertible with inverse L∗ . Thus L∗ L = I. Suppose that v ∈ V . Then
||Lv||2 =< Lv, Lv >=< v, L∗ Lv >=< v, v >= ||v||2 ,
and thus 2. holds.
Now suppose that 2. holds. Then 3. holds by the appropriate polarization identity (13)
or (14).
Finally suppose that 3. holds. For fixed w ∈ V , we have < v, w >=< Lv, Lw >=<
v, L∗ Lw > for all v ∈ V . Thus w = L∗ Lw. Since this holds for all w ∈ W , L∗ L = I, so
that L is an isomorphism with LL∗ = I, and 1. holds.
The above Lemma motivates the following definition.
Definition 20.7. L : V → V is an isometry if LL∗ = I, where I is the identity map of
V.
If V is a real inner product space, then an isometry is called an orthogonal transformation (or real unitary). If V is an Hermitian inner product space, then an isometry is
called a unitary transformation.
Lemma 20.8. Let {v1 , . . . , vn } be a an orthonormal basis of V . Then L is an isometry
if and only if {Lv1 , . . . , Lvn } is an orthonormal basis of V .
Theorem 20.9. If W is an L-invariant subspace of V and L = L∗ , then W ⊥ is an
L-invariant subspace of V .
Proof. Let v ∈ W ⊥ . For all w ∈ W we have < Lv, w >=< v, L∗ w >=< v, Lw >= 0, since
Lw ∈ W . Hence Lv ∈ W ⊥ .
Lemma 20.10. Suppose that V is an Hermitian inner product space and L is Hermitian.
Suppose that v ∈ V . Then < Lv, v >∈ R.
Proof. We have < Lv, v >=< v, L∗ v >=< v, Lv >= < Lv, v >. Thus < Lv, v >∈ R.
21. The Spectral Theorem
Lemma 21.1. Let A ∈ Mnn (R) be symmetric. Then A has a real eigenvalue.
Proof. Regarding A as a matrix in Mn,n (C), we have that A is Hermitian. Now A has a
complex eigenvalue λ, with an eigenvector v ∈ Cn . Let < , > be the standard Hermitian
t
product on Cn . We have that A = At = A , so LA : Cn → Cn is Hermitian, by Lemma
20.5. By Lemma 20.10, we then have that < Av, v >= λ < v, v > is real. Since < v, v >
is real we have that λ is real.
Theorem 21.2. Suppose that V is a finite dimensional real vector space with an inner
product. Suppose that L : V → V is a symmetric linear map. Then L has an eigenvector.
Proof. Let β = {v1 , . . . , vn } be an orthonormal basis of V . We have an isomorphism
Φ : V → Rn defined by Φ(v) = (v)β for v ∈ V . Let [, ] be the dot product on Rn . Let
A = Mββ (L) ∈ Mnn (R). We have
< v, w >= [(v)β , (w)β ]
for v, w ∈ V . For v, w ∈ V , we calculate
< L(v), w >= [(L(v))β , (w)β ] = [A(v)β , (w)β ] = [(v)β , At (w)β ]
33
and
< L(v), w >=< v, L(w) >= [(v)β , A(w)β ].
t
Since [(v)β
β ] = [(v)β , A(w)β ] for all v, w ∈ V , we have that A = A . Thus A
has a real eigenvalue λ which necessarily has a real eigenvector v by Lemma 21.1. Let
y = Φ−1 (x) ∈ V .
(L(y))β = A(y)β = Ax = λx = (λy)β .
Thus L(y) = λy. Since y is nonzero, as x is nonzero, we have that y is an eigenvector of
L.
, At (w)
Theorem 21.3. Suppose that V is a finite dimensional real vector space with an inner
product. Suppose that L : V → V is a symmetric linear map. Then V has an orthonormal
basis consisting of eigenvectors of L.
Proof. We prove the Theorem by induction on n = dim V . By Theorem 21.2 there exists
an eigenvector v ∈ V for L. Let W be the one dimensional subspace of V generated
Lby ⊥v.
⊥
∼
W is L-invariant. By Theorem 20.9, W is also L-invariant. We have that V = W
W ,
so dim W ⊥ = dim V − 1. The restriction of L to W ⊥ is a symmetric linear map of W ⊥ to
itself. By induction on n, there exists an orthonormal basis {u2 , . . . , un } of W ⊥ consisting
1
v, u2 , . . . , un } is an orthonormal basis of V consisting
of eigenvectors of L. Thus {u1 = ||v||
of eigenvectors of L.
Corollary 21.4. Let A ∈ Mnn (R) be symmetric. Then there exists an orthogonal matrix
P ∈ Mn,n (R) such that P t AP = P −1 AP is a diagonal matrix.
Proof. Rn has an orthonormal basis β = {u1 , . . . , un } of eigenvectors of LA : Rn → Rn
by Lemma 20.4 and Theorem 21.3. Let λ1 , . . . , λn be the respective eigenvalues. Then
Mββ (LA ) is the diagonal matrix D with λ1 , λ2 , . . . , λn as diagonal elements. Let β st =
st
{e1 , . . . , en } be the standard basis of Rn . Let P = Mββst (I) = (u1 , . . . , un ). P −1 = Mββ (I).
LP is an isometry by Lemma 20.8, since P ei = ui for 1 ≤ i ≤ n. Thus P is orthogonal
and P −1 = P t by Lemmas 20.4 and 20.6.
st
st
D = Mββ (LA ) = Mββ (I)Mββst (LA )Mββst (I) = P −1 AP = P t AP.
The proof of Theorem 21.3 also proves the following theorem.
Theorem 21.5. Suppose that V is a finite dimensional complex vector space with an
Hermitian inner product. Suppose that L : V → V is a Hermitian linear map. Then V
has an orthonormal basis consisting of eigenvectors of L. All eigenvalues of L are real.
t
Corollary 21.6. Let A ∈ Mnn (C) be Hermitian (A = A ). Then there exists a unitary
t
matrix U ∈ Mn,n (C) such that U AU = U −1 AU is a real diagonal matrix.
34