Download Abstract Vector Spaces, Linear Transformations, and Their

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Birkhoff's representation theorem wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Jordan normal form wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Hilbert space wikipedia , lookup

System of linear equations wikipedia , lookup

Tensor operator wikipedia , lookup

Laplace–Runge–Lenz vector wikipedia , lookup

Euclidean vector wikipedia , lookup

Matrix calculus wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Vector space wikipedia , lookup

Four-vector wikipedia , lookup

Cartesian tensor wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Dual space wikipedia , lookup

Bra–ket notation wikipedia , lookup

Linear algebra wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Transcript
Abstract Vector Spaces, Linear Transformations, and Their
Coordinate Representations
Contents
1 Vector Spaces
1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . .
1.1.2 Linear Combinations, Spans, Bases, and Dimensions
1.1.3 Sums and Products of Vector Spaces and Subspaces
1.2 Basic Vector Space Theory . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
1
1
3
5
. . . . . . . . . .
Homomorphisms
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
15
15
15
18
20
23
3 Matrix Representations of Linear Transformations
3.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Matrix Representations of Linear Transformations . . . . . . . . . . . . . . . . . . .
25
25
26
4 Change of Coordinates
4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Change of Coordinate Maps and Matrices . . . . . . . . . . . . . . . . . . . . . . . .
31
31
31
2 Linear Transformations
2.1 Definitions . . . . . . . . . . . . . . . . . . . . .
2.1.1 Linear Transformations, or Vector Space
2.2 Basic Properties of Linear Transformations . .
2.3 Monomorphisms and Isomorphisms . . . . . . .
2.4 Rank-Nullity Theorem . . . . . . . . . . . . . .
1
1.1
1.1.1
.
.
.
.
.
.
.
.
.
.
Vector Spaces
Definitions
Basics
A vector space (linear space) V over a field F is a set V on which the operations addition,
+ : V × V → V , and left F -action or scalar multiplication, · : F × V → V , satisfy: for all
x, y, z ∈ V and a, b, 1 ∈ F

VS1 x + y = y + x


VS2 (x + y) + z = x + (y + z)
(V, +) is an abelian group
VS3 ∃0 ∈ V such that x + 0 = x


VS4 ∀x ∈ V, ∃y ∈ V such that x + y = 0
VS5 1x = x
VS6 (ab)x = a(bx)
VS7 a(x + y) = ax + ay
VS8 (a + b)x = ax + ay
If V is a vector space over F , then a subset W ⊆ V is called a subspace of V if W is a vector space
over the same field F and with addition and scalar multiplication +|W ×W and ·|F ×W .
1
1.1.2
Linear Combinations, Spans, Bases, and Dimensions
If ∅ 6= S ⊆ V , v1 , . . . , vn ∈ S and a1 , . . . , an ∈ F , then a linear combination of v1 , . . . , vn is the
finite sum
a 1 v1 + · · · + a n vn
(1.1)
which is a vector in V . The ai ∈ F are called the coefficients of the linear combination. If
a1 = · · · = an = 0, then the linear combination is said to be trivial. In particular, considering
the special case of 0 in V , the zero vector, we note that 0 may always be represented as a linear
combination of any vectors u1 , . . . , un ∈ V ,
0
0u1 + · · · + 0un = |{z}
{z
}
|
0∈V
0∈F
This representation is called the trivial representation of 0 by u1 , . . . , un . If, on the other hand,
there are vectors u1 , . . . , un ∈ V and scalars a1 , . . . , an ∈ F such that
a1 u1 + · · · + an un = 0
where at least one ai 6= 0 ∈ F , then that linear combination is called a nontrivial representation
of 0. Using linear combinations we can generate subspaces, as follows. If S 6= ∅ is a subset of V ,
then the span of S is given by
span(S) := {v ∈ V | v is a linear combination of vectors in S}
(1.2)
The span of ∅ is by definition
span(∅) := {0}
(1.3)
In this case, S ⊆ V is said to generate or span V , and to be a generating or spanning set. If
span(S) = V , then S said to be a generating set or a spanning set for V .
A nonempty subset S of a vector space V is said to be linearly independent if, taking any finite
number of distinct vectors u1 , . . . , un ∈ S, we have for all a1 , . . . , an ∈ F that
a1 u1 + a2 u2 + · · · + an un = 0 =⇒ a1 = · · · = an = 0
That is S is linearly independent if the only representation of 0 ∈ V by vectors in S is the trivial one.
In this case, the vectors u1 , . . . , un themselves are also said to be linearly independent. Otherwise,
if there is at least one nontrivial representation of 0 by vectors in S, then S is said to be linearly
dependent.
Given ∅ 6= S ⊆ V , a nonzero vector v ∈ S is said to be an essentially unique linear combination
of the vectors in S if, up to order of terms, there is one and only one way to express v as a linear
combination of u1 , . . . , un ∈ S. That is, if there are a1 , . . . , an , b1 , . . . , bm ∈ F \{0} and distinct
u1 , . . . , un ∈ S and distinct v1 , . . . , vm ∈ S distinct, then, re-indexing the bi s if necessary,
)
(
)
v = a1 u1 + · · · + an un
ai = bi
=⇒ m = n and
∀i = 1, . . . , n
= b1 v1 + · · · + bm vm
ui = vi
A subset β ⊆ V is called a (Hamel) basis if it is linearly independent and span(β) = V . We also
say that the vectors of β form a basis for V . Equivalently, as explained in Theorem 1.13 below, β
is a basis if every nonzero vector v ∈ V is an essentially unique linear combination of vectors in β.
In the context of inner product spaces of inifinite dimension, there is a difference between a vector
space basis, the Hamel basis of V , and an orthonormal basis for V , the Hilbert basis for V , because
though the two always exist, they are not always equal unless dim(V ) < ∞.
2
The dimension of a vector space V is the cardinality of any basis for V , and is denoted dim(V ).
V finite-dimensional if it is the zero vector space {0} or if it has a basis of finite cardinality.
Otherwise, if it’s basis has infinite cardinality, it is called infinite-dimensional. In the former case,
dim(V ) = |β| = n < ∞ for some n ∈ N, and V is said to be n-dimensional, while in the latter
case, dim(V ) = |β| = κ, where κ is a cardinal number, and V is said to be κ-dimensional.
If V is finite-dimensional, say of dimension n, then an ordered basis for V a finite sequence or
n-tuple (v1 , . . . , vn ) of linearly independent vectors v1 , . . . , vn ∈ V such that {v1 , . . . , vn } is a basis
for V . If V is infinite-dimensional but with a countable basis, then an ordered basis is a sequence
(vn )n∈N such that the set {vn | n ∈ N} is a basis for V . The term standard ordered basis is
applied only to the ordered bases for F n , (F ω )0 , (F B )0 and F [x] and its subsets Fn [x]. For F n ,
the standard ordered basis is given by
(e1 , e2 , . . . , en )
where e1 = (1, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0), . . . , en = (0, . . . , 0, 1). Since F is any field, 1 represents
the multiplicative identity and 0 the additive identity. For F = R, the standard ordered basis is just
as above with the usual 0 and 1. If F = C, then 1 = (1, 0) and 0 = (0, 0), so the ith standard basis
vector for Cn looks like this:
ith term
ei = (0, 0), . . . , (1, 0) , . . . , (0, 0)
For (F ω )0 the standard ordered basis is given by
(en )n∈N ,
where en (k) = δnk
and generally for (F B )0 the standard ordered basis is given by:
(
1,
if x = b
(eb )b∈B ,
where eb (x) =
0,
if x =
6 b
When the term standard ordered basis is applied to the vector space Fn [x], the subspace of F [x] of
polynomials of degree less than or equal to n, it refers the basis
(1, x, x2 , . . . , xn )
For F [x], the standard ordered basis is given by
(1, x, x2 , . . . )
1.1.3
Sums and Products of Vector Spaces and Subspaces
If S1 , . . . , Sk ∈ P (V ) are subsets or subspaces of a vector space V , then their (internal) sum is
given by
k
X
S1 + · · · + Sk =
Si := {v1 + · · · + vk | vi ∈ Si for 1 ≤ i ≤ k}
(1.4)
i=1
More generally, if {Si | i ∈ I} ⊆ P (V ) is any collection of subsets or subspaces of V , then their sum
is defined to be
n
o
X
[
Si = v1 + · · · + vn si ∈
Si and n ∈ N
(1.5)
i∈I
i∈I
If S is a subspace of V and v ∈ V , then the sum {v} + S = {v + s | s ∈ S} is called a coset of S and
is usually denoted
v+S
(1.6)
3
We are of course interested in the (internal) sum of subspaces of a vector space. In this case,
we may have the special condition
V =
k
X
Si
and Sj ∩
i=1
X
Si = {0}, ∀j = 1, . . . , k
(1.7)
i6=j
When this happens, V is said to be the (internal) direct sum of S1 , . . . , Sk , and this is symbolically
denoted
k
M
V = S1 ⊕ S2 ⊕ · · · ⊕ Sk =
Si = {s1 + · · · + sk | si ∈ Si }
(1.8)
i=1
In the infinite-dimensional case we proceed similarly. If F = {Si | i ∈ I} is a family of subsets of V
such that
X
X
V =
Si and Sj ∩
Si = {0}, ∀j ∈ I
(1.9)
i∈I
i6=j
then V is a direct sum of the Si ∈ F, denoted
n
o
M
M
[
V =
F =
Si = s1 + · · · + sn sj ∈
Si , and sji ∈ Si if sji 6= 0
i∈I
(1.10)
i∈I
We can also define the (external) sum of distinct vector spaces which do not lie inside a larger
vector space: if V1 , . . . , Vn are vector spaces over the same field F , then their external direct sum is
the cartesian product V1 × · · · × Vn , with addition and scalar multiplication defined componentwise.
And we denote the sum, confusingly, by the same notation:
V1 ⊕ V2 ⊕ · · · ⊕ Vn := {(v1 , v2 , . . . , vn ) | vi ∈ Vi , i = 1, . . . , n}
(1.11)
In the infinite-dimensional case, we have two types of external direct sum, one where there is no
restriction on the sequences, the other where we only allow sequences with finite support. To
distinguish these two cases we use the terms direct product and external direct product, respectively:
let F = {Vi | i ∈ I} be a family of vector spaces over the same field F . Then their direct product
is given by
n
o
Y
Vi :=
v = (vi )i∈I vi ∈ Vi
(1.12)
i∈I
=
n
o
[ f :I→
Vi f (i) ∈ Vi
(1.13)
i∈I
The infinite-dimensional external direct sum is then defined:
n
o
M
Vi :=
v = (vi )i∈I vi ∈ Vi and v has finite support
(1.14)
i∈I
=
n
o
[ f :I→
Vi f (i) ∈ Vi and f has finite support
(1.15)
i∈I
When all the Vi are equal, there is some special notation:
V n (finite-dimensional)
Y
V I :=
V (infinite-dimensional)
(1.16)
(1.17)
i∈I
(V I )0 :=
M
V
(1.18)
i∈I
4
Example 1.1 Common examples of vector spaces are the sequence space F n , F ω , F B , (F ω )0
and (F B )0 in a field F over that field, i.e. the various types of cartesian products of F equipped
with addition and scalar multiplication operations defined componentwise (ω = N and B is any set,
and where (F ω )0 and (F B )0 denote all functions f : ω → F , respectively f : B → F , with finite
support). Some typical ones are
Qn
Qω
QB
(Qω )0
(QB )0
Rn
Rω
RB
(Rω )0
(RB )0
each over Q or R or C, respectively.
Cn
Cω
CB
(Cω )0
(CB )0
5
1.2
Basic Vector Space Theory
Theorem 1.2 If V is a vector space over a field F , then for all x, y, z ∈ V and a, b ∈ F the
following hold:
(1) x + y = z + y =⇒ x = z
(Cancellation Law)
(2) 0 ∈ V is unique.
(3) x + y = x + z = 0 =⇒ y = z
(Uniqueness of the Additive Inverse)
(4) 0x = 0
(5) (−a)x = −(ax) = a(−x)
(6) The finite external direct sum V1 ⊕ · · · ⊕ Vn is a vector space over F if each Vi is a vector space
over F and if we define + and · componentwise by
(v1 , . . . , vn ) + (u1 , . . . , un )
:=
(v1 + u1 , . . . , vn + un )
c(v1 , . . . , vn )
:=
(cv1 , . . . , cvn )
More generally,
Qif F = {Vi | i ∈ I} is a family of
Lvector spaces over the same field F , then the
direct product i∈I Vi and external direct sum i∈I Vi = (V I )0 are vector spaces over F , with
+ and · given by
[ [ [ v+w =
f :I→
Vi + g : I →
Vi := f + g : I →
Vi
i∈I
cv
= c f :I→
[
i∈I
Vi
:=
i∈I
cf : I →
i∈I
[
Vi
i∈I
Proof: (1) x = x+0 = x+(y +(−y)) = (x+y)+(−y) = (z +y)+(−y) = z +(y +(−y)) = z +0 = z.
(2) ∃01 , 02 =⇒ 01 = 01 + 02 = 02 + 01 = 02 . (3)(a) x + y1 = x + y2 = 0 =⇒ y1 = y2 by the
Cancellation Law. (b) y1 = y1 + 0 = y1 + (x + y2 ) = (y1 + x) + y2 = 0 + y2 = y2 + 0 = y2 . (4)
0x + 0x = (0 + 0)x = 0x = 0x + 0V =⇒ 0x = 0V by the Cancellation Law. (5) ax + (−(ax)) = 0,
and by 3 −(ax) is unique. But note ax + (−a)x = (a + (−a))x = 0x = 0, hence (−a)x = −(ax), and
also a(−x) = a((−1)x) = (a(−1))x = (−a)x, which we already know equals −(ax). (6) All 8 vector
space properties are verified componentwise, since that is how addition and scalar multiplication are
defined. But the components are vectors in a vector space, so all 8 properties will be satisfied. Theorem 1.3 Let V be a vector space over F . A subset S of V is a subspace iff for all a, b ∈ F
and x, y ∈ S we have one of the following equivalent conditions:
(1) ax + by ∈ S
(2) x + y ∈ S and ax ∈ S
(3) x + ay ∈ S
Proof: If S satisfies any of the above implications, then it is a subspace because vector space
properties VS1-2 and VS5-8 follow from the fact that S ⊆ V , while VS3 follows because we can
choose a = b = 0 and 0 ∈ V is unique (Theorem 1.2), and VS4 follows from VS3, in the first case
by choosing b = −a and x = y, in the second case by choosing y = (−1)x, and in the third case by
choosing a = −1 and y = x. Conversely, if S is a vector space, then (1)-(3) follow by definition. 6
Theorem 1.4 Let V be a vector space. If C = {Si | i ∈ K} is a collection of subspaces of V , then
X
M
\
Si ,
Si
and
Si
(1.19)
i∈K
i∈K
i∈K
are subspaces.
Proof: (1) If C = {Si | i ∈ K} is a collection P
of subspaces, then none of the Si are
S empty by
the previous theorem,Sso if a, b ∈ F and v, w ∈ i∈K Si , then v = si1 + · · · + sin ∈ i∈K Si and
w = sj1 + · · · + sjm ∈ i∈K Si such that sik ∈ Sik for all ik ∈ K. Consequently
av + bw
= a(si1 + · · · + sin ) + b(sj1 + · · · + sjn )
=
∈
asi1 + · · · + asin + bsj1 + · · · + bsjn
X
Si
i∈K
because
asik ∈ Sik for all ik ∈ K because the Sik are subspaces. Hence by the previous theorem
P
S
i∈K i is a subspace of V . The direct sum case is a subcase of this general case.
T
(2) If a, b ∈ FTand u, v ∈ T
i∈K Si , then u, v ∈ Si for all i ∈ K, whence au + bv ∈ Si for all i ∈ K,
or au + bv ∈ i∈K Si , and i∈K Si is a subspace by the previous theorem.
Theorem 1.5 If S and T are subspaces of a vector space V , then S ∪ T is a subspace iff S ⊆ T or
T ⊆ S.
Proof: If S ∪ T is a subspace, then for all a, b ∈ F and x, y ∈ S ∪ T we have ax + by ∈ S ∪ T .
Suppose neither S ⊆ T nor S ⊆ T holds, and take x ∈ S\T and y ∈ T \S. Then, x + y is neither
in S nor in T (for if it were in S, say, then y = (x + y) + (−x) ∈ S, a contradiction), contradicting
the fact that S ∪ T is a subspace. Conversely, suppose, say, S ⊆ T , and choose any a, b ∈ F and
x, y ∈ S ∪ T . If x, y ∈ S, then ax + by ∈ S ⊆ S ∪ T , while if x, y ∈ T , then ax + by ∈ T ⊆ S ∪ T . If
x ∈ S and y ∈ T , then x ∈ T as well and ax + by ∈ T ⊆ S ∪ T . Thus in all cases ax + by ∈ S ∪ T ,
and S ∪ T is a subspace.
Theorem 1.6 If V is a vector space, then for all S ⊆ V , span(S) is a subspace of V . Moreover,
if T is a subspace of V and S ⊆ T , then span(S) ⊆ T .
Proof: If S = ∅, span(∅) = {0}, which is a subspace of V ; if T is a subspace, then ∅ ⊆ T and
{0} ⊆ T . If S 6= ∅, the any x, y ∈ S have the form x = a1 v1 + · · · + an vn and y = b1 w1 + · · · + bm wm
for some a1 , . . . an , b1 , . . . , bn ∈ F and v1 , . . . , vn , w1 , . . . , wn ∈ S. Then, if a, b ∈ F we have
ax + by
= a(a1 v1 + · · · + an vn ) + b(b1 w1 + · · · + bm wm )
=
(aa1 )v1 + · · · + (aan )vn + (bb1 )w1 + · · · + (bbm )wm
∈
span(S)
so span(S) is a subspace of V . Now, suppose T is a subspace of V and S ⊆ T . If w ∈ span(S), then
w = a1 w1 + · · · + ak wn for some w1 , . . . , wn ∈ S and a1 , . . . , an ∈ F . Since S ⊆ T , w1 , . . . , wk ∈ T ,
whence w = c1 w1 + · · · + ck wk ∈ T .
Corollary 1.7 If V is a vector space, then S ⊆ V is a subspace of V iff span(S) = S.
Proof: If S is a subspace of V , then by the theorem S ⊆ S =⇒ span(S) ⊆ S. But if s ∈ S, then
s = 1s ∈ span(S), so S ⊆ span(S), and therefore S = span(S). Conversely, if S = span(S), then S
is a subspace of V by the theorem.
7
Corollary 1.8 If V is a vector space and S, T ⊆ V , then the following hold:
(1) S ⊆ T ⊆ V =⇒ span(S) ⊆ span(T )
(2) S ⊆ T ⊆ V and span(S) = V =⇒ span(T ) = V
(3) span(S ∪ T ) = span(S) + span(T )
(4) span(S ∩ T ) ⊆ span(S) ∩ span(T )
Proof: (1) and (2) are immediate, so we only need to prove 3 and 4:
(3) If v ∈ span(S ∪ T ), then ∃v1 , . . . , vm ∈ S, u1 , . . . , un ∈ T and a1 , . . . , am , b1 , . . . , bn ∈ F such
that
v = v1 a1 + · · · + vm am + b1 u1 + · · · + bn un
Note, however, v1 a1 + · · · + vm am ∈ span(S) and b1 u1 + · · · + bn un ∈ span(T ), so that v ∈ span(S) +
span(T ). Thus span(S ∪ T ) ⊆ span(S) + span(T ). Conversely, if v = s + t ∈ span(S) + span(T ),
then s ∈ span(S) and t ∈ span(T ), so that by 1, since S ⊆ S ∪ T and T ⊆ S ∪ T , we must have
span(S) ⊆ span(S ∪ T ) and span(T ) ⊆ span(S ∪ T ). Consequently, s, t ∈ span(S ∪ T ), and since
span(S ∪ T ) is a subspace, v = s + t ∈ span(S ∪ T ). That is span(S) + span(T ) ⊆ span(S ∪ T ).
Thus, span(S) + span(T ) = span(S ∪ T ).
(4) By Theorem 1.6 span(S ∩ T ), span(S) and span(T ) are subspaces, and by Theorem 1.4 span(S) ∩
span(T ) is a subspace. Now, consider x ∈ span(S ∩ T ). There exist vectors v1 , . . . , vn ∈ S ∩ T and
scalars a1 , . . . , an ∈ F such that x = a1 v1 + · · · + an vn . But since v1 , . . . , vn belong to both S and
T , x ∈ span(S) and x ∈ span(T ), so that x ∈ span(S) ∩ span(T ). It is not in general true, however,
that span(S) ∩ span(T ) ⊆ span(S ∩ T ). For example, if S = {e1 , e2 } ⊆ R2 and T = {e1 , (1, 1)},
then span(S) ∩ span(T ) = R2 ∩ R2 = R2 , but span(S ∩ T ) = span({e1 }) = R, and R2 6⊆ R.
Theorem 1.9 If V is a vector space and S ⊆ T ⊆ V , then
(1) S linearly dependent =⇒ T linearly dependent
(2) T linearly independent =⇒ S linearly independent
Proof: (1) If S is linearly dependent, there are scalars a1 , . . . , am ∈ F , not all zero, and vectors
v1 , . . . , vm ∈ S such that a1 v1 + · · · + am vm = 0. If S ⊆ T , then v1 , . . . , vm are also in T , and have
the same property, namely a1 v1 + · · · + am vm = 0. Thus, T is linearly dependent. (2) If T is linearly
independent, then if v1 , . . . , vn ∈ S, a1 , . . . , an ∈ F and a1 v1 + · · · + an vn = 0, since S ⊆ T implies
v1 , . . . , vn ∈ T , we must have a1 = a2 = · · · = an = 0, and S is linearly independent.
Theorem 1.10 If V is a vector space, then S ⊆ V is linearly dependent iff S = {0} or there exist
distinct vectors v, u1 , . . . , un ∈ S such that v is a linear combination of u1 , . . . , un .
Proof: If S is linearly dependent, then S consists of at least one element by definition. Suppose
S has only one element, v. Then, either v = 0 or v 6= 0. If v 6= 0, consider a ∈ F . Then,
av = 0 =⇒ a = 0, which means S is linearly independent, contrary to assumption. Therefore,
v = 0, and so S = {0}. If S contains more than one element and v ∈ S, then since S is linearly
dependent, there exist distinct u1 , . . . , un ∈ S and a1 , . . . , an ∈ F , not all zero such that
a1 u1 + · · · an un = 0
If v = 0, we’re done. If v 6= 0, then, since {u1 , . . . , un } is linearly dependent, so is {u1 , . . . , un , v},
by Theorem 1.9, so that there are b1 , . . . , bn+1 ∈ F , not all zero, such that
b1 u1 + · · · + bn un + bn+1 v = 0
8
Choose bn+1 6= 0, for if we let bn+1 = 0 we have gained nothing. Then,
b1
bn
v= −
u1 + · · · + −
un
bn+1
bn+1
and v is a linear combination of distinct vectors in S.
Conversely, if S = {0}, clearly S is linearly dependent, for we can choose a, b ∈ F both nonzero and
have a0 + b0 = 0. Otherwise, if there are distinct u1 , . . . , un , v ∈ S and a1 , . . . , an ∈ F such that
v = a1 u1 + · · · + an un , then obviously 0 = −1v + a1 u1 + · · · + an un , and since −1 6= 0, S is linearly
dependent.
Corollary 1.11 If S = {u1 , . . . , un } ⊆ V , then S is linearly dependent iff u1 = 0 or for some
1 ≤ k < n we have uk+1 ∈ span(u1 , . . . , uk ).
Theorem 1.12 If V is a vector space and u, v ∈ V are distinct, then {u, v} is linearly dependent
iff u or v is a multiple of the other.
Proof: If {u, v} is linearly dependent, there are scalars a, b ∈ F not both zero such that au + bv = 0.
If a = 0, then b 6= 0, and so au + bv = bv = 0 =⇒ v = 0 =⇒ u 6= 0, and so v = 0u. If neither a nor
b is equal to zero, then au + bv = 0 =⇒ u = −a−1 bv. Conversely, suppose, say, u = av. If a = 0,
then u = 0 and v can be any vector other than v, since by hypothesis u 6= v, whence au + 0v = 0 for
any a ∈ F , including nonzero a, so that {u, v} is linearly dependent. Otherwise, if a 6= 0, we have
u = av =⇒ 1u + (−a)v = 0, and both 1 6= 0 and −a 6= 0, so that {u, v} is linearly dependent. Theorem 1.13 (Properties of a Basis) If V is a vector space, then the following are equivalent,
and any S ⊆ V satisfying any of them is a basis:
(1) S is linearly dependent and V = span(S).
(2) (Essentially Unique Representation) Ever nonzero vector v ∈ V is an essentially unique
linear combination of vectors in S.
(3) No vector in S is a linear combination of other vectors in S.
(4) S is a minimal spanning set, that is V = span(S) but no proper subset of S spans V .
(5) S is a maximal linearly independent set, that is S is linearly independent but any proper
superset is not linearly independent.
Proof: (1)=⇒ (2): If S is linearly independent and v ∈ S is a nonzero vector such that for distinct
si and ti in S
v
=
a1 s1 + · · · + an sn
=
b1 t1 + · · · + bm sm
then grouping any sij and tij that are equal, we have
0 = (ai1 − bi1 )si1 + · · · + (aik − bik )sik + aik+1 sik+1 + · · · + ain sin
+ bik+1 tik+1 + · · · + bim tim
implies aik+1 = · · · = ain = bik+1 = · · · = bim = 0, so n = m = k, and for j = 1, . . . , k we have
aij = bij , whence v is essentially uniquely represented as a linear combination of vectors in S.
9
(1)=⇒ (2): The contrapositive of Theorem 1.10.
(3)=⇒ (1): If 3 holds and a1 s1 + · · · + an sn = 0, where the si are distinct and a1 6= 0, then n > 1
and s1 = − a11 (a2 s2 + · · · + an sn ), which violates 3. Thus 3 implies 1.
(1)⇐⇒ (4): If S is a linearly independent spanning set and T is a proper subset of S that also spans
V , then any vectors in S\T would have to be linear combinations of the vectors in T , violating 3
for S (3 is equivalent to 1, as we just showed). Thus S is a minimal spanning set. Conversely, if S
is a minimal spanning set, then it must be linearly independent, for otherwise there would be some
s ∈ S that is a linear combination of other vectors in S, which would mean S\{s} also spans V ,
contradicting minimality. Thus 4 implies 1.
(1)⇐⇒ (5): If 1 holds, so S is linearly independent and spans V , but S is not maximal, then
∃v ∈ V \S such that S ∪ {v} is also linearly independent. But v ∈
/ span(S) by 3, contradiction.
Therefore S is maximal and 1 implies 5. Conversely, if S is a maximal linearly independent set,
then S must span V , for otherwise there is a v ∈ V \S that isn’t a linear combination of vectors
in S, implying S ∪ {v} is linearly independent proper superset, violating maximality. Therefore
span(S) = V , and 5 implies 1.
Corollary 1.14 If V is a vector space, then S = {v1 , . . . , vn } ⊆ V is a basis for V iff
V = span(v1 ) ⊕ span(v2 ) ⊕ · · · ⊕ span(vn )
(1.20)
The previous theorem did not say that any such basis exists for a given vector space. It merely gave
the defining characteristics of a basis, should we ever come across one. The next theorem assures us
that all vector spaces have a basis – a consequence of Zorns lemma.
Theorem 1.15 (Existence of a Basis) If V is a nonzero vector space and I ⊆ S ⊆ V with I
linearly independent and V = span(S), then ∃B basis for V for which
I⊆B⊆S
(1.21)
More specifically,
(1) Any nonzero vector space has a basis.
(2) Any linearly independent set in V is contained in a basis.
(3) Any spanning set in V contains a basis.
Proof: Let A ⊆ P (V ) be the set of all linearly independent sets L such that I ⊆ L ⊆ S. Then A
is non-empty because I ⊆ I ⊆ S, so I ∈ A. Now, if
C = {Ik | k ∈ K} ⊆ A
is a chain in A, that is a totally ordered (under set inclusion) subset of A, then the union
[
[
U=
C=
Ii
i∈K
is linearly independent and satisfies I ⊆ U ⊆ S, that is U ∈ A. But by Zorn’s lemma every chain
has a maximal element B, so that ∃B ∈ A maximal which is linearly independent. But of course
such a B is a basis for V = span(S), for if any s ∈ S is not a linear combination of elements in B,
then B ∪ {s} is linearly independent and B ( B ∪ {s}, contradicting the maximality of B. Therefore
S ⊆ span(B), and so V = span(S) ⊆ span(B) ⊆ V , or V = span(B). This shows that (1) there is a
basis B for V , (2) any linearly independent set I has I ⊆ B for this B, and (3) any spanning set S
has B ⊆ S for this B.
10
Corollary 1.16 (All Subpaces Have Complements) If S is a subspace of a vector space V ,
then there exists a subspace T of V such that
V =S⊕T
(1.22)
Proof: If S = V , then let T = {0}. Otherwise, if V \S 6= ∅, then since V has a basis B and B is
maximal, we must have S = span(B ∩ S) and B ∩ (V \S) 6= ∅. That is, there is a nonzero subspace
T = span(B ∩ (V \S)), and by Corollary 1.8
V = span(B) = span((B ∩ S) ∪ (B ∩ (V \S))) = span(B ∩ S) + span(B ∩ (V \S)) = S + T
But S ∩ T = {0} because of the essentially unique representation of vectors in V as linear combinations of vectors in B. Hence, V = S ⊕ T , and T is a complement of S.
Theorem 1.17 (Replacement Theorem) If V is a vector space such that V = span(S) for some
S ⊆ V with |S| = n, and if B ⊆ V is linearly independent with |B| = m, then
(1) m ≤ n
(2) ∃C ⊆ S with |C| = n − m such that V = span(B ∪ C)
Proof: The proof is by induction on m, beginning with m = 0. For this m, B = ∅, so taking C = S,
we have B ∪ C = ∅ ∪ S = S, which generates V . Now suppose the theorem holds for some m ≥ 0.
For m+1, let B = {v1 , . . . , vm+1 } ⊆ V be linearly independent. By Corollary 1.8, B 0 = {v1 , . . . , vm }
is linearly independent, and so by the induction hypothesis m ≤ n and ∃C 0 = {u1 , . . . , un−m } ⊆ S
such that {v1 , . . . , vn } ∪ C 0 generates V . This means ∃a1 , . . . , am , b1 , . . . , bn−m ∈ F such that
vm+1 = a1 v1 + · · · + am vm + b1 u1 + · · · + bn−m un−m
(1.23)
Note that if n = m, C 0 = ∅, so that vm+1 ∈ span(B 0 ), contradicting the assumption that B is
linearly independent. Therefore, n > m, or n ≥ m + 1. Moreover, some bi , say b1 , is nonzero,
for otherwise we have vm+1 = a1 v1 + · · · + am vm , leading again to B being linearly dependent in
contradiction to assumption. Solving (1.23) for u1 , we get
−1
−1
−1
−1
u1 = (−b−1
1 a1 )v1 + · · · + (−b1 am )vm + b1 vm+1 + (−b1 b2 )u2 + · · · + (−b1 bn−m )un−m
Let C = {u2 , . . . , un−m }. Then u1 ∈ span(B ∪ C), and since v1 , . . . , vm , u2 , . . . , un−m ∈ B ∪ C, they
are also in span(B ∪ C), whence
B 0 ∪ C 0 ⊆ span(B ∪ C)
Now, since span(B 0 ∪ C 0 ) = V by the induction hypothesis, we have by Corollary 1.8 that span(B ∪
C) = V . So we have B linearly independent with |B| = m + 1 vectors, m + 1 ≤ n, span(B ∪ C) = V ,
and C ⊆ S with |C| = (n − m) − 1 = n − (m + 1) vectors, so the theorem holds for m + 1. Therefore
the theorem is true for all m ∈ N by induction.
Corollary 1.18 If V is a vector space with a finite spanning set, then every basis for V contains
the same number of vectors.
Proof: Suppose S is a finite spanning set for V , and let β and γ be two bases for V with |γ| > |β| = k,
so that some subset T ⊆ γ contains exactly k + 1 vectors. Since T is linearly independent and β
generates V , the replacement theorem implies that k + 1 ≤ k, a contradiction. Therefore γ is finite
and |γ| ≤ k. Reversing the roles of β and γ shows that k ≤ |γ|, so that |γ| = |β| = k.
11
Corollary 1.19 If V is a finite-dimensional vector space with dim(V ) = n, then the following hold:
(1) Any finite spanning set for V contains at least n vectors, and a spanning set for V that contains
exactly n vectors is a basis for V .
(2) Any linearly independent subset of V that contains exactly n vectors is a basis for V .
(3) Every linearly independent subset of V can be extended to a basis for V .
Proof: Let B be a basis for V . (1) Let S ⊆ V be a finite spanning set for V . By Theorem
1.15, ∃B ⊆ S that is a basis for V , and by Corollary 1.18 |B| = n vectors. Therefore |S| ≥ n,
and |S| = n =⇒ S = B, so that S is a basis for V . (2) Let I ⊆ V be linearly independent
with |I| = n. By the replacement theorem ∃T ⊆ B with |T | = n − n = 0 vectors such that
V = span(I ∪ T ) = span(I ∪ ∅) = span(I). Since I is also linearly independent, it is a basis for
V . (3) If I ⊆ V is linearly independent with |I| = m vectors, then the replacement theorem asserts
that ∃H ⊆ B containing exactly n − m vectors such that V = span(I ∪ H). Now, |I ∪ H| ≤ n, so
that by 1 |I ∪ H| = n, and so I ∪ H is a basis for V extended from I.
Theorem 1.20 Any two bases for a vector space V have the same cardinality.
Proof: If V is finite-dimensional, then the previous corollary applies, so we need only consider bases
that are infinite sets. Let B = {bi | i ∈ I} ba a basis for V and let C be any other basis for V . Then
all c ∈ C can be written as finite linear combinations of vectors in B, where all the coefficients are
nonzero. That is, if Uc = {1, . . . , nc }, we have
c=
nc
X
ai bi =
i=1
X
ai bi
i∈Uc
S
S
for unique a1 , . . . , anc ∈ F . But because C is a basis, we must have I = c∈C Uc , for if c∈C ( I,
then all c ∈ C can be expressed by a proper subset B 0 of B, so that V = span(B 0 ), which is
contradiction of the minimality of B as a spanning set. Now, for all c ∈ C we have |Uc | < ℵ0 , which
implies that
[ Uc ≤ ℵ0 |C| = |C|
|B| = |I| = c∈C
Reversing the roles of B and C gives |C| ≤ |B|, so by the Schröder-Bernstein theorem we have
|B| = |C|.
Corollary 1.21 If V is a vector space and S is a subspace of V :
(1) dim(S) ≤ dim(V )
(2) dim(S) = dim(V ) < ∞ =⇒ S = V
(3) V is infinite-dimensional iff it contains an infinite linearly independent subset.
Theorem 1.22 Let V be a vector space.
(1) If B is a basis for V and B = B1 ∪ B2 , where B1 ∩ B2 = ∅, then V = span(B1 ) ⊕ span(B2 ).
(2) If V = S ⊕T and B1 is a basis for S and B2 is a basis for T , then B1 ∩B2 = ∅ and B = B1 ∪B2
is a basis for V .
Proof: (1) If B1 ∩B2 = ∅ and B = B1 ∪B2 is a basis for V , then 0 ∈
/ B1 ∪B2 . But, if a nonzero vector
v ∈ span(B1 ) ∩ span(B2 ), then B1 ∩ B2 6= ∅, a contradiction. Hence {0} = span(B1 ) ∩ span(B2 ).
Moreover, since B1 ∪ B2 is a basis for V , and since it is also a basis for span(B1 ) + span(B2 ), we
12
must have V = span(B1 ) + span(B2 ), and hence V = span(B1 ) ⊕ span(B2 ). (2) If V = S ⊕ T ,
then S ∩ T = {0}, and since 0 ∈
/ B1 ∪ B2 , we have B1 ∩ B2 = ∅. Also, since all v ∈ V = S ⊕ T
have the form a1 u1 + · · · + am um + b1 v1 + · · · + bn vn for u1 , . . . , um ∈ B1 and v1 , . . . , vn ∈ B2 ,
v ∈ span(B1 ∪ B2 ), so B1 ∪ B2 is a basis for V by Theorem 1.13.
Theorem 1.23 If S and T are subspaces of a vector space V , then
dim(S) + dim(T ) = dim(S + T ) + dim(S ∩ T )
(1.24)
As a consequence, V = S ⊕ T iff dim(V ) = dim(S) + dim(T ).
Proof: If B = {bi | i ∈ I} is a basis for S ∩ T , then we can extend this to a basis A ∪ B for S and to
another basis B ∪ C for T , where A = {aj | j ∈ J}, C = {ck | k ∈ K} and A ∩ B = ∅ and B ∩ C = ∅.
Then A ∪ B ∪ C is a basis for S + T : clearly span(A ∪ B ∪ C) = S + T , so we need only verify
that A ∪ B ∪ C is linearly independent. To that end, suppose not, suppose ∃c1 , . . . , cn ∈ F \{0} and
v1 , . . . , vn ∈ A ∪ B ∪ C such that
c1 v1 + c2 v2 + · · · + cn vn = 0
Then some vi ∈ A ∩ C since by our construction A ∪ B and B ∪ C are linearly independent. Isolating
the vectors in A, say v1 , . . . , vk , on one side of the equality shows that there is a nonzero vector,
∈span(A)
}|
{
z
x = a1 v1 + · · · + ak vk
= ak+1 vk+1 + · · · + an vn
{z
}
|
=⇒ x ∈ span(A) ∩ span(B ∪ C) = span(A) ∩ T ⊆ S ∩ T
∈span(B∪C)
since span(A) ⊆ S. Consequently x = 0, and therefore a1 = · · · = an = 0 since A and B ∪ C are
linearly independent sets, a contradiction. Thus A ∪ B ∪ C is linearly independent and hence a basis
for S + T . Moreover,
dim(S) + dim(T )
=
=
|A ∪ B| + |B ∪ C|
|A| + |B| + |B| + |C|
=
|A ∪ B ∪ C| + |B|
=
dim(S + T ) + dim(S ∩ T )
Of course if S ∩ T = {0}, then dim(S ∩ T ) = 0, and
dim(S) + dim(T ) = dim(S ⊕ T ) = dim(V )
while if dim(V ) = dim(S) + dim(T ), then dim(S ∩ T ) = 0, so S ∩ T = {0}, and so V = S ⊕ T .
Corollary 1.24 If S and T are subspaces of a vector space V , then
(1) dim(S + T ) ≤ dim(S) + dim(T )
(2) dim(S + T ) ≤ max{dim(S), dim(T )}, if S and T are finite-dimensional.
13
Theorem 1.25P(Direct Sums) If F = {Si | i ∈ I} is a family of subspaces of a vector space V
such that V = i∈I Si , then the following statements are equivalent:
L
(1) V = i∈I Si .
(2) 0 ∈ V has a unique expression as a sum of vectors each from different Si , namely as a sum
of zeros: for any distinct j1 , . . . , jn ∈ I, we have
vj1 + vj2 + · · · + vjn = 0 and vji ∈ Sji for each i =⇒ vj1 = · · · = vjn = 0
(3) Each v ∈ V has a unique expression as a sum of distinct vji ∈ Sji \{0},
v = vj 1 + vj 2 + · · · + vj n
S
(4) If γi is a basis for Si , then γ = i∈I γi is a basis for V . If V is finite-dimensional, then this
may be restated in terms of ordered bases γi and γ.
L
Proof: (1) =⇒ (2): Suppose 1 is true, that is V = i∈I Si , and choose v1 , . . . , vn ∈ V such that
vi ∈ Ski and v1 + · · · + vk = 0. Then for any j,
X
X
−vj =
vi ∈
Si
i6=j
i6=j
However, −vj ∈ Sj , whence
−vj ∈ Wj ∩
X
Si = {0}
i6=j
so that vj = 0. This is true for all j = 1, . . . , k, so v1 = · · · = vk = 0, proving 2.
P
(2) =⇒ (3): Suppose 2 is true and let v ∈ V = i∈I Si be given by
v
= u1 + · · · + un
= w1 + · · · + wm
for some ui ∈ Ski and wj ∈ Skj . Then, grouping terms from the same subspaces
0 = v − v = (ui1 − wi1 ) + · · · + (uik − wik ) + uik+1 + · · · + uin − wik+1 − · · · − wim
(
ui1 = wi1 , · · · , uik = wik
=⇒
and uik+1 = · · · = uin = wik+1 = · · · = wim = 0
proving uniqueness.
Pn
(3) =⇒ (4): Suppose each vector v ∈ V = i∈I Si can be uniquely written as vP= v1 + · · · + vk ,
n
for vi ∈ Sji . For each i ∈ I let γi be anordered basis for Si , so that since V = i∈I Si , we must
Sk
have that V = span(γ) = span
i=1 γi , so we only need to show that γ is linearly independent.
To that end, let
v11 , v12 , . . . , v1n1 ∈ γi1
..
.
vm1 , vm2 , . . . , vmnm ∈ γim
for
Pniany bases γij for Sij ∈ F, then let aij ∈ F for i = 1, . . . , m and j = 1, . . . , ni , and let wi =
j=1 aij vij . Then, suppose
X
w1 + · · · + wm =
aij vij = 0
i,j
14
Tm
since 0 ∈ i=1 Sji and 0 = 0 + · · · + 0, by the assumed uniqueness of expression of v ∈ V by wi ∈ Si
we must have wi = 0 for all i, and since all the γi are linearly independent, we must have aij = 0
for all i, j.
S
(4) =⇒ (1): if 4 is true, then there exist ordered bases γi the Si such that γ = i∈I γi is an ordered
basis for V . By Corollary 1.8 and Theorem 1.22,
[
X
X
V = span(γ) = span( γi ) =
span(γi ) =
Si
i∈I
i∈I
i∈I
Choose any j ∈ I and suppose for a nonzero vector v ∈ V we have v ∈ Sj ∩
v ∈ Sj = span(γj ) ∩ span(
[
i6=j
γi ) =
X
P
i6=j
Si . Then,
Si
i6=j
S
which means v is a nontrivial linear combinationSof elements in γi and elements in i6=j γi , so that
v can be expressed as a linear combination of i∈I γi in more than one way, which contradicts
uniqueness of representation of vectors in V in terms of a basis for V , Theorem 1.13. Consequently,
any such v must be 0. That is,
X
Sj ∩
Si = {0}
i6=j
and the sum is direct, i.e. V =
L
i∈I
Si , proving 1.
15
2
Linear Transformations
2.1
2.1.1
Definitions
Linear Transformations, or Vector Space Homomorphisms
If V and W are vector spaces over the same field F , a function T : V → W is called a linear
transformation or vector space homomorphism if it preserves the vector space structure, that
is if for all vectors x, y ∈ V and all scalars a, b ∈ F we have
T (x + y) = T (x) + T (y)
T (ax) = aT (x)
or equivalently
T (ax + by) = aT (x) + bT (y)
(2.1)
The set of all linear transformations from V to W is denoted
L(V, W )
or
Hom(V, W )
(2.2)
Linear transformation, like group and ring homomorphisms, come in different types:
1. linear operator (vector space endomorphism) on V – if V is a vector space over F , a
linear operator, or vector space endomorphism, is a linear transformation T ∈ L(V, V ). The
set of all linear operators, or vector space endomorphisms, is denoted
L(V )
or
End(V )
(2.3)
2. monomorphism (embedding) – a 1-1 or injective linear transformation.
3. epimorphism – an onto or surjective linear transformation.
4. isomorphism (invertible linear transformation) from V to W – a bijective (and therefore invertible) linear transformation T ∈ L(V, W ), i.e. a monomorphism which is also an
epimorphism. If such an isomorphism exists, V is said to be isomorphic to W , and this
relationship is denoted
V ∼
(2.4)
=W
The set of all linear isomorphomorphisms from V to W is denoted
GL(V, W )
(2.5)
by analogy with the general linear group.
5. automorphism – a bijective linear operator. The set of all automorphisms of V is
denoted
Aut(V )
or
GL(V )
(2.6)
Some common and important linear transformations are:
1. The identity transformation, IV : V → V, given by IV (x) = x for all x ∈ V . It’s linearity
is obvious: IV (ax + by) = ax + by = aIV (x) + bIV (y).
2. The zero transformation, T0 : V → W, given by T0 (x) = 0 for all x ∈ V . Sometimes this is
denoted by 0 if the context is clear, like the zero polynomial 0 ∈ F [x]. This is also clearly a
linear transformation.
3. An idempotent operator T ∈ L(V ) satisfies T 2 = T .
16
A crucial concept, which will allow us to classify and so distinguish operators, is that of similarity
of operators. Two linear operators T, U ∈ L(V ) are said to be similar, denoted
T ∼U
(2.7)
if there exists an automorphism φ ∈ Aut(V ) such that
T = φ ◦ U ◦ φ−1
(2.8)
Similarity is an equivalence relation on L(V ):
1. T ∼ T , because T = I ◦ T ◦ I −1
2. T ∼ U implies T = φ ◦ U ◦ φ−1 for some automorphism φ ∈ L(V ), so that U = φ−1 ◦ T ◦ φ, or
U = ψ ◦ T ◦ ψ −1 , where ψ = φ−1 is an automorphism. Thus U ∼ T .
3. S ∼ T and T ∼ U implies S = φ◦T ◦φ−1 and T = ψ◦U ◦ψ −1 where φ and ψ are automorphisms,
so
S = φ ◦ T ◦ φ−1 = φ ◦ (ψ ◦ U ◦ ψ −1 ) ◦ φ−1 = (φ ◦ ψ) ◦ U (φ ◦ ψ)−1
and since φ ◦ ψ is an automorphism of V , we have S ∼ U .
This partitions L(V ) into equivalence classes. As we will see, each equivalence class corresponds to
a single matrix representation, depending only on the choice of basis for V , which maybe different
for each operator.
The kernel or null space of a linear tranformation T ∈ L(V, W ) is the pre-image of 0 ∈ W under
T:
ker(T ) = T −1 (0)
(2.9)
i.e. ker(T ) = {x ∈ V | T (x) = 0}. As we will show below, ker(T ) is a subspace of V .
The range or image of a linear tranformation T ∈ L(V, W ) is the range of T as a set function: it
is variously denoted by R(T ), T (V ) or im(T ),
R(T ) = im(T ) = T (V ) := {T (x) | x ∈ V }
(2.10)
As we show below, R(T ) is a subspace of W . The rank of a linear transformation T ∈ L(V, W ) is
the dimension of its range:
rank(T ) := dim R(T )
(2.11)
The nullity of T ∈ L(V, W ) is the dimension of its kernel:
null(T ) := dim ker(T )
(2.12)
Given an operator T ∈ L(V ), a subspace S of V is said to be T -invariant if v ∈ S =⇒ T (v) ∈ S,
that is if T (S) ⊆ S. The restriction of T to a T -invariant subspace S is denoted TS , and of course
TS ∈ L(S) is a linear operator on S. In the case that V = R ⊕ S decomposes as a direct sum of
T -invariant subspaces R and S, we say the pair (R, S) reduces T , or is a reducing pair for T , and
we say that T is the direct sum of operators TR and TS , denoting this by
T = TR ⊕ TS
(2.13)
The expression T = τ ⊕ σ ∈ L(V ) means there are subspaces R and S of V for which (R, S) reduces
T and τ = TR and σ = TS .
If V is a direct sum of subspaces V = R ⊕ S, then the function πR,S : V → V given by
πR,S (v) = πR,S (r + s) = r
where r ∈ R and s ∈ S, is called the projection onto R along S.
17
(2.14)
Example 2.1 The projection on the y-axis along the x-axis is the map πRy ,Rx : R2 → R2 given by
πRy ,Rx (x) = y.
Example 2.2 The projection on the y-axis along the line L = {(s, s) | s ∈ R} is the map πRy ,L :
R2 → R2 given by πRy ,L (x) = πRy ,L (s, s + y) = y.
Any projection πR,S is linear, i.e. πR,S ∈ L(V ):
πR,S (au + bv)
= πR,S (a(r1 + s1 ) + b(r2 + s2 ))
= πR,S ((ar1 + br2 ) + (as1 + bs2 ))
= ar1 + br2
= aπR,S (u) + bπR,S (v)
And clearly
R = im(πR,S )
and
S = ker(πR,S )
(2.15)
Two projections ρ, σ ∈ L(V ) are said to be orthogonal, denoted
ρ⊥σ
(2.16)
ρ ◦ σ = σ ◦ ρ = 0 ≡ T0
(2.17)
if
This is of course by analogy with orthogonal vectors v ⊥ w, for which we have hv, wi = hw, vi = 0.
We use projections to define a resolution of the identity, a sum of the form π1 + · · · + πk = I ∈
L(V ), where the πi are pairwise orthogonal projections, that is πi ⊥ πj if i 6= j. This occurs when
V = S1 ⊕ S2 ⊕ · · · ⊕ Sk , πi = πSi ,Lj6=i Sj , and πi ⊥ πj for i 6= j. Resolutions of the identity will be
important in establishing the spectral resolution of a linear operator.
18
2.2
Basic Properties of Linear Transformations
Theorem 2.3 If V and W are vector spaces over the same field F and T ∈ L(V, W ), then ker(T )
is a subspace of V and im(T ) is a subspace of W . Moreover, if T ∈ L(V ), then {0}, ker(T ), im(T )
and V are T -invariant.
Proof: (1) If x, y ∈ ker(T ) and a, b ∈ F , we have T (ax + by) = aT (x) + bT (y) = a0 + b0 = 0 so
ax + by ∈ ker(T ), and the statement follows by Theorem 1.3. If x, y ∈ im(T ) and a, b ∈ F , then
∃u, v ∈ V such that T (u) = x and T (v) = y, whence ax + by = aT (u) + bT (v) = T (au + bv) ∈ im(T ).
(2) Clearly {0} and V are T -invariant, the first because T (0) = 0, the second because the codomain
of T is V . If x ∈ im(T ), then of course x ∈ V , so that T (x) ∈ im(T ), while if x ∈ ker(T ), then
T (x) = 0 ∈ ker(T ).
Theorem 2.4 If V and W are vector spaces and B = {vi | i ∈ I} is a basis for V , then for any
T ∈ L(V, W ) we have
im(T ) = span(T (B))
(2.18)
Proof: T (vi ) ∈ im(T ) for each i ∈ I by definition, and because im(T ) is a subspace, we have
span(T (B)) ⊆ im(T ). Moreover, span(T (B)) is a subspace by Theorem 1.6. For the reverse inclusion,
choose any w ∈ im(T )j. Then ∃v ∈ V such that w = T (v), so that by Theorem 1.13 ∃!a1 , . . . , an ∈ F
such that v = a1 v1 + · · · + an vn , where by vi we mean vi ≡ vji . Consequently,
w = T (v) = T (a1 v1 + · · · + an vn ) = a1 T (v1 ) + · · · + an T (vn ) ∈ span(T (B))
Theorem 2.5 (A Linear Transformation Is Defined by Its Action on a Basis) Let V and
W be vector spaces over the same field F . If B = {vi | i ∈ I} is a basis for V , then we can define a
unique linear transformation T ∈ L(V, W ) by arbitrarily specifying the values wi = T (vi ) ∈ W for
each i ∈ I and extending T by linearity, i.e. specifying that for all a1 , . . . , an ∈ F our T satisfy
T (a1 v1 + · · · + an vn )
= a1 T (v1 ) + · · · + an T (vn )
(2.19)
= a1 w1 + · · · + an wn
Proof: By Theorem 1.13: ∀v ∈ V , ∃a1 , . . . , an ∈ F such that v = a1 v1 + · · · + an vn . Defining
T ∈ W V by (2.19) we have the following results:
(1) T ∈ L(V, W ): For any u, v ∈ V , there exist unique a1 , . . . , an , b1 , . . . , bm ∈ F and u1 , . . . , un ,
v1 , . . . , vm ∈ B such that
u = a1 u1 + · · · + an un
v = b1 v1 + · · · + bm vm
If w1 , . . . , wn , z1 , . . . , zm ∈ W are the vectors for which we have specified
T (u) = a1 T (u1 ) + · · · + an T (un )
T (v) = b1 T (v1 ) + · · · + bm T (vm )
= a1 w1 + · · · + an wn
= b1 z1 + · · · + bm zm
then, for all a, b ∈ F we have
T (au + bv)
= T a(a1 u1 + · · · + an un ) + b(b1 v1 + · · · + bm vm )
= aa1 T (u1 ) + · · · + aan T (un ) + bb1 T (v1 ) + · · · + bbm T (vm )
= aT (a1 u1 + · · · + an un ) + bT (b1 v1 + · · · + bm vm )
= aT (u) + bT (u)
19
(2) T is unique: suppose
Pn ∃U ∈ L(V, W ) such that U (vi ) = wi = T (vi ) for all i ∈ I. Then for any
v ∈ V we have v = i=1 ai vi for unique ai ∈ F , so that
U (v) =
n
X
ai U (vi ) =
i=1
n
X
ai wi = T (v)
i=1
and U = T .
Theorem 2.6 If V, W, Z are vector spaces over the same field F and T ∈ L(V, W ) and U ∈
L(W, Z), then U ◦ T ∈ L(V, Z).
Proof: If v, w ∈ V and a, b ∈ F , then
(U ◦ T )(ac + bw) = U (T (ac + bw)) = U (aT (v) + bT (w))
= aU (T (v)) + bU (T (w)) = a(U ◦ T )(x) + b(U ◦ T )(y)
so U ◦ T ∈ L(V, Z).
Theorem 2.7 If V and W are vector spaces over F , then L(V, W ) is a vector space over F under
ordinary addition and scalar multiplication of functions.
Proof: First, note that if T, U ∈ L(V, W ) and a, b, c, d ∈ F , then ∀v, w ∈ V
(aT + bU )(cv + dw)
= aT (cv + dw) + bU (cv + dw)
= acT (v) + adT (w) + bcU (v) + bdU (w)
= c aT (v) + bU (v) + d aT (w) + bU (w)
= c(aT + bU )(v) + d(aT + bU )(w)
so (aT + bU ) ∈ L(V, W ), and L(V, W ) is closed under addition and scalar multiplication. Moreover,
L(V, W ) satisfies VS1-8 because W is a vector space, and addition and scalar multiplication in
L(V, W ) are defined pointwise for values which any functions S, T, U ∈ L(V, W ) take in W .
20
2.3
Monomorphisms and Isomorphisms
Theorem 2.8 If V, W, Z are vector spaces over the same field F and T ∈ GL(V, W ) and U ∈
GL(W, Z) are isomorphisms, then
(1) T −1 ∈ L(W, V )
(2) (T −1 )−1 = T , and hence T −1 ∈ GL(W, V ).
(3) (U ◦ T )−1 = T −1 ◦ U −1 ∈ GL(Z, V ).
Proof: (1) If w, z ∈ W and a, b ∈ F , since T is bijective ∃!u, v ∈ V such that T (u) = w and
T (v) = z. Hence T −1 (w) = u and T −1 (z) = v, so that
T −1 (aw + bz) = T −1 aT (u) + bT (v) = T −1 T (au + bv)
= au + bv = aT −1 (w) + bT −1 (z)
(2) T −1 ◦ T = IV and T ◦ T −1 = IW shows that T is the inverse of T −1 , that is T = (T −1 )−1 .
(3) First, any composition of bijections is a bijection, so U ◦ T is bijective. Moreover,
−1
(U ◦ T ) ◦ (U ◦ T )
= IZ = U ◦ U −1 = U ◦ IW ◦ U −1
= U ◦ (T ◦ T −1 ) ◦ U −1 = (U ◦ T ) ◦ (T −1 ◦ U −1 )
and
−1
(U ◦ T )
◦ (U ◦ T ) = IV = T −1 ◦ T = T −1 ◦ IW ◦ T
= T −1 ◦ (U −1 ◦ U ) ◦ T = (U −1 ◦ T −1 ) ◦ (U ◦ T )
so by the cancellation law and the fact that L(V, Z) is a vector space, we must have (U ◦ T )−1 =
T −1 ◦ U −1 ∈ L(Z, V ) and by (1) it’s an isomorphism. Alternatively, we can get (3) from (2) via
Theorem 2.5:
U ◦ T : V → Z, so (U ◦ T )−1 : Z → V
and of course
T −1 ◦ U −1 : Z → V
Let α, β and γ be bases for V , W and Z, respectively, such that
T (α) = β and U (β) = U (T (α)) = (U ◦ T )(α) = γ, so that (U ◦ T )−1 (γ) = α
which is possible by Theorem 2.10. But we also have
U −1 (γ) = β and T −1 (β) = α
so that
(T −1 ◦ U −1 )(γ) = α
By Theorem 2.5 (uniqueness) (U ◦ T )−1 = T −1 ◦ U −1 and by (1) it’s an isomorphism.
Theorem 2.9 If V and W are vector spaces over the same field F and T ∈ L(V, W ), then T is
injective iff ker(T ) = {0}.
Proof: If T is 1-1 and x ∈ ker(T ), then T (x) = 0 = T (0), so that x = 0, whence ker(T ) = {0}.
Conversely, if ker(T ) = {0} and T (x) = T (y), then,
0 = T (x) − T (y) = T (x − y) =⇒ x − y = 0
or x = y, and so T is 1-1.
21
Theorem 2.10 Let V and W be vector spaces over the same field and S ⊆ V . Then for any
T ∈ L(V, W ),
(1) T is injective iff it carries linearly independent sets into linearly independent sets.
(2) T is injective implies S ⊆ V is linearly independent iff T (S) ⊆ W is linearly independent.
Proof: (1) If T is 1-1 and S ⊆ V is linearly independent, then for any v1 , . . . , vn ∈ S we have
0V = a1 v1 + · · · + an vn =⇒ a1 = · · · an = 0, so
0W = T (0V ) = T (a1 v1 + · · · + an vn ) = a1 T (v1 ) + · · · + an T (vn )
whence {T (v1 ), . . . , T (vn )} is linearly independent, and since this is true for all vi ∈ S, T (S) is linearly independent, so T carries linearly independent sets into linearly independent sets. Conversely,
if T carries linearly independent sets into linearly independent sets, let B be a basis for V and
suppose T (u) = T (v) for some u, v ∈ V . Since u = a1 u1 + · · · + an un and v = b1 v1 + · · · + bm vm for
unique ai , bi ∈ F and u1 , . . . , un , v1 , . . . , vm ∈ B, we have
0
= T (u) − T (v)
= T (u − v)
= T (ai1 − bj1 )vi1 + · · · + (aik − bjk )vik + aik+1 uik+1 + · · · + ain uin
+bik+1 vik+1 + · · · + bim vim
=
(ai1 − bj1 )T (vi1 ) + · · · + (aik − bjk )T (vik )
+aik+1 T (uik+1 ) + · · · + ain T (uin ) + bik+1 T (vik+1 ) + · · · + bim T (vim )
so that aik+1 = · · · = ain = bik+1 = · · · = bim = 0, which means m = n = k and therefore ai1 = bj1 ,
. . . , aik = bjk . Consequently
u = ai1 vi1 + · · · + aik vik = v
whence T is 1-1.
(2) If T is 1-1 and S ⊆ V is linearly independent, then by (1) T (S) is linearly independent. Conversely, if T is 1-1 and T (S) is linearly independent, then for K = {v1 , . . . , vk } ⊆ S ⊆ V we have
that
T (0) = 0 = a1 T (v1 ) + · · · + ak T (vk ) = T (a1 v1 + · · · + ak vk )
implying simultaneously that a1 v1 + · · · + ak vk = 0 and a1 = · · · = ak = 0, so K is linearly
independent. But this is true for all such K ⊆ S, so S is linearly independent.
Theorem 2.11 If V and W are vector spaces over the same field F and T ∈ GL(V, W ), then for
any S ⊆ V we have
(1) V = span(S) iff W = span(T (S)).
(2) S is linearly independent in V iff T (S) is linearly independent in W .
(3) S is a basis for V iff T (S) is a basis for W .
T ∈GL(V,W )
}|
{
z
Proof: (1) V = span(S) ⇐⇒ W = im(T ) = T (span(S)) = span(T (S)).
Pn
(2) S linearly independent means for any s1 , . . . , sn ∈ S we have i=1 ai si = 0 ⇐⇒ for all ai = 0,
which implies
T
n
X
i=1
n
X
ai si =
ai T (si ) = 0 = T (0)
T ∈ GL(V,W )
=⇒
i=1
n
X
ai si = 0
i=1
S lin. indep.
=⇒
22
a1 = · · · = an = 0
so T (S) is linearly independent, since this is true for all si ∈ S. Conversely, if T (S) is linearly
independent we have for any T (s1 ), . . . , T (sn ) ∈ T (S)
n
n
n
X
X
X
T ∈ GL(V,W )
0=
ai T (si ) = T
ai si = T (0)
=⇒
ai si = 0
i=1
i=1
i=1
T (S) lin. indep.
=⇒
a1 = · · · = an = 0
so S is linearly independent.
(1), (2)
(3) S basis for V
⇐⇒ T (S) linearly independent in W and W = span(T (S)) =⇒ T (S) is a
basis for W . We could also say that since T −1 is an isomorphism by Theorem 2.8, we have that
since T (S) is a basis for W , S = T −1 (T (S)) is a basis for V .
Theorem 2.12 (Isomorphisms Preserve Bases) If V and W are vector spaces over the same
field F and B is a basis for V , then T ∈ GL(V, W ) iff T (B) is a basis for W .
Proof: If T ∈ GL(V, W ) is an isomorphism, then T is bijective, so by Theorem 2.11 T (B) is a basis
for W . Conversely, if T (B) is a basis for W , then ∀u ∈ V , ∃!a1 , . . . , an ∈ F and v1 , . . . , vn ∈ B such
that u = a1 v1 + · · · + an vn . Therefore,
0 = T (u) = T (a1 v1 + · · · + an vn ) = a1 T (v1 ) + · · · + an T (vn ) =⇒ a1 = · · · = an = 0
so ker(T ) = {0}, whence by Theorem 2.9 T is injective. But since span(T (B)) = W , we have for
all w ∈ W that ∃!a1 , . . . an ∈ F such that w = a1 T (v1 ) + · · · + an T (vn ) = T (a1 v1 + · · · + an vn ), so
∃u = a1 v1 + · · · + an vn ∈ V such that w = T (u), and W ⊆ im(T ). We obviously have im(T ) ⊆ W ,
and so T is surjective. Therefore it is bijective and an isomorphism.
Theorem 2.13 (Isomorphisms Preserve Dimension) If V and W are vector spaces over the
same field F , then
V ∼
(2.20)
= W ⇐⇒ dim(V ) = dim(W )
∼
Proof: V = W =⇒ ∃T ∈ GL(V, W ), so B basis for V =⇒ T (B) basis for W , and dim(V ) = |B| =
|T (B)| = dim(W ). Conversely, if dim(V ) = |B| = |C| = dim(W ), where C is a basis for W , then
∃T : B → C bijective. Extending T to V by linearity defines a unique T ∈ L(V, W ) by Theorem
2.5 and T is an isomorphism because it is surjective, im(T ) = W , and injective, ker(T ) = {0}, so
V ∼
= W.
Corollary 2.14 If V and W are vector spaces over the same field and T ∈ L(V, W ), then dim(V ) <
dim(W ) =⇒ T cannot be onto and dim(V ) > dim(W ) =⇒ T cannot be 1-1.
Corollary 2.15 If V is an n-dimensional vector space over F , where n ∈ N, then
V ∼
= Fn
(2.21)
If κ is any cardinal number, B is a set of cardinality κ and V is a κ-dimensional vector space over
F , then
V ∼
(2.22)
= (F B )0
n
B
Proof: As a result of the fact that dim(F ) = n and dim (F )0 = κ, by the previous theorem. Corollary 2.16 If V and W are vector spaces over the same field F and T ∈ GL(V, W ), then for
any subspace S of V we have that T (S) is a subspace of W and dim(S) = dim(T (S)).
Proof: If S is a subspace of V , then ∀x, y ∈ S and a, b ∈ F , we have ax+by ∈ S, and T (x), T (y) ∈ S
and T (ax+by) = aT (x)+bT (y) ∈ T (S) imply that S is a subspace of W , while dim(S) = dim(T (S))
follows from the theorem.
23
2.4
Rank-Nullity Theorem
Lemma 2.17 If V and W are vector spaces over the same field F and T ∈ L(V, W ), then any
complement of the kernel of T is isomorphic to the range of T , that is
=⇒ ker(T )c ∼
= im(T )
V = ker(T ) ⊕ ker(T )c
(2.23)
where ker(T )c is any complement of ker(T ).
Proof: By Theorem 1.23 dim(V ) = dim ker(T ) + dim ker(T )c . Let
T c = T |ker(T )c : ker(T )c → im(T )
and note that T is injective by Theorem 2.9 because
ker(T c ) = ker(T ) ∩ ker(T c ) = {0}
Moreover T c is surjective because T c (V ) = im(T ): first we obviously have T c (V ) ⊆ im(T ), while for
the reverse inclusion suppose v ∈ im(T ). Then by Theorem 1.25 ∃!s ∈ ker(T ) and t ∈ ker(T )c such
that v = s + t, so that
0
*=
T (t)
T (s) = T c (s) ∈ T c (V )
T (v) = T (s + t) = T (s) + so im(T ) = T c (V ) and T c is surjective, so that T c is an isomorphism, whence
ker(T )c ∼
= im(T )
Theorem 2.18 (Rank-Nullity Theorem) If V and W are vector spaces over the same field F
and T ∈ L(V, W ), then
dim(V ) = rank(T ) + null(T )
(2.24)
c ∼
Proof: By the lemma and
we have ker(T ) = im(T ), which by the previous theorem implies
dim ker(T )c = dim im(T ) = rank(T ), while by Theorem 1.23 V = ker(T ) ⊕ ker(T )c implies
dim(V ) = dim ker(T ) + dim ker(T )c
= dim ker(T ) + dim im(T )
=
null(T ) + rank(T )
which completes the proof.
Corollary 2.19 Let V and W are vector spaces over the same field F and T ∈ L(V, W ). If
dim(V ) = dim(W ), then the following are equivalent:
(1) T is injective.
(2) T is surjective.
(3) rank(T ) = dim(V )
Proof: By the Rank-Nullity Theorem, rank(T ) + null(T ) = dim(V ), and by Theorem 2.9 we have
T is 1-1
Thm 2.9
⇐⇒
ker(T ) = {0}
⇐⇒
null(T ) = 0
R-N Thm
⇐⇒
Thm 1.21.1
dim(im(T )) = rank(T ) = dim(V )
⇐⇒
im(T ) = W.
⇐⇒
T is onto
which completes the proof.
assump.
=
dim(W )
24
Theorem 2.20 For any linear operator T ∈ L(V ) we have that
ker(T ) ⊆ ker(T 2 ) ⊆ · · ·
(2.25)
Moreover, if there is an m ∈ N for which rank(T m ) = rank(T m+1 ), then for all n ∈ N we have
rank(T m ) = rank(T m+n )
ker(T m ) = ker(T m+n )
(2.26)
Proof: (1) Let n ∈ N and v ∈ ker(T n ), so that T n (v) = 0. Then, 0 = T (0) = T (T n (v)) = T n+1 (v),
so v ∈ ker(T n+1 ), which proves (2.25).
(2) If there is an m ∈ N such that rank(T m ) = rank(T m+1 ), then by the Rank-Nullity Theorem null(T m ) = null(T m+1 ), which by (1) shows that ker(T m ) = ker(T m+1 ) and hence T m (V ) =
T m+1 (V ). But then T |ker(T m ) = T0 ∈ L(ker(T m )) and T |T m (V ) must be bijective. Consequently,
T |T m (V ) (v) 6= 0 if v 6= 0, so T |nT m (V ) (v) 6= 0 if v 6= 0 for all n ∈ N. This shows that T |nT m (V ) (v) = 0 iff
v ∈ ker(T m ) for all n ∈ N, so ker(T m+n ) = ker(T m ) and T m (V ) = T m+n (V ) for all n ∈ N, proving
(2.26).
25
3
Matrix Representations of Linear Transformations
3.1
Definitions
If β = (v1 , . . . , vn ) is an ordered basis for a finite-dimensional vector space V over F , then by
Theorem 1.13 we know that for any vector x ∈ V there are unique scalars a1 , . . . , an ∈ F such that
x = a1 v1 + · · · + an vn . The coordinate vector of x ∈ V with respect to (relative to) β is defined
to be the (column) vector in F n consisting of the scalars ai :
 
a1
 a2 
 
(3.1)
[x]β :=  . 
 .. 
an
and the coordinate map, also called the standard representation of V with respect to β,
φβ : V → F n
(3.2)
φβ (x) = [x]β
(3.3)
is given by
Let V and W be finite dimensional vector spaces over the same field F with ordered bases β =
(v1 , . . . , vn ) and γ = (w1 , . . . , wm ), respectively. If there exist (and there do exist) unique scalars
aij ∈ F such that
m
X
aij wi
for j = 1, . . . , n
(3.4)
T (vj ) =
i=1
then the matrix representation of a linear transformation T ∈ L(V, W ) in the ordered
bases β and γ is the m × n matrix A defined by Aij = aij ,
h
i
A = [T ]γβ :=
[T (v1 )]γ [T (v2 )]γ · · · [T (vn )]γ

=

a11
 a21 


 .. 
 . 
am1


a12
 a22 


 ..  · · ·
 . 
am2



a1n
a11
 a2n 

 ..

 ..  =  .
 . 
am1
amn
···
..
.
···

a1n
.. 
. 
amn
Here are some examples:
1. If T ∈ L(R2 , R3 ) is given by T (x1 , x2 ) = (x1 + 3x2 , 0, 2x1 − 4x2 ), and β = (e1 , e2 ) and
γ = (e1 , e2 , e3 ) are the standard ordered bases for R2 and R3 , respectively, then the matrix
representation of T is found as follows:


)
1
3
T (e1 ) = T (1, 0) = (1, 0, 2) = 1e1 + 0e2 + 2e3
0
=⇒
[T ]γβ = 0
T (e2 ) = T (0, 1) = (3, 0, −4) = 3e1 + 0e2 − 4e3
2 −4
2. If T ∈ L(R3 [x], R2 [x]) is given by T (f (x)) = f 0 (x), and β = (1, x, x2 , x3 ) and γ = (1, x, x2 ) are
the standard ordered bases for R3 [x] and R2 [x], respectively, the matrix representation of T is
found as follows:

T (1) = 0 · 1 + 0x + 0x2 




2 
0 1 0 0
T (x) = 1 · 1 + 0x + 0x 
γ
=⇒
[T ]β = 0 0 2 0
T (x2 ) = 0 · 1 + 2x + 0x2 

0 0 0 3


3
2
T (x ) = 0 · 1 + 0x + 3x
26
3.2
Matrix Representations of Linear Transformations
Theorem 3.1 If A ∈ Mm,n (F ), B, C ∈ Mn,p (F ), D, E ∈ Mq,m (F ), and a ∈ F , then
(1) A(B + C) = AB + AC and (D + E)A = DA + EA
(2) a(AB) = (aA)B = A(aB)
(3) Im A = A = AIn
Proof: For each case we show the result by showing that it holds for each ijth entry as a result of
the properties of the field F :
[A(B + C)]ij
=
=
n
X
Aik (B + C)kj =
k=1
n
X
=
=
m
X
n
X
Aik Bkj +
=
(Aik Bkj + Aik Ckj )
k=1
Aik Ckj = (AB)ij + (AC)ij = [AB + AC]ij
(D + E)ik Akj =
k=1
m
X
=
n
X
k=1
m
X
(Dik + Eik )Akj =
k=1
Dik Akj +
k=1
[a(AB)]ij
Aik (Bkj + Ckj ) =
k=1
k=1
[(D + E)A]ij
n
X
a
m
X
m
X
(Dik Akj + Eik Akj )
k=1
Eik Akj = (DA)ij + (EA)ij = [DA + EA]ij
k=1
n
X
Aik Bkj =
k=1
n
X
n
X
aAik Bkj = [(aA)B]ij =
k=1
n
X
aAik Bkj
k=1
Aik aBkj = [A(aB)]ij
k=1
(Im A)ij =
m
X
(Im )ik Akj =
k=1
m
X
δik Akj = Aij =
k=1
n
X
Aik δkj =
k=1
n
X
Aik (In )kj = (AIn )ij
k=1
Theorem 3.2 If A ∈ Mm,n (F ) and B ∈ Mn,p (F ), and bj is the jth column of B, then
AB
= A[b1 b2 · · · bp ]
=
[Ab1 Ab2 · · · Abp ]
(3.5)
(3.6)
That is if uj is the jth column of AB, then uj = Abj . As a result, if A ∈ Mm,n (F ) and aj is the
jth column of A, then
A = AIn = Im A
(3.7)
Proof: First, we have
  Pn



(AB)1j
B1j
k=1 A1k Bkj

 

 . 
..
..
uj = 
 = A  ..  = Avj
=
.
Pn .
(AB)mj
Bnj
k=1 Amk Bkj

As a result
A = [a1 a2 · · · an ] = [Ae1 Ae2 · · · Aen ] = AIn
and
Im A = Im [a1 a2 · · · an ] = [Im a1 Im a2 · · · Im an ] = [a1 a2 · · · an ] = A
27
Theorem 3.3 If V is a finite-dimensional vector space over a field F with ordered basis β =
(b1 , . . . , bn ), the coordinate map φβ : V → F n is an isomorphism,
φβ ∈ GL(V, F n )
(3.8)
Proof: First, φβ ∈ L(V, F n ) since ∀x, y ∈ V and a, b ∈ F we have by the ordinary rules of matrix
addition and scalar multiplication (or alternatively by the rules of addition and scalar multiplication
on n-tuples)
φβ (ax + by) = [ax + by]β = a[x]β + [y]β = aφβ (x) + φβ (y)
Second, because φβ (b1 ) = e1 , . . . , φβ (bn ) = en , it takes a basis into a basis. Therefore it is an
isomorphism by Theorem 2.12.
Theorem 3.4 Let V and W be finite-dimensional vector spaces having ordered bases β and γ,
respectively, and let T ∈ L(V, W ). Then for all v ∈ V we have
[T (v)]γ = [T ]γβ [v]β
(3.9)
In other words, if A = [T ]γβ is the matrix representation of T , if TA ∈ L(F n , F m ) is the matrix
multiplication map, and if φβ ∈ GL(V, F n ) and φγ ∈ GL(W, F n ) are the respective coordinate maps,
then
φγ ◦ T = TA ◦ φβ
(3.10)
and the following diagram commutes:
V
TW
φγ
?
φβ
?
Fn
- Fm
TA
Proof: If β = (v1 , . . . , vn ) is an odered basis for V and γ
then let


a11
h
i


[T ]γβ = [T (v1 )]γ · · · [T (vn )]γ =  ...  · · ·
am1
= (w1 , . . . , wm ) is an ordered basis for W ,
 
a11
a1n
 ..   ..
 .  =  .

amn
am1
···
..
.
···

a1n
.. 
. 
amn
Now, ∀u ∈ V, ∃!b1 , . . . , bn ∈ F such that u = b1 v1 + · · · + bn vn . Therefore,
T (u)
= T (b1 v1 + · · · + bn vn )
= b1 T (v1 ) + · · · + bn T (vn )
so that
[T (u)]γ = φγ T (u)
=
φγ b1 T (v1 ) + · · · + bn T (vn )
b1 φγ T (v1 ) + · · · + bn φγ T (vn )
=
b1 [T (v1 )]γ + · · · + bn [T (vn )]γ
=
[T ]γβ [u]β
=
which completes the proof.
28
Theorem 3.5 If V, W, Z are finite-dimensional vector spaces with ordered bases α, β and γ, respectively, and T ∈ L(V, W ) and U ∈ L(W, Z), then
[U ◦ T ]γα = [U ]γβ [T ]βα
(3.11)
Proof: By the previous two theorems we have
h
i
[U ◦ T ]γα = [(U ◦ T )(v1 )]γ · · · [(U ◦ T )(vn )]γ
h
i
= [U (T (v1 ))]γ · · · [U (T (vn ))]γ
h
i
= [U ]γβ [T (v1 )]β · · · [U ]γβ [T (vn )]β
h
i
= [U ]γβ [T (v1 )]β · · · [T (vn )]β
=
[U ]γβ [T ]βα
which completes the proof.
Corollary 3.6 If V is an n-dimensional vector space over F with an ordered basis β and I ∈ L(V )
is the identity operator, then [I]β = In ∈ Mn (F ).
Proof: By Theorem 3.5, given any T ∈ L(V ), T = I ◦T , so that [T ]β = [I ◦T ]β = [I]β [T ]β = In [T ]β ,
so that [I]β = In by the cancellation law.
Theorem 3.7 If V and W are finite-dimensional vector spaces of dimensions n and m, respectively,
with ordered bases β and γ, respectively, and T ∈ L(V, W ), then
rank(T ) = rank([T ]γβ )
and
null(T ) = null([T ]γβ )
(3.12)
Proof: This follows from Theorem 3.4, since φγ and φβ are isomorphisms, and so carry bases into
bases, and φγ ◦ T = TA ◦ φβ , so that
rank(T ) = dim im(T ) = dim(φγ (im(T )) = rank(φγ ◦ T )
= rank(TA ◦ φβ ) = dim im(TA ◦ φβ )) = dim im(TA )
= rank(TA ) = rank(A) = rank([T ]γβ )
while the second equality follows from the rank nullity theorem along with the fact that the ranks
are equal.
Theorem 3.8 If V and W are finite-dimensional vector spaces over F with ordered bases β =
(b1 , . . . , bn ) and γ = (c1 , . . . , cm ), then the function
Φ : L(V, W ) → Mm,n (F )
Φ(T ) =
[T ]γβ
(3.13)
(3.14)
is an isomorphism,
Φ ∈ GL L(V, W ), Mm,n
and consequently
(3.15)
L(V, W ) ∼
= Mm,n (F )
(3.16)
dim L(V, W ) = dim Mm,n (F ) = mn
(3.17)
and
29
Proof: First, Φ ∈ L L(V, W ), Mm,n : there exist scalars rij , sij ∈ F , for i = 1, . . . , m and j =
1, . . . , n, such that for any T ∈ L(V, W ) we have
T (b1 ) = r11 c1 + · · · + rm1 cm
..
.
U (b1 ) = s11 c1 + · · · + sm1 cm
..
.
and
T (bn ) = r1n c1 + · · · + rmn cm
U (bn ) = s1n c1 + · · · + smn cm
Hence, for all k, l ∈ F and j = 1, . . . , n we have
(kT + lU )(bj ) = kT (bj ) + lU (bj ) = k
m
X
rij ci + l
i=1
m
X
i=1
sij ci =
m
X
(krij + lsij )ci
i=1
As a consequence of this and the rules of matrix addition and scalar multiplication we have
Φ(kT + lU )
[kT + lU ]γβ

kr11 + ls11

..
= 
.
=

· · · kr1n + ls1n

..
..

.
.
krm1 + lsm1 · · · krmn + lsmn




s11 · · · s1n
r11 · · · r1n

.. 
..  + l  ..
..
..
= k  ...
 .
.
.
. 
. 
sm1 · · · smn
rm1 · · · rmn
= k[T ]γβ + l[U ]γβ
= kΦ(T ) + lΦ(U )
Moreover, Φ is bijective by Theorem 2.5, since ∀A ∈ Mm,n , ∃!T ∈ L(V, W ) such that Φ(T ) = A,
because there exists a unique T ∈ L(V, W ) such that
T (bj ) = A1j c1 + · · · + Anj cm
for j = 1, . . . , n
This makes Φ onto, and also 1-1 because ker(Φ) = {T0 } because for O ∈ Mm×n (F ) there is only
T0 ∈ L(V, W ) such that Φ(T0 ) = O, because
T (bj ) = 0c1 + · · · + 0cm = 0
for j = 1, . . . , n
defines a unique transformation, T0 .
Theorem 3.9 If V and W are finite-dimensional vector spaces over the same field F with ordered
bases β = (b1 , . . . , bn ) and γ = (c1 , . . . , cn ), respectively, then T ∈ L(V, W ) is an isomporhpism iff
[T ]γβ is invertible, in which case
−1
[T −1 ]βγ = [T ]γβ
(3.18)
Proof: If T is an isomorphism and dim(V ) = dim(W ) = n, then T −1 ∈ L(W, V ), and T ◦ T −1 = IW
and T −1 ◦ T = IV , so that by Theorem 3.5, Corollary 3.6 and Theorem 3.1 we have
[T ]γβ [T −1 ]βγ = [T ◦ T −1 ]γ = [IW ]γ = In = [IV ]β = [T −1 ◦ T ]β = [T −1 ]βγ [T ]γβ
so that [T ]γβ is invertibe with inverse [T −1 ]βγ , and by the uniqueness of the multiplicative inverse in
Mn (F ), which follows from the uniqueness of TA−1 ∈ L(F n ), we have
[T −1 ]βγ = [T ]γβ
30
−1
Conversely, if A = [T ]γβ is invertible, there is a n × n matrix B such that AB = BA = In , so by
Pn
Theorem 2.5 there is a unique U ∈ L(W, V ) such that U (cj ) = vj = i=1 Bij bj . Hence B = [U ]βγ .
To show that U = T −1 , note that
[U ◦ T ]β = [U ]βγ [T ]γβ = BA = In = [IV ]β
and
[T ◦ U ]γ = [T ]γβ [U ]βγ = AB = In = [IW ]γ
But since Φ ∈ L(V, W ), Mm,n (F ) is an isomorphism, and therefore 1-1, we must have that
U ◦ T = IV and T ◦ U = IW
By the uniqueness of the inverse, however, U = T −1 , and T is an isomorphism.
31
4
4.1
Change of Coordinates
Definitions
We can also now define the change of coordinates or change of basis, operator. If V is a
finite-dimensional vector space, say dim(V ) = n, and β = (b1 , . . . , bn ) and γ = (c1 , . . . , cn ) are
two ordered bases for V , then the coordinate maps φβ , φγ ∈ L(V, F n ), which are isomorphisms by
Theorem 2.15, may be used to define a change of coordinates operator φβ,γ ∈ L(F n ) changing β
coordinates into γ coordinates, that is having the property
φβ,γ ([v]β ) = [v]γ
(4.1)
φβ,γ := φγ ◦ φ−1
β
(4.2)
We define the operator by
The relationship between these three functions is illustrated in the following commutative diagram:
φγ ◦ φ−1
β
n
F φβ
V
- Fn
φγ
The change of coordinates operator has a matrix representation
h
i
Mβ,γ = [φβ,γ ]ρ = [φγ ]ρβ = [b1 ]γ [b2 ]γ · · · [bn ]γ
4.2
(4.3)
Change of Coordinate Maps and Matrices
Theorem 4.1 (Change of Coordinate Matrix) Let V be a finite-dimensional vector space over
a field F and β = (b1 , . . . , bn ) and γ = (c1 , . . . , cn ) be two ordered bases for V . Since φβ and φγ are
isomorphisms, the following diagram commutes,
φγ ◦ φ−1
β
Fn φβ
V
- Fn
φγ
n
and the change of basis operator φβ,γ := φγ ◦φ−1
β ∈ L(F ), changing β coordinates into γ coordinates,
is an automorphism of F n . It’s matrix representation,
Mβ,γ = [φβ,γ ]ρ ∈ Mn (F )
(4.4)
where ρ = (e1 , . . . , en ) is the standard ordered basis for F n , is called the change of coordinate
matrix, and it satisfies the following conditions:
1. Mβ,γ = [φβ,γ ]ρ
= [φγ ]ρβ
h
i
= [b1 ]γ [b2 ]γ · · · [bn ]γ
2. [v]γ = Mβ,γ [v]β ,
∀v ∈ V
3. Mβ,γ is invertible and
−1
Mβ,γ
= Mγ,β = [φβ ]ργ
32
Proof: The first point is shown as follows:
h
i
Mβ,γ = [φβ,γ ]ρ = φβ,γ (e1 ) φβ,γ (e2 ) · · · φβ,γ (en )
h
i
−1
−1
= (φγ ◦ φ−1
β )(e1 ) (φγ ◦ φβ )(e2 ) · · · (φγ ◦ φβ )(en )
h
i
−1
−1
= (φγ ◦ φ−1
)([b
]
)
(φ
◦
φ
)([b
]
)
·
·
·
(φ
◦
φ
)([b
]
)
1 β
γ
2 β
γ
n β
β
β
β
h
i
= φγ (b1 ) φγ (b2 ) · · · φγ (bn )
h
i
= [b1 ]γ [b2 ]γ · · · [bn ]γ
=
[φγ ]β
or by Theorem 3.5 and Theorem 3.9 we have that
ρ −1 β
ρ −1
ρ
Mβ,γ = [φβ,γ ]ρ = [φγ ◦ φ−1
= [φγ ]ρβ In−1 = [φγ ]ρβ In = [φγ ]ρβ
β ]ρ = [φγ ]β [φβ ]ρ = [φγ ]γ ([φβ ]β )
The second point follows from Theorem 3.4, since
−1
φγ (v) = (φγ ◦ I)(v) = (φγ ◦ (φ−1
β ◦ φβ ))(v) = ((φγ ◦ φβ ) ◦ φβ )(v)
implies that
−1
[v]γ = [φγ (v)]ρ = [((φγ ◦ φ−1
β ) ◦ φβ )(v)]ρ = [φγ ◦ φβ ]ρ [φβ (v)]ρ = [φβ,γ ]ρ [v]β = Mβ,γ [v]β
And the last point follows from the fact that φβ and φγ are isomorphism, so that φβ,γ is an isomorn
phism, and hence φ−1
β,γ ∈ L(F ) is an isomorphism, and because the diagram above commutes we
must have
−1 −1
φ−1
= φβ ◦ φ−1
γ = φβ,γ
β,γ = (φγ ◦ φβ )
so that by 1
−1
Mβ,γ
= [φ−1
β,γ ]ρ = [φγ,β ]ρ = Mγ,β
or alternatively by Theorem 3.9
−1
ρ
Mβ,γ
= ([φβ,γ ]ρ )−1 = [φ−1
β,γ ]ρ = [φγ,β ]ρ = [φβ ]γ = Mγ,β
Corollary 4.2 (Change of Basis) Let V and W be finite-dimensional vector spaces over the same
field F and let (β, γ) and (β 0 , γ 0 ) be pairs of ordered bases for V and W , respectively. If T ∈ L(V, W ),
then
0
[T ]γβ 0
= Mγ,γ 0 [T ]γβ Mβ 0 ,β
(4.5)
−1
= Mγ,γ 0 [T ]γβ Mβ,β
0
(4.6)
where Mγ,γ 0 and Mβ 0 ,β are change of coordinate matrices. That is, the following diagram commutes
TA
φβ,β 0
φβ
- Fn
φγ 0 ,γ
Fn
6
φβ 0
V
TA0 - n F
6
φγ 0
T
33
- W
φγ
-
Fn
Proof: This follows from the fact that if β = (b1 , . . . , bn ), β 0 = (b01 , . . . , b0n ), γ = (c1 , . . . , cm ),
γ = (c01 , . . . , c0m ), then for each i = 1, . . . , n we have
0
0
[T (b0i )]γ 0 = [(φγ,γ 0 ◦ T ◦ φ−1
β,β 0 )(bi )]γ
so
0
[T ]γβ 0
0
=
γ
[φγ,γ 0 ◦ T ◦ φ−1
β,β 0 ]β 0
=
[φγ,γ 0 ]ρm [T ]γβ [φ−1
β,β 0 ]ρn
=
[φγ,γ 0 ]ρm [T ]γβ ([φβ,β 0 ]ρn )−1
−1
= Mγ,γ 0 [T ]γβ Mβ,β
0
which completes the proof.
Corollary 4.3 (Change of Basis for a Linear Operator) If V is a finite-dimensional vector
space with ordered bases β and γ, and T ∈ L(V ), then
−1
[T ]γ = Mβ,γ [T ]β Mβ,γ
where Mβ,γ is the change of coordinates matrix.
(4.7)
Corollary 4.4 If we are given any two of the following:
(1) A ∈ Mn (F ) invertible
(2) an ordered basis β for F n
(3) an ordered basis γ for F n
the third is uniquely determined by the equation A = Mβ,γ , where Mβ,γ is the change of coordinates
matrix of the previous theorem.
h
i
Proof: If we have A = Mβ,γ = [φβ ]γ = [b1 ]γ [b2 ]γ · · · [bn ]γ , suppose we know A and γ. Then,
[bi ]γ is given by A, so bi = Ai1 c1 + · · · + Ain cn , so β is uniquely
determined. iIf β and γ are given,
h
then by the previous theorem Mβ,γ is given by Mβ,γ = [b1 ]γ [b2 ]γ · · · [bn ]γ . Lastly, if A and β
are given, then γ is given by the first case applied to A−1 = Mγ,β .
34