Download Algebra I – lecture notes

Document related concepts

Tensor operator wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Euclidean vector wikipedia , lookup

Jordan normal form wikipedia , lookup

History of algebra wikipedia , lookup

Oscillator representation wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Clifford algebra wikipedia , lookup

Vector space wikipedia , lookup

Complexification (Lie group) wikipedia , lookup

Matrix calculus wikipedia , lookup

System of linear equations wikipedia , lookup

Laws of Form wikipedia , lookup

Exterior algebra wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Dual space wikipedia , lookup

Geometric algebra wikipedia , lookup

Four-vector wikipedia , lookup

Cartesian tensor wikipedia , lookup

Bra–ket notation wikipedia , lookup

Linear algebra wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Transcript
Algebra I – lecture notes
version β
Imperial College London
Mathematics 2005/2006
CONTENTS
Algebra I – lecture notes
Contents
1 Groups
1.1 Definition and examples . . . . . . . . . . .
1.1.1 Group table . . . . . . . . . . . . . .
1.2 Subgroups . . . . . . . . . . . . . . . . . . .
1.2.1 Criterion for subgroups . . . . . . . .
1.3 Cyclic subgroups . . . . . . . . . . . . . . .
1.3.1 Order of an element . . . . . . . . .
1.4 More on the symetric groups Sn . . . . . . .
1.4.1 Order of permutation . . . . . . . . .
1.5 Lagranges Theorem . . . . . . . . . . . . . .
1.5.1 Consequences of Lagranges Theorem
1.6 Applications to number theory . . . . . . . .
1.6.1 Groups . . . . . . . . . . . . . . . . .
1.7 Applications of the group Z∗p . . . . . . . . .
1.7.1 Mersenne Primes . . . . . . . . . . .
1.7.2 How to find Meresenne Primes . . . .
1.8 Proof of Lagrange’s Theorem . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
11
12
13
14
16
18
20
21
22
22
23
26
27
28
30
.
.
.
.
.
.
.
.
.
.
34
35
38
39
40
41
43
44
48
51
53
3 More on Subspaces
3.1 Sums and Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 The rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
57
61
2 Vector Spaces and Linear Algebra
2.1 Definition of a vector space . . . . . .
2.2 Subspaces . . . . . . . . . . . . . . .
2.3 Solution spaces . . . . . . . . . . . .
2.4 Linear Combinations . . . . . . . . .
2.5 Span . . . . . . . . . . . . . . . . . .
2.6 Spanning sets . . . . . . . . . . . . .
2.7 Linear dependence and independence
2.8 Bases . . . . . . . . . . . . . . . . . .
2.9 Dimension . . . . . . . . . . . . . . .
2.10 Further Deductions . . . . . . . . . .
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Algebra I – lecture notes
3.2.1
3.2.2
CONTENTS
How to find row-rank(A) . . . . . . . . . . . . . . . . . . . . . . . .
How to find column-rank(A)? . . . . . . . . . . . . . . . . . . . . .
4 Linear Transformations
4.1 Basic properties . . . . . . . . . . . . .
4.2 Constructing linear transformations . .
4.3 Kernel and Image . . . . . . . . . . . .
4.4 Composition of linear transformations .
4.5 The matrix of a linear transformation .
4.6 Eigenvalues and eigenvectors . . . . . .
4.6.1 How to find evals / evecs of T ?
4.7 Diagonalisation . . . . . . . . . . . . .
4.8 Change of basis . . . . . . . . . . . . .
61
63
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
68
70
71
73
78
79
82
83
84
85
5 Error-correcting codes
5.1 Introduction . . . . . . . . . . . . . . . .
5.2 Theory of Codes . . . . . . . . . . . . .
5.2.1 Error Correction . . . . . . . . .
5.3 Linear Codes . . . . . . . . . . . . . . .
5.3.1 Minimum distance of linear code
5.4 The Check Matrix . . . . . . . . . . . .
5.5 Decoding . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
88
88
90
91
91
92
93
95
3
CONTENTS
Algebra I – lecture notes
Introduction
(1) Groups – used throughout maths and science to describe symmetry – e.g. every
physical object, algebraic equation or system of differential equations, . . . , has a
group associated with it.
(2) Vector spaces – have seen and studied some of these already, e.g. Rn .
4
Algebra I – lecture notes
Chapter 1
Groups
1.1
Definition and examples
Definition 1.1. Let S be a set. A binary operation ∗ on S is a rule which assigns to any
ordered pair (a, b) (a, b ∈ S) an element a ∗ b ∈ S.
In other words, it’s a function from S × S → S.
Eg 1.1.
1) S = Z, a ∗ b = a + b
2) S = C, a ∗ b − ab
3) S = R, a ∗ b = a − b
4) S = R, a ∗ b = min(a, b)
5) S = {1, 2, 3}, a ∗ b = a (eg. 1 ∗ 1 = 1, 2 ∗ 3 = 2)
Given a binary operation on a set S and a, b, c ∈ S, can form “a ∗ b ∗ c” in two ways
(a ∗ b) ∗ c
a ∗ (b ∗ c)
These may or may not be equal.
Eg 1.2. In 1), (a ∗ b) ∗ c = a ∗ (b ∗ c).
In 3), (3 ∗ 5) ∗ 4 = (3 − 5) − 4 = −6. Whereas 3 ∗ (5 ∗ 4) = 3 − (5 − 4) = 2
Definition 1.2. A binary operation ∗ is associative if for all a, b, c ∈ S
(a ∗ b) ∗ c = a ∗ (b ∗ c)
Associativity is important.
5
1.1. DEFINITION AND EXAMPLES
Algebra I – lecture notes
Eg 1.3. Solve 5 + x = 2.
We add −5 to get −5 + (5 + x) = −5 + 2. Now we use associativity! We rebracket to
(−5 + 5) + x = −5 + 2. Thus 0 + x = −5 + 2, so x = −3.
To do this, we needed
1) associativity of +
2) the existence of 0 (with 0 + x = x)
3) existence of −5 (with −5 + 5 = 0)
Generally, suppose we have a binary operation ∗ and an equation
a∗x = b
(a, b ∈ S constants, x ∈ S unknown) To be able to solve, we need
1) associativity
2) existence of e ∈ S such that e ∗ x = x for x ∈ S
3) existence of a′ ∈ S such that a′ ∗ a = e
Then can solve
a∗x
a ∗ (a ∗ x)
(a′ ∗ a) ∗ x
e∗x
x
′
=
=
=
=
=
b
a′ ∗ b
a′ ∗ b
a′ ∗ b
a′ ∗ b
Group will be a structure in which we can solve equations like this.
Definition 1.3. A group (G, ∗) is a set G with a binary operation ∗ satisfying the following
axioms (for all a, b, c ∈ S)
(1) if a, b ∈ S then a ∗ b ∈ S
(closure)
(2) (a ∗ b) ∗ c = a ∗ (b ∗ c)
(associativity)
(3) there exists e ∈ S such that
e∗x= x∗e= x
(identity axiom)
6
Algebra I – lecture notes
1.1. DEFINITION AND EXAMPLES
(4) for any a ∈ S, there exists a′ ∈ S such that
a ∗ a′ = a′ ∗ a = e
(inverse axiom)
Element e in (3) is an identity element of G.
Element a′ in (4) is an inverse of a in G.
Eg 1.4.
(Z, +)
(Z, −)
(Z, ×)
(Q, +)
(Q, ×)
(Q − {0}, ×)
(C − {0}, ×)
({1, −1, i, −i}, ×)
closure
yes
yes
no
yes
yes
yes
yes
yes
assoc.
yes
no
identity
yes
no
inverse
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
yes
no
yes
yes
yes
We check the group axioms for the last example
• Closure Multiplication table
1 −1
i −i
1
1 −1
i −i
−1 −1
1 −i
i
i
i −i −1
1
1 −i
i
1 −1
• Associativity Follows from associativity of (C, ×)
• Identity 1
• Inverse from table
Uniqueness of identity and inverses
Proposition 1.1. Let (G, ∗) be a group.
1) G has exactly one identity element
2) Each element of G has exactly one inverse
Proof.
7
group
yes
no
no
yes
no
yes
yes
yes
1.1. DEFINITION AND EXAMPLES
Algebra I – lecture notes
1) Suppose e, e′ are identity elements. So
e∗x = x∗e=x
e′ ∗ x = x ∗ e′ = x
Then
e = e ∗ e′ = e′
2) Let x ∈ G and suppose x′ , x′′ are inverses of x. That means
x′ ∗ x = x ∗ x′ = e
x′′ ∗ x = x ∗ x′′ = e
Then
x′ =
=
=
=
=
x′ ∗ e
x′ ∗ (x ∗ x′′ )
(x′ ∗ x) ∗ x′′
e ∗ x′′
x′′
2
Notation 1.1.
• e is the identity element of G
• x−1 is the inverse of x
• Instead of “(G, ∗) is a group”, we write “G is a group under ∗”.
• Often drop the ∗ in a ∗ b, and write just ab.
Eg 1.5.
In (Z, +), x−1 = −x. Z is a group under addition.
In (Q − {0}, x), x−1 = x1 .
Definition 1.4. We say (G, ∗) is a finite group if |G| is finite; (G, ∗) is an infinite group if
|G| is infinite.
Eg 1.6. All groups in example 2.4 are infinite except the last which has size (order ) 4.
8
Algebra I – lecture notes
1.1. DEFINITION AND EXAMPLES
Eg 1.7. Let F = R or C. Say a matrix (aij ) is a matrix over F if all aij ∈ F .
Set of all n × n matrices over F under matrix multiplication is not a group (problem with
inverse axiom). But let’s define GL(n, F ) to be the set of all n × n invertible matrices over
F.
Definition 1.5. Denote the set of all invertible matrices over field F as GL(n, F )
GL(n, F ) = {(aij ) | 1 < i, j ≤ n, aij ∈ F }
Claim GL(n, F ) is a group under matrix multiplication.
Proof. Write G = GL(n, F ).
Closure Let A, B ∈ G. So A, B are invertible. Now
(AB)−1 = B −1 A−1
since
(AB)(B −1 A−1 ) = A(BB −1 B) = AIA−1 = I
(B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 IB = I
So AB ∈ G.
Associativity proved in M1GLA.
Identity is identity matrix In .
Inverse of A is A−1 (since AA−1 = A−1 A = I). Note A−1 ∈ G as it has an inverse,
A.
2
1) GL(1, R) is the set of all (a) with a ∈ R, a 6= 0. This is just the group (R − {0}, ×).
a b
2) GL(2, C) is the set of
, a, b, c, d ∈ C, ad − bc 6= 0.
c d
Note 1.1. Usually, in a group (G, ∗) a ∗ b is not same as b ∗ a.
Definition 1.6. Let (G, ∗) be a group. If for all a, b ∈ G, a ∗ b = b ∗ a we call G an abelian
group.
Eg 1.8. (Z, +) is abelian as a + b = b + a for all a, b ∈ Z. So are the other groups in 2.4.
So is GL(1, F ).
But GL(2, R) is not abelian, since
−1 1
0 1
1 −1
=
2 0
1 0
0
2
0
2
1 −1
0 1
=
1 −1
0 2
1 0
9
1.1. DEFINITION AND EXAMPLES
Algebra I – lecture notes
Groups of permutations
Definition 1.7. S a set. A permutation of S is a function f : S → S which is a bijection
(both injection and surjection).
Eg 1.9. S = {1, 2, 3, 4}, f : 1 → 2, 2 → 3, 3 → 4, 4 → 1 is a permutation.
Notation 1.2.
1 2 3 4
f=
2 4 3 1
is a permutation 1 → 2, 2 → 4, 3 → 3, 4 → 1.
Let
1 2 3 4
g=
3 1 2 4
The composition f ◦ q is defined by
f ◦ g = f (g(s))
Here
f ◦g =
1 2 3 4
3 2 4 1
Recall the inverse function f −1 is the “inverse” of f . Here
1 2 3 4
−1
f =
4 1 3 2
1 2 3 4
−1
= e, the identity function.
Notice f ◦ f =
1 2 3 4
Proposition 1.2. Let S = {1, 2, 3, . . . , n} and let G be the set of all permutations of
S. Then (G, ◦), ◦ being the function composition, is a group, i.e. G is a group under
composition.
Proof. Notation for f ∈ G is
f=
1
2
···
n
f (1) f (2) · · · f (n)
• Closure By M1F, if f , g are bijections S → S then f ◦ g is a bijection.
• Associativity Let f, g, h ∈ G and apply s ∈ S Then
f ◦ (g ◦ h)(s) = f (g ◦ h(s))
= f (g(h(s)))
(f ◦ g) ◦ h(s) = (f ◦ g)(h(s)) = f (g(h(s)))
So f ◦ (g ◦ h) = (f ◦ g) ◦ h
10
Algebra I – lecture notes
1.1. DEFINITION AND EXAMPLES
1 2 ··· n
, since e ◦ f = f ◦ e = f .
• Identity is e =
1 2 ··· n
f (1) f (2) · · · f (n)
−1
and f −1 ◦ f = f ◦ f −1 = e
• Inverse of f is f =
1
2
···
n
2
Definition 1.8. The group of all permutations of {1, 2, . . . , n} is written Sn and called the
symmetric group of degree n.
1 2
Eg 1.10. S2 = e,
. So |S2 | = 2.
2 1
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
1 2 3
,
,
,
,
,
Eg 1.11. S3 =
3 1 2
2 3 1
1 3 2
3 2 1
2 1 3
1 2 3
|S3 | = 6
Proposition 1.3. Sn is a finite group of size n!.
1
2
···
n
, number of choices for f (1) is n, for f (2) is n − 1,
Proof. For f =
f (1) f (2) · · · f (n)
f (3) is n − 2, . . . , f (n) is 1. The total number of permutations is n · (n − 1) · (n − 2) · 1 = n!.
2
Notation 1.3. (Multiplicative notation for groups)
If (G, ∗) is a group, we’ll usually write just ab instead of a ∗ b. We can define powers
a2
a3
an
=
=
···
=
a∗a
a∗a∗a
|a ∗ a ∗{z· · · ∗ a}
n
When we write “Let G be a group”, we mean the binary operation ∗ is understood, and
we’re writting ab instead of a ∗ b, etc.
Eg 1.12. In (Z, +), ab = a ∗ b = a + b and a2 = a ∗ a = 2a, i.e. an = na.
1.1.1
Group table
Definition 1.9. Let G be a group, with elements a, b, c, . . . . Form a group table
a
a a2
b ba
..
.
b
c ···
ab ac · · ·
b2 bc · · ·
..
.
11
1.2. SUBGROUPS
Algebra I – lecture notes
Eg 1.13. S3 =
e =
a =
a2 =
b =
ab =
a2 b =
1
1
1
2
1
3
1
2
1
3
1
1
2 3
2 3
2 3
3 1
2 3
1 2
2 3
1 3
2 3
2 1
2 3
3 2
e
a a2
b ab a2 b
2
e
e
a a
b ab a2 b
a
a a2
e ab a2 b
b
2
2
2
a
a
e
a ab
b ab
b
b a2 b ab
e a2
a
2
ab ab
b ab
a
e a2
2
2
2
b a
a
e
a b a b ab
• a3 = e
• ba = a2 b
• b2 = e
1.2
Subgroups
Definition 1.10. Let (G, ∗) be a group. A subgroup of (G, ∗) is a subset of G which is
itself a group under ∗.
Eg 1.14.
• (Z, +) is a subgroup of (R, +).
• (Q − {0}, ×) is not a subgroup of (R, +)
• ({1, −1}, ×) is a subgroup of ({1, −1, i, −i}, ×)
• ({1, i}, ×) is not a subgroup of ({1, −1, i, −i}, ×) (closure fails – i × i = −1)
12
Algebra I – lecture notes
1.2.1
1.2. SUBGROUPS
Criterion for subgroups
Proposition 1.4. Let G be a group.1 Let H ⊆ G. Then H is a subgroup of G if the
following conditions are true
(1) e ∈ H (where e is the identity of G)
(2) if h, k ∈ H then hk ∈ H (H is cloesed)
(3) if h ∈ H then h−1 ∈ H
Proof. Assume (1)-(3). Check the group axioms for H:
Closure – true by (2)
Associativity – true by associativity for G
Identity – by (1)
Identity – by (3)
2
1 n
|n∈Z .
Eg 1.15. Let G = GL(2, R) (2×2 invertible matrices over R). Let H =
0 1
Claim H is a subgroup of G.
Proof. Check (1)-(3) of previous proposition
(1) e =
1 0
0 1
∈H
1 p+n
1 p
1 n
∈ H.
. Then nk =
,k=
(2) Let h =
0
1
0 1
0 1
1 n
1 −n
−1
(3) Let h =
. Then h =
∈H
0 1
0 1
2
1
so ∗ is understood and we write ab instead of a ∗ b
13
1.3. CYCLIC SUBGROUPS
1.3
Algebra I – lecture notes
Cyclic subgroups
Let G be a group. Let a ∈ G. Recall
a1 = a, a2 = aa, . . .
Negative powers
a0 =
a−2 =
a−n =
e
a−1 a−1
−1
a
· · a−1}
| ·{z
n
Note 1.2. All the powers an (n ∈ Z) lie in G (by closure).
Lemma 1.1. For any m, n ∈ Z
am an = am+n
Proof. For m, n > 0
m+n
m n
a a
}|
{
z
· · a} a
· · a}
= a
| ·{z
| ·{z
n
m+n
m
= a
For m ≥ 0, n < 0
−1
· · a−1}
am an = |a ·{z
· · a} a
| ·{z
n
−n
m−(−n)
= a
Similarly for m < 0, n ≥ 0. Finally, when m, n < 0
−m−n
am an
z
}|
{
−1
−1 −1
−1
= a
·
·
·
a
a
·
·
·
a
| {z } | {z }
−n
m+n
−m
= a
2
Proposition 1.5. Let G be a group. Let a ∈ G. Define
A = {an | n ∈ Z} = . . . , a−2 , a−1 , e, a, a2 , . . .
Then A is a subgroup of G.
Proof. Check (1)-(3) of 2.4
14
Algebra I – lecture notes
1.3. CYCLIC SUBGROUPS
(1) e = a0 ∈ A
(2) am , an ∈ A then am an = am+n ∈ A
(3) an ∈ A then (an )−1 = a−n ∈ A
2
Definition 1.11. Write A = hai, called the cyclic subgroup of G generated by a. So for
each element a ∈ G we get a cyclic subgroup hai of G.
Eg 1.16. (1) G = (Z, +). What is the cyclic subgroup h3i?
Well, 31 = 3, 32 = 3 + 3 = 6, 3n = 3n, 3−1 = −3, 3−n = −3n. So h3i = {3n | n ∈ Z}.
Similarly h1i = {n | n ∈ Z} = Z.
1 2 3
. What is hai?
(2) G = S3 , a =
2 3 1
Well
a0 = e
a1 = a
1 2 3
2
a =
3 1 2
a3 = e
a4 = a
a5 = a2
..
.
a−1 = a3 a−1 = a2
a−2 = a
..
.
Hence hai = {an | n ∈ Z} = {e,
a, a2 } .
1 2 3
Now consider hbi, b =
. Here b0 = e, b1 = b, b2 = e, . . . .
2 1 3
So hbi = {e, b}.
(3) All the cyclic subgroups of S3 = {e, a, a2 , b, ab, a2 b}
hei
hai
2
a
hbi
habi
2 ab
=
=
=
=
=
=
15
{e}
{e, a, a2 }
{e, a, a2 }
{e, b}
{e, ab}
{e, a2 b}
1.3. CYCLIC SUBGROUPS
Algebra I – lecture notes
Definition 1.12. Say a group G is a cyclic group, if there exists an element a ∈ G such
that
G = hai = {an | n ∈ Z}
Call a a generator for G.
Eg 1.17.
(1) (Z, +) = h1i, So (Z, +) is cyclic with generator 1
(2) ({1, −1, i, −i}, ×) is cyclic, generator i, since
hii = {i0 , i1 , i2 , i3 } = {1, i, −1, −i}
Another generator is −i, but 1 and −1 are not generators.
(3) S3 is not cyclic, as non of its 5 cyclic subgroups is the whole of S3 .
For any n ∈ N there exists a cyclic group of size n (having n elements) – Cn
Cn = {x ∈ C | xn = 1}
the complex n-th roots of unity, under multiplication. By M1F, we know Cn = {1, ω, ω 2, . . . , ω n−1},
i
where ω = e2π n , So
Cn = hωi
a cyclic subgroup of (C − {0}, ×).
1.3.1
Order of an element
Definition 1.13. Let G be a group, let a ∈ G. The order of a, written o(a), is the
smallest positive integer k such that ak = e. So o(a) = k means ak = e and ai 6= e for
i = 1, . . . , k − 1.
If no such k exists, we say a has infinite order and we write o(a) = ∞.
Eg 1.18.
(1) e has order 1, and is the only such element.
1 2 3
1 2 3
1 2 3
3
1
2
. So
, a =
. Then a = a, a =
(2) G = S3 , a =
1 2 3
3 1 2
2 3 1
o(a) = 3.
1 2 3
, b1 6= e, b2 = e, so o(b) = 2. Full list:
For b =
2 1 3
o(e)
o(a)
o(a2 )
o(b)
o(ab)
o(a2 b)
16
=
=
=
=
=
=
1
3
3
2
2
2
Algebra I – lecture notes
1.3. CYCLIC SUBGROUPS
(3) G = (Z, +). What is o(3)? In G, e = 0, 3n = n × 3. So 3n 6= e for any n ∈ N, so
o(3) = ∞.
i
0
. Then
(4) G = GL(2, C), A =
0 e2πi/3
k
i
0
k
A =
0 e2πik/3
So smallest k for which this is the identity is 12 ∴ o(A) = 12.
Proposition 1.6. G a group, a ∈ G. The number of elements in the cyclic subgroup
generated by a is equal to o(a).
| hai | = o(a)
Proof.
(1) Suppose o(a) = k, finite. This means ak = e, but ai 6= e for 1 ≤ i ≤ k − 1.
Write A = hai = {an | n ∈ Z}. Then A contains
e, a, a2 , . . . , ak−1
These are all different elements of G since for 1 ≤ y < j ≤ k − 1,
ai = aj
a−1 ai = a−i aj
e = aj−i
o(a) = j − i < k – contradiction.
Hence A contains e, a, . . . , ak−1 , all distinct, so
|A| ≥ K
We now show that every element of A is one of e, a, . . . , ak−1. Let an ∈ A. Write
n = qk + r, 0 ≤ r ≤ k − 1
Then
an =
=
=
=
=
aqk+r
aqk ar
(ak )q ar
eq ar
ar
So an = ar ∈ {e, a, a2 , . . . , ak−1 }. We’ve shown
A = {e, a, a2 , . . . , ak−1 }
so |A| = k = o(a).
17
1.4. MORE ON THE SYMETRIC GROUPS SN
Algebra I – lecture notes
(2) Suppose o(a) = ∞. This means
ai 6= e for i ≥ 1
If i < j then ai 6= aj since
ai = aj
e = aj−i
contradiction.
Then
A = {. . . , a−2 , a−1 , e, a, a2 , . . . }
and all these elements are different elements of G. So |A| = ∞ = o(a).
2
Eg 1.19.
1 2 3
. Then hai = {e, a, a2 }, size 3, o(a) = 3.
(1) G = S3 , a =
2 3 1
(2) G = (Z, +). Then h3i = {3n | n ∈ Z}, is infinite and o(3) = ∞.
(3) Cn = hωi = {1, ω, . . . , ω n−1}, size n, and o(ω) = n.
1.4
More on the symetric groups Sn
1 2 3 4 5 6 7 8
∈ S8 . What is f 2 , f 5 ?
Eg 1.20. Let f =
4 5 6 3 2 7 1 8
We need better notation to see answers quicklky. Observe the numbers 1–4–3–6–7–1 are
in a cycle, as well as numbers 2–5 and 8.
We will write f = (1 4 3 6 7)(2 5)(8). These are the cycles of f . Each symbol in the first
cycle goes to the next except for the last 7 which goes back to the first 1. The cycles are
disjoint – they have no symbols in common. Call this the cycle notation for f .
Definition 1.14. In general, an r cycle is a permutation
a1 a2 . . . ar
which sends a1 → a2 → · · · → ar → a1 . . . .
Eg 1.21. Can easily go from cycle notation to original, e.g.
g = (1 5 3)(2 4)(6 7) ∈ S7
1 2 3 4 5 6 7
g =
5 4 1 2 3 7 6
18
Algebra I – lecture notes
1.4. MORE ON THE SYMETRIC GROUPS SN
Proposition 1.7. Every permutation f in Sn can be expressed in the cycle notation, i.e.
as a product of disjoint cycles.
Proof. Following procedure works:
Start with 1, and write down sequence
1, f (1), f 2(1), . . . , f r−1 (1)
until the first repeat f r (1). Then in fact f r (1) = 1, since
f r (1) = f i (1)
f −i (1)f r (1) = 1
f r−i (1) = 1
which is a contradiction as f r (1) is the first repeat. So have the r-cycle
(1 f (1) · · · f r−1 (1))
first cycle of f .
Second cycle: Pick a symbol i not in the first cycle, and write
i, f (i), f 2 (i), . . . , f s−1(i)
where f s (i) = i. Then this is the second cycle of f . This cycle is disjoint from the first
since if not, say f j (i) = k in first cycle, then f s−j(k) = f s (i) = i would be in the first cycle.
Now carry on: pick j not in first two cycles and repeat to get third cycle and carry on until
we have used all the symbols 1, . . . , n. so
f = (1 f (1) · · · f r−1 (1)(i f (i) · · · f s−1 (i)) . . .
a product of disjoint cycles.
2
Note 1.3. Cycle notation is not quite unique – e.g. (1 2 3 4) can be written as (2 3 4 1)
AND (1 2)(3 4 5) = (3 4 5)(1 2). Notation is unique appart from such changes.
Eg 1.22.
1. The elements of S3 in cycle notation
e
a
a2
b
ab
a2 b
=
=
=
=
=
=
19
(1)(2)(3)
(1 2 3)
(1 3 2)
(1 2)(3)
(1 3)(2)
(2 3)(1)
1.4. MORE ON THE SYMETRIC GROUPS SN
Algebra I – lecture notes
2. For disjoint cycles, order of multiplication does not matter, e.g.
(1 2)(3 4 5) = (3 4 5)(1 2)
For non-disjoint cycles it does matter, e.g.
(1 2)(1 3) 6= (1 3)(1 2)
3. Multiplication is easy using cycle notation, e.g.
f = (1 2 3 5 4) ∈ S5
g = (2 4)(1 5 3) ∈ S5
then
f g = (1 4 3 2)(5)
Definition 1.15. Let g = (1 2 3) (4 5)(5 7)(8)(9) ∈ S9 . The cycle shape of g is
(3, 2, 2, 1, 1)
i.e. the sequence of numbers giving the cycle-length of g in descending order. Abbreviate:
(3, 22 , 12 )
Eg 1.23. How many permutation of each cycle-shape in S4 ?
cycle-shape
e.g
number in S4
4
(1 )
e
1
4
2
(2, 1 ) (1 2)(3)(4)
2 = 6
4
(3, 1) (1 2 3)(4)
×2=8
3
(4) (1 2 3 4)
3! = 6
2
3
(2 ) (1 2)(3 4)
Total 24 = 4!.
1.4.1
Order of permutation
Recall the order o(f ) of f ∈ Sn is the smallest positive integer k such that f k = e.
20
Algebra I – lecture notes
1.5. LAGRANGES THEOREM
Eg 1.24. f = (1 2 3 4), 4-cycle then
f1
f2
f3
f4
=
=
=
=
f
(1 3)(2 4)
(1 4 3 2)
e
So o(f ) = 4. Similarly, if f = (1 2 . . . r) then o(f ) = r.
Eg 1.25. g = (1 2 3)(4 5 6 7). What is o(g)?
g 2 = (1 2 3)(4 5 6 7) ◦ (1 2 3)(4 5 6 7)
(disjoint) = (1 2 3)2 (4 5 6 7)2
Similarly
g i = (1 2 3)i (4 5 6 7)i
To make g i = e, need i to be divisible by 3 (to get rid of (1 2 3)i ) and by 4 (to get rid of
(4 5 6 7)i ). So o(g) = lcm(3, 4) = 12.
Same argument gives
Proposition 1.8. The order of a permutation in cycle notation is the least common
multiple of the cycle lengths
Eg 1.26. Order of (1 2)(3 4 5 6) is lcm(2, 4) = 4. The order of (1 3)(3 4 5 6) is not 4 (not
disjoint)
Eg 1.27. Pack of 8 cards. Shuffle by dividing into two halves and interlacing, so if original
order is 1, 2, . . . , 8 then the order is 1, 5, 2, 6, 3, 7, 4, 8. How many shuffles bring cards back
to original order?
This is the permutation s in S8
1 2 3 4 5 6 7 8
s=
1 5 2 6 3 7 4 8
In cycle notation s = (1)(2 5 3)(4 6 7)(8). So order of s o(s) = lcm(3, 3, 1, 1) = 3, so 3
shuffles are required.
1.5
Lagranges Theorem
Recall G a finite group means G has a finite number of elements. Size of G is |G|, e.g.
|S3 | = 6.
Theorem 1.1. Let G be a finite group. If H is any subgroup of G, then |H| divides |G|.
Eg 1.28. Subgroups of S3 have size 1, 2, 3 or 6.
Note 1.4. It does not work the other wat round, i.e. if a is a number dividing |G|, then
there may well not exist a subgroup of G of size a.
21
1.6. APPLICATIONS TO NUMBER THEORY
1.5.1
Algebra I – lecture notes
Consequences of Lagranges Theorem
Corollary 1. If G is a finite group and a ∈ G then o(a) divides G.
Proof. Let H = hai, cyclic subgroup of G generated by a. By 1.6, |H| = o(a) so by
Lagrange, o(a) divides |G|.
2
Corollary 2. Let G be a finite group and let n = |G|. If a ∈ G, then an = e.
r
Proof. Let k = o(a). By 1, k divides n. Say n = kr. So an = ak = er = e.
2
Corollary 3. If |G| is a prime number, then G is cyclic.
Proof. Let |G| = p, prime. Pick a ∈ G with a 6= e. By Lagrange, the cyclic subgroup hai
has size dividing p. It contains e, a, so has size ≥ 2, therefore has size p. As |G| = p, this
implies G = hai, cyclic.
2
Eg 1.29. Subgroups of S3 . These have size 1 – hei, 2, 3 – cyclic by 3. So we know all the
subgroups of S3 .
1.6
Applications to number theory
Definition 1.16. Fix a positive integer m ∈ N. For any integer r, the residue class of r
modulo m denoted [r]m is
[r]m = {km + r | k ∈ Z}
Eg 1.30.
[0]5 = {5k | k ∈ Z}
[1]5 = {. . . , −9, −4, 1, 6, 11, . . . } = [1]5
[−2]5 = [3]5 = [8]5
Since every integer is congruent to 0, 1, 2, . . . , m − 1 modulo m,
[0]m ∪ [1]m ∪ · · · ∪ [m − 1]m = Z
and every integer is in exactly one of these residue classes.
Proposition 1.9.
[a]m = [b]m ⇔ a ≡ b
mod m
Proof.
→ Suppose [a]m = [b]m . As a ∈ [a]m this implies a ∈ [b]m , so a ≡ b mod m.
22
Algebra I – lecture notes
1.6. APPLICATIONS TO NUMBER THEORY
← Suppose a ≡ b mod m. Now
x≡a
mod m ⇔ x ≡ b
mod m
(as ≡ is an equivalence relation). So
x ∈ [a]m ⇔ x ∈ [b]m
Therefore [a]m = [b]m .
2
Eg 1.31.
[17]9 = [−19]9
Definition 1.17. Write Zm for the set of all the residue classes
[0]m , [1]m , . . . , [m − 1]m
From now on we’ll usually drop the subscript m and write
[r] = [r]m
Definition 1.18. Define +, × on Zm by
[a] + [b] = [a + b]
[a] · [b] = [ab]
This is OK, as
[a] = [a′ ]
→
→
[b] = [b′ ]
[a + b] = [a′ + b′ ]
[ab] = [a′ b′ ]
Eg 1.32.
[2] + [4] = [1]
[3] + [3] = [1]
[3] · [3] = [4]
1.6.1
Groups
Eg 1.33. (Zm , +) is a group. What about (Zm , ×)? Identity will be [1]. So [0] will have
no inverse (as [0] [a] = [0]). So let
Z∗m = Zm − {[0]}
23
1.6. APPLICATIONS TO NUMBER THEORY
Algebra I – lecture notes
For which m is (Z∗m , ×) a group?
Eg 1.34.
Z∗2 = {[1]}. This is a group.
Z∗3 = {[1] , [2]}
· [1]
[1] [1]
[2] [2]
[2]
[2]
[1]
Compare with S2 to see it is a group.
Z∗4
·
[1]
[2]
[1] [2]
[1] [2]
[2] [0]
[3]
[3]
Here [2] ∈ Z∗4 , but [2] [2] = [0] ∈
/ Z∗4 .
Theorem 1.2. (Z∗m , ×) is a group iff m is a prime number.
Proof.
→ Suppose Z∗m is a group. If m is not a prime, then
m = ab, 1 < a, b < m
so [a], [b] ∈ Z∗m (neither is [0]). but
[a] · [b] = [ab] = [m] = [0]
This contradicts closure. So m is prime.
← Suppose m is a prime, write m = p.
We show that Z∗p is a group.
– Closure Let [a] , [b] ∈ Z∗p . Then [a] , [b] 6= [0], so p 6 |a and p 6 |b. Then p 6 |ab (as
p is prime – result from M1F). So
[a] [b] = [ab] 6= [0]
Thus [a] [b] ∈ Z∗p .
24
Algebra I – lecture notes
1.6. APPLICATIONS TO NUMBER THEORY
– Associativity
([a] [b]) [c] = [ab] c = [(ab)c]
[a] ([b] [c]) = [a] [bc] = [a(bc)]
These are equal as (ab)c = a(bc) for a, b, c ∈ Z.
– Identity is [1] as [a] [1] = [1] [a] = [a].
– Inverses Let [a] ∈ Z∗p . We want to find [a′ ] such that [a] [a′ ] = [a′ ] [a] = [1], i.e.
[aa′ ] = [1]
aa′ ≡ 1 mod p
Here’s how. Well, [a] 6= [0] so p 6 |a. As p is prime, hcf (p, a) = 1. By M1F,
there exist integers s, t ∈ Z with
sp + ta = 1
Then
ta = 1 − sp ≡ 1
mod p
So
[t] [a] = [1]
Then [t] ∈ Z∗p ([t] 6= [0]) and [t] = [a]−1 .
2
So, Z∗p (p prime)
(1) is abelian
(2) has p − 1 elements
Eg 1.35. Z∗5 = {[1] , [2] , [3] , [4]}. Is Z∗5 cyclic?
Well
[2]2 = 4
[2]3 = [3]
So Z∗5 = h[2]i.
Eg 1.36. In the group Z∗31 what is [7]−1 ?
From the proof above, want to find s, t with
7s + 31t = 1
25
[2]4 = [1]
1.7. APPLICATIONS OF THE GROUP Z∗P
Algebra I – lecture notes
Use Euclidean algrithm
31 = 4 · 7 + 3
7 = 2·3+1
So
1 = 7−2·3
= 7 − 2(31 − 4 · 7)
= 9 · −2 · 31
So [7]−1 = [9] .
1.7
Applications of the group Z∗p
Theorem 1.3. (Fermat’s Little Theorem) Let p be a prime, and let n be an integer not
divisible by p. Then
np−1 ≡ 1 mod p
Proof. Work in the group
Z∗p = {[1] , . . . , [p − 1]}
As p 6 |n, [n] 6= [0],so
[n] ∈ Z∗p
Now Cor.?? says: if |G| = k then ak = e ∀a ∈ G.
Hence
[n]p−1 = identity of Z∗p
= [1]
Since
so
(from prop. 1.9).
[n]p−1 = [n] · · · [n] = np−1
p−1 n
= [1] → np−1 ≡ 1 mod p
Corollary 4. Let p be prime. Then for all integers n
np ≡ n
26
mod p
2
1.7. APPLICATIONS OF THE GROUP Z∗P
Algebra I – lecture notes
Proof. If p 6 |n then by FLT
np−1 ≡ 1 mod p
np ≡ n mod p
If p|n then both np and n are congruent to 0 mod p.
2
Eg 1.37. p = 5, then 1314 ≡ 1 mod 5
p = 17, then 6216 ≡ 1 mod 17.
Eg 1.38. Find remainder when divide 682 by 17.
616 ≡ 1 mod 17
(616 )5 = 680 ≡ 1 mod 17
682 = 680 · 66 ≡ 62 ≡ 2 mod 17
(FLT)
Second application.
1.7.1
Mersenne Primes
Definition 1.19. A prime number p is called a Mersenne prime if p = 2n − 1 for some
n ∈ N.
Eg 1.39.
22 − 1
23 − 1
24 − 1
25 − 1
27 − 1
=
=
=
=
=
3
7
15
31
127
The largest known primes are Mersenne primes. Largest known 2/2/06
230402457 − 1
Connection with perfect numbers
Definition 1.20. A positive integer N is perfect if N is equal to the sum of its positive
divisors (including 1, not N).
Eg 1.40.
6 = 1+2+3
28 = 1 + 2 + 4 + 7 + 14
27
1.7. APPLICATIONS OF THE GROUP Z∗P
Algebra I – lecture notes
Theorem 1.4. (Euler)
(1) If 2n − 1 is prime then 2n−1 (2n − 1) is perfect.
(2) Every even perfect number is of this form
Proof.
• Sheet 4.
• Harder - look it up.
2
It is still unsolved – is there an odd perfect number?
1.7.2
How to find Meresenne Primes
Proposition 1.10. If 2n − 1 is prime, then n must be prime.
Proof. Suppose n is not prime. So
n = ab, 1 < a, b < n
Then
2n − 1 = 2ab − 1
= (2a − 1)(2a(b−1) + 2a(b−2) + · · · + 2a + 1)
(using xb − 1 = (x − 1)(xb−1 + · · · ) with x = 2a )
So 2n − 1 has factor 2a − 1 > 1, so is not prime. Hence 2n − 1 implies n prime.
Eg 1.41. Know
22 − 1, 23 − 1, 25 − 1, 27 − 1
are prime. Next cases
211 − 1, 213 − 1, 217 − 1
Are these prime?
We will answer this using the group Z∗p . We will need
Proposition 1.11. Let G be a group, and let a ∈ G. Suppose an = e. Then o(a)|n.
28
2
1.7. APPLICATIONS OF THE GROUP Z∗P
Algebra I – lecture notes
Proof. Let K = o(a). Write
n = qK + r, 0 ≤ r < K
Then
e =
=
=
=
an = aqK+r
aqK ar = (aK )q ar
eq ar
ar
So ar = e. This K is smallest positive integer such that aK = e and 0 ≤ r < K, this forces
r = 0. Hence K = o(a) divides n.
2
Proposition 1.12. Let N = 2p − 1, p prime. Let q be prime, and suppose q|N. Then
q ≡ 1 mod p.
Proof. q|N means N ≡ 0 mod q, i.e.
2p ≡ 1 mod q
This means that
[2]p = [1] ∈ Z∗q
We know that Z∗q is a group of order q − 1. We also know that o([2]) in Z∗q divides p, so is
1 or p as p is prime.
If o([2]) = 1, then
[2] = [1] in Z∗q
that is
2 ≡ 1 mod q
1 ≡ 0 mod q
so q|1, a contradiction.
Hence we must have
o([2]) = p
By Corollary 1,
That is, p divides q − 1
o([2]) divides |Z∗q |
q−1 ≡ 0
q ≡ 1
mod p
mod p
2
29
1.8. PROOF OF LAGRANGE’S THEOREM
Algebra I – lecture notes
Test for a Mersenne prime
N = 2p − 1
√
List all the primes q with q ≡ 1 mod p and q < N and check, one by one, to see if any
divide N. If none of them divide N, we have a prime.
√
Eg 1.42. p = 11. N = 2p − 1 = 2047, N < 50. Which primes q less than 50 have q ≡ 1
mod 11? We check through all numbers congruent to 1 mod 11.
12, 23, 34, 45
The only prime less than 50 that can possibly divide 2047 is 23. Now we check to see if
23|211 − 1, i.e., if 21 1 = 1 mod 23.
25 ≡ 32 ≡ 9 mod 23
210 ≡ (25 )2 ≡ 92 mod 23
≡ 12 mod 23
11
2
≡ 23 mod 23
≡ 1 mod 23
Conclusion – 211 − 1 is not a prime – it has a factor of 23.
Eg 1.43. 213 − 1 is prime – Exercise sheet.
1.8
Proof of Lagrange’s Theorem
Now we have to prove the Lagrange’s Theorem
Theorem 1.5. Let G be a finite group of order |G|, with a subgroup H of order |H| = m.
Then m divides |G|.
Note 1.5. The idea – write H = {h1 , . . . , hm }. Then we divide G into “blocks”.
H
h1
h2
..
.
1
Hx Hy
h1 x h1 y
h2 x h2 y
2
3
...
...
...
...
r
We want the blocks to have the following three properties
(1) Each block has m distinct elements
(2) No element of G belongs to two blocks
(3) Every element of G belongs to (exactly) one block
30
Algebra I – lecture notes
1.8. PROOF OF LAGRANGE’S THEOREM
Then |G| is the total number of elements listed in the blocks, i.e. rm, so m||G|.
Definition 1.21. For x ∈ G, H subgroup of G, define the right coset
Hx = {hx | h ∈ H} = {hx | h ∈ H}
= {h1 x, h2 x, . . . , hm x}
The official name for a “block” is a right coset.
Note 1.6. Hx ⊆ G
Eg 1.44. G = S3 , H = hai = {e, a, a2 }, a = (1 2 3).
H = He = Ha = Ha2 = ea2 , aa2 , a2 a2
2
a , e, a
Take b = (1 2), so b2 = e,
Hb = eb, ab, a2 b = b, ab, a2 b
e
a
a2
b
ab
a2 n
Lemma 1.2. For any x in G
|Hx| = m
Proof. By definition, we have
Hx = {h1 , x, . . . , hm x}
These elements are all different, as
hi x = hj x
hi xx−1 = hj xx−1
hi = hj
So |Hx| = m.
2
Lemma 1.3. If x, y ∈ G then either Hx = Hy or Hx ∩ Hy = ∅.
Proof. Suppose
Hx ∩ Hy 6= ∅
We will show this implies Hx = Hy.
We can choose an element a ∈ Hx ∩ Hy. Then
a = hi x
a = hj y
31
1.8. PROOF OF LAGRANGE’S THEOREM
Algebra I – lecture notes
for some hi , hj ∈ H.
a =i x = hi y
x = h−1
i hj y
Then for any h ∈ H
hx = hh−1
i hj y
As H is a subgroup, hh−1
i hj ∈ H. Hence
hx ∈ Hy
This shows Hx ⊆ Hy.
Similarly
hi x = hj y
y = h−1
j hi x
so for any h ∈ H
So Hy ⊆ Hx.
We conclude Hx = Hy.
hy = hh−1
j hi x ∈ Hx
2
Lemma 1.4. Let x ∈ G. Then x lies in the right coset Hx.
Proof. As H is a subgroup, e ∈ H. So x = ex ∈ Hx.
2
Theorem 1.6. Let G be a finite group of order |G|, with a subgroup H of order |H| = m.
Then m divides |G|.
Proof. By 1.4, G is equal to the union all the right cosets of H, i.e.
[
G=
Hx
x∈G
Some of these right cosets will be equal (eg. G = S3 , H = hai, then H = He = Ha = Ha2 ).
Let the list of different right cosets be
Hx1 , . . . , Hxr
Then
G = Hx1 ∪ Hx2 ∪ · · · ∪ Hxr
and Hxi 6= Hxj if i 6= j (eg. in G = S3 , G = H ∪ Hb).
By 1.3, Hxi ∩ Hxj = ∅ if i 6= j. Picture
G = Hx1
Hx2
32
···
Hxr
(1.1)
Algebra I – lecture notes
1.8. PROOF OF LAGRANGE’S THEOREM
So |G| = |Hx1 | + · · · + |Hxr |. By 1.2
|Hxi | = m = |H|
So
|G| = rm = r|H|
Therefore |H| divides |G|.
2
Proposition 1.13. Let G be a finite group, and H a subgroup of G. Let
r=
|G|
|H|
Then there are exactly r different right cosets of H in G, say
Hx1 , . . . Hxr
They are disjoint, and
G = Hx1 ∪ · · · ∪ Hxr
Definition 1.22. The integer r =
|G|
|H|
is called the index of H in G, written
r = |G : H|
Eg 1.45.
(1) G = S3 , H = hai = {e, a, a2 }. Index |G : H| =
Hb and G = H ∪ Hb.
6
3
= 2. There are 2 right cosets H,
(2) G = S3 , K = hbi = {e, b} where b = (1 2)(3). Index |G : K| =
right cosets – they are
Ke = K = {e, b}
Ka = {a, ba}
= a, a2 b
Ka2 = a2 , ba2 = a2 , ab
33
6
2
= 3. So there are 3
Algebra I – lecture notes
Chapter 2
Vector Spaces and Linear Algebra
Recall
Rn = {(x1 , x2 , . . . , xn ) | xi ∈ R}
Basic operations on Rn :
• addition (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn )
• scalar multiplication λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ), λ ∈ R
These operations satisfy the following rules:
• Addition rules
A1 u + (v + w) = (u + v) + w
associativity
A2 v + 0 = 0 + v = v
identity
A3 v + (−v) = 0
inverses
A4 u + v = v + u
abelian
(These say (Rn , +) is an abelian group)
• Scalar multiplication rules
S1 λ(v + w) = λv + λw
S2 (λ + µ)v = λv + µv
S3 λ(µv) = (λµ)v
S4 1v = v
These are easily proved for Rn :
Eg 2.1.
34
Algebra I – lecture notes
2.1. DEFINITION OF A VECTOR SPACE
A1
u + (v + w) =
=
=
=
=
=
(u1 , . . . , un ) + ((v1 , . . . , vn ) + (w1 , . . . , wn ))
(u1 , . . . , un ) + (v1 + w1 , . . . , vn + wn )
(u1 + (v1 + w1 ), . . . )
((u1 + v1 ) + w1 , . . . )
((u1 , . . . ) + (v1 , . . . )) + (w1 , . . . )
(u + v) + w
S3
λ(µv) = λ ((µv1 , . . . , µvn ))
= (λ(µv1 ), . . . , λ(µvn ))
= ((λµ)v1 , . . . , (λµ)vn )
= (λµ)v
2.1
(assoc. of (R, ×))
Definition of a vector space
A vector space will be a set of objects with addition and scalar multiplication defined
satisfying the above axioms. Want to let the scalars be either R or C (or a lot of other
things). So let
F = either R or C
Definition 2.1. A vector space over F is a set V of objects called vectors together with
a set of scalars F and with
• a rule for adding any two vectors v, w ∈ V to get a vector v + w ∈ V
• a rule for multiplying any vector v ∈ V by any scalar λ ∈ F to get a vector λv ∈ V .
• a zero vector 0 ∈ V
• for any v ∈ V , a vector −v ∈ V
Such that the axioms A1-A4 and S1-S4 are satisfied.
There are many different types of vectors spaces
Eg 2.2.
(1) Rn is a vector space over R
(2) Cn = {(z1 , . . . , zn ) | zi ∈ C} with addition u + v and scalar multiplication λv (λ ∈ C)
is a vector space over C.
35
2.1. DEFINITION OF A VECTOR SPACE
Algebra I – lecture notes
(3) Let m, n ∈ N. Define
Mm,n = set of all m × n matrices with real entries
(So in this example, “vectors” are matrices.) Adopt the usual rules for addition and
scalar multiplication of matrices: A = (aij ), B = (bij ), λ ∈ R
A + B = (aij + bij )
λA = (λaij )
Zero vector is the matrix 0 (m × n zero matrix). And −A = (−aij ). Then Mm,n
becomes a vector space over R (check axioms).
(4) A non-example: Let V = R2 , with usual addition defined and new scalar multiplication: λ ∗ (x1 , x2 ) = (λx1 , 0). Let’s check axioms
– A1-A4 hold
– S1 λ ∗ (v + w) = λ ∗ v + λ ∗ w holds
– S2 (λ + µ) ∗ v = λ ∗ v + µ ∗ v holds
– S3 λ ∗ (µ ∗ v) = (λµ)v holds
– S4 1 ∗ v = v fails. To show this, need to produce just one v for which it fails,
eg. 1 ∗ (17, 259) = (17, 0) 6= (17, 259)
(5) Functions. Let
V = set of all functions f : R → R
So “vectors” are functions.
– Addition f + g is a function x 7→ f (x) + g(x)
– Scalar multiplication is a function x 7→ λf (x) (λ ∈ R)
– Zero vector is the function 0 : x 7→ 0.
– Inverses −f is a function x 7→ −f (x)
Check the axioms
– A1 using associativity of R
(f + (g + h))(x) =
=
=
=
=
36
f (x) + (g + h)(h)
f (x) + (g(x) + h(x))
(f (x) + g(x)) + h(x)
(f + g)(x) + h(x)
((f + g) + h) (x)
Algebra I – lecture notes
2.1. DEFINITION OF A VECTOR SPACE
Conclude V is a vector space over R.
(6) Polynomials. Recall a polynomial over R is an expression
p(x) = a0 + a1 x + · · · + anx
with all ai ∈ R. Let
V = set of all polynomials over R
P
P i
– Addition If p(x) = ai xi , q(x) =
bi x then
X
p(x) + q(x) =
(ai + bi )xi
P
– Scalar multiplication If p(x) =
ai xi , ai ∈ R then
X
λp(x) =
λai xi
– Zero vector is 0 – the poly with all coefficients 0
P
– Negative of p(x) =
ai xi is
X
−p(x) =
−ai xi
Now check A1-A4, S1-S4. So V is a vector space over R.
Consequence of axioms
Proposition 2.1. Let V be a vector space over F and let v ∈ V , λ ∈ F
(1) 0v = 0
(2) λ0 = 0
(3) if λv = 0 then λ = 0 or v = 0
(4) (−λ)v = −λv = λ(−v)
Proof.
(1) Observe
0v = (0 + 0)v
= 0v + 0v
0v + (−0v) = (0v + 0v) + (−0v)
0 = 0v
by S2
(2)
λ0 = λ(0 + 0)
= λ0 + λ0
0 = λ0
by S1
Parts (3), (4) – Ex. sheet 5
2
37
2.2. SUBSPACES
2.2
Algebra I – lecture notes
Subspaces
Definition 2.2. Let V be a vector space over F , and let W ⊆ V . Say W is a subspace of
V if W is itself a vector space, with the same addition and scalar multiplication as V .
Criterion for subspaces
Proposition 2.2. W is a subspace of vector space V if the following hold:
(1) 0 ∈ W
(2) if v, w ∈ W then v + w ∈ W
(3) if w ∈ W , λ ∈ F then λw ∈ W
Proof. Assume (1), (2), (3). We show W is a vector space.
• Addition and scalar multiplication on W are defined by (2), (3).
• Zero vector 0 ∈ W by (1)
• Negative −w = (−1)w ∈ W by (3).
Finally, A1-A4, S1-S4 hold for W since they hold for V .
Eg 2.3.
1. V is a subspace of itself.
2. {0} is a subspace of any vector space.
3. Let V = R2 and
W = {(x1 , x2 ) | x1 + 2x2 = 0}
Claim W is a subspace of R2 .
Proof. Check (1)-(3) from the proposition
(1) 0 ∈ W since 0 + 2 · 0 = 0
(2) Let v = (v1 , v2 ) ∈ W , w = (w1 , w2 ) ∈ W . So
v1 + 2v2 = w1 + 2w2 = 0
v1 + w1 + 2(v2 + w2 ) = 0
v + w = (v1 + w1 , v2 + w2 ) ∈ W
(3) Let v = (v1 , v2 ) ∈ W , λ ∈ R. Then
v1 + 2v2 = 0
λv1 + 2λv2 = 0
λv = (λv1 , λv2 ) ∈ W
38
2
Algebra I – lecture notes
2.3. SOLUTION SPACES
So W is a subspace by 2.2.
2
4. Same proof shows that any line through 0 (ie. px1 + qx2 = 0) is a subspace of R2 .
Note 2.1. A line not through the origin is not a subspace (no zero vector).
The only subspace of R2 are: lines through 0, R2 itself, {0}.
5. Let V = vector space of polynomials over R. Define
W = polynomials of degree at most 3
(recall deg(p(x)) = highest power of x appearing in p(x)).
Claim W is a subspace of V .
Proof.
(1) 0 ∈ W
(2) if p(x), q(x) ∈ W then deg(p), deg(q) ≤ 3, hence deg(p + q) ≤ 3, so p + q ∈ W .
(3) if p(x) ∈ W , λ ∈ R, then λp(x) has degree of most 3, so λp(x) ∈ W .
2
2.3
Solution spaces
Vast collection of subspaces of Rn is provided by the following
Proposition 2.3. Let A be an m × n matrix with real entries and let
W = {x ∈ Rn | Ax = 0}
(The set of solutions of the system of linear equations Ax = 0)
Then W is a subspace of Rn .
Proof. We check 3 conditions of 2.2.
(1) 0 ∈ W (as A0 = 0)
(2) if v, w ∈ W then Av = Aw = 0. Hence A(v + w) = 0, so v + w ∈ W
(3) if v ∈ W , λ ∈ R (Av = 0), then A(λv) = λ(Av) = λ0 = 0, so λv ∈ W
2
Definition 2.3. The system Ax = 0 is a homogeneous system of linear equations, and W
is called the solution space
Eg 2.4.
39
2.4. LINEAR COMBINATIONS
Algebra I – lecture notes
1. m = 1, n = 2, A = a b . Then
W = x ∈ R2 | ax1 + bx2 = 0
which is a line through 0.
2. m = 1, n = 3, A = a b c . Then
W = x ∈ R3 | ax1 + bx2 + cx3 = 0
a plane through 0.
3. m = 2, n = 4, A =
1 2 1 0
−1 0 1 2
Here
W = x ∈ R4 | x1 + 2x2 + x3 = 0, − x1 + x3 + 2x4 = 0
4. Try a non-linear equation:
W = (x1 , x2 ) ∈ R2 | x1 x2 = 0
Answer is no. To show this, need a single counterexample to one of the conditions
of 2.2, eg:
(1, 0), (0, 1) ∈ W , but (1, 0) + (0, 1) = (1, 1) ∈
/ W.
2.4
Linear Combinations
Definition 2.4. Let V be a vector space over F and let v1 , v2 , . . . , vk be vectors in V . A
vector v ∈ V of the form
v = λ1 v1 + λ2 v2 + · · · + λk vk
is called a linear combination of v1 , . . . , vk .
Eg 2.5.
1. V = R2 . Let v1 = (1, 1). The linear combinations of v1 are the vectors
v = λv1
(λ ∈ R) = (λ, λ)
These form the line through origin and v1 , ie. x1 − x2 = 0.
2. V = R2 . Let
v1 = (1, 0)
v2 = (0, 1)
The linear combinations of v1 , v2 are
λ1 v1 + λ2 v2 = (λ1 , λ2 )
So every vector in R2 is a linear combination of v1 , v2 .
40
Algebra I – lecture notes
2.5. SPAN
3. V = R3 . Let
v1 = (1, 1, 1)
v2 = (2, 2, −1)
Typical linear combination is
λ1 v1 + λ2 v2 = (λ1 + 2λ2 , λ1 + 2λ2 , λ1 − λ2 )
This gives all vectors in the plane containing origin, v1 , v2 , which is x1 − x2 = 0. So eg.
(1, 0, 0) is not a linear combination of v1 , v2 .
2.5
Span
Definition 2.5. Let V be a vector space over F , and let v1 , . . . , vk be vectors in V . Define
the span of v1 , . . . , vk , written
Sp(v1 , . . . , vk )
to be the set of all linear combinations of v1 , . . . , vk . In other words
Sp(v1 , . . . , vk ) = {λ1 v1 + · · · + λk vk | λi ∈ F } ⊆ V
Eg 2.6.
1. V = R2 , any v1 ∈ V . Then
Sp(v1 ) =
=
all vectors λv1 (λ ∈ R)
line through 0, v1
2. In R2 ,
Sp((1, 0), (0, 1)) = R2
3. In R3 , v1 = (1, 1, 1), v2 = (2, 2, −1)
Sp(v1 , v2 ) =
=
plane containing 0, v1 , v2
plane x1 = x2
4. In R3
Sp(v1 = (1, 0, 0), v2 = (0, 1, 0), v3 = (0, 0, 1)) =
5. V = R3 . Let
w1 = (1, 0, 0)
w2 = (1, 1, 0)
w3 = (1, 1, 1)
Claim: Sp(w1 , w2 , w3 ) = R3 .
41
whole of R3
2.5. SPAN
Algebra I – lecture notes
Proof. Observe
v1 = w1
v2 = w2 − w1
v3 = w3 − w2
Hence any linear combination of v1 , v2 , v3 is also a linear combination of w1 , w2 , w3 (i.e.
(λ1 , λ2 , λ3 ) = λ1 v1 +λ2 v2 +λ3 v3 = λ1 w1 +λ2 (w2 −w1 )+λ3 (w3 −w2 ) ∈ Sp(w1 , w2 , w3 ))
2
6. V = vector space of polynomials over R. Let
v1 = 1
v2 = x
v3 = x2
Then
Sp(v1 , v2 , v3 ) = {λ1 v1 + λ2 v2 + λ3 v3 | λi ∈ R}
= λ1 + λ2 x + λ3 x2 | λi ∈ R
= set of all polynomials of degree ≤ 2
Eg 2.7. In general, If v1 , v2 are vectors in R3 , not on same line through 0 (i.e. v1 6= λv1 ),
then
Sp(v1 , v2 ) = plane through 0, v1 , v2
Proposition 2.4. V vector space, v1 , . . . , vk ∈ V . Then
Sp(v1 , . . . , vk )
is a subspace of V .
Proof. Check the conditions of 2.2
(1) Taking all λi = 0 (using 2.1)
0v1 + 0v2 + · · · + 0vk = 0 + · · · + 0 = 0
So 0 is a linear combination of v1 , . . . , vk , so 0 ∈ Sp(v1 , . . . , vk )
(2) Let v, w ∈ Sp(v1 , . . . , vk ), so
v = λ1 v1 + · · · + λk vk
w = µ1 v1 + · · · + µk vk
Then v + w = (λ1 + µ1 )v1 + · · · + (λk + µk )vk ∈ Sp(v1 , . . . , vk ).
42
Algebra I – lecture notes
2.6. SPANNING SETS
(3) Let v ∈ Sp(v1 , . . . , vk ), λ ∈ F , so
v = λ1 v1 + · · · + λk vk
so
λv = (λλ1 )v1 + · · · + (λλk vk ) ∈ Sp(v1 , . . . , vk )
2
2.6
Spanning sets
Definition 2.6. V vector space, W a subspace of V . We say vectors v1 , . . . , vk span W if
(1) v1 , . . . , vk ∈ W and
(2) W = Sp(v1 , . . . , v2 )
Call the set {v1 , . . . , vk } a spanning set of W .
Eg 2.8.
• {(1, 0, 0) , (1, 1, 0) , (1, 1, 1)} is a spannig set for R3 .
• (1, 1, 1) , (2, 2, −1) span the plane x1 − x2 = 0.
• Let



1 1 3 1

4 

W = x∈R 2 3 1 1 x=0


1 0 8 2
Find a (finite) spanning
Solve system

1 1
 2 3
1 0
Echelon form:


set for W .



3 1 0
1
1
3
1 0
1 1 0  →  0
1 −5 −1 0 
8 2 0
0 −1
5
1 0


1 1
3
1 0
→  0 1 −5 −1 0 
0 0
0
0 0
x1 + x2 + 3x3 + x4 = 0
x2 − 5x3 − x4 = 0
43
2.7. LINEAR DEPENDENCE AND INDEPENDENCE
Algebra I – lecture notes
General solution
x4
x3
x2
x1
=
=
=
=
=
a
b
a + 5b
−a − 3b − (a + 5b)
−2a − 8b
i.e. x = (−2a − 8b, a + 5b, b, a). So W = {(−2a − 8b, a + 5b, b, a) | a, b ∈ R} Define
two vectors (take a = 1 and b = 0 and vice versa)
w1 = (−2, 1, 0, 1)
w2 = (−8, 5, 1, 0)
a = 1, b = 0
a = 0, b = 1
Claim W = Sp(w1 , w2 )
Proof. Observe
(−2a − 8b, a + 5b, b, a) = a(−2, 1, 0, 1) + b(−8, 5, 1, 0)
= aw1 + bw2
This gives a general method of finding spanning sets of solution spaces.
2.7
2
Linear dependence and independence
Definition 2.7. V vector space over F . We say a set of vectors v1 , . . . , vk in V is a linearly
independent set if the following condition holds
λ1 v1 + · · · + λk vk = 0 ⇒
all λi = 0
Usually just say the vectors v1 , . . . , vk are linearly independent vectors.
We say the set {v1 , . . . , vk } is linearly dependent if the oposite true, i.e. if we can find
scalars λi such that
(1) λ1 v1 + · · · + λk vk = 0
(2) at least one λi 6= 0
Eg 2.9.
1. V = R2 , v1 = (1, 1). Then {v1 } is a linearly independent set, as
λv1 = 0 ⇒ (λ, λ) = (0, 0)
⇒ λ=0
44
Algebra I – lecture notes
2.7. LINEAR DEPENDENCE AND INDEPENDENCE
2. V = R2 , the set {0} is linearly dependent, e.g.
20 = 0
3. In R2 , let v1 = (1, 1), v2 = (2, 1). Is {v1 , v2 } linearly independent?
Consider equation
λ1 v1 + λ2 v2 = 0
i.e.
(λ1 , λ1 ) + (2λ2 , λ2 ) = (0, 0)
i.e.
λ1 + 2λ2 = 0
⇒ λ1 = λ2 = 0
λ1 + λ2 = 0
4. In R3 , let
v1 = (1, 0, 1)
v2 = (2, 2, −1)
v3 = (1, 4, −5)
Are v1 , v2 , v3 linearly independent?
Consider system
x1 v1 + x2 v2 + x3 v3 = 0
(2.1)
This is the system of linear equations


1
2
1
 0
2
4 x = 0
1 −1 −5
(i.e. v1 v2 v3 x = 0)
Solve




1
2
1 0
1
2
1 0
 0
2
4 0  →  0
2
4 0 
1 −1 −5 0
0 −3 −6 0


1 2 1 0
→  0 2 4 0 
0 0 0 0
Solution x = (3a, −2a, a) (any a). So
3v1 − 2v2 + v3 = 0
So v1 , v2 , v3 are linearly dependent. Geometrically, v1 , v2 span a plane in R3 and v3 =
−3v1 + 2v2 ∈ Sp(v1 , v2 ) is in this plane.
In general: in R3 , three vectors are linearly dependent iff they are coplanar.
45
2.7. LINEAR DEPENDENCE AND INDEPENDENCE
Algebra I – lecture notes
4. V = vector space of polynomials over R. Let
p1 (x) = 1 + x2
p2 (x) = 2 + 2x − x2
p3 (x) = 1 + 4x − 5x2
Are p1 , p2 , p3 linearly dependent?
Consider equation
λ 1 p1 + λ 2 p2 + λ 3 p3 = 0
Equating coefficients
λ1 + 2λ2 + λ3 = 0
2λ2 + 4λ3 = 0
λ1 − λ2 − 5λ3 = 0
Showed in previous example that a solution is
λ1 = 3, λ2 = −2, λ3 = 1
So
So linearly dependent.
3p1 − 2p2 + p1 = 0
5. V = vector space of functions R → R. Let
f1 (x) = sin x, f2 (x) = cos x
So f1 , f2 ∈ V . Are f1 , f2 linearly independent? Sheet 6.
Two basic results about linearly independent sets.
Proposition 2.5. Any subset of a linearly independent set of vectors is linearly independent.
Proof. Let S be a lin. indep. set of vectors, and T ⊆ S. Label vectors in S, T
T = {v1 , . . . , vt }
S = {v1 , . . . , vt , vt+1 , . . . , vs }
Suppose
λ1 v1 + · · · + λt vt = 0
Then
λ1 v1 + · · · + λt vt + 0vt+1 + · · · + 0vs = 0
As S is lin. indep., all coeffs must be 0, so all λi = 0. Thus T is lin. indep.
46
2
Algebra I – lecture notes
2.7. LINEAR DEPENDENCE AND INDEPENDENCE
Proposition 2.6. V vector space, v1 , . . . , vk ∈ V . Then the following two statements are
equivalent (i.e. (1)⇔(2)).
(1) v1 , . . . , vk are lin. dependent
(2) there exists i such that vi is a linear combination of v1 , . . . , vi−1 .
Proof.
(1) ⇒ (2) Suppose v1 , . . . , vk is lin. dep., so there exist λi such that
λ1 v1 + · · · + λk vk = 0
and λj 6= 0 for some j. Choose the largest j for which λj 6= 0. So
λ1 v1 + · · · + λj vj = 0
Then
λj vj = −λ1 v1 − · · · − λj−1 vj−1
So
vj = −
λ1
λj−1
−···−
λj
λj
which is a linear combination of v1 , . . . , vj−1 .
(1) ⇐ (2) Assume vi is a linear combination of v1 , . . . , vi−1 , say
v1 = λ1 v1 + · · · + λi−1 vi−1
Then
λ1 v1 + · · · + λi−1 vi−1 − vi + 0vi+1 + · · · + 0vk = 0
Not all the coefficients in this equation are zero (coef of vi is −1). So v1 , . . . , vk are
lin. dependent.
2
Eg 2.10. v1 = (1, 0, 1), v2 = (2, 2, −1), v3 = (1, 4, −5) in R3 . These are linearly dependent:
3v1 − 2v2 + v3 = 0. And v3 = −3v1 + 2v2 a linear combination of previous ones.
Proposition 2.7. V vector space, v1 , . . . , vk ∈ V . Suppose vi is a linear combination of
v1 , . . . , vi−1 . Then
Sp(v1 , . . . , vk ) = Sp(v1 , . . . , vi−1 , vi+1 , . . . , vk )
(i.e. throwing out vi does not change Sp(v1 , . . . , vk ))
47
2.8. BASES
Algebra I – lecture notes
Proof. Let
vi = µ1 v1 + · · · + µi−1 vi−1 (µj ∈ F )
Now consider
v = λ1 v1 + · · · + λk vk ∈ Sp(v1 , . . . , vk )
Then
v = λ1 v1 + · · · + λi−1 vi−1 +
+λi (µ1 v1 + · · · + µi−1 vi−1 ) +
+λi+1 vi+1 + · · · + λk vk
So v is a lin. comb. of
v1 , . . . , vi−1 , vi+1 , . . . , vk
Therefore Sp(v1 , . . . , vk ) ⊆ Sp(v1 , . . . , vi−1 , vi+1 , . . . vk ).
2
Eg 2.11. v1 = (1, 0, 1), v2 = (2, 2, −1), v3 = (1, 4, −5). Here
v3 = −3v1 + 2v2
So Sp(v1 , v2 , v3 ) = Sp(v1 , v2 ).
2.8
Bases
Definition 2.8. V a vector space. We say a set of vectors {v1 , . . . , vk } in V , is basis of V
if
(1) V = Sp(v1 , . . . , vk )
(2) {v1 , . . . , vk } is a linearly independent set.
Informally, a basis is a spanning set for which we cannot throw any of the vectors away.
Eg 2.12.
1. {(1, 0), (0, 1)} is a basis of R2 .
v1
v2
Proof.
(1) (x1 , x2 ) = x1 v1 + x2 v2 so R2 = Sp(v1 , v2 )
(2) v1 , v2 are linearly independent as
λ1 v1 + λ2 v2 = 0 ⇒ (λ1 , λ2 ) = (0, 0)
⇒ λ1 = λ2 = 0
48
Algebra I – lecture notes
2.8. BASES
2
2. (1, 0, 0), (1, 1, 0), (1, 1, 1) is a basis of R3
Proof.
(1) They span R3 – previous example.
(2)
x1 v1 + x2 v2 + x3 v3 = 0


1 1 1 0
leads to the system 0 1 1 0 with the only solution x1 = x2 = x3 = x4 = 0
0 0 1 0
∴ v1 , v2 , v3 are lin. indep.
2
Theorem 2.1. Let V be a vector space with a spanning set v1 , . . . , vk (i.e. V = Sp(v1 , . . . , vk )).
Then there is a subset of {v1 , . . . , vk } which is a basis of V .
Proof. Consider the set
v1 , . . . , vk
We throw away vectors in this list which are linear combinations of the previous vectors
in the list. End up with a basis. Process:
Casting out Process
First, throw away any zero vectors in the list.
• Start at v2 : if it is a linear combination v1 , (i.e. v2 = λv1 ), then delete it; if not,
leave it there.
• Now consider v3 : if it is a linear combination of the remaining previous vectors, delete
it; if not, leave it there.
• Continue, moving from left to right, deleting any vi , which is a linear combination of
previous vectors in the list.
End up with a subset {w1 , . . . , wm } of {v1 , . . . , vk } such that
(1) V = Sp(w1 , . . . , wm ) (by 2.7)
(2) no wi is a linear combination of previous ones.
Then {w1 , . . . , wm } form a linearly independent set by 2.6. Therefore {w1 , . . . , wm } is a
basis of V .
2
Eg 2.13.
49
2.8. BASES
Algebra I – lecture notes
1. V = R3 , v1 = (1, 0, 1), v2 = (2, 2, −1), v3 = (1, 4, −5). Let W = Sp(v1 , v2 , v3 ). Find
a basis of W .
1) Is v2 a linear combination of v1 ? No: leave it in.
2) Is v3 a linear combination of v1 , v2 ? Yes: v3 = −3v1 + 2v2 .
So cast out v3 : basis for W is {v1 , v2 }.
2. Here’s a meatier example of The Casting out Process. Let V = R4 and
v1
v2
v3
v4
v5
=
=
=
=
=
(1, −2, 3, 1)
(2, 2, −2, 1)
(5, 2, −1, 3)
(11, 2, 1, 7)
(2, 8, 2, 3)
Let W = Sp(v1 , . . . , v5 ), subspace of R4 . Find abasis
 of W .
v1
 .. 
The all-in-one-go method : Form 5 × 4 matrix  .  and reduce it to echelon form:
v5







v1
1 −2
3 1
2
2 −2 1 
 v2
5
2 −1 3 
 v3
11
2
1 7  v4
v5
2
8
2 3



→ 





→ 



v1
1 −2
3
1
0
6 −8 −1 
 v2 − 2v1
0 12 −16 −2 
 v3 − 5v1
0 24 −32 −4  v4 − 11v1
v5 − 2v1
0 12 −4
1

v1
1 −2
3
1

0
6 −8 −1  v2 − 2v1
0
0
0
0 
 v3 − 5v1 − 2(v2 − 2v1 )
0
0
0
0  v4 − 11v1 − 4(v2 − 2v1 )
v5 − 2v1 − 2(v2 − 2v1 )
0
0 12
3
So v3 is a linear combination of v1 and v2 : cast it out. And v4 is linear combination
of previous ones: cast it out.
Row vectors in echelon form are linearly independent: So last row v5 + 2v1 − 2v2 is
not a linear combination of the first two rows. So v5 is not a linear combination of
v1 , v2 .
Conclude: Basis of W is {v1 , v2 , v5 }.
To help with spanning calculations:
Eg 2.14. Let v1 = (1, 2, −1), v2 = (2, 0, 1), v3 = (0, −1, 3), v4 = (1, 2, 3). Do v1 , v2 , v3 , v4
span the whole of R3 ?
50
Algebra I – lecture notes
2.9. DIMENSION
Let b ∈ R3 . Then b ∈ Sp(v1 , v2 , v3 , v4 ) iff system x1 v1 +x2 v2 +x3 v3 +x4 v4 = b has a solution
for x1 , x2 , x3 , x4 ∈ R. This system is




1 2
0 1 b
b1
1
2
0 1
 2 0 −1 2 b2  →  0 −4 −1 0 b2 − 2b1 
−1 1
3 3 b3
0
3
3 4 b3 + b1


b1
1
2
0 1
→  0 −4 −1 0 b2 − 2b1 
0
0
9 12
···
This system has a solution for any b ∈ R3 . Hence Sp(v1 , . . . , v4 ) ∈ R3 .
2.9
Dimension
Definition 2.9. A vector space V is finite-dimensional if it has a finite spanning set, i.e.
there is a finite set of vectors v1 , . . . , vk such that V = Sp(v1 , . . . , vk ).
Eg 2.15. Rn is finite dimensional. To show this, let
e1 = (1, 0, 0, . . . , 0)
e2 = (0, 1, 0, . . . , 0)
..
.
en = (0, 0, 0, . . . , 1)
Then for any x = (x1 , . . . , xn ) ∈ Rn
x = x1 e1 + x2 e2 + · · · + xn en
So Rn = Sp(e1 , . . . , en ) is finite-dimensional.
Note 2.2. {e1 , . . . , en } is linearly independent since λ1 e1 +· · ·+λn en = 0 implies (λ1 , . . . , λn ) =
0, so all λi = 0. So {e1 , . . . , en } is a basis for Rn , called the standard basis.
Eg 2.16. Let V be a vector space of polynomials over R.
Claim: V is not finite-dimensional.
Proof. By contradiction. Assume V has a finite spanning set p1 , . . . , pk . Let deg (pi ) = ni
and let n = max (n1 , . . . , nk . Any linear combination λ1 p1 + · · · + λk pk (λ ∈ R) has degree
≤ n. So the poly xn+1 is not a linear combination of vectors from our assumed spanning
set; contradiction.
2
Proposition 2.8. Any finite-dimensional vector space has a basis.
Proof. Let V be a finite-dimensional vector space. Then V has a finite spanning set. This
contains a basis of V by Theorem 2.1.
2
51
2.9. DIMENSION
Algebra I – lecture notes
Definition 2.10. The dimension of V is the number of vectors in any basis of V .1 Written
dim V
Eg 2.17. Rn has basis e1 , . . . , en , so dim Rn = n.
Eg 2.18. Let v ∈ R2 , v 6= 0 and let L be the line through 0 and v. So L is a subspace.
L = {λv | λ ∈ R}
So L = Sp(v) and {v} is a basis of L. So dim L = 1.
Eg 2.19. Let v1 , v2 ∈ R3 with v1 , v2 6= 0 and v2 6= λv1 . Then Sp(v1 , v2 ) = P is a plane
through 0, v1 , v2 . As v2 =
6 λv1 , {v1 , v2 } is linearly indepenednt, so is a basis of P . So
dim P = 2.
Major result:
Theorem 2.2. Let V be a finite-dimensional vector space. Then all bases of V have the
same number of vectors.
Proof. Based on:
Lemma 2.1. (Replacement Lemma) V a vector space. Suppose v1 , . . . , vk and x1 , . . . , xr
are vectors in V such that
• v1 , . . . , vk span V
• x1 , . . . , xr are linearly independent
Then
(1) r ≤ k and
(2) there is a subset {w1 , . . . , wk−r } of {v1 , . . . , vk } such that x1 , . . . , xr , w1 , . . . , wk−r span
V (i.e. we can replace r of the v’s by the x’s and still span V )
Eg 2.20. V = R3 .
• e1 , e2 , e3 span R3
• x1 = (1, 1, −1)
According to 2.1(2), we can replace one of the ei ’s by x1 and get a spanning set {x1 , ei , ej }.
How? Consider spanning set
x1 , e1 , e2 , e3
This set is linearly dependent since x = e1 + e2 − e3 . By 2.6, one of the vectors is therefore
a linear combination of previous ones – in this case
e3 = e1 + e2 − x1
So cast out e3 – spanning set is {x1 , e1 , e2 }.
1
following theorem shows the uniqueness of this number
52
Algebra I – lecture notes
2.10. FURTHER DEDUCTIONS
Proof. (of Lemma 2.1)
Consider S1 = {x1 , v1 , . . . , vk }. This spans V . It is linearly dependent, as x1 is a linear
combination of the spanning set v1 , . . . , vk . So by 2.6, some of the vectors in S1 is a
linear combination of previous ones. This vector is not x1 , so it is some vi . By 2.7,
V = Sp(x1 , v1 , . . . , 6 vi , . . . , vk ). Now let S2 = {x1 , x2 , v1 , . . . , vi , . . . , vk }. This spans V and
is linearly dependent, as x2 is a linear combination of others. By 2.6, there exists a vector
in S2 which is linear combination of previous ones. It is not x1 and x2 , as x1 , x2 are linearly
inedpendent. So it is some vj . By 2.7, V = Sp(x1 , x2 , v1 , . . . , 6 vi , . . . , 6 vj , . . . , vk ). Continue
like this, adding x’s, deleting v’s.
If r > k, then eventually, we delete all the v’s and get V = Sp(x1 , . . . , xk ). Then xk+1 is a
linear combination of x1 , . . . , xk . This can’t happed as x, . . . , xk+1 is linearly independent
set. Therefore r ≤ k.
This proces ends when we’ve used up all the x’s, giving
V = Sp(x1 , . . . , xr , k − r remaining v’s)
2
(Proof of 2.2 continued)
Let {v1 , . . . , vk } and {x1 , . . . , xr } be the bases of V . Both are spanning sets for V and both
are linearly independent. Well v1 , . . . , vk span, x1 , . . . , xr is linearly inedpendent, so by the
previous lemma, r ≤ k.
Similarly, x1 , . . . , xr span. v1 , . . . , vk is linearly independent, so by the previous lemma
again, k ≤ r.
Hence r = k. So all bases of V have the same number of vectors.
2
2.10
Further Deductions
Proposition 2.9. Let dim V = n. Any spanning set for V of size n is a basis of V .
Proof. Let {v1 , . . . , vn } be the spanning set. By 2.1, this set contains a basis of V . By 2.2,
all bases of V have the size n. Therefore, {v1 , . . . , vn } is a basis of V .
2
Eg 2.21. Is (1, −2, 3), (0, 2, 5), (−1, 0, 6) a basis of R3 ?



1 −2 3
1 −2
 0


2 5
→
0
2
−1
0 6
0 −2

1 −2

0
2
→
0
0

3
5 
9

3
5 
14
The rows of this echelon form are linearly independent, so can’t cast out any vectors. So
they form a basis.
53
2.10. FURTHER DEDUCTIONS
Algebra I – lecture notes
Proposition 2.10. If {x1 , . . . , xr } is a linearly independent set in V , then there is a basis
of V containing x1 , . . . , xr .
Proof. Let v1 , . . . , vn be a basis of V . By 2.1(2), there exists {w1 , . . . , wn−r } ⊆ {v1 , . . . , vn }
such that
V = Sp(x1 , . . . , xr , w1, . . . , wn−r )
Then x1 , . . . , xr , w1 , . . . , wn−r is a spanning set of size n, hence is a basis by 2.9.
2
Eg 2.22. Let v1 = (1, 0, −1, 2), v2 = (1, 1, 2, 5) ∈ R4 . Find a basis of R4 containing v1 , v2 .
Claim: v1 , v2 , e1 , e2 is a basis of R4 .
Proof. Clearly can get all standard basis vectors e1 , e2 , e3 , e4 as linear combination of
v1 , v2 , e1 , e2 . So v1 , v2 , e1 , e2 span R4 , so they are basis by 2.9.
2
Proposition 2.11. Let W be subspace of V . Then
(1) dim W ≤ dim V
(2) If W 6= V , then dim W < dim V
Proof.
(1) Let w1 , . . . , wr be a basis of W . This set is linearly independent. So by Proposition
2.10 there is a basis of V containing it. Say w1 , . . . , wr , v1 , . . . , vs . Then dim V =
r + s ≥ r = dim W .
(2) If dim W = dim V , then s = 0 and w1 , . . . , wr is a basis of V , so V = Sp(w1 , . . . , wr ) =
W.
2
Eg 2.23. (The subspaces of R3 ) Let W be a subspace of R3 . Then dim W ≤ dim R3 = 3.
Possibilities:
• dim W = 3 Then W = R3
• dim W = 2 Then W has a basis {v1 , v2 } so W = Sp(v1 , v2 ), which is a plane through
0, v1 , v2 .
• dim W = 1 Then W has a basis {v1 } so W = Sp(v1 ), which is a line through 0, v1 .
• dim W = 0 Then W = {0}.
Conclude: The subspaces of R3 are {0}, R3 and lines and planes containing 0.
Proposition 2.12. Let dim V = n. Any set of n vectors which is linearly independent is
a basis of V .
54
Algebra I – lecture notes
2.10. FURTHER DEDUCTIONS
Proof. Let v1 , . . . , vn be linearly independent. By Proposition 2.10 there is a basis containing v1 , . . . , vn . As all bases have n vectors, v1 , . . . , vn must be a basis.
2
Eg 2.24. Is the set (1, 1, −1, 0), (2, 0, −1, 2), (0, 3, 1, −1),



1
1
1 1 −1
0
v1
 0 −2

2
0
−1
2
v2 
 → 

 0
3
1 −1 
v3  0 3
0
0
2 2
1
0
v4

1
1
 0 −1
→ 
 0
0
0
0

1
1
 0 −1
→ 
 0
0
0
0
(2, 2, 1, 0) a basis of R4 ?

−1
0
2
2 

1 −1 
3
0

−1 0
1 1 

4 5 
3 0

−1 0 w1
1 1 
 w2
4 5  w3
0 1 w4
The vectors w1 , w2 , w3 , w4 are linearly independent (clear as they are in echelon form). By
2.12, w1 , . . . , w4 are a basis of R4 , therefore v1 , . . . , v4 span R4 (as w’s are linear combinations of v’s), therefore v1 , . . . , v4 is a basis of R4 , by 2.9.
Proposition 2.13. Let dim V = n. Then any set of n + 1 vectros or more vectors in V is
linearly dependent.
Proof. Let S be a set of n + 1 or more vectors. If S is linearly independent, it is contained
in a basis by Proposition 2.10, which is impossible as all bases have n vectors. So S is
linearly dependent.
2
Eg 2.25. (A fact about matrices)
Let V = M2,2 , the vector space of all 2 × 2 matrices over R (usual addition A + B and
scalar multiplication λA of matrices) Basis: Let
1 0
E11 =
0 0
0 1
E12 =
0 0
0 0
E21 =
1 0
0 0
E22 =
0 1
Claim: E11 , E12 , E21 , E22 is a basis of V = M2,2 .
Proof.
55
2.10. FURTHER DEDUCTIONS
• Span
a b
c d
Algebra I – lecture notes
= aE11 + bE12 + cE21 + dE22
• Linear independence
λ1 E11 + λ2 E12 + λ3 E21 + λ4 E22 = 0 implies
0 0
λ1 λ2
⇒ λi = 0
=
0 0
λ3 λ4
2
So dim V = 4.
Now let A ∈ V = M2,2 . Consider I, A, A2 , A3 , A4 . These are 5 vectors in V , so they are
linearly dependent by 2.13. This means there exist λi ∈ R (at least one non zero) such
that
λ4 A4 + λ3 A3 + λ2 A2 + λ1 A + λ0 I = 0
This means, if we write
p(x) = λ4 x4 + λ3 x3 + λ2 x2 + λ1 x + λ0
then p(x) 6= 0 and p(A) = 0. So we’ve proved the following:
Proposition 2.14. For any 2 × 2 matrix A there exists a nonzero polynomial p(x) of
degree ≤ 4, such that p(A) = 0.
Note 2.3. This generalizes to n × n matrices.
Summary so far
V a finite-dimensional vector space (i.e. V has a finite spanning set)
• Basis of V is a linear independent spanning set
• All bases have the same size called dim V
(Theorem 2.2)
• Every spanning set contains a basis
(Theorem 2.1)
Write dim V = n
• Any spanning set of size n is a basis
(Proposition 2.9)
• Any linearly independent set of size n is a basis
(Proposition 2.12)
• Any linearly independent set is contained in a basis
(Proposition 2.10)
• Any set of n + 1 or more vectors is linearly dependent
(Proposition 2.13)
• Any subspace W of V has dim W ≤ n, and dim W = n ⇒ W = V
2.11)
56
(Proposition
Algebra I – lecture notes
Chapter 3
More on Subspaces
3.1
Sums and Intersections
Definition 3.1. V a vector space. Let U, W be subspaces of V . The intersection of V
and W is
U ∩ W = {v | v ∈ U and v ∈ W }
The sum of U and W
U + W = {u + w | u ∈ U and w ∈ W }
Note 3.1. U + W contains
• all the vectors in u ∈ U (as u + 0 ∈ U + W )
• all the vectors in w ∈ W
• many more vectors (usually)
Eg 3.1. V = R2 U = Sp(1, 0), W = Sp(0, 1). Then U + W contains all vectors λ1 (1, 0) +
λ2 (0, 1) = (λ1 , λ2 ). So U + W is the whole of R2 .
Proposition 3.1. U ∩ W and U + W are subspaces of V .
Proof. Use subspaes criterion, Proposition 2.3
U +W
(1) As U, W are subspaces, both contain 0, so 0 + 0 = 0 ∈ U + W
(2) Let u1 + w1 , u2 + w2 ∈ U + W (where ui ∈ U, wi ∈ W ). Then (u1 + w1 ) + (u2 +
w2 ) = (u1 + u2 ) + (w1 + w2 ) ∈ U + W
(3) Let u + w ∈ U + W , λ ∈ F . Then
λ(u + w) = λu + λw ∈ U + W
57
3.1. SUMS AND INTERSECTIONS
Algebra I – lecture notes
U ∩ W – Sheet 8.
2
What about dimensions of U + W , U ∩ W ?
First:
Proposition 3.2. If U = Sp(u1 , . . . , ur ), W = Sp(w1 , . . . , ws ). Then
U + W = Sp(u1 , . . . , ur , w1 , . . . , ws )
Proof. Let u + w ∈ U + W . Then (λi , µi ∈ F )
u = λ1 u1 + · · · + λr vr
w = µ1 w1 + · · · + µs ws
So
u + w = λ1 u1 + · · · + λr vr + µ1 w1 + · · · + µs ws ∈ Sp(u1 , . . . , ur , w1 , . . . , wr )
So
U + W ⊆ Sp(u1 , . . . , ur , w1 , . . . , ws )
All the ui , wi are in U+W . As U+W is a subspace, it therefore contains Sp(u1 , . . . , ur , w1 , . . . , ws ).
Hence
U + W = Sp(u1 , . . . , ur , w1 , . . . , ws )
2
Eg 3.2. In the above example, U + W = Sp((1, 0), (0, 1)) = R2 .
Eg 3.3. Let U = {x ∈ R3 | x1 + x2 + x3 = 0}, W = {x ∈ R3 | − x1 + 2x2 + x3 = 0} subspaces of R3 . Find bases of U, W , U ∩ W , U + W .
• For U general solution is (−a − b, b, a), so basis for U is: {(−1, 0, 1), (−1, 1, 0)}.
• For W general solution is (2b + a, b, a), so basis of W is: {(1, 0, 1), (2, 1, 0)}.
1 1 1
3 x = 0. Solve
• U ∩ W : this is x ∈ R −1 2 1
1 1 1 0
−1 2 1 0
→
1 1 1 0
0 3 2 0
General solution is (−a, −2a, 3a). Basis for U + W is {(−1, −2, 3)}.
58
Algebra I – lecture notes
3.1. SUMS AND INTERSECTIONS
• U + W : By Proposition 3.2
U + W = Sp((−1, 0, 1), (−1, 1, 0), (1, 0, 1), (2, 1, 0))
Check that can cast out only 1 vector. So U + W has dimension 3, so
U + W = R3
So
dim U = dim W = 2
dim U ∩ W = 1
dim U + W = 3
Theorem 3.1. Let V be a finite-dimensional space and let U, W be subspaces of V . Then
dim (U + W ) = dim U + dim W − dim (U ∩ W )
Proof. Let
dim U = m
dim W = n
dim (U ∩ W ) = r
Aim: to prove dim (U + W ) = m + n − r. Start with basis {x1 , . . . , xr } of U ∩ W . By 2.10,
can extend this to bases
Bu = {x1 , . . . , xr , u1 , . . . , um−r } basis of U
Bw = {x1 , . . . , xr , w1 , . . . , wn−r } bases of W
Let
B = Bu ∪ Bw = {x1 , . . . , xr , u1 , . . . , um−r , w1 , . . . , wn−r }
Claim B is a basis of U + W .
Proof.
(1) Span: B spans U + W by Proposition 3.2.
(2) Linear independence: We show that B is linearly independent. Suppose
λ1 x1 + · · · + λr xr + α1 u1 + · · · + αm−r um−r + β1 w1 + · · · + βn−r wn−r
i.e.
r
X
i=1
λi xi +
m−r
X
αi u i +
i=1
59
n−r
X
i=1
βi wi = 0
(3.1)
3.1. SUMS AND INTERSECTIONS
Algebra I – lecture notes
Let
v=
n−r
X
βi wi
i=1
Then v ∈ W . Also
v=−
So v is in U ∩ W .
As x1 , . . . , xr is a basis of U ∩ W
X
v=
λi xi −
r
X
i=1
As v =
P
X
αi u i ∈ U
γi xi (γ ∈ F )
βi wi , this gives
−
r
X
γi xi +
n−r
X
βi wi = 0
i=1
i=1
Since Bw = {x1 , . . . , xr , w1 , . . . , wn−r } is linearly independent, this forces (for all i)
γi = 0
βi = 0
(i.e. v = 0). Then by (3.1)
r
X
λi xi +
i=1
m−r
X
αi u i = 0
i=1
Since Bu = {x1 , . . . , xr , u1, . . . , um−r } is linearly independent, this forces (for all i)
λi = 0
αi = 0
So we’ve shown that in (3.1), all coefficients λi , αi , βi are zero, showing that B =
Bu ∩ Bw is linearly independent. Hence B is a basis of U + W .
2
Then we proved that
dim (U + W ) = r + m − r + n − r = r + m − r
2
Eg 3.4. V = R4 . Suppose U, W are subspaces with dim U = 2, dim W = 3. Then
dim U + W ≥ 3 (as it contains W ) and dim U + W ≤ 4 (as U + W ⊆ R4 ). Possibilities
60
Algebra I – lecture notes
3.2. THE RANK OF A MATRIX
• dim U + W = 3 Then U + W = W and so U ⊆ W .
• dim U + W = 4 (in other words U + W = R4 ). Then
dim(U ∩ W ) = dim U + dim W − dim U + W = 1
For example this happens:
U = Sp(e1 , e2 ) = {(x1 , x2 , 0, 0) | xi ∈ R}
W = Sp(e1 , e3 , e4 ) = {(x1 , 0, x2 , x3 )}
3.2
The rank of a matrix
Definition 3.2. Let A be an m × n matrix with real entries. Define
row-space(A) =
column-space(A) =
3
1 2
Eg 3.5. A =
0 −1 1
subspace of Rn spanned by the rows of A
subspace of Rm spanned by the columns of A
row-space(A) = Sp((3, 1, 2), (0, −1, 1))
3
1
2
column-space(A) = Sp
,
,
0
−1
1
Definition 3.3. Let A be m × n matrix. Define
row-rank(A) = dim row-space(A)
column-rank(A) = dim column-space(A)
Eg 3.6. In above example
row-rank(A) = column-rank(A) = 2
3.2.1
How to find row-rank(A)
Procedure:
(1) Reduce A to echelon form by

0
0

A′ = 

0
Then (we will prove this)
row operations, say

... 1 ... ... ... ... ...
. . . 0 . . . 1 . . . . . . . . .


..

.
... 0 ... 0
1 ... ...
row-space(A) = row-space(A′ )
61
3.2. THE RANK OF A MATRIX
Algebra I – lecture notes
(2) Then row-rank(A) = number of nonzero rows in echelon form A′ and these nonzero
rows are a basis for row-space(A).
Proof.
(1) Rows of A′ are linear combinations of the rows of A (since they are obtained by row
operations ri → ri + λrj , etc.) Therefore
row-space(A′ ) ⊆ Sp( rows of A)
= row-space(A)
By reversing the row operations to go from A′ to A, we see that rows of A are linear
combinations of rows of A′ , so
row-space(A) ⊆ row-space(A′ )
Therefore row-space(A) = row-space(A′ )
(2) Let the nonzero rows of A′ be v1 , . . . , vr

0 ... 1 ... ... ... ...
0 . . . 0 . . . 1 . . . . . .

A′ = 
..

.
0 ... 0 ... 0
1 ...
Then

. . . v1
. . .
 v2
 ..
 .
...
vr
row-space(A′ ) = Sp(v1 , . . . , vr )
Also v1 , . . . , vr are linearly independent, since
λ1 v1 + · · · + λr vr = 0
implies
λ1 = 0 (since λ1 is i1 coordinate of LHS)
λ2 = 0 (since λ1 is i2 coordinate of LHS)
λj = 0 (since λ1 is ij coordinate of LHS)
Therefore v1 , . . . , vr is a basis for row-space(A′ ), hence for row-space(A). So
row-rank(A) = r = no. of nonzero rows of A′
2
62
Algebra I – lecture notes
3.2. THE RANK OF A MATRIX
Eg 3.7. Find the row-rank of


1 2 5
A= 2 1 0 
−1 4 15
Reduce to echelon form:

1
2
5
A →  0 −3 −10 
0
6
20


1
2
5
→  0 −3 −10 
0
0
0

Row-rank is 2.
Eg 3.8. Find the dimension of
W = Sp((−1, 1, 0, 1), (2, 3, 1, 0), (0, 1, 2, 3)) ⊆ R4
Observe

−1 1 0 1
W = row-space( 2 3 1 0 ) = A
0 1 2 3

So dim W = row-rank(A).

−1 1 0 1
A →  0 5 1 2 
0 1 2 3


−1 1 0 1
→  0 5 1 2 
0 0 9 12

So dim W = row-rank(A) = 3.
3.2.2
How to find column-rank(A)?
Clearly
column-rank(A) = row-rank(AT )
63
3.2. THE RANK OF A MATRIX
Algebra I – lecture notes


1 2 5
Eg 3.9. column-rank() of A =  2 1 0 
−1 4 15


1 2 −1
4 
AT =  2 1
5 0 15


1 2 −1
→  0 3 −6 
0 0
0
So column-rank(A) = 2.
Theorem 3.2. For any matrix A,
row-rank(A) = column-rank(A)
Proof. Let


a11 · · · a1n
v1
 ..

.
.
..
..  ...
A= .
am1 · · · amn
vm
So vi = (ai1 , . . . , ain ).
Let
k = row-rank(A)
= dim Sp(v1 , . . . , vm )
Let w1 , . . . , wk be a basis for row-space(A). Say
w1 = (b11 , . . . , b1n )
..
.
wk = (bk1 , . . . , bkn )
Each vi ∈ Sp(w1 , . . . , wk ), so (λij ∈ F )
v1 = λ11 w1 + · · · + λ1k wk
..
.
vm = λm1 w1 + · · · + λmk wk
Equating coordinates:
ith coord of v1 : a1i = λ11 b1i + · · · + λ1k bki
..
.
ith coord of vm : ami = λm1 b1i + · · · + λmk bki
64
Algebra I – lecture notes
3.2. THE RANK OF A MATRIX
This says
ith






a1i
λ11
λ1k
 




column of A =  ...  = b1i  ...  + · · · + bki  ... 
ami
λm1
λmk
Hence each column of A is a linear combination of the k column vectors




λ1k
λ11
 . 
 .. 
 .  , . . . ,  ..  = l1 , . . . , lk
λmk
λm1
So column-space(A) is spanned by these k vectors. So
column-rank(A) = dim column-space(A) ≤ k = row-rank(A)
So we’ve shown that
column-rank(A) ≤ row-rank(A)
Applying the same to AT :
column-rank(AT ) ≤ row-rank(AT )
i.e.
row-rank(A) ≤ column-rank(A)
Hence row-rank(A) = column-rank(A).
Eg 3.10. illustrating the proof
Let
2


1 2 −1 0 v1
0 1  v2
A =  −1 1
0 3 −1 2 v3
As v3 = v1 + v2 , basis of row-space is w1 , w2 where
w1 = v1
w2 = v2
Write each vi as a linear combination of w1 , w2
v1 = w1 = 1w1 + 0w2
v2 = w2 = 0w1 + 1w2
v3 = w1 + w2 = 1w1 + 1w2
65
3.2. THE RANK OF A MATRIX
Algebra I – lecture notes
So the column vectors l1 , l2 are
  
0
1
 0 ,  1 
1
1

l2
l2
These span column-space(A). Check

1
 −1
0

2
 1
3

−1
 0
−1

 = l1 − l2

 = 2l1 + l2

 = −l1
Definition 3.4. The rank of a matrix is its row-rank or its col-rank, written
rank(A) or rk(A)
Proposition 3.3. Let A be n × n. Then the following four statements are equivalent:
(1) rank(A) = n
(2) rows of A are a basis of Rn
(3) columns of A are a basis of Rn
(4) A is invertible
Proof.
• (1) ⇔ (2)
rank(A) = n
⇔
dim row-space(A) = n
⇔
the n rows of A span R3
⇔
the n rows are a basis of R3 (2.9)
66
Algebra I – lecture notes
3.2. THE RANK OF A MATRIX
• (1) ⇔ (3) Similarly
• (1) ⇔ (4)
rank(A) = n
⇔
A can be reduced to echelon form
⇔
A can be reduced to In
⇔
A is invertible (M1GLA, 7.5)
2
67
Algebra I – lecture notes
Chapter 4
Linear Transformations
Linear transformations are functions from one vector space to another which “preserve”
addition and scalar multiplication, i.e. for linear transformation T if
v1 7→ w1 = T (v1 )
v2 7→ w2 = T (v2 )
then
v1 + v2 7→ w1 + w2 = T (v1 ) + T (v2 )
and
λv1 7→ λw1 = λT (v1 )
Definition 4.1. Let V , W be vector spaces. A function T : V → W is a linear transformation if
1) T (v1 + v2 ) = T (v1 ) + T (v2 ) for all v1 , v2 ∈ V
2) T (λv) = λT (v) for all v ∈ V , λ ∈ F
Eg 4.1.
(1) Define T : R1 → R1 by
T (x) = sin x
Then T is not a linear transformation: e.g.
So 2T
π
2
6= T (π)
T (π) = sin π = 0
π π
= 2 sin = 2
2T
2
2
(2) T : R2 → R1 ,
T (x1 , x2 ) = x1 + x2
T is a linear transformation:
68
Algebra I – lecture notes
1)
T ((x1 , x2 ) + (y1 , y2)) = T (x1 + y1 , x2 + y2 )
= x1 + x2 + y1 + y2
= T (x1 , x2 ) + T (y1 , y2 )
2)
T (λ(x1 , x2 )) =
=
=
=
(3) T : R2 → R2
T (λx1 , λx2 )
λx1 + λx2
λ(x1 + x2 )
λT (x1 , x2 )
T (x1 , x2 ) = x1 + x2 + 1
T is not linear: e.g.
T (2(1, 0)) = T (2, 0) = 3
2T (1, 0) = 4
(4) V vector space of polynomials. Define T : V → V by
T (p(x)) = p′ (x)
e.g.
T (x3 − 3x) = 3x2 − 3
Then T is linear transformation:
1)
T (p(x) + q(x)) = p′ (x) + q ′ (x)
= T (p(x)) + T (q(x))
2)
T (λp(x)) = λp′ (x) = λT (p(x))
Basic examples:
Proposition 4.1. Let A be an m × n matrix over R. Define T : Rn → Rm by (for all
x ∈ Rn , column vectors)
T (x) = Ax
Then T is a linear transformation.
69
4.1. BASIC PROPERTIES
Algebra I – lecture notes
Proof.
1)
T (v1 + v2 ) = A(v1 + v2 )
= Av1 + Av2
= T (v1 ) + T (v2 )
2)
T (λv) = A(λv)
= λAv
= λT (v)
2
Eg 4.2.
1. Define T : R3 → R2
Then
 
x1
x1 − 3x2 + x3


T x2 =
x1 + x2 − 2x3
x3
T (x) =
1 −3
1
1
1 −2
x
So T is a linear transformation.
2. Rotation ρθ : R2 → R2 is
ρθ =
cos θ sin θ
− sin θ cos θ
x
so is a linear transformation.
4.1
Basic properties
Proposition 4.2. Let T : V → W be a linear transformation
(i) T (0V ) = 0W
(ii) T (λ1 v2 + · · · + λk vk ) = λ1 T v1 + · · · + λk T (vk )
Proof.
70
Algebra I – lecture notes
4.2. CONSTRUCTING LINEAR TRANSFORMATIONS
(i)
T (0V ) = T (0v)
= 0T (v)
= 0W
(ii)
T (λ1 v1 + · · · + λk vk ) = T (λ1 v1 + · · · + λk−1 vk−1 ) + T (λk vk )
= T (λ1 v1 + · · · + λk−1 vk−1 ) + λk T (vk )
Repeat to get (ii).
2
4.2
Constructing linear transformations
Eg 4.3. Find a linear transformation T : R2 → R3 which sends


1
1
7→  −1 
0
2
e1
w
 1
0
0

7→
1
1
3
e2
w2
We are forced to define
x1
T
= T (x1 e1 + x2 e2 )
x2
= x1 T (e1 ) + x2 T (e2 )
= x1 w1 + x2 w2
So the only possible choice for T is


x1
x1
=  −x1 + x2 
T
x2
2x1 + 3x2
This is a linear transformation, as it is


1 0
T (x) =  −1 1  x
2 3
And it does send e1 7→ w1 , e2 7→ w2 .
71
4.2. CONSTRUCTING LINEAR TRANSFORMATIONS
Algebra I – lecture notes
In general:
Proposition 4.3. Let V, W be vector spaces, and let v1 , . . . , vn be a basis of V . For any
n vectors w1 , . . . , wn in W there is a unique linear transformation T : V → W such that
T (v1 ) = w1
..
.
T (vn ) = wn
Proof. Let v ∈ V . Write
v = λ1 v1 + · · · + λn vn
By 4.2(ii), the only possible choice for T (v) is
T (v) = T (λ1 v1 + · · · + λn vn )
= λ1 T (v1 ) + · · · + λn T (vn )
= λ1 w 1 + · · · + λn w n
So this is our definition of T : V → W – if v = λ1 v1 + · · · + λn vn , then
T (v) = λ1 w1 + · · · + λn wn
We show this function T is a linear transformation:
P
P
P
1) Let v = λi vi , w =
µi vi . Then v + w = (λi + µi )vi , so
X
T (v + w) =
(λi + µi )wi
X
X
=
λi w i +
µi wi
= T (v) + T (w)
2) Let v =
P
λi vi , λ ∈ F . Then
X
T (λv) = T
λλi vi
X
=
λλi wi
X
= λ
λi w i
= λT (v)
So T is a linear transformation sending v1 7→ w1 for all i and is unique.
2
72
Algebra I – lecture notes
4.3. KERNEL AND IMAGE
Remark 4.1. This shows that once we know what a linear transformation does to the
vectors in a basis, we know what it does to all vectors.
Eg 4.4. V =vector space of polynomials over R of degree ≤ 2. Basis of V : 1, x, x2 . Pick
3 vectors in V : w1 = 1 + x, w2 = x − x2 , w3 = 1 + x2 . By 4.3 there exists a unique linear
transformation T : V → V sending
1 7→ w1
x 7→ w2
x2 7→ w3
Then
4.2
T (a + bx + cx2 ) = aT (1) + bT (x) + cT (x2 )
= a(1 + x) + b(x − x2 ) + x(1 + x2 )
= a + c + (a + b)x + (c − b)x2
4.3
Kernel and Image
Definition 4.2. T : V → V linear transformation. Define the image Im(T ) to be
Im(T ) = {T (v) | v ∈ V } ⊆ W
The kernel Ker (T ) is
Ker (T ) = {v ∈ V | T (v) = 0} ⊆ V
Eg 4.5. T : R3 → R2
 
 
x1
x1
3
1
2
3x
+
x
+
2x
1
2
3
x2  =
T x2  =
−1 0 1
−x1 + x3
x3
x3
A
Then
Ker (T ) = x ∈ R3 | T (X) = 0
= x ∈ R3 | Ax = 0 = solution space of Ax
3x1 + x2 + 2x3
Im (T ) = set of all vectors
−x1 + x3
2
1
3
+ x3
+ x2
= set of all vectors x1
1
0
−1
= col-space of A
73
4.3. KERNEL AND IMAGE
Algebra I – lecture notes
Proposition 4.4. T : V → W linear transformation. Then
i) Ker (T ) is a subspace of V
ii) Im (T ) is a subspace of W
Proof.
i) Use 2.3:
4.2
1) 0 ∈ Ker (T ) since T (0V ) = 0W
2) Let v, w ∈ Ker (T ). Then T (v) = T (w) = 0, so
T (v + w) = T (v) + T (w)
= 0+0=0
So v + w ∈ Ker (T ).
3) Let v ∈ Ker (T ), λ ∈ F . Then
T (λv) = λT (v)
= λ0 = 0
So λv ∈ Ker (T )
ii)
1) 0 ∈ Im (T ) as 0 = T (0)
2) w1 , w2 ∈ Im (T ) so w1 = T (v1 ), w2 = T (v2 )
w1 + w2 = T (v1 ) + T (v2 )
= T (v1 + v2 )
so w1 + w2 ∈ Im (T )
3) w ∈ Im (T ), λ ∈ F . Then
w = T (v)
λw = λT (v)
= T (λv)
so λw ∈ Im (T ).
2
74
Algebra I – lecture notes
4.3. KERNEL AND IMAGE
Eg 4.6. Let vn = vector space of polynomials of degree ≤ n. Define T : Vn → Vn−1 by
T (p(x)) = p′ (x)
Then T is a linear transformation.
Ker (T ) = {p(x) | T (p(x)) = 0}
= {p(x) | p′ (x) = 0}
= V0 (the constant polynomials)
and Im (T ) = Vn−1 . This has basis 1, x, x2 , . . . , xn−1 , dim n.
Proposition 4.5. Let T : V → W be a linear transformation. If v1 , . . . , vn is a basis of
V , then
Im (T ) = Sp(T (v1 ), . . . , T (vn ))
Proof. Let T (v) ∈ Im (T ). Write
v = λ1 v1 + · · · + λn vn
Then
4.2
T (v) = λ1 T (v1 ) + · · · + λn T (vn )
∈ Sp(T (v1 ), . . . , T (vn ))
This shows
Im (T ) ⊆ Sp(T (v1 ), . . . , T (vn ))
All T (v1 ) ∈ Im T , so as Im (T ) is a subspace, Sp(T (v1 ), . . . , T (vn )) ⊆ Im (T ). Therefore
Sp(T (v1 ), . . . , T (vn )) = Im (T ).
2
Important class of kernels and images:
Proposition 4.6. Let A be an m × n matrix, and define T : Rn → Rm by (x ∈ Rn )
T (x) = Ax
Then
1) Ker (T ) = solution space of the system Ax = 0.
2) Im T = column-space(A)
3) dim Im (T ) = rank(A)
Proof.
75
4.3. KERNEL AND IMAGE
Algebra I – lecture notes
1)
Ker (T ) = {x | T (x) = 0}
= {x ∈ Rn | Ax = 0}
= solution space of Ax = 0
2) Take a standard basis e1 , . . . , en of Rn . By 4.5
Im (T ) = Sp(T (e1 ), . . . , T (en ))
Here
 
0
 .. 
.
 
T (ei ) = Aei = A 1
.
 .. 
0
= i-th column of A
So Im (T ) = Sp(columns of A) = column-space(A).
3) dim (Im (T )) = dim (column-space(A)) = rank(A)
2
Eg 4.7. T : R3 → R3


1 2 3
T (x) =  −1 0 1 
1 4 7
Find bases for Ker (T ) and Im (T ).
• Ker (T )



1 2 3 0
1 2 3
 −1 0 1 0  →  0 2 4
1 4 7 0
0 2 4

1 2 3

0 2 4
→
0 0 0

0
0 
0

0
0 
0
General solution (a, −2a, a). Basis for Ker (T ) is (1, −2, 1).
76
Algebra I – lecture notes
4.3. KERNEL AND IMAGE
• Im
(T ). Basis
 forIm (T ) = column-space((A)). Dimension is rank((A)) = 2. So basis
1
2



−1
is
, 0.
1
4
Theorem 4.1 (Rank-nullity Theorem).
a linear transformation. Then
1
Let V, W be vector spaces, and T : V → W be
dim (Ker (T )) + dim (Im (T )) = dim (V )
Proof. Let r = dim (Ker (T )). Let u1 , . . . , ue be a basis of Ker (T ). By 2.10 we can extend
this to
u1 , . . . , ur , v1 , . . . , vs
basis of V . So dim V = r + s.
We want to show that dim (Im (T )) = s. By 4.5
Im (T ) = Sp(T (u1 ), . . . , T (ur ), T (v1 ), . . . , T (vs ))
Each T (u1) = 0, as ui ∈ Ker T . So
Im T = Sp(T (v1 ), . . . , T (vs ))
(∗)
Claim: T (v1 ), . . . , T (vs ) is a basis of Im T .
Proof. Span shown by (∗). Suppose
λ1 T (v1 ) + · · · + λs T (vs ) = 0
Then
4.2
T (λ1 v1 + · · · + λs vs ) = 0
So λ1 v1 + · · · + λs vs ∈ Ker T . So
λ1 v1 + · · · + λs vs = µ1 u1 + · · · + µr ur
(as u1 , . . . , ur are basis of Ker T ). That is
µ1 u1 + · · · + µr ur − λ1 v1 − · · · − λs vs = 0
As u1 , . . . , ur , v1 , . . . , vs is a basis of V , it is linearly independent, and so
µ i = λi = 0
∀i
This shows T (v1 ), . . . , T (vs ) is linearly independent, hence a basis of Im T .
2
So dim (Im T ) = s and
dim (Ker T ) + dim (Im T ) = r + s = dim V.
2
1
dim (Ker (T )) is sometimes called the nullity of T
77
4.4. COMPOSITION OF LINEAR TRANSFORMATIONS
Algebra I – lecture notes
Consequences for linear equations
Proposition 4.7. Let A be m × n matrix, and
W = solution space of Ax = 0
= {x ∈ Rn | Ax = 0}
Then
dim W = n − rank(A)
Proof. Define linear transformation T : Rn → Rm by
(x ∈ Rn )
T (x) = Ax
4.6
Then Ker T = W , Im T = column-space(A). By 4.1
dim Rn = n = dim (Ker T ) + dim Im T
= dim W + dim column-space(A)
So
n = dim W + rank(A)
2


1 1 3 2 5
Eg 4.8. Let A =  0 1 3 2 1 . Let W be the solution space of the system Ax = 0.
0 0 0 1 2
5
So W ⊆ R . Then
dim W
4.7
= 5 − rank(A)
= 5 − 3 = 2.
General solution is (−4a, 3a−3b, b, −2a, a). This has two free variables and is a(−4, 3, 0, −2, 1)+
b(0, −3, 1, 0, 0) so (−4, 3, 0, −2, 1), (0, −3, 1, 0, 0) is a basis of W .
In general, number of free variables in the general solution of Ax = 0 is the dimension of
the solution space, i.e. n − rank(A).
4.4
Composition of linear transformations
Definition 4.3. Let
T : V → W,
S:W →X
be linear transformation (V, W, X vector spaces). The composition S ◦ T : V → X is
defined
S ◦ T (v) = S(T (v))
Usually just write ST .
78
Algebra I – lecture notes
4.5. THE MATRIX OF A LINEAR TRANSFORMATION
Then ST is again a linear transformation
ST (v1 + v2 ) = S(T (v1 + v2 ))
= S(T (v1 ) + T (v2 ))
= ST (v1 ) + ST (v2 )
(T is linear)
(S is linear)
Eg 4.9.
1. Let
T : Rn → Rm ,
S : Rm → Rp
be
(x ∈ Rn )
(x ∈ Rm )
T (x) = Ax
S(x) = Bx
So
ST (x) =
=
=
=
S(T (x))
S(Ax)
B(Ax)
BAx
2. Let T : V → V . Define
T2 = T ◦ T : V → V
= T (T (V ))
If T : Rn → Rn , T (x) = Ax then
T 2 (x) = A2 x
T 3 (x) = A3 x
..
.
n
T (x) = An x
4.5
The matrix of a linear transformation
A linear transformation is a type of function between two vector spaces. We’ll show how
to associate a matrix with any linear transformation. This will enable us to use matrix
theory to study linear transformations.
79
4.5. THE MATRIX OF A LINEAR TRANSFORMATION
Algebra I – lecture notes
Let T : V → V be a linear transformation (V a vector space). Let B = {v1 , . . . , vn } be a
basis of V (finite dimensional). Each T (vi ) ∈ V , so is a linear combination of vi ’s
T (v1 ) = a11 v1 + a21 v2 + · · · + an1 vn
T (v2 ) = a12 v1 + a22 v2 + · · · + an2 vn
..
.
T (vn ) = a1n v1 + a2n v2 + · · · + ann vn
Definition 4.4. Matrix of T (with respect

a11
 a21

[T ]B =  ..
 .
an1
Eg 4.10. T : R2 → R2
to basis B) is

a12 . . . a1n
a22 . . . a2n 

..
.. 
..
.
.
. 
an2 . . . ann
x1
2x1 − x2
T
=
x2
x1 + 2x2
2 −1
x1
=
1
2
x2
1
0
Let B = {e1 , e2 } =
,
. Work out [T ]B :
0
1
2
T (e1 ) =
= 2e1 + e2
1
−1
T (e2 ) =
= −e1 + 2e2
2
2 −2
So [T ]B =
.
1
2
With another basis
B ′ = {(1, 1), (0, 1)} = {v1 , v2 }
What is [T ]B′ ?
Hence [T ]B′
1
2−1
1
T (v1 ) = T
=
=
= 1v1 + 2v2
1
1+2
3
0
−1
T (v2 ) = T
=
= −v1 + 3v2
1
2
1 −1
=
.
2
3
80
Algebra I – lecture notes
4.5. THE MATRIX OF A LINEAR TRANSFORMATION
Eg 4.11. V the vector space of polynomials of degree ≤ 2. Define T : V → V by T (p) = p′ .
Basis
B = 1, x, x2
T (1) = 0
T (x) = 1
T (x2 ) = 2x
Hence


0 1 0
[T ]B = 0 0 2
0 0 0
Observe
[T (p(x))]B = [T ]B [p(x)]B


0 1 0
=  0 0 2
0 0 0
 
b

2c
=
0
Definition 4.5. Let V be a vector space over F , B = {v1 , . . . , vn } a basis of V . Let v ∈ V ,
v = λ1 v1 + · · · + λn vn , λi ∈ F . Define vector of v, [v]B ∈ F n to be


λ1
 
[v]B =  ... 
λn
Proposition 4.8. Let V be a vector space, B a basis of V and v a vector in V . Then
[T (v)]B = [T ]B [v]B
Proof. Let v =
Pn
i=1
Let [T ]B = (aij ), so
λi vi . So


λ1
 
[v]B =  ...  .
λn
T (vi ) = a1i v1 + a2i v2 + · · · + ani vn
81
4.6. EIGENVALUES AND EIGENVECTORS
Algebra I – lecture notes
So
T (v) =
n
X
λi T (vi )
i=1
=
n
X
i=1
=
λi (a1i v1 + · · · + ani vn )
n
X
λi a1i
i=1
!
v1 + · · · +
n
X
λi ani
i=1
!
vn
So
P

[T (v)]B = 
P

λ1 a1i

..

.
λi ani
 
λ1
a11 · · · a1n


 ..
.
.
..
..   ... 
=  .

λn
an1 . . . ann

= [T ]B [v]B
2
4.6
Eigenvalues and eigenvectors
Definition 4.6. Let T : V → V be a linear transformation. Say v ∈ V is an eigenvector
of T if
(1) v 6= 0
(2) T (v) = λv, λ ∈ F
Call λ an eigenvalue of T .
Eg 4.12.
1. V a vector space of polys of degree ≤ 2. Define T : V → V by
T (p(x)) = p(x + 1) − p(x)
T is a linear transformation. The eigenvectors of T are non-zero polynomials p(x)
such that
T (p(x)) = λp(x)
p(x + 1) − p(x) = λp(x)
p(x + 1) = (λ + 1)p(x).
82
Algebra I – lecture notes
4.6. EIGENVALUES AND EIGENVECTORS
1. Let T : Rn → Rn be defined by
T (v) = Av
where A is n × n matrix. Then T (v) = λv iff Av = λv; the eigenvalues and eigenvectors of T are the same as those of matrix A.
4.6.1
How to find evals / evecs of T ?
Let v be an eigenvector of T , so
T (v) = λv
Let B be a basis of V . Then
[T (v)]B = [λv]B = λ [v]B
By
[T ]B [v]B = λ [v]B
So collumnn vector [v]B is an eigenvector of the matrix [T ]B , and λ is an eigenvalue of this
matrix. Hence:
Proposition 4.9. Let T : V → V , B a basis of vector space V .
1) The eigenvalues of T are the eigenvalues of the matrix [T ]B
2) The eigenvectors of T are the vectors v such that [v]B is an eigenvector of the matrix
[T ]B .
Eg 4.13. V = polynomials of degree ≤ 2.
T (p(x)) = p(x + 1) − p(x)
Find the eigenvectors and eigenvalues of T .
Let B = {1, x, x2 }. So (from one of the previous examples)


0 1 1
[T ]B =  0 0 2 
0 0 0
Characteristic polynomial
−λ
1
1
2
|A − λI| = 0 −λ
0
0 −λ
= −λ3
So only eigenvalue is 0. Eigenvectors of A are solutions to the equation
So for λ = 0
(A − λI)x = 0

0 1 1
 0 0 2
0 0 0
So eigenvectors of T are the polynomials a, i.e.
83

0
0 
0
the constant polynomials.
4.7. DIAGONALISATION
4.7
Algebra I – lecture notes
Diagonalisation
Let T : V → V be a linear transformation. Suppose B = {v1 , . . . , vn } is a basis of V such
that the matrix [T ]B is diagonal. So
This means


λ1 · · · 0


..
[T ]B = 

.
0 · · · λn
T (v1 ) = λ1 v1
T (v2 ) = λ2 v2
..
.
T (vn ) = λn vn
Proposition 4.10. The matrix [T ]B is diagonal iff B is a basis consisting of eigenvectors
of T .
Definition 4.7. Linear transformation T : V → V is diagonalisable if there exists a basis
B of V consisting of eigenvectors of T .
Eg 4.14. V = polynomials of degree ≤ 2.
1. T1 (p(x)) = p(x + 1) − p(x). We’ve showed that the only eigenvectors are the constant
polynomials. So there exists no basis of eigenvectors; T1 is not diagonalisable.
2. T2 (p(x)) = p′ (x). Then for B = {1, x, x2 }


0 1 0
[T2 ]B =  0 0 2 
0 0 0
Find as for T1 , that the only eigenvectors of T are the constant polynomials; T2 is
not diagonalisable.
3. T3 (p(x)) = p(1 − x). Here

Is T3 diagonalisable?

1
1
1
[T ]B =  0 −1 −2 
0
0
1
84
Algebra I – lecture notes
4.8. CHANGE OF BASIS
Eg 4.15. T : R2 → R2
x1
0 1
x2
x1
=
=
T
x2
−2 3
−2x1 + 3x2
x2
Is T diagonalisable? 1
0
For B = {e1 , e2 } =
,
,
0
1
[T ]B =
0 1
−2 3
=A
a
b
We find that the eigenvalues of A are 1, 2 and the eigenvectors
,
. So there is a
a
2b
basis of eigenvectors of A:
1
1
′
B =
,
1
2
So
[T ]B′ =
1 0
0 2
From M1GLA, recall that if (columns are equal to vectors in B ′ )
1 1
P =
1 2
then
P
−1
1 0
AP =
0 2
i.e.
P −1 [T ]B P = [T ]B′
General theory:
4.8
Change of basis
V a vector space, B = {v1 , . . . , vn } a basis of V . T a linear transformation T : V → V .
Matrix [T ]B = (ai j) (n × n matrix) where
T (vi ) = a1i v1 + · · · + ani vn
Question: If B ′ = {w1 , . . . , wn } is another basis of V , get another matrix [T ]B′ . What is
85
4.8. CHANGE OF BASIS
Algebra I – lecture notes
the relation between [T ]B and [T ]B′ ?
To answer this, write
w1 = p11 v1 + · · · + pn1 vn
..
.
wn = p1n v1 + · · · + pnn vn
where pij ∈ F .
The matrix

p11 · · · p1n

.. 
..
P =  ...
.
. 
pn1 · · · pnn

is called the change of basis matrix from B to B ′ .
Proposition 4.11. P is invertible, and
[T ]B′ = P −1 [T ]B P
Proof. Omitted (in favour of further material). Can look it up, e.g. in Lipschitz, Linear
Algebra.
2
Eg 4.16. Let T : R2 → R2 .
x1
0 1
x2
x1
=
=
T
x2
−2 3
−2x1 + 3x2
x2
Let
0
0
,
B =
1
1
1
1
,
B′ =
2
1
Then change of basis matrix is
P =
and
[T ]B
[T ]B′
1 1
1 2
0 1
=
−2 3
1 0
=
0 2
So by 4.11,
P −1 [T ]B P = [T ]B′
86
Algebra I – lecture notes
4.8. CHANGE OF BASIS
i.e.
P
−1
0 1
−2 3
P =
(familiar from M1GLA).
87
1 0
0 2
Algebra I – lecture notes
Chapter 5
Error-correcting codes
5.1
Introduction
Everyday language: alphabet: a,b,c,. . . and words: ahem, won, pea, too, . . . . A code is a
language for machine communication: alphabet: 0, 1 and codewords: selected strings of
0’s and 1’s.
Eg 5.1. ASCII code:
A → 1000000
B → 1000001
..
.
9 → 0011100
Message
encode
−→
codewords
transmission
−→
with noise!
received words
Errors in transmission happen (1 in 100 bits).
In everyday language errors can usually be corrected:
lunear algebra → linear algebra
Try to do the same with codes.
Eg 5.2. Simplest code:
Messages YES/ NO.
Codewords: YES - 111, NO - 000.
88
decode
−→
decoded message
Algebra I – lecture notes
5.1. INTRODUCTION
Decode by taking majority, e.g. receive 010, decode as 000. This corrects a single error.
Definition 5.1. If w is a string of 1’s and 0’s, the parity check-bit of w is 1, if number of
1’s in w is odd, 0 otherwise.
Eg 5.3. Parity check-bit of
11010 is 1,
00101 is 0.
Eg 5.4. Here’s a code which communicates 8 messages abc (a, b, c are 0 or 1). Codewords
are abcxyz where
x is parity check-bit for ab
y is parity check-bit for ac
z is parity check-bit for bc
So codewords are
000000, 100110, . . .
Suppose we receive a string 010110. Here
abx = 011 - ok.
acy = 001 - wrong.
bcz = 100 - wrong.
So there is an error. If only 1 error, it must be c, so corrected codeword is 011110.
Claim: This code can correct any single error.
Proof.
wrong
abx
acy
bcz
a
wrong
wrong
ok
b
c
wrong
ok
ok
wrong
wrong wrong
x
wrong
ok
ok
So the pattern of wrong’s and ok’s pinpoints the error.
89
y
ok
wrong
ok
z
ok
ok
wrong
2
5.2. THEORY OF CODES
5.2
Algebra I – lecture notes
Theory of Codes
Recall Z2 = {[0] , [1]}. So
0+0 = 0
0+1 = 1
1+1 = 0
Define Zn2 = {(x1 , . . . , xn ) | xi ∈ Z2 }.
Definition 5.2. A (binary) code of length n is a subset C of Zn2 . Members of C are called
codewords.
Eg 5.5. “Triple-check” code
C3 = {abcxyz | x = a + b y = a + c z = b + c}
= {000000, 100110, 010101, 001011, 110011, 101101, 011110, 111000}
Definition 5.3. For x, y ∈ Zn2 the distance d(x, y) is the number of positions where x and
y differ.
Eg 5.6. d(11010, 01111) = 3.
Proposition 5.1 (Triangle Inequality). For x, y, z ∈ Zn2
d(x, y) ≤ d(x, z) + d(z, y)
Proof. Let x = x1 , . . . , xn , etc. Let
U = {i | xi =
6 yi }
S = {i | xi =
6 yi , xi = zi }
T = {i | xi 6= yi , xi =
6 zi }
so d(x, y) = |U|. Then
|U| = |S| + |T |
|S| ≤ d(y, z)
|T | ≤ d(x, z)
2
Definition 5.4. For a code C, the minimum distance of C is
min {d(x, y) | x, y ∈ C,
Call it d(C).
90
x 6= y}
Algebra I – lecture notes
5.2.1
5.3. LINEAR CODES
Error Correction
Code C. Send codeword c. Make some errors and c′ is received. Correct to nearest
codewords. Want this to be c.
Definition 5.5. Let e ≥ 1. We say a code C corrects e errors if, whenever a codeword
c ∈ C is sent, and at most e errors are made, the received word is corrected to c
Equivalently
(1) Let Se (c) = {w ∈ Zn2 | d(c, w) ≤ e}. Then C corrects e errors iff
Se (c) ∩ Se (c′ ) = ∅
for all c, c′ such that c 6= c′ .
(2) C corrects e errors if for any c, c′ ∈ C and w ∈ Zn2
d(c, w) ≤ e,
d(c′ , w) ≤ e ⇒ c = c′
Proposition 5.2. If the minimum distance d(C) ≥ 2e + 1, then C corrects e errors.
Proof. Apply (2). For c, c′ ∈ C and w ∈ Zn2 , d(c, w) ≤ e, d(c′ , w) ≤ e. Therefore by 5.1
d(c, c′) ≤ d(c, w) + d(w, c′ )
≤ 2e
So c = c′ , since d(C) ≥ 2e + 1.
2
Eg 5.7. d(C3 ) = 3 = 2 · 1 + 1, therefore C3 corrects 1 error.
5.3
Linear Codes
Claim Z2n is a vector space over Z2 (scallars are Z2 = {0, 1}).
Proof. Define addition (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) and scalar multiplication λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ). Check axioms
(1) Zn2 is an abelian group under +
(2) four scalar multiplication axioms hold
2
Note 5.1. Unlike Rn , Zn is a finite vector space, having 2n vectors. It has dimension n,
with standard basis e1 , . . . , en .
Definition 5.6. Code C ⊆ Zn2 is a linear code if C is a subspace of Zn2 .
91
5.3. LINEAR CODES
Algebra I – lecture notes
By 2.3, this means
(1) 0 ∈ C
(2) c, d ∈ C, then c + d ∈ C
(3) c ∈ C, then λc ∈ C (as λ can be only 0 and 1).
Eg 5.8.
C3 = {abcxyz | x = a + b,


1 1 0

6 
1 0 1
=
x ∈ Z2 |

0 1 1
a solution space in Z62 .
y = a + c, z = b + c}

1 0 0 
0 1 0 

0 0 1
Aim Find linear codes C with:
• dim C large
• length of C small
• d(C) large
Proposition 5.3. IF C is a linear code, and dim C = m, then number of codewords in C
is
|C| = 2n
Proof. Let c1 , . . . , cm be a basis of C. So C consists of all linear combinations
λ 1 c1 + · · · + λ m cm
(λi ∈ Z2 )
There are 2 choices 0, 1 for each λi . Hence 2m codewords.
2
Eg 5.9. C3 has 8 codewords abcxyz.
5.3.1
Minimum distance of linear code
Definition 5.7. For w ∈ Zn2 define the weight
wt (w) = number of 1’s in w
Eg 5.10. wt(1101010) = 4
Note 5.2. wt (w) = d(w, 0)
Proposition 5.4. For x, y, z ∈ Zn2
d(x, y) = d(x + z, y + z).
Proof. Notice that x + z = x changed in places where z has a 1. Simillarly y + z is y
changed in places where z has 1. So x + z, y + z differ in same places where x, y differ. 2
92
Algebra I – lecture notes
5.4. THE CHECK MATRIX
Main Result
Proposition 5.5. If C is a linear code,
d(C) = min (wt(c) | c ∈ C, c 6= 0)
Proof. Let c ∈ C, c 6= 0, be a codeword of smallest weight, say wt(c) = r. Then d(c, 0) = r,
hence
d(C) ≤ r
( )
Let x, y ∈ C, x 6= y. Then
d(x, y) =
=
=
≥
So d(C) ≥ r. Hence, by (
d(x + y, y + y)
d(x + y, 0)
wt(x + y)
r
by 5.4
(as x + y ∈ C)
), d(C) = r.
2
Eg 5.11. C3 = 000000, 100110, . . . has min(wt(c)) = 3, hence d(C) = 3.
5.4
The Check Matrix
Definition 5.8. Let A be an m × n matrix with entries in Z2 . If C is the linear code
C = {x ∈ Zn2 | Ax = 0}
Then A is a check matrix for C.
Eg 5.12.
1. Check matrix for C3 is


1 1 0 1 0 0
1 0 1 0 1 0
0 1 1 0 0 1
1 0 1 0
2. Let A =
, check matrix of
1 1 0 1
C = x ∈ Z42 | Ax = 0
x3 = x1
4
=
x ∈ Z2 |
x4 = x1 + x2
C sends 4 messages x1 x2 with two “check-bits” x3 x4 .
C = {0000, 1011, 0101, 1110}
dim C = 2
d(C) = 2
So C does not correct any errors.
93
5.4. THE CHECK MATRIX
Algebra I – lecture notes
Proposition 5.6. If C is a linear code of length n with check matrix A
C = {x ∈ Zn2 | Ax = 0}
then
dim C = n − rank(A)
Proof. This is Proposition 4.7 (rank-nullity).
Eg 5.13.
Find dim (C)




C = x ∈ Z62



2

1
0
|
0
1
0
1
0
1
1
1
0
0
1
0
1
0
0
1
1
0

1
0
A → 
0
0

1
0
→ 
0
0
0
1
0
1
1
1
0
1
1
0
1
1
0
1
1
0
0
1
0
0
1
1
0
0
1
0
1
1
0
1
1
1
So rank(A) = 3, dim C = 6 − 3 = 3.


0



0
x = 0
1



1

0
0

1
1

0
0

1
1
Relation between check matrix and min. distance
Proposition 5.7. Let C be linear code
C = {x ∈ Zn2 | Ax = 0}
witch check matrix A.
(1) If all columns of A are different, and no column is zero, then the C corrects 1 error.
(2) Let d ≥ 1. Suppose every set of -.1 columns of A is linearly independent. Then
d(C) ≥ d.
Proof.
(1) Must show that d(C) ≥ 3, i.e. wt(c) ≥ 3 for every c ∈ C \ {0}. If there is a codeword


0
 .. 
 . 1
c of weight 1, then c = ei =  . , and 0 = Ac = Aei , which is the i-th column of
 .. 
0
A. Contradiction.
94
Algebra I – lecture notes
5.5. DECODING
If there is a codeword c of weight 2, then c = ei + ej , so
0 = Ac = Aei + Aej
= column i + column j
i.e. column i equals to column j. Contradiction.
(2)
2
Eg 5.14. To correct two errors, need d(C) ≥ 5. So the condition in (2) is that any 4
columns are linearly independent.
Eg 5.15.
1. There are 7 non-zero column vectors in Z32 .

1 1 1

1 1 0
A =
1 0 1
Let
Write them down

0 1 0 0
1 0 1 0 
1 0 0 1
x ∈ Z72 | Ax = 0


x5 = x1 + x2 + x3 

=
x . . . x7 | x6 = x1 + x2 + x4
 1

x7 = x1 + x3 + x4
dim H = 4
|H| = 16
H =
H corrects one error by 5.7(1). Code H is Ham(3), a Hamming code.
5.5
Decoding
Say C = {x ∈ Zn2 | Ax = 0}, corrects 1 error. Send c ∈ C, make 1 error in i-th bit.
Received word is
c′ = c + ei
How to find i?
Ac′ = Ac + Aei
= 0 + Aei
= i-th column of A !
So we correct i-th bit, where Ac′ is the i-th column of A.
95
z
Algebra I – lecture notes
Eg 5.16. With code H as above. Receive

1 1
Aw = 1 1
1 0
 
0
= 1
0
w = 0111010.

1 0 1 0 0
0 1 0 1 0 w T
1 1 0 0 1
This is the 6-th column of A is 0111000.
THE END.
[email protected]