Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Fundamental theorem of algebra wikipedia, lookup

Exterior algebra wikipedia, lookup

Vector space wikipedia, lookup

Covariance and contravariance of vectors wikipedia, lookup

Euclidean vector wikipedia, lookup

Laws of Form wikipedia, lookup

Bra–ket notation wikipedia, lookup

Geometric algebra wikipedia, lookup

Matrix calculus wikipedia, lookup

System of linear equations wikipedia, lookup

Singular-value decomposition wikipedia, lookup

Eigenvalues and eigenvectors wikipedia, lookup

Linear algebra wikipedia, lookup

Tensor operator wikipedia, lookup

History of algebra wikipedia, lookup

Basis (linear algebra) wikipedia, lookup

Jordan normal form wikipedia, lookup

Cartesian tensor wikipedia, lookup

Complexification (Lie group) wikipedia, lookup

Transcript

Algebra I – lecture notes version β Imperial College London Mathematics 2005/2006 CONTENTS Algebra I – lecture notes Contents 1 Groups 1.1 Definition and examples . . . . . . . . . . . 1.1.1 Group table . . . . . . . . . . . . . . 1.2 Subgroups . . . . . . . . . . . . . . . . . . . 1.2.1 Criterion for subgroups . . . . . . . . 1.3 Cyclic subgroups . . . . . . . . . . . . . . . 1.3.1 Order of an element . . . . . . . . . 1.4 More on the symetric groups Sn . . . . . . . 1.4.1 Order of permutation . . . . . . . . . 1.5 Lagranges Theorem . . . . . . . . . . . . . . 1.5.1 Consequences of Lagranges Theorem 1.6 Applications to number theory . . . . . . . . 1.6.1 Groups . . . . . . . . . . . . . . . . . 1.7 Applications of the group Z∗p . . . . . . . . . 1.7.1 Mersenne Primes . . . . . . . . . . . 1.7.2 How to find Meresenne Primes . . . . 1.8 Proof of Lagrange’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . 5 5 11 12 13 14 16 18 20 21 22 22 23 26 27 28 30 . . . . . . . . . . 34 35 38 39 40 41 43 44 48 51 53 3 More on Subspaces 3.1 Sums and Intersections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 The rank of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 57 61 2 Vector Spaces and Linear Algebra 2.1 Definition of a vector space . . . . . . 2.2 Subspaces . . . . . . . . . . . . . . . 2.3 Solution spaces . . . . . . . . . . . . 2.4 Linear Combinations . . . . . . . . . 2.5 Span . . . . . . . . . . . . . . . . . . 2.6 Spanning sets . . . . . . . . . . . . . 2.7 Linear dependence and independence 2.8 Bases . . . . . . . . . . . . . . . . . . 2.9 Dimension . . . . . . . . . . . . . . . 2.10 Further Deductions . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Algebra I – lecture notes 3.2.1 3.2.2 CONTENTS How to find row-rank(A) . . . . . . . . . . . . . . . . . . . . . . . . How to find column-rank(A)? . . . . . . . . . . . . . . . . . . . . . 4 Linear Transformations 4.1 Basic properties . . . . . . . . . . . . . 4.2 Constructing linear transformations . . 4.3 Kernel and Image . . . . . . . . . . . . 4.4 Composition of linear transformations . 4.5 The matrix of a linear transformation . 4.6 Eigenvalues and eigenvectors . . . . . . 4.6.1 How to find evals / evecs of T ? 4.7 Diagonalisation . . . . . . . . . . . . . 4.8 Change of basis . . . . . . . . . . . . . 61 63 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 70 71 73 78 79 82 83 84 85 5 Error-correcting codes 5.1 Introduction . . . . . . . . . . . . . . . . 5.2 Theory of Codes . . . . . . . . . . . . . 5.2.1 Error Correction . . . . . . . . . 5.3 Linear Codes . . . . . . . . . . . . . . . 5.3.1 Minimum distance of linear code 5.4 The Check Matrix . . . . . . . . . . . . 5.5 Decoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 88 90 91 91 92 93 95 3 CONTENTS Algebra I – lecture notes Introduction (1) Groups – used throughout maths and science to describe symmetry – e.g. every physical object, algebraic equation or system of differential equations, . . . , has a group associated with it. (2) Vector spaces – have seen and studied some of these already, e.g. Rn . 4 Algebra I – lecture notes Chapter 1 Groups 1.1 Definition and examples Definition 1.1. Let S be a set. A binary operation ∗ on S is a rule which assigns to any ordered pair (a, b) (a, b ∈ S) an element a ∗ b ∈ S. In other words, it’s a function from S × S → S. Eg 1.1. 1) S = Z, a ∗ b = a + b 2) S = C, a ∗ b − ab 3) S = R, a ∗ b = a − b 4) S = R, a ∗ b = min(a, b) 5) S = {1, 2, 3}, a ∗ b = a (eg. 1 ∗ 1 = 1, 2 ∗ 3 = 2) Given a binary operation on a set S and a, b, c ∈ S, can form “a ∗ b ∗ c” in two ways (a ∗ b) ∗ c a ∗ (b ∗ c) These may or may not be equal. Eg 1.2. In 1), (a ∗ b) ∗ c = a ∗ (b ∗ c). In 3), (3 ∗ 5) ∗ 4 = (3 − 5) − 4 = −6. Whereas 3 ∗ (5 ∗ 4) = 3 − (5 − 4) = 2 Definition 1.2. A binary operation ∗ is associative if for all a, b, c ∈ S (a ∗ b) ∗ c = a ∗ (b ∗ c) Associativity is important. 5 1.1. DEFINITION AND EXAMPLES Algebra I – lecture notes Eg 1.3. Solve 5 + x = 2. We add −5 to get −5 + (5 + x) = −5 + 2. Now we use associativity! We rebracket to (−5 + 5) + x = −5 + 2. Thus 0 + x = −5 + 2, so x = −3. To do this, we needed 1) associativity of + 2) the existence of 0 (with 0 + x = x) 3) existence of −5 (with −5 + 5 = 0) Generally, suppose we have a binary operation ∗ and an equation a∗x = b (a, b ∈ S constants, x ∈ S unknown) To be able to solve, we need 1) associativity 2) existence of e ∈ S such that e ∗ x = x for x ∈ S 3) existence of a′ ∈ S such that a′ ∗ a = e Then can solve a∗x a ∗ (a ∗ x) (a′ ∗ a) ∗ x e∗x x ′ = = = = = b a′ ∗ b a′ ∗ b a′ ∗ b a′ ∗ b Group will be a structure in which we can solve equations like this. Definition 1.3. A group (G, ∗) is a set G with a binary operation ∗ satisfying the following axioms (for all a, b, c ∈ S) (1) if a, b ∈ S then a ∗ b ∈ S (closure) (2) (a ∗ b) ∗ c = a ∗ (b ∗ c) (associativity) (3) there exists e ∈ S such that e∗x= x∗e= x (identity axiom) 6 Algebra I – lecture notes 1.1. DEFINITION AND EXAMPLES (4) for any a ∈ S, there exists a′ ∈ S such that a ∗ a′ = a′ ∗ a = e (inverse axiom) Element e in (3) is an identity element of G. Element a′ in (4) is an inverse of a in G. Eg 1.4. (Z, +) (Z, −) (Z, ×) (Q, +) (Q, ×) (Q − {0}, ×) (C − {0}, ×) ({1, −1, i, −i}, ×) closure yes yes no yes yes yes yes yes assoc. yes no identity yes no inverse yes yes yes yes yes yes yes yes yes yes yes yes no yes yes yes We check the group axioms for the last example • Closure Multiplication table 1 −1 i −i 1 1 −1 i −i −1 −1 1 −i i i i −i −1 1 1 −i i 1 −1 • Associativity Follows from associativity of (C, ×) • Identity 1 • Inverse from table Uniqueness of identity and inverses Proposition 1.1. Let (G, ∗) be a group. 1) G has exactly one identity element 2) Each element of G has exactly one inverse Proof. 7 group yes no no yes no yes yes yes 1.1. DEFINITION AND EXAMPLES Algebra I – lecture notes 1) Suppose e, e′ are identity elements. So e∗x = x∗e=x e′ ∗ x = x ∗ e′ = x Then e = e ∗ e′ = e′ 2) Let x ∈ G and suppose x′ , x′′ are inverses of x. That means x′ ∗ x = x ∗ x′ = e x′′ ∗ x = x ∗ x′′ = e Then x′ = = = = = x′ ∗ e x′ ∗ (x ∗ x′′ ) (x′ ∗ x) ∗ x′′ e ∗ x′′ x′′ 2 Notation 1.1. • e is the identity element of G • x−1 is the inverse of x • Instead of “(G, ∗) is a group”, we write “G is a group under ∗”. • Often drop the ∗ in a ∗ b, and write just ab. Eg 1.5. In (Z, +), x−1 = −x. Z is a group under addition. In (Q − {0}, x), x−1 = x1 . Definition 1.4. We say (G, ∗) is a finite group if |G| is finite; (G, ∗) is an infinite group if |G| is infinite. Eg 1.6. All groups in example 2.4 are infinite except the last which has size (order ) 4. 8 Algebra I – lecture notes 1.1. DEFINITION AND EXAMPLES Eg 1.7. Let F = R or C. Say a matrix (aij ) is a matrix over F if all aij ∈ F . Set of all n × n matrices over F under matrix multiplication is not a group (problem with inverse axiom). But let’s define GL(n, F ) to be the set of all n × n invertible matrices over F. Definition 1.5. Denote the set of all invertible matrices over field F as GL(n, F ) GL(n, F ) = {(aij ) | 1 < i, j ≤ n, aij ∈ F } Claim GL(n, F ) is a group under matrix multiplication. Proof. Write G = GL(n, F ). Closure Let A, B ∈ G. So A, B are invertible. Now (AB)−1 = B −1 A−1 since (AB)(B −1 A−1 ) = A(BB −1 B) = AIA−1 = I (B −1 A−1 )(AB) = B −1 (A−1 A)B = B −1 IB = I So AB ∈ G. Associativity proved in M1GLA. Identity is identity matrix In . Inverse of A is A−1 (since AA−1 = A−1 A = I). Note A−1 ∈ G as it has an inverse, A. 2 1) GL(1, R) is the set of all (a) with a ∈ R, a 6= 0. This is just the group (R − {0}, ×). a b 2) GL(2, C) is the set of , a, b, c, d ∈ C, ad − bc 6= 0. c d Note 1.1. Usually, in a group (G, ∗) a ∗ b is not same as b ∗ a. Definition 1.6. Let (G, ∗) be a group. If for all a, b ∈ G, a ∗ b = b ∗ a we call G an abelian group. Eg 1.8. (Z, +) is abelian as a + b = b + a for all a, b ∈ Z. So are the other groups in 2.4. So is GL(1, F ). But GL(2, R) is not abelian, since −1 1 0 1 1 −1 = 2 0 1 0 0 2 0 2 1 −1 0 1 = 1 −1 0 2 1 0 9 1.1. DEFINITION AND EXAMPLES Algebra I – lecture notes Groups of permutations Definition 1.7. S a set. A permutation of S is a function f : S → S which is a bijection (both injection and surjection). Eg 1.9. S = {1, 2, 3, 4}, f : 1 → 2, 2 → 3, 3 → 4, 4 → 1 is a permutation. Notation 1.2. 1 2 3 4 f= 2 4 3 1 is a permutation 1 → 2, 2 → 4, 3 → 3, 4 → 1. Let 1 2 3 4 g= 3 1 2 4 The composition f ◦ q is defined by f ◦ g = f (g(s)) Here f ◦g = 1 2 3 4 3 2 4 1 Recall the inverse function f −1 is the “inverse” of f . Here 1 2 3 4 −1 f = 4 1 3 2 1 2 3 4 −1 = e, the identity function. Notice f ◦ f = 1 2 3 4 Proposition 1.2. Let S = {1, 2, 3, . . . , n} and let G be the set of all permutations of S. Then (G, ◦), ◦ being the function composition, is a group, i.e. G is a group under composition. Proof. Notation for f ∈ G is f= 1 2 ··· n f (1) f (2) · · · f (n) • Closure By M1F, if f , g are bijections S → S then f ◦ g is a bijection. • Associativity Let f, g, h ∈ G and apply s ∈ S Then f ◦ (g ◦ h)(s) = f (g ◦ h(s)) = f (g(h(s))) (f ◦ g) ◦ h(s) = (f ◦ g)(h(s)) = f (g(h(s))) So f ◦ (g ◦ h) = (f ◦ g) ◦ h 10 Algebra I – lecture notes 1.1. DEFINITION AND EXAMPLES 1 2 ··· n , since e ◦ f = f ◦ e = f . • Identity is e = 1 2 ··· n f (1) f (2) · · · f (n) −1 and f −1 ◦ f = f ◦ f −1 = e • Inverse of f is f = 1 2 ··· n 2 Definition 1.8. The group of all permutations of {1, 2, . . . , n} is written Sn and called the symmetric group of degree n. 1 2 Eg 1.10. S2 = e, . So |S2 | = 2. 2 1 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 , , , , , Eg 1.11. S3 = 3 1 2 2 3 1 1 3 2 3 2 1 2 1 3 1 2 3 |S3 | = 6 Proposition 1.3. Sn is a finite group of size n!. 1 2 ··· n , number of choices for f (1) is n, for f (2) is n − 1, Proof. For f = f (1) f (2) · · · f (n) f (3) is n − 2, . . . , f (n) is 1. The total number of permutations is n · (n − 1) · (n − 2) · 1 = n!. 2 Notation 1.3. (Multiplicative notation for groups) If (G, ∗) is a group, we’ll usually write just ab instead of a ∗ b. We can define powers a2 a3 an = = ··· = a∗a a∗a∗a |a ∗ a ∗{z· · · ∗ a} n When we write “Let G be a group”, we mean the binary operation ∗ is understood, and we’re writting ab instead of a ∗ b, etc. Eg 1.12. In (Z, +), ab = a ∗ b = a + b and a2 = a ∗ a = 2a, i.e. an = na. 1.1.1 Group table Definition 1.9. Let G be a group, with elements a, b, c, . . . . Form a group table a a a2 b ba .. . b c ··· ab ac · · · b2 bc · · · .. . 11 1.2. SUBGROUPS Algebra I – lecture notes Eg 1.13. S3 = e = a = a2 = b = ab = a2 b = 1 1 1 2 1 3 1 2 1 3 1 1 2 3 2 3 2 3 3 1 2 3 1 2 2 3 1 3 2 3 2 1 2 3 3 2 e a a2 b ab a2 b 2 e e a a b ab a2 b a a a2 e ab a2 b b 2 2 2 a a e a ab b ab b b a2 b ab e a2 a 2 ab ab b ab a e a2 2 2 2 b a a e a b a b ab • a3 = e • ba = a2 b • b2 = e 1.2 Subgroups Definition 1.10. Let (G, ∗) be a group. A subgroup of (G, ∗) is a subset of G which is itself a group under ∗. Eg 1.14. • (Z, +) is a subgroup of (R, +). • (Q − {0}, ×) is not a subgroup of (R, +) • ({1, −1}, ×) is a subgroup of ({1, −1, i, −i}, ×) • ({1, i}, ×) is not a subgroup of ({1, −1, i, −i}, ×) (closure fails – i × i = −1) 12 Algebra I – lecture notes 1.2.1 1.2. SUBGROUPS Criterion for subgroups Proposition 1.4. Let G be a group.1 Let H ⊆ G. Then H is a subgroup of G if the following conditions are true (1) e ∈ H (where e is the identity of G) (2) if h, k ∈ H then hk ∈ H (H is cloesed) (3) if h ∈ H then h−1 ∈ H Proof. Assume (1)-(3). Check the group axioms for H: Closure – true by (2) Associativity – true by associativity for G Identity – by (1) Identity – by (3) 2 1 n |n∈Z . Eg 1.15. Let G = GL(2, R) (2×2 invertible matrices over R). Let H = 0 1 Claim H is a subgroup of G. Proof. Check (1)-(3) of previous proposition (1) e = 1 0 0 1 ∈H 1 p+n 1 p 1 n ∈ H. . Then nk = ,k= (2) Let h = 0 1 0 1 0 1 1 n 1 −n −1 (3) Let h = . Then h = ∈H 0 1 0 1 2 1 so ∗ is understood and we write ab instead of a ∗ b 13 1.3. CYCLIC SUBGROUPS 1.3 Algebra I – lecture notes Cyclic subgroups Let G be a group. Let a ∈ G. Recall a1 = a, a2 = aa, . . . Negative powers a0 = a−2 = a−n = e a−1 a−1 −1 a · · a−1} | ·{z n Note 1.2. All the powers an (n ∈ Z) lie in G (by closure). Lemma 1.1. For any m, n ∈ Z am an = am+n Proof. For m, n > 0 m+n m n a a }| { z · · a} a · · a} = a | ·{z | ·{z n m+n m = a For m ≥ 0, n < 0 −1 · · a−1} am an = |a ·{z · · a} a | ·{z n −n m−(−n) = a Similarly for m < 0, n ≥ 0. Finally, when m, n < 0 −m−n am an z }| { −1 −1 −1 −1 = a · · · a a · · · a | {z } | {z } −n m+n −m = a 2 Proposition 1.5. Let G be a group. Let a ∈ G. Define A = {an | n ∈ Z} = . . . , a−2 , a−1 , e, a, a2 , . . . Then A is a subgroup of G. Proof. Check (1)-(3) of 2.4 14 Algebra I – lecture notes 1.3. CYCLIC SUBGROUPS (1) e = a0 ∈ A (2) am , an ∈ A then am an = am+n ∈ A (3) an ∈ A then (an )−1 = a−n ∈ A 2 Definition 1.11. Write A = hai, called the cyclic subgroup of G generated by a. So for each element a ∈ G we get a cyclic subgroup hai of G. Eg 1.16. (1) G = (Z, +). What is the cyclic subgroup h3i? Well, 31 = 3, 32 = 3 + 3 = 6, 3n = 3n, 3−1 = −3, 3−n = −3n. So h3i = {3n | n ∈ Z}. Similarly h1i = {n | n ∈ Z} = Z. 1 2 3 . What is hai? (2) G = S3 , a = 2 3 1 Well a0 = e a1 = a 1 2 3 2 a = 3 1 2 a3 = e a4 = a a5 = a2 .. . a−1 = a3 a−1 = a2 a−2 = a .. . Hence hai = {an | n ∈ Z} = {e, a, a2 } . 1 2 3 Now consider hbi, b = . Here b0 = e, b1 = b, b2 = e, . . . . 2 1 3 So hbi = {e, b}. (3) All the cyclic subgroups of S3 = {e, a, a2 , b, ab, a2 b} hei hai 2 a hbi habi 2 ab = = = = = = 15 {e} {e, a, a2 } {e, a, a2 } {e, b} {e, ab} {e, a2 b} 1.3. CYCLIC SUBGROUPS Algebra I – lecture notes Definition 1.12. Say a group G is a cyclic group, if there exists an element a ∈ G such that G = hai = {an | n ∈ Z} Call a a generator for G. Eg 1.17. (1) (Z, +) = h1i, So (Z, +) is cyclic with generator 1 (2) ({1, −1, i, −i}, ×) is cyclic, generator i, since hii = {i0 , i1 , i2 , i3 } = {1, i, −1, −i} Another generator is −i, but 1 and −1 are not generators. (3) S3 is not cyclic, as non of its 5 cyclic subgroups is the whole of S3 . For any n ∈ N there exists a cyclic group of size n (having n elements) – Cn Cn = {x ∈ C | xn = 1} the complex n-th roots of unity, under multiplication. By M1F, we know Cn = {1, ω, ω 2, . . . , ω n−1}, i where ω = e2π n , So Cn = hωi a cyclic subgroup of (C − {0}, ×). 1.3.1 Order of an element Definition 1.13. Let G be a group, let a ∈ G. The order of a, written o(a), is the smallest positive integer k such that ak = e. So o(a) = k means ak = e and ai 6= e for i = 1, . . . , k − 1. If no such k exists, we say a has infinite order and we write o(a) = ∞. Eg 1.18. (1) e has order 1, and is the only such element. 1 2 3 1 2 3 1 2 3 3 1 2 . So , a = . Then a = a, a = (2) G = S3 , a = 1 2 3 3 1 2 2 3 1 o(a) = 3. 1 2 3 , b1 6= e, b2 = e, so o(b) = 2. Full list: For b = 2 1 3 o(e) o(a) o(a2 ) o(b) o(ab) o(a2 b) 16 = = = = = = 1 3 3 2 2 2 Algebra I – lecture notes 1.3. CYCLIC SUBGROUPS (3) G = (Z, +). What is o(3)? In G, e = 0, 3n = n × 3. So 3n 6= e for any n ∈ N, so o(3) = ∞. i 0 . Then (4) G = GL(2, C), A = 0 e2πi/3 k i 0 k A = 0 e2πik/3 So smallest k for which this is the identity is 12 ∴ o(A) = 12. Proposition 1.6. G a group, a ∈ G. The number of elements in the cyclic subgroup generated by a is equal to o(a). | hai | = o(a) Proof. (1) Suppose o(a) = k, finite. This means ak = e, but ai 6= e for 1 ≤ i ≤ k − 1. Write A = hai = {an | n ∈ Z}. Then A contains e, a, a2 , . . . , ak−1 These are all different elements of G since for 1 ≤ y < j ≤ k − 1, ai = aj a−1 ai = a−i aj e = aj−i o(a) = j − i < k – contradiction. Hence A contains e, a, . . . , ak−1 , all distinct, so |A| ≥ K We now show that every element of A is one of e, a, . . . , ak−1. Let an ∈ A. Write n = qk + r, 0 ≤ r ≤ k − 1 Then an = = = = = aqk+r aqk ar (ak )q ar eq ar ar So an = ar ∈ {e, a, a2 , . . . , ak−1 }. We’ve shown A = {e, a, a2 , . . . , ak−1 } so |A| = k = o(a). 17 1.4. MORE ON THE SYMETRIC GROUPS SN Algebra I – lecture notes (2) Suppose o(a) = ∞. This means ai 6= e for i ≥ 1 If i < j then ai 6= aj since ai = aj e = aj−i contradiction. Then A = {. . . , a−2 , a−1 , e, a, a2 , . . . } and all these elements are different elements of G. So |A| = ∞ = o(a). 2 Eg 1.19. 1 2 3 . Then hai = {e, a, a2 }, size 3, o(a) = 3. (1) G = S3 , a = 2 3 1 (2) G = (Z, +). Then h3i = {3n | n ∈ Z}, is infinite and o(3) = ∞. (3) Cn = hωi = {1, ω, . . . , ω n−1}, size n, and o(ω) = n. 1.4 More on the symetric groups Sn 1 2 3 4 5 6 7 8 ∈ S8 . What is f 2 , f 5 ? Eg 1.20. Let f = 4 5 6 3 2 7 1 8 We need better notation to see answers quicklky. Observe the numbers 1–4–3–6–7–1 are in a cycle, as well as numbers 2–5 and 8. We will write f = (1 4 3 6 7)(2 5)(8). These are the cycles of f . Each symbol in the first cycle goes to the next except for the last 7 which goes back to the first 1. The cycles are disjoint – they have no symbols in common. Call this the cycle notation for f . Definition 1.14. In general, an r cycle is a permutation a1 a2 . . . ar which sends a1 → a2 → · · · → ar → a1 . . . . Eg 1.21. Can easily go from cycle notation to original, e.g. g = (1 5 3)(2 4)(6 7) ∈ S7 1 2 3 4 5 6 7 g = 5 4 1 2 3 7 6 18 Algebra I – lecture notes 1.4. MORE ON THE SYMETRIC GROUPS SN Proposition 1.7. Every permutation f in Sn can be expressed in the cycle notation, i.e. as a product of disjoint cycles. Proof. Following procedure works: Start with 1, and write down sequence 1, f (1), f 2(1), . . . , f r−1 (1) until the first repeat f r (1). Then in fact f r (1) = 1, since f r (1) = f i (1) f −i (1)f r (1) = 1 f r−i (1) = 1 which is a contradiction as f r (1) is the first repeat. So have the r-cycle (1 f (1) · · · f r−1 (1)) first cycle of f . Second cycle: Pick a symbol i not in the first cycle, and write i, f (i), f 2 (i), . . . , f s−1(i) where f s (i) = i. Then this is the second cycle of f . This cycle is disjoint from the first since if not, say f j (i) = k in first cycle, then f s−j(k) = f s (i) = i would be in the first cycle. Now carry on: pick j not in first two cycles and repeat to get third cycle and carry on until we have used all the symbols 1, . . . , n. so f = (1 f (1) · · · f r−1 (1)(i f (i) · · · f s−1 (i)) . . . a product of disjoint cycles. 2 Note 1.3. Cycle notation is not quite unique – e.g. (1 2 3 4) can be written as (2 3 4 1) AND (1 2)(3 4 5) = (3 4 5)(1 2). Notation is unique appart from such changes. Eg 1.22. 1. The elements of S3 in cycle notation e a a2 b ab a2 b = = = = = = 19 (1)(2)(3) (1 2 3) (1 3 2) (1 2)(3) (1 3)(2) (2 3)(1) 1.4. MORE ON THE SYMETRIC GROUPS SN Algebra I – lecture notes 2. For disjoint cycles, order of multiplication does not matter, e.g. (1 2)(3 4 5) = (3 4 5)(1 2) For non-disjoint cycles it does matter, e.g. (1 2)(1 3) 6= (1 3)(1 2) 3. Multiplication is easy using cycle notation, e.g. f = (1 2 3 5 4) ∈ S5 g = (2 4)(1 5 3) ∈ S5 then f g = (1 4 3 2)(5) Definition 1.15. Let g = (1 2 3) (4 5)(5 7)(8)(9) ∈ S9 . The cycle shape of g is (3, 2, 2, 1, 1) i.e. the sequence of numbers giving the cycle-length of g in descending order. Abbreviate: (3, 22 , 12 ) Eg 1.23. How many permutation of each cycle-shape in S4 ? cycle-shape e.g number in S4 4 (1 ) e 1 4 2 (2, 1 ) (1 2)(3)(4) 2 = 6 4 (3, 1) (1 2 3)(4) ×2=8 3 (4) (1 2 3 4) 3! = 6 2 3 (2 ) (1 2)(3 4) Total 24 = 4!. 1.4.1 Order of permutation Recall the order o(f ) of f ∈ Sn is the smallest positive integer k such that f k = e. 20 Algebra I – lecture notes 1.5. LAGRANGES THEOREM Eg 1.24. f = (1 2 3 4), 4-cycle then f1 f2 f3 f4 = = = = f (1 3)(2 4) (1 4 3 2) e So o(f ) = 4. Similarly, if f = (1 2 . . . r) then o(f ) = r. Eg 1.25. g = (1 2 3)(4 5 6 7). What is o(g)? g 2 = (1 2 3)(4 5 6 7) ◦ (1 2 3)(4 5 6 7) (disjoint) = (1 2 3)2 (4 5 6 7)2 Similarly g i = (1 2 3)i (4 5 6 7)i To make g i = e, need i to be divisible by 3 (to get rid of (1 2 3)i ) and by 4 (to get rid of (4 5 6 7)i ). So o(g) = lcm(3, 4) = 12. Same argument gives Proposition 1.8. The order of a permutation in cycle notation is the least common multiple of the cycle lengths Eg 1.26. Order of (1 2)(3 4 5 6) is lcm(2, 4) = 4. The order of (1 3)(3 4 5 6) is not 4 (not disjoint) Eg 1.27. Pack of 8 cards. Shuffle by dividing into two halves and interlacing, so if original order is 1, 2, . . . , 8 then the order is 1, 5, 2, 6, 3, 7, 4, 8. How many shuffles bring cards back to original order? This is the permutation s in S8 1 2 3 4 5 6 7 8 s= 1 5 2 6 3 7 4 8 In cycle notation s = (1)(2 5 3)(4 6 7)(8). So order of s o(s) = lcm(3, 3, 1, 1) = 3, so 3 shuffles are required. 1.5 Lagranges Theorem Recall G a finite group means G has a finite number of elements. Size of G is |G|, e.g. |S3 | = 6. Theorem 1.1. Let G be a finite group. If H is any subgroup of G, then |H| divides |G|. Eg 1.28. Subgroups of S3 have size 1, 2, 3 or 6. Note 1.4. It does not work the other wat round, i.e. if a is a number dividing |G|, then there may well not exist a subgroup of G of size a. 21 1.6. APPLICATIONS TO NUMBER THEORY 1.5.1 Algebra I – lecture notes Consequences of Lagranges Theorem Corollary 1. If G is a finite group and a ∈ G then o(a) divides G. Proof. Let H = hai, cyclic subgroup of G generated by a. By 1.6, |H| = o(a) so by Lagrange, o(a) divides |G|. 2 Corollary 2. Let G be a finite group and let n = |G|. If a ∈ G, then an = e. r Proof. Let k = o(a). By 1, k divides n. Say n = kr. So an = ak = er = e. 2 Corollary 3. If |G| is a prime number, then G is cyclic. Proof. Let |G| = p, prime. Pick a ∈ G with a 6= e. By Lagrange, the cyclic subgroup hai has size dividing p. It contains e, a, so has size ≥ 2, therefore has size p. As |G| = p, this implies G = hai, cyclic. 2 Eg 1.29. Subgroups of S3 . These have size 1 – hei, 2, 3 – cyclic by 3. So we know all the subgroups of S3 . 1.6 Applications to number theory Definition 1.16. Fix a positive integer m ∈ N. For any integer r, the residue class of r modulo m denoted [r]m is [r]m = {km + r | k ∈ Z} Eg 1.30. [0]5 = {5k | k ∈ Z} [1]5 = {. . . , −9, −4, 1, 6, 11, . . . } = [1]5 [−2]5 = [3]5 = [8]5 Since every integer is congruent to 0, 1, 2, . . . , m − 1 modulo m, [0]m ∪ [1]m ∪ · · · ∪ [m − 1]m = Z and every integer is in exactly one of these residue classes. Proposition 1.9. [a]m = [b]m ⇔ a ≡ b mod m Proof. → Suppose [a]m = [b]m . As a ∈ [a]m this implies a ∈ [b]m , so a ≡ b mod m. 22 Algebra I – lecture notes 1.6. APPLICATIONS TO NUMBER THEORY ← Suppose a ≡ b mod m. Now x≡a mod m ⇔ x ≡ b mod m (as ≡ is an equivalence relation). So x ∈ [a]m ⇔ x ∈ [b]m Therefore [a]m = [b]m . 2 Eg 1.31. [17]9 = [−19]9 Definition 1.17. Write Zm for the set of all the residue classes [0]m , [1]m , . . . , [m − 1]m From now on we’ll usually drop the subscript m and write [r] = [r]m Definition 1.18. Define +, × on Zm by [a] + [b] = [a + b] [a] · [b] = [ab] This is OK, as [a] = [a′ ] → → [b] = [b′ ] [a + b] = [a′ + b′ ] [ab] = [a′ b′ ] Eg 1.32. [2] + [4] = [1] [3] + [3] = [1] [3] · [3] = [4] 1.6.1 Groups Eg 1.33. (Zm , +) is a group. What about (Zm , ×)? Identity will be [1]. So [0] will have no inverse (as [0] [a] = [0]). So let Z∗m = Zm − {[0]} 23 1.6. APPLICATIONS TO NUMBER THEORY Algebra I – lecture notes For which m is (Z∗m , ×) a group? Eg 1.34. Z∗2 = {[1]}. This is a group. Z∗3 = {[1] , [2]} · [1] [1] [1] [2] [2] [2] [2] [1] Compare with S2 to see it is a group. Z∗4 · [1] [2] [1] [2] [1] [2] [2] [0] [3] [3] Here [2] ∈ Z∗4 , but [2] [2] = [0] ∈ / Z∗4 . Theorem 1.2. (Z∗m , ×) is a group iff m is a prime number. Proof. → Suppose Z∗m is a group. If m is not a prime, then m = ab, 1 < a, b < m so [a], [b] ∈ Z∗m (neither is [0]). but [a] · [b] = [ab] = [m] = [0] This contradicts closure. So m is prime. ← Suppose m is a prime, write m = p. We show that Z∗p is a group. – Closure Let [a] , [b] ∈ Z∗p . Then [a] , [b] 6= [0], so p 6 |a and p 6 |b. Then p 6 |ab (as p is prime – result from M1F). So [a] [b] = [ab] 6= [0] Thus [a] [b] ∈ Z∗p . 24 Algebra I – lecture notes 1.6. APPLICATIONS TO NUMBER THEORY – Associativity ([a] [b]) [c] = [ab] c = [(ab)c] [a] ([b] [c]) = [a] [bc] = [a(bc)] These are equal as (ab)c = a(bc) for a, b, c ∈ Z. – Identity is [1] as [a] [1] = [1] [a] = [a]. – Inverses Let [a] ∈ Z∗p . We want to find [a′ ] such that [a] [a′ ] = [a′ ] [a] = [1], i.e. [aa′ ] = [1] aa′ ≡ 1 mod p Here’s how. Well, [a] 6= [0] so p 6 |a. As p is prime, hcf (p, a) = 1. By M1F, there exist integers s, t ∈ Z with sp + ta = 1 Then ta = 1 − sp ≡ 1 mod p So [t] [a] = [1] Then [t] ∈ Z∗p ([t] 6= [0]) and [t] = [a]−1 . 2 So, Z∗p (p prime) (1) is abelian (2) has p − 1 elements Eg 1.35. Z∗5 = {[1] , [2] , [3] , [4]}. Is Z∗5 cyclic? Well [2]2 = 4 [2]3 = [3] So Z∗5 = h[2]i. Eg 1.36. In the group Z∗31 what is [7]−1 ? From the proof above, want to find s, t with 7s + 31t = 1 25 [2]4 = [1] 1.7. APPLICATIONS OF THE GROUP Z∗P Algebra I – lecture notes Use Euclidean algrithm 31 = 4 · 7 + 3 7 = 2·3+1 So 1 = 7−2·3 = 7 − 2(31 − 4 · 7) = 9 · −2 · 31 So [7]−1 = [9] . 1.7 Applications of the group Z∗p Theorem 1.3. (Fermat’s Little Theorem) Let p be a prime, and let n be an integer not divisible by p. Then np−1 ≡ 1 mod p Proof. Work in the group Z∗p = {[1] , . . . , [p − 1]} As p 6 |n, [n] 6= [0],so [n] ∈ Z∗p Now Cor.?? says: if |G| = k then ak = e ∀a ∈ G. Hence [n]p−1 = identity of Z∗p = [1] Since so (from prop. 1.9). [n]p−1 = [n] · · · [n] = np−1 p−1 n = [1] → np−1 ≡ 1 mod p Corollary 4. Let p be prime. Then for all integers n np ≡ n 26 mod p 2 1.7. APPLICATIONS OF THE GROUP Z∗P Algebra I – lecture notes Proof. If p 6 |n then by FLT np−1 ≡ 1 mod p np ≡ n mod p If p|n then both np and n are congruent to 0 mod p. 2 Eg 1.37. p = 5, then 1314 ≡ 1 mod 5 p = 17, then 6216 ≡ 1 mod 17. Eg 1.38. Find remainder when divide 682 by 17. 616 ≡ 1 mod 17 (616 )5 = 680 ≡ 1 mod 17 682 = 680 · 66 ≡ 62 ≡ 2 mod 17 (FLT) Second application. 1.7.1 Mersenne Primes Definition 1.19. A prime number p is called a Mersenne prime if p = 2n − 1 for some n ∈ N. Eg 1.39. 22 − 1 23 − 1 24 − 1 25 − 1 27 − 1 = = = = = 3 7 15 31 127 The largest known primes are Mersenne primes. Largest known 2/2/06 230402457 − 1 Connection with perfect numbers Definition 1.20. A positive integer N is perfect if N is equal to the sum of its positive divisors (including 1, not N). Eg 1.40. 6 = 1+2+3 28 = 1 + 2 + 4 + 7 + 14 27 1.7. APPLICATIONS OF THE GROUP Z∗P Algebra I – lecture notes Theorem 1.4. (Euler) (1) If 2n − 1 is prime then 2n−1 (2n − 1) is perfect. (2) Every even perfect number is of this form Proof. • Sheet 4. • Harder - look it up. 2 It is still unsolved – is there an odd perfect number? 1.7.2 How to find Meresenne Primes Proposition 1.10. If 2n − 1 is prime, then n must be prime. Proof. Suppose n is not prime. So n = ab, 1 < a, b < n Then 2n − 1 = 2ab − 1 = (2a − 1)(2a(b−1) + 2a(b−2) + · · · + 2a + 1) (using xb − 1 = (x − 1)(xb−1 + · · · ) with x = 2a ) So 2n − 1 has factor 2a − 1 > 1, so is not prime. Hence 2n − 1 implies n prime. Eg 1.41. Know 22 − 1, 23 − 1, 25 − 1, 27 − 1 are prime. Next cases 211 − 1, 213 − 1, 217 − 1 Are these prime? We will answer this using the group Z∗p . We will need Proposition 1.11. Let G be a group, and let a ∈ G. Suppose an = e. Then o(a)|n. 28 2 1.7. APPLICATIONS OF THE GROUP Z∗P Algebra I – lecture notes Proof. Let K = o(a). Write n = qK + r, 0 ≤ r < K Then e = = = = an = aqK+r aqK ar = (aK )q ar eq ar ar So ar = e. This K is smallest positive integer such that aK = e and 0 ≤ r < K, this forces r = 0. Hence K = o(a) divides n. 2 Proposition 1.12. Let N = 2p − 1, p prime. Let q be prime, and suppose q|N. Then q ≡ 1 mod p. Proof. q|N means N ≡ 0 mod q, i.e. 2p ≡ 1 mod q This means that [2]p = [1] ∈ Z∗q We know that Z∗q is a group of order q − 1. We also know that o([2]) in Z∗q divides p, so is 1 or p as p is prime. If o([2]) = 1, then [2] = [1] in Z∗q that is 2 ≡ 1 mod q 1 ≡ 0 mod q so q|1, a contradiction. Hence we must have o([2]) = p By Corollary 1, That is, p divides q − 1 o([2]) divides |Z∗q | q−1 ≡ 0 q ≡ 1 mod p mod p 2 29 1.8. PROOF OF LAGRANGE’S THEOREM Algebra I – lecture notes Test for a Mersenne prime N = 2p − 1 √ List all the primes q with q ≡ 1 mod p and q < N and check, one by one, to see if any divide N. If none of them divide N, we have a prime. √ Eg 1.42. p = 11. N = 2p − 1 = 2047, N < 50. Which primes q less than 50 have q ≡ 1 mod 11? We check through all numbers congruent to 1 mod 11. 12, 23, 34, 45 The only prime less than 50 that can possibly divide 2047 is 23. Now we check to see if 23|211 − 1, i.e., if 21 1 = 1 mod 23. 25 ≡ 32 ≡ 9 mod 23 210 ≡ (25 )2 ≡ 92 mod 23 ≡ 12 mod 23 11 2 ≡ 23 mod 23 ≡ 1 mod 23 Conclusion – 211 − 1 is not a prime – it has a factor of 23. Eg 1.43. 213 − 1 is prime – Exercise sheet. 1.8 Proof of Lagrange’s Theorem Now we have to prove the Lagrange’s Theorem Theorem 1.5. Let G be a finite group of order |G|, with a subgroup H of order |H| = m. Then m divides |G|. Note 1.5. The idea – write H = {h1 , . . . , hm }. Then we divide G into “blocks”. H h1 h2 .. . 1 Hx Hy h1 x h1 y h2 x h2 y 2 3 ... ... ... ... r We want the blocks to have the following three properties (1) Each block has m distinct elements (2) No element of G belongs to two blocks (3) Every element of G belongs to (exactly) one block 30 Algebra I – lecture notes 1.8. PROOF OF LAGRANGE’S THEOREM Then |G| is the total number of elements listed in the blocks, i.e. rm, so m||G|. Definition 1.21. For x ∈ G, H subgroup of G, define the right coset Hx = {hx | h ∈ H} = {hx | h ∈ H} = {h1 x, h2 x, . . . , hm x} The official name for a “block” is a right coset. Note 1.6. Hx ⊆ G Eg 1.44. G = S3 , H = hai = {e, a, a2 }, a = (1 2 3). H = He = Ha = Ha2 = ea2 , aa2 , a2 a2 2 a , e, a Take b = (1 2), so b2 = e, Hb = eb, ab, a2 b = b, ab, a2 b e a a2 b ab a2 n Lemma 1.2. For any x in G |Hx| = m Proof. By definition, we have Hx = {h1 , x, . . . , hm x} These elements are all different, as hi x = hj x hi xx−1 = hj xx−1 hi = hj So |Hx| = m. 2 Lemma 1.3. If x, y ∈ G then either Hx = Hy or Hx ∩ Hy = ∅. Proof. Suppose Hx ∩ Hy 6= ∅ We will show this implies Hx = Hy. We can choose an element a ∈ Hx ∩ Hy. Then a = hi x a = hj y 31 1.8. PROOF OF LAGRANGE’S THEOREM Algebra I – lecture notes for some hi , hj ∈ H. a =i x = hi y x = h−1 i hj y Then for any h ∈ H hx = hh−1 i hj y As H is a subgroup, hh−1 i hj ∈ H. Hence hx ∈ Hy This shows Hx ⊆ Hy. Similarly hi x = hj y y = h−1 j hi x so for any h ∈ H So Hy ⊆ Hx. We conclude Hx = Hy. hy = hh−1 j hi x ∈ Hx 2 Lemma 1.4. Let x ∈ G. Then x lies in the right coset Hx. Proof. As H is a subgroup, e ∈ H. So x = ex ∈ Hx. 2 Theorem 1.6. Let G be a finite group of order |G|, with a subgroup H of order |H| = m. Then m divides |G|. Proof. By 1.4, G is equal to the union all the right cosets of H, i.e. [ G= Hx x∈G Some of these right cosets will be equal (eg. G = S3 , H = hai, then H = He = Ha = Ha2 ). Let the list of different right cosets be Hx1 , . . . , Hxr Then G = Hx1 ∪ Hx2 ∪ · · · ∪ Hxr and Hxi 6= Hxj if i 6= j (eg. in G = S3 , G = H ∪ Hb). By 1.3, Hxi ∩ Hxj = ∅ if i 6= j. Picture G = Hx1 Hx2 32 ··· Hxr (1.1) Algebra I – lecture notes 1.8. PROOF OF LAGRANGE’S THEOREM So |G| = |Hx1 | + · · · + |Hxr |. By 1.2 |Hxi | = m = |H| So |G| = rm = r|H| Therefore |H| divides |G|. 2 Proposition 1.13. Let G be a finite group, and H a subgroup of G. Let r= |G| |H| Then there are exactly r different right cosets of H in G, say Hx1 , . . . Hxr They are disjoint, and G = Hx1 ∪ · · · ∪ Hxr Definition 1.22. The integer r = |G| |H| is called the index of H in G, written r = |G : H| Eg 1.45. (1) G = S3 , H = hai = {e, a, a2 }. Index |G : H| = Hb and G = H ∪ Hb. 6 3 = 2. There are 2 right cosets H, (2) G = S3 , K = hbi = {e, b} where b = (1 2)(3). Index |G : K| = right cosets – they are Ke = K = {e, b} Ka = {a, ba} = a, a2 b Ka2 = a2 , ba2 = a2 , ab 33 6 2 = 3. So there are 3 Algebra I – lecture notes Chapter 2 Vector Spaces and Linear Algebra Recall Rn = {(x1 , x2 , . . . , xn ) | xi ∈ R} Basic operations on Rn : • addition (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) • scalar multiplication λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ), λ ∈ R These operations satisfy the following rules: • Addition rules A1 u + (v + w) = (u + v) + w associativity A2 v + 0 = 0 + v = v identity A3 v + (−v) = 0 inverses A4 u + v = v + u abelian (These say (Rn , +) is an abelian group) • Scalar multiplication rules S1 λ(v + w) = λv + λw S2 (λ + µ)v = λv + µv S3 λ(µv) = (λµ)v S4 1v = v These are easily proved for Rn : Eg 2.1. 34 Algebra I – lecture notes 2.1. DEFINITION OF A VECTOR SPACE A1 u + (v + w) = = = = = = (u1 , . . . , un ) + ((v1 , . . . , vn ) + (w1 , . . . , wn )) (u1 , . . . , un ) + (v1 + w1 , . . . , vn + wn ) (u1 + (v1 + w1 ), . . . ) ((u1 + v1 ) + w1 , . . . ) ((u1 , . . . ) + (v1 , . . . )) + (w1 , . . . ) (u + v) + w S3 λ(µv) = λ ((µv1 , . . . , µvn )) = (λ(µv1 ), . . . , λ(µvn )) = ((λµ)v1 , . . . , (λµ)vn ) = (λµ)v 2.1 (assoc. of (R, ×)) Definition of a vector space A vector space will be a set of objects with addition and scalar multiplication defined satisfying the above axioms. Want to let the scalars be either R or C (or a lot of other things). So let F = either R or C Definition 2.1. A vector space over F is a set V of objects called vectors together with a set of scalars F and with • a rule for adding any two vectors v, w ∈ V to get a vector v + w ∈ V • a rule for multiplying any vector v ∈ V by any scalar λ ∈ F to get a vector λv ∈ V . • a zero vector 0 ∈ V • for any v ∈ V , a vector −v ∈ V Such that the axioms A1-A4 and S1-S4 are satisfied. There are many different types of vectors spaces Eg 2.2. (1) Rn is a vector space over R (2) Cn = {(z1 , . . . , zn ) | zi ∈ C} with addition u + v and scalar multiplication λv (λ ∈ C) is a vector space over C. 35 2.1. DEFINITION OF A VECTOR SPACE Algebra I – lecture notes (3) Let m, n ∈ N. Define Mm,n = set of all m × n matrices with real entries (So in this example, “vectors” are matrices.) Adopt the usual rules for addition and scalar multiplication of matrices: A = (aij ), B = (bij ), λ ∈ R A + B = (aij + bij ) λA = (λaij ) Zero vector is the matrix 0 (m × n zero matrix). And −A = (−aij ). Then Mm,n becomes a vector space over R (check axioms). (4) A non-example: Let V = R2 , with usual addition defined and new scalar multiplication: λ ∗ (x1 , x2 ) = (λx1 , 0). Let’s check axioms – A1-A4 hold – S1 λ ∗ (v + w) = λ ∗ v + λ ∗ w holds – S2 (λ + µ) ∗ v = λ ∗ v + µ ∗ v holds – S3 λ ∗ (µ ∗ v) = (λµ)v holds – S4 1 ∗ v = v fails. To show this, need to produce just one v for which it fails, eg. 1 ∗ (17, 259) = (17, 0) 6= (17, 259) (5) Functions. Let V = set of all functions f : R → R So “vectors” are functions. – Addition f + g is a function x 7→ f (x) + g(x) – Scalar multiplication is a function x 7→ λf (x) (λ ∈ R) – Zero vector is the function 0 : x 7→ 0. – Inverses −f is a function x 7→ −f (x) Check the axioms – A1 using associativity of R (f + (g + h))(x) = = = = = 36 f (x) + (g + h)(h) f (x) + (g(x) + h(x)) (f (x) + g(x)) + h(x) (f + g)(x) + h(x) ((f + g) + h) (x) Algebra I – lecture notes 2.1. DEFINITION OF A VECTOR SPACE Conclude V is a vector space over R. (6) Polynomials. Recall a polynomial over R is an expression p(x) = a0 + a1 x + · · · + anx with all ai ∈ R. Let V = set of all polynomials over R P P i – Addition If p(x) = ai xi , q(x) = bi x then X p(x) + q(x) = (ai + bi )xi P – Scalar multiplication If p(x) = ai xi , ai ∈ R then X λp(x) = λai xi – Zero vector is 0 – the poly with all coefficients 0 P – Negative of p(x) = ai xi is X −p(x) = −ai xi Now check A1-A4, S1-S4. So V is a vector space over R. Consequence of axioms Proposition 2.1. Let V be a vector space over F and let v ∈ V , λ ∈ F (1) 0v = 0 (2) λ0 = 0 (3) if λv = 0 then λ = 0 or v = 0 (4) (−λ)v = −λv = λ(−v) Proof. (1) Observe 0v = (0 + 0)v = 0v + 0v 0v + (−0v) = (0v + 0v) + (−0v) 0 = 0v by S2 (2) λ0 = λ(0 + 0) = λ0 + λ0 0 = λ0 by S1 Parts (3), (4) – Ex. sheet 5 2 37 2.2. SUBSPACES 2.2 Algebra I – lecture notes Subspaces Definition 2.2. Let V be a vector space over F , and let W ⊆ V . Say W is a subspace of V if W is itself a vector space, with the same addition and scalar multiplication as V . Criterion for subspaces Proposition 2.2. W is a subspace of vector space V if the following hold: (1) 0 ∈ W (2) if v, w ∈ W then v + w ∈ W (3) if w ∈ W , λ ∈ F then λw ∈ W Proof. Assume (1), (2), (3). We show W is a vector space. • Addition and scalar multiplication on W are defined by (2), (3). • Zero vector 0 ∈ W by (1) • Negative −w = (−1)w ∈ W by (3). Finally, A1-A4, S1-S4 hold for W since they hold for V . Eg 2.3. 1. V is a subspace of itself. 2. {0} is a subspace of any vector space. 3. Let V = R2 and W = {(x1 , x2 ) | x1 + 2x2 = 0} Claim W is a subspace of R2 . Proof. Check (1)-(3) from the proposition (1) 0 ∈ W since 0 + 2 · 0 = 0 (2) Let v = (v1 , v2 ) ∈ W , w = (w1 , w2 ) ∈ W . So v1 + 2v2 = w1 + 2w2 = 0 v1 + w1 + 2(v2 + w2 ) = 0 v + w = (v1 + w1 , v2 + w2 ) ∈ W (3) Let v = (v1 , v2 ) ∈ W , λ ∈ R. Then v1 + 2v2 = 0 λv1 + 2λv2 = 0 λv = (λv1 , λv2 ) ∈ W 38 2 Algebra I – lecture notes 2.3. SOLUTION SPACES So W is a subspace by 2.2. 2 4. Same proof shows that any line through 0 (ie. px1 + qx2 = 0) is a subspace of R2 . Note 2.1. A line not through the origin is not a subspace (no zero vector). The only subspace of R2 are: lines through 0, R2 itself, {0}. 5. Let V = vector space of polynomials over R. Define W = polynomials of degree at most 3 (recall deg(p(x)) = highest power of x appearing in p(x)). Claim W is a subspace of V . Proof. (1) 0 ∈ W (2) if p(x), q(x) ∈ W then deg(p), deg(q) ≤ 3, hence deg(p + q) ≤ 3, so p + q ∈ W . (3) if p(x) ∈ W , λ ∈ R, then λp(x) has degree of most 3, so λp(x) ∈ W . 2 2.3 Solution spaces Vast collection of subspaces of Rn is provided by the following Proposition 2.3. Let A be an m × n matrix with real entries and let W = {x ∈ Rn | Ax = 0} (The set of solutions of the system of linear equations Ax = 0) Then W is a subspace of Rn . Proof. We check 3 conditions of 2.2. (1) 0 ∈ W (as A0 = 0) (2) if v, w ∈ W then Av = Aw = 0. Hence A(v + w) = 0, so v + w ∈ W (3) if v ∈ W , λ ∈ R (Av = 0), then A(λv) = λ(Av) = λ0 = 0, so λv ∈ W 2 Definition 2.3. The system Ax = 0 is a homogeneous system of linear equations, and W is called the solution space Eg 2.4. 39 2.4. LINEAR COMBINATIONS Algebra I – lecture notes 1. m = 1, n = 2, A = a b . Then W = x ∈ R2 | ax1 + bx2 = 0 which is a line through 0. 2. m = 1, n = 3, A = a b c . Then W = x ∈ R3 | ax1 + bx2 + cx3 = 0 a plane through 0. 3. m = 2, n = 4, A = 1 2 1 0 −1 0 1 2 Here W = x ∈ R4 | x1 + 2x2 + x3 = 0, − x1 + x3 + 2x4 = 0 4. Try a non-linear equation: W = (x1 , x2 ) ∈ R2 | x1 x2 = 0 Answer is no. To show this, need a single counterexample to one of the conditions of 2.2, eg: (1, 0), (0, 1) ∈ W , but (1, 0) + (0, 1) = (1, 1) ∈ / W. 2.4 Linear Combinations Definition 2.4. Let V be a vector space over F and let v1 , v2 , . . . , vk be vectors in V . A vector v ∈ V of the form v = λ1 v1 + λ2 v2 + · · · + λk vk is called a linear combination of v1 , . . . , vk . Eg 2.5. 1. V = R2 . Let v1 = (1, 1). The linear combinations of v1 are the vectors v = λv1 (λ ∈ R) = (λ, λ) These form the line through origin and v1 , ie. x1 − x2 = 0. 2. V = R2 . Let v1 = (1, 0) v2 = (0, 1) The linear combinations of v1 , v2 are λ1 v1 + λ2 v2 = (λ1 , λ2 ) So every vector in R2 is a linear combination of v1 , v2 . 40 Algebra I – lecture notes 2.5. SPAN 3. V = R3 . Let v1 = (1, 1, 1) v2 = (2, 2, −1) Typical linear combination is λ1 v1 + λ2 v2 = (λ1 + 2λ2 , λ1 + 2λ2 , λ1 − λ2 ) This gives all vectors in the plane containing origin, v1 , v2 , which is x1 − x2 = 0. So eg. (1, 0, 0) is not a linear combination of v1 , v2 . 2.5 Span Definition 2.5. Let V be a vector space over F , and let v1 , . . . , vk be vectors in V . Define the span of v1 , . . . , vk , written Sp(v1 , . . . , vk ) to be the set of all linear combinations of v1 , . . . , vk . In other words Sp(v1 , . . . , vk ) = {λ1 v1 + · · · + λk vk | λi ∈ F } ⊆ V Eg 2.6. 1. V = R2 , any v1 ∈ V . Then Sp(v1 ) = = all vectors λv1 (λ ∈ R) line through 0, v1 2. In R2 , Sp((1, 0), (0, 1)) = R2 3. In R3 , v1 = (1, 1, 1), v2 = (2, 2, −1) Sp(v1 , v2 ) = = plane containing 0, v1 , v2 plane x1 = x2 4. In R3 Sp(v1 = (1, 0, 0), v2 = (0, 1, 0), v3 = (0, 0, 1)) = 5. V = R3 . Let w1 = (1, 0, 0) w2 = (1, 1, 0) w3 = (1, 1, 1) Claim: Sp(w1 , w2 , w3 ) = R3 . 41 whole of R3 2.5. SPAN Algebra I – lecture notes Proof. Observe v1 = w1 v2 = w2 − w1 v3 = w3 − w2 Hence any linear combination of v1 , v2 , v3 is also a linear combination of w1 , w2 , w3 (i.e. (λ1 , λ2 , λ3 ) = λ1 v1 +λ2 v2 +λ3 v3 = λ1 w1 +λ2 (w2 −w1 )+λ3 (w3 −w2 ) ∈ Sp(w1 , w2 , w3 )) 2 6. V = vector space of polynomials over R. Let v1 = 1 v2 = x v3 = x2 Then Sp(v1 , v2 , v3 ) = {λ1 v1 + λ2 v2 + λ3 v3 | λi ∈ R} = λ1 + λ2 x + λ3 x2 | λi ∈ R = set of all polynomials of degree ≤ 2 Eg 2.7. In general, If v1 , v2 are vectors in R3 , not on same line through 0 (i.e. v1 6= λv1 ), then Sp(v1 , v2 ) = plane through 0, v1 , v2 Proposition 2.4. V vector space, v1 , . . . , vk ∈ V . Then Sp(v1 , . . . , vk ) is a subspace of V . Proof. Check the conditions of 2.2 (1) Taking all λi = 0 (using 2.1) 0v1 + 0v2 + · · · + 0vk = 0 + · · · + 0 = 0 So 0 is a linear combination of v1 , . . . , vk , so 0 ∈ Sp(v1 , . . . , vk ) (2) Let v, w ∈ Sp(v1 , . . . , vk ), so v = λ1 v1 + · · · + λk vk w = µ1 v1 + · · · + µk vk Then v + w = (λ1 + µ1 )v1 + · · · + (λk + µk )vk ∈ Sp(v1 , . . . , vk ). 42 Algebra I – lecture notes 2.6. SPANNING SETS (3) Let v ∈ Sp(v1 , . . . , vk ), λ ∈ F , so v = λ1 v1 + · · · + λk vk so λv = (λλ1 )v1 + · · · + (λλk vk ) ∈ Sp(v1 , . . . , vk ) 2 2.6 Spanning sets Definition 2.6. V vector space, W a subspace of V . We say vectors v1 , . . . , vk span W if (1) v1 , . . . , vk ∈ W and (2) W = Sp(v1 , . . . , v2 ) Call the set {v1 , . . . , vk } a spanning set of W . Eg 2.8. • {(1, 0, 0) , (1, 1, 0) , (1, 1, 1)} is a spannig set for R3 . • (1, 1, 1) , (2, 2, −1) span the plane x1 − x2 = 0. • Let 1 1 3 1 4 W = x∈R 2 3 1 1 x=0 1 0 8 2 Find a (finite) spanning Solve system 1 1 2 3 1 0 Echelon form: set for W . 3 1 0 1 1 3 1 0 1 1 0 → 0 1 −5 −1 0 8 2 0 0 −1 5 1 0 1 1 3 1 0 → 0 1 −5 −1 0 0 0 0 0 0 x1 + x2 + 3x3 + x4 = 0 x2 − 5x3 − x4 = 0 43 2.7. LINEAR DEPENDENCE AND INDEPENDENCE Algebra I – lecture notes General solution x4 x3 x2 x1 = = = = = a b a + 5b −a − 3b − (a + 5b) −2a − 8b i.e. x = (−2a − 8b, a + 5b, b, a). So W = {(−2a − 8b, a + 5b, b, a) | a, b ∈ R} Define two vectors (take a = 1 and b = 0 and vice versa) w1 = (−2, 1, 0, 1) w2 = (−8, 5, 1, 0) a = 1, b = 0 a = 0, b = 1 Claim W = Sp(w1 , w2 ) Proof. Observe (−2a − 8b, a + 5b, b, a) = a(−2, 1, 0, 1) + b(−8, 5, 1, 0) = aw1 + bw2 This gives a general method of finding spanning sets of solution spaces. 2.7 2 Linear dependence and independence Definition 2.7. V vector space over F . We say a set of vectors v1 , . . . , vk in V is a linearly independent set if the following condition holds λ1 v1 + · · · + λk vk = 0 ⇒ all λi = 0 Usually just say the vectors v1 , . . . , vk are linearly independent vectors. We say the set {v1 , . . . , vk } is linearly dependent if the oposite true, i.e. if we can find scalars λi such that (1) λ1 v1 + · · · + λk vk = 0 (2) at least one λi 6= 0 Eg 2.9. 1. V = R2 , v1 = (1, 1). Then {v1 } is a linearly independent set, as λv1 = 0 ⇒ (λ, λ) = (0, 0) ⇒ λ=0 44 Algebra I – lecture notes 2.7. LINEAR DEPENDENCE AND INDEPENDENCE 2. V = R2 , the set {0} is linearly dependent, e.g. 20 = 0 3. In R2 , let v1 = (1, 1), v2 = (2, 1). Is {v1 , v2 } linearly independent? Consider equation λ1 v1 + λ2 v2 = 0 i.e. (λ1 , λ1 ) + (2λ2 , λ2 ) = (0, 0) i.e. λ1 + 2λ2 = 0 ⇒ λ1 = λ2 = 0 λ1 + λ2 = 0 4. In R3 , let v1 = (1, 0, 1) v2 = (2, 2, −1) v3 = (1, 4, −5) Are v1 , v2 , v3 linearly independent? Consider system x1 v1 + x2 v2 + x3 v3 = 0 (2.1) This is the system of linear equations 1 2 1 0 2 4 x = 0 1 −1 −5 (i.e. v1 v2 v3 x = 0) Solve 1 2 1 0 1 2 1 0 0 2 4 0 → 0 2 4 0 1 −1 −5 0 0 −3 −6 0 1 2 1 0 → 0 2 4 0 0 0 0 0 Solution x = (3a, −2a, a) (any a). So 3v1 − 2v2 + v3 = 0 So v1 , v2 , v3 are linearly dependent. Geometrically, v1 , v2 span a plane in R3 and v3 = −3v1 + 2v2 ∈ Sp(v1 , v2 ) is in this plane. In general: in R3 , three vectors are linearly dependent iff they are coplanar. 45 2.7. LINEAR DEPENDENCE AND INDEPENDENCE Algebra I – lecture notes 4. V = vector space of polynomials over R. Let p1 (x) = 1 + x2 p2 (x) = 2 + 2x − x2 p3 (x) = 1 + 4x − 5x2 Are p1 , p2 , p3 linearly dependent? Consider equation λ 1 p1 + λ 2 p2 + λ 3 p3 = 0 Equating coefficients λ1 + 2λ2 + λ3 = 0 2λ2 + 4λ3 = 0 λ1 − λ2 − 5λ3 = 0 Showed in previous example that a solution is λ1 = 3, λ2 = −2, λ3 = 1 So So linearly dependent. 3p1 − 2p2 + p1 = 0 5. V = vector space of functions R → R. Let f1 (x) = sin x, f2 (x) = cos x So f1 , f2 ∈ V . Are f1 , f2 linearly independent? Sheet 6. Two basic results about linearly independent sets. Proposition 2.5. Any subset of a linearly independent set of vectors is linearly independent. Proof. Let S be a lin. indep. set of vectors, and T ⊆ S. Label vectors in S, T T = {v1 , . . . , vt } S = {v1 , . . . , vt , vt+1 , . . . , vs } Suppose λ1 v1 + · · · + λt vt = 0 Then λ1 v1 + · · · + λt vt + 0vt+1 + · · · + 0vs = 0 As S is lin. indep., all coeffs must be 0, so all λi = 0. Thus T is lin. indep. 46 2 Algebra I – lecture notes 2.7. LINEAR DEPENDENCE AND INDEPENDENCE Proposition 2.6. V vector space, v1 , . . . , vk ∈ V . Then the following two statements are equivalent (i.e. (1)⇔(2)). (1) v1 , . . . , vk are lin. dependent (2) there exists i such that vi is a linear combination of v1 , . . . , vi−1 . Proof. (1) ⇒ (2) Suppose v1 , . . . , vk is lin. dep., so there exist λi such that λ1 v1 + · · · + λk vk = 0 and λj 6= 0 for some j. Choose the largest j for which λj 6= 0. So λ1 v1 + · · · + λj vj = 0 Then λj vj = −λ1 v1 − · · · − λj−1 vj−1 So vj = − λ1 λj−1 −···− λj λj which is a linear combination of v1 , . . . , vj−1 . (1) ⇐ (2) Assume vi is a linear combination of v1 , . . . , vi−1 , say v1 = λ1 v1 + · · · + λi−1 vi−1 Then λ1 v1 + · · · + λi−1 vi−1 − vi + 0vi+1 + · · · + 0vk = 0 Not all the coefficients in this equation are zero (coef of vi is −1). So v1 , . . . , vk are lin. dependent. 2 Eg 2.10. v1 = (1, 0, 1), v2 = (2, 2, −1), v3 = (1, 4, −5) in R3 . These are linearly dependent: 3v1 − 2v2 + v3 = 0. And v3 = −3v1 + 2v2 a linear combination of previous ones. Proposition 2.7. V vector space, v1 , . . . , vk ∈ V . Suppose vi is a linear combination of v1 , . . . , vi−1 . Then Sp(v1 , . . . , vk ) = Sp(v1 , . . . , vi−1 , vi+1 , . . . , vk ) (i.e. throwing out vi does not change Sp(v1 , . . . , vk )) 47 2.8. BASES Algebra I – lecture notes Proof. Let vi = µ1 v1 + · · · + µi−1 vi−1 (µj ∈ F ) Now consider v = λ1 v1 + · · · + λk vk ∈ Sp(v1 , . . . , vk ) Then v = λ1 v1 + · · · + λi−1 vi−1 + +λi (µ1 v1 + · · · + µi−1 vi−1 ) + +λi+1 vi+1 + · · · + λk vk So v is a lin. comb. of v1 , . . . , vi−1 , vi+1 , . . . , vk Therefore Sp(v1 , . . . , vk ) ⊆ Sp(v1 , . . . , vi−1 , vi+1 , . . . vk ). 2 Eg 2.11. v1 = (1, 0, 1), v2 = (2, 2, −1), v3 = (1, 4, −5). Here v3 = −3v1 + 2v2 So Sp(v1 , v2 , v3 ) = Sp(v1 , v2 ). 2.8 Bases Definition 2.8. V a vector space. We say a set of vectors {v1 , . . . , vk } in V , is basis of V if (1) V = Sp(v1 , . . . , vk ) (2) {v1 , . . . , vk } is a linearly independent set. Informally, a basis is a spanning set for which we cannot throw any of the vectors away. Eg 2.12. 1. {(1, 0), (0, 1)} is a basis of R2 . v1 v2 Proof. (1) (x1 , x2 ) = x1 v1 + x2 v2 so R2 = Sp(v1 , v2 ) (2) v1 , v2 are linearly independent as λ1 v1 + λ2 v2 = 0 ⇒ (λ1 , λ2 ) = (0, 0) ⇒ λ1 = λ2 = 0 48 Algebra I – lecture notes 2.8. BASES 2 2. (1, 0, 0), (1, 1, 0), (1, 1, 1) is a basis of R3 Proof. (1) They span R3 – previous example. (2) x1 v1 + x2 v2 + x3 v3 = 0 1 1 1 0 leads to the system 0 1 1 0 with the only solution x1 = x2 = x3 = x4 = 0 0 0 1 0 ∴ v1 , v2 , v3 are lin. indep. 2 Theorem 2.1. Let V be a vector space with a spanning set v1 , . . . , vk (i.e. V = Sp(v1 , . . . , vk )). Then there is a subset of {v1 , . . . , vk } which is a basis of V . Proof. Consider the set v1 , . . . , vk We throw away vectors in this list which are linear combinations of the previous vectors in the list. End up with a basis. Process: Casting out Process First, throw away any zero vectors in the list. • Start at v2 : if it is a linear combination v1 , (i.e. v2 = λv1 ), then delete it; if not, leave it there. • Now consider v3 : if it is a linear combination of the remaining previous vectors, delete it; if not, leave it there. • Continue, moving from left to right, deleting any vi , which is a linear combination of previous vectors in the list. End up with a subset {w1 , . . . , wm } of {v1 , . . . , vk } such that (1) V = Sp(w1 , . . . , wm ) (by 2.7) (2) no wi is a linear combination of previous ones. Then {w1 , . . . , wm } form a linearly independent set by 2.6. Therefore {w1 , . . . , wm } is a basis of V . 2 Eg 2.13. 49 2.8. BASES Algebra I – lecture notes 1. V = R3 , v1 = (1, 0, 1), v2 = (2, 2, −1), v3 = (1, 4, −5). Let W = Sp(v1 , v2 , v3 ). Find a basis of W . 1) Is v2 a linear combination of v1 ? No: leave it in. 2) Is v3 a linear combination of v1 , v2 ? Yes: v3 = −3v1 + 2v2 . So cast out v3 : basis for W is {v1 , v2 }. 2. Here’s a meatier example of The Casting out Process. Let V = R4 and v1 v2 v3 v4 v5 = = = = = (1, −2, 3, 1) (2, 2, −2, 1) (5, 2, −1, 3) (11, 2, 1, 7) (2, 8, 2, 3) Let W = Sp(v1 , . . . , v5 ), subspace of R4 . Find abasis of W . v1 .. The all-in-one-go method : Form 5 × 4 matrix . and reduce it to echelon form: v5 v1 1 −2 3 1 2 2 −2 1 v2 5 2 −1 3 v3 11 2 1 7 v4 v5 2 8 2 3 → → v1 1 −2 3 1 0 6 −8 −1 v2 − 2v1 0 12 −16 −2 v3 − 5v1 0 24 −32 −4 v4 − 11v1 v5 − 2v1 0 12 −4 1 v1 1 −2 3 1 0 6 −8 −1 v2 − 2v1 0 0 0 0 v3 − 5v1 − 2(v2 − 2v1 ) 0 0 0 0 v4 − 11v1 − 4(v2 − 2v1 ) v5 − 2v1 − 2(v2 − 2v1 ) 0 0 12 3 So v3 is a linear combination of v1 and v2 : cast it out. And v4 is linear combination of previous ones: cast it out. Row vectors in echelon form are linearly independent: So last row v5 + 2v1 − 2v2 is not a linear combination of the first two rows. So v5 is not a linear combination of v1 , v2 . Conclude: Basis of W is {v1 , v2 , v5 }. To help with spanning calculations: Eg 2.14. Let v1 = (1, 2, −1), v2 = (2, 0, 1), v3 = (0, −1, 3), v4 = (1, 2, 3). Do v1 , v2 , v3 , v4 span the whole of R3 ? 50 Algebra I – lecture notes 2.9. DIMENSION Let b ∈ R3 . Then b ∈ Sp(v1 , v2 , v3 , v4 ) iff system x1 v1 +x2 v2 +x3 v3 +x4 v4 = b has a solution for x1 , x2 , x3 , x4 ∈ R. This system is 1 2 0 1 b b1 1 2 0 1 2 0 −1 2 b2 → 0 −4 −1 0 b2 − 2b1 −1 1 3 3 b3 0 3 3 4 b3 + b1 b1 1 2 0 1 → 0 −4 −1 0 b2 − 2b1 0 0 9 12 ··· This system has a solution for any b ∈ R3 . Hence Sp(v1 , . . . , v4 ) ∈ R3 . 2.9 Dimension Definition 2.9. A vector space V is finite-dimensional if it has a finite spanning set, i.e. there is a finite set of vectors v1 , . . . , vk such that V = Sp(v1 , . . . , vk ). Eg 2.15. Rn is finite dimensional. To show this, let e1 = (1, 0, 0, . . . , 0) e2 = (0, 1, 0, . . . , 0) .. . en = (0, 0, 0, . . . , 1) Then for any x = (x1 , . . . , xn ) ∈ Rn x = x1 e1 + x2 e2 + · · · + xn en So Rn = Sp(e1 , . . . , en ) is finite-dimensional. Note 2.2. {e1 , . . . , en } is linearly independent since λ1 e1 +· · ·+λn en = 0 implies (λ1 , . . . , λn ) = 0, so all λi = 0. So {e1 , . . . , en } is a basis for Rn , called the standard basis. Eg 2.16. Let V be a vector space of polynomials over R. Claim: V is not finite-dimensional. Proof. By contradiction. Assume V has a finite spanning set p1 , . . . , pk . Let deg (pi ) = ni and let n = max (n1 , . . . , nk . Any linear combination λ1 p1 + · · · + λk pk (λ ∈ R) has degree ≤ n. So the poly xn+1 is not a linear combination of vectors from our assumed spanning set; contradiction. 2 Proposition 2.8. Any finite-dimensional vector space has a basis. Proof. Let V be a finite-dimensional vector space. Then V has a finite spanning set. This contains a basis of V by Theorem 2.1. 2 51 2.9. DIMENSION Algebra I – lecture notes Definition 2.10. The dimension of V is the number of vectors in any basis of V .1 Written dim V Eg 2.17. Rn has basis e1 , . . . , en , so dim Rn = n. Eg 2.18. Let v ∈ R2 , v 6= 0 and let L be the line through 0 and v. So L is a subspace. L = {λv | λ ∈ R} So L = Sp(v) and {v} is a basis of L. So dim L = 1. Eg 2.19. Let v1 , v2 ∈ R3 with v1 , v2 6= 0 and v2 6= λv1 . Then Sp(v1 , v2 ) = P is a plane through 0, v1 , v2 . As v2 = 6 λv1 , {v1 , v2 } is linearly indepenednt, so is a basis of P . So dim P = 2. Major result: Theorem 2.2. Let V be a finite-dimensional vector space. Then all bases of V have the same number of vectors. Proof. Based on: Lemma 2.1. (Replacement Lemma) V a vector space. Suppose v1 , . . . , vk and x1 , . . . , xr are vectors in V such that • v1 , . . . , vk span V • x1 , . . . , xr are linearly independent Then (1) r ≤ k and (2) there is a subset {w1 , . . . , wk−r } of {v1 , . . . , vk } such that x1 , . . . , xr , w1 , . . . , wk−r span V (i.e. we can replace r of the v’s by the x’s and still span V ) Eg 2.20. V = R3 . • e1 , e2 , e3 span R3 • x1 = (1, 1, −1) According to 2.1(2), we can replace one of the ei ’s by x1 and get a spanning set {x1 , ei , ej }. How? Consider spanning set x1 , e1 , e2 , e3 This set is linearly dependent since x = e1 + e2 − e3 . By 2.6, one of the vectors is therefore a linear combination of previous ones – in this case e3 = e1 + e2 − x1 So cast out e3 – spanning set is {x1 , e1 , e2 }. 1 following theorem shows the uniqueness of this number 52 Algebra I – lecture notes 2.10. FURTHER DEDUCTIONS Proof. (of Lemma 2.1) Consider S1 = {x1 , v1 , . . . , vk }. This spans V . It is linearly dependent, as x1 is a linear combination of the spanning set v1 , . . . , vk . So by 2.6, some of the vectors in S1 is a linear combination of previous ones. This vector is not x1 , so it is some vi . By 2.7, V = Sp(x1 , v1 , . . . , 6 vi , . . . , vk ). Now let S2 = {x1 , x2 , v1 , . . . , vi , . . . , vk }. This spans V and is linearly dependent, as x2 is a linear combination of others. By 2.6, there exists a vector in S2 which is linear combination of previous ones. It is not x1 and x2 , as x1 , x2 are linearly inedpendent. So it is some vj . By 2.7, V = Sp(x1 , x2 , v1 , . . . , 6 vi , . . . , 6 vj , . . . , vk ). Continue like this, adding x’s, deleting v’s. If r > k, then eventually, we delete all the v’s and get V = Sp(x1 , . . . , xk ). Then xk+1 is a linear combination of x1 , . . . , xk . This can’t happed as x, . . . , xk+1 is linearly independent set. Therefore r ≤ k. This proces ends when we’ve used up all the x’s, giving V = Sp(x1 , . . . , xr , k − r remaining v’s) 2 (Proof of 2.2 continued) Let {v1 , . . . , vk } and {x1 , . . . , xr } be the bases of V . Both are spanning sets for V and both are linearly independent. Well v1 , . . . , vk span, x1 , . . . , xr is linearly inedpendent, so by the previous lemma, r ≤ k. Similarly, x1 , . . . , xr span. v1 , . . . , vk is linearly independent, so by the previous lemma again, k ≤ r. Hence r = k. So all bases of V have the same number of vectors. 2 2.10 Further Deductions Proposition 2.9. Let dim V = n. Any spanning set for V of size n is a basis of V . Proof. Let {v1 , . . . , vn } be the spanning set. By 2.1, this set contains a basis of V . By 2.2, all bases of V have the size n. Therefore, {v1 , . . . , vn } is a basis of V . 2 Eg 2.21. Is (1, −2, 3), (0, 2, 5), (−1, 0, 6) a basis of R3 ? 1 −2 3 1 −2 0 2 5 → 0 2 −1 0 6 0 −2 1 −2 0 2 → 0 0 3 5 9 3 5 14 The rows of this echelon form are linearly independent, so can’t cast out any vectors. So they form a basis. 53 2.10. FURTHER DEDUCTIONS Algebra I – lecture notes Proposition 2.10. If {x1 , . . . , xr } is a linearly independent set in V , then there is a basis of V containing x1 , . . . , xr . Proof. Let v1 , . . . , vn be a basis of V . By 2.1(2), there exists {w1 , . . . , wn−r } ⊆ {v1 , . . . , vn } such that V = Sp(x1 , . . . , xr , w1, . . . , wn−r ) Then x1 , . . . , xr , w1 , . . . , wn−r is a spanning set of size n, hence is a basis by 2.9. 2 Eg 2.22. Let v1 = (1, 0, −1, 2), v2 = (1, 1, 2, 5) ∈ R4 . Find a basis of R4 containing v1 , v2 . Claim: v1 , v2 , e1 , e2 is a basis of R4 . Proof. Clearly can get all standard basis vectors e1 , e2 , e3 , e4 as linear combination of v1 , v2 , e1 , e2 . So v1 , v2 , e1 , e2 span R4 , so they are basis by 2.9. 2 Proposition 2.11. Let W be subspace of V . Then (1) dim W ≤ dim V (2) If W 6= V , then dim W < dim V Proof. (1) Let w1 , . . . , wr be a basis of W . This set is linearly independent. So by Proposition 2.10 there is a basis of V containing it. Say w1 , . . . , wr , v1 , . . . , vs . Then dim V = r + s ≥ r = dim W . (2) If dim W = dim V , then s = 0 and w1 , . . . , wr is a basis of V , so V = Sp(w1 , . . . , wr ) = W. 2 Eg 2.23. (The subspaces of R3 ) Let W be a subspace of R3 . Then dim W ≤ dim R3 = 3. Possibilities: • dim W = 3 Then W = R3 • dim W = 2 Then W has a basis {v1 , v2 } so W = Sp(v1 , v2 ), which is a plane through 0, v1 , v2 . • dim W = 1 Then W has a basis {v1 } so W = Sp(v1 ), which is a line through 0, v1 . • dim W = 0 Then W = {0}. Conclude: The subspaces of R3 are {0}, R3 and lines and planes containing 0. Proposition 2.12. Let dim V = n. Any set of n vectors which is linearly independent is a basis of V . 54 Algebra I – lecture notes 2.10. FURTHER DEDUCTIONS Proof. Let v1 , . . . , vn be linearly independent. By Proposition 2.10 there is a basis containing v1 , . . . , vn . As all bases have n vectors, v1 , . . . , vn must be a basis. 2 Eg 2.24. Is the set (1, 1, −1, 0), (2, 0, −1, 2), (0, 3, 1, −1), 1 1 1 1 −1 0 v1 0 −2 2 0 −1 2 v2 → 0 3 1 −1 v3 0 3 0 0 2 2 1 0 v4 1 1 0 −1 → 0 0 0 0 1 1 0 −1 → 0 0 0 0 (2, 2, 1, 0) a basis of R4 ? −1 0 2 2 1 −1 3 0 −1 0 1 1 4 5 3 0 −1 0 w1 1 1 w2 4 5 w3 0 1 w4 The vectors w1 , w2 , w3 , w4 are linearly independent (clear as they are in echelon form). By 2.12, w1 , . . . , w4 are a basis of R4 , therefore v1 , . . . , v4 span R4 (as w’s are linear combinations of v’s), therefore v1 , . . . , v4 is a basis of R4 , by 2.9. Proposition 2.13. Let dim V = n. Then any set of n + 1 vectros or more vectors in V is linearly dependent. Proof. Let S be a set of n + 1 or more vectors. If S is linearly independent, it is contained in a basis by Proposition 2.10, which is impossible as all bases have n vectors. So S is linearly dependent. 2 Eg 2.25. (A fact about matrices) Let V = M2,2 , the vector space of all 2 × 2 matrices over R (usual addition A + B and scalar multiplication λA of matrices) Basis: Let 1 0 E11 = 0 0 0 1 E12 = 0 0 0 0 E21 = 1 0 0 0 E22 = 0 1 Claim: E11 , E12 , E21 , E22 is a basis of V = M2,2 . Proof. 55 2.10. FURTHER DEDUCTIONS • Span a b c d Algebra I – lecture notes = aE11 + bE12 + cE21 + dE22 • Linear independence λ1 E11 + λ2 E12 + λ3 E21 + λ4 E22 = 0 implies 0 0 λ1 λ2 ⇒ λi = 0 = 0 0 λ3 λ4 2 So dim V = 4. Now let A ∈ V = M2,2 . Consider I, A, A2 , A3 , A4 . These are 5 vectors in V , so they are linearly dependent by 2.13. This means there exist λi ∈ R (at least one non zero) such that λ4 A4 + λ3 A3 + λ2 A2 + λ1 A + λ0 I = 0 This means, if we write p(x) = λ4 x4 + λ3 x3 + λ2 x2 + λ1 x + λ0 then p(x) 6= 0 and p(A) = 0. So we’ve proved the following: Proposition 2.14. For any 2 × 2 matrix A there exists a nonzero polynomial p(x) of degree ≤ 4, such that p(A) = 0. Note 2.3. This generalizes to n × n matrices. Summary so far V a finite-dimensional vector space (i.e. V has a finite spanning set) • Basis of V is a linear independent spanning set • All bases have the same size called dim V (Theorem 2.2) • Every spanning set contains a basis (Theorem 2.1) Write dim V = n • Any spanning set of size n is a basis (Proposition 2.9) • Any linearly independent set of size n is a basis (Proposition 2.12) • Any linearly independent set is contained in a basis (Proposition 2.10) • Any set of n + 1 or more vectors is linearly dependent (Proposition 2.13) • Any subspace W of V has dim W ≤ n, and dim W = n ⇒ W = V 2.11) 56 (Proposition Algebra I – lecture notes Chapter 3 More on Subspaces 3.1 Sums and Intersections Definition 3.1. V a vector space. Let U, W be subspaces of V . The intersection of V and W is U ∩ W = {v | v ∈ U and v ∈ W } The sum of U and W U + W = {u + w | u ∈ U and w ∈ W } Note 3.1. U + W contains • all the vectors in u ∈ U (as u + 0 ∈ U + W ) • all the vectors in w ∈ W • many more vectors (usually) Eg 3.1. V = R2 U = Sp(1, 0), W = Sp(0, 1). Then U + W contains all vectors λ1 (1, 0) + λ2 (0, 1) = (λ1 , λ2 ). So U + W is the whole of R2 . Proposition 3.1. U ∩ W and U + W are subspaces of V . Proof. Use subspaes criterion, Proposition 2.3 U +W (1) As U, W are subspaces, both contain 0, so 0 + 0 = 0 ∈ U + W (2) Let u1 + w1 , u2 + w2 ∈ U + W (where ui ∈ U, wi ∈ W ). Then (u1 + w1 ) + (u2 + w2 ) = (u1 + u2 ) + (w1 + w2 ) ∈ U + W (3) Let u + w ∈ U + W , λ ∈ F . Then λ(u + w) = λu + λw ∈ U + W 57 3.1. SUMS AND INTERSECTIONS Algebra I – lecture notes U ∩ W – Sheet 8. 2 What about dimensions of U + W , U ∩ W ? First: Proposition 3.2. If U = Sp(u1 , . . . , ur ), W = Sp(w1 , . . . , ws ). Then U + W = Sp(u1 , . . . , ur , w1 , . . . , ws ) Proof. Let u + w ∈ U + W . Then (λi , µi ∈ F ) u = λ1 u1 + · · · + λr vr w = µ1 w1 + · · · + µs ws So u + w = λ1 u1 + · · · + λr vr + µ1 w1 + · · · + µs ws ∈ Sp(u1 , . . . , ur , w1 , . . . , wr ) So U + W ⊆ Sp(u1 , . . . , ur , w1 , . . . , ws ) All the ui , wi are in U+W . As U+W is a subspace, it therefore contains Sp(u1 , . . . , ur , w1 , . . . , ws ). Hence U + W = Sp(u1 , . . . , ur , w1 , . . . , ws ) 2 Eg 3.2. In the above example, U + W = Sp((1, 0), (0, 1)) = R2 . Eg 3.3. Let U = {x ∈ R3 | x1 + x2 + x3 = 0}, W = {x ∈ R3 | − x1 + 2x2 + x3 = 0} subspaces of R3 . Find bases of U, W , U ∩ W , U + W . • For U general solution is (−a − b, b, a), so basis for U is: {(−1, 0, 1), (−1, 1, 0)}. • For W general solution is (2b + a, b, a), so basis of W is: {(1, 0, 1), (2, 1, 0)}. 1 1 1 3 x = 0. Solve • U ∩ W : this is x ∈ R −1 2 1 1 1 1 0 −1 2 1 0 → 1 1 1 0 0 3 2 0 General solution is (−a, −2a, 3a). Basis for U + W is {(−1, −2, 3)}. 58 Algebra I – lecture notes 3.1. SUMS AND INTERSECTIONS • U + W : By Proposition 3.2 U + W = Sp((−1, 0, 1), (−1, 1, 0), (1, 0, 1), (2, 1, 0)) Check that can cast out only 1 vector. So U + W has dimension 3, so U + W = R3 So dim U = dim W = 2 dim U ∩ W = 1 dim U + W = 3 Theorem 3.1. Let V be a finite-dimensional space and let U, W be subspaces of V . Then dim (U + W ) = dim U + dim W − dim (U ∩ W ) Proof. Let dim U = m dim W = n dim (U ∩ W ) = r Aim: to prove dim (U + W ) = m + n − r. Start with basis {x1 , . . . , xr } of U ∩ W . By 2.10, can extend this to bases Bu = {x1 , . . . , xr , u1 , . . . , um−r } basis of U Bw = {x1 , . . . , xr , w1 , . . . , wn−r } bases of W Let B = Bu ∪ Bw = {x1 , . . . , xr , u1 , . . . , um−r , w1 , . . . , wn−r } Claim B is a basis of U + W . Proof. (1) Span: B spans U + W by Proposition 3.2. (2) Linear independence: We show that B is linearly independent. Suppose λ1 x1 + · · · + λr xr + α1 u1 + · · · + αm−r um−r + β1 w1 + · · · + βn−r wn−r i.e. r X i=1 λi xi + m−r X αi u i + i=1 59 n−r X i=1 βi wi = 0 (3.1) 3.1. SUMS AND INTERSECTIONS Algebra I – lecture notes Let v= n−r X βi wi i=1 Then v ∈ W . Also v=− So v is in U ∩ W . As x1 , . . . , xr is a basis of U ∩ W X v= λi xi − r X i=1 As v = P X αi u i ∈ U γi xi (γ ∈ F ) βi wi , this gives − r X γi xi + n−r X βi wi = 0 i=1 i=1 Since Bw = {x1 , . . . , xr , w1 , . . . , wn−r } is linearly independent, this forces (for all i) γi = 0 βi = 0 (i.e. v = 0). Then by (3.1) r X λi xi + i=1 m−r X αi u i = 0 i=1 Since Bu = {x1 , . . . , xr , u1, . . . , um−r } is linearly independent, this forces (for all i) λi = 0 αi = 0 So we’ve shown that in (3.1), all coefficients λi , αi , βi are zero, showing that B = Bu ∩ Bw is linearly independent. Hence B is a basis of U + W . 2 Then we proved that dim (U + W ) = r + m − r + n − r = r + m − r 2 Eg 3.4. V = R4 . Suppose U, W are subspaces with dim U = 2, dim W = 3. Then dim U + W ≥ 3 (as it contains W ) and dim U + W ≤ 4 (as U + W ⊆ R4 ). Possibilities 60 Algebra I – lecture notes 3.2. THE RANK OF A MATRIX • dim U + W = 3 Then U + W = W and so U ⊆ W . • dim U + W = 4 (in other words U + W = R4 ). Then dim(U ∩ W ) = dim U + dim W − dim U + W = 1 For example this happens: U = Sp(e1 , e2 ) = {(x1 , x2 , 0, 0) | xi ∈ R} W = Sp(e1 , e3 , e4 ) = {(x1 , 0, x2 , x3 )} 3.2 The rank of a matrix Definition 3.2. Let A be an m × n matrix with real entries. Define row-space(A) = column-space(A) = 3 1 2 Eg 3.5. A = 0 −1 1 subspace of Rn spanned by the rows of A subspace of Rm spanned by the columns of A row-space(A) = Sp((3, 1, 2), (0, −1, 1)) 3 1 2 column-space(A) = Sp , , 0 −1 1 Definition 3.3. Let A be m × n matrix. Define row-rank(A) = dim row-space(A) column-rank(A) = dim column-space(A) Eg 3.6. In above example row-rank(A) = column-rank(A) = 2 3.2.1 How to find row-rank(A) Procedure: (1) Reduce A to echelon form by 0 0 A′ = 0 Then (we will prove this) row operations, say ... 1 ... ... ... ... ... . . . 0 . . . 1 . . . . . . . . . .. . ... 0 ... 0 1 ... ... row-space(A) = row-space(A′ ) 61 3.2. THE RANK OF A MATRIX Algebra I – lecture notes (2) Then row-rank(A) = number of nonzero rows in echelon form A′ and these nonzero rows are a basis for row-space(A). Proof. (1) Rows of A′ are linear combinations of the rows of A (since they are obtained by row operations ri → ri + λrj , etc.) Therefore row-space(A′ ) ⊆ Sp( rows of A) = row-space(A) By reversing the row operations to go from A′ to A, we see that rows of A are linear combinations of rows of A′ , so row-space(A) ⊆ row-space(A′ ) Therefore row-space(A) = row-space(A′ ) (2) Let the nonzero rows of A′ be v1 , . . . , vr 0 ... 1 ... ... ... ... 0 . . . 0 . . . 1 . . . . . . A′ = .. . 0 ... 0 ... 0 1 ... Then . . . v1 . . . v2 .. . ... vr row-space(A′ ) = Sp(v1 , . . . , vr ) Also v1 , . . . , vr are linearly independent, since λ1 v1 + · · · + λr vr = 0 implies λ1 = 0 (since λ1 is i1 coordinate of LHS) λ2 = 0 (since λ1 is i2 coordinate of LHS) λj = 0 (since λ1 is ij coordinate of LHS) Therefore v1 , . . . , vr is a basis for row-space(A′ ), hence for row-space(A). So row-rank(A) = r = no. of nonzero rows of A′ 2 62 Algebra I – lecture notes 3.2. THE RANK OF A MATRIX Eg 3.7. Find the row-rank of 1 2 5 A= 2 1 0 −1 4 15 Reduce to echelon form: 1 2 5 A → 0 −3 −10 0 6 20 1 2 5 → 0 −3 −10 0 0 0 Row-rank is 2. Eg 3.8. Find the dimension of W = Sp((−1, 1, 0, 1), (2, 3, 1, 0), (0, 1, 2, 3)) ⊆ R4 Observe −1 1 0 1 W = row-space( 2 3 1 0 ) = A 0 1 2 3 So dim W = row-rank(A). −1 1 0 1 A → 0 5 1 2 0 1 2 3 −1 1 0 1 → 0 5 1 2 0 0 9 12 So dim W = row-rank(A) = 3. 3.2.2 How to find column-rank(A)? Clearly column-rank(A) = row-rank(AT ) 63 3.2. THE RANK OF A MATRIX Algebra I – lecture notes 1 2 5 Eg 3.9. column-rank() of A = 2 1 0 −1 4 15 1 2 −1 4 AT = 2 1 5 0 15 1 2 −1 → 0 3 −6 0 0 0 So column-rank(A) = 2. Theorem 3.2. For any matrix A, row-rank(A) = column-rank(A) Proof. Let a11 · · · a1n v1 .. . . .. .. ... A= . am1 · · · amn vm So vi = (ai1 , . . . , ain ). Let k = row-rank(A) = dim Sp(v1 , . . . , vm ) Let w1 , . . . , wk be a basis for row-space(A). Say w1 = (b11 , . . . , b1n ) .. . wk = (bk1 , . . . , bkn ) Each vi ∈ Sp(w1 , . . . , wk ), so (λij ∈ F ) v1 = λ11 w1 + · · · + λ1k wk .. . vm = λm1 w1 + · · · + λmk wk Equating coordinates: ith coord of v1 : a1i = λ11 b1i + · · · + λ1k bki .. . ith coord of vm : ami = λm1 b1i + · · · + λmk bki 64 Algebra I – lecture notes 3.2. THE RANK OF A MATRIX This says ith a1i λ11 λ1k column of A = ... = b1i ... + · · · + bki ... ami λm1 λmk Hence each column of A is a linear combination of the k column vectors λ1k λ11 . .. . , . . . , .. = l1 , . . . , lk λmk λm1 So column-space(A) is spanned by these k vectors. So column-rank(A) = dim column-space(A) ≤ k = row-rank(A) So we’ve shown that column-rank(A) ≤ row-rank(A) Applying the same to AT : column-rank(AT ) ≤ row-rank(AT ) i.e. row-rank(A) ≤ column-rank(A) Hence row-rank(A) = column-rank(A). Eg 3.10. illustrating the proof Let 2 1 2 −1 0 v1 0 1 v2 A = −1 1 0 3 −1 2 v3 As v3 = v1 + v2 , basis of row-space is w1 , w2 where w1 = v1 w2 = v2 Write each vi as a linear combination of w1 , w2 v1 = w1 = 1w1 + 0w2 v2 = w2 = 0w1 + 1w2 v3 = w1 + w2 = 1w1 + 1w2 65 3.2. THE RANK OF A MATRIX Algebra I – lecture notes So the column vectors l1 , l2 are 0 1 0 , 1 1 1 l2 l2 These span column-space(A). Check 1 −1 0 2 1 3 −1 0 −1 = l1 − l2 = 2l1 + l2 = −l1 Definition 3.4. The rank of a matrix is its row-rank or its col-rank, written rank(A) or rk(A) Proposition 3.3. Let A be n × n. Then the following four statements are equivalent: (1) rank(A) = n (2) rows of A are a basis of Rn (3) columns of A are a basis of Rn (4) A is invertible Proof. • (1) ⇔ (2) rank(A) = n ⇔ dim row-space(A) = n ⇔ the n rows of A span R3 ⇔ the n rows are a basis of R3 (2.9) 66 Algebra I – lecture notes 3.2. THE RANK OF A MATRIX • (1) ⇔ (3) Similarly • (1) ⇔ (4) rank(A) = n ⇔ A can be reduced to echelon form ⇔ A can be reduced to In ⇔ A is invertible (M1GLA, 7.5) 2 67 Algebra I – lecture notes Chapter 4 Linear Transformations Linear transformations are functions from one vector space to another which “preserve” addition and scalar multiplication, i.e. for linear transformation T if v1 7→ w1 = T (v1 ) v2 7→ w2 = T (v2 ) then v1 + v2 7→ w1 + w2 = T (v1 ) + T (v2 ) and λv1 7→ λw1 = λT (v1 ) Definition 4.1. Let V , W be vector spaces. A function T : V → W is a linear transformation if 1) T (v1 + v2 ) = T (v1 ) + T (v2 ) for all v1 , v2 ∈ V 2) T (λv) = λT (v) for all v ∈ V , λ ∈ F Eg 4.1. (1) Define T : R1 → R1 by T (x) = sin x Then T is not a linear transformation: e.g. So 2T π 2 6= T (π) T (π) = sin π = 0 π π = 2 sin = 2 2T 2 2 (2) T : R2 → R1 , T (x1 , x2 ) = x1 + x2 T is a linear transformation: 68 Algebra I – lecture notes 1) T ((x1 , x2 ) + (y1 , y2)) = T (x1 + y1 , x2 + y2 ) = x1 + x2 + y1 + y2 = T (x1 , x2 ) + T (y1 , y2 ) 2) T (λ(x1 , x2 )) = = = = (3) T : R2 → R2 T (λx1 , λx2 ) λx1 + λx2 λ(x1 + x2 ) λT (x1 , x2 ) T (x1 , x2 ) = x1 + x2 + 1 T is not linear: e.g. T (2(1, 0)) = T (2, 0) = 3 2T (1, 0) = 4 (4) V vector space of polynomials. Define T : V → V by T (p(x)) = p′ (x) e.g. T (x3 − 3x) = 3x2 − 3 Then T is linear transformation: 1) T (p(x) + q(x)) = p′ (x) + q ′ (x) = T (p(x)) + T (q(x)) 2) T (λp(x)) = λp′ (x) = λT (p(x)) Basic examples: Proposition 4.1. Let A be an m × n matrix over R. Define T : Rn → Rm by (for all x ∈ Rn , column vectors) T (x) = Ax Then T is a linear transformation. 69 4.1. BASIC PROPERTIES Algebra I – lecture notes Proof. 1) T (v1 + v2 ) = A(v1 + v2 ) = Av1 + Av2 = T (v1 ) + T (v2 ) 2) T (λv) = A(λv) = λAv = λT (v) 2 Eg 4.2. 1. Define T : R3 → R2 Then x1 x1 − 3x2 + x3 T x2 = x1 + x2 − 2x3 x3 T (x) = 1 −3 1 1 1 −2 x So T is a linear transformation. 2. Rotation ρθ : R2 → R2 is ρθ = cos θ sin θ − sin θ cos θ x so is a linear transformation. 4.1 Basic properties Proposition 4.2. Let T : V → W be a linear transformation (i) T (0V ) = 0W (ii) T (λ1 v2 + · · · + λk vk ) = λ1 T v1 + · · · + λk T (vk ) Proof. 70 Algebra I – lecture notes 4.2. CONSTRUCTING LINEAR TRANSFORMATIONS (i) T (0V ) = T (0v) = 0T (v) = 0W (ii) T (λ1 v1 + · · · + λk vk ) = T (λ1 v1 + · · · + λk−1 vk−1 ) + T (λk vk ) = T (λ1 v1 + · · · + λk−1 vk−1 ) + λk T (vk ) Repeat to get (ii). 2 4.2 Constructing linear transformations Eg 4.3. Find a linear transformation T : R2 → R3 which sends 1 1 7→ −1 0 2 e1 w 1 0 0 7→ 1 1 3 e2 w2 We are forced to define x1 T = T (x1 e1 + x2 e2 ) x2 = x1 T (e1 ) + x2 T (e2 ) = x1 w1 + x2 w2 So the only possible choice for T is x1 x1 = −x1 + x2 T x2 2x1 + 3x2 This is a linear transformation, as it is 1 0 T (x) = −1 1 x 2 3 And it does send e1 7→ w1 , e2 7→ w2 . 71 4.2. CONSTRUCTING LINEAR TRANSFORMATIONS Algebra I – lecture notes In general: Proposition 4.3. Let V, W be vector spaces, and let v1 , . . . , vn be a basis of V . For any n vectors w1 , . . . , wn in W there is a unique linear transformation T : V → W such that T (v1 ) = w1 .. . T (vn ) = wn Proof. Let v ∈ V . Write v = λ1 v1 + · · · + λn vn By 4.2(ii), the only possible choice for T (v) is T (v) = T (λ1 v1 + · · · + λn vn ) = λ1 T (v1 ) + · · · + λn T (vn ) = λ1 w 1 + · · · + λn w n So this is our definition of T : V → W – if v = λ1 v1 + · · · + λn vn , then T (v) = λ1 w1 + · · · + λn wn We show this function T is a linear transformation: P P P 1) Let v = λi vi , w = µi vi . Then v + w = (λi + µi )vi , so X T (v + w) = (λi + µi )wi X X = λi w i + µi wi = T (v) + T (w) 2) Let v = P λi vi , λ ∈ F . Then X T (λv) = T λλi vi X = λλi wi X = λ λi w i = λT (v) So T is a linear transformation sending v1 7→ w1 for all i and is unique. 2 72 Algebra I – lecture notes 4.3. KERNEL AND IMAGE Remark 4.1. This shows that once we know what a linear transformation does to the vectors in a basis, we know what it does to all vectors. Eg 4.4. V =vector space of polynomials over R of degree ≤ 2. Basis of V : 1, x, x2 . Pick 3 vectors in V : w1 = 1 + x, w2 = x − x2 , w3 = 1 + x2 . By 4.3 there exists a unique linear transformation T : V → V sending 1 7→ w1 x 7→ w2 x2 7→ w3 Then 4.2 T (a + bx + cx2 ) = aT (1) + bT (x) + cT (x2 ) = a(1 + x) + b(x − x2 ) + x(1 + x2 ) = a + c + (a + b)x + (c − b)x2 4.3 Kernel and Image Definition 4.2. T : V → V linear transformation. Define the image Im(T ) to be Im(T ) = {T (v) | v ∈ V } ⊆ W The kernel Ker (T ) is Ker (T ) = {v ∈ V | T (v) = 0} ⊆ V Eg 4.5. T : R3 → R2 x1 x1 3 1 2 3x + x + 2x 1 2 3 x2 = T x2 = −1 0 1 −x1 + x3 x3 x3 A Then Ker (T ) = x ∈ R3 | T (X) = 0 = x ∈ R3 | Ax = 0 = solution space of Ax 3x1 + x2 + 2x3 Im (T ) = set of all vectors −x1 + x3 2 1 3 + x3 + x2 = set of all vectors x1 1 0 −1 = col-space of A 73 4.3. KERNEL AND IMAGE Algebra I – lecture notes Proposition 4.4. T : V → W linear transformation. Then i) Ker (T ) is a subspace of V ii) Im (T ) is a subspace of W Proof. i) Use 2.3: 4.2 1) 0 ∈ Ker (T ) since T (0V ) = 0W 2) Let v, w ∈ Ker (T ). Then T (v) = T (w) = 0, so T (v + w) = T (v) + T (w) = 0+0=0 So v + w ∈ Ker (T ). 3) Let v ∈ Ker (T ), λ ∈ F . Then T (λv) = λT (v) = λ0 = 0 So λv ∈ Ker (T ) ii) 1) 0 ∈ Im (T ) as 0 = T (0) 2) w1 , w2 ∈ Im (T ) so w1 = T (v1 ), w2 = T (v2 ) w1 + w2 = T (v1 ) + T (v2 ) = T (v1 + v2 ) so w1 + w2 ∈ Im (T ) 3) w ∈ Im (T ), λ ∈ F . Then w = T (v) λw = λT (v) = T (λv) so λw ∈ Im (T ). 2 74 Algebra I – lecture notes 4.3. KERNEL AND IMAGE Eg 4.6. Let vn = vector space of polynomials of degree ≤ n. Define T : Vn → Vn−1 by T (p(x)) = p′ (x) Then T is a linear transformation. Ker (T ) = {p(x) | T (p(x)) = 0} = {p(x) | p′ (x) = 0} = V0 (the constant polynomials) and Im (T ) = Vn−1 . This has basis 1, x, x2 , . . . , xn−1 , dim n. Proposition 4.5. Let T : V → W be a linear transformation. If v1 , . . . , vn is a basis of V , then Im (T ) = Sp(T (v1 ), . . . , T (vn )) Proof. Let T (v) ∈ Im (T ). Write v = λ1 v1 + · · · + λn vn Then 4.2 T (v) = λ1 T (v1 ) + · · · + λn T (vn ) ∈ Sp(T (v1 ), . . . , T (vn )) This shows Im (T ) ⊆ Sp(T (v1 ), . . . , T (vn )) All T (v1 ) ∈ Im T , so as Im (T ) is a subspace, Sp(T (v1 ), . . . , T (vn )) ⊆ Im (T ). Therefore Sp(T (v1 ), . . . , T (vn )) = Im (T ). 2 Important class of kernels and images: Proposition 4.6. Let A be an m × n matrix, and define T : Rn → Rm by (x ∈ Rn ) T (x) = Ax Then 1) Ker (T ) = solution space of the system Ax = 0. 2) Im T = column-space(A) 3) dim Im (T ) = rank(A) Proof. 75 4.3. KERNEL AND IMAGE Algebra I – lecture notes 1) Ker (T ) = {x | T (x) = 0} = {x ∈ Rn | Ax = 0} = solution space of Ax = 0 2) Take a standard basis e1 , . . . , en of Rn . By 4.5 Im (T ) = Sp(T (e1 ), . . . , T (en )) Here 0 .. . T (ei ) = Aei = A 1 . .. 0 = i-th column of A So Im (T ) = Sp(columns of A) = column-space(A). 3) dim (Im (T )) = dim (column-space(A)) = rank(A) 2 Eg 4.7. T : R3 → R3 1 2 3 T (x) = −1 0 1 1 4 7 Find bases for Ker (T ) and Im (T ). • Ker (T ) 1 2 3 0 1 2 3 −1 0 1 0 → 0 2 4 1 4 7 0 0 2 4 1 2 3 0 2 4 → 0 0 0 0 0 0 0 0 0 General solution (a, −2a, a). Basis for Ker (T ) is (1, −2, 1). 76 Algebra I – lecture notes 4.3. KERNEL AND IMAGE • Im (T ). Basis forIm (T ) = column-space((A)). Dimension is rank((A)) = 2. So basis 1 2 −1 is , 0. 1 4 Theorem 4.1 (Rank-nullity Theorem). a linear transformation. Then 1 Let V, W be vector spaces, and T : V → W be dim (Ker (T )) + dim (Im (T )) = dim (V ) Proof. Let r = dim (Ker (T )). Let u1 , . . . , ue be a basis of Ker (T ). By 2.10 we can extend this to u1 , . . . , ur , v1 , . . . , vs basis of V . So dim V = r + s. We want to show that dim (Im (T )) = s. By 4.5 Im (T ) = Sp(T (u1 ), . . . , T (ur ), T (v1 ), . . . , T (vs )) Each T (u1) = 0, as ui ∈ Ker T . So Im T = Sp(T (v1 ), . . . , T (vs )) (∗) Claim: T (v1 ), . . . , T (vs ) is a basis of Im T . Proof. Span shown by (∗). Suppose λ1 T (v1 ) + · · · + λs T (vs ) = 0 Then 4.2 T (λ1 v1 + · · · + λs vs ) = 0 So λ1 v1 + · · · + λs vs ∈ Ker T . So λ1 v1 + · · · + λs vs = µ1 u1 + · · · + µr ur (as u1 , . . . , ur are basis of Ker T ). That is µ1 u1 + · · · + µr ur − λ1 v1 − · · · − λs vs = 0 As u1 , . . . , ur , v1 , . . . , vs is a basis of V , it is linearly independent, and so µ i = λi = 0 ∀i This shows T (v1 ), . . . , T (vs ) is linearly independent, hence a basis of Im T . 2 So dim (Im T ) = s and dim (Ker T ) + dim (Im T ) = r + s = dim V. 2 1 dim (Ker (T )) is sometimes called the nullity of T 77 4.4. COMPOSITION OF LINEAR TRANSFORMATIONS Algebra I – lecture notes Consequences for linear equations Proposition 4.7. Let A be m × n matrix, and W = solution space of Ax = 0 = {x ∈ Rn | Ax = 0} Then dim W = n − rank(A) Proof. Define linear transformation T : Rn → Rm by (x ∈ Rn ) T (x) = Ax 4.6 Then Ker T = W , Im T = column-space(A). By 4.1 dim Rn = n = dim (Ker T ) + dim Im T = dim W + dim column-space(A) So n = dim W + rank(A) 2 1 1 3 2 5 Eg 4.8. Let A = 0 1 3 2 1 . Let W be the solution space of the system Ax = 0. 0 0 0 1 2 5 So W ⊆ R . Then dim W 4.7 = 5 − rank(A) = 5 − 3 = 2. General solution is (−4a, 3a−3b, b, −2a, a). This has two free variables and is a(−4, 3, 0, −2, 1)+ b(0, −3, 1, 0, 0) so (−4, 3, 0, −2, 1), (0, −3, 1, 0, 0) is a basis of W . In general, number of free variables in the general solution of Ax = 0 is the dimension of the solution space, i.e. n − rank(A). 4.4 Composition of linear transformations Definition 4.3. Let T : V → W, S:W →X be linear transformation (V, W, X vector spaces). The composition S ◦ T : V → X is defined S ◦ T (v) = S(T (v)) Usually just write ST . 78 Algebra I – lecture notes 4.5. THE MATRIX OF A LINEAR TRANSFORMATION Then ST is again a linear transformation ST (v1 + v2 ) = S(T (v1 + v2 )) = S(T (v1 ) + T (v2 )) = ST (v1 ) + ST (v2 ) (T is linear) (S is linear) Eg 4.9. 1. Let T : Rn → Rm , S : Rm → Rp be (x ∈ Rn ) (x ∈ Rm ) T (x) = Ax S(x) = Bx So ST (x) = = = = S(T (x)) S(Ax) B(Ax) BAx 2. Let T : V → V . Define T2 = T ◦ T : V → V = T (T (V )) If T : Rn → Rn , T (x) = Ax then T 2 (x) = A2 x T 3 (x) = A3 x .. . n T (x) = An x 4.5 The matrix of a linear transformation A linear transformation is a type of function between two vector spaces. We’ll show how to associate a matrix with any linear transformation. This will enable us to use matrix theory to study linear transformations. 79 4.5. THE MATRIX OF A LINEAR TRANSFORMATION Algebra I – lecture notes Let T : V → V be a linear transformation (V a vector space). Let B = {v1 , . . . , vn } be a basis of V (finite dimensional). Each T (vi ) ∈ V , so is a linear combination of vi ’s T (v1 ) = a11 v1 + a21 v2 + · · · + an1 vn T (v2 ) = a12 v1 + a22 v2 + · · · + an2 vn .. . T (vn ) = a1n v1 + a2n v2 + · · · + ann vn Definition 4.4. Matrix of T (with respect a11 a21 [T ]B = .. . an1 Eg 4.10. T : R2 → R2 to basis B) is a12 . . . a1n a22 . . . a2n .. .. .. . . . an2 . . . ann x1 2x1 − x2 T = x2 x1 + 2x2 2 −1 x1 = 1 2 x2 1 0 Let B = {e1 , e2 } = , . Work out [T ]B : 0 1 2 T (e1 ) = = 2e1 + e2 1 −1 T (e2 ) = = −e1 + 2e2 2 2 −2 So [T ]B = . 1 2 With another basis B ′ = {(1, 1), (0, 1)} = {v1 , v2 } What is [T ]B′ ? Hence [T ]B′ 1 2−1 1 T (v1 ) = T = = = 1v1 + 2v2 1 1+2 3 0 −1 T (v2 ) = T = = −v1 + 3v2 1 2 1 −1 = . 2 3 80 Algebra I – lecture notes 4.5. THE MATRIX OF A LINEAR TRANSFORMATION Eg 4.11. V the vector space of polynomials of degree ≤ 2. Define T : V → V by T (p) = p′ . Basis B = 1, x, x2 T (1) = 0 T (x) = 1 T (x2 ) = 2x Hence 0 1 0 [T ]B = 0 0 2 0 0 0 Observe [T (p(x))]B = [T ]B [p(x)]B 0 1 0 = 0 0 2 0 0 0 b 2c = 0 Definition 4.5. Let V be a vector space over F , B = {v1 , . . . , vn } a basis of V . Let v ∈ V , v = λ1 v1 + · · · + λn vn , λi ∈ F . Define vector of v, [v]B ∈ F n to be λ1 [v]B = ... λn Proposition 4.8. Let V be a vector space, B a basis of V and v a vector in V . Then [T (v)]B = [T ]B [v]B Proof. Let v = Pn i=1 Let [T ]B = (aij ), so λi vi . So λ1 [v]B = ... . λn T (vi ) = a1i v1 + a2i v2 + · · · + ani vn 81 4.6. EIGENVALUES AND EIGENVECTORS Algebra I – lecture notes So T (v) = n X λi T (vi ) i=1 = n X i=1 = λi (a1i v1 + · · · + ani vn ) n X λi a1i i=1 ! v1 + · · · + n X λi ani i=1 ! vn So P [T (v)]B = P λ1 a1i .. . λi ani λ1 a11 · · · a1n .. . . .. .. ... = . λn an1 . . . ann = [T ]B [v]B 2 4.6 Eigenvalues and eigenvectors Definition 4.6. Let T : V → V be a linear transformation. Say v ∈ V is an eigenvector of T if (1) v 6= 0 (2) T (v) = λv, λ ∈ F Call λ an eigenvalue of T . Eg 4.12. 1. V a vector space of polys of degree ≤ 2. Define T : V → V by T (p(x)) = p(x + 1) − p(x) T is a linear transformation. The eigenvectors of T are non-zero polynomials p(x) such that T (p(x)) = λp(x) p(x + 1) − p(x) = λp(x) p(x + 1) = (λ + 1)p(x). 82 Algebra I – lecture notes 4.6. EIGENVALUES AND EIGENVECTORS 1. Let T : Rn → Rn be defined by T (v) = Av where A is n × n matrix. Then T (v) = λv iff Av = λv; the eigenvalues and eigenvectors of T are the same as those of matrix A. 4.6.1 How to find evals / evecs of T ? Let v be an eigenvector of T , so T (v) = λv Let B be a basis of V . Then [T (v)]B = [λv]B = λ [v]B By [T ]B [v]B = λ [v]B So collumnn vector [v]B is an eigenvector of the matrix [T ]B , and λ is an eigenvalue of this matrix. Hence: Proposition 4.9. Let T : V → V , B a basis of vector space V . 1) The eigenvalues of T are the eigenvalues of the matrix [T ]B 2) The eigenvectors of T are the vectors v such that [v]B is an eigenvector of the matrix [T ]B . Eg 4.13. V = polynomials of degree ≤ 2. T (p(x)) = p(x + 1) − p(x) Find the eigenvectors and eigenvalues of T . Let B = {1, x, x2 }. So (from one of the previous examples) 0 1 1 [T ]B = 0 0 2 0 0 0 Characteristic polynomial −λ 1 1 2 |A − λI| = 0 −λ 0 0 −λ = −λ3 So only eigenvalue is 0. Eigenvectors of A are solutions to the equation So for λ = 0 (A − λI)x = 0 0 1 1 0 0 2 0 0 0 So eigenvectors of T are the polynomials a, i.e. 83 0 0 0 the constant polynomials. 4.7. DIAGONALISATION 4.7 Algebra I – lecture notes Diagonalisation Let T : V → V be a linear transformation. Suppose B = {v1 , . . . , vn } is a basis of V such that the matrix [T ]B is diagonal. So This means λ1 · · · 0 .. [T ]B = . 0 · · · λn T (v1 ) = λ1 v1 T (v2 ) = λ2 v2 .. . T (vn ) = λn vn Proposition 4.10. The matrix [T ]B is diagonal iff B is a basis consisting of eigenvectors of T . Definition 4.7. Linear transformation T : V → V is diagonalisable if there exists a basis B of V consisting of eigenvectors of T . Eg 4.14. V = polynomials of degree ≤ 2. 1. T1 (p(x)) = p(x + 1) − p(x). We’ve showed that the only eigenvectors are the constant polynomials. So there exists no basis of eigenvectors; T1 is not diagonalisable. 2. T2 (p(x)) = p′ (x). Then for B = {1, x, x2 } 0 1 0 [T2 ]B = 0 0 2 0 0 0 Find as for T1 , that the only eigenvectors of T are the constant polynomials; T2 is not diagonalisable. 3. T3 (p(x)) = p(1 − x). Here Is T3 diagonalisable? 1 1 1 [T ]B = 0 −1 −2 0 0 1 84 Algebra I – lecture notes 4.8. CHANGE OF BASIS Eg 4.15. T : R2 → R2 x1 0 1 x2 x1 = = T x2 −2 3 −2x1 + 3x2 x2 Is T diagonalisable? 1 0 For B = {e1 , e2 } = , , 0 1 [T ]B = 0 1 −2 3 =A a b We find that the eigenvalues of A are 1, 2 and the eigenvectors , . So there is a a 2b basis of eigenvectors of A: 1 1 ′ B = , 1 2 So [T ]B′ = 1 0 0 2 From M1GLA, recall that if (columns are equal to vectors in B ′ ) 1 1 P = 1 2 then P −1 1 0 AP = 0 2 i.e. P −1 [T ]B P = [T ]B′ General theory: 4.8 Change of basis V a vector space, B = {v1 , . . . , vn } a basis of V . T a linear transformation T : V → V . Matrix [T ]B = (ai j) (n × n matrix) where T (vi ) = a1i v1 + · · · + ani vn Question: If B ′ = {w1 , . . . , wn } is another basis of V , get another matrix [T ]B′ . What is 85 4.8. CHANGE OF BASIS Algebra I – lecture notes the relation between [T ]B and [T ]B′ ? To answer this, write w1 = p11 v1 + · · · + pn1 vn .. . wn = p1n v1 + · · · + pnn vn where pij ∈ F . The matrix p11 · · · p1n .. .. P = ... . . pn1 · · · pnn is called the change of basis matrix from B to B ′ . Proposition 4.11. P is invertible, and [T ]B′ = P −1 [T ]B P Proof. Omitted (in favour of further material). Can look it up, e.g. in Lipschitz, Linear Algebra. 2 Eg 4.16. Let T : R2 → R2 . x1 0 1 x2 x1 = = T x2 −2 3 −2x1 + 3x2 x2 Let 0 0 , B = 1 1 1 1 , B′ = 2 1 Then change of basis matrix is P = and [T ]B [T ]B′ 1 1 1 2 0 1 = −2 3 1 0 = 0 2 So by 4.11, P −1 [T ]B P = [T ]B′ 86 Algebra I – lecture notes 4.8. CHANGE OF BASIS i.e. P −1 0 1 −2 3 P = (familiar from M1GLA). 87 1 0 0 2 Algebra I – lecture notes Chapter 5 Error-correcting codes 5.1 Introduction Everyday language: alphabet: a,b,c,. . . and words: ahem, won, pea, too, . . . . A code is a language for machine communication: alphabet: 0, 1 and codewords: selected strings of 0’s and 1’s. Eg 5.1. ASCII code: A → 1000000 B → 1000001 .. . 9 → 0011100 Message encode −→ codewords transmission −→ with noise! received words Errors in transmission happen (1 in 100 bits). In everyday language errors can usually be corrected: lunear algebra → linear algebra Try to do the same with codes. Eg 5.2. Simplest code: Messages YES/ NO. Codewords: YES - 111, NO - 000. 88 decode −→ decoded message Algebra I – lecture notes 5.1. INTRODUCTION Decode by taking majority, e.g. receive 010, decode as 000. This corrects a single error. Definition 5.1. If w is a string of 1’s and 0’s, the parity check-bit of w is 1, if number of 1’s in w is odd, 0 otherwise. Eg 5.3. Parity check-bit of 11010 is 1, 00101 is 0. Eg 5.4. Here’s a code which communicates 8 messages abc (a, b, c are 0 or 1). Codewords are abcxyz where x is parity check-bit for ab y is parity check-bit for ac z is parity check-bit for bc So codewords are 000000, 100110, . . . Suppose we receive a string 010110. Here abx = 011 - ok. acy = 001 - wrong. bcz = 100 - wrong. So there is an error. If only 1 error, it must be c, so corrected codeword is 011110. Claim: This code can correct any single error. Proof. wrong abx acy bcz a wrong wrong ok b c wrong ok ok wrong wrong wrong x wrong ok ok So the pattern of wrong’s and ok’s pinpoints the error. 89 y ok wrong ok z ok ok wrong 2 5.2. THEORY OF CODES 5.2 Algebra I – lecture notes Theory of Codes Recall Z2 = {[0] , [1]}. So 0+0 = 0 0+1 = 1 1+1 = 0 Define Zn2 = {(x1 , . . . , xn ) | xi ∈ Z2 }. Definition 5.2. A (binary) code of length n is a subset C of Zn2 . Members of C are called codewords. Eg 5.5. “Triple-check” code C3 = {abcxyz | x = a + b y = a + c z = b + c} = {000000, 100110, 010101, 001011, 110011, 101101, 011110, 111000} Definition 5.3. For x, y ∈ Zn2 the distance d(x, y) is the number of positions where x and y differ. Eg 5.6. d(11010, 01111) = 3. Proposition 5.1 (Triangle Inequality). For x, y, z ∈ Zn2 d(x, y) ≤ d(x, z) + d(z, y) Proof. Let x = x1 , . . . , xn , etc. Let U = {i | xi = 6 yi } S = {i | xi = 6 yi , xi = zi } T = {i | xi 6= yi , xi = 6 zi } so d(x, y) = |U|. Then |U| = |S| + |T | |S| ≤ d(y, z) |T | ≤ d(x, z) 2 Definition 5.4. For a code C, the minimum distance of C is min {d(x, y) | x, y ∈ C, Call it d(C). 90 x 6= y} Algebra I – lecture notes 5.2.1 5.3. LINEAR CODES Error Correction Code C. Send codeword c. Make some errors and c′ is received. Correct to nearest codewords. Want this to be c. Definition 5.5. Let e ≥ 1. We say a code C corrects e errors if, whenever a codeword c ∈ C is sent, and at most e errors are made, the received word is corrected to c Equivalently (1) Let Se (c) = {w ∈ Zn2 | d(c, w) ≤ e}. Then C corrects e errors iff Se (c) ∩ Se (c′ ) = ∅ for all c, c′ such that c 6= c′ . (2) C corrects e errors if for any c, c′ ∈ C and w ∈ Zn2 d(c, w) ≤ e, d(c′ , w) ≤ e ⇒ c = c′ Proposition 5.2. If the minimum distance d(C) ≥ 2e + 1, then C corrects e errors. Proof. Apply (2). For c, c′ ∈ C and w ∈ Zn2 , d(c, w) ≤ e, d(c′ , w) ≤ e. Therefore by 5.1 d(c, c′) ≤ d(c, w) + d(w, c′ ) ≤ 2e So c = c′ , since d(C) ≥ 2e + 1. 2 Eg 5.7. d(C3 ) = 3 = 2 · 1 + 1, therefore C3 corrects 1 error. 5.3 Linear Codes Claim Z2n is a vector space over Z2 (scallars are Z2 = {0, 1}). Proof. Define addition (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) and scalar multiplication λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ). Check axioms (1) Zn2 is an abelian group under + (2) four scalar multiplication axioms hold 2 Note 5.1. Unlike Rn , Zn is a finite vector space, having 2n vectors. It has dimension n, with standard basis e1 , . . . , en . Definition 5.6. Code C ⊆ Zn2 is a linear code if C is a subspace of Zn2 . 91 5.3. LINEAR CODES Algebra I – lecture notes By 2.3, this means (1) 0 ∈ C (2) c, d ∈ C, then c + d ∈ C (3) c ∈ C, then λc ∈ C (as λ can be only 0 and 1). Eg 5.8. C3 = {abcxyz | x = a + b, 1 1 0 6 1 0 1 = x ∈ Z2 | 0 1 1 a solution space in Z62 . y = a + c, z = b + c} 1 0 0 0 1 0 0 0 1 Aim Find linear codes C with: • dim C large • length of C small • d(C) large Proposition 5.3. IF C is a linear code, and dim C = m, then number of codewords in C is |C| = 2n Proof. Let c1 , . . . , cm be a basis of C. So C consists of all linear combinations λ 1 c1 + · · · + λ m cm (λi ∈ Z2 ) There are 2 choices 0, 1 for each λi . Hence 2m codewords. 2 Eg 5.9. C3 has 8 codewords abcxyz. 5.3.1 Minimum distance of linear code Definition 5.7. For w ∈ Zn2 define the weight wt (w) = number of 1’s in w Eg 5.10. wt(1101010) = 4 Note 5.2. wt (w) = d(w, 0) Proposition 5.4. For x, y, z ∈ Zn2 d(x, y) = d(x + z, y + z). Proof. Notice that x + z = x changed in places where z has a 1. Simillarly y + z is y changed in places where z has 1. So x + z, y + z differ in same places where x, y differ. 2 92 Algebra I – lecture notes 5.4. THE CHECK MATRIX Main Result Proposition 5.5. If C is a linear code, d(C) = min (wt(c) | c ∈ C, c 6= 0) Proof. Let c ∈ C, c 6= 0, be a codeword of smallest weight, say wt(c) = r. Then d(c, 0) = r, hence d(C) ≤ r ( ) Let x, y ∈ C, x 6= y. Then d(x, y) = = = ≥ So d(C) ≥ r. Hence, by ( d(x + y, y + y) d(x + y, 0) wt(x + y) r by 5.4 (as x + y ∈ C) ), d(C) = r. 2 Eg 5.11. C3 = 000000, 100110, . . . has min(wt(c)) = 3, hence d(C) = 3. 5.4 The Check Matrix Definition 5.8. Let A be an m × n matrix with entries in Z2 . If C is the linear code C = {x ∈ Zn2 | Ax = 0} Then A is a check matrix for C. Eg 5.12. 1. Check matrix for C3 is 1 1 0 1 0 0 1 0 1 0 1 0 0 1 1 0 0 1 1 0 1 0 2. Let A = , check matrix of 1 1 0 1 C = x ∈ Z42 | Ax = 0 x3 = x1 4 = x ∈ Z2 | x4 = x1 + x2 C sends 4 messages x1 x2 with two “check-bits” x3 x4 . C = {0000, 1011, 0101, 1110} dim C = 2 d(C) = 2 So C does not correct any errors. 93 5.4. THE CHECK MATRIX Algebra I – lecture notes Proposition 5.6. If C is a linear code of length n with check matrix A C = {x ∈ Zn2 | Ax = 0} then dim C = n − rank(A) Proof. This is Proposition 4.7 (rank-nullity). Eg 5.13. Find dim (C) C = x ∈ Z62 2 1 0 | 0 1 0 1 0 1 1 1 0 0 1 0 1 0 0 1 1 0 1 0 A → 0 0 1 0 → 0 0 0 1 0 1 1 1 0 1 1 0 1 1 0 1 1 0 0 1 0 0 1 1 0 0 1 0 1 1 0 1 1 1 So rank(A) = 3, dim C = 6 − 3 = 3. 0 0 x = 0 1 1 0 0 1 1 0 0 1 1 Relation between check matrix and min. distance Proposition 5.7. Let C be linear code C = {x ∈ Zn2 | Ax = 0} witch check matrix A. (1) If all columns of A are different, and no column is zero, then the C corrects 1 error. (2) Let d ≥ 1. Suppose every set of -.1 columns of A is linearly independent. Then d(C) ≥ d. Proof. (1) Must show that d(C) ≥ 3, i.e. wt(c) ≥ 3 for every c ∈ C \ {0}. If there is a codeword 0 .. . 1 c of weight 1, then c = ei = . , and 0 = Ac = Aei , which is the i-th column of .. 0 A. Contradiction. 94 Algebra I – lecture notes 5.5. DECODING If there is a codeword c of weight 2, then c = ei + ej , so 0 = Ac = Aei + Aej = column i + column j i.e. column i equals to column j. Contradiction. (2) 2 Eg 5.14. To correct two errors, need d(C) ≥ 5. So the condition in (2) is that any 4 columns are linearly independent. Eg 5.15. 1. There are 7 non-zero column vectors in Z32 . 1 1 1 1 1 0 A = 1 0 1 Let Write them down 0 1 0 0 1 0 1 0 1 0 0 1 x ∈ Z72 | Ax = 0 x5 = x1 + x2 + x3 = x . . . x7 | x6 = x1 + x2 + x4 1 x7 = x1 + x3 + x4 dim H = 4 |H| = 16 H = H corrects one error by 5.7(1). Code H is Ham(3), a Hamming code. 5.5 Decoding Say C = {x ∈ Zn2 | Ax = 0}, corrects 1 error. Send c ∈ C, make 1 error in i-th bit. Received word is c′ = c + ei How to find i? Ac′ = Ac + Aei = 0 + Aei = i-th column of A ! So we correct i-th bit, where Ac′ is the i-th column of A. 95 z Algebra I – lecture notes Eg 5.16. With code H as above. Receive 1 1 Aw = 1 1 1 0 0 = 1 0 w = 0111010. 1 0 1 0 0 0 1 0 1 0 w T 1 1 0 0 1 This is the 6-th column of A is 0111000. THE END. as1005@ic.ac.uk