Download M1GLA: Geometry and Linear Algebra Lecture Notes

Geometry & Linear Algebra Lectured in Autumn 2014 at Imperial College by Prof. A. N. Skorobogatov. Humbly typed by Karim Bacchus. Caveat Lector: unofficial notes. Comments and corrections should be sent to [email protected]. Other notes available at wwwf.imperial.ac.uk/∼kb514. Syllabus This course details how Euclidean geometry can be developed in the framework of vector algebra, and how results in algebra can be illuminated by geometrical interpretation. Geometry in R2 : Vectors, lines, triangles, Cauchy-Schwarz inequality, Triangle inequality. Matrices and linear equations: Gaussian elimination, matrix algebra, inverses, determinants of 2 × 2 and 3 × 3 matrices. Eigenvalues and eigenvectors, diagonalisation of matrices and applications. Conics and quadrics: Matrix methods, orthogonal matrices and diagonalisation of symmetric matrices. Geometry in R3 : Lines, planes, vector product, relations with distances, areas and volumes. Vector spaces: Axioms, examples, linear independence and span, bases and dimension, subspaces. Appropriate books J. Fraleigh and R. Beauregard, Linear Algebra S. Lipschutz and M. Lipson, Linear Algebra Contents 0 Introduction 3 1 Geometry in R2 1.1 Lines in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 8 10 2 Matrices 12 2.1 System of Linear Equations . . . . . . . . . . . . . . . . . . . . 12 2.2 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3 Matrix Algebra 19 3.1 Square matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 4 Applications of Matrices 26 4.1 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.2 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5 Inverses 29 5.1 Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 6 Determinants 33 6.1 3 × 3 determinants . . . . . . . . . . . . . . . . . . . . . . . . . 33 7 Eigenvalues and Eigenvectors 37 7.1 Diagonalisation of Matrices . . . . . . . . . . . . . . . . . . . . 39 8 Conics and Quadrics 8.1 Standard Forms of Conics . . 8.2 Reducing with Trigonometry 8.3 Reducing with Eigenvectors . 8.4 Quadric Surfaces . . . . . . . 8.5 Reducing Quadrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 46 48 51 56 58 9 Geometry in R3 61 9.1 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 9.2 Rotations in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 10 Vector spaces 68 10.1 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 10.2 * Error correcting codes * . . . . . . . . . . . . . . . . . . . . . 79 0 Introduction Geometry & Linear Algebra 0 Introduction Geometry Greek approach Lecture 0 • Undefined objects (points, lines) • Axioms, e.g. through any two distant points in a plane there is exactly one line. • Using logic you deduce theorems • Proof Contractibility Using only a ruler and a compass e.g. perpendicular bisector, 60◦ (... can you construct 20◦ ? ) What can we construct using a ruler and compass, once we are given a unit length? • Rational numbers are constructable • So are square roots 1 √ 2 1 √ But can you construct 3 2? Modern Approach: Cartesian (Rene Descartes) Plane ←→ the set of points (x1 , x2 ), x1 , x2 ∈ R Line ←→ a set of points satisfying ax1 + bx2 = c for some given a, b, c ∈ R e.g. a circle x21 + x22 = r Space ←→ {(x1 , x2 , x3 ) x1 , x2 , x3 ∈ R} Planes ←→ {ax1 + bx2 + cx3 = d} Lines ←→ {ax1 + bx2 + cx3 = d} Advantages: (i) Can use Analysis (ii) Can do any dimension. (iii) Can replace R by C or by binary fields, i.e. {0, 1} with binary operations 1 + 1 = 0. 3 1 Geometry in R2 Geometry & Linear Algebra 1 Geometry in R2 R is the set of real numbers: Lecture 1 − 32 0 1 √ 2 π This has the properties of being: • Commutative: ab = ba • Associative: a + (b + c) = (a + b) + c • Distributive: c(a + b) = ca + cb • If a < b, c > 0 then ca < cb Definition. R2 = the set of ordered pairs (a, b) where a, b ∈ R i.e. (a, b) ̸= (b, a) in general. Elements of R2 are points, or vectors. 0 = (0, 0) = origin of R2 . A scalar is an element of R. Properties of R2 : (i) Addition (sum): (a, b) + (a′ , b′ ) = (a + a′ , b + b′ ) (a + a′ , b + b′ ) R (a′ , b′ ) (a, b) R (ii) Scalar Multiplication: If λ = R, (a, b) ∈ R2 , then λ · (a, b) = (λa, λb) v ∈ R2 =⇒ λv ∈ R2 Exercise: Check λ(v1 + v2 ) = λv1 + λv2 where λ ∈ R, v1 , v2 ∈ R2 . 1.1 Lines in R2 Definition. L is a line in R2 if ∃u, v ∈ R2 v ̸= 0 with L = {u + λv | λ ∈ R} So if u = (a1 , b1 ), v = (a2 , b2 ) then L = {(a1 + λb1 , a2 + λb2 ) | λ ∈ R}. 4 1 Geometry in R2 Geometry & Linear Algebra Example 1.1. y x line is {(0, 0) + λ(1, 0) | λ ∈ R} L y line is {(0, 0) + λ(0, 1) | λ ∈ R} x We draw the line L = {x + y = 1} = {(1, 0) + λ(−1, 1) | λ ∈ R}. Note that v is the vector for which L is parallel to. Checking the components x = 1 + (−1)λ = 1 − λ y =0+1·λ=λ } x+y =1 Assume L, M are two lines in R2 , with L = {u = λv | λ ∈ R}, v ̸= 0 M = {a + µb | µ ∈ R}, b ̸= 0 Proposition 1.2. The two lines are the same (L = M ) if and only if the following holds: (i) v = αb for some α ∈ R (ii) L ∩ M ̸= ∅ (L and M have a point in common) Proof. ( =⇒ ) (i) Assume that L = M . We know that u ∈ L =⇒ u ∈ M =⇒ u = a + µb for some µ ∈ R Also u + v = L (λ = v) =⇒ u + v ∈ M =⇒ u + v = a + µ1 b for some µ1 ∈ R =⇒ v = (a + µ1 b) − (a − µb) = (µ1 − µ)b = αb. (ii) Since L = m, surely L ∩ M = ∅. 5 Geometry in R2 1 Geometry & Linear Algebra ( ⇐= ) Assume v = αb =⇒ L = {u + λαb | λ ∈ R}. We also know that L ∩ M ̸= ∅ =⇒ ∃c ∈ L ∩ M =⇒ c ∈ L =⇒ c = u + λ0 αb, for some λ0 ∈ R. Then also c ∈ M =⇒ c = a + µb, for some µ ∈ R =⇒ u + λ0 αb = a + µb =⇒ u = a + µb − λ0 αb = a + (µ − λ0 α)b (∗) Let’s consider a point inside L: L = u + λαb = a + (µ − λ0 α)b + λαb (using ∗) = a + (µ − λ0 α + λα)b ∈ M So any point in L is inside M , so L ⊆ M . By symmetry M ⊆ L =⇒ L = M. Parallel Lines Let L = {u = λv | λ ∈ R}, v ̸= 0 M = {a + µb | µ ∈ R}, b ̸= 0 Definition. L and M are parallel if ∃ α ∈ R such that b = αv. Proposition 1.3. Let x, y be two distinct points of R2 . Then there exists a unique line which contains both x and y. Proof. (Existence) Choose u − x, v = y − x. Then let L = {x + λ(y − x) | λ ∈ R} N.B. v = y − x ̸= 0. We want to show that x, y ∈ l. Chose λ = 0, x + λ(y = x) = x ∈ L. Also choose λ = 1, x + λy − λx = y ∈ L. So x, y are contained in a line. (Uniqueness) L = {u + lv | l ∈ R}. L is a line through x and y. x ∈ L =⇒ x = u + l1 v for l1 ∈ R. y ∈ L =⇒ y = u + l2 v for l2 ∈ R. y − x = (l2 − l1 )v =⇒ v = y−x (l2 − l1 ) The vector that defined the direction of L is proportional to the vector y − x. By the question we answered before, it’s enough to show that the two lines have a point in common. x, y are points in common =⇒ Two lines are the same. 6 1 Geometry in R2 Problem: Construct Geometry & Linear Algebra √ x. Lecture 2 C α Claim: |CD| = √ β 1 A |AD| = 1 |BD| = x D x B x. Proof. The angle ACD = 90◦ , since AB is the diameter of a circle =⇒ α + β = 90◦ . Note △ABC is similar to △DBC, since CAD = DCB = α, ACD = CBD = 90◦ and CBA = CBD = β. So we have a proportion: |BD| |CD| = |CD| |AD| √ x |CD| = =⇒ |CD| = x |CD| 1 Recall: The plane R2 is the set of vectors (x1 , x2 ), xi ∈ R. For lines L = {u = λv | λ ∈ R}, v ̸= 0 M = {a + µb | µ ∈ R}, b ̸= 0 L || M ⇐⇒ v and b are proportional. L = M ⇐⇒ (L ∩ M ̸= ∅). v and b are proportional and they have a common point Proposition 1.4. Through any two points, there is a unique line. Proposition 1.5. Any two non-parallel lines meet in a unique point. Proof. See exercise sheet. Example 1.6. Let L = {(0, 1) + λ(1, 1)} M = {(4, 0) + µ(2, 1)} The common point = (0, 1) + λ(1, 1) = (4, 0) + µ(2, 1). Equating x and y’s, we see from the first co-ordinate: λ = 4 + 4µ. From the second co-ordinate: 1 + λ = µ. 7 1 Geometry in R2 Geometry & Linear Algebra So λ = 4 + 2(λ + 1) =⇒ λ = −6 Therefore the common point is (−6, −5). 1.2 Triangles Definition. A triangle is a set of 3 points in R2 that are not on a common line. c b+c 2 a+b 2 a+c 2 a b Definition (Midpoint). The midpoint of ab is the vector 21 (a + b). Definition (Median). A median is the line joining a corner to the midpoint of the opposite side. Proposition 1.7. The three medians of the triangle meet in a common point. Proof. The medians are La = {a + λ( b+c 2 − a)} a+c Lb = {b + λ( 2 − b)} Lc = {c + λ( a+b 2 − c)} We can write them as La = {(1 − λ)a + λ2 b + λ2 c} Lb = {(1 − µ)b + µ2 a + µ2 c} Take λ = µ, then La = Lb . Check: λ = La ∩ Lb ∩ Lc . 2 3 µ 2c =⇒ = λ 2 c. Also λ 2 = 1 − µ = 1 − λ =⇒ λ = La = Lb . So the common point is 1 3 (a Hence + b + c) ∈ Definition (Distance). The length of √a vector x = (x1 , x2 ) is We denote the length of x by ||x|| = x21 + x22 . 8 2 3. √ x21 + x22 . 1 Geometry in R2 Geometry & Linear Algebra The distance between two points x, y ∈ R2 is √ ||x − y|| = (x1 − y1 )2 + (x2 − y2 )2 Definition (Scalar product). The scalar product (or dot product) of two vectors x, y ∈ R2 is (x · y) = x1 y1 + x2 y2 E.g. If x = (1, −1), y = (1, 2) then (x · y) = 1 + (−2) = −1. Easy properties: For any vectors x, y, z ∈ R2 • x · (y + z) = x · y + x · z (distribution over +) • ||x||2 = (x · x) = x21 + x22 • (x · y) = (y · x) (commutativity) Proposition 1.8 (Cauchy-Schwartz Inequality). For x, y ∈ R2 |(x · y)| ≤ ||x|| · ||y|| With equality iff x, y are multiples of the same vector. Proof. w.l.o.g. y ̸= (0, 0). Consider all vectors of the form x − λy. The length ||x − λy|| squared is clearly ≥ 0, so ||x − λy||2 = (x − λy · x − λy) = (x · x) − 2λ(x · y) + λ2 (y · y) = ||x||2 − 2λ(x · y) + λ2 ||y|| ≥ 0 This is a quadratic polynomial in λ with real coefficients and at most one real root (= 0). Now λ2 ||y||2 − 2λ(x · y) + ||x||2 ≥ 0 =⇒ discriminant, D ≤ 0 =⇒ ((2(x · y))2 ≤ 4||x||2 ||y||2 =⇒ 4(x · y)2 ≤ 4||x||2 ||y||2 =⇒ |x · y| ≤ ||x||||y|| The equality condition is clear for y = 0. Assume that y is not the zero vector. If |(x · y)|2 = ||x|| · ||y||, then D = 0. D = 4(x · y)2 − 4||x||2 · ||y||2 = 0 Then aλ2 +bλ+c has a double root, say λ0 . But aλ20 +bλ0 +c = ||x−λ0 y||2 = 0. Therefore x − λ0 y is the zero vector. Thus x = λ0 y. Proposition 1.9 (Triangle Inequality). For any vectors x, y ∈ R2 , we have ||x + y|| ≤ ||x|| + ||y|| If equality holds, then x and y are multiples of the same vector. 9 Lecture 3 1 Geometry in R2 Geometry & Linear Algebra x+y y x Proof. It is enough to prove that the square on LHS ≤ RHS, i.e. ||x + y||2 ≤ ||x||2 + ||y||2 + 2||x|| · ||y|| We’ve already seen that ||x + y||2 = (x + y · x + y) = ||x||2 + ||y||2 + 2(x · y) By Cauchy-Schwartz (x · y) ≤ ||x|| · ||y||, hence the inequality is proved. To prove the equality condition, we note that the triangle inequality is an equality precisely when (x · y) = ||x|| + ||y||. By Theorem 1.4, we conclude that x and y are multiples of the same vector. Recall the scalar product of x as ||x||2 = (x · x). The scalar product can be reconstructed if we only know the lengths of vectors: Proposition 1.10. For any two vectors x, y ∈ R2 , we have (x · y) = 1 (||x||2 + ||y||2 − ||x − y||2 ) 2 Proof. 1 (||x||2 + ||y||2 − ||x||2 − ||y||2 + 2(x · y) 2 = (x · y) RHS = 1.3 Angles From the Cauchy-Schwartz inequality, we find that −1 ≤ (x · y) ≤ 1. ||x|| · ||y|| Definition. We define θ as the angle such that cos θ = y θ x 10 (x · y) ||x|| · ||y|| 1 Geometry in R2 Geometry & Linear Algebra In particular, x and y are perpendicular iff (x · y) = 0. Lines in R2 can be written as L = {u + λv | λ ∈ R}. This will be referred to as vector form. Lines can also be described by their Cartesian equation: px1 + qx2 + r = 0, where p1 , q1 ∈ R. Definition. Any vector perpendicular to the direction vector of a line is called a normal vector to L. n L If L = {u + λv}, v = (v1 , v2 ), then n = (−v2 , v1 ) is a normal vector to L. Remark. Let n ∈ R2 be any (non-zero) normal vector to L = {u + λv}. Then all points x ∈ L have the property that (x · n) is constant. Indeed, x = u + λv, hence (x · n) = ((u + λv) · n) = (u · n) + λ(v · n) = u · n. | {z } =0 Therefore every point x of L satisfies the equation n1 x1 + n2 x2 = c, where c is a constant (c = u1 n1 +u2 n2 ). L can be given by −v2 x1 +v1 x2 = c for some c. From Cartesian to Vector form Start with L which is the zero set of px1 + qx2 + r = 0. Now (p, q) is a normal vector to L, hence v = (q, −p) is a possible direction vector of L. ((−q, p) is another possibility.) Then L = {u + λv | λ ∈ R} for an appropriate vector of a. px1 + qx2 + r = 0. ((p, q) · x) = constant. (this constant is −r). The lines L = {u+λv} and M = {a+µb} are perpendicular iff (v ·b) = 0. The lines px1 + qx2 + r = 0 and ex1 + f x2 + g = 0 are perpendicular iff the normal vectors to these lines e.g. (p, q) and (e, f ) are perpendicular. This happens when (p, q) · (e, f ) = 0. Claim: The vector (p, q) is a normal vector to L, given by px1 + qx2 + r = 0. Proof. We must show that (p, q) is perpendicular to (x − y), where x ∈ L and y ∈ L. Consider y x px1 + qx2 + r = 0 (1) py1 + qy2 + r = 0 (2) (1) − (2) : p(x1 − y1 ) + q(x2 − y2 ) = 0 =⇒ (p, q) · (x1 − y1 , x2 − y2 ) = 0 Hence (p, q) · (x − y) = 0. So that (p, q) is perpendicular to (x − y). 11 Lecture 4 2 Matrices Geometry & Linear Algebra 2 Matrices Definition. Let n be a natural number, n = 1, 2, 3, . . . . Then Rn is the set of all vectors with n co-ordinates (x1 , . . . , xn ). E.g. R is the real line. R2 is the real plane. R3 is 3-dimensional space etc. Definition (Vector Operations). Vector addition: (x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn ) Multiplication by scalar α ∈ R: α(x1 , . . . , xn ) = (αx1 , . . . , αxn ) Dot product of two vectors: (x1 , . . . , xn ) · (y1 , . . . , yn ) = (x1 y1 + x2 y2 + . . . xn yn ) 2.1 System of Linear Equations A linear equation of n vairables is of the form a1 x1 + . . . an xn = b, where a1 , . . . , an , b ∈ R. We want to solve systems of such linear equations. Example 2.1. x1 + x2 = 3 −2x1 + x2 = 1 (1) (2) 2∗(1)+(2) elimated x, leaving 3x2 = 7 =⇒ x2 = 73 . Then x1 = 3−x2 = 23 . Example 2.2. x1 − 2x2 = 2 (1) 3x1 − 6x2 = 4 (2) −3 × (1) + 2 is 0 = −2. So this system has no solutions. Example 2.3. x1 + x2 = 0 x2 − x3 = 0 (1) (2) x1 − x2 + 2x3 = 0 (3) 12 2 Matrices Geometry & Linear Algebra Observe (3) = (1) − 2 × (2). So (1) and (2) are enough here, so (3) is automatically satisfied if (1) and (2) are satisfied. So x3 = a ∈ R, any real number. Then (2) says that x2 = a. Then (1) says that x1 = −a. So this system has infinitely many solutions, which are (−a, a, a) for any a ∈ R. Consider a general system of linear equations of n variables x1 , . . . , xn : a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 .. . am1 x1 + am2 x2 + · · · + amn xn = bm A vector (y1 , . . . , yn ), yi ∈ R is a solution of the system if all the equations are satisfied when we put x1 = y1 , . . . , xn = yn . 2.2 Gaussian Elimination Example 2.4. x1 + 2x2 + 3x3 = 9 4x1 + 5x2 + 6x3 = 24 (1) (2) 3x1 + x2 − 2x3 = 4 (3) Step 1: Eliminate x1 from (2) and (3). We replace (2) by (2) − 4 × (1) and replace (3) by (3) − 3 × (1). We obtain: x1 + 2x2 + 3x3 = 9 −3x2 − 6x3 = −12 −5x2 − 11x3 = −23 Step 2: Make the coefficent of x2 in (2) equal to 1. x1 + 2x2 + 3x3 = 9 x2 + 2x3 = 4 −5x2 − 11x3 = −23 STEP 3: Eliminate x2 from equation (3) by adding 5 × (2) to (3): x1 + 2x2 + 3x3 = 9 x2 + 2x3 = 4 −x3 = −3 Step 4: Make the coefficient of x3 = 1 =⇒ x3 = 3. 13 2 Matrices Geometry & Linear Algebra Step 5: Work backwards to obtain first x2 from (2) and then x1 from (1). =⇒ x2 = 4 − 2x3 = −2 and x1 = 9 − 2x2 − 3x3 = 4. Finally we’ve solved the system and we have found all the solutions of the system. 2.3 Matrices Definition. A m × n matrix is a rectangular array of mn real numbers, arranged into m rows and n columns.   a11 a12 . . . a1n  a21 a22 . . . a2n     .. ..   . .  am1 am2 ... amn The matrix entry is written down as aij where i is the number of the row and j is the number of the column. Sometimes this matrix is denoted by (aij ) or (aih )1 ≤ 1 ≤ m, 1 ≤ j ≤ n. Basic examples of matrices - Any n-dimensional vector is a matrix of size 1 × n. (1, . . . , n) is a matrix with one row.  x1  x2 - A column vector is  .. . This is a matrix of size m × 1. . xm Other examples  2 5 1 −1 5 has size 2 × 3 and 6 7 has size 3 × 2. 2 0 1 0 3   1 0 0 0 0 1 0 0   - 0 0 1 0 is the identity matrix (of size 4 × 4). 0 0 0 1 (  ) A matrix such that m = n is called a square matrix. To our general linear system we attach its co-efficient matrices:     a11 a12 . . . a1n b1  a21 a22 . . . a2n     ..   .. ..  and  .   . .  bm am1 am2 . . . amn  a11  a21  =⇒  .  .. a12 a22 ... ... a1n a2n .. . am1 am2 . . . amn  b1 b2    , the augmented matrix.  bm 14 2 Matrices Geometry & Linear Algebra Definition (Elementary Row Operations). The following operations are the elementary row operations. (i) Add a scalar multiple of one row to another: ri −→ ri + λrj , λ ∈ R. (ii) Swap two rows (ri ↔ rj ) (iii) Multiply a row by a non-zero number: ri −→ λri , λ ∈ R, λ ̸= 0. Point: The new equations are combinations of the old ones, and vice-versa, so that the set of solutions is exactly the same. Example 2.4 (Revisited).   3 9 6 24 −2 4 1 2 4 5 3 1 Step 1: r2 −→ r2 − 4r1 , r3 −→ r3 − 3r1 .   1 2 3 9 0 −3 −6 −12 0 −5 −11 −23 Step 2: r2 −→ − 13 r2 .  Step 3: r3 −→ r3 + 5r2 . 1 2 0 1 0 −5 3 2 −11  1 0 0 3 2 −1 2 1 0  9 4  −23  9 4 −3 Step 4: r3 −→ −1r3  1 2 3 0 1 2 0 0 1 {z |  9 4 3 } x1 + 2x2 + 3x3 = 9 =⇒ x2 + 2x3 = 4 x3 = 3 Echelon form Definition. A matrix is in echelon form if (i) The first non-zero entry in each row is appears to the right on the non-zero entry in all the rows above 15 Lecture 5 2 Matrices Geometry & Linear Algebra (ii) All rows consiting only of zeros (0, 0, . . . , 0), if any, appear at the bottom. Examples of echelon form: ( 0 0 1 0  ) 5 0 Yes 5 −3 6 2 0 0 0 0 Non-examples ( 0 1 1 0 ) 17 No 2 ( 0 1 0 1 17 5  0 5 Yes 0 4 1 0  ) No 0 0 1 2 0 1  0 5 No 2 Remark. If we reduce the augmented matrix of a system of linear equations to the echelon form, then we can easily solve it. Example 2.5.   1 −1 3 0 0 1 1 1  0 0 0 0 If we have a row (0, . . . , 0, a) where a ̸= 0 then there are no solutions. We say that the linear system is inconsistent since 0 · x1 + 0 · x2 + . . . 0 · xn = a. Now ignore all zero-rows. To solve this linear system, we start at the bottom: x2 + x3 = 1. x3 = a, where a is any real number. x1 = 1 − a. Express the variable corresponding to the first non-zero entry in this row in terms of the variables that come after it. x1 − x2 + 3x3 = 0. Thus x1 = x2 −3x3 = 1−a−3a = 1−4a. The general solution is (1−4a, 1−a, a), a ∈ R. The variables that are assigned arbitrary values are called free variables, and all other variables are dependent variables. Example 2.6.   1 −1 3 0 0 1 1 1  0 0 0 2 0 · x1 + 0 · x2 + 0 · x3 = 2. This is inconsistent, so no solutions. Example 2.7.  1 0 0 1 0 0 1 0 0 3 1 0  2 4 0 −1 0 3 1 1 0 x5 + x6 = 0, so let x6 = a, a ∈ R. Then x5 = −a. x4 − x5 = 3 =⇒ x4 = 3 − a. x1 + x2 + x3 + 3x4 + 2x5 + 4x6 = 0. We get new free variables: x2 = b, b ∈ R and x3 = c, for any c ∈ R. =⇒ x1 = −b − c + a − 9 =⇒ (a − b − c − 16 2 Matrices Geometry & Linear Algebra 9, b, c, −a − 3, −a, a). Theorem 2.8: Gaussian Elimination Any matrix can be reduced to a matrix in echelon form by elementary row operations. Example:  1 0 1 0  1 1 1 −1 3 3 4 2  5 1 8 1  7 0 3 2 2 5 5 −1 Step 1: Clear the first column.  1 0 3 2 0 0 0 3  0 1 1 3 0 −1 −1 −3  5 1 3 0  2 −1 −2 1 Step 2: Swap 2nd row with  1 0  0 0 4th row  5 1 −2 1   2 −1 3 0 Step 3: r3 −→ r2 + r3  0 3 2 −1 −1 −3 1 1 3 0 0 3 1 0  0 0 0 1 0 0 3 1 0 0 2 3 0 3 5 2 0 3  1 1  0 0 and we’re done. Now for the proof of the theorem:  a11  a21  Proof. Let A =  .  .. ... ... am1 ...  a1n a2n   ..  be an arbitrary m × n matrix. .  amn The basic procedure consists of two steps: Step 1: Find a non-zero entry furthest to  0 0 ... 0 0 0 . . . 0   .. .. .. . . . 0 0 ... the left in A. So A looks like this:  ∗ ... ∗ . . .    ∗ 0 ∗ ... If this non-zero element is in the row with number j, then swap the first row with the j-th row (ri ↔ rj ). Multiply this new row 1 to make this non-zero element equal to 1. 17 Lecture 6 2 Matrices Geometry & Linear Algebra Outcome of Step 1:  0 ... 0 . . .   .. . 0 ... 0 1 ... 0 ∗ ... .. . ∗ 0 ∗ ...  ∗ ∗    ∗ Step 2 Subtracting multiples of the first row from other rows to ensure that then entries underneath this 1 are all equal to 0. Outcome of Step 2:          0 ... 0 ... .. . 0 ... 0 0 .. . 0 1 ∗ ... ∗  0 ∗ ... ∗     ..  . A1  0 ∗ ... ∗ From now on we only use row operations involving rows from 2 to m. (never use row 1 anymore). We’ll apply steps 1 and 2 to the “red” matrix A1 . (A1 is A with the first j columns and 1st row removed). The outcome of steps 1 and 2 applied to A1 is:           0 0 0 .. . 0 ... 0 1 ∗ ... 0 0 1 ... 0 0 0 .. .. .. . . . ... 0 0 0  ... ∗ ∗  ... ∗ ∗   ∗ ... ∗      A2  ∗ ... ∗ Let us now consider A2 and repeat steps 1 and 2 for A2 and so on. The number of rows of A1 is m − 1, the number of rows of A2 is m − 2, . . . . Hence after finitely many iterations, we’ll stop. In the end the matrix will be in the echelon form! 18 3 Matrix Algebra Geometry & Linear Algebra 3 Matrix Algebra Definition. The zero matrix of size  0 0 = 0 0 m × n is  ... 0 . . . 0 ... 0 Addition If A and B are both m × n matrices, with entries aij and bij respectively, then A + B is the matrix of size m × n with entries aij + bij :   a11 + b11 . . . a1n + b1n   .. ..   . . am1 + bm1 . . . amn + bmn Multiply matrices by a scalar: If c ∈ R and A is a matrix of size m × n, with A = (aij ) then cA is the matrix of size m × n with entries (caij ). ( ) ( ) a11 a12 ca11 ca12 If A = cA = a21 a22 ca21 ca22 Matrix Multiplication: Let A be a matrix of size m × n and B a matrix of size n × k. Then AB is a matrix of size m × k defined as follows:   ← r1 (A) →   ↑ ↑ ↑ ← r2 (A) →   B = c1 (B) c2 (B) . . . ck (B) A=  ..   . ↓ ↓ ↓ ← rk (A) → Each if the row vectors ri (A) has n co-ordinates. Each of the column vectors ri (B) has n co-ordinates. Because each of ri (A) and cj (B) has exactly n co-ordinates, we can form the dot product ri (A) · cj (B) for any 1 ≤ i ≤ m 1 ≤ j ≤ k. The ij−entry of AB is defined to be the dot prodcut ri (A) · cj (B). Examples 3.1. (i) 1 × 1 matrices are just a number, so if A = (a) and B = (b) =⇒ AB = (ab). ( ) ( ) a12 b12 (ii) 2 × 2 matrices, A = aa11 , B = bb11 , then 21 a22 21 b22 ( AB = a11 b11 + a12 b21 a21 b11 + a22 b21 19 a11 b12 + a12 b22 a21 b12 + a22 b22 ) 3 Matrix Algebra Geometry & Linear Algebra (b ) 11 .. (iii) A = (a11 , a12 , . . . , a1n ) a 1 × n matrix and B = then . bn1 AB = (a11 b11 + a12 b21 + · · · + a1n bn1 ) (which is a 1 × 1 matrix, i.e. a number) (iv) BA will be a matrix of size n × m :    b11 a111  b21 a11     ..  (a11 , a12 , . . . , a1n ) =  ..  . . bn1 bn1 a11 b11 b21 ... ...  b11 a1n b21 a1n   ..  .  ... bn1 a1n Remark. In general multiplication of matrices is far from symmetric! AB ̸= BA. Recall: If A is an m × n matrix and B is an n × p matrix, then AB is an m × p matrix. The ij−entry is the dot product ri (A) · cj (B). Examples 3.2. (i) ( 1 3 2 4 )( 0 1 1 0 ) = ( 2 4 1 3 ) ( 0 1 1 0 )( 1 3 2 4 ) = ( ) 3 4 1 2 So AB ̸= BA. (ii) ( 1 2 3 4 )( −1 0 1 1 1 2 ) = ( 1 2 1 4 5 11 ) (iii) If A has size m × n and v is a column vector with n co-ordinates, then Av is well defined. e.g.      1 2 −1 x1 x1 + 2x2 − x3 0 1 1  x2  =  x2 + x3  1 0 2 x3 x1 + 2x3 Therefore the system of linear equations x1 + 2x2 − x3 = 1 x2 + x3 = −7 x1 + 2x3 = 0 can be written in matrix form. Ax = B, where A is the 3 matrix above and B is the vector (1, −7, 0)t . 20 Lecture 7 3 Matrix Algebra Geometry & Linear Algebra In general the system of equations a11 x1 + a12 x2 + · · · + a1n xn = b1 a21 x1 + a22 x2 + · · · + a2n xn = b2 .. . am1 x1 + am2 x2 + · · · + amn xn = bm is equivalent to our matrix equation Ax = b, where A is the matrix with entires aij , x = (x1 , . . . , xn )t and b = (b1 , . . . , bn )t . Proposition 3.3 (Distributivity Law). (i) Let A be an m × n matrix and let B and C be n × p matrices. Then A(B + C) = AB + AC (ii) Let D and E be m × n matrices and F be an n × p matrix. Then (D + E)F = DF + EF Proof. (i) By definition, the ij-entry of A(B + C) is ri (A) · cj (B + C) = ri (A) · (cj (B) + cj (C)) = (ri (A) · cj (B) + (ri (A) · cj (C)) = ij − entry of AB + ij − entry of AC = ij − entry of AB + AC Exercise: Prove (ii). Application to system of linear equations Remark 1. Any system of linear equations can only have 0, 1 or infinitely many solutions. Proof. We’ve seen cases when there is 0 or only 1 solution. Suppose we have at least two distinct solutions, say p and q (column vectors with n co-ordinates). Let λ ∈ R be any real number. Then A(p + λ(q − p) = Ap+A(λ(q −p)) by Prop. 3.3. This equals Ap+λA(q −p) = Ap+λ·Aq −λAp. Since p and q are solutions, we have Ap = Aq = b. So A(p+λ(q−p)) = Ap = b. Since q − p is not the zero vector, we have have obtained infinitely many solutions in the form p + λ(q − p) = x. Remark 2. If p is a solution to Ax = b, then any solution of Ax = b is the sum of p and the general solution of Ax = 0. Proof. Indeed if Ax = b, then A(x−p) = Ax−Ap = b−b = 0, so x = p+(x−p) where x−p is a solution of Ay = 0. Conversely if Ay = 0 then A(p+y) = b. Consider matrices A, B, C. Does (AB)C = A(BC)? Yes! 21 3 Matrix Algebra Geometry & Linear Algebra Theorem 3.4: Associativity law Let A be a matrix of size m × n, and let B be a matrix of n × p and let C be a matrix of size p × q. Then (AB)C = A(BC) ( y1 ) .. Lemma 3.5. Let x = (x1 , . . . , xn ) be a row vector and let y = and B . be an n × p matrix with entries bij . Then x · By = xB · y. yp Remark. We’re multiplying a (n × p) matrix by a (p × 1) column vector, and also a (1×n) row vector by a (n×p) matrix, so both products are well-defined. Proof of Lemma. Idea: both sides are equal to and j. ∑ bij xi yj over all values of i Consider LHS = (x1 , . . . , xn ) · By  b11  .. By =  . bn1 ...     b1p y1 b11 y1 + · · · + b1p yp  ..   ..  =  ..  .  .   . ... bnp bn1 y1 + · · · + bnp yp  ∑p  j=1 bij yj   .  = ∑ ..  p b y nj j j=1 yp Then LHS = (x1 , . . . , xn ) · By   p n ∑ ∑ xi · = bij yj  i=1 = j=1 p n ∑ ∑ bij xi yj i=1 j=1 Now consider RHS: RHS = xB · y = (x1 b11 + . . . xn bn1 , . . . , x1 b1p + · · · + xn bnp )   n n ∑ ∑ = b11 xi , . . . , bip xi  i=1 = p ∑ y j · j=1 = i=1  p n ∑ ∑ n ∑  bij xi  i=1 bij xi yj i=1 j=1 22 3 Matrix Algebra Geometry & Linear Algebra Proof of Theorem. Observation: ri (AB) = ri (A)B, indeed: Lecture 8 ri (AB) = (r1 (A) · c1 (B), . . . , ri (A) · cp (B)) which is the same as ( ) ri (A)B = (← ri (A) →) B = (r1 (A) · c1 (B), . . . , ri (A) · cp (B)) Similarly cj (BC) = Bcj (C). (Exercise, easy) To finish the proof, note that the ij-entry of (AB)C is the dot product ri (AB) · cj (C) = ri (A)B · cj (C) = ri (A) · Bci (C) (by Lemma 3.7) = the ij − entry of A(BC). 3.1 Square matrices Definition. A matrix of size n × n for some n is called a square matrix. These are nice as we can multiply these matrices by themselves, in fact A2 = AA is only defined when A is a square matrix. A2 = (AA)A = A(AA) .. . An = An−1 A for all n ≥ 2 1 is a very important number as it’s a number such that multiplying everything by 1 gets you the same thing. There is a similar thing for matrices: Definition. The identity matrix by  1 0    0 In is the square matrix of size n × n given 0 1 ... ... .. . 0 ...  0 0    1 The ij-entry of In is 1 if i = j, and is 0 otherwise if i ̸= j. Proposition 3.6. For any square matrix A of size n × n, we have AIn = In A = A. So the identity matrix is an analogue of 1. 23 3 Matrix Algebra Geometry & Linear Algebra Proof.    1 0 a11 a12 . . . a1n   .. ..  0 1  . .   an1 an2 . . . ann 0 0   a11 a12 . . . a1n  .. ..  = . .  an1 an2  ... 0 . . . 0   ..  . ... 1 . . . ann The other way is similar. What about the inverse? For many reasons it’s important to be able to invert matrices, and at this point we have to be careful as multiplication is not commutative, so there may be two inverses. Let me make a temporary definition first: Definition. B is the left inverse of A if BA = In . C is the right inverse of A if AC = In Proposition 3.7. Assume that A has both a right inverse C and a left inverse B. Then B = C. Proof. By Theorem 3.6, we have (BA)C = B(AC). But then by Prop 3.8 we can continue on the left: C = In C = (BA)C = B(AC). And simiarly we can continue on the right: (BA)C = B(AC) = BIn = B. Warning. Not all matrices are invertible. e.g. ( )( ) ( ) 0 1 0 1 0 0 = 0 0 0 0 0 0 Clearly if A2 = 0, then the inverse does not exist; if there is a B such that BA = I2 , then (BA)A = B(AA) by associativity, however AA = A2 = 0, hence B(AA) = 0. Whereas (BA)A = I2 A = A. Since A ̸= 0 we have a contradiction. Therefore B does not exist. A matrix with 2 identical (or proportional) rows is not invertible, e.g. ( ) 1 2 A= 1 2 is not invertible. Let me go back to this example later on when we have more theory... 24 3 Matrix Algebra Geometry & Linear Algebra Application to Systems of Linear Equations Proposition 3.8. Let A be a square matrix. The linear system Ax = b has exactly one solution when A has an inverse (that A has both left and right inverses). Proof. By assumption there is a square matrix B of the same size such that BA = In . Then Ax = b implies B(Ax) = Bb. By associativity, we have B(Ax) = (Ba)x = In x = x On the other hand, this equals Bb. Therefore any solution of Ax = b must be equation to Bb. Claim: Bb is in fact a solution of Ax = b. Indeed, A(Bb) = (AB)b. But from Prop 3.8 we know that b = C for which AC = In . Hence A(Bb) = (AB)b = In b. So Bb is a solution of Ax = b. Remark. If b = 0, then x = 0 is the unique solution of Ax = 0. 25 4 Applications of Matrices Geometry & Linear Algebra 4 Applications of Matrices 4.1 Geometry Rotations Lecture 9 x2 x = (x1 , x2 ) φ α x1 Consider a rotation of φ about the origin to the vector x. The vector has components x1 = p cos α, x2 = p sin α. Then rotation through through the angle φ sends x to (p cos(α + φ), p sin(α + φ)): ( ) ( ) p cos α p cos(α + φ) x= 7−→ p sin α p sin(α + φ) ( ) p cos α cos φ − p sin α sin φ = p cos α sin φ + p sin α cos φ ( ) cos ψx1 + (− sin ψ)x2 = sin ψx1 + cos ψx2 ( )( ) cos ψ − sin ψ x1 = sin ψ cos ψ x2 Definition. Let ( cos ψ Rψ = sin ψ − sin ψ cos ψ ) This is the rotation matrix. When we rotate the vector x by ψ we write x 7−→ Rψ x (This means “x goes to Rψ x) x 7−→ Rψ x 7−→ Rψ Rψ x ( 1 Note that Rψ Rψ = I = 0 ) 0 . 1 26 4 Applications of Matrices Geometry & Linear Algebra Reflections x2 x = (x1 , x2 ) x1 x′ = (x1 , −x2 ) The reflection in the x-axis sends (x1 , x2 ) to (x1 , −x2 ). In terms of matrices the reflection sends ( ) ( )( ) x1 1 0 x1 = x2 0 −1 x2 Definition. (See Sheet 3) The matrix ( ) cos ψ sin ψ sin ψ − cos ψ represents the reflection in the line x2 = x, tan ψ2 . It’s easy to check that Sψ2 = I. Also Sψ Sφ = R = Rψ−φ . Thus the composition of two reflections is a rotation: x2 ψ/2 x1 ψ/2 4.2 Markov Chains Special Halloween Problem: An enchanted village is inhabited by vampires, zombies and goblins. Every full moon: 25% of zombies turn into vampires 10% of zombies turn into goblins 65% remain as zombies 20% of vampires turn into zombies 5% of vampires turn into goblins 75% stay as vampires 27 4 Applications of Matrices Geometry & Linear Algebra 5% of goblins turn into vampires 30% of goblins turn into zombies 65% stay as goblins Question: What will happen after n full moons? If n → ∞ are the proportions going to stabilise? Let x1 be the proportion of the vampires, let x2 and x3 be the proportions of zombies and goblins respectively. x1 + x2 + x3 = 1 xi ≥ 0 After one month the proportions are as follows: (1) x1 = proportion of vampires after one month. (1) x1 = 0.75x1 + 0.25x2 + 0.05x3 (1) x2 = 0.20x1 + 0.065x2 + 0.30x3 (1) x3 = 0.05x1 + 0.10x2 + 0.65x3   0.75 0.25 0.05 x(1) = 0.20 0.65 0.30 x 0.05 0.10 0.65 Let T be this matrix. After n months we have the population represented by x(n) = T n x Observe: tij > 0. Crucially, the sum of entries in each column of T is 1. ∑n Exercise: For any matrix T = (tij ) such that tij > 0 and i=1 tij = 1, if ∑ k x = (x1 , . . . , xn )t is such that xi ≥ 0 and i=1 xi = 1, then T x has the same properties as x. This is an example of a Markov Chain. Theorem 4.1 If all entries of some power of T are positive, then there exists a unique vector ( s1 ) .. s= . ∑n sn such that si ≥ 0 and i=1 si = 1 for which T s = s (the stationary state). For any initial state x, this converges to s. Thus s is an eigenvector of T with eigenvalue 1. T s = s, or equivalently (T − I) = 0 is a system of linear equations. IN our case, solving this system gives 34 15 s = ( 37 86 , 86 , 86 ) = (0.45, 0.4, 0.17) This is the limit distribution of three species. 28 5 Inverses Geometry & Linear Algebra 5 Inverses Theorem 5.1 Let A be a square matrix. If there exists a square matrix B such that AB = I then this B is unique and satisfies BA = I. Proof. (Also gives a method for finding this B!) Let X be the square matrix with unknown entires. We want to solve the equation AX = I. The entires of X are xij . We have n2 unknowns and n2 equations. We record this as follows: (A | I), a n × 2n matrix. This is n systems of linear equations in n variables, e.g. for each column of I we have the following system:   0 .  x2j  ..  1    .. A = the jth column of I =  0   . . xnj ..  x1j  0 All these n systems have the same co-efficient matrix A, so we solve them using the same process. Apply reduction to echelon form. Perform elementary row operations on the matrix (A | I) Claim: The matrix on the LHS cannot have any rows made entirely of zeros. Proof of Claim. Remember that D is obtained by row operations from I. We know that two matrices that are obtained from each other by row operations define equivalent linear systems. This means that the linear system I(y1 , . . . , yn ) = 0 has the same solutions as D(y1 , . . . , yn ) = 0. But (0, . . . , 0) is the only solution to this. Now if D has an all zero row, the system D(y1 , . . . , yn ) = 0 has free variables, hence infinitely many solutions. This contradiction proves that D does not have an all zero row. Therefore there is a non-zero entry in the bottom row of D. Say this entry is in the jth column. Then the system (1.1) has no solutions (follows from the echelon form method). Therefore, if the matrix on the left has a bottom row made of zeros, then AX = I has no solutions. So A has no right inverse. It remains to consider the case when the matrix on the left has no all-zero rows:    (Aech | I) =     1 * 1 .. 0 .   D   1 Perform more elementary row operations to clear the entries above the main diagonal (This is possible because all diagonal entries equal 1). After this step, 29 Lecture 10 5 Inverses Geometry & Linear Algebra  we obtain:   (I | E) =     1   E   0 1 .. 0 . 1 Since row operations don’t change the solution of our linear system, we have IX = E. Hence E is a unique solution of the system AX = I, i.e. AE = I. We’ve proven that if the right inverse exists it can be obtained by the procedure, and it is unique. Finally we now prove that EA = I: Consider the equation EY = I, where Y = (yij ) is a square matrix with unknown entires yij . Reverse row operations from the first part of the proof to so (E | I) 7→ (I | A). EY = I is equivalent to IY = A, that is Y = A. Therefore EA = I. Finding inverses of 2 × 2 matrices is easy: A= ( a c ) ( ) b d −b , consider B = d −c a ( Then AB = ( = a b c d )( d −b −c a ) ad − bc 0 0 ad − bc ) = (ad − bc)I Case 1. (ad − bc) is non-zero. Then ( ) 1 d −b is the inverse of A ad − bc −c a Case 2. ad − bc = 0. In this case AB = 0. Then the inverse does not exist (If CA = I, then C(AB) = (CA)B = IB = B. If A is non-zero then B is non-zero, so we get a contradiction for 0 = B.) Hence A is not invertible. Let A be a square matrix. If Lecture 11 row ops ( (A | I) −→ ∗ 0...0 then A has no inverse. If row ops (A | I) −→ (I | E) then AE = I. 30 ) ∗ 5 Inverses Geometry & Linear Algebra Example 5.2. Let  1 3 A= 2 5 −3 2  −2 −3 −4   1 3 −2 1 0 0 (A | E) =  2 5 −3 0 1 0 −3 2 −4 0 0 1   1 0 0 1 3 −2 −2 1 0 −→ 0 −1 1 0 0 1 −19 11 1   2 1 3 0 −37 22 −→ 0 −1 0 17 −10 −1 0 0 1 −19 11 1   1 0 0 14 −8 −1 1 −→ 0 1 0 −17 10 1 0 0 1 −19 11  So A−1  14 −8 −1 1 = −17 10 −19 11 1 Recall that Theorem 5.1 says that if there exists a B such that AB = I, then BA = I. Question: What if there is a matrix C such that CA = I: does this imply that CA = I? Yes. 5.1 Transpose Definition. Let A be a matrix of size n × m. The transpose of A is the matrix At of size m × n such that the ij-entry of At is the ji−entry of A. ( E.g. if A = ) ( ) a b a c , then At = . c d b d Some properties: • It = I   x1   • (x1 , . . . , xn )t =  ...  xn • (At )t = A • ri (At ) = ci (A)t 31 5 Inverses Geometry & Linear Algebra • cj (At ) = rj (A)t Proposition 5.3. (AB)t = B t At Proof. The ij-entry of (AB)t = the ji-entry of AB = rj (A) · ci (B) = ci (B) · rj (A) = ri (B t ) · cj (At ) = ij-entry of B t At Corollary 5.4. Let A be a square matrix. If there exists a matrix C such that CA = I, then AC = I. Proof. Suppose we have C such that CA = I. Then (CA)t = I t = I. By Prop 5.2, we have At · C t = I By Theorem 5.1, this implies that C t At = I. By Prop 5.2, we obtain (AC)t = I. Apply the transpose to both sides: AC = ((AC)t )t = I t = I Remark. Let A be a square matrix. Then the following properties hold. (i) If there exists a matrix B such that AB = I, then BA = I. (ii) If there exists a matrix C such that CA = I, then AC = I. Call B = C = A−1 the inverse of A. (iii) If A−1 exists, it is unique. (iv) A−1 exists if and only if A can be reduced to I by elementary row operations (v) A−1 exists if and only if Ax = b has a unique solution. (vi) A−1 exists if and only if Ax = 0 has a unique solution 0. 32 6 Determinants Geometry & Linear Algebra 6 Determinants Definition (2 × 2 determinant). ( ) a b det = ad − bc c d a b = ad − bc Also denoted by c d We’ve seen that if det(A) ̸= 0, then A−1 exists. ( ) 1 d −b A−1 = det(A) −c a If det(A) = 0, then A−1 doesn’t exist. Examples 6.1. ( ) a b = ad 0 d ( ) cos φ − sin φ Rφ = sin φ cos φ det det Rφ = 1. Sφ = ( ) cos φ sin φ sin φ − cos φ det Sφ = −1. Exercise: det(AB) = det(A) det(B). 6.1 3 × 3 determinants Definition. For a 3 × 3 matrix, A:  a11 A = a21 a31 det(A) = a11 det ( a22 a32 a23 a33 a12 a22 a32 ) ( − a12 det  a13 a23  a33 a21 a31 a23 a33 ) + a13 ( a21 a31 a22 a32 ) = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a21 a31 ). The same decomposition works for n × n matrices. Let A be a square matrix. Lecture 12 33 6 Determinants Geometry & Linear Algebra Definition. The ij-minor of A is the square matrix obtained from A by removing the i-th row and the j-th column. Example 6.2. If  1 2 A = 4 5 5 0 then A11 ( ) 5 6 = 0 8 A13  3 6 8 ( 4 = 5 5 0 ) We defined det(A) = a11 det(A11 ) − a12 det(A12 ) + a13 det(A13 ). Proposition 6.3. Let A be a 3 × 3 matrix. (i) det(A) = expansion in the 2nd row. (ii) det(A) = expansion in 3rd row. Proof of (i). Note that det(A) = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a21 a31 ).   a11 a12 a13 A = a21 a22 a23  a31 a32 a33 We need to compare this with the following expression: −a21 (a12 a33 − a32 a13 ) + a22 (a11 a33 − a31 a13 ) − a23 (a11 a32 − a31 a12 ) Proof of (ii) is similar. Remark. There are also expansions in the first column, second column etc. det(A) = a11 det(A11 ) − a21 det(A21 ) + a31 det(A31 ) Goal: Show that A−1 exists precisely when det(A) ̸= 0. Lemma 6.4. If a 3 × 3 matrix A has two identical rows, then det(A) = 0. Proof.  a det a x b b y  c c  = expansion in 3rd row z ( ) ( ) ( ) b c a c a b = x det − y det + z det b c a c a b =0 Other cases are similar. 34 6 Determinants Geometry & Linear Algebra Proposition 6.5 (Effect of row operations on determinant). (i) Replacing r1 (A) by ri (A) + λrj (A), j ̸= i does not change det(A) (ii) Swapping ri (A) and rj (A), j ̸= i multiplies det(A) by −1 (iii) Multiplying ri (A) by λ ̸= 0 has the effect of multiplying det(A) by λ. Proof. (i) For simplicity assume that i = 2 and j = 1:   a11 a12 a13 A′ = a21 + λa11 a22 + λa12 a23 + λa13  a31 a32 a33 Let’s expand this det in the 2nd row: det(A′ ) = −(a21 + λa11 ) det(A21 ) + (a21 + λ11 ) det(A22 ) − (a23 + λ13 ) det(A23 )   a11 a12 a13 = det(A) + λ det a11 a12 a13  a31 a32 a33 = det(A) + 0 = det(A) (ii) We have:  a21 a22 det a11 a12 a31 a32  a23 a13  a33 = expand 3rd row ( ) ( ) ( ) a a23 a a23 a a22 = a31 det 22 − a32 det 21 + a33 21 a12 a13 a11 a13 a11 a12 ( ) ( ) ( a12 a13 a a13 a = −a31 det + a32 det 11 − a33 det 11 a22 a23 a21 a23 a21 a12 a22 ) = − det(A) (iii) Expand in the row that is multiplied by λ. This immediately gives the result. Examples 6.6. (i)  1 2 det  2 0 −2 1   −1 1 3  = det 0 1 0 2 −4 5  −1 5 −1 = expand in the 1st column ( ) −4 5 = det 5 −1 = 4 − 25 = −21 35 6 Determinants Geometry & Linear Algebra (ii) Van der Monde determinant:    1 x x2 1 x det 1 y y 2  = det 0 1 0 1 1 z z2  x2 x + y x+z  1 x 1 = (y − x) det 0 0 z−x  1 = (y − x)(z − x) det 0 0  1 = (y − x)(z − x) det 0 0  x2 y+x  z 2 − x2  x x2 1 x + y 1 x+z  x x2 1 x + y 0 z−y = (y − x)(z − x)(z − y) = (x − y)(y − z)(z − x) Corollary 6.7. If A′ is obtained from A by row operations, then det(A′ ) ̸= 0 if and only if det(A) ̸= 0. Proof. A direct consequence of Proposition 6.3. Theorem 6.8 Let A be a 3 × 3 matrix. Then A−1 exists if and only if det(A) ̸= 0. Proof. Recall that A can be reduced to echelon form by row operations. Let A′ be the matrix in echelon form to which A reduces. Then det(A′ ) ̸= 0 ⇐⇒ det(A) ̸= 0. Hence we are in Case 2 (If A′ has an all zero-row then we expand in this row and det(A′ ) = 0.) In Case 2, A′ can be reduced to I by further row operations. By Theorem 5.1, A is invertible. We need to prove that if A−1 exists, then det(A) ̸= 0. Indeed, if A−1 exists, then the echelon form of A has no all-zero rows. Then A can be reduced to I by row operations. Row operations can only multiply det by a non-zero number, and they can be reversed. Therefore, det (A) ̸= 0. Remark. For any square matrices A and B of the same size det(AB) = det(A)det(B). If A−1 exists, then AA−1 = I, so det(A)det(A−1 ) = 1. Hence det(A) ̸= 0 if A−1 exists. Final Comment. If A is a square matrix, then Ax = 0 has non-zero solutions if and only if det(A) = 0. (Indeed if Ax = 0 has a non-zero solution, then it has at least two distinct solutions, so it has infinitely many solutions. Then A−1 does’t exist, and det(A) = 0). 36 Lecture 13 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra 7 Eigenvalues and Eigenvectors Definition (Eigenvector, eigenvalue). Let A be a n × n matrix. Then a non-zero vector, v, is called an eigenvector of A if Av = λv for some λ ∈ R. In this case λ is called an eigenvalue of A corresponding to the eigenvector v. Remarks. A scalar multiple of an eigenvector is also an eigenvector with the same eigenvalue. ( 3 Example 7.1. If A = 2 Av = ) ( ) 2 1 ,v= , we have 0 −2 ( )( ) ( ) 3 2 1 −1 = = −v 2 0 −2 2 So v is an eigenvector of A with eigenvalue −1. ( ) 1 But if w = , then 1 ( 3 Aw = 2 2 0 )( ) ( ) ( ) 1 5 1 = ̸= λ 1 2 1 So w is not an eigenvector of A. ( 1 Remark. If A is a scalar multiple of I, A = c 0 vector is an eigenvector with eigenvalue c. ) 0 , then every non-zero 1 How do we find eigenvectors and eigenvalues? v is an eigenvector of A if and only if Av = λv for some λ ∈ R. Av = λ·Iv =⇒ (λI − A)v = 0. Here v ̸= 0. Non-zero solution (such as v) of this system of linear equations exist if and only if det(λI − A) = 0. Let t be a variable and consider the matrix tI − A:   t − a11 −a12 . . . −a1n  −a21 t − a12 . . . −a2n    tI − A =  . ..   .. .  −an1 t − an2 . . . t − ann Definition (Characteristic Polynomial). The determinant of tI − A is called the characteristic polynomial of A. (For us n = 3, or n = 2) 37 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra For example, if n = 2, then the characteristic polynomial of A is ( ) t − a11 −a12 det = (t − a11 )(t − a22 ) − a21 a − 12 −a21 t − a22 = t2 − (a11 + a22 ) + a11 a22 − a21 a12 = t2 − tr(A) + det(A) = 0 Definition (Trace). tr(A) is the trace, defined as tr(A) = a11 + a22 Proposition 7.2. Let A be a 2 × 2 or 3 × 3 matrix. Then the eigenvalues of A are the roots of the characteristic polynomial of A, i.e. every eigenvalue λ satisfies det(λI −A) = 0. The eigenvectors of A with eigenvalue λ are non-zero solutions of the system of linear equations (λI − A)v = 0. Proof. The real numbers λ for which det(λI − A) = 0 are by definition the roots of the characteristic polynomial of A. Hence v is a non-zero solution of (λI − A)v = 0. Example 7.3. Consider ( A= 3 2 2 0 ) (i) Find the characteristic polynomial of A: ( ) t − 3 −2 det = t2 − 3t − 4 = 0 −2 t =⇒ (t + 1)(t − 4) = 0 (ii) Solve this equation (t + 1)(t − 4) = 0, and find the eigenvalues: λ1 = −1, λ2 = 4 (iii) Find the eigenvectors with eigenvalue −1: ( ) −4 −2 λ1 I − A = −2 −1 Solve } −4x1 − 2x2 = 0 −2x1 − x2 = 0 ( −4 −2 )( ) −2 x1 =0 −1 x2 ( ) =⇒ v1 = 1 −2 (or any scalar multiple) Find eigenvectors with eigenvalue 4: ( ) 1 −2 λ2 I − A = −2 4 38 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra ( Solve x1 − 2x2 = 0 −2x1 + 4x2 = 0 } )( ) −2 x1 =0 4 x2 1 −2 ( =⇒ v2 = 2 ) 1 (or any scalar multiple) Conclusion: Up to proportionality, A has exactly two eigenvectors with eigenvalue −1 and 4. Warning. Some 2 × 2 matrices have only one eigenvector up to scalar multiples! Example 7.4. ( A= 1 1 0 1 ) This has characteristic polynomial of ( ) t−1 1 det = (t − 1)2 0 t−1 So 1 is the unique eigenvalue of A. Now ( ) 0 −1 I −A= 0 0 So solving ( )( ) ( ) 0 −1 x1 0 = =⇒ x2 = 0 0 0 x2 0 ( ) 1 So v = is the unique eigenvector (up to scalar multiples!) 0 If A is a 3 × 3 matrix, then the characteristic polynomial of A is as polynomial of degree 3, with leading coefficient 1. Typically, A will have three different eigenvalues (which may be complex numbers!). There are matrices with 3 non-proportional eigenvectors, but also matrices with fewer eigenvectors, e.g.     0 1 0 0 1 0 A = 0 0 1 or 0 0 0 0 0 0 0 0 1 7.1 Diagonalisation of Matrices 39 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra Definition. A matrix is called diagonal if all non-diagonal entries are zero:   d1 0 . . . 0  0 d2 . . . 0    D=. ..   .. . 0 0 . . . dn Properties • diag(d1 , . . . , dn ) + diag(d′1 , . . . , d′n ) = diag(d1 + d′1 , . . . , dn + d′n ) • diag(d1 , . . . , dn )diag(d′1 , . . . , d′n ) = diag(d1 d′1 , . . . , dn d′n ) m • diag(d1 , . . . , dn )m = diag(dm 1 , . . . , dn ) • det(diag(d1 , . . . , dn )) = d1 . . . dn −1 • If d1 , . . . , dn ̸= 0 then diag(d1 , . . . , dn )−1 = diag(d−1 1 , . . . , dn ) Consider the polynomial pm xm + · · · + p0 pi ∈ R Then p(A) = pm Am + pm−1 Am−1 + · · · + p1 A + p0 I where A is a square matrix D = diag(d1 , . . . , dn ) and p(D) = diag(p(d1 ), . . . , p(dn )). Can we somehow reduce general matrices to diagonal form? Yes, in most cases... Terminology. Let v1 , . . . , vn be vectors in Rn . Definition. v1 , . . . , vn are linearly dependent if ̸ ∃x1 , . . . , xn ∈ R not all equal to 0 such that x1 v1 + · · · + xn vn = 0. In the opposite case, v1 , . . . , vn are called linearly independent. Examples 7.5. (i) If one of the vectors is a zero vector, they are linearly dependent, say v1 = 0. Then take x1 = 1, x2 = · · · = xn = 0. 1·0+0·v2 +. . . 0·vn = 0. (ii) For n = 2, then are v1 and v2 linearly dependent? x1 v1 +x2 v2 = 0, say x1 ̸= 0, then v1 = xx12 v2 . Conclusion: v1 and v2 are linearly dependent iff they are proportional, i.e. v2 is in this line: 0 v1 (iii) For n = 3, v1 , v2 , v3 are linearly dependent iff x1 v1 + x2 v2 + x3 v3 = 0 and one of the coefficients ̸= 0, say x1 ̸= 0. Then v1 = − xx12 v2 − xx31 v3 . In R3 , v1 lies in the plane through 0, v2 , v3 : 40 Lecture 14 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra v1 v2 0 Conclusion: v1 , v2 , v3 ∈ R3 are linearly dependent iff they belong to a plane through 0. Lemma 7.6. (i) Let v1 , v2 ∈ R2 . Define a 2 × 2 matrix B such that v1 , v2 are the columns of B:   | | B =  v1 v2  | | Then det(B) = 0 iff v1 and v2 are linearly dependent. (ii) Let v1 , v2 , v3 ∈ R3 and define a 3 × 3 matrix B so that v1 , v2 , v3 are the columns of B. Then det(B) = 0 iff v1 , v2 , v3 are linearly dependent. Proof of (ii). Note that    x1 | B x2  = v1 x3 | | v2 |   | x1 v3  x2  | x3 = x1 v1 + x2 v2 + x3 v3 We know that det(B) = 0 ⇐⇒ Bx = 0 has a non-zero solution, where x = (x1 , x2 , x3 )t ⇐⇒ x1 v1 + x2 v2 + x3 v3 = 0 where some x1 ̸= 0 ⇐⇒ v1 , v2 , v3 are linearly dependent. Theorem 7.7 Let A be a square matrix (3 × 3 case) with eigenvectors v1 , v2 , v3 with eigenvalues λ1 , λ2 , λ3 respectively. Define   λ1 0 0 D = diag(λ1 , λ2 , λ3 ) =  0 λ2 0  0 0 λ3 P as a matrix with columns v1 , v2 , v3 . a) Then AP = P D b) If v1 , v2 , v3 are linearly independent, then P −1 exists and A = P DP −1 Remark. This is a theorem about the diagonalisation of matrices 41 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra Proof. a) What is AP ? The i-th column of AP is ci (AP ) = Aci (P ) (lecture 8) ci (AP ) = Aci (P ) = Avi = λi vi What is P D?   | | | λ1 0 P D = v1 v2 v3   0 λ2 | | | 0 0   | | | = λ1 v1 λ2 v2 λ3 v3  | | |  0 0 λ3 Hence AP = P D. b) By Lemma 7.2, P −1 exists iff v1 , v2 , v3 are linearly independent. Definition. Two square matrices A and B are called equivalent if there exists an invertible matrix P such that A = P BP −1 . Exercise: Check that this is indeed an equivalence relation. Definition. A square matrix A is diagonalisable if A is equivalent to a diagonal matrix, i.e. ∃ an invertible matrix P such that A = P DP −1 , where D is diagonal. Proposition 7.8. (i) If A is a diagonalisable 2 × 2 matrix, then A has two linearly independent eigenvectors (not multiples of each other) (ii) If A is a diagonalisable 3 × 3 matrix, then A has three linearly independent eigenvectors. Remark. This is in fact an “if and only if” statement (as it follows from Theorem 7.3) Proof of (ii). A = P DP −1 where D = diag(λ1 , λ2 , λ3 ). Then AP = P D. Next, consider the columns of P as three vectors, call them v1 , v2 , v3 .    | | | λ1 0 0 P D = v1 v2 v3   0 λ2 0  | | | 0 0 λ3   λ1 p11 λ2 p12 λ3 p13 = λ1 p21 λ2 p22 λ3 p23  λ1 p31 λ2 p32 λ3 p33   | | | = λ1 v1 λ2 v2 λ3 v3  | | | Now AP = P D =⇒ ci (P D) = λi vi =⇒ ci (AP ) = Aci (P ) = Avi for i = 1, 2, 3. Hence vi is an eigenvector of A with eigenvalue λi . 42 Lecture 15 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra Claim: v1 , v2 , v3 are linearly independent. In fact, Lemma 7.2 says P −1 exists iff they are linearly independent. Since P −1 exists, we’re done. Corollary 7.9. (i) All matrices that are equivalent to  λ Jλ =  0 0  1 0 λ 1 0 λ for λ ∈ R, are non-diagonalisable. (ii) All matrices equivalent to  Jλ1 µ λ 1 = 0 λ 0 0  0 0 µ for any λ1 , µ ∈ R are also non-diagonalisable. Remark. These are in fact all the non-diagonalisable matrices. Proof. (i) For contradiction, suppose that A = BJλ B −1 (for some invertible matrix B) is diagonalisable. Then A = P DP −1 , so BJλ B −1 = P DP −1 =⇒ Jλ = B −1 P DP −1 B. We know that (B −1 P )−1 = P −1 (B −1 )−1 = P −1 B Therefore Jλ = (B −1 P )D(B −1 P )−1 . So Jλ is diagonalisable and by Prop 7.4, Jλ must have three linearly independent eigenvectors. Let’s find them:  t−λ tI − Jλ =  0 0 −1 t−λ 0  0 −1  t−λ det(tI − Jλ ) = (t − λ)3 Hence λ is the only eigenvalue of (λI − Jλ )v = 0, i.e.  0 −1 0 0 0 0 Jλ . The corresponding eigenvectors satisfy     0 x1 0 −1 x2  = 0 0 x3 0 =⇒ x2 = x3 = 0, x1 = a ∈ R. Hence we conclude   1 v = 0 0 up to scalar multiples is the only eigenvector. So there are not three linearly independent eigenvectors, a contradiction. Thus A = BJλ B −1 is not a diagonalisable matrix. 43 7 Eigenvalues and Eigenvectors Geometry & Linear Algebra Exercise: Prove (ii) similarly. Examples 7.10. (1) Let ( A= 3 2 2 0 ) In lecture 13, we found this has eigenvalues λ1 = 4, λ2 − 1. The corresponding eigenvectors are ( ) ( ) 2 v1 = v2 = 1 −2 1 So A = P DP −1 where D= ( ) 4 0 0 −1 ( P = ) 2 1 −1 −2 So A2 = (P DP −1 )(P DP −1 ) = P D2 P −1 . (By induction An = P Dn P −1 since P P −1 = I). ( ) ( ) 1 −2 −1 1 2 1 P −1 = = −5 −1 2 5 1 −2 Therefore, An = P Dn P −1 ( )( n )( ) 2 1 4 0 2 1 1 −2 0 (−1)n 1 −2 ( ) ( ) 1 2 · 4n (−1)n 2 1 = 4n −2(−1)n 1 −2 5 ) ( 1 4n=1 + (−1)n 2 · 4n − 2(−1)n = 4n + 4(−1)n 5 2 · 4n − 2(−1)n =⇒ An = 1 5 (2) How do we find a 2 × 2 matrix B such that B 3 = A? Idea: Take any 2 × 2 matrix C such that C 3 = D e.g. ( 1 ) 43 0 C= 0 (−1) Then B = P CP −1 works. Indeed B 3 = P C 3 P −1 = P DP −1 = A. (3) Fibonacci Sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, 35, 55, . . . Formally defined as fn+1 = fn + fn−1 ( Let A= 44 f0 = 0, f1 = 1. 1 1 1 0 ) 7 Eigenvalues and Eigenvectors Notice that ( Geometry & Linear Algebra fn+1 fn ) ( =A fn ) fn−1 Since fn+1 = fn + fn−1 ... ( ) ( ) fn+1 fn =A fn fn−1 ( ) 2 fn−1 =A fn−2 = ... ( ) f1 =A f0 ( ) 1 = An 0 n We now know how to find An , and this gives a formula for fn . (4) Is  1 A = −1 1  0 0 2 0 −1 1 diagonalisable? The characteristic polynomial is (t − 1)2 (t − 2), so λ1 = 2, λ2 = 1. We find   0 v1 =  1  −1 is an eigenvector with eigenvalue λ1 . We can find two non-proportional eigenvectors for λ1 = 1. Then three eigenvectors are linearly independent. So A is diagonalisable. [So sometimes double root =⇒ three eigenvectors and is still diagonalisable.] 45 8 Conics and Quadrics Geometry & Linear Algebra 8 Conics and Quadrics Back to 2-dim geometry, we work in R2 . Lecture 16 Lines are given by linear equations ax1 + bx2 + c = 0 (polynomial equations in x1 , x2 of degree 2) Definition. A conic is a curve in R2 that can be given by a quadratic equation, i.e. a polynomial equation in x1 , x2 of degree 2. ax21 + bx1 x2 + cx22 + dx1 + cx2 + f = 0 e.g. x21 + x2 = r2 gives a circle. Also (ax1 + bx2 + c)(a′ x1 + b′ x2 + c′ ) = 0 is the set of solutions is the union of two lines. 8.1 Standard Forms of Conics Non-degenerate conics: For a ̸= 0, b ̸= 0 ∈ R we have (1) An ellipse x22 x21 + =1 a2 b2 x2 x1 (2) A hyperbola x21 x2 − 22 = 1 2 a b x2 x1 46 8 Conics and Quadrics Geometry & Linear Algebra (3) An imaginary ellipse x21 x22 − =1 a2 b2 This defines an empty set (there are no solutions). − (4) A parabola x2 = ax21 + b x2 b x1 Degenerate conics: (5) Two lines with a common point ( )( ) x21 x22 x1 x2 x1 x2 − = 0 =⇒ − + =0 a2 b2 a b a b x2 x1 (6) x21 = 0, a double line. (7) A pair of two conjugate complex lines meeting in a real point ( )( ) x21 x22 x1 x2 x1 x2 + = 0 =⇒ − i + i =0 a2 b2 a b a b The set of real solutions is one point (0, 0). (8) Two parallel lines x21 x1 =1 ( = ±1) a2 a x2 x1 47 8 Conics and Quadrics Geometry & Linear Algebra (9) A pair of parallel complex conjugate lines There are no real solutions. x1 a x21 =1 a2 = ±i. 8.2 Reducing with Trigonometry Our aim: To use translations and rotations in R2 to reduce any conic to one of these standard types. Why conics? These are all slices of the cone x2 + y 2 = z 2 : parabola hyperbola circle ellipse (i) z = constant =⇒ a circle / ellipse. (ii) z = y + 1 into equation =⇒ x2 = 2y + 1, a parabola. (iii) y = 1 (vertical plane) =⇒ a hyperbola. (iv) z = 0, x2 + y 2 : two conjugate lines with a common real point. (v) y = 0, x2 = z 2 : two real lines. Translations in R2 : y2 x2 y1 x1 Moving the co-ordinates by (d1 , d2 ) has the effect that x1 = y1 + d1 x2 = y2 + d2 48 8 Conics and Quadrics Geometry & Linear Algebra Rotations in R2 Recall that rotating coordinate P = (p1 , p2 ) to Q = (q1 , q2 ) by φ through O is: ( ) ( )( ) q1 cos φ − sin φ p1 = q2 sin φ cos φ p2 What is the effect of rotation on coordinates? Rotating the (x1 , x2 ) coordinate system by φ: y2 Q x2 P φ y1 φ x1 The y1 , y2 coordinates of Q are simply p1 and p2 . Therefore, the new coordinates y1 , y2 and the old coordinates x1 , x2 are related as follows: ( ) ( )( ) x1 cos φ − sin φ y1 = x2 sin φ cos φ y2 Let’s write quadratic equations in matrix form. Recall that At denotes the transpose of A, i.e. the matrix whose i − j entry is the j − i entry of A. Definition. A square matrix is called symmetric if At = A. e.g. ( ) a b , b a   a b c b d e  c e f Recall that we proved that (AB)t = B t · At and (A + B)t = At + B t . Consider ) ( a 12 b A= 1 c 2b ( (x1 , x2 ) a 1 2b )( 1 2b c x1 x2 ) ( ax + 1 bx = (x1 , x2 ) 1 1 2 2 2 bx1 + cx2 ) = x1 (ax1 + 21 bx2 ) + x2 ( 12 bx1 + cx2 ) = ax21 + bx1 x2 + cx22 49 8 Conics and Quadrics (x1 , x2 ) Geometry & Linear Algebra ( ) ( ) d x = dx1 + ex2 = (d, e) 1 e x2 Therefore we can write ax21 + bx1 x2 + cx22 + dx1 + ex2 + f as xt Ax + (d, e)x + f ( ) x1 where x = , xt = (x1 , x2 ). x2 Theorem 8.1 Using a rotation and then translation we can always reduce any quadratic equation to standard form. Low level proof - Idea: Get rid of the x1 x2 terms using a rotation. If we can do that, then for some new coefficients a′ , c′ , d′ , e′ , f ′ we obtain: a′ y12 + c′ y22 + d′ y1 + e′ y2 + f ′ we complete the squares and reduce this equation to a′′ z12 + c′′ z22 + f ′′ or (when c′ = 0) a′′ z12 + e′′ z2 + f ′′ Proof. Rotate through φ, then ( ) ( ) ( )( ) x1 y1 cos φ − sin φ y1 = Rϕ = x2 y2 sin φ cos φ y2 t ARφ y =⇒ xt Ax = (Rφy)t A(Rφ y) = y t Rφ t We want the matrix Rφ ARφ y to be diagonal (then y1 y2 terms disappears) ( t Rφ ARφ = ( = )( a cos φ sin φ 1 − sin φ cos φ 2b 1 2b c ) Rφ 1 a cos φ + 21 b sin φ 2 b cos φ + c sin φ 1 −a sin φ + 2 b cos φ − 12 b sin φ + c cos φ )( ) cos φ − sin φ sin φ cos φ We need to know the non-diagonal terms. The 12 entry is 1 1 − a sin φ cos φ − b sin2 φ + b cos2 φ + c sin φ cos φ 2 2 1 1 1 = − a sin 2φ + b cos 2φ + c sin 2φ 2 2 2 1 = (b cos 2φ − (a − c) sin 2φ) 2 If a ̸= c, then take φ such that tan 2φ = 50 b a−c (then this term is 0). Lecture 17 8 Conics and Quadrics If a = c, take φ = π 4 Geometry & Linear Algebra (again, 0). After this rotation, x = Rφ y, we accomplished the first step. Then we obtain the equation a′ y12 + c′ y22 + d′ y1 + e′ y2 + f If a′ ̸= 0, c′ ̸= 0, we can complete the squares and reduce this to a′ z12 + c′ z22 + f ′′ Then y1 = z1 + α1 y2 = z2 + α2 This is exactly a translation by (α1 , α2 ). If a′ ̸= 0, but c′ = 0, we complete the square for y1 and do nothing for y2 . Then reduce to a′ z12 + e′′ z2 + f ′′ . We get a parabola or a degenerate conic. Example 8.2. Reduce to standard form 5x21 + 4x1 x2 + 2x22 + x1 + x2 − 1 = 0 ( Step 1: A= b a−c Use Rφ = tan 2φ = 5 2 2 2 ) = 43 . We can calculate sin φ = ( x1 x2 ) 1 =√ 5 ( )( ) 1 2 y1 −2 1 y2 Substitute into equations and obtain 5x21 + 4x1 x2 + 2x22 = y12 + 6y22 The new equation is 1 3 y12 + 6y22 − √ y1 + √ y2 − 1 = 0 5 5 1 1 2 9 = (y1 − √ ) + 6(y2 + √ )2 − 8 2 5 4 5 9 = z12 + 6z22 − 8 so this is an ellipse. 8.3 Reducing with Eigenvectors There is a trigonometry-free method! 51 √2 , 5 cos φ = √1 . 2 8 Conics and Quadrics Geometry & Linear Algebra t −1 Note for any rotation matrix, we have Rφ = Rφ . Indeed Rφ = ( ) cos φ − sin φ sin φ cos φ ( t Rφ and ( = cos φ sin φ − sin φ cos φ cos φ sin φ − sin φ cos φ ) )( ) ( cos φ − sin φ 1 = sin φ cos φ 0 0 1 ) Definition. Let P be a square matrix of the size n × n. Then P is called orthogonal if P t · P = I. (In otherwords, P −1 exists and equals P t ) Since the rows of P t are precisely the columns of P , P t P = I is equivalent to the following property: { 1, i = j (ci (P ) · cj (P )) = 0, i ̸= j In sheet 6, there is an exercise that says a 2 × 2 matrix is orthogonal ⇐⇒ P is a rotation matrix or a reflection matrix; depending on whether det(P ) = 1 t −1 or det(P ) = −1. Note Rφ ARφ = Rφ ARφ . In the proof of Theorem 8.1, we −1 t watned Rφ ARφ = Rφ ARφ to be diagonal. New idea: Use diagonalisation, that  | P =  v1 |  | v2  | where v1 , v2 are the eigenvectors of A. Why do v1 , v2 exist? Why is P a rotation matrix? Theorem 8.3 Let A be any real symmetric 2 × 2 matrix ) ( ( a 12 b α such that A ̸= A= 1 0 b c 2 0 α ) for some α ∈ R. Then A has the following properties: (i) A has two different real eigenvalues λ1 , λ2 λ1 ̸= λ2 (ii) If v1 is an eigenvector, with eigenvalue λ1 and v2 is an eigenvector with eigenvalue λ2 . Then v1 · v2 = 0 (i.e. v1 , v2 are perpendicular) (iii) Up to swapping v1 and v2 and multiplying v1 , v2 by a scalar, so 52 8 Conics and Quadrics Geometry & Linear Algebra that ||v1 || = ||v2 || = 1, the matrix  | P = v1 |  | v2  | is a rotation matrix. Proof. (i) ( ) b2 t − a − 12 b = t2 − (a + c)t + (ac − ) 1 −2b t − c 4 √ 1 λ1 , λ2 = (a + c ± (a + c)2 − 4ac − b2 ) 2 √ 1 = (a + c ± (a − c)2 + b2 ) 2 ( ) α 0 Since A ̸= for some α, we have either b ̸= 0 or a − c ̸= 0. Hence 0 α (a − c)2 + b2 > 0. Therefore the characteristic polynomial has two different real roots. det (ii) Let v1 , v2 be eigenvectors for λ1 , λ2 respectively. We need to show that v1 · v2 = 0. Consider v1t Av2 : v1t Av2 = v1t (Av2 ) = v1t · λ2 v2 = λ2 (v1t · v2 ) But also v1t Av2 = (v1t A)v2 = (v1t At )v2 = (Av1 )t v2 = (λ1 v1 )t v2 = λ1 v1t v2 Therefore λ2 (v1t v2 ) = λ1 (v1t v2 ). So (λ1 −λ2 )·(v1t v2 ) = 0, but from (i), λ1 ̸= λ2 , so v1t v2 = 0. (iii) Multiply v1 and v2 by non-zero numbers so that ||v1 || = ||v2 || = 1. Clearly if ( ) α v1 = such that α2 + β 2 = 1 β then α = cos φ, β = cos φ for some φ. Since v2 is also a unit vector and is perpendicular to v1 it’s either ( ) ( ) − sin φ sin φ or cos φ − cos φ 53 8 Conics and Quadrics Geometry & Linear Algebra v2 cos φ v1 sin φ φ cos φ sin φ ( ) − sin φ Choose v2 = . Then cos φ   ( ) | | v1 v2  = cos φ − sin φ = Rφ sin φ cos φ | | t −1 Observe that Rφ = Rφ (any rotation matrix is an orthogonal matrix). From the diagonalisation theory we know that P AP −1 = diag(λ1 , λ2 ) where   | | P =  v1 v2  | | Now P = Rφ , so we have = −1 Rφ ARφ = diag(λ1 , λ2 ) t Rφ ARφ So if x = Rφ y, then t xt Ax = y t Rφ ARφ y −1 ARφ y = y t Rφ =⇒ xt Ax = y t diag(λ1 , λ2 )y = λ1 y12 + λ2 y22 ( λ (y1 , y2 ) 1 0 0 λ2 )( ) y1 = λ1 y12 + λ2 y22 y2 Example 8.4. Same equation as before ( ) 5 2 A= 2 2 t2 − 7t + 6 = (t − 1)(t − 6) so λ1 = 1, λ2 = 6. Let’s find the first eigenvector (2nd follows easily since it’s perpendicular). ( I −A= ( ) ) 1 1 −4 −2 =⇒ v1 = √ −2 −2 −1 5 54 8 Conics and Quadrics Geometry & Linear Algebra (||v1 || = 1). Then we don’t even have to calculate v2 , and we can take ( ) 1 2 v2 = √ 5 1 Check: 1 √ 5 ( ) 1 2 −2 1 has determinant 1. This is Rφ from the beginning of the lecture. Corollary 8.5. Any conic can be reduced to the following form: λ1 y12 + λ2 y22 + d′ y1 + e′ y2 + f ′ ( a be a rotation. Here λ1 , λ2 are the eigenvalues of A = 1 2b ) 1 2b c . Obviously λ1 λ2 = det(A). (since the characteristic polynomial is t2 − (a + c)t + det(A) = (t − λ1 )(t − λ2 )). Therefore if det(A) > 0, then λ1 , λ2 have the same sign. We complete the squares and obtain λ1 z1 2 + λ2 z22 = h, where h ∈ R • If h > 0, this is an ellipse. • If h < 0, this is an imaginary ellipse. • If h = 0, this is two complex conjugate lines meeting in one real point (0, 0) Next, if det(A) < 0, completing the square we obtain h∈R λ1 z12 + λ2 z22 = h, • If h > 0 or h < 0, this is a hyperbola. • If h = 0, this a pair of real lines meeting at a point. Finally if det(A) = 0, then λ2 = 0, λ1 ̸= 0. Complete the square for y1 , so we obtain λ1 z12 + e′ y2 + f ′ • If e′ ̸= 0, we reduce to λ1 z12 + e′ z2 = 0, which is a parabola. ′ • If e′ = 0, then λ1 z12 +f ′ = 0. z12 = − λf1 . If RHS is positive, these are two parallel real lines. If RHS is zero, it’s a double line. If RHS is negative, there are no real solutions; this is two parallel complex conjugate lines. Moral: If we can prove that our conic is non-empty and non-degenerate, then its type is uniquely determined by det(A). • det(A) > 0 =⇒ ellipse. • det(A) > 0 =⇒ parabola. • det(A) > 0 =⇒ hyperbola. 55 8 Conics and Quadrics Geometry & Linear Algebra 8.4 Quadric Surfaces Definition. A quadric surface is a surface in R3 defined by an eqn of degree 2: ax21 + bx22 + cx23 + dx1 x2 + ex1 x2 + f x2 x3 + gx1 + hx2 + jx3 + k = 0 Task: Use an orthogonal transformation, ideally a rotation, to reduce this equation to one of standard form. Non-denerate forms Here are non-degenerate forms, with 0 ̸= a, b, c ∈ R: (1) Ellipsoid: x2 x2 x21 + 22 + 23 = 1 2 a b c 1 z 0 −1 −1 0 x 1 −1 −0.5 1 0.5 0 y (2) Hyperboloid of one sheet: x21 x22 x23 + − =1 a2 b2 c2 2 0 z −2 −2 0 x 2 2 −2 0 (3) Hyperboloid of two sheets: − x2 x2 x21 − 22 + 23 = 1 2 a b c 56 y Lecture 19 8 Conics and Quadrics Geometry & Linear Algebra 2 0 z −2 −2 0 2 x 2 −2 0 y (4) Elliptic Paraboloid: x3 = x21 x22 + a2 b2 2 0 z −2 −2 0 2 x 2 −2 0 y (4) Hyperbolic Paraboloid: x3 = x21 x22 − a2 b2 20 0 z −20 −5 0 x 5 −10 −5 Some examples of degenerate quadrics 57 5 0 y 10 8 Conics and Quadrics Geometry & Linear Algebra (5) Cone: x21 x22 x23 + − =0 a2 b2 c2 2 0 z −2 −2 0 x 2 2 0 −2 y (6) Elliptic cylinder x21 x22 + =1 a2 b2 Similarly there are hyperbolic and parabolic cylinders, etc. You will not be expected to know all the quadrics for the exam. 8.5 Reducing Quadrics Step 1. Write the equation in matrix form.   x1 x = x2  =⇒ xt Ax + (ghj)x + k = 0 x3 where  a d 2 e 2 f 2  A =  d2 b  e 2 f 2 c Step 2. Find the eigenvalues of A. In fact any symmetric 3 × 3 matrix has three real eigenvalues: det(tI − A) = (t − λ1 )(t − λ2 )(t − λ3 ) where λi ∈ R. (See sheet 8 for proof). We’ve seen before that λi ̸= λj and vi is an eigenvector of λi , vj is an eigenvector for λj , then vi is perpendicular to vj , i.e. vi ·vj = 0. So if λ1 ̸= λ2 ̸= λ3 , then we can find v1 , v2 , v3 of length 1: ||v1 || = ||v2 || = ||v3 || = 1. These vectors will be perpendicular automatically to each other. The same also works in the general case. Step 3.  | P =  v1 | 58 | v2 |  | v3  | 8 Conics and Quadrics Geometry & Linear Algebra Then P is an orthogonal matrix, i.e. P t P = I ||vi || = 1, i = 1, 2, 3 and vi · vj = 0 when i ̸= j. =⇒ P t = P −1 , since Step 4. Since P −1 = P t , x = P y where y = (y1 , y2 , y3 )t . So the equation now looks as follows: xt Ax = y t P −1 AP y But P −1 AP is the diagonal matrix  λ1 D=0 0 0 λ2 0  0 0 λ3 Therefore xt Ax = y t P −1 AP y = y t diag(λ1 , λ2 , λ3 )y = λ1 y12 + λ2 y22 + λ3 y32 The whole equation now is λ1 y12 + λ2 y22 + λ3 y32 + g ′ y1 + h′ y2 + j ′ y3 + k = 0 If λ1 λ2 λ3 ̸= 0, complete the squares and get one of two hyperboloids, an ellipsoid, or a degenerate quadric. If λ1 λ2 ̸= 0, λ3 = 0 we get a paraboloid (one of two types) or a degenerate. Example 8.6. Reduce to standard form 2x1 x2 + 2x1 x3 − x2 − 1 = 0 Using an orthogonal transformation and a translation. 1. xt Ax − x2 − 1 = 0, where  0 A = 1 1 2.  t det −1 −1 1 0 0  1 0 0  −1 −1 t 0  = t3 − 2t 0 t √ √ = t(t − 2)(t + 2) If this quadric it myst be a hyperbolic paraboloid (since √is non-degenerate, √ λ1 = 0, λ2 = 2 and λ3 = − 3 < 0) 59 8 Conics and Quadrics Geometry & Linear Algebra 3. Find the eigenvectors λ1 = 0, solving  0 1 1 1 0 0 1 0 0 Ax = 0:  0 0 0   0 Solving we find v1 = √12  1 . −1 √ For λ2 = 2, solving Ax = 0: √ 2 √ −1 −1  2 √0 −1 −1 0 2  0  0 0 √  2 Solving we find v2 = 21  1 . 1  √  − 2 √ λ3 = − 3 =⇒ v3 = 21  1 . 1 4. Hence  √ 0  1 √ P = + 2 − √12 2 2 1 2 1 2 − √  2 2  1  2  1 2 =⇒ y t Ay + (0, −1, 0)P y − 1 √ √ 1 = 2(y2 − α)2 − 2(y3 − β)2 + √ (y1 − δ) 2 where α, β, δ ∈ R. So this is a non-degenerate quadric, hence a hyperbolic paraboloid. 60 9 Geometry in R3 Geometry & Linear Algebra 9 Geometry in R3 A vector x = (x1 , x2 , x3 ) ∈ R3 is such that • a point P ⃗ • a vector OP ⃗ of same length and direction as OP ⃗ . • any vector AB Definition. The length of x = (x1 , x2 , x3 ) is √ ||x|| = x21 + x22 + x23 = |OP | (x1 , x2 , x3 ) + (y1 , y2 , y3 ) = (x1 + y + 1, x2 + y2 , x3 + y3 ) λ(x1 , x2 , x3 ) = (λx1 , λx2 , λx3 ) Definition (Lines of R3 ). L = {a + λu : λ ∈ R} is a line in R3 . Definition (Dot product). For x, y ∈ R3 x · y = x1 y1 + x2 y2 + x3 y3 Proposition 9.1. For x, y ∈ R3 we have x · y = ||x|| · ||y|| · cos θ 9.1 Planes n = a is a normal vector for the plane. x · y = 0 ⇐⇒ x and y are perpendicular. So x − a is perpendicular to n ⇐⇒ (x − a) · n = 0. This is the equation of a plane. Now u · (v + w) = u · v + u · w so (x − a) · n = x · n − a · n. So the equation of the plane is x·n=a·n Write a · n = c, a scalar, so you get x·n=c Write n = (n1 , n2 , n3 ) then n1 x 1 + n2 x 2 + n3 x 3 = c Example 9.2. We need a, n. a = (e, e, 0) and n = (0, 0, π). Then (x − a) · n = 0, so x · n = a · n. Then a · n = (e, e, 0) · (0, 0, n) =0 61 Lecture 20 9 Geometry in R3 Geometry & Linear Algebra So the equation is (x1 , x2 , x3 ) · (0, 0, n) = 0 πx3 = 0 =⇒ x3 = 0 Proposition 9.3. (i) The intersection of two planes is • Nothing • A line • A plane (if the two planes are the same) (ii) If A, B, C are 3 points which don’t lie on a common line, then there is a unique plane through A, B, C. Proof. Problem sheet 7. Let e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 01), so (x1 , x2 , x3 ) = x1 e1 + x2 e2 + x3 e3 . Definition (Vector is e1 a × b = a1 b1 product). The vector product = cross product of a, b e2 a2 b2 e3 a3 b3 = e1 (a2 b3 − a3 b2 ) − e2 (a1 b3 − a3 b1 ) + e3 (a1 b2 − a2 b1 ) = (a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ) Example 9.4. e1 (1, −1, 0) × (3, 0, 1) = 1 3 e2 e3 −1 0 0 1 = (−1, 1, 3) Proposition 9.5. a × b is perpendicular to both a, b. Proof. a1 a · (a × b) = a1 b1 a2 a2 b2 a3 a3 b3 =0 Similarly b · (a × b) = 0. 62 9 Geometry in R3 Geometry & Linear Algebra (i) e1 × e2 = e3 , e2 × e3 = e1 , e3 × e1 = e2 Proposition 9.6. (ii) a × a = 0 (iii) b × a = −a × b (iv) (λa) × b = λ(a × b) = a × (λb) for any scalar λ ∈ R (v) a × (b + c) = a × b + a × c Proof. Use properties of determinants: (i) e1 e1 × e2 = 1 0 e3 0 0 e2 0 1 = e3 = (0, 0, 1) (ii) a = (a1 , a2 , a3 ), then e1 a × a = a1 a1 e2 a2 a2 e3 a3 a3 = 0 = (0, 0, 0) (iii) Proposition 9.7. For any vectors a, b ∈ R3 we have Lecture 21 ||a × b|| = ||a|| · ||b|| · | sin θ| = the area of a parallelogram OABC Proof. Squaring both sides: =⇒ ||a × b||2 = (a2 b3 − a3 b2 )2 + (a3 b1 − b1 a3 )2 + (a1 b2 − a2 b1 )2 Area = ||a|| · ||h|| = ||a|| · ||b|| · | sin θ| ||a||2 · ||b||2 · (1 − cos2 θ) = ||a||2 ||b||2 − (||a||||b|| cos θ)2 = (a21 + a22 + a23 ) · (b21 + b22 + b23 ) − (a · b)2 = (a21 + a22 + a23 ) · (b21 + b22 + b23 ) − (a1 b1 + a2 b2 + a3 b3 )2 = a21 b22 + a22 + b21 − 2a1 a2 b1 b2 + · · · + (similar terms) Check that the two expressions are identitical. Remark. a × b is a vector perpendicular to the plane through O, a and b of length that’s equal to the area of the parallelogram build on O, a, b. 63 9 Geometry in R3 Geometry & Linear Algebra Example 9.8. Find the area of a triangle in R3 with corners (1, 0, 0), (1, 1, 0), (0, 1, 1). Answer: Area = 21 (length of a × b) where a = (0, 1, 0), b = (−1, 1, 1) (0, 1, 1) b a (1, 0, 0)  e1 det  0 −1 e2 1 1 (1, 1, 0)  e3 0  = e1 − e2 · 0 + e3 1 = e1 + e3 = (1, 0, 1) = a × b Hence ||a × b|| = √ 2 =⇒ area of triangle is √ 2 2 . Volume Proposition 9.9. The volume of the parallelepiped built on a, b, c equals |a · (b × c)|   a1 a2 a3 = det  b1 b2 b3  c1 c2 c3 Proof. Let’s show that vol= |a · (b × c)|. By previous proposition the area of the base is (b × c) = ||b × c|| · n where n is a unit normal to the plane through 0, b and c. Therefore |a · (b × c)| = ||a|| · ||b × c|| · | cos θ| where θ is the angle between a and n. But ||a|| · | cos θ| is the height of the parallelepiped. So we proved that vol= |a · (b × c)|. Finally  a1 det  b1 c1 a2 b2 c2  a3 b3  c3 Properties of scalar triple product Proposition 9.10. For any a, b, c ∈ R3 we have 64 9 Geometry in R3 Geometry & Linear Algebra (i) a · (b × c) = c · (a × b) = b · (c × a) (ii) a, b, c are coplanar (= 0, a, b, c belong  a1 a2 det  b1 b2 c1 c2 to a plane) if and only if  a3 b3  = 0 c3 Proof. (i)  a1 det  b1 c1 a2 b2 c2   a3 a1 b3  = − det  c1 c3 b1 a2 c2 b2   a3 c1 c3  = det a1 b3 b1 c2 a2 b2  c3 a3  b3 Hence a · (b × c) = c · (a × b). Similarly we can probe that c · (a × b) = b · (c × a). (ii) a, b, c are coplanar if f volume of parallelepiped built on a, b, c is 0 ⇐⇒ det = 0. Alternative Proof of (ii). Any plane through 0 can be given by m1 x1 + m2 x2 + m3 x3 = 0 (at least one co-efficient mi ̸= 0). Then    a1 a2 a3 m1  b1 b2 b3  m2  = 0 c1 c2 c3 m3 Since a, b, c are contained in this plane. This system of linear equations has a non-zero solution. Therefore det = 0. This argument can be reversed; conversely if det = 0, then there is a non-zero solution which gives a plane though 0, a, b, c. Example 9.11. Are the points (1, 0, 0), (1, 1, 1), (1, 2, 3), (2, 3, 4) in the same plane? Let a = (0, 1, 1), b = (0, 2, 3), c = (1, 3, 4).   ( ) 0 1 1 1 1 det 0 2 3 = det =1 2 3 1 3 4 This is not 0, hence the four points do not belong to a common point. What is (a × b) × c? We know that this is perpendicular to the plane through a × b and c. Hence (a×b)×c is in the plane through 0, a, b, so the triple vector product (a×b)×c) = (k1 )a + (k2 )b. 65 Geometry in R3 9 Geometry & Linear Algebra Proposition 9.12. For a, b, c ∈ R3 : Lecture 22 (i) (a × b) × c) = (a · c)b − (b · c)a (ii) a × (b × c) = (a · c)b − (a · b)c Remark. Vector product is not associative! Proof. (i) a × b = (a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ). (a × b) × c = the x1 coordinate of (a × b) × c which is (a3 b1 − a1 b3 )c3 − (a1 b2 − a2 b1 )c2 . The x1 coordinate of RHS is (a1 c1 + a2 c2 + a3 c3 )b1 − (b1 )1 + b2 c3 + b3 c3 )a1 . Hence LHS = RHS. Similarly, check that x2 and x3 coordinate of LHS and RHS are equivalent. (ii) a × (b × c) = −(b × c) × a) =⇒ = (c × b) × a) =⇒ = (c · a)b − (b · a)c = (a · b)b − (a · b)c Remark. Use this proposition to establish the Jacobi identity: (a × b) × c + (c × a) × b + (b × c) × a = 0 This, with the property that a × b = −b × a turn R3 into a Lie Algebra. Example 9.13. a, b, c, d ∈ R3 . What is (a × b) · (c × d)? Answer: Call u = a × b. We proved that u · (c × d) = d · (u × c) (prop 9.7) = d · ((a × b) × c) = d · ((a · c)b − (b · c)a) (prop 9.8) = (a · c) · (b · d) − (b · c)(a · d) Proposition 9.14. Let a ∈ R3 , a ̸= 0. Then b × a = 0 if and only if b = λa for some λ ∈ R. Proof. If b = λa then b × a = λa × a = 0. Conversely assume that b × a = 0. Then certainly (a×(b×a) = a×0 = 0. Then a×(b×a) = (a·a)b−(a·b)a. We (a·b) know that this is zero, but (a · a) = ||a||2 ̸= 0, since a ̸= 0, so b = ||a|| 2 a. Proposition 9.15. Let a, b ∈ R3 . Assume a ̸= 0. Then the set of x ∈ R3 such that (x − b) × a = 0 is the line {b + λa | λ ∈ R}. Proof. (x − b) × a = 0 ⇐⇒ x − b = λa for some λ ∈ R by the previous proposition. 66 9 Geometry in R3 Geometry & Linear Algebra 9.2 Rotations in R3 The axis of rotation is the line in R3 such that every point of this line is fixed by the rotation. We write a rotation by a 3 × 3 matrix P such that a vector (x1 , x2 , x3 )t goes to P (x1 , x2 , x3 )t . Claim: (i) P is an orthogonal matrix, i.e. P t = P −1 (ii) Any orthogonal matrix has determinant 1 or −1. If det P = 1, then this is a rotation matrix. (iii) The axis is {λv | λ ∈ R}, where v is an eigenvector of P with eigenvalue 1, i.e. P v = v (iv) To find the angle, we take any unit vector w perpendicular to the axis, i.e. w · v = 0. Then calculate w · P w = ||w|| · ||P w|| cos θ = cos θ. 67 10 Vector spaces Geometry & Linear Algebra 10 Vector spaces We’ve seen R2 and R3 . Similarly, we can work with Rn = {(x1 , . . . , xn ) | xi ∈ R}. In the same way we can do linear algebra in Cn = {x1 , . . . , xn | xi ∈ C}. In this case “scalars” mean “complex numbers”. The same methods work equally well for binary vectors. Notation. F2 = {0, 1} is the binary field such that: 0+0=1+1=0 1+0=0+1=1 0×0=1×0=0×1=0 1×1=1 So we have three notions of scalars: R, C, F2 . What is (F2 )n ? (F2 )n = {(x1 , . . . , xn ) | xi ∈ F2 } (x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn ) Here + standard for binary addition, e.g. (1, 0) + (1, 1) = (0, 1). If λ ∈ F2 , then λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ). Now let F be one of R, C or F2 . I will refer to F as a field. Rules satisfied by addition of vectors For any u, v, w ∈ V [V is in R, C or F2 ], we have A1 u + (v + w) = (u + v) + w A2 u + 0 = 0 + u = u, 0 = (0, 0, . . . , 0) A3 u + (−u) = 0, u = (u1 , . . . , un ) − u = (−u1 , . . . , −un ) A4 u + v = v + u Rules satisfied by scalar multiplication For any λ, µ ∈ F and any u, v ∈ V we have: S1 λ(u + v) = λu + λv S2 (λ + µ)u = λu + µu S3 λ(µu) = (λµ)u S4 1u = u Definition (Vector space V over F ). Let F be a field (i.e. R, C or F2 ) and let V be a set, the elements of which are called vectors. Suppose we have addition in V i.e. a function V × V → V and multiplication by a scalar, 68 Lecture 23 10 Vector spaces Geometry & Linear Algebra i.e. a function F × V → V and an element of V called zero such that the properties A1-A4 and S1-S4 all hold. Then V is a vector space over the field F . Examples 10.1. (1) Rn , Cn , (F2 )n (2) Let S be any set. Consider all functions S → |R. Let’s define the structure of a vector space on Functions(S, R). If f1 : S → R and f2 : S → R are functions, then f1 + f2 is a function such that for any element s ∈ S, we have (f1 + f2 )(s) = f1 (s) + f2 (s). The zero element of Functions(S, R) is the zero function, i.e. the function that is equal to zero ∀s ∈ S. If f ∈ Functions(S, R) then −f is the function such that (−f )(s) = −f (s), ∀s ∈ S. It’s clear that A1-4 hold. Now lets define scalar multiplication: λ ∈ R, f ∈ Functions(s, R) then λf is the function such that (λf )(s) = λ · f (s), ∀s. Check that S1-4 hold, e.g. let’s check S4. λ(µf ) is the function such that ∀s ∈ S we have (λ(µf ))(s) = λµf (s), which is the same as the value of the function (λµ) · f . (3) Similarly the set of functions from S to F is a vector space over F . (4) Let’s be more specific. Let S = R. Consider Functions(R, R), e.g. So consider just polynomial functions. This is a subset of the set of all functions R → R. Addition of polynomials is again a polynomial, and a scalar of a polynomial is also a polynomial. The set of polynomials is a subspace of the vector space of all functions R → R. Polynomials inherit addition and scalar multiplication from Functions(R, R), they are closed under these operations. Hence R[x] = {the set of all polynomials with real coeficients} is a vector space over R. We can also check A1-4 and S1-4 for polynomials directly. (5) Now consider polynomials of degree up to d, where d is a fixed positive integer. This set is closed under addition and scalar multiplication so it’s also a vector space over R. e.g. d = 0 is the space of polynomials of degree 0 = R. d = 1 is ax + b. So addition is (ax+b)+(a′ x+b) = (a+a′ )x+(b+b′ ) and multiplication is λ(ax+ b) = λax + λb. So the space of linear polynomials can be identified with R2 . Similarly, the space of quadratic polynomials = {ax2 + bx + c | a, b, c ∈ R} can be identified with R3 be associating to ax2 + bx + c the vector (a, b, c). 69 10 Vector spaces Geometry & Linear Algebra Notation. V × V = {(v, v ′ ) | v, v ′ ∈ V }. V × V → V means a function from pairs of elements of V goes to V . For example addition of vectors is such an operation. F × V = {λ, v) | λ ∈ F, v ∈ V }. So scalar multiplication is a function F × V → V . Let’s deduce some consequences of the axioms A1-4, S1-4. Proposition 10.2. Let F be a field (i.e. F = R, C or F2 ). Let V be a vector space over F . For any v ∈ V and any λ ∈ F we have: (i) 0v = 0 (ii) λ0 = 0 (iii) If λ ∈ F , λ ̸= 0, then λv = 0 only if v = 0. (iv) (−λ)v = λ(−v) = −(λv) (v) (−1)v = −v. Proof. (i) In F we have 0 = 0 + 0. Therefore 0v = (0 + 0)v = 0v + 0v. Let’s add −Ov to both sides =⇒ 0v−0v = 0v+0v = 0v. By A1 =⇒ 0v−0v = 0v+(0v−0v) and by A3 =⇒ 0 = 0v. (∗) (ii) We need to prove λ0 = 0. By A2, 0 = 0 + 0. Then λ0 = λ(0 + 0). By S1 λ0 = λ0 + λ0. Add (−λ0) to both sides =⇒ λ0 − λ0 = λ0 + λ0 + λ0 − λ0. By (∗) =⇒ 0 = λ0. (iii) λ−1 ∈ F =⇒ λ−1 (λv) = λ−1 0 = 0. By S3 and S4 (λ−1 λ)v = 1 · v = v = 0. (iv), (v) exercise. 10.1 Subspaces Definition. Let V be a vector space over F . Let W be a subset of V . Then W is called a vector space of V when W itself is a vector space over F with the same addition and scalar multiplication as in V . Proposition 10.3. Let W be a subset of a vector space V over F . THen W is a subspace if the following conditions hold: (i) W contains 0 (zero vector of V ) (ii) For any w1 , w2 ∈ W, w1 + w2 ∈ W (W is closed under vector addition) (iii) For any w ∈ W and λ ∈ F , λw ∈ W (W is closed under scalar multiplication) Remark. (i) is a consequence of (ii) and (iii). So to check that W ⊂ V is a subspace, it is enough to check (ii) and (iii). Indeed suppose W is closed 70 Lecture 24 10 Vector spaces Geometry & Linear Algebra under addition and multiplication by scalars. Take any w ∈ W , w ̸= 0. Then (−1)w = −w and w + (−w) = 0 ∈ W (because W is closed under addition.) Proof. We must check that A1-A4 and S1-S4 hold in W . −w ∈ W because −w = (−1)w. W contains 0, for any w ∈ W , −w ∈ W , closed under addition and scalar multiplication. All axioms hold in W because they hold in V by assumption. Examples of subspaces: (i) Trivial subspaces: V is its own subspace. Also {0} is a subspace (closed by Prop 10.1 and 10.2). (ii) Polynomials with real coefficients R[x] is a subspace of Functions(R, R). (iii) Polynomials of degree at most d is a subspace of Functions(R, R) and equally a subspace of R[x]. Non-examples (i) The upper half plane {(x1 , x2 ) | x1 ≥ 0, x1 , x2 ∈ R} is not a subspace of R2 as it fails to be closed with negative scalar multiplication. (ii) The co-ordinate cross, i.e. the union of the x1 -axis and the x2 -axis is not a subspace of R2 . It fails vector addition. In this lecture V is always a vector space over F , where F = R, C or F2 . Every vector space V contains “trivial” subspaces {0} and V itself. For example, R contains only trivial subspaces. Indeed, if W ̸= {0}, then ∃ w ∈ W, w ̸= 0. Then R = {λw | λ ∈ R}. In R2 there are non-trivial subspaces, e.g. take x = (x1 , x2 ) ̸= 0. Then λx is a vector subspace. Of course {λx | λ ∈ R} is a line through zero. So any line through zero in R2 is a vector subspace of R2 . These are given by ax1 +bx2 = 0. Proposition 10.4. Let A be a matrix of size m × nS with entries in I. Then the set of vectors x ∈ F n such that Ax = 0 is a vector subspace of F n . Proof. By Prop 10.2 we need to check that Ax = 0 and Ay = 0, then A(x + y) = 0. This is clear. Similarly, if Ax = 0, then for any λ ∈ R, we have Aλx = λAx = 0. Definition. Let V be a vector space over F , and let v1 , . . . , vn ∈ V . The vector λ1 v1 + · · · + λn vn is called a linear combination of v1 , . . . , vn . Here λ1 , . . . , λn ∈ F are arbitrary scalars. Remark. Clearly λ1 vn + . . . λn vn ∈ V for any λ1 , . . . , λn ∈ F . Example 10.5 (R2 ). It is easy to see that if x, y ∈ R2 are not multiples of each other, then R2 = {λx + µy | λ, µ ∈ R} 71 Lecture 25 10 Vector spaces Geometry & Linear Algebra v1 v2 0 Definition. Let v1 , . . . , vn be vectors in V . Then the set of all linear combinations of v1 , . . . , vn is called the linear span of v1 , . . . , vn . This is denoted by Sp(v1 , . . . , vn ). Proposition 10.6. For any v1 , . . . , vn ∈ V the linear span Sp(v1 , . . . , vn ) is a vector space of V . Proof. Again, by Prop 10.2 we need only to check that Sp(v1 , . . . , vn ) is closed under + and (× by scalars). Indeed λ1 v1 + · · · + λn vn + (µ1 v1 + · · · + µn vn ) = (λ1 + µ1 )v1 + · · · + (λn + µn )vn Also for µ ∈ F we have µ(λ1 v1 + · · · + λn vn ) = (µ1 v1 )v1 + · · · + (µn λn )vn . Remark. b = (b1 , . . . , bn )t ∈ F n . When is b an element of Sp(v1 , . . . , vn ), where vi ∈ F n ? Consider the m × n matrix  | A =  v1 |  | . . . vn  | b ∈ Sp(v1 , . . . , vn ) if and only if b = x1 v1 + . . . xn vn . Equivalently,     x  b1 1 | |  ..    ..    .  = v1 . . . vn  .  = Ax | | v x n n for some x ∈ F . n Example 10.7. V = R2 . Consider     1 −1 v1 = 2 , v2 =  0  , 3 1   3 v3 =  1  −1 What is Sp(v1 , v2 , v3 )? Using the remark, we know that b ∈ Sp(v1 , v2 , v3 ) if and only if b = Ax has a solution. 72 10 Vector spaces Geometry & Linear Algebra The augmented matrix of Ax = b is this:  1 −1 3 2 0 1 3 1 −1 Using Gaussian elimination we get  1 −1 3 −→ 0 2 −5 0 0 0  1 −1 −→ 0 2 0 4  b1 b2  b3  b1 b2 − 2b1  b1 − 2b2 + b3 3 −5 −10  b1 b2 − 2b1  b3 − 3b1 If b1 − 2b2 + b3 ̸= 0, then on solutions. If b1 − 2b2 + b3 = 0, we have solutions (one free variable =⇒ infinitely many). Conclusion: b ∈ Sp(v1 , v2 , v3 ) if and only if b1 −2b2 +b3 = 0. In other words, Sp(v1 , v2 , v3 ) is a plane in R3 given by the equation x1 − 2x2 + x3 = 0.   3 Example 10.8. Same v1 , v2 but now let v3 = 1. 0 What is Sp(v1 , v2 , v3 )? We need to solve Ax = b, where the augmented matrix is   1 −1 3 b1 2 0 1 b2  3 1 0 b3 Once again, follow the same procedure:  1 −1 3 −→ 0 2 −5 0 4 −9  1 −1 −→ 0 2 0 0 3 −5 1  b1 b2 − 2b1  b3 − 3b1  b1 b2 − 2b1  b1 − 2b2 + b3 The system has a solution for any choice of b1 , b2 , b3 (and there is exactly one solution!) So Sp(v1 , v2 , v3 ) = R3 . Definition. Let V be a vector space over F , and let W ⊂ V be a subspace. If W = Sp(v1 , . . . , vn ), then v1 , . . . , vn is called a spanning set for W . Then we say that v1 , . . . , vn spans W . 73 10 Vector spaces Geometry & Linear Algebra Example 10.9. The vectors e1 = (1, 0, . . . , 0), . . . , en = (0, 0, . . . , 1) span F n . Indeed (x1 , . . . , xn ) = x1 e1 + · · · + xn en . Go back to the first example... The plane x1 − 2x2 + x3 = 0 is Sp(v1 , v2 , v3 ). In fact, this plane is also Sp(v1 , v2 ). The reason for this is that v3 is a linear combination of v1 and v2 . Indeed       3 1 −1 v3 =  1  , v1 = 2 , v2 =  0  −1 3 1 v3 = 5 1 v1 − v2 2 2 Therefore λ1 v1 + λ2 v2 + λ3 v3 = (λ1 + λ3 2 )v1 + (λ2 − 25 λ3 )v2 . Definition. Let V be a vector space over F . Then vectors v1 , . . . , vn are linearly dependent if there exists λ1 , . . . , λn ∈ F not all equal to zero such that λ1 v1 + · · · + λn vn = 0 ∈ V . The motivation for this is when v1 , . . . , vn are linearly dependent, one of these vectors is a linear combination of all the others. Indeed some λi ̸= 0. To fix ideas assume λ1 ̸= 0. Then v1 = − 1 (λ2 v2 + · · · + λn vn ) λ1 In this case Sp(v1 , . . . , vn ) = Sp(v2 , . . . , vn ). Definition. v1 , . . . , vn are linearly independent if the only way to write the zero vector as a linear combination of v1 , . . . , vn is with all coefficients equal to 0, i.e. if 0 = λ1 v1 + . . . λn vn =⇒ λ1 = · · · = λn = 0. Remarks. (i) One vector v ∈ V is linearly independent (assuming v ̸= 0, i.e. v is a non-zero vector) (ii) Two vectors v1 , v2 ∈ V are linearly independent if and only if they are not proportional. λ1 v1 + λ2 v2 = 0. If λ1 ̸= 0, then v1 = − λλ12 v2 . If λ2 ̸= 0, then v2 = − λλ12 v1 . (ii) Let V = F m . Suppose v1 , . . . , vn ∈ V . Consider the m × n matrix A such that v1 , . . . , vn are the columns of A:   | ... | A = v1 . . . vn  | ... | Then v1 , . . . , vn are linearly dependent if and only if the equation Ax = 0 has a non-zero solution. Clearly x = (x1 , . . . , xn )t is a solution, then x1 v1 + · · · + xn v + n = 0. If Ax = 0 =⇒ x = 0, then v1 , . . . , vn are linearly independent. If ax = 0 has a 74 Lecture 26 10 Vector spaces Geometry & Linear Algebra non-zero solution then v1 , . . . , vn are linearly dependent. Proposition 10.10. Any subset of a set of linearly independent set of vectors is linearly independent. Proof. Suppose that v1 , . . . , vn are linearly independent. Without loss of generality, we can assume that our subset is v1 , . . . , vk for some k ∈ n. If v1 , . . . , vk are linearly dependent, then we can find λ1 , . . . , λn ∈ F (scalars) such that λ1 v1 + . . . λk vk = 0, and not all of λ1 , . . . , λk are 0. Consider λ1 v1 + · · · + λk vk + 0vk+1 + · · · + 0vn = 0 This is a linear combination of all v1 , . . . , vn with at least one non-zero coefficient. Hence v1 , . . . , vn are linearly dependent. This is a contradiction, therefore v1 , . . . , vk are linearly independent. Proposition 10.11. Let V be a vector space over F . Let v1 , . . . , vn ∈ V . Then v1 , . . . , vn are linearly dependent if and only if one of them is a linear combination of the others. Proof. Suppose v1 , . . . , vn are linearly dependent. Then we can find λ1 , . . . , λn such that λ1 v1 + · · · + λn vn = 0 and one of λi s, say λ1 is non-zero. Then v1 = − 1 (λ2 v2 + . . . λn vn ) λ1 Conversely if v1 = µ2 v2 + . . . µn vn where µi ’s ∈ F , then (−1)v1 + µ2 v2 + · · · + µn vn = 0 (−1) ̸= 0, so at least one coefficient is non-zero. Hence v1 , . . . , vn are linearly dependent. Definition. Let V be a vector space over F . A set of vectors v1 , . . . , vn ∈ V is called a basis of V if (i) V = Sp(v1 , . . . , vn ) (ii) v1 , . . . , vn are linearly independent. In other words, every vector of V is a linear combination of v1 , . . . , vn (this is property (i)), and and no vi is a linear combination of v1 , . . . , vi−1 , vi+1 , . . . , vn . Examples 10.12. (i) V = F n . Then e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en = (0, 0, . . . , 1) is a basis of F n . (x1 , . . . , xn ) = x1 e1 + . . . xn en , so Sp(e1 , . . . , en ) = F n . Clearly e1 , . . . , en are linearly independent: λ1 e1 + · · · + λn en = 0. This says (λ1 , λ2 , . . . , λn ) is the zero vector. Hence λi = 0, i = 1, . . . , n. So e1 , . . . , en is a basis. 75 10 Vector spaces Geometry & Linear Algebra Warning. There are many other bases! e.g. R2 has a basis (1, 1) and (1, 3). Indeed (1, 1) and (1, 3) are not proportional, hence are linearly independent. (ii) Let V be a vector space of polynoials in x of degree at most d. V = {fd xd + · · · + f1 x1 + f0 | fi ∈ F } fd xd +· · ·+f0 is a linear combination of 1, x, . . . , xd with coefficients f0 , . . . , fd . But 1, x, . . . , xd are linear indepdent vectors of V . So suppose lambda0 + λ1 x + · · · + λd xd = 0 (the zero polynomial.) Conclusion: 1, x, . . . , xd is a basis of the vector space of polynomials of degree ≤ d. Other basis exist: e.g. 1, x + 1, x2 + 1, . . . , xd + 1 is also a basis. Remark. Let V = F n . Then v1 , . . . , vn ∈ V form a basis of V if and only if the matrix equation Ax = 0 has only one solution x = 0. Here   | ... | A = v1 . . . vn  | ... | We’ve seen that v1 , . . . , vn are linearly independent if the only solution is the zero solution. We still need to justify that if Ax = 0 has only one solution, then v1 , . . . , vn span V . When does v1 , . . . , vn ∈ F n form a basis? Consider  | A =  v1 |  | . . . vn  | Answer: v1 , . . . , vn are a basis of F n is and only if the equation Ax = 0, where x = (x1 , . . . , xn )t has the unique solution 0. Equivalently, det(A) ̸= 0. x is a solution ⇐⇒ x1 v1 + · · · + xn vn = 0. Hence, since v1 , . . . , vn are linearly independent the only solution is x1 = · · · = xn = 0. det(A) ̸= 0 =⇒ A−1 exists. To prove that F n = Sp(v1 , . . . , vn ), we must observe that any vector b = (b1 , . . . , bn )t can be written as b = Ac for some c = (c1 , . . . , cn )t . This is so because b = Ac says that b = c1 v1 + · · · + cn vn . Take c = A−1 b. Then b = Ac as required. Proposition 10.13. Let V be a vector space over G with a basis {v1 , . . . , vn }. Then any vector v ∈ V is uniquely written as v = λ1 v1 + · · · + λn vn . Proof. Suppose that v = λ1 v1 + · · · + λn vn 76 Lecture 27 10 Vector spaces Geometry & Linear Algebra and v = µ1 v1 + . . . µn vn Then (µ1 − λ1 )v1 + · · · + (µn − λn )vn = 0. Since v1 , . . . , vn is a basis, these vectors are linearly independent. By the definition of linear independence, we must have λ1 = µ1 , . . . , λn = µn . Theorem 10.14 Let V be a vector space over F . Suppose that {v1 , . . . , vn } & {w1 , . . . , wm } are both bases of V . Then m = n. - So any basis of V has the same number of elements. Proof. V = Sp(w1 , . . . , wm ) and V = Sp(v1 , . . . , vn ). Thus we can write: vi = n ∑ for some aij ∈ F aij wj j=1 Let A be the m × n matrix with entries aij . Similarly we write: wj = m ∑ for some bjk ∈ F bkj vk k=1 Therefore, vi − n ∑ aij · j=1 m ∑ bjk vk (∗) k=1 By Proposition 10.7, there is only one way of writing vi as a linear combination of v1 , . . . , vn , namely as vi = 0v1 + · · · + 1 · vi + · · · + 0vn . Hence RHS of (∗) is simply vi . In otherwords, we have n ∑ m ∑ aij bjk = 1 if k = i aij bjk = 0 if k ̸= i j=1 k=1 and n ∑ m ∑ j=1 k=1 The framed expressions are the entries of AB. The ii entry of AB is the first sum, that is n ∑ m ∑ aij bjk = 1 j=1 k=1 The 1k-entry of AB, i ̸= k, is the second sum, that is n ∑ m ∑ aij bjk = 0 j=1 k=1 77 10 Vector spaces Geometry & Linear Algebra This means AB = Im , the identity matrix of size m × m. By swapping vi ’s and wj ’s, we show similarly that BA = In , the identity matrix of size n × n. Thus AB = In and BA = Im . Recall that the trace of a square matrix is the sum of the diagonal elements. Fact: Tr(AB) = Tr(BA). This is enough to finish the proof because Tr(In ) = n and Tr(Im ) = m. Proof of Fact. The ii entry of AB is n ∑ aij bji =⇒ T r(AB) = j=1 n ∑ n ∑ aij bji i=1 j=1 The jj-entry of BA is m ∑ bji aij i=1 So T r(BA) = n ∑ m ∑ bji · aij = j=1 i=1 m ∑ n ∑ aij bji = T r(AB) i=1 j=1 The proves the fact, and hence the theorem. Definition. Suppose a vector space V over G has a basis consisting of n vectors. Then n is called the dimension of V = dimV . If V has a basis (of finitely many elements), then V is called a finitedimensional vector space. Example 10.15. Let V be the space of all polynomials in x with real coefficients. Claim: V is not finite-dimensional. Proof. Suppose that V is spanned by finitely many polynomials, say f1 (x), . . . , fn (x). Let d be a natural number so that d > deg fi (x) with i = 1, . . . , n. Then xd is not a linear combination of f1 (x), . . . , fn (x). Theorem 10.16 Let V be a vector space. Suppose that v1 , . . . , vn ∈ V span V . Then (i) dimV ≤ n (ii) There is a subset of {v1 , . . . , vn }, which is a basis of V . Note that (ii) =⇒ (i). Indeed dimV = number of elements in any basis of V , so dimV ≤ n, because removing some of the vectors from {v1 , . . . , vn } we obtain a basis of V . Hence (ii) =⇒ (i). 78 10 Vector spaces Geometry & Linear Algebra Proof of (ii). We use the casting out process. Start with v1 , . . . , vn . If v1 , . . . , vn are linearly independent, then {v1 , . . . , vn } is a basis. If v1 , . . . , vn are linearly independent [i.e. linearly dependent] then we find λ1 , . . . , λn ∈ F such that λ1 v1 + λn vn = 0 and at least one coefficient λi ̸= 0. Say λn ̸= 0. Then vn = − 1 (λ1 v1 + · · · + λn−1 vn−1 ) λn Therefore V is spanned by v1 , . . . , vn−1 , i.e. V = Sp(v1 , . . . , vn ) = Sp(v1 , . . . , vn−1 ). Continue like this, we stop when we are met with a linearly independent set of vectors. But we ensure that these vectors span V . Hence we have obtained a basis of V . We wanted to classify all vector space of R3 . dim(R3 ) = 3, and dim({0}) = 0 by definition. (We define the dimension of {0} to be zero.) If V ⊂ R3 is a vector space, then dim(V ) can be 0, 1, 2, or 3. If dim(V ) = 3, then V has a basis of 3 vectors, but then it’s a basis of R3 . Hence V = R3 . Felina. Finally we look at an application to error-correcting codes: 10.2 * Error correcting codes * Now F = F2 = {0, 1} with binary addition and multiplication. 1 + 1 = 0. Consider the n-dimensional vector space (F2 )n = {(x1 , . . . , xn ) | xi ∈ F2 }. We’ve seen that every vector space of (F2 )k is a vector subspace of dimension n over F2 , where 0 ≤ m ≤ n. Let v1 , . . . , vm be a basis of V , where m = dim(V ). We know that every vector v ∈ W is uniquely written as v = λ1 v1 + . . . λm vm where λi ∈ F2 , i.e. λi = 0 or λi = 1. The number of elements in (F2 )n is 2n .If V ⊂ (F2 ) has dim(V ) = m, then |V | = 2m . Recall that if v = (x1 , . . . , xn ) and w = (y1 , . . . , yn ) in Rn , then the distance between v and w is √ (x1 − y1 )2 + · · · + (xn − yn )2 This doesn’t work well for F2 . Instead consider the Hamming distance. Definition. Let v, w ∈ (F2 )n . The Hamming distance, dist(v, w) = the number of i ∈ {1, . . . , n} such that vi ̸= wi . For example, dist(v, 0) = the number of non-zero coordinates of v. Any reasonable notion of distance should satisfy three axioms: 79 Lecture 29 10 Vector spaces Geometry & Linear Algebra (i) dist(v, v) = 0 (ii) dist(v, w) = dist(w, v) ∀v, w ∈ V (iii) dist(v, w) ≤ dist(u, u) + dist(u, w) (dist(v, w) is always a non-negative real number.) Exercise: Check Axiom (iii). Error-correcting codes are used to transmit messages in such a way that a distorted message can be decoded correctly. Messages are first encoded into sequences of 0 and 1 of some length. We get, say 2d possible messages, encoded into sequences of length 1. The idea of error correction in the case of 1 error: The Hamming code We are going to realise 2d messages as elements of a vector subspace of (F2 )n , V ⊂ (F2 )n . The construction: consider the matrix of size m × 2 such that the columns are all vectors with coordinates 0 and 1 except the all-0 vector. E.g. if m = 3,   1 0 0 1 1 0 1 A = 0 1 0 1 0 1 1 0 0 1 0 1 1 1 m Define V = {x ∈ F2 −1 | Ax = 0}. Crucial property: Suppose v, w ∈ V, v ̸= w. We transmit v, w and receive v ′ , w′ . We assumed that at most one error occurs during transmission, so either v ′ = v or dist(v, v ′ ) = 1 and similarly for w. The main point is that the set of v ′ ∈ (F2 )n such that dist(v, v ′ ) ≤ 1 does not overlap with the set of w′ ∈ (F2 )n such that dist(w, w′ ) ≤ 1. Once we’ve proved that we know that the closest vector to v ′ is the original vector v. In fact for any v, w ∈ V , dist(v, w) ≥ 3 if v ̸= w. Proof.  1 0 A = 0 1 0 0 0 1 0 1 1 0 1 0 1 0 1 1  1 1 1 If dist(v, w) = 1, then v − w has exactly one non-zero coordinate. But V has no such vectors! (The matrix has no zero columns). If dist(v, w) = 2, then v − w has exactly two non-zero co-ordinates, but there are non such vectors in V (The matrix has no two identitcal columns). Hence dist(v, w) ≥ 3, if v ̸= w in V . Proof of the crucial property. If v ′ = w′ , dist(v, v ′ ) ≤ 1 and dist(w, w′ ) ≤ 1. Then by the triangle inequality dist(v, w) ≤ dist(v, v ′ ) + dist(w, w′ ) ≤ 2 80 10 Vector spaces Geometry & Linear Algebra a contradiction. The decoding is very easy: Let v ∈ V , so that Av = 0. Let v ′ be the received vector. We assume that v ′ = v or v ′ = v + e, where e has exactly one non-zero coordinate. Recipe for decoding. Av ′ is a column vector with 3 co-ordinates. If Av ′ = 0 then v ′ = v and no errors occurred. Otherwise Av ′ is a column of A. The position of Av ′ in A is the coordinate where the error has occurred. Hence we can recover v. Summary V is a vector space of dimension dimV = 2m − 1 − m Exercise: Show that the system of linear equations that defines V has this number of free variables. m Hence |V | = 22 −1−m m , with V ⊂ (F2 )2 −1 . m Final Comment: The whole space (F2 )2 is the disjoint union of Hamming m spheres of radius, i.e. (F2 )2 −1 is the disjoint union of {a′ | dist(n − v)} over m all v ∈ V . In other words each vector of (F2 )2 −1 is at the distance 1 from vector in v, which is unchanged. - End of Geometry & Linear Algebra - 81

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download M1GLA: Geometry and Linear Algebra Lecture Notes