Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Vector space wikipedia , lookup
Mathematics of radio engineering wikipedia , lookup
Classical Hamiltonian quaternions wikipedia , lookup
Karhunen–Loève theorem wikipedia , lookup
Cartesian tensor wikipedia , lookup
Bra–ket notation wikipedia , lookup
System of linear equations wikipedia , lookup
Line (geometry) wikipedia , lookup
More Lecture Notes in Algebra 1 (Fall Semester 2013) October 24, 2013 CHAPTER 1 Linear Systems of Equations 1. Introduction A linear equation in the variables (or unknowns) x1 , . . . , xn is a statement of the form (1) a1 x1 + a2 x2 + . . . + an xn = b. Here a1 , . . . , an and b are constants which are usually known. By a solution to (1) we mean a set of values of x1 , x2 , . . . , xn which makes the statement true. The linear equation (1) is called homogeneous if b = 0; otherwise it is non-homogeneous. A linear equation system is a set of linear equations. By a solution to such a system we mean a set of values of the variables such that each of the equations in the system are fulfilled. Example. (a) The system ( x + 2y = 5 −2x + 3y = 4 has a solution (x, y) = (1, 2). (b) The system x+y+ z=1 2x − y + z = 3 3x + 2z = 4 has, for example, the solutions (x, y, z) = (2, 0, −1) and (x, y, z) = (4, 1, −4). (c) The system ( x+y=2 x+y=3 has no solutions, for if there were a solution (x, y), we would have 2 = x + y = 3, which is impossible. 3 4 1. LINEAR SYSTEMS OF EQUATIONS 2. The Gauss–Jordan Elimination Method We shall now introduce a method for solving linear systems of equations. In this method, we shall successively replace a given system of equations by simpler, equivalent systems. Here, by definition, two systems are equivalent iff they have precisely the same solutions. ( Example 1. Solve the system (*) x + 2y = 5 −2x + 3y = 4 . Solution. The first equation in the system is equivalent to that x = 5 − 2y. We can then eliminate x from the second equation by replacing it with 5 − 2y. The system (*) is thus equivalent to x + 2y = 5 −2(5 − 2y) + 3y = 4 ( (1) (2) (3) ⇔ ( x + 2y = 5 7y = 14 ⇔ ( x + 2y = 5 y=2 The second equation in the last system says that y = 2. We can then eliminate y from the first equation by inserting this value for y, so that (3) is equivalent to (4) x+2·2=5 y=2 (5) ( x=1 y=2 ( ⇔ It follows that the system has the unique solution (x, y) = (1, 2). The elimination we performed to obtain (2) from (*) can be recognized as the following operation: the first equation in (*) is multiplied by 2, and is then added to the second equation. The coefficient of x in the resulting sum of equations is then zero, i.e. the variable x is 2. THE GAUSS–JORDAN ELIMINATION METHOD 5 eliminated there: ( x + 2y = 5 ×2, add to second eq. −2x + 3y = 4 ( x+ 2y = 5 ⇔ 0 · x + (2 · 2 + 3)y = 2 · 5 + 4 ( x + 2y = 5 ⇔ 7y = 14 The elimination of y from the first equation has a similar interpretation: the second equation in (3) is multiplied by −2 and added to the first, ( x + 2y = 5 y = 2 ×(−2), add to first eq. ( x+0·y=5−2·2 y=2 The indicated operation is the main ingredient in the Gauss-Jordan elimination method. We shall execute this operation repeatedly, in a systematic way. x + 4y − 2z = 8 Example 2. Solve the system 2x + 9y + z = 7 3x − 2y − 4z = 6 Solution. We first use the frist equation to eliminate x from the other equations. From the second equation we subtract 2·(first eq.) and from the third we subtract 3·(first eq.). The result is x + 4y − 2z = 8 y + 5z = −9 −14y + 2z = −18 To simplify, we here divide the third equation by 2, x + 4y − 2z = 8 y + 5z = −9 −7y + z = −9 6 1. LINEAR SYSTEMS OF EQUATIONS We now eliminate y from the third equation by adding 7·(second eq.): x + 4y − 2z = 8 y + 5z = −9 36z = −72 x + 4y − 2z = 8 ⇔ y + 5z = −9 z = −2 Using the third equation, we eliminate z from the first two ones: x + 4y = 4 y = 1 z = −2 Finally, y is eliminated from the first equation using the second one: x = 0 y = 1 z = −2 The system has the unique solution (x, y, z) = (0, 1, −2). Here is the main principle. The first equation is used to eliminate the first unknown from the other equations. Then the (new) second equation is used to eliminate the second unknown from the subsequent equations, etc. The last equation will only contain the last unknown, which has thus been calculated. After that we make a backwards substitution to eliminate the variables "upwards”, one at a time, until the system is completely solved. The simple rule above can not always be carried out. We illustrate with a couple of simple examples. Example 3. In the system y − 2z = 3 x + 2y − z = 2 2x + 3y + z = −1 we can’t as earlier use the first equation to eliminate x from the other ones. On the other hand, the second equation can be used. To obtain a system of the same form as earlier, we switch places of the first two 2. THE GAUSS–JORDAN ELIMINATION METHOD 7 equations: x + 2y − z = 2 y − 2z = 3 2x + 3y + z = −1 After this we can proceed as before. Do this! (The system has the unique solution (x, y, z) = (2, −1, −2).) ( 3x + 2y = 4 . Example 4. Solve the system 6x + 4y = 1 Solution. We eliminate x from the second equation by subtracting 2·(first eq.) and get ( 3x + 2y = 4 0 = −7 The second equation is a false statement for all values of the variables x and y. This means that the system has no solutions. ( 3x + 2y = 4 Example 5. Solve the system . 6x + 4y = 8 Solution. Eliminating x from the second equation, we now get ( 3x + 2y = 4 0=0 The second equation is always satisfied, so the system is equivalent to the single equation 3x + 2y = 4. This means that we can let y have an arbitrary value, say y = t, and then x is uniquely determined by y. Hence the system has infinitely many solutions ( x = 34 − 32 t , t ∈ R. y=t Exercises. 1. Solve the following linear systems of equations: ( ( ( x+y=2 2x − 3y = 1 4x + 3y = 2 a) b) c) −x + y = 4 6x + y = 7 3x − 5y = 6 2. Solve the following systems: x− y+ z=4 2x − y + 3z = 9 a) b) 3x + 5y − z = 0 3x + 2y − 2z = 1 2x − y − 3z = 2 4x + 5y − 4z = 2 . 8 1. LINEAR SYSTEMS OF EQUATIONS x+y =8 y + z = −2 d) c) x +z=6 . x − 2y + z = −2 2x − 5y + 3z = −2 y+z=4 3. Solve the following systems: ( ( ( 2x − 31 y = 2 4x − 3y = 4 6x + 9y = 5 a) b) c) 9 6x − y = 4 −3x + 4 y = −3 9x + 3y = 15 2 . 3. The Augmented Matrix of a Linear System of Equations The numerical operations used when solving a linear system involve solely the coefficients of the unknowns and the constants in the right hand sides. To simplify the notation, one can omit the unknown and represent the system by a scheme called the augmented matrix of the system. For example, the system x + 2y + 3z = 10 1 2 3 10 (∗) is represented by 1 1 −1 4 . x+ y− z= 4 2 −1 1 5 2x − y + z = 5 Example 6. We solve the system (*) in two ways: by writing down complete equations as before, and, in parallel, by just manipulating with the augmented matrix. Eliminate x from eq.’s 2 and 3: x + 2y + 3z = 10 − y − 4z = −6 −5y − 5z = −15 | Row 2 − row 1, row 3 − 2 · row 1 : 3 10 1 2 0 −1 −4 −6 0 −5 −5 −15 1 1 Mult. eq. 2 by − 1, eq. 3 by − : | Mult. row 2 by − 1, row 3 by − : 5 5 x + 2y + 3z = 10 1 2 3 10 0 1 4 6 y + 4z = 6 0 1 1 3 y+ z= 3 Eliminate y from eq. 3 : x + 2y + 3z = 10 y + 4z = 6 −3z = −3 | Row 1 0 0 3 − row 2 : 2 3 10 1 4 6 0 −3 −3 3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS 9 Divide eq. 3 by − 3 : x + 2y + 3z = 10 y + 4z = 6 z= 1 | Divide row 3 by − 3 : 1 2 3 10 0 1 4 6 0 0 1 1 Eliminate z in eq.’s 1 and 2 : x + 2y =7 y =6 z=1 | Row 1 − 3 · row 3, row 2 − 4 · row 3 : 1 2 0 7 0 1 0 2 0 0 1 1 Eliminate y in eq. 1 : x =3 y =6 z=1 | Row 1 − 2 · row 2 : 1 0 0 3 0 1 0 2 0 0 1 1 The system thus has the unique solution (x, y, z) = (3, 2, 1). The admissible operations on the augmented matrix, i.e. those giving rise to an equivalent augmented matrix, are the following: (i) (ii) (iii) (iv) A row is multiplied by a constant , 0. A constant multiple of a row is added to another row. Two rows are interchanged. Two columns can be interchanged if one observes that this means that two unknowns are interchanged. This must be taken into account when interpreting the answer. The operation (iv) is not really needed – one can always circumvent it by other means. Here is an example of this. Example 7. Solve the system x + 2y − 3z + w = −2 3x + 6y + 3z − w = −2 2x + 4y + 3z + w = 9 2x + 4y − 3z − w = −13 Solution. The augmented matrix is 10 1. LINEAR SYSTEMS OF EQUATIONS 1 3 2 2 1 0 ∼ 0 0 1 0 ∼ 0 0 2 −3 1 −2 6 3 −1 −2 4 3 1 9 4 −3 −1 −13 2 −3 1 −2 0 12 −4 4 0 9 −1 13 0 3 −3 −9 2 −3 1 −2 0 3 −1 1 0 9 −1 13 0 1 −1 −3 row 2 − 3·row 1 row 3 − 2·row 1 row 4 − 2·row 1 divide row 2 by 4 divide row 4 by 3 Here we realize that we can not eliminate the variable in the second column. We therefore skip that column and continue with the third one. Interchange rows 2 and 4: 1 2 −3 1 −2 0 0 1 −1 −3 row 3 − 9·row 2 0 0 9 −1 13 0 0 3 −1 1 row 4 − 3·row 2 1 2 −3 1 −2 0 0 1 −1 −3 ∼ 8 40 divide row 3 by 8 0 0 0 0 0 0 2 10 divide row 4 by 2 row 1- row 3 1 2 −3 1 −2 0 0 1 −1 −3 row 2 + row 3 ∼ 1 5 0 0 0 0 0 0 1 5 row 4 is unnecessary - strike it! row 1 + 3·row 2 1 2 −3 0 −7 0 0 1 0 2 ∼ 0 0 0 1 5 x + 2y = −1 1 2 0 0 −1 0 0 1 0 2 ∼ . z=2 i.e. 0 0 0 1 5 w=5 Here y can have an arbitrary value (say t), and then the solution is fixed. The solutions are thus (x, y, z, w) = (−1 − 2t, t, 2, 5), t ∈ R. 3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS 11 We finish this section by discussing the application of the Gauss– Jordan method in a few other slightly complicated cases. x − 2y = 1 Example 8. Solve the system 2x − 3y = 4 . 4x − 7y = 5 Solution. The augmented matrix is 1 −2 1 2 −3 4 row 2 - 2·row 1 row 3 - 4·row 1 4 −7 5 1 −2 1 2 ∼ 0 1 0 0 −1 The last line represents the equation 0 = −1, which is not satisfied for any values of x and y. The given system thus lacks solutions. ( x − 2y + z = 3 Example 9. Solve the system . −2x + 4y − 2z = −6 Solution. ! 1 −2 1 3 −2 4 −2 −6 ! 1 −2 1 3 ∼ i.e. 0 0 0 0 row 2 + 2·row 1 ( x − 2y + z = 3 0=0 The last equation is always satisfied - it does not mean any condition on the unknowns x, y, and z. The given system of equations is therefore equivalent to the single equation x − 2y + z = 3. Here we can prescribe values for y and z arbitrarily; the variable x is uniquely determined by these values. The general solution can be written (x, y, z) = (3 + 2s − t, s, t), s, t ∈ R. Example 10. Solve the following system for all values of a: x + y − az = 3 x − ay − z = 2 x − 3y − z = 2 − a 12 1. LINEAR SYSTEMS OF EQUATIONS Solution. (1) (2) (3) (4) 1 1 1 1 ∼ 0 0 1 ∼ 0 0 1 ∼ 0 0 1 −a 3 −a −1 2 −3 −1 2 − a row 2 - row 1 row 3 - row 1 1 −a Divide row 3 by -4, 3 then interchange −a − 1 a − 1 −1 row 2 with row 3. −4 a − 1 −1 − a 1 −a 3 1 1 1 (1 − a) 4 (1 + a) 4 row 3 + (a + 1)·row 2 −a − 1 a − 1 −1 1 −a 3 1 1 1 (1 − a) (1 + a) . 4 4 0 14 (a − 1)(3 − a) 14 (a − 1)(a + 3) The last row is equivalent to the equation (∗) (a − 1)(3 − 1)z = (a − 1)(a + 3). Here we need to divide into cases. Case 1: a = 1. Then the equation (*) reduces to 0 = 0, which is always satisfied. The given system is then equivalent to the first two rows in (4), i.e., 1 1 −1 3 0 1 0 12 ! row 1 - row 2 5 2 1 2 ! ( x−z= y= 1 0 −1 ∼ 0 1 0 i.e. 5 2 1 2 One can e.g. prescribe z arbitrarily and then get x, y, z from the value of z, 5 1 (x, y, z) = + t, , t , t ∈ R. 2 2 Case 2: a = 3. In this case, (*) says that 0 · z = 2 · 6 ⇔ 0 = 12. This is false, so the system lacks solutions for a = 3. 3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS 13 Case 3: a , 1 ∧ a , 3. In this case, the third row in (4) can be 4 , which leads to multiplied by (a−1)(3−a) 1 0 0 1 ∼ 0 0 1 ∼ 0 0 −a 1 (1 − a) 4 1 2 1 0 9+a 3−a a 1 0 3−a 3+a 0 1 3−a 1 1 0 0 0 1 0 0 1 3 1 (1 + a) 4 3+a row 1 + a·row 3 row 2 + 14 (a − 1)·row 3 3−a row 1 - row 2 a2 −a+9 3−a a 3−a 3+a 3−a In conclusion, we have arrived at the following result concerning the solutions to the system: (a) For a = 3 the system has no solutions. (b) For a = 1 it has the solutions (x, y, z) = ( 52 + t, 12 , t), t ∈ R. (c) For a < {1, 3} it has the unique solution ! a2 − a + 9 1 3 + a (x, y, z) = , , . 3−a 3−a 3−a Exercises. 4. Solve the following systems: x − 9y − 3z = 4 x− y+ z=0 a) b) 3x − 2y + z = 2 3x + 5y − z = 0 2x + 7y + 4z = −2 6x + 2y + 2x = 5 2x + 3y = 2 x + 2y = −3 c) d) x + 3y − z = 5 2x − 3y = 8 10x − y = 2 3x + y + z = 3 ( x − 2y = 6 2x + y − 3z = 4 e) f) 3x + y = 2 4x + 3y − z = 2 5x − 3y = 14 2x − 3y + 4z = 1 ( x + y − 4z = 7 3x + y − z = 7 g) h) x + y + 2z = 1 x − y + 5z = 1 4x − 6y + 8z = 3 14 1. LINEAR SYSTEMS OF EQUATIONS 5. 6. 7. 8. 9. 2x − 3y + 4z = 1 2x − y + z = 3 3x + y − z = 7 j) i) 4x − 2y + 5z = 0 x − y + 5z = 1 2x − y − 2z = 9 4x − 6y + 8z = 2 x + 3y + z − w = 2 3x + 5y − z − w = 2 . k) 5x − y − 5z + 3w = 0 2x + 3y − 3z − 2w = 2 2x − 3y + z = 1 Solve the system x − 3y + 2z = b when 3x + y − 4z = c a) a = 2, b = −3, c = 0; b) a = b = c = 0. Hint: Several linear systems with the same coefficient matrices but different right hand sides can be solved simultaneously with the same augmented matrix: one just writes the different right hand sides next to each other. Determine a and b so that the lines y − 3x = 2 and 2y + ax = b a) intersect at a point, b) are parallel and different, c) coincide. 2x+13 a b Determine a and b so that (x−1)(x+2) = x−1 + x+2 . (This is an example of a decomposition in partial fractions of a rational function. Such decompositions are frequently used in the calculation of integrals, for example.) Determine for all values of a the number of solutions to the system ( ( ( x + 2y = 2 x + 2y = 1 x + 2y = 1 a) b) c) . 2x + ay = a 2x + ay = a 2x + a2 y = a Determine all solutions for the following systems for all values of(the constant a: ( ( x + 3y = 4 x + 3y = 3 x − 3ay = 2 a) b) c) 2x + ay = a 2x + ay = a ax − 12y = a + 2 ( ( x + y + 3z = 1 x+ y=3 x − 2ay = 3 d) e) f) 2x + y + z = 2 2x − ay = 2 ax + 3y = 3a − 1 3x + 3y + az = 0 x+ y+ z=1 x + y + az = 1 g) h) 2x + ay − z=1 x + ay + z = a . ax + 2y + (a + 3)z = 2 ax + y + z = a2 4. ANSWERS TO EXERCISES 15 10. Examine, for different values of a and b, the number of solutions to the following linear systems of equations: ( ( ax + by = 2 ax + 2y = b . b) a) x+ y =1 3x + 2y = 5 11. Determine a, b, and c so that 3x2 + 6x − 16 a b c = + + . 3 x − 4x x x+2 x−2 12. Determine the constants a, b, c so that the function f (x) = ax2 + bx + c satisfies f (1) = −3, f (2) = 1 and f (−1) = 7. 13. Determine the equation of a third degree polynomial whose graph passes through the points (0, 1), (1, 1), (2, 1), and (−1, 7). 4. Answers to Exercises 1. a) x = −1, y = 3 b) x = 1.1, y = 0.4 c) x = 28 , y = − 18 . 29 29 2. (x, y, z) = a) (2, −1, 2) b) (1, 2, 3) c) (−6, −2, 0) d) (5, 3, 1). 3. a) Has no solution. b) x = 1 + 43 y, y ∈ R. c) (x, y) = ( 65 , 0). 4. a) (x, y, z) = (1 + 32 t, t, −1 − 52 t), t ∈ R. b) No solution. c) (x, y, z) = (4, −2, −7). d) No solution exists. , − 167 ) f) (x, y, z) = (4t + 5, −5t − 6, t), t ∈ R e) (x, y) = ( 10 7 g) (x, y, z) = (3 − t, t, −1), t ∈ R h) Has no solution. i) (x, y, z) = (2, 1, 0) j) (x, y, z) = (t, 2t−5, −2), ,t ∈ R k) (x, y, z, w) = (2, −1, 1, −2). 5. a) There is no solution. b) x = y = z = t, t ∈ R. 6. a) a , 6 b) a = 6, b , 4 c) a = −6, b = 4. d) The system of equations has respectively: exactly one solution, no solutions, infinitely many solutions. 7. a = 5, b = −3. 8. a) If a = 4 there are infinitely many, otherwise unique. b) If a = 4 there is no solution, otherwise unique. c) If a = 2 there are infinitely many, if a = −2 none, otherwise a unique solution. a 9. a) For a = 6 no solution. For a , 6, (x, y) = a−6 , a−8 . a−6 b) For a = 6: (x, y) = (3 − 3t, t), t ∈ R. For a , 6: (x, y) = (0, 1). c) For a = −2, no solution. For a = 2: (x, y) = (2 + 6t, t), t ∈ R. 1 . For a , ±2: (x, y) = a+4 , − a+2 3(a+2) 4 d) For a = −2: insoluble. For a , −2: (x, y) = 3a+2 , . a+2 a+2 16 1. LINEAR SYSTEMS OF EQUATIONS e) (x, y) = 10. 11. 12. 13. 6a2 −2a+9 , − 2a21+3 2a2 +3 . , 15 , −3 . f) a = 9: insoluble. a , 9: (x, y, z) = a−15 a−9 a−9 a−9 g) a , 1: insoluble. a = 1: (x, y, z) = (2t, 1 − 3t, t), t ∈ R. h) a = 1: (x, y, z) = (1−t−u, 2 t, u), t, u ∈ R. a = −2: insoluble. 1 , a+2 , − a+1 a < {1, −2}: (x, y, z) = a +2a+1 . a+2 a+2 a) If a , 3 unique solution; if a = 3 and b = 5 infinitely many solutions; if a = 3 and b , 5 no solutions. b) If a = b = 2 infinitely many solutions; if a = b , 2 none; if a , b a unique solution. a = 4, b = −2, c = 1. a = 3, b = −5, c = −1. The polynomial −x3 + 3x2 − 2x + 1. CHAPTER 2 Vectors 1. Basic Definitions −−→ If P and Q are two points in space, we denote by PQ the directed line-segment which starts at P and ends at Q. A directed segment is determined by its starting point P, its direction, and its magnitude (or length). −−→ −→ Two directed segments PQ and RS are called equivalent (notation: −−→ −→ PQ ∼ RS) if they have the same direction and magnitude. It is easy to verify that this defines an equivalence relation on the set of directed line segments in space. −−→ The vector u which contains PQ is the set of all directed line −−→ segments in space that are equivalent to PQ (i.e. it is the equivalence ~ In symbols: class containing PQ). n−→ −→ −−→o (5) u = RS : RS ∼ PQ . −−→ In this situation we say that the line segment PQ represents the vector u. The direction and the magnitude of the vector u is defined as the −−→ direction and magnitude of some representative PQ. We will denote the length (or norm) of u by the symbol kuk. −−→ In practice, one often briefly writes u = PQ instead of the using the bulky (but more precise) notation in (5). We will often, without further mention, use this convention in the following. −→ The zero vector is the vector 0 = PP; clearly k0k = 0. A vector u with kuk = 1 is called a unit vector. The sum u + v of two vectors is defined as follows. Take a directed −−→ −−→ segment PQ representing u, then a representative QR (with the same Q) of v. We define −→ u + v = PR. (This rule is sometimes called the parallelogram law of addition.) 17 18 2. VECTORS The addition of vector satisfies the following (easily verified) rules: (i) u + v = v + u (”commutativity”) (ii) u + (v + w) = (u + v) + w (”associativity”) (iii) If u + v = u + w then v = w (”the cancelation law”) (iv) u + 0 = u (”neutral element”). −−→ If u = PQ is a vector, we denote by −~ u the vector with the same −−→ magnitude but opposite direction, i.e., −~ u = QP. It is clear that u+(−u) = 0 for all u. We define the difference between u and v by u − v = u + (−v). Let s be a scalar (scalar is a synonym for ”real number”). We shall define a vector su. (1) If s > 0 we define su to be the vector with the same direction as u and magnitude skuk. (So su is a ”rescaled” version of u.) (2) If s = 0 we define su = 0. (3) If s < 0 we define su to be the vector with the direction opposite to u and length |s|kuk. 1 Observe that if u , 0, then the vector kuk · u is the unit vector with the same direction as u. The multiplication with scalars satisfies the following rules: (a) s(tu) = (st)u, (b) (s + t)u = su + tu, (c) s(u + v) = su + sv. (d) 0u = 0, 1u = u, s0 = 0. Two non-zero vectors u, v are called parallel if one of them can be written as a scalar multiple of the other one. We write u k v to denote that u and v are parallel. −−→ Example 1. Let O, P, Q be three points in space and put u = OP and −−→ v = OQ. Let M be the mid-point of the segment PQ. We claim that (1) −−→ 1 OM = (u + v) . 2 To prove this, observe that −−→ −−→ −−→ OP + PQ = OQ, i.e., −−→ −−→ −−→ PQ = OQ − OP = v − u. 1. BASIC DEFINITIONS 19 This gives that −−→ −−→ 1 −−→ 1 1 OM = OP + PQ = u + (v − u) = (u + v) . 2 2 2 A simple geometric consequence of this is that the diagonals of a parallelogram divide each other into equal parts. Namely, if S is the −→ fourth corner in the parallelogram with sides u and v, then OS = u+v. It thus follows from (1) that M lies halfway between O and S. Example 2. Let O, P, Q, R be four points in space and put −−→ −−→ −−→ u = OP ; v = OQ ; w = OR. By the center of mass of the triangle PQR we mean the point N defined by −−→ 1 ON = (u + v + w) . 3 Prove that the three medians of the triangle PQR intersect at the point N. (A median of a triangle is a line-segment connecting a vertex to the mid-point of the opposite side.) (2) Solution. Let M be the mid-point of the segment PQ. Then (by (1)) −−→ −−→ −−→ 1 1 RM = OM − OR = (u + v) − w = (u + v − 2w) . 2 2 Moreover, −−→ −−→ −−→ 1 1 RN = ON − OR = (u + v + w) − w = (u + v − 2w) . 3 3 We have shown that −−→ −−→ 2RM = 3RN, so N lies on the median RM and divides it according to the ratio 2 : 1. By symmetry of the expression (2), the point N is also on the other medians. Exercises. −−→ 1. Let A and B be two points and O a third point. Let a = OA, −−→ b = OB, and let M be the mid-point of the line-segment AB. Express the following vectors in terms of a and b: −−→ −→ −→ −−→ −−→ a) OB + BA b) AB c) OM d) AM. 2. Simplify the following sums: −−→ −−→ −−→ −−→ −−→ −−→ a) OB + BD + DC b) AC + CO + OB −→ −−→ −−→ −−→ −→ −→ c) AB + OA + BD d) BD − BA + DL. 20 2. VECTORS 3. Prove that a point N is the center of mass of a triangle PQR if and only if −−→ −−→ −−→ NP + NQ + NR = 0. 4. Let u, v, w be non-zero vectors. Prove that parallelism in an equivalence relation, namely: a) u k u b) u k v ⇒ v k u c) If u k v and v k w, then u k w. 5. Let ABC be a triangle, A1 the mid-point on BC, B1 the midpoint on AC, and C1 the mid-point of AB. Prove that the −−−→ −−→ −−→ vectors AA1 , BB1 , and CC1 can be used to form a triangle. 6. A median in a tethrahedron is the line-segment from a vertex to the center of mass of the opposite side. Let O be an arbitrary point and PQRS a tetrahedron. Prove that the medians of PQRS intersect at a point T which is given by −−→ 1 −−→ −−→ −−→ −→ OT = (OP + OQ + OR + OS). 4 The point T is called the center of mass of the tetrahedron. 2. Bases and coordinates Let ` be a line in space. We say that a vector u is parallel to `, or simply that u belongs to `, if u can be represented by a directed line segment of `. (Thus by "u ∈ `”, we really mean that some representative of u belongs to `. We hence use the phrase "belongs to” in slightly different meanings. This should not cause confusion, as long as the reader is aware of the distinction.) Likewise, we say that u belongs to a plane π if u can be represented by a line segment in π. Fix a line ` and let e , 0 be a vector in `. For any other vector u in ` there is then a number x such that (1) u = xe. The vector e is called a basis for ` and x is the coordinate for u with respect to the basis e. Now let π be a plane and let e1 , e2 be two non-parallel vectors in π. We claim that each vector u in π can be written (2) u = x1 e1 + x2 e2 where x1 and x2 are real numbers. To show this, we first choose two lines `1 and `2 in π such that e1 is in `1 and e2 is in `2 . Let O be the point of intersection between `1 and `2 . To help our geometric intuition, we will in the following fix O as our "origin”, and place all 2. BASES AND COORDINATES 21 −−→ vectors in π so that they emanate from the point O. (So e1 = OP1 for −−→ some point P1 , u = OP, etc.) We then decompose u into a sum u = u1 + u2 where u1 is on `1 and u2 is on `2 . As in (1) we can then write u1 = x1 e1 and u2 = x2 e2 for some real numbers x1 and x2 . This proves (2). Definition 1. Two non-parallel vectors e1 , e2 in a plane π are called a basis for π. Given a vector u ∈ π, the numbers x1 and x2 in (2) are uniquely determined; they are called the coordinates of u with respect to the basis e1 , e2 . If the basis is fixed, and no misunderstandings can arise, we can suppress the basis vectors in (2), and simply write u = (x1 , x2 ). To describe all vectors in space in a similar way, we need three vectors e1 , e2 , e3 which are not co-planar (i.e. they are not in one and the same plane). We assert that every vector u then can be written (3) u = x1 e1 + x2 e2 + x3 e3 where x1 , x2 , x3 are real numbers, which are uniquely determined by u. To prove this, we proceed in two steps. First, let π be a plane containing e1 and e2 and let ` be a line containing e3 . Let O be the point of intersection between ` and π; in the following we place all vectors so that they emanate from O. We decompose u into a sum (4) u = u0 + u00 , u0 ∈ π, u00 ∈ `. Applying (2) and (1) we can write u0 = x1 e1 + x2 e2 and u00 = x3 e3 for some (unique) real numbers x1 , x2 , x3 . This proves (3). Definition 2. Three vectors e1 , e2 , e3 which are not co-planar are called a basis for three-dimensional space. If u is a three-dimensional vector, then the numbers x1 , x2 , x3 in (3) are called the coordinates of u with respect to the basis e1 , e2 , e3 . If the basis is understood, we can abbreviate the notation (3) and write u = (x1 , x2 , x3 ). −−→ −−→ Example. Let OPQR be a tetrahedron. The vectors e1 = OP, e2 = OQ, −−→ e3 = OR then form a basis for three-dimensional space. Let N be the 22 2. VECTORS center of mass of the triangle PQR. By Example 2 in the previous section, we then have −−→ ON = (1/3, 1/3, 1/3) relative to this basis. When we have a fixed basis for 3-space, we can compute with coordinates instead of vectors. We then have the rules (x1 , x2 , x3 ) + (y1 , y2 , y3 ) = (x1 + y1 , x2 + y2 , x3 + y3 ) t(x1 , x2 , x3 ) = (tx1 , tx2 , tx3 ). (5) (6) The rule (5) corresponds to the rule for addition of vectors, for if u = x1 e1 + x2 e2 + x3 e3 and v = y1 e1 + y2 e2 + y3 e3 , then u + v = (x1 e1 + x2 e2 + x3 e3 ) + (y1 e1 + y2 e2 + y3 e3 ) = (x1 + y1 )e1 + (x2 + y2 )e2 + (x3 + y3 )e3 . The proof of (6) is similar. Projections. In the decomposition (4) of a vector u, the vector u0 is called the projection of u parallel to the line ` on the plane π, and u00 is called the projection of u parallel to π on `. If the line ` is normal to the plane π, then the projection u0 is called the orthogonal (or "right angled”) projection of u on π. If we decompose both u and v in this way, we find u + v = (u0 + v0 ) + (u00 + v00 ), which means that (u + v)0 = u0 + v0 and (u + v)00 = u00 + v00 . Exercises. 7. Let OPQR be a tetrahedron and introduce a basis for 3-space −−→ −−→ −−→ by e1 = OP, e2 = OQ, e3 = OR. Let A be the mid-point of the segment OP and B the mid-point of the segment QR. Also, let C be the mid-point on the segment AB. Determine the −−→ −−→ −−→ coordinates of the vectors OA, OB, and OC relative to the basis e1 , e2 , e3 . 3. LINEAR DEPENDENCE 23 3. Linear Dependence Let u1 , u2 , . . . , uk be a collection of vectors. We introduce some basic terminology: A vector u of the form u = λ1 u1 + . . . + λk uk , where λ1 , . . . , λk are real numbers, is called a linear combination of the vectors u1 , . . . , uk . The collection u1 , u2 , . . . , uk is called linearly dependent if there are real numbers λ1 , . . . , λk , not all equal to zero, such that λ1 u1 + . . . + λk uk = 0. Otherwise, i.e., if λ1 u1 + . . . + λk uk = 0 ⇒ λ1 = λ2 = . . . = λk = 0, we say that the collection is linearly independent. Example 1. That two vectors u1 , u2 are linearly dependent means precisely that there are λ1 , λ2 , not both zero, such that λ1 u1 +λ2 u2 = 0. If λ1 , 0 this implies u2 = tu1 where t = −λ2 /λ1 , so u1 and u2 are parallel. The same conclusion holds if λ2 , 0. Conversely, it if u1 and u2 are parallel, say if u1 = tu2 , then the relation 1·u1 +(−t)u2 = 0 shows that u1 , u2 . We have shown that two vectors are linearly dependent if and only if they are parallel. The example has the following generalization for arbitrary collections of vectors. Theorem 3. A collection u1 , . . . , uk is linearly dependent if and only if (at least) one u j can be written as a linear combination of the other ui ’s. Proof. (⇒) Assume that u1 , . . . , uk are linearly dependent. Then there are real numbers λ1 , . . . , λk , not all zero, such that λ1 u1 + λ2 u2 + . . . + λk uk = 0. We can w.l.o.g. assume that λ1 , 0. But then u1 = (−λ2 /λ1 )u2 + . . . + (−λk /λ1 )u2 , which shows that u1 is a linear combination of u2 , . . . , uk . (⇐) Suppose that some u j is a linear combination of the other ui ’s. We can w.l.o.g. assume that j = 1 and that u1 = t2 u2 + . . . + tk uk , for some real numbers t2 , . . . , tk . Then 1 · u1 + (−t2 )u2 + . . . + (−tk )uk = 0. 24 2. VECTORS Since the coefficient of u1 is not zero, we infer that the collection is linearly dependent. Remark 4. If a collection u1 , . . . , uk is linearly dependent, then any larger collection u1 , . . . , uk , uk+1 , . . . , un is also linearly dependent. This holds, since if λ1 u1 + . . . + λk uk = 0 where not all λ j are zero, then λ1 u1 + . . . + λk uk + 0uk+1 + . . . + λn un = 0. Example 2. Now suppose that three vectors u1 , u2 , u3 are linearly dependent. Then one of them, say u3 , can be written as a linear combination of u1 and u2 : u3 = t1 u1 + t2 u2 . Hence u1 and u2 are in a plane π, then u3 is also in π. Conversely, if u1 , u2 , u3 belong to a plane π, then they are linearly dependent. To realize this, it suffices to observe that either u1 , u2 are linearly dependent, whence u1 , u2 , u3 is also linearly dependent by the remark above, or u1 , u2 is a basis for a plane π, whence u3 ∈ π implies that u3 is a linear combination of u1 and u2 . We have shown that three vectors are linearly dependent if and only if they are co-planar. Equivalently three vectors are linearly independent if and only if they form a basis for 3-space. Example 3. Assume that the vectors u1 , u2 , u3 have coordinates u1 = (1, −2, 2) , u2 = (−2, 3, 1) , u3 = (−1, 3, 2) with respect to some basis for 3-space. We shall investigate whether or not the vectors u1 , u2 , u3 form a basis. To this end, we consider the vector equation λ1 u1 + λ2 u2 + λ3 u3 = 0, which is equivalent to the linear system λ1 − 2λ2 − λ3 = 0 −2λ2 + 3λ2 + 3λ3 = 0 2λ1 + λ2 + 2λ3 = 0 Solving this with the elimination method gives the only solution λ1 = λ2 = λ3 = 0. Hence u1 , u2 , u3 is linearly independent and (by the result of Example 2) is a basis for 3-space. A collection of four or more vectors in 3-space is always linearly dependent. To show this, it is enough to prove that four vectors are linearly dependent, for then every larger collection is linearly 4. LINES AND PLANES 25 dependent as well. Thus let u1 , u2 , u3 , u4 be four arbitrary vectors in 3-space. Then either u1 , u2 , u3 are linearly dependent, and therefore so are u1 , u2 , u3 , u4 , or u1 , u2 , u3 is a basis for 3-space, whence u4 is a linear combination of u1 , u2 , u3 . In either case, u1 , u2 , u3 , u4 is linearly dependent. Exercises. In the following exercises, vectors are expressed by their coordinates relative to some fixed basis e1 , e2 , e3 . 8. Prove that the vector v = (2, −7, 1) is in the plane spanned by the vectors u1 = (2, −1, 3) and u2 = (1, 1, 2). (The unique plane containing two linearly independent vectors is called the plane spanned by the vectors in question.) Determine the coordinates for v with respect to the basis u1 , u2 . 9. Are the vectors (1, −2, 1), (2, −1, −1), (−1, −4, 5) co-planar? 10. Prove that the vectors (1, 1, 2), (4, 4, 9), (2, 3, 7) form a basis for the three-dimensional space. Determine the coordinates of the vector (5, 4, 3) relative to this basis. 11. For which values of k are the following sets of vectors linearly independent? a) (k, k2 , k3 ), (2, 2, 2); b) (1, 1, 1), (1, k, 2k), (k, 1, k); c) (1, −1, −k), (2, k, 4), (k, 2, −4). 4. Lines and Planes Fix a point O in three-dimensional space. An arbitrary point P −−→ can then be described by the position vector OP. For a given basis e1 , e2 , e3 there are then unique numbers x1 , x2 , x3 such that −−→ OP = x1 e1 + x2 e2 + x3 e3 . The numbers x1 , x2 , x3 are the coordinates of the point P with respect to the coordinate system Oe1 e2 e3 . If the coordinate system is clear from the context, we can simply write P = (x1 , x2 , x3 ). Remark 5. In a similar way we can describe points in a plane π. Let O be a point in π and e1 , e2 is a basis for π. Then for each point P in π we can write −−→ OP = x1 e1 + x2 e2 for unique x1 , x2 ∈ R. We say that x1 , x2 are the coordinates of P in the coordinate system Oe1 e2 for π. In short: P = (x1 , x2 ). 26 2. VECTORS Throughout the rest of this section we fix a coordinate system Oe1 e2 e3 for three-dimensional space. Each point can then be identified with its coordinates, which we denote by (x, y, z) rather than (x1 , x2 , x3 ). Parametric representation. Let ` be a line in space. Suppose that ` passes through a point Q and that a vector u , 0 is parallel to `. Then a point P belongs to ` if and only if −−→ (1) QP = tu for some t ∈ R. The vector u is called a direction vector of `. Denote by (a, b, c) the coordinates of u with respect to the basis e1 , e2 , e3 . Thus u = ae1 + be2 + ce3 . −−→ −−→ −−→ If Q0 = (x0 , y0 , z0 ) and P = (x, y, z), then the vector QP = OP − OQ has coordinates (x − x0 , y − y0 , z − z0 ). Hence (1) can be cast in the form x − x0 = at t ∈ R. (2) y − y0 = bt , z − z0 = ct The relation (2) is known as the parametric representation of the line `. Example 1. Consider the line ` which passes through the points Q = (3, −6, −5) and R = (4, −3, −3). A direction vector of ` is −−→ −−→ −−→ QR = OR − OQ = (4, −3, −3) − (3, −6, −5) = (1, 3, 2). Since ` passes through Q, we conclude that x−3=t t ∈ R, y + 6 = 3t , z + 5 = 2t is a parametric representation for `. In a similar way, one can represent a plane π in space. Namely, let Q be a point in π and u, v a basis for π (i.e. two non-parallel vectors which are parallel to π). Then a point P in space belongs to π if and only if −−→ (3) QP = su + tv for some s, t ∈ R. The numbers s, t are the coordinates for P in the coordinate system Quv of π. If we denote u = ae1 + be2 + ce3 , v = a0 e1 + b0 e2 + c0 e3 , Q = (x0 , y0 , z0 ), 4. LINES AND PLANES then (3) can be written x − x0 = as + a0 t (4) y − y0 = bs + b0 t z − z0 = cs + c0 t , 27 s, t ∈ R. This formula is called a parametric representation of the plane π. Example 2. Consider the plane π passing through the points Q = (1, 2, 0), R1 = (0, 1, 1), and R2 = (2, −1, −3). Two non-parallel vectors in M are −−−→ −−−→ u = QR1 = (−1, −1, 1) , v = QR2 = (1, −3, −3). Hence x − 1 = −s + t y − 2 = −s − 3t z = s − 3t , s, t ∈ R is a parametric representation of π. Example 3. Let us determine the point of intersection between the line ` of Example 1 and the plane π of Example 2. To this end, we separate between the t-parameters in the representations for ` and π and write x − 3 = t1 x − 1 = −s + t2 `: , π : y + 6 = 3t y − 2 = −s − 3t2 . 1 z + 5 = 2t1 z = s − 3t2 At the point of intersection, the values of x, y, and z must match, i.e., 3 + t1 = 1 − s + t2 s + t1 − t2 = −2 ⇔ s + 3t1 + 3t2 = 8 . −6 + 3t1 = 2 − s − 3t2 −5 + 2t1 = s − 3t2 −s + 2t1 + 3t2 = 5 After Gaussian elimination, this gives s = 2, t1 = −1, and t2 = 3. Inserting t1 = −1 into the equation for ` gives (x, y, z) = (2, −9, −7). The equation of a plane. We claim that planes in space correspond precisely to equations of the form (5) Ax + By + Cz = D where not all coefficients A, B, C are zero. Indeed, suppose that A , 0 (the cases B , 0 and C , 0 are analogous). Then a point (x, y, z) 28 2. VECTORS satisfies (5) if and only if x − (D/A) = −(B/A)s − (C/A)t y = s z = t for some s, t ∈ R. Thus (5) describes the plane passing through the point (D/A, 0, 0), which is parallel to the vectors (−B/A, 1, 0), (−C/A, 0, 1). Conversely, one can, by eliminating s and t in the parametric representation (4), prove that any plane can be described by an equation of the form (5). We omit the details, since we shall anyway find an easier method to prove this later on. Instead, we turn to some examples. Example 4. Consider again the plane x − 1 = −s + t π: y − 2 = −s − 3t z = s − 3t , s, t ∈ R A point (x, y, z) is in π if and only if there are s and t satisfying these equations. To determine when this is the case, we can first eliminate s from the last two equations, and then t from the last equation: x − 1 = −s + t x − 1 = −s + t ⇔ y − 2 = −s − 3t −x + y − 1 = −4t x+ z−1= z = s − 3t −2t x − 1 = −s + t ⇔ −x + y −1= −4t . 3x − y + 2z − 1 = 0 Since we can always choose s and t so that the first two equations are satisfied, we see that a point (x, y, z) belongs to π if and only if the third equation is satisfied, i.e., iff 3x − y + 2z − 1 = 0. Lines in space can be described as the intersection between two non-parallel planes. This is illustrated by the following example. Example 5. The points (x, y, z) which belong to both of the planes 2x + y − 3z = 5 and x + 2y − z = 4 4. LINES AND PLANES are precisely the solutions to the system ( ( 2x + y − 3z = 5 2x + y − 3z = 5 ∼ 3y + z = 3 x + 2y − z = 4 29 . Inserting z = 3t (to obtain integer coefficients in the parametric representation) we get x = 2 + 5t y=1− t . z = 3t The intersection line thus passes through the point (2, 1, 0) and has direction vector (5, −1, 3). Remark 6. To describe a line in space we need two equations of the form Ax + By + Cz = D. On the other hand, if we only consider points in a plane π, and if x, y denote coordinates with respect to some coordinate system in π, then one equation Ax + By = C is sufficient to describe a line. If for example A , 0, then (x, y) satisfies the equation if and only if ( x = (C/A) − (B/A)t y= t for some t ∈ R. This is a parametric representation of a line in π passing through the point (C/A, 0), with direction vector (−B/A, 1). Exercises. 12. Give a parametric representation for the line passing through the points (1, −1, 4) and (2, 3, 5). 13. Consider the lines x=2+ t x = 1 + 2t `1 : , `2 : y=1− t y= t . z = 2t z = −1 + t Do they intersect? Are they parallel? 14. Determine whether the following lines intersect: x = 1 + 15t x = 6 − 65t y = −4 − 21t y = −11 + 91t . z = 5 + 33t z = 16 − 143t 30 2. VECTORS 15. Find a parametric representation of the line ` which passes through the point (3, 2, −1) and intersects the lines x = 10 + 5t x = 1 + t and y= 5+ t . y= t z = 2 + 2t z = −5 + t 16. Find a parametric representation of the plane π which passes through the points (2, 3, 0), (1, 5, 2), and (−1, 4, 3). 17. Are the four points (−1, −1, 0), (0, 4, 1), (1, 0, −1), (1, −3, −2) co-planar? 18. Find, in the form Ax + By + Cz = D, the equation for the plane π which passes through the point (1, −1, 2) and which contains the line x=3− t y = 2 + 2t . z = 1 − 3t 19. Prove that a line with direction-vector (a, b, c) is parallel to the plane Ax + By + Cz = D if and only if Aa + Bb + Cc = 0. 20. Find a parametric representation for the line which passes through the point (1, 2, 4) and which is parallel to the planes 2x + y − z = 3 and 3x − 3y + z = 0. 21. Determine the equation for the plane which contains the line ( 3x + 4y + z = 5 , x− y = −6 and which passes through the mid-point on the segment between the points (1, 1, 2) and (3, 1, 4). 5. Answers to Exercises 1. a) a b) b − a c) 21 (a + b) d) 21 (b − a). −−→ −→ −−→ −→ 2. a) OC b) AB c) OD d) AL. −−→ −−→ −−→ 7. OA = (1/2, 0, 0), OB = (0, 1/2, 1/2), OC = (1/4, 1/4, 1/4). 8. v = 3u1 − 4u2 . 9. Yes, (−1, −4, 5) = 3(1, −2, 1) − 2(2, −2, −1). 10. (5, 4, 3) = 23(1, 1, 2) − 4(4, 4, 9) − (2, 3, 7). The coordinates are thus (23, −4, 1). 11. a) k ∈ {0, 1, −1} b) k ∈ {1/2, 1} c) k ∈ {−2, 4}. 5. ANSWERS TO EXERCISES 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. x= 1+ t y = −1 + 4t . z = 4 + t The lines do not intersect and they are not parallel. The lines coincide. x = 5 + 2t `: y = 4 + 2t . z = t x = 2 − s − 3t π: y = 3 + 2s + t . z = 2s + 3t Yes. π : 5x − 3y − 14z = 60. π : x − y − z = 0. x = 1 + 2t y = 2 + 5t . z = 4 + 9t 13x + 36y + 7z = 83. 31 CHAPTER 3 Distance and Angle 1. Scalar Product We start with an important definition. Definition 7. Let u, v be two vectors in space and let θ be the angle between them (in the interval 0 ≤ θ ≤ π). Their scalar product is the number (u|v) defined by (u|v) = kuk · kvk · cos θ. If u = 0 or v = 0, the angle θ is not defined, but in these cases we define (u|v) = 0. The word "scalar” is a synonym for "number”; the term "scalar product” is used because the result (u|v) is a real number and not a vector. Now consider the case when v = e is a unit vector, i.e., kek = 1. If θ is the angle between u and e, then (1) (u|e) = kuk cos θ. We shall find an important geometric interpretation of this formula: Place u and e so that they emanate from the same point and let ` be a line through this point, with direction vector e. The orthogonal projection of u of ` (See Chapter 2, end of Section 2) can then seen to be (2) u00 = (kuk cos θ)e. (This can be verified by an elementary trigonometrical argument – we ask the reader to supply details.) Comparing the formulas (1) and (2), it is seen that u00 = (u|e) · e, i.e. the number (u|e) is the coordinate of the orthogonal projection u00 in the basis e for `. If v , 0 is not a unit vector, we write (u|v) = kvk · (u|e), 33 34 3. DISTANCE AND ANGLE 1 where e = kvk · v is a unit vector. Hence the absolute value |(u|v)| equals the length of v times the length of the orthogonal projection of u on a line with direction vector v. The sign of (u|v) is positive if the angle between u and v is acute, and negative if it is obtuse. The scalar product satisfies the following rules. (I) (u|v) = (v|u) (II) (u + v|w) = (u|w) + (v|w) (III) (tu, v) = t (u, v) (IV) (u|u) ≥p0 with equality ⇔ u = 0. (V) kuk = (u|u) The rule (I) is called symmetry of the scalar product; (II) and (III) together are called linearity in the first argument; (IV) is called positive definiteness. The properties (I),(III), (IV) and (V) are immediate; (II) can be realized in the following way: If v = e is a unit vector, then, by the geometrical interpretation of the scalar product, we can see that (II) means precisely that (u + w)00 = u00 + w00 . That this is the case was shown at the end of Chap. vec, Sect. 2. This shows (II) when v is a unit vector. For general v, we now get (II) by applying the unit vector case to the vector e in the formula (u|v) = kvk · (u|e). Remark 8. Note that linearity in the second argument follows from the symmetry and the linearity in the first argument. In particular, we have (u|tv) = t(u|v). As an application of the scalar product, we prove a well-known theorem. We say that two vectors u, v are orthogonal to each other if (u|v) = 0. (This means that the angle θ = π/2, or that one of u or v is the zero vector.) Theorem 9. (Pythagoras’ Theorem) If u and v are orthogonal, then ku + vk2 = kuk2 + kvk2 . Proof. If u, v are any vectors (orthogonal or not), then by the computational rules above ku + vk2 = (u + v|u + v) = (u|u) + (u|v) + (v|u) + (v|v) = kuk2 + 2(u|v) + kvk2 . 2. ORTHONORMAL BASES 35 Hence if (u|v) = 0, then ku + vk2 = kuk2 + 2(u|v) + kvk2 . Exercises. 1. Let θ be the angle between the sides AB and BC in a triangle ABC. Prove the law of cosines: → −→ −−→ 2 −→ 2 −→ 2 AC = AB + BC − 2 − AB BC cos θ. (Observe that this reduces to Pythagoras’ Theorem when θ = π/2.) 2. Denote by a and b the side-lengths in a parallelogram, and by c and d the lengths of the diagonals. Prove the parallelogram law: c2 + d2 = 2 a2 + b2 . 2. Orthonormal Bases Let π be a plane and e1 , e2 a basis for π. If u = x1 e1 + x2 e2 and v = y1 e1 + y2 e2 are two vectors in π, then by the rules (I)–(III) for the scalar product, (u|v) = x1 y1 (e1 |e1 ) + (x1 y2 + x2 y1 )(e1 |e2 ) + x2 y2 (e2 |e2 ). This expression becomes particularly simple if the basis vectors e1 , e2 are unit vectors making a right angle with each other. Namely, then (e1 |e1 ) = (e2 |e2 ) = 1 and (e1 |e2 ) = 0, and thus (3) (u|v) = x1 y1 + x2 y2 . A basis for a plane consisting of two orthogonal unit vectors is called an orthonormal basis (in short: ON-basis) for the plane. The corresponding definition in three dimensions is the following. We say that three vectors e1 , e2 , e3 are pairwise orthogonal if (e j |ek ) = 0, when j , k. Three vectors e1 , e2 , e3 are said to form an orthonormal basis for threedimensional space if (i) they have unit length, and (ii) they are pairwise orthogonal. We can summarize the definition of orthonormal basis by the equation ( 0 if j , k (e j |ek ) = . 1 if j = k 36 3. DISTANCE AND ANGLE If e1 , e2 , e3 form an orthonormal basis and u = x1 e1 + x2 e2 + x3 e3 and v = y1 e1 + y2 e2 + y3 e3 , then (4) (u|v) = x1 y1 + x2 y2 + x3 y3 . When u = v this reduces to (5) kuk2 = x21 + x22 + x23 . This can be regarded as the three-dimensional version of the theorem of Pythagoras. (u is here the sum of three pairwise orthogonal vectors x1 e1 , x2 e2 , x3 e3 .) Example 1. Suppose that u and v have coordinates u = (4, 1, 1) and v = (2, 2, −1) with respect to some orthonormal basis. We shall determine the angle θ between u and v. For this purpose we use (4) and (5) to compute (u|v) = 4 · 2 + 1 · 2 + 1 · (−1) = 9 √ √ kuk = 16 + 1 + 1 = 3 2 √ kvk = 4 + 4 + 1 = 3. Since (u|v) = kukkvk cos θ, this gives cos θ = (u|v) 9 1 = √ = √ . kukvk 3 2 · 3 2 This means that θ = π/4. Example 2. The formula (3) can be used to prove many trigonometrical identities. (As we know, the de Moivre formula for multiplication of complex numbers provides another way to do this.) As an example, we shall now show the following version of the addition formula for cosines: (6) cos(α − β) = cos α cos β + sin α sin β. To this end, led e1 , e2 be an orthonormal basis for the plane and put u = (cos α)e1 + (sin α)e2 , v = (cos β)e1 + (sin β)e2 . Since u and v then have length 1, while the angle between them is α − β, the definition of scalar product shows that (u|v) = cos(α − β). (We have here used the fact that cos is even: cos(α − β) = cos(β − α).) 3. COMPUTING DISTANCES AND ANGLES 37 On the other hand, by (3), we have (u|v) = cos α cos β + sin α sin β. Comparing the two expressions for (u|v), we infer that the formula (6) holds. Exercises. 3. A triangle in space has vertices at the points (1, 0, 2), (0, −1, 1), and (2, 1, 2) according to some orthonormal basis. Compute all side-lengths and cosines of angles in the triangle. 4. Let e1 , e2 , e3 be an orthonormal basis in space. Suppose that the vector u makes the angle π/4 to the vector e1 and the angle π/3 to the vector e2 . What are the possible angles between u and e3 ? 3. Computing Distances and Angles A coordinate system Oe1 e2 e3 where e1 , e2 , e3 is an orthonormal basis is called an orthonormal system or an ON-system. In this section we fix an ON-system and assume that all points are represented in that system. The distance between two points P = (x, y, z) and Q = (x0 , y0 , z0 ) is then given by −−→ q PQ = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 . Distance between a point and a line. Consider a line ` and a point P which is not on the line. By the distance between P and L we mean the shortest possible distance between P to a point R ∈ `. A geometric consideration shows that the closest point R is determined −→ by the condition that PR be orthogonal to the direction vector of `. (Draw a figure!) Example 1. Let ` be the line through the points Q = (−2, 1, 1) and S = (0, −1, 2), and let P = (1, 2, 1). We shall compute the distance from P to `. −→ First note that ` has direction vector QS = (2, −2, 1), so it has the parametric representation x = −2 + 2t `: y = 1 − 2t . z = 1 + t We now seek the closest point R ∈ ` to P. To this end we put −→ R = (−2 + 2t, 1 − 2t, 1 + t) and we must determine t so that PR is 38 3. DISTANCE AND ANGLE orthogonal to the direction vector (2, −2, 1) of `. That is, we shall have −→ −→ 0 = PR|QS = (−2 + 2t) · 2 + (1 − 2t) · (−2) + (1 + t) · 1 = 9t − 4. −→ This gives t = 4/9 and PR = 19 (−19, −17, 4). The distance from P to ` is thus √ −→ √ 1 74 PR = . 192 + 172 + 42 = 9 3 Distance from a point to a plane. Let π be a plane in space, Q a point in π, and n a unit normal vector to π. (This means that n has unit length and points in a direction orthogonal to π.) The distance d between a point P and π is again defined as the smallest distance between P to some point in π. We assert that −−→ (7) d = n|QP . To prove this, it suffices to note that the orthogonal projection of the −−→ −−→ vector QP on n is equal to n|QP · n. The lengh of the latter vector must thus equal to the sought distance, which proves (7). Now suppose that Q = (x0 , x y , z0 ), P = (x, y, z), and n = (A, B, C). Then −−→ n|QP = A(x − x0 ) + B(y − y0 ) + C(z − z0 ). If we put D = Ax0 + By0 + Cz0 , then (7) can be written (8) d = Ax + By + Cz − D . But P belongs to π precisely when d = 0, i.e., when (9) Ax + By + Cz = D. We have shown that an arbitrary plane can be described by a linear equation of the type (9). If P = (x, y, z) is not in π, then its distance to π is given by (8). Remark 10. The argument above shows that if we have a plane of the form (9), then n = (A, B, C) is a normal vector. Above we assumed that n is a unit vector, i.e., that A2 + B2 + C2 = 1. 3. COMPUTING DISTANCES AND ANGLES 39 If this is not satisfied, one must first normalize the equation (9) by √ 2 dividing with A + B2 + C2 . The distance formula then becomes Ax + By + Cz − D (10) d= √ . A2 + B2 + C2 Example 2. We shall determine an equation for the plane π which passes through the point (1, −1, 2) and has the normal x = 1 + 3t `: y = 3 + 2t . z = 2 − t The direction vector of `, i.e. the vector (3, 2, −1) must then be a normal vector of π. Since the point (1, −1, 2) belongs to M, the equation of the plane becomes 3(x − 1) + 2(y − (−1)) + (−1)(z − 2) = 0, i.e., 3x + 2y − z = −1. The distance from the point P = (5, 6, 7) to π is thus (see (10)) 21 |3 · 5 + 2 · 6 − 1 · 7 + 1| = √ . √ 14 32 + 22 + 12 Distance between two lines. Given two lines which do not intersect, we define their distance to be the smallest possible distance between one point on the first line and one on the second one. The following example comprises a method to compute this kind of distance. Example 3. Consider the lines `1 and `2 having parametric representations x = −3 + 2t x=3+ t `1 : ; `2 : y= t y = 4 + 3t . z = 1 − t z = 2 + 2t We shall compute the (shortest) distance between `1 and `2 . Direction vectors for the two lines are (2, 1, −1) and (1, 3, 2). Let π be the plane parallel to these directions, which passes through the point (−3, 0, 1) on `1 . Then `1 ⊂ π and also, π is parallel to `2 . Hence the distance d 40 3. DISTANCE AND ANGLE between `1 and `2 must equal to the distance from an arbitrary point on `2 to π. A parametric representation of π is x = −3 + 2t + s π: y= t + 3s . z = 1 − t + 2s Elimination of s and t gives the equation x − y + z = −2 for the plane π. The distance from the point (3, 4, 2) on `2 to π is, according to (10) √ 3 |3 − 4 + 2 + 2| d= √ = √ = 3. 3 1+1+1 √ The distance between the lines is thus 3. Angle between two planes. If π1 , π2 are two planes, we define the angle between them to be the angle between the corresponding normal vectors. (In general there are two possibilities for the angle, depending on the mutual orientations of the normal vectors: see the example below.) Example 4. Suppose that π1 : x − 2y − 2z = −3 π2 : x + 4y + z = 5. Corresponding normal vectors are n1 = (1, −2, −2) and n2 = (1, 4, 1). The angle θ between the planes then satisfies (n1 |n2 ) 1 · 1 + (−2) · 4 + (−2) · 1 1 = √ =−√ . √ kn1 kkn2 k 2 12 + 22 + 22 · 12 + 42 + 12 This gives θ = 3π/4. This is the oblique angle between the planes. There is also another possibility, namely if we substitute −n1 for n1 above. This leads to the acute angle π − θ = π/4. cos θ = Angle between a line and a plane. To determine the angle between a line ` and a plane π, one first computes the acute angle ψ between ` and a normal vector to π. The angle ϕ between ` and π is defined by ϕ + ψ = π/2. Example 5. Suppose that x=2+ t `: y=3+ t z = 1 + 4t ; π : 4x − 11y − 5z = −2. 3. COMPUTING DISTANCES AND ANGLES 41 Let θ denote the angle between the direction vector (1, 1, 4) of ` and the normal vector (4, −11, −5) of π. Then 1 · 4 + 1 · (−11) + 4 · (−5) 1 =− . √ 2 12 + 12 + 42 · 42 + 112 + 52 This gives θ = 2π/3, which is oblique. Hence the acute angle between ` and the normal to π is ψ = π − θ = π/3. The angle between ` and π is thus ϕ = π/2 − ψ = π/6. cos θ = √ Exercises. 5. Compute the distance between the point (1, 2, 3) and the line x= 1− t y = −4 + 2t . z = 3 − t 6. The line ` is the intersection between the planes x+2y−2z = 5 and 2x − y + z = 0. Determine the point on ` which is closest to the origin. 7. The line ` passes through the point (1, 2, 3) and is perpendicular to the plane 2x − 3y + 1 = −3. Find the distance between ` and the point (4, 5, 6). 8. Determine, in the form Ax + By + Cz = D, the equation of the plane which consists of all points which have equal distance to the points (1, 2, 0) and (−1, 0, 2). 9. Find the distance from the plane 3x − 4y + 12z = 13 to the points (0, 0, 0) and (2, 1, 3). Are these points on the same or on opposite sides of the plane? 10. a) Determine, in the form Ax + By + Cz = D, an equation for the plane M which passes through the points (2, −3, 0) and (2, −2, 2), and is parallel to the line x = 2 − t, y = 1 + t, z = 2 − t. b) Find the distance between the point (3, −1, 0) and M. 11. Find the point in the plane through the points (1, 3, −1), (1, 1, 0), (−1, 3, 2) which is closest to the point (−2, −2, −1). 12. a) Prove that the lines x = 1 + t x=3+ t `1 : and `2 : y=2− t y=2+ t z = 3 + 2t z = 2 − 3t intersect at a point. b) Find the distance between the point (3, 4, 5) and the plane spanned by `1 and `2 . 42 3. DISTANCE AND ANGLE 13. A ray of light is emitted from the point (3, −2, −1) and reflected off the plane x − 2y − 2z = 0. The reflected ray passes the point (4, −1, −6). At which point does the ray hit the plane? 14. Determine the distance between the lines a) (x, y, z) = t(−3, 3, 1) and (x, y, z) = (−1, 0, 0) + t(1, 1, 1). b) (x, y, z) = (1, 2, 3) + t(0, 1, 1) and (x, y, z) = (1, 1, 1) + t(2, 3, 1). 15. Consider the lines x = −12 − t x = 8 − 3t . and `2 : `1 : y = 4 − 2t y= 2− t z = 1 + t z = −3 + t Determine, in the form Ax + By + Cz = D, an equation for the plane which is parallel to `1 and `2 and has the same distance to the two lines. 16. a) Determine, in the form Ax+By+Cz = D, an equation for the plane M which passes through the points (2, −1, 3), (1, 2, −2), and (1, 0, 2). b) Determine the angle between M and the plane 2x + y − z = −1. 17. Determine the angle between the plane x + 2y − z = 0 and the line (x, y, z) = (3, 5, −1) + t(1, 1, 0). 18. A tetrahedron has corners A = (−1, 2, 0), B = (1, 3, −1), C(1, 1, 0), and D(−1, 3, −2). Determine the angle between the plane containing the side BCD and the line containing the edge AB. CHAPTER 4 Second degree curves 1. Ellipse, Hyperbola, Parabola Circle. Let F be a point in a plane π and consider the set of all points P in π of a certain distance a to F. If F and P have coordinates (x0 , y0 ) and (x, y) respectively, where coordinates are represented in some ON-system for π, then the equation of the circle can be written (x − x0 )2 + (y − y0 )2 = a2 . Of course, point F is called the center and a is the radius of the circle. If F is the origin, the equation reduces to x2 + y2 = a2 . We shall now discuss the other basic types of second degree curves: ellipse, hyperbola, and parabola. Ellipse. The definition of an ellipse generalizes the definition of a circle. Let F1 and F2 be two points in a plane π and let a be a positive constant; we assume that 2a is greater than the distance between F1 and F2 . The set of points P in π with the property that the sum of the distances from P to F1 and from P to F2 equals to 2a is called an ellipse. If we choose an ON-system in π such that the origin is the midpoint on the segment F1 F2 , and the x-axis passes through the points F1 and F2 , then F1 and F2 have coordinates (−c, 0) and (c, 0) for some real c. We can assume that c > 0. That the sum of distances from P = (x, y) to F1 and F2 equals 2a means that q q (1) (x + c)2 + y2 + (x − c)2 + y2 = 2a. Squaring the equation (1) leads to q (x + c)2 + (x − c)2 + 2y2 + 2 (x + c)2 + y2 (x − c)2 + y2 = 4a2 . Dividing by 2 and rearranging, this becomes q 2 2 2 2 x + y + c − 2a = − x2 + y2 + c2 + 2cx x2 + y2 + c2 − 2cx . 43 44 4. SECOND DEGREE CURVES Squaring again, we obtain that 2 2 x2 + y2 + c2 − 4a2 x2 + y2 + c2 + 4a2 = x2 + y2 + c2 − 4c2 x2 , i.e., a2 − c2 x2 + a2 y2 = a2 a2 − c2 . If we put b2 = a2 − c2 and divide with a2 b2 , this becomes x2 y2 + = 1. a2 b2 We have shown that each point (x, y) satisfying the root-equation (1) also satisfies (2). By tracing back in the calculations, one can also verify that all solutions to (2) satisfy (1). (Since we have squared several times, this is not immediate!) We have shown that the ellipse is completely determined by the equation (2). Notice that the ellipse (2) intersects the coordinate axes at the points (±a, 0) and (0, ±b). The segments from the origin to these points are called the semi-axes of the ellipse. The points F1 = (−c, 0) and F2 = (c, 0) are the foci of the ellipse. (2) Hyperbola. If we instead consider the set of points P in a plane π such that the difference between the distances to two given points ("foci”) F1 and F2 is constant = 2a, we get a curve known as a hyperbola. In a similar way to the case of the ellipse, we can introduce an ON-system in π such that F1 = (−c, 0) and F2 = (c, 0) for a number c > a. Hence the equation of the hyperbola becomes q q 2 2 (3) (x + c) + y − (x − c)2 + y2 = ±2a. Here the plus-sign shall be chosen if P = (x, y) is closer to F2 , and the minus-sign if P is closer to F1 . The hyperbola is not connected, it has two branches. Calculations analogous to the case for the ellipse show that the equation of the hyperbola can be written (4) x2 y2 − = 1, a2 b2 where this time b2 = c2 − a2 . Parabola. Let ` be a line in a plane π, and F a point in π, which is not on `. The set of points P in π whose distance to ` equals the distance to F is called a parabola. The point F is the focus and the line ` is called the directrix of the parabola. 1. ELLIPSE, HYPERBOLA, PARABOLA 45 Choose an ON-system in π such that the y-axis is parallel to `, the x-axis passes through F, and the origin has equal distance a to F and to `. We shall prove that the equation of the parabola in this ON-system becomes (5) y2 = 4ax. To show this, note that the distance from a point P = (x, y) to ` is x + a, p while the distance to F is (x − a)2 + y2 . Squaring these distances leads to (5). Remark 11. The parabola has an interesting optical property with many practical applications. Each light-ray parallel to the positive x-axis will, after reflection in the parabola, pass through the same point F. A set of parallel light-rays are thus focussed to the point F. Conversely, if we place a source of light at F, this will after reflection give rise to light-rays which are parallel to the x-axis. In the case of the ellipse, one has instead that if a light source is placed at F1 , then all light-rays will pass through F2 . Remark 12. The ellipse, the hyperbola, and the parabola are all cases of so-called conic sections. This name stems from the fact that all such curves can be obtained as the intersection of a double cone with a suitable plane. Exercises. 19. Determine the centers of circles which are tangent to the x-axis and which pass through the points (0, 1) and (0, 9). 20. Determine the foci of the ellipses a) 9x2 + 25y2 = 225. b) 25x2 + 169y2 = 4225. 21. Determine the equation of the ellipse which intersects the y-axis at the points (0, ±2) and has foci at the points (±2, 0). 22. Let (x0 , y0 ) be a point on the ellipse x2 /a2 + y2 /b2 = 1. a) Show that the line x = x0 + αt, y = y0 + βt is tangent to the ellipse if and only if αx0 /a2 + βy0 /b2 = 0. b) Show that the point (x, y) is on the tangent of the ellipse at (x0 , y0 ) if and only if xx0 /a2 + yy0 /b2 = 1. 23. Find the foci of the hyperbolae a) 16x2 − 9y2 = 144. b) 3x2 − 5y2 = 75. 24. Find the equation of the hyperbola which intersects the x-axis at the points (±2, 0) and has foci at (±3, 0). 25. Find the equation of the parabola which is symmetric with respect to the x-axis and which passes through the points (0, 0) and (27, 18). Also determine the focus. 46 4. SECOND DEGREE CURVES 2. General Second-Degree Equations A second-degree equation in the variables x and y is an equation of the form (1) Ax2 + Bxy + Cy2 + Dx + Ey = F. Now suppose that x and y are coordinates with respect to an ONsystem Oe1 e2 in the plane. We shall investigate the geometric meaning of the equation (1). In the preceding section, we saw that ellipses, hyperbolas, and parabolas are all described by second-degree equations. We shall here show that, except for certain "pathological” cases, these three basic types of curves can be used to describe all second-degree curves. If A = B = C = 0, then (1) is a first-degree equation Dx + Ey = F. This is (unless D = E = 0) the equation for a line. In the sequel, we can hence assume that at least one of the coefficients A, B, C are non-zero. The main idea for our solution of (1) involves changing coordinates to a new ON-system, where the equation has a simpler form. We start by showing that, by a suitable rotation of the basis vectors, we can get rid of the coefficient B for xy. Thus we introduce new basis vectors e01 , e02 by e01 = cos θ e1 + sin θ e2 , e02 = − sin θ e1 + cos θ e2 . It is easy to see that e01 , e02 (since e1 , e2 is so). Let (x0 , y0 ) be the coordinates for a point P relative to the system Oe01 e02 . Then −−→ OP = x0 cos θ − y0 sin θ e1 + x0 sin θ + y0 cos θ e2 . If (x, y) are the coordinates of P in the "old” system Oe1 e2 , we hence have x = x0 cos θ − y0 sin θ. y = x0 sin θ + y0 cos θ. Substituting these expressions into (1), we get an equation of the form 2 A0 (x0 )2 + B0 x0 y0 + C0 y0 + D0 x0 + E0 y0 = F, 2. GENERAL SECOND-DEGREE EQUATIONS 47 where the coefficient B0 for x0 y0 is given by (2) B0 = −2A cos θ sin θ + B(cos2 θ − sin2 θ) + 2C cos θ sin θ = B cos 2θ − (A − C) sin 2θ, where we have used the familiar "double angle” formulas for cos and sin. If B = 0 no rotation is necessary, and we can take θ = 0. If B , 0 we choose θ such that A−C cot θ = . B The relation (2) then shows that B0 = 0. In all cases, our new coordinate system turns our equation into the type 2 (3) A0 (x0 )2 + C0 y0 + D0 x0 + E0 y0 = F, for suitable constants A0 , C0 , D0 , E0 . Since we have assumed that at least one of the numbers A, B, C are not zero, it is easy to see that at least one of A0 , C0 are not zero. Case 1. We first consider the case when both A0 and C0 are non-zero. We can then complete squares in (3) to obtain D0 2 E0 2 (D0 )2 (E0 )2 0 0 (4) A0 x0 + + C y + = + + F. 2A0 2C0 4A0 4C0 We then make a new change of coordinates by E0 D0 00 0 , y = y + . 2A0 2C0 This means that the origin in the old system is moved to the point which has x0 y0 -coordinates (−D0 /2A0 , −E0 /2C0 ). If x00 = x0 + (D0 )2 (E0 )2 F = + + F, 4A0 4C0 0 then (4) becomes (5) A0 (x00 )2 + C0 (y00 )2 = F0 . If A0 and C0 are both positive, then (5) describes an ellipse, a point, or the empty set, depending on whether F0 > 0, F0 = 0, or F0 < 0. The case then A0 and C0 are both negative can be reduced to the positive case by multiplying both sides of the equation by −1. If A0 and C0 have opposite signs, we can after multiplication with a suitable constant assume that A0 > 0 and C0 = −1. The equation (5) is then A0 (x00 )2 − (y00 )2 = F0 . 48 4. SECOND DEGREE CURVES If F00 , 0 this means a hyperbola. If F0 = 0 we get √ y00 = ± A0 · x00 , which means a "degenerate hyperbola”, or rather: two intersecting straight lines. Case 2. Now suppose that one of the numbers A0 , C0 in (3) are zero. We can w.l.o.g. assume that A0 = 0 and C0 , 0. Then (3) says that C0 (y0 )2 + D0 x0 + E0 y0 = F. (6) If D0 = 0, then the equation (6) becomes independent of x0 . The equation then means two lines parallel to the x0 -axis, or one line parallel to the x0 -axis, or the empty set, depending on the number of different real solutions to the second-degree equation C0 (y0 )2 + E0 y0 = F. If D0 , 0, the equation (6) can be written ! (E0 )2 F E0 2 0 0 0 0 C y + 0 + D x − 0 − 0 0 = 0. 2C D 4C D If we now put (E0 )2 E0 F 00 0 , y = y + − 4C0 D0 , D0 2C0 we see that (6) transforms into the equation x00 = x0 − (7) C0 (y00 )2 + D0 x00 = 0. This last equation means a parabola: if C0 and D0 have equal signs, it surrounds the negative x00 -axis, otherwise it surrounds the positive x00 -axis. The above discussion characterizes all possible second-degree curves. Except for ellipse, hyperbola, and parabola, there are the following "pathological” cases: two intersecting straight lines, one or two parallel straight lines, a point, the whole plane (when A = B = C = D = E = F = 0), and the empty set. Exercises. 26. Prove that each of the following equations describe an ellipse. Also determine the lengths of the semi-axes. a) 17x2 − 16xy + 17y2 = 225. b) 3x2 + 2xy + 3y2 = 8 c) 9x2 + y2 − 18x + 4y + 4 = 0 d) 2x2 + 3y2 + 12x + 12 = 0. 3. ANSWERS TO EXERCISES 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 23. 24. 25. 26. 49 3. Answers to Exercises √ √ The side-lengths are 3, 3, and 2. The cosines of angles are √ √ √ − 2/3, 5/(3 3), and 2 2/3. π/3 √ or 2π/3. 12. (1,√1, −1). 3 3. x + y − z = 0. 1 and 25/13 respectively. The points are on opposite sides of the plane. √ a) 3x − 2y + z = 12. b) 1/ 14. (1, −1, 1). √ a) Intersection-point: (2, 1, 5). b) 16/ 30. (2, 1, 0). √ √ a) 1/ 14. b) 1/ 3. x + 2y + 5z = −1. a) x + 2y + z = 3. b) π/3. π/3. π/6. (±3, 5). a) (±4, 0). b) (±12, 0). x2 + 2y2 = 8. √ a) (±5, 0). b) (± 40, 0). 5x2 − 4y2 = 20. y2 = 12x2 ; focus (3, 0). √ √ √ a) 5 and 3. b) 2 and 2. c) 3 and 1. d) 3 and 2. CHAPTER 5 Cross-product and Volume-product 1. Orientation of vectors Let π be a plane in space and u, v two non-parallel vectors in π. Consider the smallest rotation which turns u into a vector with the same direction as v. If we view the rotation from one side of the plane, the rotation appears clockwise, but from the other side it will be perceived to be counterclockwise. If w is a vector not in π, and if the rotation is counterclockwise when seen from the side of π in the direction of w, then the triple u, v, w is said to be positively oriented. If this is not the case, we say that the triple is negatively oriented. A positively (resp. negatively) oriented system is sometimes called a right handed (resp. left handed) system. Observe that the ordering of the vectors is essential here. For example, if u, v, w is positively oriented, then u, w, v is negatively oriented. Namely: place all vectors with their tails at the same point. Then, seen from the tip of v, the smallest rotation which turns u into a vector with the same direction as w, will be clockwise. The reader is asked to supply a picture of the situation. 2. Cross-product Let u and v be two vectors in space and denote by A(u, v) the area of the parallelogram spanned by u and v. If u and v are parallel, then of course A(u, v) = 0; otherwise, a simple geometric consideration shows that (8) A(u, v) = kukkvk sin θ, where θ is the angle between u and v (in the interval [0, π]). Definition 13. The cross-product (or vector-product) u × v of u and v is the unique vector w with the following properties: (i) w is orthogonal to u and to v, (ii) kwk = A(u, v), (iii) If u and v are not parallel, then u, v, w is a positively oriented triple. 51 52 5. CROSS-PRODUCT AND VOLUME-PRODUCT Remark 14. Note that if the vectors u and v are parallel, then A(u, v) = 0 and u × v = 0. Example 1. Let e1 , e2 , e3 be a positively oriented orthonormal basis. Then e1 × e2 = −e2 × e1 = e3 , e3 × e1 = −e1 × e3 = e2 , e2 × e3 = −e3 × e2 = e1 . These formulas are obvious geometrically. The cross-product can (like the scalar product) be described using an orthogonal projection. Suppose that v , 0 and that π is a plane with normal vector v. Denote by u0 the orthogonal projection of u on π. Then A(u, v) = A(u0 , v), because ku0 k is the hight from v of the parallelogram spanned by u and v. Let w = u × v. So w is the vector orthogonal to u and v, such that kwk = ku0 kkvk and that u, v, w is a positively oriented triple. It is now seen that (9) u × v = kvk · T(u0 ), where T(u0 ) is the vector u0 rotated the angle π/2 clockwise in π, seen from the tip of v. For the orthogonal projection, we know that (u1 + u2 )0 = u01 + u02 . If the resulting vectors are rotated by π/2 clockwise, it follows that T((u1 + u2 )0 ) = T(u01 ) + T(u02 ). It now follows from (9) that (10) (u1 + u2 ) × v = u1 × v + u2 × v. We have proved the distributive law for the cross-product. It is also clear from the definition of the cross-product that it obeys linearity in the first argument, (11) (tu) × v = t(u × v), as well as a new type of rule: (12) u × v = −v × u. The rule (12) is known as the anti-commutativity of the cross-product. A consequence of (12) is that the counterparts of (10) and (??) also hold for the second argument: u × (v1 + v2 ) = u × v1 + u × v2 , u × (tv) = t(u × v). If e1 , e2 , e3 is a basis for three-dimensional space and (13) u = x1 e1 + x2 e2 + x3 e3 , v = y1 e1 + y2 e2 + y3 e3 2. CROSS-PRODUCT 53 we will thus have u × v = x1 y1 e1 × e1 + x1 y2 e1 × e2 + . . . + x3 y3 e3 × e3 . Since e j × e j = 0 and e j × ek = −ek × e j , this simplifies to (x1 y2 − x2 y1 )e1 × e2 + (x1 y3 − x3 y1 )e1 × e3 + (x2 y3 − x3 y2 )e2 × e3 . In the important case when the basis e1 , e2 , e3 is orthonormal and positively oriented, we get (using Example 1) the following result. Theorem 15. Suppose that e1 , e2 , e3 is an orthonormal basis. Then (14) u × v = (x2 y3 − x3 y2 )e2 × e1 + (x1 y3 − x3 y1 )e2 + (x1 y2 − x2 y1 )e3 . Remark 16. As a help for memory, there is a well-known mnemonic trick to remember the above formula. This uses the concept of "determinants”, a concept which we at this stage will use solely as a help for our memory. By definition a 2 × 2-determinant is defined by a b c d = ad − bc. A 3 × 3-determinant is then defined by a1 a2 a3 b1 b2 b3 = a1 b2 b3 − a2 b1 b3 + a3 b1 b2 . c2 c3 c1 c3 c1 c2 c1 c2 c3 This rule is called "expansion along the first row”: one starts in the upper left corner with a1 and multiplies by the 2 × 2-determinant obtained by striking the row and column containing a1 . Then we proceed to a2 and do the same, but with the opposite (i.e., minus-) sign in front of it. Then we move to a3 (changing sign again). Using determinants we can now write e1 e2 e3 u × v = x1 x2 x3 . y1 y2 y3 Remark 17. We saw above that the cross-product is non-commutative (it is in fact anti-commutative). The cross-product does not obey the associate law either. For example, if e1 , e2 , e3 is an orthonormal basis, then by Example 1, e1 × (e1 × e2 ) = −e2 and (e1 × e1 ) × e2 = 0 × e2 = 0. Thus it can happen that u × (v × w) , (u × v) × w. 54 5. CROSS-PRODUCT AND VOLUME-PRODUCT Example 2. Consider three points P0 , P1 , P2 , which in a positively oriented orthonormal system have coordinates (2, 3, −2), (4, 1, 1), and (2, 1, −1) respectively. We shall compute the area of the triangle P0 P1 P2 . This area equals to half the area of the parallelogram spanned −−−→ −−−→ by the vectors P0 P1 and P0 P2 . The area of that parallelogram is the length of the cross-product −−−→ −−−→ P0 P1 × P0 P2 = (2, −2, 3) × (0, −2, 1) = (4, −2, −4). Thus the triangle P0 P1 P2 has area 1√ 2 4 + 22 + 42 = 3. 2 Example 3. If e1 , e2 is an orthonormal basis in a plane π, we can choose a unit normal vector e3 to π such that e1 , e2 , e3 becomes an orthonormal basis for space. Two vectors u = x1 e1 + x2 e2 , v = y1 e1 + y2 e2 in π will then have coordinates (x1 , x2 , 0) resp. (y1 , y2 , 0) in the basis e1 , e2 , e3 . The area of the parallelogram spanned by u and v therefore equals to the length of the cross-product (x1 , x2 , 0) × (y1 , y2 , 0) = (0, 0, x1 y2 − x2 y1 ), i.e. we have the formula A(u, v) = x1 y2 − x2 y1 . Example 4. The cross-product can be used to calculate the distance between lines. Assume a positively oriented orthonormal system, and let `1 be the line through the points (1, 1, 1) and (4, 5, 3) while `2 is the line passing through the points (−1, −10, −1) and (8, 2, 2). Thus `1 has direction vector (3, 4, 2) and `2 has direction vector (3, 4, 1). Since (3, 4, 2) × (3, 4, 1) = (−4, 3, 0) we see that e = 15 (−4, 3, 0) is a unit vector orthogonal to both `1 and `2 . If P is a point on `1 and Q a point on `2 , we infer that the absolute value −−→ of the scalar product PQ e must equal to the distance between `1 and `2 . Choosing, for example, P = (1, 1, 1) and Q = (8, 2, 2), we find that −−→ PQ e = −5. The distance between `1 and `2 is thus 5. 3. VOLUME-PRODUCT 55 Exercises. 1. Prove that if u + v + w = 0, then u × v = v × w + w × u. 2. Find the area of the triangle which in a positively oriented ON-system has its vertices at the points a) (1, 2, 3), (3, 4, 1), (2, 0, 2). b) (5, 1, 1), (2, 3, 2), (3, 2, 3). c) (1, 0, 0), (0, 1, 0), (0, 0, 1). 3. In a positively oriented ON-system, the points (1, 1, 1) and (0, 3, 3) are on the line `1 , and (2, 2, −4) and (4, 4, 4) are on `2 . Find the distance between `1 and `2 . 4. Prove that (u × v) × w = (u|w) v − (v|w) u. Hint: To simplify the computations, one can choose an ON-basis e1 , e2 , e3 such that u = x1 e1 and v = x1 e1 + x2 e2 . 5. Prove that (u × v) × w = u × (v × w) if and only if u is parallel to w, or u and w are both orthogonal to v. Hint: Use the preceding exercise. 3. Volume-product The cross-product and the scalar product can be combined to form the volume-product V(u, v, w) of three vectors in space: (15) V(u, v, w) = (u × v|w). The motivation for the name is that the absolute value of V(u, v, w) equals to the volume of the parallelepiped spanned by the vectors u, v, w, if they are placed with their tails at one and the same point. To show this, note that the vector u×v e= ku × vk is a unit normal to the plane spanned by u and v. Denote by P the parallelepiped spanned by u, v, w. From elementary geometry we know that the volume of P equals the area of the "base parallelogram” spanned by u, v, times the height h of P above the plane spanned by u and v. Then h is the length of the orthogonal projection of w on the 56 5. CROSS-PRODUCT AND VOLUME-PRODUCT normal of the plane, i.e., h = |(e|w)|. Since the base parallelogram has area A(u, v) = ku × vk, we infer that the volume of P equals A(u, v)h = ku × vk · |(e|w)| = |(u × v|w)| . The asserted property of the volume product is proved. When two vectors u and v are parallel we have u × v = 0, so V(u, v, w) = 0 in this case. More generally, the volume-product is zero when the parallelepiped is degenerate, i.e., when the vectors u, v, w are linearly dependent. On the other hand, if u, v, w are linearly independent, the sign of (u × v|w) depends on whether or not the two vectors u × v and w lie on the same side of the plane spanned by u and v. The volume product V(u, v, w) is positive when the triple u, v, w is positively oriented and negative otherwise. Since the volume of the parallelepiped is the same regardless of how we choose to permute the vectors u, v, w, only the sign can change under such a permutation: V(u, v, w) = V(w, u, v) = V(v, w, u) = −V(u, w, v) = −V(v, u, w) = −V(w, v, u). Combining the computational rules for the cross- and scalar- products, we find that (16) V(su1 + tu2 , v, w) = sV(u1 , v, w) + tV(u2 , v, w) for all real numbers s and t. Thus we have linearity in the first argument for the volume product. We similarly have linearity in the second and in the third argument. In short: the volume product is tri-linear, i.e., linear in each of its three arguments. Now let e1 , e2 , e3 be a basis for space, and take three vectors u = (x1 , x2 , x3 ), v = (y1 , y2 , y3 ), w = (z1 , z2 , z3 ), where coordinates are given according to the chosen basis. Then by linearity in the different arguments, V(u, v, w) = V(x1 e1 + x2 e2 + x3 e3 , v, w) = x1 V(e1 , v, w) + x2 V(e2 , v, w) + x3 V(e3 , v, w) X = ... = xi y j zk V(ei , e j , ek ). Here the sum is over all possible choices of of i, j, k ∈ {1, 2, 3}, but since V(ei , e j , ek ) = 0 if two of them coincide, only six terms can be non-zero. Furthermore V(e1 , e2 , e3 ) = V(e3 , e1 , e2 ) = V(e2 , e3 , e1 ) = −V(e1 , e3 , e2 ) = −V(e2 , e1 , e3 ) = −V(e3 , e2 , e1 ). 3. VOLUME-PRODUCT 57 This gives that V(u, v, w) is equal to (x1 y2 z3 + x3 y1 z2 + x2 y3 z1 − x1 y3 z2 − x2 y1 z3 − x3 y2 z1 )V(e1 , e2 , e3 ). The number in front of V(e1 , e2 , e3 ) can now be recognized as the 3 × 3-determinant (see Remark 16 in the previous section) x1 x2 x3 y1 y2 y3 . z1 z2 z3 We have arrived at the formula x1 x2 x3 (17) V(u, v, w) = y1 y2 y3 V(e1 , e2 , e3 ). z1 z2 z3 This formula is particularly simple when the basis e1 , e2 , e3 is orthonormal. Then V(e1 , e2 , e3 ) = 1, so we simply have: x1 x2 x3 (18) V(u, v, w) = y1 y2 y3 . z1 z2 z3 Example 1. Suppose that the vectors u, v, w have coordinates (2, −1, 3), (0, 3, 2), and (3, 5, 1) respectively, with respect to a positively oriented orthonormal basis. We get 2 −1 3 V(u, v, w) = 0 3 2 = −47. 3 5 1 The volume of the parallelepiped spanned by u, v, w is thus 47. Three vectors are linearly dependent if and only if they are coplanar, i.e., if the corresponding parallelepiped has volume zero. According to (17) this is equivalent to that the determinant of the coordinates of the vectors is zero. Example 2. To decide when the vectors (1, a, 2), (−1, 7, 1 + a), and (1, −1, 1) are linearly dependent, we form the determinant a 2 1 −1 7 1 + a = a2 + 3a − 4 = (a − 1)(a + 4). 1 −1 1 The vectors are thus linearly dependent if a = 1 or a = −4. 58 5. CROSS-PRODUCT AND VOLUME-PRODUCT Exercises. 6. Motivate the identity (u × v|w) = (u|v × w) . 7. The volume of a tetrahedron spanned by three vectors u, v, w, rooted at the same point, equals to 1/8 of the volume of the parallelepiped spanned by u, v, and w. Find the volume of the tetrahedron with vertices at a) (2, 1, 0), (3, 5, 2), (4, 1, 2), (6, 1, 5). b) (−2, 2, −3), (2, 1, 3), (1, 4, −2), (0, 5, 1). 8. A tetrahedron with volume 5 has three of its vertices at the points (2, 1, −1), (3, 0, 1), (2, −1, 3). The fourth vertex is on the positive y-axis. Determine its y-coordinate. 9. For which values of a and b are the three vectors (a, b, b) , (b, a, b) , (b, b, a) linearly dependent? 10. For which values of a are the four points (0, 2, 1) , (−a, 1, 0) , (−3, 3, −a) , (3, −3, 1 + a) in the same plane? 4. Quarternions We have seen that neither the commutative, nor the associative laws hold for the cross-product. One can ask whether there is some other way of defining multiplication between vectors, so that all the usual computational laws are satisfied. For vectors in a plane, this is true, since plane vectors can be identified with complex numbers. In the definition of multiplication of complex numbers, one starts with the familiar identity i2 = −1; if the usual laws of calculation shall hold, the product of complex numbers must be (x1 + ix2 )(y1 + iy2 ) = (x1 y1 − x2 y2 ) + i(x1 y2 + x2 y1 ). As we know, this definition does indeed satisfy all the usual computational rules. Vectors in three-dimensional space can formally be written x1 + x2 i + x3 j and if we still insist that i2 = −1, so that the multiplication when x3 = 0 corresponds to multiplication of complex numbers, we just 4. QUARTERNIONS 59 need to establish the rules for multiplication with j. In particular, the product ij must be a new vector (19) ij = a + bi + cj. If this is multiplied by i, using that i2 = −1, we obtain −j = −b + ia + ijc. If we here substitute ij for the right hand side of (19), we get after simplification that 0 = (ac − b) + (bc + a)i + (c2 + 1)j. This does not make sense, for we can not have c2 + 1 = 0 for a real number c. The above argument shows that it is impossible to extend the multiplication of complex numbers to multiplication of triples of numbers in a way such that the usual laws of calculation are preserved. Annoyed by this type of inconveniences, the Irish mathematician Hamilton tried to instead define multiplication between 4-tuples (20) x = x0 + x1 i + x2 j + x3 k. For reasons soon to be made clear, the coefficients are enumerated from 0 to 3, rather than from 1 to 4. In 1843, Hamilton discovered that if one abandons the commutative law xy = yx and defines (21) i2 = j2 = k2 = −1 ij = −ji = k , jk = −k j = i , ki = −ik = j, then multiplication of 4-tuples will satisfy all other rules of calculation. Hamilton coined the term quarternions for the set of 4-tuples with this multiplication. He also showed how to define division by a non-zero quarternion. In 1878, the German mathematician Frobenius proved that if we want to define multiplication of n-tuples such that division by nonzero elements is always possible, and if we only are prepared to abandon the commutative law for multiplication, then n must be either 1, 2, or 4. Except for the real and complex fields, Hamilton’s quarternions are the only possibility. Hamilton called the number x0 the scalar part of the quarternion (20) and x1 i + x2 j + x3 k is the vector part. If one multiplies two quarternions with scalar parts zero and uses the identities in (21), one finds after a little calculation that (x1 i + x2 j + x3 k)(y1 i + y2 j + y3 k) = −(x1 y1 + x2 y2 + x3 y3 )+ + (x2 y3 − x3 y2 )i + (x3 y1 − x1 y3 ) j + (x1 y2 − x2 y1 )k. 60 5. CROSS-PRODUCT AND VOLUME-PRODUCT The scalar part of the right hand sign equals the negative of the scalar product of the vectors in the left hand side, and the vector part of the right hand side equals the cross-product of the vectors in the left hand side. The scalar product is older, even though the name was invented by Hamilton, but the cross-product was discovered in this way, as a by-product of multiplication of quarternions. Hamilton also interpreted the quarternions geometrically and defined the scalarand cross-products in a basis-independent way, as we have done in this chapter. Exercises. 11. Prove that the formula (21) implies that (x0 + x1 i + x2 j + x3 k)(x0 − x1 i − x2 j − x3 k) = x20 + x21 + x22 + x23 , and that one therefore can define division by non-zero quarternions. 5. Answers to Exercises 1. Determine t so that the following vectors become orthogonal: a) u = (t, 4) and v = (−2, 3) b) u = (t, 2) and v = (t, −8) c) u = (t − 2, 3) and v = (−4, 2t). 2. Find the angle between the vectors u = (−1, 3) and v = (−2, 1). 3. Determine cos θ where θ is the angle between a) u = (3, 2) and v = (3, −2) b) u = (1, 1) and v = (−2, 1). 4. Determine t so that the angle between u = (−1, t) and v = (1, 1) becomes π/3. −−→ 5. Find the length of the vector PQ when a) P = (1, 4) and Q = (5, 7) b) P = (−1, 0) and Q (4, 12) √ =√ c) √ P =√ (2, −3) and Q = (1, 1) d) P = (− 2, 3) and Q = ( 2, 3 3). 7. a) 4/3. b) 10. 8. y = 8. 9. a = b or a = −2b. 10. a = 1 or a = −9/4. CHAPTER 6 Matrices 1. Basic properties Definitions. By a p × n-matrix we mean an array of numbers, arranged in the form a11 a12 . . . a1n a 21 a22 . . . a2n A = .. .. .. . . . ap1 ap2 . . . apn with p rows and n columns. The numbers a jk are called matrix elements. Notice that a jk is in the j:th row and k:th column. A more brief notation, meaning the same matrix A is: p,n A = (a jk ) j,k=1 . In the special case when p = n we say that A is a square matrix of order n. p,n p,n Operations with matrices. Let A = (a jk ) j,k=1 and B = (b jk ) j,k=1 be two p × n matrices. We define A + B to be the p × n matrix with entries a jk + b jk , i.e., p,n A + B = (a jk + b jk ) j,k=1 . Example 1. ! ! ! 2 3 −1 3 −2 1 5 1 0 + = . 4 2 1 2 1 2 6 3 3 For a scalar t we define tA as the matrix with entries ta jk . Example 2. ! ! 2 3 −1 4 6 −2 . 2· = 4 2 1 8 4 2 61 62 6. MATRICES The definition of the product of two matrices is less obvious. In order to find a reasonable definition, let us consider a linear relation a11 x1 + a12 x2 + . . . + a1n xn = y1 a21 x1 + a22 x2 + . . . + a2n xn = y2 . (1) .. . ap1 x1 + ap2 x2 + . . . + apn xn = yp If the numbers y1 , . . . , yp are given, then (1) is a linear system for the unknowns x1 , . . . , xn . On the other hand, if x1 , . . . , xn are given, then (1) gives us the values of y1 , . . . , yp . That is, the quantities y1 , . . . , yp can via (1) be regarded as functions of x1 , . . . , xn . In order to define matrix multiplication, we shall adapt this latter point of view: we regard (1) as a recipe for a function. Now suppose that we have another set of variables z1 , . . . , zq which depend on y1 , . . . , yp in a similar way, b11 y1 + b12 y2 + . . . + b1p yp = z1 b21 y1 + b22 y2 + . . . + b2p yp = z2 . (2) .. . bq1 y1 + bq2 y2 + . . . + aqp yp = zq If we here substitute y1 , . . . , yp by the corresponding left hand side in (1), we get a relation of the form c x + c12 x2 + . . . + c1n xn = z1 11 1 c21 x1 + c22 x2 + . . . + c2n xn = z2 (3) , .. . cq1 x1 + cq2 x2 + . . . + cqn xn = zq where (4) c jk = b j1 a1k + b j2 a j2 + · · · + b jp apk . Now define two matrices by a11 a12 . . . a1n a 21 a22 . . . a2n A = .. .. .. . . . ap1 ap2 . . . apn b11 b12 . . . b1p b 21 b22 . . . b2n and B = .. .. .. . . . . bp1 bp2 . . . bpn We call these matrices the coefficient matrices of the linear equation systems (1) resp. (2). We define the product BA to be the matrix q,n C = (c jk ) j,k=1 where the c jk are given by (4). Thus the element in 1. BASIC PROPERTIES 63 position ( j, k) in BA is obtained by pairwise multiplication of the elements of row j in B with the elements in column k in A, followed by summation. Observe that the matrix product BA is defined only if the number of columns of B equals to the number of rows of A. Example 1. ! ! ! 1 2 1 0 5 1·1+2·3 1·0+2·6 1·5+2·7 = 3 4 3 6 7 3·1+4·3 3·0+4·6 3·5+4·7 ! 7 12 19 = . 15 24 43 Example 2. 3 1 2 3 2 = 10 . 1 Example 3. 3 6 9 3 2 1 2 3 = 2 4 6 . 1 2 3 1 Example 4. ! ! ! 4 −2 1 −2 8 −16 = , 2 −1 −2 4 4 −8 ! ! ! 1 −2 4 −2 0 0 = . −2 4 2 −1 0 0 The last example shows that the order between the factors is essential for matrix multiplication. In other words, matrix multiplication is non-commutative: it is possible (and very common) to have AB , BA. In Example 1, B is a 2 × 2 matrix and A is a 2 × 3 matrix. The product BA is therefore a 2 × 3 matrix. The product AB is not defined in this case. In order that both AB and BA be defined, it is necessary and sufficient that A be a n × p and B a p × n matrix (with the same n and p). Then BA is an n × n matrix and AB is p × p. This is illustrated by examples 2 and 3. 64 6. MATRICES Definition 18. The square n × n-matrix 1 0 . . . 0 0 1 . . . 0 E = En = .. .. .. . . . 0 0 ... 1 is called the identity matrix of order n. Notice that E is the neutral element for matrix multiplication, i.e. we have EA = AE = A for all n × n matrices A. While matrix multiplication fails to be commutative, it obeys the other rules of calculation. Theorem 19. Matrix multiplication obeys the associative law C(BA) = (CB)A (5) and the distributive laws (6) B(A + A0 ) = BA + BA0 , (B + B0 )A = BA + BA0 . (We here assume that the dimensions of the matrices are such that the sums and products make sense.) Proof. The formulas can easily be verified by direct evaluation. Nonetheless, we shall give an alternative argument for the associative law (5). Consider the relation (1) as a function F : Rn → Rp , which to each n-tuple (x1 , . . . , xn ) ∈ Rn associates a p-tuple (y1 , . . . , yp ) ∈ Rp . Likewise (2) can be regarded as a function G from Rp to Rq , which to (y1 , . . . , yp ) associates (z1 , . . . , zq ). The matrix product BA will then correspond to the composite function G ◦ F, and (5) follows from the associate law from composition of functions: H ◦ (G ◦ F) = (H ◦ G) ◦ F. We shall in this course only be concerned with matrices whose entries are real numbers. Nonetheless, we want to mention that matrices with complex entries can be handled in the same way, as in the following example. 1. BASIC PROPERTIES Example 5. Consider the three matrices ! ! i 0 0 1 I= , J= 0 −i −1 0 65 , ! 0 i K= , i 0 where i is the imaginary unit. If these are multiplied by −i, one obtains the famous Pauli matrices; these were used by Paul Dirac in 1928, to formulate an equation for the electron. It is easy to check that I2 = J2 = K2 = −E, where E is the identity matrix of order 2, and that IJ = −JI = K , JK = −KJ = I , KI = −IK = J. If this is compared with the formulas for multiplication of quarternions in the preceding chapter, one realizes that Hamilton’s quarternions x0 + x1 i + x2 j + x3 k can be identified with the set of complex 2 × 2 matrices of the form ! x0 + ix2 x2 + ix3 x0 E + x1 I + x2 J + x3 K = . −x2 + ix3 x0 − ix1 The computational rules for quarternions can then be seen as special cases of the rules for matrix multiplication. Before we close this section, we define a new matrix operation called transposition. If A is a p × n matrix, then the transpose of A At is defined as the n × p matrix whose rows are the columns of A: a11 a21 . . . ap1 a11 a12 . . . a1n a a 21 a22 . . . a2n 12 a22 . . . ap2 t If A = .. .. .. . .. .. then A = .. . . . . . . ap1 ap2 . . . apn a1n a2n . . . apn Transposition satisfies the following computational rules (proofs are left as exercises for the reader) (A + B)t = At + Bt , (AB)t = Bt At . Notice that the last rule says that transposition reverses the order of a matrix product. Exercises. 1. Let 1 0 2 A = 0 3 1 2 2 −1 , 0 1 1 B = 2 −2 0 1 2 3 , 2 1 C = −1 1 . 1 2 66 6. MATRICES Compute a) AB b) BA c) At Bt d) (A + 3B)C e) CCt 2. Let ! ! 1 1 1 −2 A= and B= . −1 1 3 4 f) Ct C. Determine: a) A2 − B2 , b) (A + B)(A − B). Why are the answers different in a) and b)? 3. Let ! 1 −3 A= . −3 9 Find all 2 × 2 matrices B such that AB = BA = 0. Here 0 denotes the 2 × 2 zero-matrix, i.e. the matrix all of whose entries equal zero. 4. Denote by Ak the product of a matrix A by itself k times. a) Prove that if AB = BA, then we have the binomial expansion ! k (A + B)k = Ak + kAk−1 B + Ak−2 B2 + . . . + Ak . 2 b) Compute (I + A)10 where I is the identity matrix and 0 5 3 A = 0 0 3 . 0 0 0 5. Show that, in order to verify all statements in Example 5, it suffices to prove that I2 = J2 = K2 = IJK = −E. 2. Matrix inverse Let a11 a12 . . . a1n a21 a22 . . . a2n A = .. .. .. . . . ap1 ap2 . . . apn , x1 x2 x = .. . xn , y1 y2 y = .. . . yp The linear equation system (1) can then be written in the matrix form: (7) Ax = y. We shall here discuss (7) in the important case when n = p, i.e., when A is a square matrix of order n. 2. MATRIX INVERSE 67 If n = 1, the system (7) reduces to a single equation ax = y. If a , 0 this equation can be solved by multiplication with a−1 = 1/a: x = a−1 y. There is a counterpart to this procedure also when n > 1. Definition 20. Let A be a n × n-matrix. We say that A is invertible if there is an n × n-matrix B such that AB = E BA = E. and In this case, B is called an inverse to A. Remark 21. If A is invertible, then the inverse is unique. We can thus speak of the inverse and write B = A−1 . To see this, assume that there are two matrices B and C which are inverse to A. Then BA = E and AC = E, so B = BE = B(AC) = (BA)C = EC = C. The uniqueness is proved. Example 1. If 1 2 A= 2 3 ! , ! −3 2 B= , 2 −1 then by direct calculation, AB = BA = E. Thus A is invertible and B = A−1 . For the same reason, B is invertible and A = B−1 . Now suppose that A is an invertible matrix. The linear system Ax = y can then be multiplied by A−1 from the left, giving x = Ex = (A−1 A)x = A−1 (Ax) = A−1 y. If the system has a solution x we must thus have x = A−1 y. That this really is a solution follows from that A(A−1 y) = (AA−1 )y = Ey = y. We have proved one direction of the following theorem. Theorem 22. A square matrix A is invertible if and only if the linear equation system Ax = y has a unique solution x for all right hand sides y. If this is the case, the solution is given by x = A−1 y. 68 6. MATRICES Remark 23. In the proof, we shall use the following property of matrix multiplication: If C is a square matrix with columns C1 , C2 , . . . , Cn , then the matrix AC has columns AC1 , AC2 , . . . , ACn . The (simple) verification of this fact is left as an exercise for the interested reader. Proof of Theorem 22. It remains to prove that if the system Ax = y has a unique solution for all possible right hand sides y, then A is invertible. Let C and D be two square matrices with columns C1 , C2 , . . . , Cn resp. D1 , D2 , . . . , Dn . By Remark 23, the matrix identity (8) AC = D is equivalent to the n vector identities ACk = Dk , k = 1, . . . , n. Hence if the system Ax = y has a unique solution for all y, there is precisely one n × n matrix D satisfying (8). In particular there is a unique n × n matrix B such that (9) AB = E. In order to show that A is invertible, we shall show that also BA = E. But by (9) we have A(BA) = (AB)A = EA = A. The matrix C = BA thus satisfies the equation AC = A. This last equation is also satisfied by C = E. Since (8) has precisely one solution C for every right hand side D, we must then have BA = E. The following example shows how one can calculate inverse matrices in practice. Example 2. To determine whether the matrix 1 1 1 A = 1 2 3 1 3 2 2. MATRIX INVERSE 69 is invertible, we try to solve the system Ax = y for an arbitrary right hand side y: x1 + x2 + x3 = y1 x1 + 2x2 + 3x3 = y2 x1 + 3x2 + 2x3 = y3 x1 + x2 + x3 = y1 ∼ x2 + 2x3 = −y1 + y2 x2 − x3 = −y2 + y3 ... ∼ 3x1 = 5y1 − y2 − y3 ∼ 3x2 = −y1 − y2 + 2y3 3x3 = −y1 + 2y2 − y3 ∼ . We see that the system has a unique solution x for each right hand side y, so the matrix A is invertible. The last system also shows that 5 −1 −1 1 A−1 = −1 −1 2 . 3 −1 2 −1 Computational rules for the inverse. If both of the matrices A and B are invertible, then the product AB is also invertible, and (AB)−1 = B−1 A−1 . The order between factors is thus reversed after inversion. This is realized by observing that if A and B are invertible then the matrix D = B−1 A−1 satisfies D(AB) = B−1 (A−1 A)B = B−1 EB = B−1 B = E, and similarly (AB)D = E. Thus AB is invertible with inverse D. Finally, we leave it to the reader to check that if A is invertible, then At is invertible and (At )−1 = (A−1 )t. 70 6. MATRICES Exercises. 6. Determine which matrices are invertible. Also determine the inverse matrix whenit exists. 1 0 1 1 1 2 1 2 3 a) 0 1 2 b) 2 1 1 c) 0 1 1 . 1 1 0 −1 1 4 0 0 1 7. Let 1 0 a A = 0 −1 1 . 1 1 0 Calculate A−1 for those values of a for which A is invertible. 8. Find the inverse matrices of A and of A2 where 1 2 3 A = 2 3 1 . 1 1 1 9. Find a matrix X which solves the matrix equation AXB = C where ! ! 1 2 3 1 2 1 1 1 0 1 2 A= , B = . , C = 1 2 2 1 2 0 0 1 10. Let A and B be two n×n-matrices such that E−AB is invertible. Prove that E − BA is invertible and that (E − BA)−1 = E + B(E − AB)−1 A. 3. Answers to Exercises 7 2 5 0 2 2 5 1. a) 7 −4 3 b) 0 −6 2 c) 5 0 3 −4 −1 7 12 1 4 14 5 −1 4 6 d) 16 5 e) −1 2 1 f) 3 10 29! 4 1 !5 5 12 4 9 2. a) b) . −17 −10 −20 −9 ! 9t 3t 3. , t ∈ R. 3t t 1 50 705 4. b) (E + A)10 = 0 1 30 . 0 0 1 2 7 −6 12 2 1 ! 3 . 6 3. ANSWERS TO EXERCISES 1 −2 1 1 −1 1 1 . c) 12 −1 1 6. a) 0 1 −2 . b) Not invertible. 0 0 1 1 1 −1 1 −a −a 1 −1 a 1 for a , 1. 7. A−1 = 1−a −1 1 1 −2 −1 7 10 −7 −2 2 −5 and (A2 )−1 = 19 −5 8 −8 . 8. A−1 = 13 1 1 −1 1 −2 −4 13 ! 0 3 −6 9. X = A−1 CB−1 = . 1 −3 4 71