Download More Lecture Notes in Algebra 1 (Fall Semester 2013)

Document related concepts

Vector space wikipedia , lookup

Mathematics of radio engineering wikipedia , lookup

Classical Hamiltonian quaternions wikipedia , lookup

Karhunen–Loève theorem wikipedia , lookup

Cartesian tensor wikipedia , lookup

Bra–ket notation wikipedia , lookup

System of linear equations wikipedia , lookup

Line (geometry) wikipedia , lookup

Linear algebra wikipedia , lookup

Basis (linear algebra) wikipedia , lookup

Transcript
More Lecture Notes in Algebra 1 (Fall Semester 2013)
October 24, 2013
CHAPTER 1
Linear Systems of Equations
1. Introduction
A linear equation in the variables (or unknowns) x1 , . . . , xn is a statement of the form
(1)
a1 x1 + a2 x2 + . . . + an xn = b.
Here a1 , . . . , an and b are constants which are usually known. By a
solution to (1) we mean a set of values of x1 , x2 , . . . , xn which makes
the statement true.
The linear equation (1) is called homogeneous if b = 0; otherwise it
is non-homogeneous.
A linear equation system is a set of linear equations. By a solution
to such a system we mean a set of values of the variables such that
each of the equations in the system are fulfilled.
Example. (a) The system
(
x + 2y = 5
−2x + 3y = 4
has a solution (x, y) = (1, 2).
(b) The system



x+y+ z=1



2x − y + z = 3




3x
+ 2z = 4
has, for example, the solutions (x, y, z) = (2, 0, −1) and (x, y, z) =
(4, 1, −4).
(c) The system
(
x+y=2
x+y=3
has no solutions, for if there were a solution (x, y), we would have
2 = x + y = 3, which is impossible.
3
4
1. LINEAR SYSTEMS OF EQUATIONS
2. The Gauss–Jordan Elimination Method
We shall now introduce a method for solving linear systems of
equations. In this method, we shall successively replace a given system of equations by simpler, equivalent systems. Here, by definition,
two systems are equivalent iff they have precisely the same solutions.
(
Example 1. Solve the system (*)
x + 2y = 5
−2x + 3y = 4
.
Solution. The first equation in the system is equivalent to that x =
5 − 2y. We can then eliminate x from the second equation by replacing
it with 5 − 2y. The system (*) is thus equivalent to
x + 2y = 5
−2(5 − 2y) + 3y = 4
(
(1)
(2)
(3)
⇔
(
x + 2y = 5
7y = 14
⇔
(
x + 2y = 5
y=2
The second equation in the last system says that y = 2. We can then
eliminate y from the first equation by inserting this value for y, so
that (3) is equivalent to
(4)
x+2·2=5
y=2
(5)
(
x=1
y=2
(
⇔
It follows that the system has the unique solution (x, y) = (1, 2).
The elimination we performed to obtain (2) from (*) can be recognized as the following operation: the first equation in (*) is multiplied
by 2, and is then added to the second equation. The coefficient of x
in the resulting sum of equations is then zero, i.e. the variable x is
2. THE GAUSS–JORDAN ELIMINATION METHOD
5
eliminated there:
(
x + 2y = 5 ×2, add to second eq.
−2x + 3y = 4
(
x+
2y = 5
⇔
0 · x + (2 · 2 + 3)y = 2 · 5 + 4
(
x + 2y = 5
⇔
7y = 14
The elimination of y from the first equation has a similar interpretation: the second equation in (3) is multiplied by −2 and added to the
first,
(
x + 2y = 5
y = 2 ×(−2), add to first eq.
(
x+0·y=5−2·2
y=2
The indicated operation is the main ingredient in the Gauss-Jordan
elimination method. We shall execute this operation repeatedly, in a
systematic way.



x + 4y − 2z = 8



Example 2. Solve the system 
2x + 9y + z = 7



3x − 2y − 4z = 6
Solution. We first use the frist equation to eliminate x from the other
equations. From the second equation we subtract 2·(first eq.) and
from the third we subtract 3·(first eq.). The result is



x + 4y − 2z = 8



y + 5z = −9




 −14y + 2z = −18
To simplify, we here divide the third equation by 2,



x + 4y − 2z = 8



y + 5z = −9




 −7y + z = −9
6
1. LINEAR SYSTEMS OF EQUATIONS
We now eliminate y from the third equation by adding 7·(second eq.):



x + 4y − 2z = 8



y + 5z = −9





36z = −72



x + 4y − 2z = 8



⇔ 
y + 5z = −9




z = −2
Using the third equation, we eliminate z from the first two ones:



x + 4y
= 4



y
= 1





z = −2
Finally, y is eliminated from the first equation using the second one:



x
= 0



y
= 1





z = −2
The system has the unique solution (x, y, z) = (0, 1, −2).
Here is the main principle. The first equation is used to eliminate the first unknown from the other equations. Then the (new)
second equation is used to eliminate the second unknown from the
subsequent equations, etc. The last equation will only contain the
last unknown, which has thus been calculated. After that we make a
backwards substitution to eliminate the variables "upwards”, one at a
time, until the system is completely solved.
The simple rule above can not always be carried out. We illustrate
with a couple of simple examples.
Example 3. In the system



y − 2z = 3



x + 2y − z = 2




2x + 3y + z = −1
we can’t as earlier use the first equation to eliminate x from the other
ones. On the other hand, the second equation can be used. To obtain
a system of the same form as earlier, we switch places of the first two
2. THE GAUSS–JORDAN ELIMINATION METHOD
7
equations:



x + 2y − z = 2



y − 2z = 3




2x + 3y + z = −1
After this we can proceed as before. Do this! (The system has the
unique solution (x, y, z) = (2, −1, −2).)
(
3x + 2y = 4
.
Example 4. Solve the system
6x + 4y = 1
Solution. We eliminate x from the second equation by subtracting
2·(first eq.) and get
(
3x + 2y = 4
0 = −7
The second equation is a false statement for all values of the variables
x and y. This means that the system has no solutions.
(
3x + 2y = 4
Example 5. Solve the system
.
6x + 4y = 8
Solution. Eliminating x from the second equation, we now get
(
3x + 2y = 4
0=0
The second equation is always satisfied, so the system is equivalent
to the single equation 3x + 2y = 4. This means that we can let y have
an arbitrary value, say y = t, and then x is uniquely determined by
y. Hence the system has infinitely many solutions
(
x = 34 − 32 t
,
t ∈ R.
y=t
Exercises.
1. Solve the following linear systems of equations:
(
(
(
x+y=2
2x − 3y = 1
4x + 3y = 2
a)
b)
c)
−x + y = 4
6x + y = 7
3x − 5y = 6
2. Solve 
the following systems:





x− y+ z=4
2x − y + 3z = 9






a) 
b) 
3x + 5y − z = 0
3x + 2y − 2z = 1




2x − y − 3z = 2
4x + 5y − 4z = 2
.
8
1. LINEAR SYSTEMS OF EQUATIONS






x+y
=8
y + z = −2






d) 
c) 
x
+z=6 .
x − 2y + z = −2







2x − 5y + 3z = −2
y+z=4
3. Solve the following systems:
(
(
(
2x − 31 y = 2
4x − 3y = 4
6x + 9y = 5
a)
b)
c)
9
6x − y = 4
−3x + 4 y = −3
9x + 3y = 15
2
.
3. The Augmented Matrix of a Linear System of Equations
The numerical operations used when solving a linear system involve solely the coefficients of the unknowns and the constants in the
right hand sides. To simplify the notation, one can omit the unknown
and represent the system by a scheme called the augmented matrix of
the system. For example, the system





x + 2y + 3z = 10
1 2
3 10







(∗) 
is represented by 1 1 −1 4  .
x+ y− z= 4





2 −1 1 5
2x − y + z = 5
Example 6. We solve the system (*) in two ways: by writing down
complete equations as before, and, in parallel, by just manipulating
with the augmented matrix.
Eliminate x from eq.’s 2 and 3:



x + 2y + 3z = 10



− y − 4z = −6




 −5y − 5z = −15
|
Row 2 − row 1, row 3 − 2 · row 1 :


3
10 
1 2
0 −1 −4 −6 


0 −5 −5 −15
1
1
Mult. eq. 2 by − 1, eq. 3 by − : | Mult. row 2 by − 1, row 3 by − :
5
5





x + 2y + 3z = 10
1 2 3 10





0 1 4 6 
y + 4z = 6








0 1 1 3

y+ z= 3
Eliminate y from eq. 3 :



x + 2y + 3z = 10



y + 4z = 6





−3z = −3
|
Row

1
0


0
3 − row 2 :

2 3 10 
1 4
6 

0 −3 −3
3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS
9
Divide eq. 3 by − 3 :



x + 2y + 3z = 10



y + 4z = 6





z= 1
|
Divide row 3 by − 3 :


1 2 3 10
0 1 4 6 




0 0 1 1
Eliminate z in eq.’s 1 and 2 :

x + 2y

=7



y
=6





z=1
| Row 1 − 3 · row 3, row 2 − 4 · row 3 :


1 2 0 7
0 1 0 2




0 0 1 1
Eliminate y in eq. 1 :


x
=3



y
=6





z=1
|
Row 1 − 2 · row 2 :


1 0 0 3
0 1 0 2




0 0 1 1
The system thus has the unique solution (x, y, z) = (3, 2, 1).
The admissible operations on the augmented matrix, i.e. those
giving rise to an equivalent augmented matrix, are the following:
(i)
(ii)
(iii)
(iv)
A row is multiplied by a constant , 0.
A constant multiple of a row is added to another row.
Two rows are interchanged.
Two columns can be interchanged if one observes that this
means that two unknowns are interchanged. This must be taken
into account when interpreting the answer.
The operation (iv) is not really needed – one can always circumvent it by other means. Here is an example of this.
Example 7. Solve the system


x + 2y − 3z + w = −2





3x + 6y + 3z − w = −2



2x + 4y + 3z + w = 9



2x + 4y − 3z − w = −13
Solution. The augmented matrix is
10
1. LINEAR SYSTEMS OF EQUATIONS

1
3


2

2

1
0

∼ 
0

0

1
0

∼ 
0

0

2 −3 1 −2 
6 3 −1 −2 

4 3
1
9 

4 −3 −1 −13

2 −3 1 −2
0 12 −4 4 

0 9 −1 13 

0 3 −3 −9

2 −3 1 −2
0 3 −1 1 

0 9 −1 13 

0 1 −1 −3
row 2 − 3·row 1
row 3 − 2·row 1
row 4 − 2·row 1
divide row 2 by 4
divide row 4 by 3
Here we realize that we can not eliminate the variable in the second
column. We therefore skip that column and continue with the third
one. Interchange rows 2 and 4:


1 2 −3 1 −2
0 0 1 −1 −3




row 3 − 9·row 2
0 0 9 −1 13 


0 0 3 −1 1
row 4 − 3·row 2


1 2 −3 1 −2
0 0 1 −1 −3


∼ 

8 40 
divide row 3 by 8
0 0 0


0 0 0
2 10
divide row 4 by 2


row 1- row 3
1 2 −3 1 −2
0 0 1 −1 −3
row 2 + row 3


∼ 

1
5 
0 0 0


0 0 0
1
5
row 4 is unnecessary - strike it!


row 1 + 3·row 2
1 2 −3 0 −7
0 0 1 0 2 
∼ 



0 0 0 1 5





x + 2y = −1

1 2 0 0 −1


0 0 1 0 2 

∼ 
.
z=2
 i.e. 





0 0 0 1 5

w=5
Here y can have an arbitrary value (say t), and then the solution is
fixed. The solutions are thus
(x, y, z, w) = (−1 − 2t, t, 2, 5),
t ∈ R.
3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS
11
We finish this section by discussing the application of the Gauss–
Jordan method in a few other slightly complicated cases.



x − 2y = 1



Example 8. Solve the system 
2x − 3y = 4 .



4x − 7y = 5
Solution. The augmented matrix is


1 −2 1
2 −3 4
row 2 - 2·row 1




row 3 - 4·row 1
4 −7 5


1 −2 1 

2 
∼ 0 1


0 0 −1
The last line represents the equation 0 = −1, which is not satisfied for
any values of x and y. The given system thus lacks solutions.
(
x − 2y + z = 3
Example 9. Solve the system
.
−2x + 4y − 2z = −6
Solution.
!
1 −2 1
3
−2 4 −2 −6
!
1 −2 1 3
∼
i.e.
0 0 0 0
row 2 + 2·row 1
(
x − 2y + z = 3
0=0
The last equation is always satisfied - it does not mean any condition
on the unknowns x, y, and z. The given system of equations is
therefore equivalent to the single equation x − 2y + z = 3. Here we
can prescribe values for y and z arbitrarily; the variable x is uniquely
determined by these values. The general solution can be written
(x, y, z) = (3 + 2s − t, s, t),
s, t ∈ R.
Example 10. Solve the following system for all values of a:



x + y − az = 3



x − ay − z = 2




x − 3y − z = 2 − a
12
1. LINEAR SYSTEMS OF EQUATIONS
Solution.
(1)
(2)
(3)
(4)

1
1


1

1

∼ 0

0

1

∼ 0

0

1

∼ 0

0

1 −a
3 
−a −1
2 

−3 −1 2 − a
row 2 - row 1
row 3 - row 1

1
−a
Divide row 3 by -4,
3 
then interchange
−a − 1 a − 1
−1 

row 2 with row 3.
−4
a − 1 −1 − a

1
−a
3 

1
1
1
(1 − a) 4 (1 + a)
4

row 3 + (a + 1)·row 2
−a − 1 a − 1
−1

1
−a
3


1
1
1
(1
−
a)
(1
+
a)
 .
4
4

0 14 (a − 1)(3 − a) 14 (a − 1)(a + 3)
The last row is equivalent to the equation
(∗)
(a − 1)(3 − 1)z = (a − 1)(a + 3).
Here we need to divide into cases.
Case 1: a = 1. Then the equation (*) reduces to 0 = 0, which is
always satisfied. The given system is then equivalent to the first two
rows in (4), i.e.,
1 1 −1 3
0 1 0 12
!
row 1 - row 2
5
2
1
2
!
(
x−z=
y=
1 0 −1
∼
0 1 0
i.e.
5
2
1
2
One can e.g. prescribe z arbitrarily and then get x, y, z from the value
of z,
5
1
(x, y, z) =
+ t, , t ,
t ∈ R.
2
2
Case 2: a = 3. In this case, (*) says that 0 · z = 2 · 6 ⇔ 0 = 12. This
is false, so the system lacks solutions for a = 3.
3. THE AUGMENTED MATRIX OF A LINEAR SYSTEM OF EQUATIONS
13
Case 3: a , 1 ∧ a , 3. In this case, the third row in (4) can be
4
, which leads to
multiplied by (a−1)(3−a)

1
0


0

1

∼ 0

0

1

∼ 0

0
−a
1
(1
− a)
4
1
2
1 0 9+a
3−a 

a 

1 0 3−a

3+a 
0 1 3−a
1
1
0
0 0
1 0
0 1

3 
1
(1 + a)
4

3+a
row 1 + a·row 3
row 2 + 14 (a − 1)·row 3
3−a
row 1 - row 2
a2 −a+9
3−a 

a

3−a 
3+a 
3−a

In conclusion, we have arrived at the following result concerning the
solutions to the system:
(a) For a = 3 the system has no solutions.
(b) For a = 1 it has the solutions (x, y, z) = ( 52 + t, 12 , t), t ∈ R.
(c) For a < {1, 3} it has the unique solution
!
a2 − a + 9 1 3 + a
(x, y, z) =
,
,
.
3−a
3−a 3−a
Exercises.
4. Solve
 the following systems: 


 x − 9y − 3z = 4

x− y+ z=0






a) 
b) 
3x − 2y + z = 2
3x + 5y − z = 0






2x + 7y + 4z = −2
6x + 2y + 2x = 5


2x + 3y = 2
 x + 2y = −3








c) 
d)
x
+
3y
−
z
=
5
2x − 3y = 8







10x − y = 2
3x + y + z = 3


(

x − 2y = 6


2x + y − 3z = 4

e) 
f)
3x
+
y
=
2


4x + 3y − z = 2

5x − 3y = 14


2x − 3y + 4z = 1


(



x + y − 4z = 7
3x + y − z = 7
g)
h) 


x + y + 2z = 1
x − y + 5z = 1



4x − 6y + 8z = 3
14
1. LINEAR SYSTEMS OF EQUATIONS
5.
6.
7.
8.
9.



2x − 3y + 4z = 1





2x − y + z = 3





3x + y − z = 7
j) 
i) 
4x − 2y + 5z = 0




x − y + 5z = 1




2x − y − 2z = 9

4x − 6y + 8z = 2


x + 3y + z − w = 2





3x + 5y − z − w = 2
.
k) 


5x − y − 5z + 3w = 0



2x + 3y − 3z − 2w = 2



2x − 3y + z = 1



Solve the system 
x − 3y + 2z = b when



3x + y − 4z = c
a) a = 2, b = −3, c = 0;
b) a = b = c = 0.
Hint: Several linear systems with the same coefficient
matrices but different right hand sides can be solved simultaneously with the same augmented matrix: one just writes the
different right hand sides next to each other.
Determine a and b so that the lines y − 3x = 2 and 2y + ax = b
a) intersect at a point, b) are parallel and different, c) coincide.
2x+13
a
b
Determine a and b so that (x−1)(x+2)
= x−1
+ x+2
.
(This is an example of a decomposition in partial fractions
of a rational function. Such decompositions are frequently
used in the calculation of integrals, for example.)
Determine for all values of a the number of solutions to the
system
(
(
(
x + 2y = 2
x + 2y = 1
x + 2y = 1
a)
b)
c)
.
2x + ay = a
2x + ay = a
2x + a2 y = a
Determine all solutions for the following systems for all values of(the constant a:
(
(
x + 3y = 4
x + 3y = 3
x − 3ay = 2
a)
b)
c)
2x + ay = a
2x + ay = a
ax − 12y = a + 2


(
(

x + y + 3z = 1


x+ y=3
x − 2ay = 3

d)
e)
f) 
2x + y + z = 2


2x − ay = 2
ax + 3y = 3a − 1

3x + 3y + az = 0






x+ y+
z=1
x + y + az = 1






g) 
h) 
2x + ay −
z=1
x + ay + z = a .






ax + 2y + (a + 3)z = 2
ax + y + z = a2
4. ANSWERS TO EXERCISES
15
10. Examine, for different values of a and b, the number of solutions to the following linear systems of equations:
(
(
ax + by = 2
ax + 2y = b
.
b)
a)
x+ y =1
3x + 2y = 5
11. Determine a, b, and c so that
3x2 + 6x − 16 a
b
c
= +
+
.
3
x − 4x
x x+2 x−2
12. Determine the constants a, b, c so that the function f (x) =
ax2 + bx + c satisfies f (1) = −3, f (2) = 1 and f (−1) = 7.
13. Determine the equation of a third degree polynomial whose
graph passes through the points (0, 1), (1, 1), (2, 1), and (−1, 7).
4. Answers to Exercises
1. a) x = −1, y = 3
b) x = 1.1, y = 0.4 c) x = 28
, y = − 18
.
29
29
2. (x, y, z) =
a) (2, −1, 2)
b) (1, 2, 3)
c) (−6, −2, 0)
d) (5, 3, 1).
3. a) Has no solution. b) x = 1 + 43 y, y ∈ R. c) (x, y) = ( 65 , 0).
4. a) (x, y, z) = (1 + 32 t, t, −1 − 52 t), t ∈ R.
b) No solution.
c) (x, y, z) = (4, −2, −7).
d) No solution exists.
, − 167 ) f) (x, y, z) = (4t + 5, −5t − 6, t), t ∈ R
e) (x, y) = ( 10
7
g) (x, y, z) = (3 − t, t, −1), t ∈ R
h) Has no solution.
i) (x, y, z) = (2, 1, 0)
j) (x, y, z) = (t, 2t−5, −2),
,t ∈ R
k) (x, y, z, w) = (2, −1, 1, −2).
5. a) There is no solution.
b) x = y = z = t, t ∈ R.
6. a) a , 6
b) a = 6, b , 4
c) a = −6, b = 4.
d) The system of equations has respectively: exactly one
solution, no solutions, infinitely many solutions.
7. a = 5, b = −3.
8. a) If a = 4 there are infinitely many, otherwise unique.
b) If a = 4 there is no solution, otherwise unique.
c) If a = 2 there are infinitely many, if a = −2 none, otherwise a unique solution.
a
9. a) For a = 6 no solution. For a , 6, (x, y) = a−6
, a−8
.
a−6
b) For a = 6: (x, y) = (3 − 3t, t), t ∈ R. For a , 6: (x, y) =
(0, 1).
c) For a = −2, no solution. For a = 2: (x, y) = (2 + 6t, t),
t ∈ R.
1
.
For a , ±2: (x, y) = a+4
,
−
a+2
3(a+2)
4
d) For a = −2: insoluble. For a , −2: (x, y) = 3a+2
,
.
a+2 a+2
16
1. LINEAR SYSTEMS OF EQUATIONS
e) (x, y) =
10.
11.
12.
13.
6a2 −2a+9
, − 2a21+3
2a2 +3
.
, 15 , −3 .
f) a = 9: insoluble. a , 9: (x, y, z) = a−15
a−9 a−9 a−9
g) a , 1: insoluble. a = 1: (x, y, z) = (2t, 1 − 3t, t), t ∈ R.
h) a = 1: (x, y, z) = (1−t−u,
2 t, u), t, u ∈ R. a = −2: insoluble.
1
, a+2
, − a+1
a < {1, −2}: (x, y, z) = a +2a+1
.
a+2
a+2
a) If a , 3 unique solution; if a = 3 and b = 5 infinitely many
solutions; if a = 3 and b , 5 no solutions.
b) If a = b = 2 infinitely many solutions; if a = b , 2 none;
if a , b a unique solution.
a = 4, b = −2, c = 1.
a = 3, b = −5, c = −1.
The polynomial −x3 + 3x2 − 2x + 1.
CHAPTER 2
Vectors
1. Basic Definitions
−−→
If P and Q are two points in space, we denote by PQ the directed
line-segment which starts at P and ends at Q. A directed segment is
determined by its starting point P, its direction, and its magnitude (or
length).
−−→
−→
Two directed segments PQ and RS are called equivalent (notation:
−−→ −→
PQ ∼ RS) if they have the same direction and magnitude. It is easy to
verify that this defines an equivalence relation on the set of directed
line segments in space.
−−→
The vector u which contains PQ is the set of all directed line
−−→
segments in space that are equivalent to PQ (i.e. it is the equivalence
~ In symbols:
class containing PQ).
n−→ −→ −−→o
(5)
u = RS : RS ∼ PQ .
−−→
In this situation we say that the line segment PQ represents the vector
u.
The direction and the magnitude of the vector u is defined as the
−−→
direction and magnitude of some representative PQ. We will denote
the length (or norm) of u by the symbol kuk.
−−→
In practice, one often briefly writes u = PQ instead of the using
the bulky (but more precise) notation in (5). We will often, without
further mention, use this convention in the following.
−→
The zero vector is the vector 0 = PP; clearly k0k = 0. A vector u
with kuk = 1 is called a unit vector.
The sum u + v of two vectors is defined as follows. Take a directed
−−→
−−→
segment PQ representing u, then a representative QR (with the same
Q) of v. We define
−→
u + v = PR.
(This rule is sometimes called the parallelogram law of addition.)
17
18
2. VECTORS
The addition of vector satisfies the following (easily verified)
rules:
(i) u + v = v + u (”commutativity”)
(ii) u + (v + w) = (u + v) + w (”associativity”)
(iii) If u + v = u + w then v = w (”the cancelation law”)
(iv) u + 0 = u (”neutral element”).
−−→
If u = PQ is a vector, we denote by −~
u the vector with the same
−−→
magnitude but opposite direction, i.e., −~
u = QP. It is clear that u+(−u) =
0 for all u. We define the difference between u and v by
u − v = u + (−v).
Let s be a scalar (scalar is a synonym for ”real number”). We shall
define a vector su.
(1) If s > 0 we define su to be the vector with the same direction
as u and magnitude skuk. (So su is a ”rescaled” version of u.)
(2) If s = 0 we define su = 0.
(3) If s < 0 we define su to be the vector with the direction opposite
to u and length |s|kuk.
1
Observe that if u , 0, then the vector kuk
· u is the unit vector with
the same direction as u.
The multiplication with scalars satisfies the following rules:
(a) s(tu) = (st)u,
(b) (s + t)u = su + tu,
(c) s(u + v) = su + sv.
(d) 0u = 0, 1u = u, s0 = 0.
Two non-zero vectors u, v are called parallel if one of them can be
written as a scalar multiple of the other one. We write u k v to denote
that u and v are parallel.
−−→
Example 1. Let O, P, Q be three points in space and put u = OP and
−−→
v = OQ. Let M be the mid-point of the segment PQ. We claim that
(1)
−−→ 1
OM = (u + v) .
2
To prove this, observe that
−−→ −−→ −−→
OP + PQ = OQ,
i.e.,
−−→ −−→ −−→
PQ = OQ − OP = v − u.
1. BASIC DEFINITIONS
19
This gives that
−−→ −−→ 1 −−→
1
1
OM = OP + PQ = u + (v − u) = (u + v) .
2
2
2
A simple geometric consequence of this is that the diagonals of a
parallelogram divide each other into equal parts. Namely, if S is the
−→
fourth corner in the parallelogram with sides u and v, then OS = u+v.
It thus follows from (1) that M lies halfway between O and S.
Example 2. Let O, P, Q, R be four points in space and put
−−→
−−→
−−→
u = OP ; v = OQ ; w = OR.
By the center of mass of the triangle PQR we mean the point N defined
by
−−→ 1
ON = (u + v + w) .
3
Prove that the three medians of the triangle PQR intersect at the point
N. (A median of a triangle is a line-segment connecting a vertex to
the mid-point of the opposite side.)
(2)
Solution. Let M be the mid-point of the segment PQ. Then (by (1))
−−→ −−→ −−→ 1
1
RM = OM − OR = (u + v) − w = (u + v − 2w) .
2
2
Moreover,
−−→ −−→ −−→ 1
1
RN = ON − OR = (u + v + w) − w = (u + v − 2w) .
3
3
We have shown that
−−→
−−→
2RM = 3RN,
so N lies on the median RM and divides it according to the ratio 2 : 1.
By symmetry of the expression (2), the point N is also on the other
medians. Exercises.
−−→
1. Let A and B be two points and O a third point. Let a = OA,
−−→
b = OB, and let M be the mid-point of the line-segment AB.
Express the following vectors in terms of a and b:
−−→ −→
−→
−−→
−−→
a) OB + BA b) AB c) OM d) AM.
2. Simplify the following sums:
−−→ −−→ −−→
−−→ −−→ −−→
a) OB + BD + DC b) AC + CO + OB
−→ −−→ −−→
−−→ −→ −→
c) AB + OA + BD d) BD − BA + DL.
20
2. VECTORS
3. Prove that a point N is the center of mass of a triangle PQR if
and only if
−−→ −−→ −−→
NP + NQ + NR = 0.
4. Let u, v, w be non-zero vectors. Prove that parallelism in an
equivalence relation, namely:
a) u k u b) u k v ⇒ v k u c) If u k v and v k w, then u k w.
5. Let ABC be a triangle, A1 the mid-point on BC, B1 the midpoint on AC, and C1 the mid-point of AB. Prove that the
−−−→ −−→
−−→
vectors AA1 , BB1 , and CC1 can be used to form a triangle.
6. A median in a tethrahedron is the line-segment from a vertex to
the center of mass of the opposite side. Let O be an arbitrary
point and PQRS a tetrahedron. Prove that the medians of
PQRS intersect at a point T which is given by
−−→ 1 −−→ −−→ −−→ −→
OT = (OP + OQ + OR + OS).
4
The point T is called the center of mass of the tetrahedron.
2. Bases and coordinates
Let ` be a line in space. We say that a vector u is parallel to `,
or simply that u belongs to `, if u can be represented by a directed
line segment of `. (Thus by "u ∈ `”, we really mean that some
representative of u belongs to `. We hence use the phrase "belongs to”
in slightly different meanings. This should not cause confusion, as
long as the reader is aware of the distinction.)
Likewise, we say that u belongs to a plane π if u can be represented
by a line segment in π.
Fix a line ` and let e , 0 be a vector in `. For any other vector u in
` there is then a number x such that
(1)
u = xe.
The vector e is called a basis for ` and x is the coordinate for u with
respect to the basis e.
Now let π be a plane and let e1 , e2 be two non-parallel vectors in
π. We claim that each vector u in π can be written
(2)
u = x1 e1 + x2 e2
where x1 and x2 are real numbers. To show this, we first choose two
lines `1 and `2 in π such that e1 is in `1 and e2 is in `2 . Let O be
the point of intersection between `1 and `2 . To help our geometric
intuition, we will in the following fix O as our "origin”, and place all
2. BASES AND COORDINATES
21
−−→
vectors in π so that they emanate from the point O. (So e1 = OP1 for
−−→
some point P1 , u = OP, etc.)
We then decompose u into a sum
u = u1 + u2
where u1 is on `1 and u2 is on `2 . As in (1) we can then write
u1 = x1 e1
and u2 = x2 e2
for some real numbers x1 and x2 . This proves (2).
Definition 1. Two non-parallel vectors e1 , e2 in a plane π are
called a basis for π. Given a vector u ∈ π, the numbers x1 and
x2 in (2) are uniquely determined; they are called the coordinates
of u with respect to the basis e1 , e2 . If the basis is fixed, and no
misunderstandings can arise, we can suppress the basis vectors in
(2), and simply write
u = (x1 , x2 ).
To describe all vectors in space in a similar way, we need three
vectors e1 , e2 , e3 which are not co-planar (i.e. they are not in one and
the same plane). We assert that every vector u then can be written
(3)
u = x1 e1 + x2 e2 + x3 e3
where x1 , x2 , x3 are real numbers, which are uniquely determined by
u. To prove this, we proceed in two steps.
First, let π be a plane containing e1 and e2 and let ` be a line
containing e3 . Let O be the point of intersection between ` and π; in
the following we place all vectors so that they emanate from O.
We decompose u into a sum
(4)
u = u0 + u00 ,
u0 ∈ π, u00 ∈ `.
Applying (2) and (1) we can write u0 = x1 e1 + x2 e2 and u00 = x3 e3 for
some (unique) real numbers x1 , x2 , x3 . This proves (3).
Definition 2. Three vectors e1 , e2 , e3 which are not co-planar are
called a basis for three-dimensional space. If u is a three-dimensional
vector, then the numbers x1 , x2 , x3 in (3) are called the coordinates of u
with respect to the basis e1 , e2 , e3 . If the basis is understood, we can
abbreviate the notation (3) and write
u = (x1 , x2 , x3 ).
−−→
−−→
Example. Let OPQR be a tetrahedron. The vectors e1 = OP, e2 = OQ,
−−→
e3 = OR then form a basis for three-dimensional space. Let N be the
22
2. VECTORS
center of mass of the triangle PQR. By Example 2 in the previous
section, we then have
−−→
ON = (1/3, 1/3, 1/3)
relative to this basis.
When we have a fixed basis for 3-space, we can compute with
coordinates instead of vectors. We then have the rules
(x1 , x2 , x3 ) + (y1 , y2 , y3 ) = (x1 + y1 , x2 + y2 , x3 + y3 )
t(x1 , x2 , x3 ) = (tx1 , tx2 , tx3 ).
(5)
(6)
The rule (5) corresponds to the rule for addition of vectors, for if
u = x1 e1 + x2 e2 + x3 e3
and v = y1 e1 + y2 e2 + y3 e3 ,
then
u + v = (x1 e1 + x2 e2 + x3 e3 ) + (y1 e1 + y2 e2 + y3 e3 )
= (x1 + y1 )e1 + (x2 + y2 )e2 + (x3 + y3 )e3 .
The proof of (6) is similar.
Projections. In the decomposition (4) of a vector u, the vector u0
is called the projection of u parallel to the line ` on the plane π, and u00 is
called the projection of u parallel to π on `.
If the line ` is normal to the plane π, then the projection u0 is called
the orthogonal (or "right angled”) projection of u on π.
If we decompose both u and v in this way, we find
u + v = (u0 + v0 ) + (u00 + v00 ),
which means that
(u + v)0 = u0 + v0
and
(u + v)00 = u00 + v00 .
Exercises.
7. Let OPQR be a tetrahedron and introduce a basis for 3-space
−−→
−−→
−−→
by e1 = OP, e2 = OQ, e3 = OR. Let A be the mid-point of the
segment OP and B the mid-point of the segment QR. Also,
let C be the mid-point on the segment AB. Determine the
−−→ −−→
−−→
coordinates of the vectors OA, OB, and OC relative to the
basis e1 , e2 , e3 .
3. LINEAR DEPENDENCE
23
3. Linear Dependence
Let u1 , u2 , . . . , uk be a collection of vectors. We introduce some
basic terminology:
A vector u of the form
u = λ1 u1 + . . . + λk uk ,
where λ1 , . . . , λk are real numbers, is called a linear combination of the
vectors u1 , . . . , uk .
The collection u1 , u2 , . . . , uk is called linearly dependent if there are
real numbers λ1 , . . . , λk , not all equal to zero, such that
λ1 u1 + . . . + λk uk = 0.
Otherwise, i.e., if
λ1 u1 + . . . + λk uk = 0
⇒
λ1 = λ2 = . . . = λk = 0,
we say that the collection is linearly independent.
Example 1. That two vectors u1 , u2 are linearly dependent means
precisely that there are λ1 , λ2 , not both zero, such that λ1 u1 +λ2 u2 = 0.
If λ1 , 0 this implies u2 = tu1 where t = −λ2 /λ1 , so u1 and u2 are
parallel. The same conclusion holds if λ2 , 0. Conversely, it if u1 and
u2 are parallel, say if u1 = tu2 , then the relation 1·u1 +(−t)u2 = 0 shows
that u1 , u2 . We have shown that two vectors are linearly dependent if
and only if they are parallel.
The example has the following generalization for arbitrary collections of vectors.
Theorem 3. A collection u1 , . . . , uk is linearly dependent if and only if
(at least) one u j can be written as a linear combination of the other ui ’s.
Proof. (⇒) Assume that u1 , . . . , uk are linearly dependent. Then
there are real numbers λ1 , . . . , λk , not all zero, such that
λ1 u1 + λ2 u2 + . . . + λk uk = 0.
We can w.l.o.g. assume that λ1 , 0. But then
u1 = (−λ2 /λ1 )u2 + . . . + (−λk /λ1 )u2 ,
which shows that u1 is a linear combination of u2 , . . . , uk .
(⇐) Suppose that some u j is a linear combination of the other ui ’s.
We can w.l.o.g. assume that j = 1 and that
u1 = t2 u2 + . . . + tk uk ,
for some real numbers t2 , . . . , tk . Then
1 · u1 + (−t2 )u2 + . . . + (−tk )uk = 0.
24
2. VECTORS
Since the coefficient of u1 is not zero, we infer that the collection is
linearly dependent.
Remark 4. If a collection u1 , . . . , uk is linearly dependent, then
any larger collection u1 , . . . , uk , uk+1 , . . . , un is also linearly dependent.
This holds, since if
λ1 u1 + . . . + λk uk = 0
where not all λ j are zero, then
λ1 u1 + . . . + λk uk + 0uk+1 + . . . + λn un = 0.
Example 2. Now suppose that three vectors u1 , u2 , u3 are linearly
dependent. Then one of them, say u3 , can be written as a linear
combination of u1 and u2 :
u3 = t1 u1 + t2 u2 .
Hence u1 and u2 are in a plane π, then u3 is also in π. Conversely, if u1 ,
u2 , u3 belong to a plane π, then they are linearly dependent. To realize
this, it suffices to observe that either u1 , u2 are linearly dependent,
whence u1 , u2 , u3 is also linearly dependent by the remark above, or
u1 , u2 is a basis for a plane π, whence u3 ∈ π implies that u3 is a
linear combination of u1 and u2 . We have shown that three vectors
are linearly dependent if and only if they are co-planar. Equivalently three
vectors are linearly independent if and only if they form a basis for 3-space.
Example 3. Assume that the vectors u1 , u2 , u3 have coordinates
u1 = (1, −2, 2) ,
u2 = (−2, 3, 1)
,
u3 = (−1, 3, 2)
with respect to some basis for 3-space. We shall investigate whether
or not the vectors u1 , u2 , u3 form a basis. To this end, we consider the
vector equation
λ1 u1 + λ2 u2 + λ3 u3 = 0,
which is equivalent to the linear system



λ1 − 2λ2 − λ3 = 0



−2λ2 + 3λ2 + 3λ3 = 0




 2λ1 + λ2 + 2λ3 = 0
Solving this with the elimination method gives the only solution
λ1 = λ2 = λ3 = 0. Hence u1 , u2 , u3 is linearly independent and (by
the result of Example 2) is a basis for 3-space.
A collection of four or more vectors in 3-space is always linearly
dependent. To show this, it is enough to prove that four vectors
are linearly dependent, for then every larger collection is linearly
4. LINES AND PLANES
25
dependent as well. Thus let u1 , u2 , u3 , u4 be four arbitrary vectors in
3-space. Then either u1 , u2 , u3 are linearly dependent, and therefore
so are u1 , u2 , u3 , u4 , or u1 , u2 , u3 is a basis for 3-space, whence u4 is a
linear combination of u1 , u2 , u3 . In either case, u1 , u2 , u3 , u4 is linearly
dependent.
Exercises. In the following exercises, vectors are expressed by
their coordinates relative to some fixed basis e1 , e2 , e3 .
8. Prove that the vector v = (2, −7, 1) is in the plane spanned
by the vectors u1 = (2, −1, 3) and u2 = (1, 1, 2). (The unique
plane containing two linearly independent vectors is called
the plane spanned by the vectors in question.) Determine the
coordinates for v with respect to the basis u1 , u2 .
9. Are the vectors (1, −2, 1), (2, −1, −1), (−1, −4, 5) co-planar?
10. Prove that the vectors (1, 1, 2), (4, 4, 9), (2, 3, 7) form a basis for
the three-dimensional space. Determine the coordinates of
the vector (5, 4, 3) relative to this basis.
11. For which values of k are the following sets of vectors linearly
independent?
a) (k, k2 , k3 ), (2, 2, 2);
b) (1, 1, 1), (1, k, 2k), (k, 1, k);
c) (1, −1, −k), (2, k, 4), (k, 2, −4).
4. Lines and Planes
Fix a point O in three-dimensional space. An arbitrary point P
−−→
can then be described by the position vector OP. For a given basis e1 ,
e2 , e3 there are then unique numbers x1 , x2 , x3 such that
−−→
OP = x1 e1 + x2 e2 + x3 e3 .
The numbers x1 , x2 , x3 are the coordinates of the point P with respect
to the coordinate system Oe1 e2 e3 . If the coordinate system is clear from
the context, we can simply write
P = (x1 , x2 , x3 ).
Remark 5. In a similar way we can describe points in a plane π.
Let O be a point in π and e1 , e2 is a basis for π. Then for each point P
in π we can write
−−→
OP = x1 e1 + x2 e2
for unique x1 , x2 ∈ R. We say that x1 , x2 are the coordinates of P in
the coordinate system Oe1 e2 for π. In short: P = (x1 , x2 ).
26
2. VECTORS
Throughout the rest of this section we fix a coordinate system
Oe1 e2 e3 for three-dimensional space. Each point can then be identified with its coordinates, which we denote by (x, y, z) rather than
(x1 , x2 , x3 ).
Parametric representation. Let ` be a line in space. Suppose that
` passes through a point Q and that a vector u , 0 is parallel to `.
Then a point P belongs to ` if and only if
−−→
(1)
QP = tu
for some t ∈ R. The vector u is called a direction vector of `.
Denote by (a, b, c) the coordinates of u with respect to the basis e1 ,
e2 , e3 . Thus
u = ae1 + be2 + ce3 .
−−→ −−→ −−→
If Q0 = (x0 , y0 , z0 ) and P = (x, y, z), then the vector QP = OP − OQ has
coordinates (x − x0 , y − y0 , z − z0 ). Hence (1) can be cast in the form



x − x0 = at



t ∈ R.
(2)
y − y0 = bt ,




z − z0 = ct
The relation (2) is known as the parametric representation of the line `.
Example 1. Consider the line ` which passes through the points
Q = (3, −6, −5) and R = (4, −3, −3). A direction vector of ` is
−−→ −−→ −−→
QR = OR − OQ = (4, −3, −3) − (3, −6, −5) = (1, 3, 2).
Since ` passes through Q, we conclude that



x−3=t



t ∈ R,
y + 6 = 3t ,




z + 5 = 2t
is a parametric representation for `.
In a similar way, one can represent a plane π in space. Namely, let
Q be a point in π and u, v a basis for π (i.e. two non-parallel vectors
which are parallel to π). Then a point P in space belongs to π if and
only if
−−→
(3)
QP = su + tv
for some s, t ∈ R. The numbers s, t are the coordinates for P in the
coordinate system Quv of π. If we denote
u = ae1 + be2 + ce3
,
v = a0 e1 + b0 e2 + c0 e3
,
Q = (x0 , y0 , z0 ),
4. LINES AND PLANES
then (3) can be written



x − x0 = as + a0 t



(4)
y − y0 = bs + b0 t




z − z0 = cs + c0 t
,
27
s, t ∈ R.
This formula is called a parametric representation of the plane π.
Example 2. Consider the plane π passing through the points Q =
(1, 2, 0), R1 = (0, 1, 1), and R2 = (2, −1, −3). Two non-parallel vectors
in M are
−−−→
−−−→
u = QR1 = (−1, −1, 1) , v = QR2 = (1, −3, −3).
Hence



x − 1 = −s + t



y − 2 = −s − 3t





z = s − 3t
,
s, t ∈ R
is a parametric representation of π.
Example 3. Let us determine the point of intersection between the
line ` of Example 1 and the plane π of Example 2. To this end, we
separate between the t-parameters in the representations for ` and π
and write




x − 3 = t1
x − 1 = −s + t2






`:
,
π
:
y
+
6
=
3t
y − 2 = −s − 3t2 .

1






z + 5 = 2t1

z = s − 3t2
At the point of intersection, the values of x, y, and z must match, i.e.,






3 + t1 = 1 − s + t2
s + t1 − t2 = −2






⇔ 
s + 3t1 + 3t2 = 8 .
−6 + 3t1 = 2 − s − 3t2







−5 + 2t1 =

s − 3t2
−s + 2t1 + 3t2 = 5
After Gaussian elimination, this gives s = 2, t1 = −1, and t2 = 3.
Inserting t1 = −1 into the equation for ` gives (x, y, z) = (2, −9, −7).
The equation of a plane. We claim that planes in space correspond precisely to equations of the form
(5)
Ax + By + Cz = D
where not all coefficients A, B, C are zero. Indeed, suppose that A , 0
(the cases B , 0 and C , 0 are analogous). Then a point (x, y, z)
28
2. VECTORS
satisfies (5) if and only if



x − (D/A) = −(B/A)s − (C/A)t



y
=
s




z
=
t
for some s, t ∈ R. Thus (5) describes the plane passing through
the point (D/A, 0, 0), which is parallel to the vectors (−B/A, 1, 0),
(−C/A, 0, 1).
Conversely, one can, by eliminating s and t in the parametric representation (4), prove that any plane can be described by an equation
of the form (5). We omit the details, since we shall anyway find
an easier method to prove this later on. Instead, we turn to some
examples.
Example 4. Consider again the plane



x − 1 = −s + t



π:
y − 2 = −s − 3t




z = s − 3t
,
s, t ∈ R
A point (x, y, z) is in π if and only if there are s and t satisfying these
equations. To determine when this is the case, we can first eliminate
s from the last two equations, and then t from the last equation:






x
−
1
=
−s
+
t
x
− 1 = −s + t






⇔ 
y − 2 = −s − 3t
−x + y − 1 =
−4t








 x+ z−1=
z = s − 3t
−2t



x
− 1 = −s + t



⇔ 
−x + y
−1=
−4t .



 3x − y + 2z − 1 =
0
Since we can always choose s and t so that the first two equations are
satisfied, we see that a point (x, y, z) belongs to π if and only if the
third equation is satisfied, i.e., iff
3x − y + 2z − 1 = 0.
Lines in space can be described as the intersection between two
non-parallel planes. This is illustrated by the following example.
Example 5. The points (x, y, z) which belong to both of the planes
2x + y − 3z = 5
and x + 2y − z = 4
4. LINES AND PLANES
are precisely the solutions to the system
(
(
2x + y − 3z = 5
2x + y − 3z = 5
∼
3y + z = 3
x + 2y − z = 4
29
.
Inserting z = 3t (to obtain integer coefficients in the parametric representation) we get



x = 2 + 5t



y=1− t .




z =
3t
The intersection line thus passes through the point (2, 1, 0) and has
direction vector (5, −1, 3).
Remark 6. To describe a line in space we need two equations of the
form Ax + By + Cz = D. On the other hand, if we only consider points
in a plane π, and if x, y denote coordinates with respect to some
coordinate system in π, then one equation Ax + By = C is sufficient to
describe a line. If for example A , 0, then (x, y) satisfies the equation
if and only if
(
x = (C/A) − (B/A)t
y=
t
for some t ∈ R. This is a parametric representation of a line in π
passing through the point (C/A, 0), with direction vector (−B/A, 1).
Exercises.
12. Give a parametric representation for the line passing through
the points (1, −1, 4) and (2, 3, 5).
13. Consider the lines






x=2+ t
x = 1 + 2t






`1 : 
, `2 : 
y=1− t
y=
t .






z =

2t
z = −1 + t
Do they intersect? Are they parallel?
14. Determine whether the following lines intersect:






x = 1 + 15t
x = 6 − 65t






y = −4 − 21t
y = −11 + 91t .








z = 5 + 33t
z = 16 − 143t
30
2. VECTORS
15. Find a parametric representation of the line ` which passes
through the point (3, 2, −1) and intersects the lines






x = 10 + 5t
x
=
1
+
t






and
y= 5+ t .
y=
t








z = 2 + 2t
z = −5 + t
16. Find a parametric representation of the plane π which passes
through the points (2, 3, 0), (1, 5, 2), and (−1, 4, 3).
17. Are the four points (−1, −1, 0), (0, 4, 1), (1, 0, −1), (1, −3, −2)
co-planar?
18. Find, in the form Ax + By + Cz = D, the equation for the
plane π which passes through the point (1, −1, 2) and which
contains the line



x=3− t



y = 2 + 2t .




z = 1 − 3t
19. Prove that a line with direction-vector (a, b, c) is parallel to the
plane Ax + By + Cz = D if and only if
Aa + Bb + Cc = 0.
20. Find a parametric representation for the line which passes
through the point (1, 2, 4) and which is parallel to the planes
2x + y − z = 3
and
3x − 3y + z = 0.
21. Determine the equation for the plane which contains the line
(
3x + 4y + z = 5
,
x− y
= −6
and which passes through the mid-point on the segment between the points (1, 1, 2) and (3, 1, 4).
5. Answers to Exercises
1. a) a b) b − a c) 21 (a + b) d) 21 (b − a).
−−→
−→
−−→
−→
2. a) OC
b) AB
c) OD
d) AL.
−−→
−−→
−−→
7. OA = (1/2, 0, 0), OB = (0, 1/2, 1/2), OC = (1/4, 1/4, 1/4).
8. v = 3u1 − 4u2 .
9. Yes, (−1, −4, 5) = 3(1, −2, 1) − 2(2, −2, −1).
10. (5, 4, 3) = 23(1, 1, 2) − 4(4, 4, 9) − (2, 3, 7).
The coordinates are thus (23, −4, 1).
11. a) k ∈ {0, 1, −1}
b) k ∈ {1/2, 1}
c) k ∈ {−2, 4}.
5. ANSWERS TO EXERCISES
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.



x= 1+ t



y = −1 + 4t .




z = 4 + t
The lines do not intersect and they are not parallel.
The lines
 coincide.


x = 5 + 2t



`:
y = 4 + 2t .




z =
t



x = 2 − s − 3t



π:
y = 3 + 2s + t .




z = 2s + 3t
Yes.
π : 5x − 3y − 14z = 60.
π
 : x − y − z = 0.


x = 1 + 2t



y = 2 + 5t .




z = 4 + 9t
13x + 36y + 7z = 83.
31
CHAPTER 3
Distance and Angle
1. Scalar Product
We start with an important definition.
Definition 7. Let u, v be two vectors in space and let θ be the
angle between them (in the interval 0 ≤ θ ≤ π). Their scalar product
is the number (u|v) defined by
(u|v) = kuk · kvk · cos θ.
If u = 0 or v = 0, the angle θ is not defined, but in these cases we
define (u|v) = 0.
The word "scalar” is a synonym for "number”; the term "scalar
product” is used because the result (u|v) is a real number and not a
vector.
Now consider the case when v = e is a unit vector, i.e., kek = 1. If
θ is the angle between u and e, then
(1)
(u|e) = kuk cos θ.
We shall find an important geometric interpretation of this formula:
Place u and e so that they emanate from the same point and let `
be a line through this point, with direction vector e. The orthogonal
projection of u of ` (See Chapter 2, end of Section 2) can then seen to
be
(2)
u00 = (kuk cos θ)e.
(This can be verified by an elementary trigonometrical argument –
we ask the reader to supply details.)
Comparing the formulas (1) and (2), it is seen that
u00 = (u|e) · e,
i.e. the number (u|e) is the coordinate of the orthogonal projection u00 in
the basis e for `.
If v , 0 is not a unit vector, we write
(u|v) = kvk · (u|e),
33
34
3. DISTANCE AND ANGLE
1
where e = kvk
· v is a unit vector. Hence the absolute value |(u|v)|
equals the length of v times the length of the orthogonal projection
of u on a line with direction vector v. The sign of (u|v) is positive if
the angle between u and v is acute, and negative if it is obtuse.
The scalar product satisfies the following rules.
(I) (u|v) = (v|u)
(II) (u + v|w) = (u|w) + (v|w)
(III) (tu, v) = t (u, v)
(IV) (u|u) ≥p0 with equality ⇔ u = 0.
(V) kuk = (u|u)
The rule (I) is called symmetry of the scalar product; (II) and (III)
together are called linearity in the first argument; (IV) is called positive
definiteness.
The properties (I),(III), (IV) and (V) are immediate; (II) can be
realized in the following way: If v = e is a unit vector, then, by the
geometrical interpretation of the scalar product, we can see that (II)
means precisely that
(u + w)00 = u00 + w00 .
That this is the case was shown at the end of Chap. vec, Sect. 2.
This shows (II) when v is a unit vector. For general v, we now get
(II) by applying the unit vector case to the vector e in the formula
(u|v) = kvk · (u|e).
Remark 8. Note that linearity in the second argument follows from
the symmetry and the linearity in the first argument. In particular,
we have (u|tv) = t(u|v).
As an application of the scalar product, we prove a well-known
theorem. We say that two vectors u, v are orthogonal to each other if
(u|v) = 0. (This means that the angle θ = π/2, or that one of u or v is
the zero vector.)
Theorem 9. (Pythagoras’ Theorem) If u and v are orthogonal, then
ku + vk2 = kuk2 + kvk2 .
Proof. If u, v are any vectors (orthogonal or not), then by the
computational rules above
ku + vk2 = (u + v|u + v)
= (u|u) + (u|v) + (v|u) + (v|v)
= kuk2 + 2(u|v) + kvk2 .
2. ORTHONORMAL BASES
35
Hence if (u|v) = 0, then
ku + vk2 = kuk2 + 2(u|v) + kvk2 .
Exercises.
1. Let θ be the angle between the sides AB and BC in a triangle
ABC. Prove the law of cosines:
→ −→
−−→ 2 −→ 2 −→ 2
AC = AB + BC − 2 −
AB BC cos θ.
(Observe that this reduces to Pythagoras’ Theorem when θ =
π/2.)
2. Denote by a and b the side-lengths in a parallelogram, and by
c and d the lengths of the diagonals. Prove the parallelogram
law:
c2 + d2 = 2 a2 + b2 .
2. Orthonormal Bases
Let π be a plane and e1 , e2 a basis for π. If
u = x1 e1 + x2 e2
and v = y1 e1 + y2 e2
are two vectors in π, then by the rules (I)–(III) for the scalar product,
(u|v) = x1 y1 (e1 |e1 ) + (x1 y2 + x2 y1 )(e1 |e2 ) + x2 y2 (e2 |e2 ).
This expression becomes particularly simple if the basis vectors e1 , e2
are unit vectors making a right angle with each other. Namely, then
(e1 |e1 ) = (e2 |e2 ) = 1 and (e1 |e2 ) = 0, and thus
(3)
(u|v) = x1 y1 + x2 y2 .
A basis for a plane consisting of two orthogonal unit vectors is called
an orthonormal basis (in short: ON-basis) for the plane.
The corresponding definition in three dimensions is the following.
We say that three vectors e1 , e2 , e3 are pairwise orthogonal if
(e j |ek ) = 0,
when
j , k.
Three vectors e1 , e2 , e3 are said to form an orthonormal basis for threedimensional space if (i) they have unit length, and (ii) they are pairwise orthogonal. We can summarize the definition of orthonormal
basis by the equation
(
0 if j , k
(e j |ek ) =
.
1 if j = k
36
3. DISTANCE AND ANGLE
If e1 , e2 , e3 form an orthonormal basis and
u = x1 e1 + x2 e2 + x3 e3
and v = y1 e1 + y2 e2 + y3 e3 ,
then
(4)
(u|v) = x1 y1 + x2 y2 + x3 y3 .
When u = v this reduces to
(5)
kuk2 = x21 + x22 + x23 .
This can be regarded as the three-dimensional version of the theorem
of Pythagoras. (u is here the sum of three pairwise orthogonal vectors
x1 e1 , x2 e2 , x3 e3 .)
Example 1. Suppose that u and v have coordinates
u = (4, 1, 1)
and v = (2, 2, −1)
with respect to some orthonormal basis. We shall determine the angle
θ between u and v. For this purpose we use (4) and (5) to compute
(u|v) = 4 · 2 + 1 · 2 + 1 · (−1) = 9
√
√
kuk = 16 + 1 + 1 = 3 2
√
kvk = 4 + 4 + 1 = 3.
Since (u|v) = kukkvk cos θ, this gives
cos θ =
(u|v)
9
1
= √
= √ .
kukvk 3 2 · 3
2
This means that θ = π/4.
Example 2. The formula (3) can be used to prove many trigonometrical
identities. (As we know, the de Moivre formula for multiplication of
complex numbers provides another way to do this.)
As an example, we shall now show the following version of the
addition formula for cosines:
(6)
cos(α − β) = cos α cos β + sin α sin β.
To this end, led e1 , e2 be an orthonormal basis for the plane and put
u = (cos α)e1 + (sin α)e2
,
v = (cos β)e1 + (sin β)e2 .
Since u and v then have length 1, while the angle between them is
α − β, the definition of scalar product shows that
(u|v) = cos(α − β).
(We have here used the fact that cos is even: cos(α − β) = cos(β − α).)
3. COMPUTING DISTANCES AND ANGLES
37
On the other hand, by (3), we have
(u|v) = cos α cos β + sin α sin β.
Comparing the two expressions for (u|v), we infer that the formula
(6) holds. Exercises.
3. A triangle in space has vertices at the points (1, 0, 2), (0, −1, 1),
and (2, 1, 2) according to some orthonormal basis. Compute
all side-lengths and cosines of angles in the triangle.
4. Let e1 , e2 , e3 be an orthonormal basis in space. Suppose that
the vector u makes the angle π/4 to the vector e1 and the angle
π/3 to the vector e2 . What are the possible angles between u
and e3 ?
3. Computing Distances and Angles
A coordinate system Oe1 e2 e3 where e1 , e2 , e3 is an orthonormal
basis is called an orthonormal system or an ON-system. In this section
we fix an ON-system and assume that all points are represented
in that system. The distance between two points P = (x, y, z) and
Q = (x0 , y0 , z0 ) is then given by
−−→ q
PQ = (x − x0 )2 + (y − y0 )2 + (z − z0 )2 .
Distance between a point and a line. Consider a line ` and a
point P which is not on the line. By the distance between P and L
we mean the shortest possible distance between P to a point R ∈ `. A
geometric consideration shows that the closest point R is determined
−→
by the condition that PR be orthogonal to the direction vector of `.
(Draw a figure!)
Example 1. Let ` be the line through the points Q = (−2, 1, 1) and
S = (0, −1, 2), and let P = (1, 2, 1). We shall compute the distance
from P to `.
−→
First note that ` has direction vector QS = (2, −2, 1), so it has the
parametric representation



x = −2 + 2t



`:
y = 1 − 2t .



z = 1 + t
We now seek the closest point R ∈ ` to P. To this end we put
−→
R = (−2 + 2t, 1 − 2t, 1 + t) and we must determine t so that PR is
38
3. DISTANCE AND ANGLE
orthogonal to the direction vector (2, −2, 1) of `. That is, we shall
have
−→ −→
0 = PR|QS = (−2 + 2t) · 2 + (1 − 2t) · (−2) + (1 + t) · 1 = 9t − 4.
−→
This gives t = 4/9 and PR = 19 (−19, −17, 4). The distance from P to `
is thus
√
−→
√
1
74
PR =
.
192 + 172 + 42 =
9
3
Distance from a point to a plane. Let π be a plane in space, Q
a point in π, and n a unit normal vector to π. (This means that n has
unit length and points in a direction orthogonal to π.)
The distance d between a point P and π is again defined as the
smallest distance between P to some point in π. We assert that
−−→
(7)
d = n|QP .
To prove this, it suffices to note that the orthogonal projection of the
−−→
−−→
vector QP on n is equal to n|QP · n. The lengh of the latter vector
must thus equal to the sought distance, which proves (7).
Now suppose that Q = (x0 , x y , z0 ), P = (x, y, z), and n = (A, B, C).
Then
−−→
n|QP = A(x − x0 ) + B(y − y0 ) + C(z − z0 ).
If we put
D = Ax0 + By0 + Cz0 ,
then (7) can be written
(8)
d = Ax + By + Cz − D .
But P belongs to π precisely when d = 0, i.e., when
(9)
Ax + By + Cz = D.
We have shown that an arbitrary plane can be described by a linear
equation of the type (9). If P = (x, y, z) is not in π, then its distance to
π is given by (8).
Remark 10. The argument above shows that if we have a plane of
the form (9), then n = (A, B, C) is a normal vector. Above we assumed
that n is a unit vector, i.e., that
A2 + B2 + C2 = 1.
3. COMPUTING DISTANCES AND ANGLES
39
If this is not satisfied,
one must first normalize the equation (9) by
√
2
dividing with A + B2 + C2 . The distance formula then becomes
Ax + By + Cz − D
(10)
d= √
.
A2 + B2 + C2
Example 2. We shall determine an equation for the plane π which
passes through the point (1, −1, 2) and has the normal



x = 1 + 3t



`:
y = 3 + 2t .



z = 2 − t
The direction vector of `, i.e. the vector (3, 2, −1) must then be a normal vector of π. Since the point (1, −1, 2) belongs to M, the equation
of the plane becomes
3(x − 1) + 2(y − (−1)) + (−1)(z − 2) = 0,
i.e.,
3x + 2y − z = −1.
The distance from the point P = (5, 6, 7) to π is thus (see (10))
21
|3 · 5 + 2 · 6 − 1 · 7 + 1|
= √ .
√
14
32 + 22 + 12
Distance between two lines. Given two lines which do not intersect, we define their distance to be the smallest possible distance
between one point on the first line and one on the second one. The
following example comprises a method to compute this kind of distance.
Example 3. Consider the lines `1 and `2 having parametric representations






x
=
−3
+
2t
x=3+ t






`1 : 
; `2 : 
y=
t
y = 4 + 3t .






z = 1 − t
z = 2 + 2t
We shall compute the (shortest) distance between `1 and `2 . Direction
vectors for the two lines are (2, 1, −1) and (1, 3, 2). Let π be the plane
parallel to these directions, which passes through the point (−3, 0, 1)
on `1 . Then `1 ⊂ π and also, π is parallel to `2 . Hence the distance d
40
3. DISTANCE AND ANGLE
between `1 and `2 must equal to the distance from an arbitrary point
on `2 to π. A parametric representation of π is



x = −3 + 2t + s



π:
y=
t + 3s .



z = 1 − t + 2s
Elimination of s and t gives the equation
x − y + z = −2
for the plane π. The distance from the point (3, 4, 2) on `2 to π is,
according to (10)
√
3
|3 − 4 + 2 + 2|
d= √
= √ = 3.
3
1+1+1
√
The distance between the lines is thus 3.
Angle between two planes. If π1 , π2 are two planes, we define
the angle between them to be the angle between the corresponding
normal vectors. (In general there are two possibilities for the angle,
depending on the mutual orientations of the normal vectors: see the
example below.)
Example 4. Suppose that
π1 : x − 2y − 2z = −3
π2 : x + 4y + z = 5.
Corresponding normal vectors are n1 = (1, −2, −2) and n2 = (1, 4, 1).
The angle θ between the planes then satisfies
(n1 |n2 )
1 · 1 + (−2) · 4 + (−2) · 1
1
= √
=−√ .
√
kn1 kkn2 k
2
12 + 22 + 22 · 12 + 42 + 12
This gives θ = 3π/4. This is the oblique angle between the planes.
There is also another possibility, namely if we substitute −n1 for n1
above. This leads to the acute angle π − θ = π/4.
cos θ =
Angle between a line and a plane. To determine the angle between a line ` and a plane π, one first computes the acute angle ψ
between ` and a normal vector to π. The angle ϕ between ` and π is
defined by ϕ + ψ = π/2.
Example 5. Suppose that



x=2+ t



`:
y=3+ t



z = 1 + 4t
;
π : 4x − 11y − 5z = −2.
3. COMPUTING DISTANCES AND ANGLES
41
Let θ denote the angle between the direction vector (1, 1, 4) of ` and
the normal vector (4, −11, −5) of π. Then
1 · 4 + 1 · (−11) + 4 · (−5)
1
=− .
√
2
12 + 12 + 42 · 42 + 112 + 52
This gives θ = 2π/3, which is oblique. Hence the acute angle between
` and the normal to π is ψ = π − θ = π/3. The angle between ` and
π is thus ϕ = π/2 − ψ = π/6.
cos θ = √
Exercises.
5. Compute the distance between the point (1, 2, 3) and the line



x= 1− t



y = −4 + 2t .




z = 3 − t
6. The line ` is the intersection between the planes x+2y−2z = 5
and 2x − y + z = 0. Determine the point on ` which is closest
to the origin.
7. The line ` passes through the point (1, 2, 3) and is perpendicular to the plane 2x − 3y + 1 = −3. Find the distance between
` and the point (4, 5, 6).
8. Determine, in the form Ax + By + Cz = D, the equation of the
plane which consists of all points which have equal distance
to the points (1, 2, 0) and (−1, 0, 2).
9. Find the distance from the plane 3x − 4y + 12z = 13 to the
points (0, 0, 0) and (2, 1, 3). Are these points on the same or
on opposite sides of the plane?
10. a) Determine, in the form Ax + By + Cz = D, an equation for
the plane M which passes through the points (2, −3, 0) and
(2, −2, 2), and is parallel to the line x = 2 − t, y = 1 + t, z = 2 − t.
b) Find the distance between the point (3, −1, 0) and M.
11. Find the point in the plane through the points (1, 3, −1), (1, 1, 0),
(−1, 3, 2) which is closest to the point (−2, −2, −1).
12. a) Prove that the lines






x
=
1
+
t
x=3+ t






`1 : 
and
`2 : 
y=2− t
y=2+ t






z = 3 + 2t
z = 2 − 3t
intersect at a point.
b) Find the distance between the point (3, 4, 5) and the
plane spanned by `1 and `2 .
42
3. DISTANCE AND ANGLE
13. A ray of light is emitted from the point (3, −2, −1) and reflected
off the plane x − 2y − 2z = 0. The reflected ray passes the point
(4, −1, −6). At which point does the ray hit the plane?
14. Determine the distance between the lines
a) (x, y, z) = t(−3, 3, 1) and (x, y, z) = (−1, 0, 0) + t(1, 1, 1).
b) (x, y, z) = (1, 2, 3) + t(0, 1, 1) and (x, y, z) = (1, 1, 1) +
t(2, 3, 1).
15. Consider the lines






x = −12 − t
x
=
8
−
3t






.
and
`2 : 
`1 : 
y = 4 − 2t
y= 2− t






z = 1 + t
z = −3 + t
Determine, in the form Ax + By + Cz = D, an equation for the
plane which is parallel to `1 and `2 and has the same distance
to the two lines.
16. a) Determine, in the form Ax+By+Cz = D, an equation for the
plane M which passes through the points (2, −1, 3), (1, 2, −2),
and (1, 0, 2).
b) Determine the angle between M and the plane 2x + y −
z = −1.
17. Determine the angle between the plane x + 2y − z = 0 and the
line (x, y, z) = (3, 5, −1) + t(1, 1, 0).
18. A tetrahedron has corners A = (−1, 2, 0), B = (1, 3, −1), C(1, 1, 0),
and D(−1, 3, −2). Determine the angle between the plane containing the side BCD and the line containing the edge AB.
CHAPTER 4
Second degree curves
1. Ellipse, Hyperbola, Parabola
Circle. Let F be a point in a plane π and consider the set of all
points P in π of a certain distance a to F. If F and P have coordinates
(x0 , y0 ) and (x, y) respectively, where coordinates are represented in
some ON-system for π, then the equation of the circle can be written
(x − x0 )2 + (y − y0 )2 = a2 .
Of course, point F is called the center and a is the radius of the circle.
If F is the origin, the equation reduces to
x2 + y2 = a2 .
We shall now discuss the other basic types of second degree curves:
ellipse, hyperbola, and parabola.
Ellipse. The definition of an ellipse generalizes the definition of
a circle. Let F1 and F2 be two points in a plane π and let a be a positive
constant; we assume that 2a is greater than the distance between F1
and F2 . The set of points P in π with the property that the sum of the
distances from P to F1 and from P to F2 equals to 2a is called an ellipse.
If we choose an ON-system in π such that the origin is the midpoint on the segment F1 F2 , and the x-axis passes through the points
F1 and F2 , then F1 and F2 have coordinates (−c, 0) and (c, 0) for some
real c. We can assume that c > 0. That the sum of distances from
P = (x, y) to F1 and F2 equals 2a means that
q
q
(1)
(x + c)2 + y2 + (x − c)2 + y2 = 2a.
Squaring the equation (1) leads to
q
(x + c)2 + (x − c)2 + 2y2 + 2 (x + c)2 + y2 (x − c)2 + y2 = 4a2 .
Dividing by 2 and rearranging, this becomes
q
2
2
2
2
x + y + c − 2a = − x2 + y2 + c2 + 2cx x2 + y2 + c2 − 2cx .
43
44
4. SECOND DEGREE CURVES
Squaring again, we obtain that
2
2
x2 + y2 + c2 − 4a2 x2 + y2 + c2 + 4a2 = x2 + y2 + c2 − 4c2 x2 ,
i.e.,
a2 − c2 x2 + a2 y2 = a2 a2 − c2 .
If we put b2 = a2 − c2 and divide with a2 b2 , this becomes
x2 y2
+
= 1.
a2 b2
We have shown that each point (x, y) satisfying the root-equation (1)
also satisfies (2). By tracing back in the calculations, one can also
verify that all solutions to (2) satisfy (1). (Since we have squared
several times, this is not immediate!) We have shown that the ellipse
is completely determined by the equation (2).
Notice that the ellipse (2) intersects the coordinate axes at the
points (±a, 0) and (0, ±b). The segments from the origin to these
points are called the semi-axes of the ellipse. The points F1 = (−c, 0)
and F2 = (c, 0) are the foci of the ellipse.
(2)
Hyperbola. If we instead consider the set of points P in a plane
π such that the difference between the distances to two given points
("foci”) F1 and F2 is constant = 2a, we get a curve known as a hyperbola.
In a similar way to the case of the ellipse, we can introduce an
ON-system in π such that F1 = (−c, 0) and F2 = (c, 0) for a number
c > a. Hence the equation of the hyperbola becomes
q
q
2
2
(3)
(x + c) + y − (x − c)2 + y2 = ±2a.
Here the plus-sign shall be chosen if P = (x, y) is closer to F2 , and the
minus-sign if P is closer to F1 . The hyperbola is not connected, it has
two branches.
Calculations analogous to the case for the ellipse show that the
equation of the hyperbola can be written
(4)
x2 y2
−
= 1,
a2 b2
where this time b2 = c2 − a2 .
Parabola. Let ` be a line in a plane π, and F a point in π, which
is not on `. The set of points P in π whose distance to ` equals the
distance to F is called a parabola. The point F is the focus and the line
` is called the directrix of the parabola.
1. ELLIPSE, HYPERBOLA, PARABOLA
45
Choose an ON-system in π such that the y-axis is parallel to `,
the x-axis passes through F, and the origin has equal distance a to
F and to `. We shall prove that the equation of the parabola in this
ON-system becomes
(5)
y2 = 4ax.
To show this, note that the distance
from a point P = (x, y) to ` is x + a,
p
while the distance to F is (x − a)2 + y2 . Squaring these distances
leads to (5).
Remark 11. The parabola has an interesting optical property with
many practical applications. Each light-ray parallel to the positive
x-axis will, after reflection in the parabola, pass through the same
point F. A set of parallel light-rays are thus focussed to the point F.
Conversely, if we place a source of light at F, this will after reflection
give rise to light-rays which are parallel to the x-axis.
In the case of the ellipse, one has instead that if a light source is
placed at F1 , then all light-rays will pass through F2 .
Remark 12. The ellipse, the hyperbola, and the parabola are all
cases of so-called conic sections. This name stems from the fact that
all such curves can be obtained as the intersection of a double cone
with a suitable plane.
Exercises.
19. Determine the centers of circles which are tangent to the x-axis
and which pass through the points (0, 1) and (0, 9).
20. Determine the foci of the ellipses
a) 9x2 + 25y2 = 225.
b) 25x2 + 169y2 = 4225.
21. Determine the equation of the ellipse which intersects the
y-axis at the points (0, ±2) and has foci at the points (±2, 0).
22. Let (x0 , y0 ) be a point on the ellipse x2 /a2 + y2 /b2 = 1.
a) Show that the line x = x0 + αt, y = y0 + βt is tangent to
the ellipse if and only if αx0 /a2 + βy0 /b2 = 0.
b) Show that the point (x, y) is on the tangent of the ellipse
at (x0 , y0 ) if and only if xx0 /a2 + yy0 /b2 = 1.
23. Find the foci of the hyperbolae
a) 16x2 − 9y2 = 144.
b) 3x2 − 5y2 = 75.
24. Find the equation of the hyperbola which intersects the x-axis
at the points (±2, 0) and has foci at (±3, 0).
25. Find the equation of the parabola which is symmetric with
respect to the x-axis and which passes through the points
(0, 0) and (27, 18). Also determine the focus.
46
4. SECOND DEGREE CURVES
2. General Second-Degree Equations
A second-degree equation in the variables x and y is an equation
of the form
(1)
Ax2 + Bxy + Cy2 + Dx + Ey = F.
Now suppose that x and y are coordinates with respect to an ONsystem Oe1 e2 in the plane. We shall investigate the geometric meaning of the equation (1). In the preceding section, we saw that ellipses,
hyperbolas, and parabolas are all described by second-degree equations. We shall here show that, except for certain "pathological”
cases, these three basic types of curves can be used to describe all
second-degree curves.
If A = B = C = 0, then (1) is a first-degree equation
Dx + Ey = F.
This is (unless D = E = 0) the equation for a line. In the sequel,
we can hence assume that at least one of the coefficients A, B, C are
non-zero.
The main idea for our solution of (1) involves changing coordinates to a new ON-system, where the equation has a simpler form.
We start by showing that, by a suitable rotation of the basis vectors,
we can get rid of the coefficient B for xy. Thus we introduce new
basis vectors e01 , e02 by
e01 = cos θ e1 + sin θ e2 ,
e02 = − sin θ e1 + cos θ e2 .
It is easy to see that e01 , e02 (since e1 , e2 is so).
Let (x0 , y0 ) be the coordinates for a point P relative to the system
Oe01 e02 . Then
−−→
OP = x0 cos θ − y0 sin θ e1 + x0 sin θ + y0 cos θ e2 .
If (x, y) are the coordinates of P in the "old” system Oe1 e2 , we hence
have
x = x0 cos θ − y0 sin θ.
y = x0 sin θ + y0 cos θ.
Substituting these expressions into (1), we get an equation of the form
2
A0 (x0 )2 + B0 x0 y0 + C0 y0 + D0 x0 + E0 y0 = F,
2. GENERAL SECOND-DEGREE EQUATIONS
47
where the coefficient B0 for x0 y0 is given by
(2)
B0 = −2A cos θ sin θ + B(cos2 θ − sin2 θ) + 2C cos θ sin θ
= B cos 2θ − (A − C) sin 2θ,
where we have used the familiar "double angle” formulas for cos and
sin.
If B = 0 no rotation is necessary, and we can take θ = 0. If B , 0
we choose θ such that
A−C
cot θ =
.
B
The relation (2) then shows that B0 = 0. In all cases, our new coordinate system turns our equation into the type
2
(3)
A0 (x0 )2 + C0 y0 + D0 x0 + E0 y0 = F,
for suitable constants A0 , C0 , D0 , E0 . Since we have assumed that at
least one of the numbers A, B, C are not zero, it is easy to see that at
least one of A0 , C0 are not zero.
Case 1. We first consider the case when both A0 and C0 are non-zero.
We can then complete squares in (3) to obtain
D0 2
E0 2 (D0 )2 (E0 )2
0
0
(4)
A0 x0 +
+
C
y
+
=
+
+ F.
2A0
2C0
4A0
4C0
We then make a new change of coordinates by
E0
D0
00
0
,
y
=
y
+
.
2A0
2C0
This means that the origin in the old system is moved to the point
which has x0 y0 -coordinates (−D0 /2A0 , −E0 /2C0 ). If
x00 = x0 +
(D0 )2 (E0 )2
F =
+
+ F,
4A0
4C0
0
then (4) becomes
(5)
A0 (x00 )2 + C0 (y00 )2 = F0 .
If A0 and C0 are both positive, then (5) describes an ellipse, a point,
or the empty set, depending on whether F0 > 0, F0 = 0, or F0 < 0. The
case then A0 and C0 are both negative can be reduced to the positive
case by multiplying both sides of the equation by −1. If A0 and
C0 have opposite signs, we can after multiplication with a suitable
constant assume that A0 > 0 and C0 = −1. The equation (5) is then
A0 (x00 )2 − (y00 )2 = F0 .
48
4. SECOND DEGREE CURVES
If F00 , 0 this means a hyperbola. If F0 = 0 we get
√
y00 = ± A0 · x00 ,
which means a "degenerate hyperbola”, or rather: two intersecting
straight lines.
Case 2. Now suppose that one of the numbers A0 , C0 in (3) are
zero. We can w.l.o.g. assume that A0 = 0 and C0 , 0. Then (3) says
that
C0 (y0 )2 + D0 x0 + E0 y0 = F.
(6)
If D0 = 0, then the equation (6) becomes independent of x0 . The
equation then means two lines parallel to the x0 -axis, or one line
parallel to the x0 -axis, or the empty set, depending on the number of
different real solutions to the second-degree equation C0 (y0 )2 + E0 y0 =
F. If D0 , 0, the equation (6) can be written
!
(E0 )2
F
E0 2
0
0
0
0
C y + 0 + D x − 0 − 0 0 = 0.
2C
D
4C D
If we now put
(E0 )2
E0
F
00
0
,
y
=
y
+
−
4C0 D0 ,
D0
2C0
we see that (6) transforms into the equation
x00 = x0 −
(7)
C0 (y00 )2 + D0 x00 = 0.
This last equation means a parabola: if C0 and D0 have equal signs, it
surrounds the negative x00 -axis, otherwise it surrounds the positive
x00 -axis.
The above discussion characterizes all possible second-degree
curves. Except for ellipse, hyperbola, and parabola, there are the
following "pathological” cases: two intersecting straight lines, one
or two parallel straight lines, a point, the whole plane (when A = B =
C = D = E = F = 0), and the empty set.
Exercises.
26. Prove that each of the following equations describe an ellipse.
Also determine the lengths of the semi-axes.
a) 17x2 − 16xy + 17y2 = 225.
b) 3x2 + 2xy + 3y2 = 8
c) 9x2 + y2 − 18x + 4y + 4 = 0
d) 2x2 + 3y2 + 12x + 12 = 0.
3. ANSWERS TO EXERCISES
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
23.
24.
25.
26.
49
3. Answers to Exercises
√
√
The side-lengths
are
3,
3,
and
2. The cosines of angles are
√
√
√
− 2/3, 5/(3 3), and 2 2/3.
π/3
√ or 2π/3.
12.
(1,√1, −1).
3 3.
x + y − z = 0.
1 and 25/13 respectively. The points are on opposite sides of
the plane.
√
a) 3x − 2y + z = 12.
b) 1/ 14.
(1, −1, 1).
√
a) Intersection-point: (2, 1, 5).
b) 16/ 30.
(2, 1, 0).
√
√
a) 1/ 14.
b) 1/ 3.
x + 2y + 5z = −1.
a) x + 2y + z = 3.
b) π/3.
π/3.
π/6.
(±3, 5).
a) (±4, 0).
b) (±12, 0).
x2 + 2y2 = 8.
√
a) (±5, 0).
b) (± 40, 0).
5x2 − 4y2 = 20.
y2 = 12x2 ; focus (3, 0).
√
√
√
a) 5 and 3. b) 2 and 2. c) 3 and 1. d) 3 and 2.
CHAPTER 5
Cross-product and Volume-product
1. Orientation of vectors
Let π be a plane in space and u, v two non-parallel vectors in π.
Consider the smallest rotation which turns u into a vector with the
same direction as v. If we view the rotation from one side of the
plane, the rotation appears clockwise, but from the other side it will
be perceived to be counterclockwise. If w is a vector not in π, and
if the rotation is counterclockwise when seen from the side of π in the
direction of w, then the triple u, v, w is said to be positively oriented.
If this is not the case, we say that the triple is negatively oriented. A
positively (resp. negatively) oriented system is sometimes called a
right handed (resp. left handed) system.
Observe that the ordering of the vectors is essential here. For
example, if u, v, w is positively oriented, then u, w, v is negatively
oriented. Namely: place all vectors with their tails at the same point.
Then, seen from the tip of v, the smallest rotation which turns u into
a vector with the same direction as w, will be clockwise. The reader
is asked to supply a picture of the situation.
2. Cross-product
Let u and v be two vectors in space and denote by A(u, v) the area
of the parallelogram spanned by u and v. If u and v are parallel, then
of course A(u, v) = 0; otherwise, a simple geometric consideration
shows that
(8)
A(u, v) = kukkvk sin θ,
where θ is the angle between u and v (in the interval [0, π]).
Definition 13. The cross-product (or vector-product) u × v of u and
v is the unique vector w with the following properties:
(i) w is orthogonal to u and to v,
(ii) kwk = A(u, v),
(iii) If u and v are not parallel, then u, v, w is a positively oriented
triple.
51
52
5. CROSS-PRODUCT AND VOLUME-PRODUCT
Remark 14. Note that if the vectors u and v are parallel, then
A(u, v) = 0 and u × v = 0.
Example 1. Let e1 , e2 , e3 be a positively oriented orthonormal basis.
Then
e1 × e2 = −e2 × e1 = e3 ,
e3 × e1 = −e1 × e3 = e2 ,
e2 × e3 = −e3 × e2 = e1 .
These formulas are obvious geometrically.
The cross-product can (like the scalar product) be described using
an orthogonal projection. Suppose that v , 0 and that π is a plane
with normal vector v. Denote by u0 the orthogonal projection of u
on π. Then A(u, v) = A(u0 , v), because ku0 k is the hight from v of the
parallelogram spanned by u and v. Let w = u × v. So w is the vector
orthogonal to u and v, such that kwk = ku0 kkvk and that u, v, w is a
positively oriented triple. It is now seen that
(9)
u × v = kvk · T(u0 ),
where T(u0 ) is the vector u0 rotated the angle π/2 clockwise in π,
seen from the tip of v. For the orthogonal projection, we know
that (u1 + u2 )0 = u01 + u02 . If the resulting vectors are rotated by π/2
clockwise, it follows that T((u1 + u2 )0 ) = T(u01 ) + T(u02 ). It now follows
from (9) that
(10)
(u1 + u2 ) × v = u1 × v + u2 × v.
We have proved the distributive law for the cross-product. It is also
clear from the definition of the cross-product that it obeys linearity in
the first argument,
(11)
(tu) × v = t(u × v),
as well as a new type of rule:
(12)
u × v = −v × u.
The rule (12) is known as the anti-commutativity of the cross-product.
A consequence of (12) is that the counterparts of (10) and (??) also
hold for the second argument:
u × (v1 + v2 ) = u × v1 + u × v2
,
u × (tv) = t(u × v).
If e1 , e2 , e3 is a basis for three-dimensional space and
(13)
u = x1 e1 + x2 e2 + x3 e3
,
v = y1 e1 + y2 e2 + y3 e3
2. CROSS-PRODUCT
53
we will thus have
u × v = x1 y1 e1 × e1 + x1 y2 e1 × e2 + . . . + x3 y3 e3 × e3 .
Since e j × e j = 0 and e j × ek = −ek × e j , this simplifies to
(x1 y2 − x2 y1 )e1 × e2 + (x1 y3 − x3 y1 )e1 × e3 + (x2 y3 − x3 y2 )e2 × e3 .
In the important case when the basis e1 , e2 , e3 is orthonormal and
positively oriented, we get (using Example 1) the following result.
Theorem 15. Suppose that e1 , e2 , e3 is an orthonormal basis. Then
(14) u × v = (x2 y3 − x3 y2 )e2 × e1 + (x1 y3 − x3 y1 )e2 + (x1 y2 − x2 y1 )e3 .
Remark 16. As a help for memory, there is a well-known mnemonic
trick to remember the above formula. This uses the concept of "determinants”, a concept which we at this stage will use solely as a help
for our memory. By definition a 2 × 2-determinant is defined by
a b
c d = ad − bc.
A 3 × 3-determinant is then defined by
a1 a2 a3 b1 b2 b3 = a1 b2 b3 − a2 b1 b3 + a3 b1 b2 .
c2 c3 c1 c3 c1 c2 c1 c2 c3 This rule is called "expansion along the first row”: one starts in the
upper left corner with a1 and multiplies by the 2 × 2-determinant
obtained by striking the row and column containing a1 . Then we
proceed to a2 and do the same, but with the opposite (i.e., minus-)
sign in front of it. Then we move to a3 (changing sign again). Using
determinants we can now write
e1 e2 e3 u × v = x1 x2 x3 .
y1 y2 y3 Remark 17. We saw above that the cross-product is non-commutative
(it is in fact anti-commutative). The cross-product does not obey the
associate law either. For example, if e1 , e2 , e3 is an orthonormal basis,
then by Example 1,
e1 × (e1 × e2 ) = −e2
and
(e1 × e1 ) × e2 = 0 × e2 = 0.
Thus it can happen that u × (v × w) , (u × v) × w.
54
5. CROSS-PRODUCT AND VOLUME-PRODUCT
Example 2. Consider three points P0 , P1 , P2 , which in a positively oriented orthonormal system have coordinates (2, 3, −2), (4, 1, 1), and
(2, 1, −1) respectively. We shall compute the area of the triangle
P0 P1 P2 . This area equals to half the area of the parallelogram spanned
−−−→
−−−→
by the vectors P0 P1 and P0 P2 . The area of that parallelogram is the
length of the cross-product
−−−→ −−−→
P0 P1 × P0 P2 = (2, −2, 3) × (0, −2, 1) = (4, −2, −4).
Thus the triangle P0 P1 P2 has area
1√ 2
4 + 22 + 42 = 3.
2
Example 3. If e1 , e2 is an orthonormal basis in a plane π, we can
choose a unit normal vector e3 to π such that e1 , e2 , e3 becomes an
orthonormal basis for space. Two vectors
u = x1 e1 + x2 e2
,
v = y1 e1 + y2 e2
in π will then have coordinates (x1 , x2 , 0) resp. (y1 , y2 , 0) in the basis
e1 , e2 , e3 . The area of the parallelogram spanned by u and v therefore
equals to the length of the cross-product
(x1 , x2 , 0) × (y1 , y2 , 0) = (0, 0, x1 y2 − x2 y1 ),
i.e. we have the formula
A(u, v) = x1 y2 − x2 y1 .
Example 4. The cross-product can be used to calculate the distance
between lines. Assume a positively oriented orthonormal system,
and let `1 be the line through the points (1, 1, 1) and (4, 5, 3) while `2 is
the line passing through the points (−1, −10, −1) and (8, 2, 2). Thus `1
has direction vector (3, 4, 2) and `2 has direction vector (3, 4, 1). Since
(3, 4, 2) × (3, 4, 1) = (−4, 3, 0)
we see that e = 15 (−4, 3, 0) is a unit vector orthogonal to both `1 and `2 .
If P is a point on `1 and Q a point on `2 , we infer that the absolute value
−−→ of the scalar product PQ e must equal to the distance between `1
and `2 . Choosing, for example, P = (1, 1, 1) and Q = (8, 2, 2), we find
that
−−→ PQ e = −5.
The distance between `1 and `2 is thus 5.
3. VOLUME-PRODUCT
55
Exercises.
1. Prove that if u + v + w = 0, then
u × v = v × w + w × u.
2. Find the area of the triangle which in a positively oriented
ON-system has its vertices at the points
a) (1, 2, 3), (3, 4, 1), (2, 0, 2).
b) (5, 1, 1), (2, 3, 2), (3, 2, 3).
c) (1, 0, 0), (0, 1, 0), (0, 0, 1).
3. In a positively oriented ON-system, the points (1, 1, 1) and
(0, 3, 3) are on the line `1 , and (2, 2, −4) and (4, 4, 4) are on `2 .
Find the distance between `1 and `2 .
4. Prove that
(u × v) × w = (u|w) v − (v|w) u.
Hint: To simplify the computations, one can choose an
ON-basis e1 , e2 , e3 such that u = x1 e1 and v = x1 e1 + x2 e2 .
5. Prove that
(u × v) × w = u × (v × w)
if and only if u is parallel to w, or u and w are both orthogonal
to v.
Hint: Use the preceding exercise.
3. Volume-product
The cross-product and the scalar product can be combined to form
the volume-product V(u, v, w) of three vectors in space:
(15)
V(u, v, w) = (u × v|w).
The motivation for the name is that the absolute value of V(u, v, w)
equals to the volume of the parallelepiped spanned by the vectors u, v, w,
if they are placed with their tails at one and the same point. To show this,
note that the vector
u×v
e=
ku × vk
is a unit normal to the plane spanned by u and v. Denote by P the
parallelepiped spanned by u, v, w. From elementary geometry we
know that the volume of P equals the area of the "base parallelogram”
spanned by u, v, times the height h of P above the plane spanned by
u and v. Then h is the length of the orthogonal projection of w on the
56
5. CROSS-PRODUCT AND VOLUME-PRODUCT
normal of the plane, i.e., h = |(e|w)|. Since the base parallelogram has
area A(u, v) = ku × vk, we infer that the volume of P equals
A(u, v)h = ku × vk · |(e|w)| = |(u × v|w)| .
The asserted property of the volume product is proved.
When two vectors u and v are parallel we have u × v = 0, so
V(u, v, w) = 0 in this case. More generally, the volume-product is
zero when the parallelepiped is degenerate, i.e., when the vectors u,
v, w are linearly dependent. On the other hand, if u, v, w are linearly
independent, the sign of (u × v|w) depends on whether or not the
two vectors u × v and w lie on the same side of the plane spanned by
u and v. The volume product V(u, v, w) is positive when the triple u,
v, w is positively oriented and negative otherwise. Since the volume
of the parallelepiped is the same regardless of how we choose to
permute the vectors u, v, w, only the sign can change under such a
permutation:
V(u, v, w) = V(w, u, v) = V(v, w, u) = −V(u, w, v)
= −V(v, u, w) = −V(w, v, u).
Combining the computational rules for the cross- and scalar- products, we find that
(16)
V(su1 + tu2 , v, w) = sV(u1 , v, w) + tV(u2 , v, w)
for all real numbers s and t. Thus we have linearity in the first argument
for the volume product. We similarly have linearity in the second
and in the third argument. In short: the volume product is tri-linear,
i.e., linear in each of its three arguments.
Now let e1 , e2 , e3 be a basis for space, and take three vectors
u = (x1 , x2 , x3 ), v = (y1 , y2 , y3 ), w = (z1 , z2 , z3 ), where coordinates are
given according to the chosen basis. Then by linearity in the different
arguments,
V(u, v, w) = V(x1 e1 + x2 e2 + x3 e3 , v, w)
= x1 V(e1 , v, w) + x2 V(e2 , v, w) + x3 V(e3 , v, w)
X
= ... =
xi y j zk V(ei , e j , ek ).
Here the sum is over all possible choices of of i, j, k ∈ {1, 2, 3}, but
since V(ei , e j , ek ) = 0 if two of them coincide, only six terms can be
non-zero. Furthermore
V(e1 , e2 , e3 ) = V(e3 , e1 , e2 ) = V(e2 , e3 , e1 ) = −V(e1 , e3 , e2 )
= −V(e2 , e1 , e3 ) = −V(e3 , e2 , e1 ).
3. VOLUME-PRODUCT
57
This gives that V(u, v, w) is equal to
(x1 y2 z3 + x3 y1 z2 + x2 y3 z1 − x1 y3 z2 − x2 y1 z3 − x3 y2 z1 )V(e1 , e2 , e3 ).
The number in front of V(e1 , e2 , e3 ) can now be recognized as the
3 × 3-determinant (see Remark 16 in the previous section)
x1 x2 x3 y1 y2 y3 .
z1 z2 z3 We have arrived at the formula
x1 x2 x3 (17)
V(u, v, w) = y1 y2 y3 V(e1 , e2 , e3 ).
z1 z2 z3 This formula is particularly simple when the basis e1 , e2 , e3 is orthonormal. Then V(e1 , e2 , e3 ) = 1, so we simply have:
x1 x2 x3 (18)
V(u, v, w) = y1 y2 y3 .
z1 z2 z3 Example 1. Suppose that the vectors u, v, w have coordinates (2, −1, 3),
(0, 3, 2), and (3, 5, 1) respectively, with respect to a positively oriented
orthonormal basis. We get
2 −1 3
V(u, v, w) = 0 3 2 = −47.
3 5 1
The volume of the parallelepiped spanned by u, v, w is thus 47.
Three vectors are linearly dependent if and only if they are coplanar, i.e., if the corresponding parallelepiped has volume zero.
According to (17) this is equivalent to that the determinant of the
coordinates of the vectors is zero.
Example 2. To decide when the vectors (1, a, 2), (−1, 7, 1 + a), and
(1, −1, 1) are linearly dependent, we form the determinant
a
2 1
−1 7 1 + a = a2 + 3a − 4 = (a − 1)(a + 4).
1 −1
1 The vectors are thus linearly dependent if a = 1 or a = −4.
58
5. CROSS-PRODUCT AND VOLUME-PRODUCT
Exercises.
6. Motivate the identity
(u × v|w) = (u|v × w) .
7. The volume of a tetrahedron spanned by three vectors u, v,
w, rooted at the same point, equals to 1/8 of the volume of the
parallelepiped spanned by u, v, and w. Find the volume of
the tetrahedron with vertices at
a) (2, 1, 0), (3, 5, 2), (4, 1, 2), (6, 1, 5).
b) (−2, 2, −3), (2, 1, 3), (1, 4, −2), (0, 5, 1).
8. A tetrahedron with volume 5 has three of its vertices at the
points (2, 1, −1), (3, 0, 1), (2, −1, 3). The fourth vertex is on the
positive y-axis. Determine its y-coordinate.
9. For which values of a and b are the three vectors
(a, b, b) ,
(b, a, b) ,
(b, b, a)
linearly dependent?
10. For which values of a are the four points
(0, 2, 1) ,
(−a, 1, 0)
,
(−3, 3, −a) ,
(3, −3, 1 + a)
in the same plane?
4. Quarternions
We have seen that neither the commutative, nor the associative
laws hold for the cross-product. One can ask whether there is some
other way of defining multiplication between vectors, so that all the
usual computational laws are satisfied. For vectors in a plane, this is
true, since plane vectors can be identified with complex numbers. In
the definition of multiplication of complex numbers, one starts with
the familiar identity i2 = −1; if the usual laws of calculation shall
hold, the product of complex numbers must be
(x1 + ix2 )(y1 + iy2 ) = (x1 y1 − x2 y2 ) + i(x1 y2 + x2 y1 ).
As we know, this definition does indeed satisfy all the usual computational rules.
Vectors in three-dimensional space can formally be written
x1 + x2 i + x3 j
and if we still insist that i2 = −1, so that the multiplication when
x3 = 0 corresponds to multiplication of complex numbers, we just
4. QUARTERNIONS
59
need to establish the rules for multiplication with j. In particular, the
product ij must be a new vector
(19)
ij = a + bi + cj.
If this is multiplied by i, using that i2 = −1, we obtain
−j = −b + ia + ijc.
If we here substitute ij for the right hand side of (19), we get after
simplification that
0 = (ac − b) + (bc + a)i + (c2 + 1)j.
This does not make sense, for we can not have c2 + 1 = 0 for a real
number c.
The above argument shows that it is impossible to extend the multiplication of complex numbers to multiplication of triples of numbers in a way such that the usual laws of calculation are preserved.
Annoyed by this type of inconveniences, the Irish mathematician
Hamilton tried to instead define multiplication between 4-tuples
(20)
x = x0 + x1 i + x2 j + x3 k.
For reasons soon to be made clear, the coefficients are enumerated
from 0 to 3, rather than from 1 to 4. In 1843, Hamilton discovered
that if one abandons the commutative law xy = yx and defines
(21)
i2 = j2 = k2 = −1
ij = −ji = k , jk = −k j = i ,
ki = −ik = j,
then multiplication of 4-tuples will satisfy all other rules of calculation. Hamilton coined the term quarternions for the set of 4-tuples
with this multiplication. He also showed how to define division by
a non-zero quarternion.
In 1878, the German mathematician Frobenius proved that if we
want to define multiplication of n-tuples such that division by nonzero elements is always possible, and if we only are prepared to
abandon the commutative law for multiplication, then n must be
either 1, 2, or 4. Except for the real and complex fields, Hamilton’s
quarternions are the only possibility.
Hamilton called the number x0 the scalar part of the quarternion
(20) and x1 i + x2 j + x3 k is the vector part. If one multiplies two quarternions with scalar parts zero and uses the identities in (21), one
finds after a little calculation that
(x1 i + x2 j + x3 k)(y1 i + y2 j + y3 k) = −(x1 y1 + x2 y2 + x3 y3 )+
+ (x2 y3 − x3 y2 )i + (x3 y1 − x1 y3 ) j + (x1 y2 − x2 y1 )k.
60
5. CROSS-PRODUCT AND VOLUME-PRODUCT
The scalar part of the right hand sign equals the negative of the scalar
product of the vectors in the left hand side, and the vector part of
the right hand side equals the cross-product of the vectors in the left
hand side. The scalar product is older, even though the name was
invented by Hamilton, but the cross-product was discovered in this
way, as a by-product of multiplication of quarternions. Hamilton also
interpreted the quarternions geometrically and defined the scalarand cross-products in a basis-independent way, as we have done in
this chapter.
Exercises.
11. Prove that the formula (21) implies that
(x0 + x1 i + x2 j + x3 k)(x0 − x1 i − x2 j − x3 k) = x20 + x21 + x22 + x23 ,
and that one therefore can define division by non-zero quarternions.
5. Answers to Exercises
1. Determine t so that the following vectors become orthogonal:
a) u = (t, 4) and v = (−2, 3) b) u = (t, 2) and v = (t, −8)
c) u = (t − 2, 3) and v = (−4, 2t).
2. Find the angle between the vectors u = (−1, 3) and v = (−2, 1).
3. Determine cos θ where θ is the angle between
a) u = (3, 2) and v = (3, −2) b) u = (1, 1) and v = (−2, 1).
4. Determine t so that the angle between u = (−1, t) and v = (1, 1)
becomes π/3.
−−→
5. Find the length of the vector PQ when
a) P = (1, 4) and Q = (5, 7) b) P = (−1, 0) and Q
(4, 12)
√ =√
c) √
P =√
(2, −3) and Q = (1, 1) d) P = (− 2, 3) and
Q = ( 2, 3 3).
7. a) 4/3. b) 10.
8. y = 8.
9. a = b or a = −2b.
10. a = 1 or a = −9/4.
CHAPTER 6
Matrices
1. Basic properties
Definitions. By a p × n-matrix we mean an array of numbers,
arranged in the form


a11 a12 . . . a1n 
a

 21 a22 . . . a2n 
A =  ..
..
.. 
 .
.
. 

ap1 ap2 . . . apn
with p rows and n columns. The numbers a jk are called matrix elements. Notice that a jk is in the j:th row and k:th column. A more brief
notation, meaning the same matrix A is:
p,n
A = (a jk ) j,k=1 .
In the special case when p = n we say that A is a square matrix of order
n.
p,n
p,n
Operations with matrices. Let A = (a jk ) j,k=1 and B = (b jk ) j,k=1 be
two p × n matrices. We define A + B to be the p × n matrix with entries
a jk + b jk , i.e.,
p,n
A + B = (a jk + b jk ) j,k=1 .
Example 1.
!
!
!
2 3 −1
3 −2 1
5 1 0
+
=
.
4 2 1
2 1 2
6 3 3
For a scalar t we define tA as the matrix with entries ta jk .
Example 2.
!
!
2 3 −1
4 6 −2
.
2·
=
4 2 1
8 4 2
61
62
6. MATRICES
The definition of the product of two matrices is less obvious. In
order to find a reasonable definition, let us consider a linear relation


a11 x1 + a12 x2 + . . . + a1n xn = y1






a21 x1 + a22 x2 + . . . + a2n xn = y2
.
(1)

..


.




ap1 x1 + ap2 x2 + . . . + apn xn = yp
If the numbers y1 , . . . , yp are given, then (1) is a linear system for the
unknowns x1 , . . . , xn . On the other hand, if x1 , . . . , xn are given, then
(1) gives us the values of y1 , . . . , yp . That is, the quantities y1 , . . . , yp
can via (1) be regarded as functions of x1 , . . . , xn . In order to define
matrix multiplication, we shall adapt this latter point of view: we
regard (1) as a recipe for a function.
Now suppose that we have another set of variables z1 , . . . , zq
which depend on y1 , . . . , yp in a similar way,


b11 y1 + b12 y2 + . . . + b1p yp = z1






b21 y1 + b22 y2 + . . . + b2p yp = z2
.
(2)

..


.




bq1 y1 + bq2 y2 + . . . + aqp yp = zq
If we here substitute y1 , . . . , yp by the corresponding left hand side in
(1), we get a relation of the form


c x + c12 x2 + . . . + c1n xn = z1


 11 1



c21 x1 + c22 x2 + . . . + c2n xn = z2
(3)
,

..


.




cq1 x1 + cq2 x2 + . . . + cqn xn = zq
where
(4)
c jk = b j1 a1k + b j2 a j2 + · · · + b jp apk .
Now define two matrices by


a11 a12 . . . a1n 
a

 21 a22 . . . a2n 
A =  ..
..
.. 
 .
.
. 

ap1 ap2 . . . apn


b11 b12 . . . b1p 
b

 21 b22 . . . b2n 
and B =  ..
..
..  .
 .
.
. 

bp1 bp2 . . . bpn
We call these matrices the coefficient matrices of the linear equation
systems (1) resp. (2). We define the product BA to be the matrix
q,n
C = (c jk ) j,k=1 where the c jk are given by (4). Thus the element in
1. BASIC PROPERTIES
63
position ( j, k) in BA is obtained by pairwise multiplication of the
elements of row j in B with the elements in column k in A, followed
by summation.
Observe that the matrix product BA is defined only if the number
of columns of B equals to the number of rows of A.
Example 1.
!
!
!
1 2 1 0 5
1·1+2·3 1·0+2·6 1·5+2·7
=
3 4 3 6 7
3·1+4·3 3·0+4·6 3·5+4·7
!
7 12 19
=
.
15 24 43
Example 2.
 
3 1 2 3 2 = 10 .
 
1
Example 3.


 
3 6 9
3 

2 1 2 3
= 2 4 6 .
 


 
1 2 3
1
Example 4.
!
!
!
4 −2 1 −2
8 −16
=
,
2 −1 −2 4
4 −8
!
!
!
1 −2 4 −2
0 0
=
.
−2 4 2 −1
0 0
The last example shows that the order between the factors is
essential for matrix multiplication. In other words, matrix multiplication is non-commutative: it is possible (and very common) to have
AB , BA.
In Example 1, B is a 2 × 2 matrix and A is a 2 × 3 matrix. The
product BA is therefore a 2 × 3 matrix. The product AB is not defined
in this case. In order that both AB and BA be defined, it is necessary
and sufficient that A be a n × p and B a p × n matrix (with the same n
and p). Then BA is an n × n matrix and AB is p × p. This is illustrated
by examples 2 and 3.
64
6. MATRICES
Definition 18. The square n × n-matrix


1 0 . . . 0
0 1 . . . 0


E = En =  .. ..
.. 
 . .
. 

0 0 ... 1
is called the identity matrix of order n.
Notice that E is the neutral element for matrix multiplication, i.e.
we have
EA = AE = A
for all n × n matrices A.
While matrix multiplication fails to be commutative, it obeys the
other rules of calculation.
Theorem 19. Matrix multiplication obeys the associative law
C(BA) = (CB)A
(5)
and the distributive laws
(6)
B(A + A0 ) = BA + BA0
,
(B + B0 )A = BA + BA0 .
(We here assume that the dimensions of the matrices are such that the sums
and products make sense.)
Proof. The formulas can easily be verified by direct evaluation.
Nonetheless, we shall give an alternative argument for the associative
law (5).
Consider the relation (1) as a function F : Rn → Rp , which to
each n-tuple (x1 , . . . , xn ) ∈ Rn associates a p-tuple (y1 , . . . , yp ) ∈ Rp .
Likewise (2) can be regarded as a function G from Rp to Rq , which
to (y1 , . . . , yp ) associates (z1 , . . . , zq ). The matrix product BA will then
correspond to the composite function G ◦ F, and (5) follows from the
associate law from composition of functions:
H ◦ (G ◦ F) = (H ◦ G) ◦ F.
We shall in this course only be concerned with matrices whose
entries are real numbers. Nonetheless, we want to mention that
matrices with complex entries can be handled in the same way, as in
the following example.
1. BASIC PROPERTIES
Example 5. Consider the three matrices
!
!
i 0
0 1
I=
, J=
0 −i
−1 0
65
,
!
0 i
K=
,
i 0
where i is the imaginary unit. If these are multiplied by −i, one
obtains the famous Pauli matrices; these were used by Paul Dirac in
1928, to formulate an equation for the electron.
It is easy to check that
I2 = J2 = K2 = −E,
where E is the identity matrix of order 2, and that
IJ = −JI = K
,
JK = −KJ = I
,
KI = −IK = J.
If this is compared with the formulas for multiplication of quarternions in the preceding chapter, one realizes that Hamilton’s quarternions x0 + x1 i + x2 j + x3 k can be identified with the set of complex
2 × 2 matrices of the form
!
x0 + ix2 x2 + ix3
x0 E + x1 I + x2 J + x3 K =
.
−x2 + ix3 x0 − ix1
The computational rules for quarternions can then be seen as special
cases of the rules for matrix multiplication.
Before we close this section, we define a new matrix operation
called transposition. If A is a p × n matrix, then the transpose of A At is
defined as the n × p matrix whose rows are the columns of A:




a11 a21 . . . ap1 
a11 a12 . . . a1n 
a

a

 21 a22 . . . a2n 
 12 a22 . . . ap2 
t



If A =  ..
..
..  .
..
..  then A =  ..
 .

 .
.
. 
.
.



ap1 ap2 . . . apn
a1n a2n . . . apn
Transposition satisfies the following computational rules (proofs are
left as exercises for the reader)
(A + B)t = At + Bt
,
(AB)t = Bt At .
Notice that the last rule says that transposition reverses the order of a
matrix product.
Exercises.
1. Let


1 0 2 


A = 0 3 1 


2 2 −1
,


0 1 1


B = 2 −2 0


1 2 3
,


 2 1


C = −1 1 .


1 2
66
6. MATRICES
Compute
a) AB b) BA c) At Bt d) (A + 3B)C e) CCt
2. Let
!
!
1 1
1 −2
A=
and
B=
.
−1 1
3 4
f) Ct C.
Determine: a) A2 − B2 , b) (A + B)(A − B).
Why are the answers different in a) and b)?
3. Let
!
1 −3
A=
.
−3 9
Find all 2 × 2 matrices B such that
AB = BA = 0.
Here 0 denotes the 2 × 2 zero-matrix, i.e. the matrix all of
whose entries equal zero.
4. Denote by Ak the product of a matrix A by itself k times.
a) Prove that if AB = BA, then we have the binomial expansion
!
k
(A + B)k = Ak + kAk−1 B +
Ak−2 B2 + . . . + Ak .
2
b) Compute (I + A)10 where I is the identity matrix and


0 5 3


A = 0 0 3 .


0 0 0
5. Show that, in order to verify all statements in Example 5, it
suffices to prove that
I2 = J2 = K2 = IJK = −E.
2. Matrix inverse
Let


a11 a12 . . . a1n 
a21 a22 . . . a2n 

A =  ..
..
.. 
 .
.
. 

ap1 ap2 . . . apn
,
 
x1 
x2 
x =  .. 
 . 
 
xn
,
 
 y1 
 y2 
y =  ..  .
 . 
 
yp
The linear equation system (1) can then be written in the matrix form:
(7)
Ax = y.
We shall here discuss (7) in the important case when n = p, i.e., when
A is a square matrix of order n.
2. MATRIX INVERSE
67
If n = 1, the system (7) reduces to a single equation
ax = y.
If a , 0 this equation can be solved by multiplication with a−1 = 1/a:
x = a−1 y.
There is a counterpart to this procedure also when n > 1.
Definition 20. Let A be a n × n-matrix. We say that A is invertible
if there is an n × n-matrix B such that
AB = E
BA = E.
and
In this case, B is called an inverse to A.
Remark 21. If A is invertible, then the inverse is unique. We can
thus speak of the inverse and write B = A−1 . To see this, assume that
there are two matrices B and C which are inverse to A. Then BA = E
and AC = E, so
B = BE = B(AC) = (BA)C = EC = C.
The uniqueness is proved.
Example 1. If
1 2
A=
2 3
!
,
!
−3 2
B=
,
2 −1
then by direct calculation, AB = BA = E. Thus A is invertible and
B = A−1 . For the same reason, B is invertible and A = B−1 .
Now suppose that A is an invertible matrix. The linear system
Ax = y
can then be multiplied by A−1 from the left, giving
x = Ex = (A−1 A)x = A−1 (Ax) = A−1 y.
If the system has a solution x we must thus have x = A−1 y. That this
really is a solution follows from that
A(A−1 y) = (AA−1 )y = Ey = y.
We have proved one direction of the following theorem.
Theorem 22. A square matrix A is invertible if and only if the linear
equation system Ax = y has a unique solution x for all right hand sides y.
If this is the case, the solution is given by x = A−1 y.
68
6. MATRICES
Remark 23. In the proof, we shall use the following property of
matrix multiplication: If C is a square matrix with columns C1 , C2 , . . . , Cn ,
then the matrix AC has columns AC1 , AC2 , . . . , ACn . The (simple) verification of this fact is left as an exercise for the interested reader.
Proof of Theorem 22. It remains to prove that if the system Ax =
y has a unique solution for all possible right hand sides y, then A is
invertible.
Let C and D be two square matrices with columns C1 , C2 , . . . , Cn
resp. D1 , D2 , . . . , Dn . By Remark 23, the matrix identity
(8)
AC = D
is equivalent to the n vector identities
ACk = Dk ,
k = 1, . . . , n.
Hence if the system Ax = y has a unique solution for all y, there is
precisely one n × n matrix D satisfying (8). In particular there is a
unique n × n matrix B such that
(9)
AB = E.
In order to show that A is invertible, we shall show that also BA = E.
But by (9) we have
A(BA) = (AB)A = EA = A.
The matrix C = BA thus satisfies the equation
AC = A.
This last equation is also satisfied by C = E. Since (8) has precisely one
solution C for every right hand side D, we must then have BA = E. The following example shows how one can calculate inverse matrices in practice.
Example 2. To determine whether the matrix


1 1 1


A = 1 2 3


1 3 2
2. MATRIX INVERSE
69
is invertible, we try to solve the system Ax = y for an arbitrary right
hand side y:



x1 + x2 + x3 = y1



x1 + 2x2 + 3x3 = y2




x1 + 3x2 + 2x3 = y3



x1 + x2 + x3 = y1



∼
x2 + 2x3 = −y1 + y2




x2 − x3 =
−y2 + y3
... ∼



3x1
= 5y1 − y2 − y3



∼
3x2
= −y1 − y2 + 2y3




3x3 = −y1 + 2y2 − y3
∼
.
We see that the system has a unique solution x for each right hand
side y, so the matrix A is invertible. The last system also shows that


5 −1 −1


1

A−1 = −1 −1 2  .

3 −1 2 −1
Computational rules for the inverse. If both of the matrices A
and B are invertible, then the product AB is also invertible, and
(AB)−1 = B−1 A−1 .
The order between factors is thus reversed after inversion. This is
realized by observing that if A and B are invertible then the matrix
D = B−1 A−1 satisfies
D(AB) = B−1 (A−1 A)B = B−1 EB = B−1 B = E,
and similarly
(AB)D = E.
Thus AB is invertible with inverse D.
Finally, we leave it to the reader to check that if A is invertible,
then At is invertible and
(At )−1 = (A−1 )t.
70
6. MATRICES
Exercises.
6. Determine which matrices are invertible. Also determine the
inverse


 matrix whenit exists. 
1 0 1
 1 1 2
1 2 3






a) 0 1 2 b)  2 1 1 c) 0 1 1 .






1 1 0
−1 1 4
0 0 1
7. Let


1 0 a


A = 0 −1 1 .


1 1 0
Calculate A−1 for those values of a for which A is invertible.
8. Find the inverse matrices of A and of A2 where


1 2 3


A = 2 3 1 .


1 1 1
9. Find a matrix X which solves the matrix equation AXB = C
where


!
!
1 2 3
1 2 1
1 1
0 1 2
A=
, B = 
.
 , C =
1 2
2 1 2


0 0 1
10. Let A and B be two n×n-matrices such that E−AB is invertible.
Prove that E − BA is invertible and that
(E − BA)−1 = E + B(E − AB)−1 A.
3. Answers to Exercises





7 
2 5 0
2
2 5





1. a) 7 −4 3  b) 0 −6 2 c) 5





0
3 −4 −1
7 12 1




 4 14
 5 −1 4
6




d) 16 5  e) −1 2 1 f)
3




10 29!
4
1 !5
5
12
4
9
2. a)
b)
.
−17 −10
−20 −9
!
9t 3t
3.
, t ∈ R.
3t t


1 50 705


4. b) (E + A)10 = 0 1 30 .


0 0
1

2 7 
−6 12

2 1
!
3
.
6
3. ANSWERS TO EXERCISES




1 −2 1 
 1 −1 1 



1 .
c) 12 −1 1
6. a) 0 1 −2 . b) Not invertible.




0 0
1
1
1 −1


 1 −a −a
1 
−1 a 1  for a , 1.
7. A−1 = 1−a


−1 1 1




−2 −1 7 
 10 −7 −2

2 −5 and (A2 )−1 = 19 −5 8 −8 .
8. A−1 = 13  1




1 −1 1
−2 −4 13
!
0 3 −6
9. X = A−1 CB−1 =
.
1 −3 4
71