Download M1GLA: Geometry and Linear Algebra Lecture Notes

Document related concepts

Linear least squares (mathematics) wikipedia , lookup

Symmetric cone wikipedia , lookup

Rotation matrix wikipedia , lookup

Laplace–Runge–Lenz vector wikipedia , lookup

Euclidean vector wikipedia , lookup

Determinant wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Exterior algebra wikipedia , lookup

Vector space wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Jordan normal form wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Gaussian elimination wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

System of linear equations wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Matrix multiplication wikipedia , lookup

Four-vector wikipedia , lookup

Matrix calculus wikipedia , lookup

Transcript
Geometry & Linear Algebra
Lectured in Autumn 2014 at Imperial College by Prof. A. N. Skorobogatov.
Humbly typed by Karim Bacchus.
Caveat Lector: unofficial notes. Comments and corrections should be sent to
[email protected]. Other notes available at wwwf.imperial.ac.uk/∼kb514.
Syllabus
This course details how Euclidean geometry can be developed in the framework
of vector algebra, and how results in algebra can be illuminated by geometrical
interpretation.
Geometry in R2 : Vectors, lines, triangles, Cauchy-Schwarz inequality, Triangle
inequality.
Matrices and linear equations: Gaussian elimination, matrix algebra, inverses,
determinants of 2 × 2 and 3 × 3 matrices. Eigenvalues and eigenvectors, diagonalisation of matrices and applications. Conics and quadrics: Matrix methods,
orthogonal matrices and diagonalisation of symmetric matrices.
Geometry in R3 : Lines, planes, vector product, relations with distances, areas
and volumes.
Vector spaces: Axioms, examples, linear independence and span, bases and
dimension, subspaces.
Appropriate books
J. Fraleigh and R. Beauregard, Linear Algebra
S. Lipschutz and M. Lipson, Linear Algebra
Contents
0 Introduction
3
1 Geometry in R2
1.1 Lines in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2 Triangles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.3 Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
4
8
10
2 Matrices
12
2.1 System of Linear Equations . . . . . . . . . . . . . . . . . . . . 12
2.2 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Matrix Algebra
19
3.1 Square matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4 Applications of Matrices
26
4.1 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.2 Markov Chains . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
5 Inverses
29
5.1 Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
6 Determinants
33
6.1 3 × 3 determinants . . . . . . . . . . . . . . . . . . . . . . . . . 33
7 Eigenvalues and Eigenvectors
37
7.1 Diagonalisation of Matrices . . . . . . . . . . . . . . . . . . . . 39
8 Conics and Quadrics
8.1 Standard Forms of Conics . .
8.2 Reducing with Trigonometry
8.3 Reducing with Eigenvectors .
8.4 Quadric Surfaces . . . . . . .
8.5 Reducing Quadrics . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
46
46
48
51
56
58
9 Geometry in R3
61
9.1 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.2 Rotations in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
10 Vector spaces
68
10.1 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
10.2 * Error correcting codes * . . . . . . . . . . . . . . . . . . . . . 79
0
Introduction
Geometry & Linear Algebra
0 Introduction
Geometry Greek approach
Lecture 0
• Undefined objects (points, lines)
• Axioms, e.g. through any two distant points in a plane there is exactly
one line.
• Using logic you deduce theorems
• Proof
Contractibility
Using only a ruler and a compass e.g. perpendicular bisector, 60◦ (... can you
construct 20◦ ? )
What can we construct using a ruler and compass, once we are given a unit
length?
• Rational numbers are constructable
• So are square roots
1
√
2
1
√
But can you construct 3 2?
Modern Approach: Cartesian (Rene Descartes)
Plane ←→ the set of points (x1 , x2 ), x1 , x2 ∈ R
Line ←→ a set of points satisfying ax1 + bx2 = c for some given a, b, c ∈ R
e.g. a circle x21 + x22 = r
Space ←→ {(x1 , x2 , x3 ) x1 , x2 , x3 ∈ R}
Planes ←→ {ax1 + bx2 + cx3 = d}
Lines ←→ {ax1 + bx2 + cx3 = d}
Advantages:
(i) Can use Analysis
(ii) Can do any dimension.
(iii) Can replace R by C or by binary fields, i.e. {0, 1} with binary operations
1 + 1 = 0.
3
1
Geometry in R2
Geometry & Linear Algebra
1 Geometry in R2
R is the set of real numbers:
Lecture 1
− 32
0
1
√
2
π
This has the properties of being:
• Commutative: ab = ba
• Associative: a + (b + c) = (a + b) + c
• Distributive: c(a + b) = ca + cb
• If a < b, c > 0 then ca < cb
Definition. R2 = the set of ordered pairs (a, b) where a, b ∈ R i.e. (a, b) ̸=
(b, a) in general. Elements of R2 are points, or vectors. 0 = (0, 0) = origin
of R2 .
A scalar is an element of R.
Properties of R2 :
(i) Addition (sum): (a, b) + (a′ , b′ ) = (a + a′ , b + b′ )
(a + a′ , b + b′ )
R
(a′ , b′ )
(a, b)
R
(ii) Scalar Multiplication: If λ = R, (a, b) ∈ R2 , then
λ · (a, b) = (λa, λb)
v ∈ R2 =⇒ λv ∈ R2
Exercise: Check λ(v1 + v2 ) = λv1 + λv2 where λ ∈ R, v1 , v2 ∈ R2 .
1.1 Lines in R2
Definition. L is a line in R2 if ∃u, v ∈ R2 v ̸= 0 with
L = {u + λv | λ ∈ R}
So if u = (a1 , b1 ), v = (a2 , b2 ) then L = {(a1 + λb1 , a2 + λb2 ) | λ ∈ R}.
4
1
Geometry in R2
Geometry & Linear Algebra
Example 1.1.
y
x line is {(0, 0) + λ(1, 0) | λ ∈ R}
L
y line is {(0, 0) + λ(0, 1) | λ ∈ R}
x
We draw the line
L = {x + y = 1}
= {(1, 0) + λ(−1, 1) | λ ∈ R}.
Note that v is the vector for which L is parallel to.
Checking the components
x = 1 + (−1)λ = 1 − λ
y =0+1·λ=λ
}
x+y =1
Assume L, M are two lines in R2 , with
L = {u = λv | λ ∈ R}, v ̸= 0
M = {a + µb | µ ∈ R}, b ̸= 0
Proposition 1.2. The two lines are the same (L = M ) if and only if the
following holds:
(i) v = αb for some α ∈ R
(ii) L ∩ M ̸= ∅ (L and M have a point in common)
Proof. ( =⇒ )
(i) Assume that L = M . We know that
u ∈ L =⇒ u ∈ M
=⇒ u = a + µb for some µ ∈ R
Also
u + v = L (λ = v) =⇒ u + v ∈ M
=⇒ u + v = a + µ1 b for some µ1 ∈ R
=⇒ v = (a + µ1 b) − (a − µb) = (µ1 − µ)b = αb.
(ii) Since L = m, surely L ∩ M = ∅.
5
Geometry in R2
1
Geometry & Linear Algebra
( ⇐= ) Assume v = αb =⇒ L = {u + λαb | λ ∈ R}. We also know that
L ∩ M ̸= ∅ =⇒ ∃c ∈ L ∩ M
=⇒ c ∈ L
=⇒ c = u + λ0 αb, for some λ0 ∈ R.
Then also
c ∈ M =⇒ c = a + µb, for some µ ∈ R
=⇒ u + λ0 αb = a + µb
=⇒ u = a + µb − λ0 αb = a + (µ − λ0 α)b
(∗)
Let’s consider a point inside L:
L = u + λαb
= a + (µ − λ0 α)b + λαb
(using ∗)
= a + (µ − λ0 α + λα)b ∈ M
So any point in L is inside M , so L ⊆ M . By symmetry M ⊆ L =⇒ L =
M.
”
Parallel Lines
Let
L = {u = λv | λ ∈ R}, v ̸= 0
M = {a + µb | µ ∈ R}, b ̸= 0
Definition. L and M are parallel if ∃ α ∈ R such that b = αv.
Proposition 1.3. Let x, y be two distinct points of R2 . Then there exists a
unique line which contains both x and y.
Proof. (Existence) Choose u − x, v = y − x. Then let
L = {x + λ(y − x) | λ ∈ R}
N.B. v = y − x ̸= 0. We want to show that x, y ∈ l. Chose λ = 0, x + λ(y =
x) = x ∈ L. Also choose λ = 1, x + λy − λx = y ∈ L. So x, y are contained in
a line.
(Uniqueness) L = {u + lv | l ∈ R}. L is a line through x and y. x ∈ L =⇒
x = u + l1 v for l1 ∈ R. y ∈ L =⇒ y = u + l2 v for l2 ∈ R.
y − x = (l2 − l1 )v =⇒ v =
y−x
(l2 − l1 )
The vector that defined the direction of L is proportional to the vector y − x.
By the question we answered before, it’s enough to show that the two lines
have a point in common. x, y are points in common =⇒ Two lines are the
same.
”
6
1
Geometry in R2
Problem: Construct
Geometry & Linear Algebra
√
x.
Lecture 2
C
α
Claim: |CD| =
√
β
1
A
|AD| = 1
|BD| = x
D
x
B
x.
Proof. The angle ACD = 90◦ , since AB is the diameter of a circle =⇒
α + β = 90◦ . Note △ABC is similar to △DBC, since CAD = DCB =
α, ACD = CBD = 90◦ and CBA = CBD = β. So we have a proportion:
|BD|
|CD|
=
|CD|
|AD|
√
x
|CD|
=
=⇒ |CD| = x
|CD|
1
”
Recall: The plane R2 is the set of vectors (x1 , x2 ), xi ∈ R.
For lines
L = {u = λv | λ ∈ R}, v ̸= 0
M = {a + µb | µ ∈ R}, b ̸= 0
L || M ⇐⇒ v and b are proportional.
L = M ⇐⇒
(L ∩ M ̸= ∅).
v and b are proportional and they have a common point
Proposition 1.4. Through any two points, there is a unique line.
Proposition 1.5. Any two non-parallel lines meet in a unique point.
Proof. See exercise sheet.
Example 1.6. Let
L = {(0, 1) + λ(1, 1)}
M = {(4, 0) + µ(2, 1)}
The common point = (0, 1) + λ(1, 1) = (4, 0) + µ(2, 1).
Equating x and y’s, we see from the first co-ordinate: λ = 4 + 4µ. From the
second co-ordinate: 1 + λ = µ.
7
1
Geometry in R2
Geometry & Linear Algebra
So
λ = 4 + 2(λ + 1)
=⇒ λ = −6
Therefore the common point is (−6, −5).
1.2 Triangles
Definition. A triangle is a set of 3 points in R2 that are not on a common
line.
c
b+c
2
a+b
2
a+c
2
a
b
Definition (Midpoint). The midpoint of ab is the vector 21 (a + b).
Definition (Median). A median is the line joining a corner to the midpoint
of the opposite side.
Proposition 1.7. The three medians of the triangle meet in a common point.
Proof. The medians are
La = {a + λ( b+c
2 − a)}
a+c
Lb = {b + λ( 2 − b)}
Lc = {c + λ( a+b
2 − c)}
We can write them as
La = {(1 − λ)a + λ2 b + λ2 c}
Lb = {(1 − µ)b + µ2 a + µ2 c}
Take λ = µ, then
La = Lb .
Check: λ =
La ∩ Lb ∩ Lc .
2
3
µ
2c
=⇒
=
λ
2 c.
Also
λ
2
= 1 − µ = 1 − λ =⇒ λ =
La = Lb . So the common point is
1
3 (a
Hence
+ b + c) ∈
”
Definition (Distance). The length of
√a vector x = (x1 , x2 ) is
We denote the length of x by ||x|| = x21 + x22 .
8
2
3.
√
x21 + x22 .
1
Geometry in R2
Geometry & Linear Algebra
The distance between two points x, y ∈ R2 is
√
||x − y|| = (x1 − y1 )2 + (x2 − y2 )2
Definition (Scalar product). The scalar product (or dot product) of two
vectors x, y ∈ R2 is
(x · y) = x1 y1 + x2 y2
E.g. If x = (1, −1), y = (1, 2) then (x · y) = 1 + (−2) = −1.
Easy properties: For any vectors x, y, z ∈ R2
• x · (y + z) = x · y + x · z (distribution over +)
• ||x||2 = (x · x) = x21 + x22
• (x · y) = (y · x) (commutativity)
Proposition 1.8 (Cauchy-Schwartz Inequality). For x, y ∈ R2
|(x · y)| ≤ ||x|| · ||y||
With equality iff x, y are multiples of the same vector.
Proof. w.l.o.g. y ̸= (0, 0). Consider all vectors of the form x − λy. The length
||x − λy|| squared is clearly ≥ 0, so
||x − λy||2 = (x − λy · x − λy)
= (x · x) − 2λ(x · y) + λ2 (y · y)
= ||x||2 − 2λ(x · y) + λ2 ||y|| ≥ 0
This is a quadratic polynomial in λ with real coefficients and at most one real
root (= 0). Now
λ2 ||y||2 − 2λ(x · y) + ||x||2 ≥ 0 =⇒ discriminant, D ≤ 0
=⇒ ((2(x · y))2 ≤ 4||x||2 ||y||2
=⇒ 4(x · y)2 ≤ 4||x||2 ||y||2
=⇒ |x · y| ≤ ||x||||y||
The equality condition is clear for y = 0. Assume that y is not the zero vector.
If |(x · y)|2 = ||x|| · ||y||, then D = 0.
D = 4(x · y)2 − 4||x||2 · ||y||2 = 0
Then aλ2 +bλ+c has a double root, say λ0 . But aλ20 +bλ0 +c = ||x−λ0 y||2 = 0.
Therefore x − λ0 y is the zero vector. Thus x = λ0 y.
”
Proposition 1.9 (Triangle Inequality). For any vectors x, y ∈ R2 , we have
||x + y|| ≤ ||x|| + ||y||
If equality holds, then x and y are multiples of the same vector.
9
Lecture 3
1
Geometry in R2
Geometry & Linear Algebra
x+y
y
x
Proof. It is enough to prove that the square on LHS ≤ RHS, i.e.
||x + y||2 ≤ ||x||2 + ||y||2 + 2||x|| · ||y||
We’ve already seen that
||x + y||2 = (x + y · x + y) = ||x||2 + ||y||2 + 2(x · y)
By Cauchy-Schwartz (x · y) ≤ ||x|| · ||y||, hence the inequality is proved. To
prove the equality condition, we note that the triangle inequality is an equality
precisely when (x · y) = ||x|| + ||y||. By Theorem 1.4, we conclude that x and
y are multiples of the same vector.
”
Recall the scalar product of x as ||x||2 = (x · x). The scalar product can be
reconstructed if we only know the lengths of vectors:
Proposition 1.10. For any two vectors x, y ∈ R2 , we have
(x · y) =
1
(||x||2 + ||y||2 − ||x − y||2 )
2
Proof.
1
(||x||2 + ||y||2 − ||x||2 − ||y||2 + 2(x · y)
2
= (x · y)
RHS =
”
1.3 Angles
From the Cauchy-Schwartz inequality, we find that −1 ≤
(x · y)
≤ 1.
||x|| · ||y||
Definition. We define θ as the angle such that cos θ =
y
θ
x
10
(x · y)
||x|| · ||y||
1
Geometry in R2
Geometry & Linear Algebra
In particular, x and y are perpendicular iff (x · y) = 0.
Lines in R2 can be written as L = {u + λv | λ ∈ R}. This will be referred
to as vector form. Lines can also be described by their Cartesian equation:
px1 + qx2 + r = 0, where p1 , q1 ∈ R.
Definition. Any vector perpendicular to the direction vector of a line is
called a normal vector to L.
n
L
If L = {u + λv}, v = (v1 , v2 ), then n = (−v2 , v1 ) is a normal vector to L.
Remark. Let n ∈ R2 be any (non-zero) normal vector to L = {u + λv}. Then
all points x ∈ L have the property that (x · n) is constant. Indeed, x = u + λv,
hence
(x · n) = ((u + λv) · n) = (u · n) + λ(v · n) = u · n.
| {z }
=0
Therefore every point x of L satisfies the equation n1 x1 + n2 x2 = c, where c
is a constant (c = u1 n1 +u2 n2 ). L can be given by −v2 x1 +v1 x2 = c for some c.
From Cartesian to Vector form
Start with L which is the zero set of px1 + qx2 + r = 0. Now (p, q) is a normal
vector to L, hence v = (q, −p) is a possible direction vector of L. ((−q, p) is
another possibility.) Then L = {u + λv | λ ∈ R} for an appropriate vector of
a. px1 + qx2 + r = 0. ((p, q) · x) = constant. (this constant is −r).
The lines L = {u+λv} and M = {a+µb} are perpendicular iff (v ·b) = 0. The
lines px1 + qx2 + r = 0 and ex1 + f x2 + g = 0 are perpendicular iff the normal
vectors to these lines e.g. (p, q) and (e, f ) are perpendicular. This happens
when (p, q) · (e, f ) = 0.
Claim: The vector (p, q) is a normal vector to L, given by px1 + qx2 + r = 0.
Proof. We must show that (p, q) is perpendicular to (x − y), where x ∈ L and
y ∈ L.
Consider
y
x
px1 + qx2 + r = 0
(1)
py1 + qy2 + r = 0
(2)
(1) − (2) : p(x1 − y1 ) + q(x2 − y2 ) = 0
=⇒ (p, q) · (x1 − y1 , x2 − y2 ) = 0
Hence (p, q) · (x − y) = 0. So that (p, q) is perpendicular to (x − y).
11
”
Lecture 4
2
Matrices
Geometry & Linear Algebra
2 Matrices
Definition. Let n be a natural number, n = 1, 2, 3, . . . . Then Rn is the
set of all vectors with n co-ordinates (x1 , . . . , xn ).
E.g. R is the real line. R2 is the real plane. R3 is 3-dimensional space etc.
Definition (Vector Operations).
Vector addition:
(x1 , x2 , . . . , xn ) + (y1 , y2 , . . . , yn )
= (x1 + y1 , x2 + y2 , . . . , xn + yn )
Multiplication by scalar α ∈ R:
α(x1 , . . . , xn ) = (αx1 , . . . , αxn )
Dot product of two vectors:
(x1 , . . . , xn ) · (y1 , . . . , yn ) = (x1 y1 + x2 y2 + . . . xn yn )
2.1 System of Linear Equations
A linear equation of n vairables is of the form a1 x1 + . . . an xn = b, where
a1 , . . . , an , b ∈ R. We want to solve systems of such linear equations.
Example 2.1.
x1 + x2 = 3
−2x1 + x2 = 1
(1)
(2)
2∗(1)+(2) elimated x, leaving 3x2 = 7 =⇒ x2 = 73 . Then x1 = 3−x2 = 23 .
Example 2.2.
x1 − 2x2 = 2
(1)
3x1 − 6x2 = 4
(2)
−3 × (1) + 2 is 0 = −2. So this system has no solutions.
Example 2.3.
x1 + x2 = 0
x2 − x3 = 0
(1)
(2)
x1 − x2 + 2x3 = 0
(3)
12
2
Matrices
Geometry & Linear Algebra
Observe (3) = (1) − 2 × (2). So (1) and (2) are enough here, so (3) is
automatically satisfied if (1) and (2) are satisfied. So x3 = a ∈ R, any real
number. Then (2) says that x2 = a. Then (1) says that x1 = −a. So this
system has infinitely many solutions, which are (−a, a, a) for any a ∈ R.
Consider a general system of linear equations of n variables x1 , . . . , xn :
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
..
.
am1 x1 + am2 x2 + · · · + amn xn = bm
A vector (y1 , . . . , yn ), yi ∈ R is a solution of the system if all the equations
are satisfied when we put x1 = y1 , . . . , xn = yn .
2.2 Gaussian Elimination
Example 2.4.
x1 + 2x2 + 3x3 = 9
4x1 + 5x2 + 6x3 = 24
(1)
(2)
3x1 + x2 − 2x3 = 4
(3)
Step 1: Eliminate x1 from (2) and (3). We replace (2) by (2) − 4 × (1) and
replace (3) by (3) − 3 × (1). We obtain:
x1 + 2x2 + 3x3 = 9
−3x2 − 6x3 = −12
−5x2 − 11x3 = −23
Step 2: Make the coefficent of x2 in (2) equal to 1.
x1 + 2x2 + 3x3 = 9
x2 + 2x3 = 4
−5x2 − 11x3 = −23
STEP 3: Eliminate x2 from equation (3) by adding 5 × (2) to (3):
x1 + 2x2 + 3x3 = 9
x2 + 2x3 = 4
−x3 = −3
Step 4: Make the coefficient of x3 = 1 =⇒ x3 = 3.
13
2
Matrices
Geometry & Linear Algebra
Step 5: Work backwards to obtain first x2 from (2) and then x1 from (1).
=⇒ x2 = 4 − 2x3 = −2 and x1 = 9 − 2x2 − 3x3 = 4. Finally we’ve solved
the system and we have found all the solutions of the system.
2.3 Matrices
Definition. A m × n matrix is a rectangular array of mn real numbers,
arranged into m rows and n columns.


a11 a12 . . . a1n
 a21 a22 . . . a2n 


 ..
.. 
 .
. 
am1
am2
...
amn
The matrix entry is written down as aij where i is the number of the row
and j is the number of the column. Sometimes this matrix is denoted by
(aij ) or (aih )1 ≤ 1 ≤ m, 1 ≤ j ≤ n.
Basic examples of matrices
- Any n-dimensional vector is a matrix of size 1 × n. (1, . . . , n) is a matrix
with one row.
 x1 
x2
- A column vector is  .. . This is a matrix of size m × 1.
.
xm
Other examples

2 5
1 −1 5
has size 2 × 3 and 6 7 has size 3 × 2.
2 0 1
0 3


1 0 0 0
0 1 0 0 

-
0 0 1 0 is the identity matrix (of size 4 × 4).
0 0 0 1
(

)
A matrix such that m = n is called a square matrix.
To our general linear system we attach its co-efficient matrices:


 
a11 a12 . . . a1n
b1
 a21 a22 . . . a2n 


 .. 
 ..
..  and  . 
 .
. 
bm
am1 am2 . . . amn

a11
 a21

=⇒  .
 ..
a12
a22
...
...
a1n
a2n
..
.
am1
am2
. . . amn

b1
b2 

 , the augmented matrix.

bm
14
2
Matrices
Geometry & Linear Algebra
Definition (Elementary Row Operations). The following operations are
the elementary row operations.
(i) Add a scalar multiple of one row to another: ri −→ ri + λrj , λ ∈ R.
(ii) Swap two rows (ri ↔ rj )
(iii) Multiply a row by a non-zero number: ri −→ λri , λ ∈ R, λ ̸= 0.
Point: The new equations are combinations of the old ones, and vice-versa, so
that the set of solutions is exactly the same.
Example 2.4 (Revisited).


3
9
6 24
−2 4
1 2
4 5
3 1
Step 1: r2 −→ r2 − 4r1 , r3 −→ r3 − 3r1 .


1 2
3
9
0 −3 −6 −12
0 −5 −11 −23
Step 2: r2 −→ − 13 r2 .

Step 3: r3 −→ r3 + 5r2 .
1 2
0 1
0 −5
3
2
−11

1
0
0
3
2
−1
2
1
0

9
4 
−23

9
4
−3
Step 4: r3 −→ −1r3

1 2 3
0 1 2
0 0 1
{z
|

9
4
3
}
x1 + 2x2 + 3x3 = 9
=⇒
x2 + 2x3 = 4
x3 = 3
Echelon form
Definition. A matrix is in echelon form if
(i) The first non-zero entry in each row is appears to the right on the
non-zero entry in all the rows above
15
Lecture 5
2
Matrices
Geometry & Linear Algebra
(ii) All rows consiting only of zeros (0, 0, . . . , 0), if any, appear at the
bottom.
Examples of echelon form:
(
0
0
1
0

)
5 0
Yes
5 −3
6 2
0 0
0 0
Non-examples
(
0 1
1 0
)
17
No
2
(
0 1
0 1
17
5

0
5 Yes
0
4
1
0

)
No
0 0
1 2
0 1

0
5 No
2
Remark. If we reduce the augmented matrix of a system of linear equations
to the echelon form, then we can easily solve it.
Example 2.5.


1 −1 3 0
0 1 1 1 
0 0 0 0
If we have a row (0, . . . , 0, a) where a ̸= 0 then there are no solutions. We
say that the linear system is inconsistent since 0 · x1 + 0 · x2 + . . . 0 · xn = a.
Now ignore all zero-rows. To solve this linear system, we start at the bottom:
x2 + x3 = 1. x3 = a, where a is any real number. x1 = 1 − a.
Express the variable corresponding to the first non-zero entry in this row
in terms of the variables that come after it. x1 − x2 + 3x3 = 0. Thus x1 =
x2 −3x3 = 1−a−3a = 1−4a. The general solution is (1−4a, 1−a, a), a ∈ R.
The variables that are assigned arbitrary values are called free variables, and
all other variables are dependent variables.
Example 2.6.


1 −1 3 0
0 1 1 1 
0 0 0 2
0 · x1 + 0 · x2 + 0 · x3 = 2. This is inconsistent, so no solutions.
Example 2.7.

1
0
0
1
0
0
1
0
0
3
1
0

2 4 0
−1 0 3
1 1 0
x5 + x6 = 0, so let x6 = a, a ∈ R. Then x5 = −a. x4 − x5 = 3 =⇒ x4 =
3 − a.
x1 + x2 + x3 + 3x4 + 2x5 + 4x6 = 0. We get new free variables: x2 = b, b ∈ R
and x3 = c, for any c ∈ R. =⇒ x1 = −b − c + a − 9 =⇒ (a − b − c −
16
2
Matrices
Geometry & Linear Algebra
9, b, c, −a − 3, −a, a).
Theorem 2.8: Gaussian Elimination
Any matrix can be reduced to a matrix in echelon form by elementary
row operations.
Example:

1 0
1 0

1 1
1 −1
3
3
4
2

5 1
8 1

7 0
3 2
2
5
5
−1
Step 1: Clear the first column.

1 0
3
2
0 0
0
3

0 1
1
3
0 −1 −1 −3

5
1
3
0

2 −1
−2 1
Step 2: Swap 2nd row with

1
0

0
0
4th row

5
1
−2 1 

2 −1
3
0
Step 3: r3 −→ r2 + r3

0
3
2
−1 −1 −3
1
1
3
0
0
3
1
0

0
0
0
1
0
0
3
1
0
0
2
3
0
3
5
2
0
3

1
1

0
0
and we’re done.
Now for the proof of the theorem:

a11
 a21

Proof. Let A =  .
 ..
...
...
am1
...

a1n
a2n 

..  be an arbitrary m × n matrix.
. 
amn
The basic procedure consists of two steps:
Step 1: Find a non-zero entry furthest to

0 0 ... 0
0 0 . . . 0

 .. ..
..
. .
.
0 0
...
the left in A. So A looks like this:

∗ ...
∗ . . .



∗
0 ∗ ...
If this non-zero element is in the row with number j, then swap the first row
with the j-th row (ri ↔ rj ). Multiply this new row 1 to make this non-zero
element equal to 1.
17
Lecture 6
2
Matrices
Geometry & Linear Algebra
Outcome of Step 1:

0 ...
0 . . .

 ..
.
0 ...
0 1 ...
0 ∗ ...
..
. ∗
0 ∗ ...

∗
∗



∗
Step 2 Subtracting multiples of the first row from other rows to ensure that
then entries underneath this 1 are all equal to 0.
Outcome of Step 2:









0 ...
0 ...
..
.
0 ...
0
0
..
.
0
1 ∗ ... ∗ 
0 ∗ ... ∗ 



..

.
A1

0 ∗ ... ∗
From now on we only use row operations involving rows from 2 to m. (never
use row 1 anymore). We’ll apply steps 1 and 2 to the “red” matrix A1 . (A1
is A with the first j columns and 1st row removed). The outcome of steps 1
and 2 applied to A1 is:










0
0
0
..
.
0
... 0 1 ∗
... 0 0 1
... 0 0 0
.. .. ..
. . .
... 0 0 0

... ∗ ∗ 
... ∗ ∗ 

∗ ... ∗ 




A2

∗ ... ∗
Let us now consider A2 and repeat steps 1 and 2 for A2 and so on. The
number of rows of A1 is m − 1, the number of rows of A2 is m − 2, . . . . Hence
after finitely many iterations, we’ll stop. In the end the matrix will be in the
echelon form!
”
18
3
Matrix Algebra
Geometry & Linear Algebra
3 Matrix Algebra
Definition. The zero matrix of size

0
0 = 0
0
m × n is

... 0
. . . 0
... 0
Addition
If A and B are both m × n matrices, with entries aij and bij respectively, then
A + B is the matrix of size m × n with entries aij + bij :


a11 + b11 . . . a1n + b1n


..
..


.
.
am1 + bm1
. . . amn + bmn
Multiply matrices by a scalar:
If c ∈ R and A is a matrix of size m × n, with A = (aij ) then cA is the matrix
of size m × n with entries (caij ).
(
)
(
)
a11 a12
ca11 ca12
If A =
cA =
a21 a22
ca21 ca22
Matrix Multiplication:
Let A be a matrix of size m × n and B a matrix of size n × k. Then AB is a
matrix of size m × k defined as follows:


← r1 (A) →


↑
↑
↑
← r2 (A) →


B = c1 (B) c2 (B) . . . ck (B)
A=

..


.
↓
↓
↓
← rk (A) →
Each if the row vectors ri (A) has n co-ordinates. Each of the column vectors ri (B) has n co-ordinates. Because each of ri (A) and cj (B) has exactly n co-ordinates, we can form the dot product ri (A) · cj (B) for any
1 ≤ i ≤ m 1 ≤ j ≤ k. The ij−entry of AB is defined to be the dot prodcut
ri (A) · cj (B).
Examples 3.1.
(i) 1 × 1 matrices are just a number, so if A = (a) and B = (b) =⇒
AB = (ab).
(
)
(
)
a12
b12
(ii) 2 × 2 matrices, A = aa11
, B = bb11
, then
21 a22
21 b22
(
AB =
a11 b11 + a12 b21
a21 b11 + a22 b21
19
a11 b12 + a12 b22
a21 b12 + a22 b22
)
3
Matrix Algebra
Geometry & Linear Algebra
(b )
11
..
(iii) A = (a11 , a12 , . . . , a1n ) a 1 × n matrix and B =
then
.
bn1
AB = (a11 b11 + a12 b21 + · · · + a1n bn1 )
(which is a 1 × 1 matrix, i.e. a number)
(iv) BA will be a matrix of size n × m :



b11 a111
 b21 a11



 ..  (a11 , a12 , . . . , a1n ) =  ..
 .
.
bn1
bn1 a11
b11
b21
...
...

b11 a1n
b21 a1n 

.. 
. 
...
bn1 a1n
Remark. In general multiplication of matrices is far from symmetric! AB ̸=
BA.
Recall: If A is an m × n matrix and B is an n × p matrix, then AB is an m × p
matrix. The ij−entry is the dot product ri (A) · cj (B).
Examples 3.2.
(i)
(
1
3
2
4
)(
0 1
1 0
)
=
(
2
4
1
3
)
(
0 1
1 0
)(
1
3
2
4
)
=
(
)
3 4
1 2
So AB ̸= BA.
(ii)
(
1 2
3 4
)(
−1 0 1
1 1 2
)
=
(
1 2
1 4
5
11
)
(iii) If A has size m × n and v is a column vector with n co-ordinates, then
Av is well defined. e.g.

  

1 2 −1
x1
x1 + 2x2 − x3
0 1 1  x2  =  x2 + x3 
1 0 2
x3
x1 + 2x3
Therefore the system of linear equations
x1 + 2x2 − x3 = 1
x2 + x3 = −7
x1 + 2x3 = 0
can be written in matrix form. Ax = B, where A is the 3 matrix
above and B is the vector (1, −7, 0)t .
20
Lecture 7
3
Matrix Algebra
Geometry & Linear Algebra
In general the system of equations
a11 x1 + a12 x2 + · · · + a1n xn = b1
a21 x1 + a22 x2 + · · · + a2n xn = b2
..
.
am1 x1 + am2 x2 + · · · + amn xn = bm
is equivalent to our matrix equation Ax = b, where A is the matrix with entires aij , x = (x1 , . . . , xn )t and b = (b1 , . . . , bn )t .
Proposition 3.3 (Distributivity Law).
(i) Let A be an m × n matrix and let B and C be n × p matrices. Then
A(B + C) = AB + AC
(ii) Let D and E be m × n matrices and F be an n × p matrix. Then
(D + E)F = DF + EF
Proof. (i) By definition, the ij-entry of A(B + C) is
ri (A) · cj (B + C) = ri (A) · (cj (B) + cj (C))
= (ri (A) · cj (B) + (ri (A) · cj (C))
= ij − entry of AB + ij − entry of AC
= ij − entry of AB + AC
”
Exercise: Prove (ii).
Application to system of linear equations
Remark 1. Any system of linear equations can only have 0, 1 or infinitely
many solutions.
Proof. We’ve seen cases when there is 0 or only 1 solution.
Suppose we have at least two distinct solutions, say p and q (column vectors
with n co-ordinates). Let λ ∈ R be any real number. Then A(p + λ(q − p) =
Ap+A(λ(q −p)) by Prop. 3.3. This equals Ap+λA(q −p) = Ap+λ·Aq −λAp.
Since p and q are solutions, we have Ap = Aq = b. So A(p+λ(q−p)) = Ap = b.
Since q − p is not the zero vector, we have have obtained infinitely many
solutions in the form p + λ(q − p) = x.
”
Remark 2. If p is a solution to Ax = b, then any solution of Ax = b is the sum
of p and the general solution of Ax = 0.
Proof. Indeed if Ax = b, then A(x−p) = Ax−Ap = b−b = 0, so x = p+(x−p)
where x−p is a solution of Ay = 0. Conversely if Ay = 0 then A(p+y) = b. ”
Consider matrices A, B, C. Does (AB)C = A(BC)? Yes!
21
3
Matrix Algebra
Geometry & Linear Algebra
Theorem 3.4: Associativity law
Let A be a matrix of size m × n, and let B be a matrix of n × p and let
C be a matrix of size p × q. Then
(AB)C = A(BC)
( y1 )
..
Lemma 3.5. Let x = (x1 , . . . , xn ) be a row vector and let y =
and B
.
be an n × p matrix with entries bij . Then x · By = xB · y.
yp
Remark. We’re multiplying a (n × p) matrix by a (p × 1) column vector, and
also a (1×n) row vector by a (n×p) matrix, so both products are well-defined.
Proof of Lemma. Idea: both sides are equal to
and j.
∑
bij xi yj over all values of i
Consider LHS = (x1 , . . . , xn ) · By

b11
 ..
By =  .
bn1
...
  

b1p
y1
b11 y1 + · · · + b1p yp

..   ..  = 
..

.  .  
.
...
bnp
bn1 y1 + · · · + bnp yp
 ∑p

j=1 bij yj


.

=
∑ ..

p
b
y
nj
j
j=1
yp
Then
LHS = (x1 , . . . , xn ) · By


p
n
∑
∑
xi ·
=
bij yj 
i=1
=
j=1
p
n ∑
∑
bij xi yj
i=1 j=1
Now consider RHS:
RHS = xB · y
= (x1 b11 + . . . xn bn1 , . . . , x1 b1p + · · · + xn bnp )


n
n
∑
∑
=
b11 xi , . . . ,
bip xi 
i=1
=
p
∑
y j ·
j=1
=
i=1

p
n ∑
∑
n
∑

bij xi 
i=1
”
bij xi yj
i=1 j=1
22
3
Matrix Algebra
Geometry & Linear Algebra
Proof of Theorem. Observation: ri (AB) = ri (A)B, indeed:
Lecture 8
ri (AB) = (r1 (A) · c1 (B), . . . , ri (A) · cp (B))
which is the same as
( )
ri (A)B = (← ri (A) →) B
= (r1 (A) · c1 (B), . . . , ri (A) · cp (B))
Similarly cj (BC) = Bcj (C). (Exercise, easy)
To finish the proof, note that the ij-entry of (AB)C is the dot product
ri (AB) · cj (C) = ri (A)B · cj (C)
= ri (A) · Bci (C) (by Lemma 3.7)
= the ij − entry of A(BC).
”
3.1 Square matrices
Definition. A matrix of size n × n for some n is called a square matrix.
These are nice as we can multiply these matrices by themselves, in fact A2 =
AA is only defined when A is a square matrix.
A2 = (AA)A = A(AA)
..
.
An = An−1 A for all n ≥ 2
1 is a very important number as it’s a number such that multiplying everything by 1 gets you the same thing. There is a similar thing for matrices:
Definition. The identity matrix
by

1
0



0
In is the square matrix of size n × n given
0
1
...
...
..
.
0
...

0
0



1
The ij-entry of In is 1 if i = j, and is 0 otherwise if i ̸= j.
Proposition 3.6. For any square matrix A of size n × n, we have AIn =
In A = A.
So the identity matrix is an analogue of 1.
23
3
Matrix Algebra
Geometry & Linear Algebra
Proof.



1 0
a11 a12 . . . a1n 
 ..
..  0 1
 .
. 

an1 an2 . . . ann
0 0


a11 a12 . . . a1n
 ..
.. 
= .
. 
an1
an2

... 0
. . . 0


..

.
... 1
. . . ann
”
The other way is similar.
What about the inverse? For many reasons it’s important to be able to invert
matrices, and at this point we have to be careful as multiplication is not commutative, so there may be two inverses. Let me make a temporary definition
first:
Definition. B is the left inverse of A if BA = In .
C is the right inverse of A if AC = In
Proposition 3.7. Assume that A has both a right inverse C and a left inverse
B. Then B = C.
Proof. By Theorem 3.6, we have (BA)C = B(AC). But then by Prop 3.8 we
can continue on the left: C = In C = (BA)C = B(AC). And simiarly we can
continue on the right: (BA)C = B(AC) = BIn = B.
”

Warning. Not all matrices are invertible. e.g.
(
)(
) (
)
0 1
0 1
0 0
=
0 0
0 0
0 0
Clearly if A2 = 0, then the inverse does not exist; if there is a B such that
BA = I2 , then (BA)A = B(AA) by associativity, however AA = A2 = 0,
hence B(AA) = 0. Whereas (BA)A = I2 A = A. Since A ̸= 0 we have a
contradiction. Therefore B does not exist.
A matrix with 2 identical (or proportional) rows is not invertible, e.g.
(
)
1 2
A=
1 2
is not invertible. Let me go back to this example later on when we have more
theory...
24
3
Matrix Algebra
Geometry & Linear Algebra
Application to Systems of Linear Equations
Proposition 3.8. Let A be a square matrix. The linear system Ax = b has
exactly one solution when A has an inverse (that A has both left and right
inverses).
Proof. By assumption there is a square matrix B of the same size such that
BA = In . Then Ax = b implies B(Ax) = Bb. By associativity, we have
B(Ax) = (Ba)x = In x = x
On the other hand, this equals Bb. Therefore any solution of Ax = b must be
equation to Bb.
Claim: Bb is in fact a solution of Ax = b.
Indeed, A(Bb) = (AB)b. But from Prop 3.8 we know that b = C for which
AC = In . Hence A(Bb) = (AB)b = In b. So Bb is a solution of Ax = b.
”
Remark. If b = 0, then x = 0 is the unique solution of Ax = 0.
25
4
Applications of Matrices
Geometry & Linear Algebra
4 Applications of Matrices
4.1 Geometry
Rotations
Lecture 9
x2
x = (x1 , x2 )
φ
α
x1
Consider a rotation of φ about the origin to the vector x. The vector has
components x1 = p cos α, x2 = p sin α. Then rotation through through the
angle φ sends x to (p cos(α + φ), p sin(α + φ)):
(
)
(
)
p cos α
p cos(α + φ)
x=
7−→
p sin α
p sin(α + φ)
(
)
p cos α cos φ − p sin α sin φ
=
p cos α sin φ + p sin α cos φ
(
)
cos ψx1 + (− sin ψ)x2
=
sin ψx1 + cos ψx2
(
)( )
cos ψ − sin ψ
x1
=
sin ψ
cos ψ
x2
Definition. Let
(
cos ψ
Rψ =
sin ψ
− sin ψ
cos ψ
)
This is the rotation matrix.
When we rotate the vector x by ψ we write
x 7−→ Rψ x (This means “x goes to Rψ x)
x 7−→ Rψ x 7−→ Rψ Rψ x
(
1
Note that Rψ Rψ = I =
0
)
0
.
1
26
4
Applications of Matrices
Geometry & Linear Algebra
Reflections
x2
x = (x1 , x2 )
x1
x′ = (x1 , −x2 )
The reflection in the x-axis sends (x1 , x2 ) to (x1 , −x2 ). In terms of matrices
the reflection sends
( ) (
)( )
x1
1 0
x1
=
x2
0 −1
x2
Definition. (See Sheet 3) The matrix
(
)
cos ψ
sin ψ
sin ψ − cos ψ
represents the reflection in the line x2 = x, tan ψ2 .
It’s easy to check that Sψ2 = I. Also Sψ Sφ = R = Rψ−φ . Thus the composition of two reflections is a rotation:
x2
ψ/2
x1
ψ/2
4.2 Markov Chains
Special Halloween Problem: An enchanted village is inhabited by vampires, zombies and goblins. Every full moon:
25% of zombies turn into vampires
10% of zombies turn into goblins
65% remain as zombies
20% of vampires turn into zombies
5% of vampires turn into goblins
75% stay as vampires
27
4
Applications of Matrices
Geometry & Linear Algebra
5% of goblins turn into vampires
30% of goblins turn into zombies
65% stay as goblins
Question: What will happen after n full moons? If n → ∞ are the proportions going to stabilise?
Let x1 be the proportion of the vampires, let x2 and x3 be the proportions of
zombies and goblins respectively.
x1 + x2 + x3 = 1
xi ≥ 0
After one month the proportions are as follows:
(1)
x1 = proportion of vampires after one month.
(1)
x1 = 0.75x1 + 0.25x2 + 0.05x3
(1)
x2 = 0.20x1 + 0.065x2 + 0.30x3
(1)
x3 = 0.05x1 + 0.10x2 + 0.65x3


0.75 0.25 0.05
x(1) = 0.20 0.65 0.30 x
0.05 0.10 0.65
Let T be this matrix. After n months we have the population represented by
x(n) = T n x
Observe: tij > 0. Crucially, the sum of entries in each column of T is 1.
∑n
Exercise: For any matrix T = (tij ) such that tij > 0 and i=1 tij = 1, if
∑
k
x = (x1 , . . . , xn )t is such that xi ≥ 0 and i=1 xi = 1, then T x has the same
properties as x.
This is an example of a Markov Chain.
Theorem 4.1
If all entries of some power of T are positive, then there exists a unique
vector
( s1 )
..
s=
.
∑n
sn
such that si ≥ 0 and i=1 si = 1 for which T s = s (the stationary state).
For any initial state x, this converges to s. Thus s is an eigenvector of
T with eigenvalue 1.
T s = s, or equivalently (T − I) = 0 is a system of linear equations. IN our
case, solving this system gives
34 15
s = ( 37
86 , 86 , 86 ) = (0.45, 0.4, 0.17)
This is the limit distribution of three species.
28
5
Inverses
Geometry & Linear Algebra
5 Inverses
Theorem 5.1
Let A be a square matrix. If there exists a square matrix B such that
AB = I then this B is unique and satisfies BA = I.
Proof. (Also gives a method for finding this B!)
Let X be the square matrix with unknown entires. We want to solve the
equation AX = I. The entires of X are xij . We have n2 unknowns and n2
equations. We record this as follows: (A | I), a n × 2n matrix.
This is n systems of linear equations in n variables, e.g. for each column of I
we have the following system:
 
0
.

x2j
 .. 
1



..
A
= the jth column of I =  0 

.
.
xnj
..
 x1j 
0
All these n systems have the same co-efficient matrix A, so we solve them using the same process. Apply reduction to echelon form. Perform elementary
row operations on the matrix (A | I)
Claim: The matrix on the LHS cannot have any rows made entirely of zeros.
Proof of Claim. Remember that D is obtained by row operations from I.
We know that two matrices that are obtained from each other by row operations define equivalent linear systems. This means that the linear system I(y1 , . . . , yn ) = 0 has the same solutions as D(y1 , . . . , yn ) = 0. But
(0, . . . , 0) is the only solution to this. Now if D has an all zero row, the system
D(y1 , . . . , yn ) = 0 has free variables, hence infinitely many solutions. This
contradiction proves that D does not have an all zero row.
”
Therefore there is a non-zero entry in the bottom row of D. Say this entry is
in the jth column. Then the system (1.1) has no solutions (follows from the
echelon form method). Therefore, if the matrix on the left has a bottom row
made of zeros, then AX = I has no solutions. So A has no right inverse. It
remains to consider the case when the matrix on the left has no all-zero rows:



(Aech | I) = 



1
*
1
..
0
.


D


1
Perform more elementary row operations to clear the entries above the main
diagonal (This is possible because all diagonal entries equal 1). After this step,
29
Lecture 10
5
Inverses
Geometry & Linear Algebra

we obtain:


(I | E) = 



1


E


0
1
..
0
.
1
Since row operations don’t change the solution of our linear system, we have
IX = E. Hence E is a unique solution of the system AX = I, i.e. AE =
I. We’ve proven that if the right inverse exists it can be obtained by the
procedure, and it is unique.
Finally we now prove that EA = I:
Consider the equation EY = I, where Y = (yij ) is a square matrix with
unknown entires yij . Reverse row operations from the first part of the proof
to so (E | I) 7→ (I | A). EY = I is equivalent to IY = A, that is Y = A.
Therefore EA = I.
”
Finding inverses of 2 × 2 matrices is easy:
A=
(
a
c
)
(
)
b
d −b
, consider B =
d
−c a
(
Then
AB =
(
=
a b
c d
)(
d −b
−c a
)
ad − bc
0
0
ad − bc
)
= (ad − bc)I
Case 1. (ad − bc) is non-zero. Then
(
)
1
d −b
is the inverse of A
ad − bc −c a
Case 2. ad − bc = 0. In this case AB = 0. Then the inverse does not exist
(If CA = I, then C(AB) = (CA)B = IB = B. If A is non-zero then B is
non-zero, so we get a contradiction for 0 = B.) Hence A is not invertible.
Let A be a square matrix. If
Lecture 11
row ops
(
(A | I) −→
∗
0...0
then A has no inverse.
If
row ops
(A | I) −→ (I | E)
then AE = I.
30
)
∗
5
Inverses
Geometry & Linear Algebra
Example 5.2. Let

1 3
A= 2 5
−3 2

−2
−3
−4


1 3 −2 1 0 0
(A | E) =  2 5 −3 0 1 0
−3 2 −4 0 0 1


1
0 0
1 3 −2
−2 1 0
−→ 0 −1 1
0 0
1 −19 11 1


2
1 3 0 −37 22
−→ 0 −1 0 17 −10 −1
0 0 1 −19 11
1


1 0 0 14 −8 −1
1
−→ 0 1 0 −17 10
1
0 0 1 −19 11

So
A−1

14 −8 −1
1
= −17 10
−19 11
1
Recall that Theorem 5.1 says that if there exists a B such that AB = I, then
BA = I.
Question: What if there is a matrix C such that CA = I: does this imply
that CA = I? Yes.
5.1 Transpose
Definition. Let A be a matrix of size n × m. The transpose of A is the
matrix At of size m × n such that the ij-entry of At is the ji−entry of A.
(
E.g. if A =
)
(
)
a b
a c
, then At =
.
c d
b d
Some properties:
• It = I


x1
 
• (x1 , . . . , xn )t =  ... 
xn
• (At )t = A
• ri (At ) = ci (A)t
31
5
Inverses
Geometry & Linear Algebra
• cj (At ) = rj (A)t
Proposition 5.3. (AB)t = B t At
Proof.
The ij-entry of (AB)t = the ji-entry of AB
= rj (A) · ci (B)
= ci (B) · rj (A)
= ri (B t ) · cj (At )
= ij-entry of B t At
”
Corollary 5.4. Let A be a square matrix. If there exists a matrix C such
that CA = I, then AC = I.
Proof. Suppose we have C such that CA = I. Then (CA)t = I t = I. By
Prop 5.2, we have At · C t = I By Theorem 5.1, this implies that C t At = I.
By Prop 5.2, we obtain (AC)t = I. Apply the transpose to both sides:
AC = ((AC)t )t = I t = I
”
Remark. Let A be a square matrix. Then the following properties hold.
(i) If there exists a matrix B such that AB = I, then BA = I.
(ii) If there exists a matrix C such that CA = I, then AC = I. Call
B = C = A−1 the inverse of A.
(iii) If A−1 exists, it is unique.
(iv) A−1 exists if and only if A can be reduced to I by elementary row
operations
(v) A−1 exists if and only if Ax = b has a unique solution.
(vi) A−1 exists if and only if Ax = 0 has a unique solution 0.
32
6
Determinants
Geometry & Linear Algebra
6 Determinants
Definition (2 × 2 determinant).
(
)
a b
det
= ad − bc
c d
a b = ad − bc
Also denoted by c d
We’ve seen that if det(A) ̸= 0, then A−1 exists.
(
)
1
d −b
A−1 =
det(A) −c a
If det(A) = 0, then A−1 doesn’t exist.
Examples 6.1.
(
)
a b
= ad
0 d
(
)
cos φ − sin φ
Rφ =
sin φ cos φ
det
det Rφ = 1.
Sφ =
(
)
cos φ
sin φ
sin φ − cos φ
det Sφ = −1.
Exercise: det(AB) = det(A) det(B).
6.1 3 × 3 determinants
Definition. For a 3 × 3 matrix, A:

a11
A = a21
a31
det(A) = a11 det
(
a22
a32
a23
a33
a12
a22
a32
)
(
− a12 det

a13
a23 
a33
a21
a31
a23
a33
)
+ a13
(
a21
a31
a22
a32
)
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a21 a31 ).
The same decomposition works for n × n matrices.
Let A be a square matrix.
Lecture 12
33
6
Determinants
Geometry & Linear Algebra
Definition. The ij-minor of A is the square matrix obtained from A by
removing the i-th row and the j-th column.
Example 6.2. If

1 2
A = 4 5
5 0
then
A11
(
)
5 6
=
0 8
A13

3
6
8
(
4
=
5
5
0
)
We defined det(A) = a11 det(A11 ) − a12 det(A12 ) + a13 det(A13 ).
Proposition 6.3. Let A be a 3 × 3 matrix.
(i) det(A) = expansion in the 2nd row.
(ii) det(A) = expansion in 3rd row.
Proof of (i). Note that
det(A) = a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a21 a31 ).


a11 a12 a13
A = a21 a22 a23 
a31 a32 a33
We need to compare this with the following expression:
−a21 (a12 a33 − a32 a13 ) + a22 (a11 a33 − a31 a13 ) − a23 (a11 a32 − a31 a12 )
”
Proof of (ii) is similar.
Remark. There are also expansions in the first column, second column etc.
det(A) = a11 det(A11 ) − a21 det(A21 ) + a31 det(A31 )
Goal: Show that A−1 exists precisely when det(A) ̸= 0.
Lemma 6.4. If a 3 × 3 matrix A has two identical rows, then det(A) = 0.
Proof.

a
det a
x
b
b
y

c
c  = expansion in 3rd row
z
(
)
(
)
(
)
b c
a c
a b
= x det
− y det
+ z det
b c
a c
a b
=0
”
Other cases are similar.
34
6
Determinants
Geometry & Linear Algebra
Proposition 6.5 (Effect of row operations on determinant).
(i) Replacing r1 (A) by ri (A) + λrj (A), j ̸= i does not change det(A)
(ii) Swapping ri (A) and rj (A), j ̸= i multiplies det(A) by −1
(iii) Multiplying ri (A) by λ ̸= 0 has the effect of multiplying det(A) by λ.
Proof. (i) For simplicity assume that i = 2 and j = 1:


a11
a12
a13
A′ = a21 + λa11 a22 + λa12 a23 + λa13 
a31
a32
a33
Let’s expand this det in the 2nd row:
det(A′ ) = −(a21 + λa11 ) det(A21 ) + (a21 + λ11 ) det(A22 ) − (a23 + λ13 ) det(A23 )


a11 a12 a13
= det(A) + λ det a11 a12 a13 
a31 a32 a33
= det(A) + 0
= det(A)
(ii) We have:

a21 a22
det a11 a12
a31 a32

a23
a13 
a33
= expand 3rd row
(
)
(
)
(
)
a
a23
a
a23
a
a22
= a31 det 22
− a32 det 21
+ a33 21
a12 a13
a11 a13
a11 a12
(
)
(
)
(
a12
a13
a
a13
a
= −a31 det
+ a32 det 11
− a33 det 11
a22 a23
a21 a23
a21
a12
a22
)
= − det(A)
(iii) Expand in the row that is multiplied by λ. This immediately gives the
result.
”
Examples 6.6. (i)

1 2
det  2 0
−2 1


−1
1
3  = det 0
1
0
2
−4
5

−1
5
−1
= expand in the 1st column
(
)
−4 5
= det
5 −1
= 4 − 25 = −21
35
6
Determinants
Geometry & Linear Algebra
(ii) Van der Monde determinant:



1 x x2
1 x
det 1 y y 2  = det 0 1
0 1
1 z z2

x2
x + y
x+z

1
x
1
= (y − x) det 0
0 z−x

1
= (y − x)(z − x) det 0
0

1
= (y − x)(z − x) det 0
0

x2
y+x 
z 2 − x2

x
x2
1 x + y
1 x+z

x
x2
1 x + y
0 z−y
= (y − x)(z − x)(z − y)
= (x − y)(y − z)(z − x)
Corollary 6.7. If A′ is obtained from A by row operations, then det(A′ ) ̸= 0
if and only if det(A) ̸= 0.
Proof. A direct consequence of Proposition 6.3.
”
Theorem 6.8
Let A be a 3 × 3 matrix. Then A−1 exists if and only if det(A) ̸= 0.
Proof. Recall that A can be reduced to echelon form by row operations. Let
A′ be the matrix in echelon form to which A reduces. Then det(A′ ) ̸= 0 ⇐⇒
det(A) ̸= 0. Hence we are in Case 2 (If A′ has an all zero-row then we expand
in this row and det(A′ ) = 0.) In Case 2, A′ can be reduced to I by further
row operations. By Theorem 5.1, A is invertible.
We need to prove that if A−1 exists, then det(A) ̸= 0. Indeed, if A−1 exists,
then the echelon form of A has no all-zero rows. Then A can be reduced to
I by row operations. Row operations can only multiply det by a non-zero
number, and they can be reversed. Therefore, det (A) ̸= 0.
”
Remark. For any square matrices A and B of the same size det(AB) =
det(A)det(B). If A−1 exists, then AA−1 = I, so det(A)det(A−1 ) = 1. Hence
det(A) ̸= 0 if A−1 exists.
Final Comment. If A is a square matrix, then Ax = 0 has non-zero solutions
if and only if det(A) = 0. (Indeed if Ax = 0 has a non-zero solution, then it
has at least two distinct solutions, so it has infinitely many solutions. Then
A−1 does’t exist, and det(A) = 0).
36
Lecture 13
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
7 Eigenvalues and Eigenvectors
Definition (Eigenvector, eigenvalue). Let A be a n × n matrix. Then a
non-zero vector, v, is called an eigenvector of A if Av = λv for some λ ∈ R.
In this case λ is called an eigenvalue of A corresponding to the eigenvector
v.
Remarks. A scalar multiple of an eigenvector is also an eigenvector with the
same eigenvalue.
(
3
Example 7.1. If A =
2
Av =
)
( )
2
1
,v=
, we have
0
−2
(
)( ) ( )
3 2
1
−1
=
= −v
2 0
−2
2
So v is an eigenvector of A with eigenvalue −1.
( )
1
But if w =
, then
1
(
3
Aw =
2
2
0
)( ) ( )
( )
1
5
1
=
̸= λ
1
2
1
So w is not an eigenvector of A.
(
1
Remark. If A is a scalar multiple of I, A = c
0
vector is an eigenvector with eigenvalue c.
)
0
, then every non-zero
1
How do we find eigenvectors and eigenvalues?
v is an eigenvector of A if and only if Av = λv for some λ ∈ R. Av = λ·Iv =⇒
(λI − A)v = 0. Here v ̸= 0. Non-zero solution (such as v) of this system of
linear equations exist if and only if det(λI − A) = 0.
Let t be a variable and consider the matrix tI − A:


t − a11
−a12
. . . −a1n
 −a21
t − a12 . . . −a2n 


tI − A =  .
.. 
 ..
. 
−an1 t − an2 . . . t − ann
Definition (Characteristic Polynomial). The determinant of tI − A is
called the characteristic polynomial of A. (For us n = 3, or n = 2)
37
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
For example, if n = 2, then the characteristic polynomial of A is
(
)
t − a11 −a12
det
= (t − a11 )(t − a22 ) − a21 a − 12
−a21 t − a22
= t2 − (a11 + a22 ) + a11 a22 − a21 a12
= t2 − tr(A) + det(A) = 0
Definition (Trace). tr(A) is the trace, defined as tr(A) = a11 + a22
Proposition 7.2. Let A be a 2 × 2 or 3 × 3 matrix. Then the eigenvalues of
A are the roots of the characteristic polynomial of A, i.e. every eigenvalue λ
satisfies det(λI −A) = 0. The eigenvectors of A with eigenvalue λ are non-zero
solutions of the system of linear equations (λI − A)v = 0.
Proof. The real numbers λ for which det(λI − A) = 0 are by definition the
roots of the characteristic polynomial of A. Hence v is a non-zero solution of
(λI − A)v = 0.
”
Example 7.3. Consider
(
A=
3 2
2 0
)
(i) Find the characteristic polynomial of A:
(
)
t − 3 −2
det
= t2 − 3t − 4 = 0
−2
t
=⇒ (t + 1)(t − 4) = 0
(ii) Solve this equation (t + 1)(t − 4) = 0, and find the eigenvalues:
λ1 = −1, λ2 = 4
(iii) Find the eigenvectors with eigenvalue −1:
(
)
−4 −2
λ1 I − A =
−2 −1
Solve
}
−4x1 − 2x2 = 0
−2x1 − x2 = 0
(
−4
−2
)( )
−2
x1
=0
−1
x2
(
)
=⇒ v1 = 1 −2 (or any scalar multiple)
Find eigenvectors with eigenvalue 4:
(
)
1 −2
λ2 I − A =
−2 4
38
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
(
Solve
x1 − 2x2 = 0
−2x1 + 4x2 = 0
}
)( )
−2
x1
=0
4
x2
1
−2
(
=⇒ v2 = 2
)
1 (or any scalar multiple)
Conclusion: Up to proportionality, A has exactly two eigenvectors
with eigenvalue −1 and 4.

Warning. Some 2 × 2 matrices have only one eigenvector up to scalar
multiples!
Example 7.4.
(
A=
1 1
0 1
)
This has characteristic polynomial of
(
)
t−1
1
det
= (t − 1)2
0
t−1
So 1 is the unique eigenvalue of A. Now
(
)
0 −1
I −A=
0 0
So solving
(
)( ) ( )
0 −1
x1
0
=
=⇒ x2 = 0
0 0
x2
0
( )
1
So v =
is the unique eigenvector (up to scalar multiples!)
0
If A is a 3 × 3 matrix, then the characteristic polynomial of A is as polynomial
of degree 3, with leading coefficient 1. Typically, A will have three different
eigenvalues (which may be complex numbers!). There are matrices with 3
non-proportional eigenvectors, but also matrices with fewer eigenvectors, e.g.




0 1 0
0 1 0
A = 0 0 1 or 0 0 0
0 0 0
0 0 1
7.1 Diagonalisation of Matrices
39
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
Definition. A matrix is called diagonal if all non-diagonal entries are zero:


d1 0 . . . 0
 0 d2 . . . 0 


D=.
.. 
 ..
.
0
0
. . . dn
Properties
• diag(d1 , . . . , dn ) + diag(d′1 , . . . , d′n ) = diag(d1 + d′1 , . . . , dn + d′n )
• diag(d1 , . . . , dn )diag(d′1 , . . . , d′n ) = diag(d1 d′1 , . . . , dn d′n )
m
• diag(d1 , . . . , dn )m = diag(dm
1 , . . . , dn )
• det(diag(d1 , . . . , dn )) = d1 . . . dn
−1
• If d1 , . . . , dn ̸= 0 then diag(d1 , . . . , dn )−1 = diag(d−1
1 , . . . , dn )
Consider the polynomial
pm xm + · · · + p0
pi ∈ R
Then
p(A) = pm Am + pm−1 Am−1 + · · · + p1 A + p0 I
where A is a square matrix
D = diag(d1 , . . . , dn ) and p(D) = diag(p(d1 ), . . . , p(dn )).
Can we somehow reduce general matrices to diagonal form? Yes, in most
cases...
Terminology. Let v1 , . . . , vn be vectors in Rn .
Definition. v1 , . . . , vn are linearly dependent if ̸ ∃x1 , . . . , xn ∈ R not all
equal to 0 such that x1 v1 + · · · + xn vn = 0.
In the opposite case, v1 , . . . , vn are called linearly independent.
Examples 7.5.
(i) If one of the vectors is a zero vector, they are linearly dependent, say
v1 = 0. Then take x1 = 1, x2 = · · · = xn = 0. 1·0+0·v2 +. . . 0·vn = 0.
(ii) For n = 2, then are v1 and v2 linearly dependent? x1 v1 +x2 v2 = 0, say
x1 ̸= 0, then v1 = xx12 v2 . Conclusion: v1 and v2 are linearly dependent
iff they are proportional, i.e. v2 is in this line:
0
v1
(iii) For n = 3, v1 , v2 , v3 are linearly dependent iff x1 v1 + x2 v2 + x3 v3 = 0
and one of the coefficients ̸= 0, say x1 ̸= 0. Then v1 = − xx12 v2 − xx31 v3 .
In R3 , v1 lies in the plane through 0, v2 , v3 :
40
Lecture 14
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
v1
v2
0
Conclusion: v1 , v2 , v3 ∈ R3 are linearly dependent iff they belong to a
plane through 0.
Lemma 7.6. (i) Let v1 , v2 ∈ R2 . Define a 2 × 2 matrix B such that v1 , v2 are
the columns of B:


|
|
B =  v1 v2 
|
|
Then det(B) = 0 iff v1 and v2 are linearly dependent.
(ii) Let v1 , v2 , v3 ∈ R3 and define a 3 × 3 matrix B so that v1 , v2 , v3 are the
columns of B. Then det(B) = 0 iff v1 , v2 , v3 are linearly dependent.
Proof of (ii). Note that

 
x1
|
B x2  = v1
x3
|
|
v2
|
 
|
x1
v3  x2 
|
x3
= x1 v1 + x2 v2 + x3 v3
We know that det(B) = 0 ⇐⇒ Bx = 0 has a non-zero solution, where x =
(x1 , x2 , x3 )t ⇐⇒ x1 v1 + x2 v2 + x3 v3 = 0 where some x1 ̸= 0 ⇐⇒ v1 , v2 , v3
are linearly dependent.
”
Theorem 7.7
Let A be a square matrix (3 × 3 case) with eigenvectors v1 , v2 , v3 with
eigenvalues λ1 , λ2 , λ3 respectively. Define


λ1 0
0
D = diag(λ1 , λ2 , λ3 ) =  0 λ2 0 
0
0 λ3
P as a matrix with columns v1 , v2 , v3 .
a) Then AP = P D
b) If v1 , v2 , v3 are linearly independent, then P −1 exists and A = P DP −1
Remark. This is a theorem about the diagonalisation of matrices
41
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
Proof. a) What is AP ? The i-th column of AP is ci (AP ) = Aci (P ) (lecture
8)
ci (AP ) = Aci (P ) = Avi = λi vi
What is P D?


|
|
|
λ1 0
P D = v1 v2 v3   0 λ2
|
|
|
0
0


|
|
|
= λ1 v1 λ2 v2 λ3 v3 
|
|
|

0
0
λ3
Hence AP = P D.
b) By Lemma 7.2, P −1 exists iff v1 , v2 , v3 are linearly independent.
”
Definition. Two square matrices A and B are called equivalent if there
exists an invertible matrix P such that A = P BP −1 .
Exercise: Check that this is indeed an equivalence relation.
Definition. A square matrix A is diagonalisable if A is equivalent to a
diagonal matrix, i.e. ∃ an invertible matrix P such that A = P DP −1 ,
where D is diagonal.
Proposition 7.8.
(i) If A is a diagonalisable 2 × 2 matrix, then A has two linearly independent
eigenvectors (not multiples of each other)
(ii) If A is a diagonalisable 3 × 3 matrix, then A has three linearly independent
eigenvectors.
Remark. This is in fact an “if and only if” statement (as it follows from
Theorem 7.3)
Proof of (ii). A = P DP −1 where D = diag(λ1 , λ2 , λ3 ). Then AP = P D.
Next, consider the columns of P as three vectors, call them v1 , v2 , v3 .



|
|
|
λ1 0
0
P D = v1 v2 v3   0 λ2 0 
|
|
|
0
0 λ3


λ1 p11 λ2 p12 λ3 p13
= λ1 p21 λ2 p22 λ3 p23 
λ1 p31 λ2 p32 λ3 p33


|
|
|
= λ1 v1 λ2 v2 λ3 v3 
|
|
|
Now AP = P D =⇒ ci (P D) = λi vi =⇒ ci (AP ) = Aci (P ) = Avi for
i = 1, 2, 3. Hence vi is an eigenvector of A with eigenvalue λi .
42
Lecture 15
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
Claim: v1 , v2 , v3 are linearly independent. In fact, Lemma 7.2 says P −1 exists
iff they are linearly independent. Since P −1 exists, we’re done.
”
Corollary 7.9.
(i) All matrices that are equivalent to

λ
Jλ =  0
0

1 0
λ 1
0 λ
for λ ∈ R, are non-diagonalisable.
(ii) All matrices equivalent to

Jλ1 µ
λ 1
= 0 λ
0 0

0
0
µ
for any λ1 , µ ∈ R are also non-diagonalisable.
Remark. These are in fact all the non-diagonalisable matrices.
Proof. (i) For contradiction, suppose that A = BJλ B −1 (for some invertible
matrix B) is diagonalisable. Then A = P DP −1 , so BJλ B −1 = P DP −1 =⇒
Jλ = B −1 P DP −1 B. We know that
(B −1 P )−1 = P −1 (B −1 )−1 = P −1 B
Therefore Jλ = (B −1 P )D(B −1 P )−1 .
So Jλ is diagonalisable and by Prop 7.4, Jλ must have three linearly independent eigenvectors. Let’s find them:

t−λ
tI − Jλ =  0
0
−1
t−λ
0

0
−1 
t−λ
det(tI − Jλ ) = (t − λ)3
Hence λ is the only eigenvalue of
(λI − Jλ )v = 0, i.e.

0 −1
0 0
0 0
Jλ . The corresponding eigenvectors satisfy
   
0
x1
0
−1 x2  = 0
0
x3
0
=⇒ x2 = x3 = 0, x1 = a ∈ R. Hence we conclude
 
1
v = 0
0
up to scalar multiples is the only eigenvector. So there are not three linearly independent eigenvectors, a contradiction. Thus A = BJλ B −1 is not a
diagonalisable matrix.
”
43
7
Eigenvalues and Eigenvectors
Geometry & Linear Algebra
Exercise: Prove (ii) similarly.
Examples 7.10.
(1) Let
(
A=
3 2
2 0
)
In lecture 13, we found this has eigenvalues λ1 = 4, λ2 − 1. The corresponding eigenvectors are
( )
(
)
2
v1 =
v2 = 1 −2
1
So A = P DP −1 where
D=
(
)
4 0
0 −1
(
P =
)
2
1
−1 −2
So A2 = (P DP −1 )(P DP −1 ) = P D2 P −1 . (By induction An = P Dn P −1
since P P −1 = I).
(
)
(
)
1 −2 −1
1 2 1
P −1 =
=
−5 −1 2
5 1 −2
Therefore, An = P Dn P −1
(
)( n
)(
)
2 1
4
0
2 1
1 −2
0 (−1)n
1 −2
(
)
(
)
1 2 · 4n
(−1)n
2 1
=
4n
−2(−1)n
1 −2
5
)
(
1 4n=1 + (−1)n 2 · 4n − 2(−1)n
=
4n + 4(−1)n
5 2 · 4n − 2(−1)n
=⇒ An =
1
5
(2) How do we find a 2 × 2 matrix B such that B 3 = A?
Idea: Take any 2 × 2 matrix C such that C 3 = D e.g.
( 1
)
43
0
C=
0 (−1)
Then B = P CP −1 works. Indeed B 3 = P C 3 P −1 = P DP −1 = A.
(3) Fibonacci Sequence
0, 1, 1, 2, 3, 5, 8, 13, 21, 35, 55, . . .
Formally defined as fn+1 = fn + fn−1
(
Let
A=
44
f0 = 0, f1 = 1.
1 1
1 0
)
7
Eigenvalues and Eigenvectors
Notice that
(
Geometry & Linear Algebra
fn+1
fn
)
(
=A
fn
)
fn−1
Since fn+1 = fn + fn−1 ...
(
)
(
)
fn+1
fn
=A
fn
fn−1
(
)
2 fn−1
=A
fn−2
= ...
( )
f1
=A
f0
( )
1
= An
0
n
We now know how to find An , and this gives a formula for fn .
(4) Is

1
A = −1
1

0 0
2 0
−1 1
diagonalisable?
The characteristic polynomial is (t − 1)2 (t − 2), so λ1 = 2, λ2 = 1. We find
 
0
v1 =  1 
−1
is an eigenvector with eigenvalue λ1 .
We can find two non-proportional eigenvectors for λ1 = 1. Then three
eigenvectors are linearly independent. So A is diagonalisable. [So sometimes
double root =⇒ three eigenvectors and is still diagonalisable.]
45
8
Conics and Quadrics
Geometry & Linear Algebra
8 Conics and Quadrics
Back to 2-dim geometry, we work in R2 .
Lecture 16
Lines are given by linear equations
ax1 + bx2 + c = 0
(polynomial equations in x1 , x2 of degree 2)
Definition. A conic is a curve in R2 that can be given by a quadratic
equation, i.e. a polynomial equation in x1 , x2 of degree 2.
ax21 + bx1 x2 + cx22 + dx1 + cx2 + f = 0
e.g. x21 + x2 = r2 gives a circle. Also (ax1 + bx2 + c)(a′ x1 + b′ x2 + c′ ) = 0 is
the set of solutions is the union of two lines.
8.1 Standard Forms of Conics
Non-degenerate conics:
For a ̸= 0, b ̸= 0 ∈ R we have
(1) An ellipse
x22
x21
+
=1
a2
b2
x2
x1
(2) A hyperbola
x21
x2
− 22 = 1
2
a
b
x2
x1
46
8
Conics and Quadrics
Geometry & Linear Algebra
(3) An imaginary ellipse
x21
x22
−
=1
a2
b2
This defines an empty set (there are no solutions).
−
(4) A parabola
x2 = ax21 + b
x2
b
x1
Degenerate conics:
(5) Two lines with a common point
(
)(
)
x21
x22
x1
x2
x1
x2
−
=
0
=⇒
−
+
=0
a2
b2
a
b
a
b
x2
x1
(6) x21 = 0, a double line.
(7) A pair of two conjugate complex lines meeting in a real point
(
)(
)
x21
x22
x1
x2
x1
x2
+
=
0
=⇒
−
i
+
i
=0
a2
b2
a
b
a
b
The set of real solutions is one point (0, 0).
(8) Two parallel lines
x21
x1
=1 (
= ±1)
a2
a
x2
x1
47
8
Conics and Quadrics
Geometry & Linear Algebra
(9) A pair of parallel complex conjugate lines
There are no real solutions.
x1
a
x21
=1
a2
= ±i.
8.2 Reducing with Trigonometry
Our aim: To use translations and rotations in R2 to reduce any conic to one
of these standard types.
Why conics? These are all slices of the cone x2 + y 2 = z 2 :
parabola
hyperbola
circle
ellipse
(i) z = constant =⇒ a circle / ellipse.
(ii) z = y + 1 into equation =⇒ x2 = 2y + 1, a parabola.
(iii) y = 1 (vertical plane) =⇒ a hyperbola.
(iv) z = 0, x2 + y 2 : two conjugate lines with a common real point.
(v) y = 0, x2 = z 2 : two real lines.
Translations in R2 :
y2
x2
y1
x1
Moving the co-ordinates by (d1 , d2 ) has the effect that
x1 = y1 + d1
x2 = y2 + d2
48
8
Conics and Quadrics
Geometry & Linear Algebra
Rotations in R2
Recall that rotating coordinate P = (p1 , p2 ) to Q = (q1 , q2 ) by φ through O
is:
( ) (
)( )
q1
cos φ − sin φ
p1
=
q2
sin φ cos φ
p2
What is the effect of rotation on coordinates?
Rotating the (x1 , x2 ) coordinate system by φ:
y2
Q
x2
P
φ
y1
φ
x1
The y1 , y2 coordinates of Q are simply p1 and p2 . Therefore, the new coordinates y1 , y2 and the old coordinates x1 , x2 are related as follows:
( ) (
)( )
x1
cos φ − sin φ
y1
=
x2
sin φ cos φ
y2
Let’s write quadratic equations in matrix form. Recall that At denotes the
transpose of A, i.e. the matrix whose i − j entry is the j − i entry of A.
Definition. A square matrix is called symmetric if At = A.
e.g.
(
)
a b
,
b a


a b c
b d e 
c e f
Recall that we proved that (AB)t = B t · At and (A + B)t = At + B t . Consider
)
(
a 12 b
A= 1
c
2b
(
(x1 , x2 )
a
1
2b
)(
1
2b
c
x1
x2
)
(
ax + 1 bx
= (x1 , x2 ) 1 1 2 2
2 bx1 + cx2
)
= x1 (ax1 + 21 bx2 ) + x2 ( 12 bx1 + cx2 )
= ax21 + bx1 x2 + cx22
49
8
Conics and Quadrics
(x1 , x2 )
Geometry & Linear Algebra
( )
( )
d
x
= dx1 + ex2 = (d, e) 1
e
x2
Therefore we can write ax21 + bx1 x2 + cx22 + dx1 + ex2 + f as
xt Ax + (d, e)x + f
( )
x1
where x =
, xt = (x1 , x2 ).
x2
Theorem 8.1
Using a rotation and then translation we can always reduce any quadratic
equation to standard form.
Low level proof - Idea: Get rid of the x1 x2 terms using a rotation. If we can
do that, then for some new coefficients a′ , c′ , d′ , e′ , f ′ we obtain:
a′ y12 + c′ y22 + d′ y1 + e′ y2 + f ′
we complete the squares and reduce this equation to
a′′ z12 + c′′ z22 + f ′′
or (when c′ = 0)
a′′ z12 + e′′ z2 + f ′′
Proof. Rotate through φ, then
( )
( ) (
)( )
x1
y1
cos φ − sin φ
y1
= Rϕ
=
x2
y2
sin φ cos φ
y2
t
ARφ y
=⇒ xt Ax = (Rφy)t A(Rφ y) = y t Rφ
t
We want the matrix Rφ
ARφ y to be diagonal (then y1 y2 terms disappears)
(
t
Rφ
ARφ =
(
=
)(
a
cos φ sin φ
1
− sin φ cos φ
2b
1
2b
c
)
Rφ
1
a cos φ + 21 b sin φ
2 b cos φ + c sin φ
1
−a sin φ + 2 b cos φ − 12 b sin φ + c cos φ
)(
)
cos φ − sin φ
sin φ cos φ
We need to know the non-diagonal terms. The 12 entry is
1
1
− a sin φ cos φ − b sin2 φ + b cos2 φ + c sin φ cos φ
2
2
1
1
1
= − a sin 2φ + b cos 2φ + c sin 2φ
2
2
2
1
= (b cos 2φ − (a − c) sin 2φ)
2
If a ̸= c, then take φ such that tan 2φ =
50
b
a−c
(then this term is 0).
Lecture 17
8
Conics and Quadrics
If a = c, take φ =
π
4
Geometry & Linear Algebra
(again, 0).
After this rotation, x = Rφ y, we accomplished the first step. Then we obtain
the equation
a′ y12 + c′ y22 + d′ y1 + e′ y2 + f
If a′ ̸= 0, c′ ̸= 0, we can complete the squares and reduce this to
a′ z12 + c′ z22 + f ′′
Then
y1 = z1 + α1
y2 = z2 + α2
This is exactly a translation by (α1 , α2 ). If a′ ̸= 0, but c′ = 0, we complete
the square for y1 and do nothing for y2 . Then reduce to a′ z12 + e′′ z2 + f ′′ . We
get a parabola or a degenerate conic.
”
Example 8.2. Reduce to standard form
5x21 + 4x1 x2 + 2x22 + x1 + x2 − 1 = 0
(
Step 1:
A=
b
a−c
Use Rφ = tan 2φ =
5 2
2 2
)
= 43 . We can calculate sin φ =
(
x1
x2
)
1
=√
5
(
)( )
1 2
y1
−2 1
y2
Substitute into equations and obtain
5x21 + 4x1 x2 + 2x22 = y12 + 6y22
The new equation is
1
3
y12 + 6y22 − √ y1 + √ y2 − 1 = 0
5
5
1
1 2
9
= (y1 − √ ) + 6(y2 + √ )2 −
8
2 5
4 5
9
= z12 + 6z22 −
8
so this is an ellipse.
8.3 Reducing with Eigenvectors
There is a trigonometry-free method!
51
√2 ,
5
cos φ =
√1 .
2
8
Conics and Quadrics
Geometry & Linear Algebra
t
−1
Note for any rotation matrix, we have Rφ
= Rφ
. Indeed
Rφ =
(
)
cos φ − sin φ
sin φ cos φ
(
t
Rφ
and
(
=
cos φ sin φ
− sin φ cos φ
cos φ sin φ
− sin φ cos φ
)
)(
) (
cos φ − sin φ
1
=
sin φ cos φ
0
0
1
)
Definition. Let P be a square matrix of the size n × n. Then P is called
orthogonal if P t · P = I. (In otherwords, P −1 exists and equals P t )
Since the rows of P t are precisely the columns of P , P t P = I is equivalent to
the following property:
{
1, i = j
(ci (P ) · cj (P )) =
0, i ̸= j
In sheet 6, there is an exercise that says a 2 × 2 matrix is orthogonal ⇐⇒ P
is a rotation matrix or a reflection matrix; depending on whether det(P ) = 1
t
−1
or det(P ) = −1. Note Rφ
ARφ = Rφ
ARφ . In the proof of Theorem 8.1, we
−1
t
watned Rφ ARφ = Rφ ARφ to be diagonal.
New idea: Use diagonalisation, that

|
P =  v1
|

|
v2 
|
where v1 , v2 are the eigenvectors of A.
Why do v1 , v2 exist? Why is P a rotation matrix?
Theorem 8.3
Let A be any real symmetric 2 × 2 matrix
)
(
(
a 12 b
α
such that A ̸=
A= 1
0
b
c
2
0
α
)
for some α ∈ R.
Then A has the following properties:
(i) A has two different real eigenvalues λ1 , λ2 λ1 ̸= λ2
(ii) If v1 is an eigenvector, with eigenvalue λ1 and v2 is an eigenvector
with eigenvalue λ2 . Then v1 · v2 = 0 (i.e. v1 , v2 are perpendicular)
(iii) Up to swapping v1 and v2 and multiplying v1 , v2 by a scalar, so
52
8
Conics and Quadrics
Geometry & Linear Algebra
that ||v1 || = ||v2 || = 1, the matrix

|
P = v1
|

|
v2 
|
is a rotation matrix.
Proof. (i)
(
)
b2
t − a − 12 b
= t2 − (a + c)t + (ac − )
1
−2b t − c
4
√
1
λ1 , λ2 = (a + c ± (a + c)2 − 4ac − b2 )
2
√
1
= (a + c ± (a − c)2 + b2 )
2
(
)
α 0
Since A ̸=
for some α, we have either b ̸= 0 or a − c ̸= 0. Hence
0 α
(a − c)2 + b2 > 0. Therefore the characteristic polynomial has two different
real roots.
det
(ii) Let v1 , v2 be eigenvectors for λ1 , λ2 respectively. We need to show that
v1 · v2 = 0.
Consider v1t Av2 :
v1t Av2 = v1t (Av2 )
= v1t · λ2 v2
= λ2 (v1t · v2 )
But also
v1t Av2 = (v1t A)v2
= (v1t At )v2
= (Av1 )t v2
= (λ1 v1 )t v2
= λ1 v1t v2
Therefore λ2 (v1t v2 ) = λ1 (v1t v2 ). So (λ1 −λ2 )·(v1t v2 ) = 0, but from (i), λ1 ̸= λ2 ,
so v1t v2 = 0.
(iii) Multiply v1 and v2 by non-zero numbers so that ||v1 || = ||v2 || = 1. Clearly
if
( )
α
v1 =
such that α2 + β 2 = 1
β
then α = cos φ, β = cos φ for some φ. Since v2 is also a unit vector and is
perpendicular to v1 it’s either
(
)
(
)
− sin φ
sin φ
or
cos φ
− cos φ
53
8
Conics and Quadrics
Geometry & Linear Algebra
v2
cos φ
v1
sin φ
φ
cos φ
sin φ
(
)
− sin φ
Choose v2 =
. Then
cos φ


(
)
|
|
v1 v2  = cos φ − sin φ = Rφ
sin φ cos φ
|
|
”
t
−1
Observe that Rφ
= Rφ
(any rotation matrix is an orthogonal matrix). From
the diagonalisation theory we know that P AP −1 = diag(λ1 , λ2 ) where


|
|
P =  v1 v2 
|
|
Now P = Rφ , so we have
=
−1
Rφ
ARφ = diag(λ1 , λ2 )
t
Rφ
ARφ
So if x = Rφ y, then
t
xt Ax = y t Rφ
ARφ y
−1
ARφ y
= y t Rφ
=⇒ xt Ax = y t diag(λ1 , λ2 )y = λ1 y12 + λ2 y22
(
λ
(y1 , y2 ) 1
0
0
λ2
)( )
y1
= λ1 y12 + λ2 y22
y2
Example 8.4. Same equation as before
(
)
5 2
A=
2 2
t2 − 7t + 6 = (t − 1)(t − 6)
so λ1 = 1, λ2 = 6. Let’s find the first eigenvector (2nd follows easily since
it’s perpendicular).
(
I −A=
( )
)
1
1
−4 −2
=⇒ v1 = √
−2
−2 −1
5
54
8
Conics and Quadrics
Geometry & Linear Algebra
(||v1 || = 1). Then we don’t even have to calculate v2 , and we can take
( )
1 2
v2 = √
5 1
Check:
1
√
5
(
)
1 2
−2 1
has determinant 1. This is Rφ from the beginning of the lecture.
Corollary 8.5. Any conic can be reduced to the following form:
λ1 y12 + λ2 y22 + d′ y1 + e′ y2 + f ′
(
a
be a rotation. Here λ1 , λ2 are the eigenvalues of A = 1
2b
)
1
2b
c
.
Obviously λ1 λ2 = det(A). (since the characteristic polynomial is t2 − (a +
c)t + det(A) = (t − λ1 )(t − λ2 )).
Therefore if det(A) > 0, then λ1 , λ2 have the same sign. We complete the
squares and obtain
λ1 z1 2 + λ2 z22 = h,
where h ∈ R
• If h > 0, this is an ellipse.
• If h < 0, this is an imaginary ellipse.
• If h = 0, this is two complex conjugate lines meeting in one real point
(0, 0)
Next, if det(A) < 0, completing the square we obtain
h∈R
λ1 z12 + λ2 z22 = h,
• If h > 0 or h < 0, this is a hyperbola.
• If h = 0, this a pair of real lines meeting at a point.
Finally if det(A) = 0, then λ2 = 0, λ1 ̸= 0. Complete the square for y1 , so we
obtain
λ1 z12 + e′ y2 + f ′
• If e′ ̸= 0, we reduce to λ1 z12 + e′ z2 = 0, which is a parabola.
′
• If e′ = 0, then λ1 z12 +f ′ = 0. z12 = − λf1 . If RHS is positive, these are two
parallel real lines. If RHS is zero, it’s a double line. If RHS is negative,
there are no real solutions; this is two parallel complex conjugate lines.
Moral: If we can prove that our conic is non-empty and non-degenerate, then
its type is uniquely determined by det(A).
• det(A) > 0 =⇒ ellipse.
• det(A) > 0 =⇒ parabola.
• det(A) > 0 =⇒ hyperbola.
55
8
Conics and Quadrics
Geometry & Linear Algebra
8.4 Quadric Surfaces
Definition. A quadric surface is a surface in R3 defined by an eqn of
degree 2:
ax21 + bx22 + cx23 + dx1 x2 + ex1 x2 + f x2 x3 + gx1 + hx2 + jx3 + k = 0
Task: Use an orthogonal transformation, ideally a rotation, to reduce this
equation to one of standard form.
Non-denerate forms
Here are non-degenerate forms, with 0 ̸= a, b, c ∈ R:
(1) Ellipsoid:
x2
x2
x21
+ 22 + 23 = 1
2
a
b
c
1
z
0
−1
−1
0
x
1 −1
−0.5
1
0.5
0
y
(2) Hyperboloid of one sheet:
x21
x22
x23
+
−
=1
a2
b2
c2
2
0
z
−2
−2
0
x
2
2
−2
0
(3) Hyperboloid of two sheets:
−
x2
x2
x21
− 22 + 23 = 1
2
a
b
c
56
y
Lecture 19
8
Conics and Quadrics
Geometry & Linear Algebra
2
0
z
−2
−2
0
2
x
2
−2
0
y
(4) Elliptic Paraboloid:
x3 =
x21
x22
+
a2
b2
2
0
z
−2
−2
0
2
x
2
−2
0
y
(4) Hyperbolic Paraboloid:
x3 =
x21
x22
−
a2
b2
20
0
z
−20
−5
0
x
5
−10
−5
Some examples of degenerate quadrics
57
5
0
y
10
8
Conics and Quadrics
Geometry & Linear Algebra
(5) Cone:
x21
x22
x23
+
−
=0
a2
b2
c2
2
0
z
−2
−2
0
x
2
2
0
−2
y
(6) Elliptic cylinder
x21
x22
+
=1
a2
b2
Similarly there are hyperbolic and parabolic cylinders, etc. You will not be
expected to know all the quadrics for the exam.
8.5 Reducing Quadrics
Step 1. Write the equation in matrix form.
 
x1
x = x2  =⇒ xt Ax + (ghj)x + k = 0
x3
where

a
d
2
e
2
f
2

A =  d2
b

e
2
f
2
c
Step 2. Find the eigenvalues of A. In fact any symmetric 3 × 3 matrix has
three real eigenvalues: det(tI − A) = (t − λ1 )(t − λ2 )(t − λ3 ) where λi ∈ R.
(See sheet 8 for proof).
We’ve seen before that λi ̸= λj and vi is an eigenvector of λi , vj is an eigenvector for λj , then vi is perpendicular to vj , i.e. vi ·vj = 0. So if λ1 ̸= λ2 ̸= λ3 ,
then we can find v1 , v2 , v3 of length 1: ||v1 || = ||v2 || = ||v3 || = 1. These vectors
will be perpendicular automatically to each other. The same also works in the
general case.
Step 3.

|
P =  v1
|
58
|
v2
|

|
v3 
|
8
Conics and Quadrics
Geometry & Linear Algebra
Then P is an orthogonal matrix, i.e. P t P = I
||vi || = 1, i = 1, 2, 3 and vi · vj = 0 when i ̸= j.
=⇒
P t = P −1 , since
Step 4. Since P −1 = P t , x = P y where y = (y1 , y2 , y3 )t . So the equation
now looks as follows:
xt Ax = y t P −1 AP y
But P −1 AP is the diagonal matrix

λ1
D=0
0
0
λ2
0

0
0
λ3
Therefore
xt Ax = y t P −1 AP y
= y t diag(λ1 , λ2 , λ3 )y
= λ1 y12 + λ2 y22 + λ3 y32
The whole equation now is
λ1 y12 + λ2 y22 + λ3 y32 + g ′ y1 + h′ y2 + j ′ y3 + k = 0
If λ1 λ2 λ3 ̸= 0, complete the squares and get one of two hyperboloids, an ellipsoid, or a degenerate quadric. If λ1 λ2 ̸= 0, λ3 = 0 we get a paraboloid (one
of two types) or a degenerate.
Example 8.6. Reduce to standard form
2x1 x2 + 2x1 x3 − x2 − 1 = 0
Using an orthogonal transformation and a translation.
1. xt Ax − x2 − 1 = 0, where

0
A = 1
1
2.

t
det −1
−1
1
0
0

1
0
0

−1 −1
t
0  = t3 − 2t
0
t
√
√
= t(t − 2)(t + 2)
If this quadric
it myst be a hyperbolic paraboloid (since
√is non-degenerate,
√
λ1 = 0, λ2 = 2 and λ3 = − 3 < 0)
59
8
Conics and Quadrics
Geometry & Linear Algebra
3. Find the eigenvectors λ1 = 0, solving

0 1 1
1 0 0
1 0 0
Ax = 0:

0
0
0


0
Solving we find v1 = √12  1 .
−1
√
For λ2 = 2, solving Ax = 0:
√
2 √
−1 −1

2 √0
−1
−1 0
2

0

0
0
√ 
2
Solving we find v2 = 21  1 .
1
 √ 
− 2
√
λ3 = − 3 =⇒ v3 = 21  1 .
1
4. Hence

√
0
 1
√
P =
+ 2
− √12
2
2
1
2
1
2
−
√

2
2 
1 
2 
1
2
=⇒ y t Ay + (0, −1, 0)P y − 1
√
√
1
= 2(y2 − α)2 − 2(y3 − β)2 + √ (y1 − δ)
2
where α, β, δ ∈ R. So this is a non-degenerate quadric, hence a hyperbolic
paraboloid.
60
9
Geometry in R3
Geometry & Linear Algebra
9 Geometry in R3
A vector x = (x1 , x2 , x3 ) ∈ R3 is such that
• a point P
⃗
• a vector OP
⃗ of same length and direction as OP
⃗ .
• any vector AB
Definition. The length of x = (x1 , x2 , x3 ) is
√
||x|| = x21 + x22 + x23
= |OP |
(x1 , x2 , x3 ) + (y1 , y2 , y3 ) = (x1 + y + 1, x2 + y2 , x3 + y3 )
λ(x1 , x2 , x3 ) = (λx1 , λx2 , λx3 )
Definition (Lines of R3 ). L = {a + λu : λ ∈ R} is a line in R3 .
Definition (Dot product). For x, y ∈ R3
x · y = x1 y1 + x2 y2 + x3 y3
Proposition 9.1. For x, y ∈ R3 we have x · y = ||x|| · ||y|| · cos θ
9.1 Planes
n = a is a normal vector for the plane.
x · y = 0 ⇐⇒ x and y are perpendicular. So x − a is perpendicular to
n ⇐⇒ (x − a) · n = 0. This is the equation of a plane.
Now u · (v + w) = u · v + u · w so (x − a) · n = x · n − a · n. So the equation of
the plane is
x·n=a·n
Write a · n = c, a scalar, so you get
x·n=c
Write n = (n1 , n2 , n3 ) then
n1 x 1 + n2 x 2 + n3 x 3 = c
Example 9.2. We need a, n. a = (e, e, 0) and n = (0, 0, π). Then (x − a) ·
n = 0, so x · n = a · n. Then
a · n = (e, e, 0) · (0, 0, n)
=0
61
Lecture 20
9
Geometry in R3
Geometry & Linear Algebra
So the equation is
(x1 , x2 , x3 ) · (0, 0, n) = 0
πx3 = 0
=⇒ x3 = 0
Proposition 9.3.
(i) The intersection of two planes is
• Nothing
• A line
• A plane (if the two planes are the same)
(ii) If A, B, C are 3 points which don’t lie on a common line, then there is a
unique plane through A, B, C.
Proof. Problem sheet 7.
Let e1 = (1, 0, 0), e2 = (0, 1, 0) and e3 = (0, 0, 01), so (x1 , x2 , x3 ) = x1 e1 +
x2 e2 + x3 e3 .
Definition (Vector
is
e1
a × b = a1
b1
product). The vector product = cross product of a, b
e2
a2
b2
e3 a3 b3 = e1 (a2 b3 − a3 b2 ) − e2 (a1 b3 − a3 b1 ) + e3 (a1 b2 − a2 b1 )
= (a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 )
Example 9.4.
e1
(1, −1, 0) × (3, 0, 1) = 1
3
e2 e3 −1 0 0
1
= (−1, 1, 3)
Proposition 9.5. a × b is perpendicular to both a, b.
Proof.
a1
a · (a × b) = a1
b1
a2
a2
b2
a3 a3 b3 =0
Similarly b · (a × b) = 0.
”
62
9
Geometry in R3
Geometry & Linear Algebra
(i) e1 × e2 = e3 , e2 × e3 = e1 , e3 × e1 = e2
Proposition 9.6.
(ii) a × a = 0
(iii) b × a = −a × b
(iv) (λa) × b = λ(a × b) = a × (λb) for any scalar λ ∈ R
(v) a × (b + c) = a × b + a × c
Proof. Use properties of determinants:
(i)
e1
e1 × e2 = 1
0
e3 0 0
e2
0
1
= e3 = (0, 0, 1)
(ii) a = (a1 , a2 , a3 ), then
e1
a × a = a1
a1
e2
a2
a2
e3 a3 a3 = 0 = (0, 0, 0)
(iii)
”
Proposition 9.7. For any vectors a, b ∈ R3 we have
Lecture 21
||a × b|| = ||a|| · ||b|| · | sin θ|
= the area of a parallelogram OABC
Proof. Squaring both sides:
=⇒ ||a × b||2 = (a2 b3 − a3 b2 )2 + (a3 b1 − b1 a3 )2 + (a1 b2 − a2 b1 )2
Area = ||a|| · ||h|| = ||a|| · ||b|| · | sin θ|
||a||2 · ||b||2 · (1 − cos2 θ) = ||a||2 ||b||2 − (||a||||b|| cos θ)2
= (a21 + a22 + a23 ) · (b21 + b22 + b23 ) − (a · b)2
= (a21 + a22 + a23 ) · (b21 + b22 + b23 ) − (a1 b1 + a2 b2 + a3 b3 )2
= a21 b22 + a22 + b21 − 2a1 a2 b1 b2 + · · · + (similar terms)
Check that the two expressions are identitical.
”
Remark. a × b is a vector perpendicular to the plane through O, a and b of
length that’s equal to the area of the parallelogram build on O, a, b.
63
9
Geometry in R3
Geometry & Linear Algebra
Example 9.8. Find the area of a triangle in R3 with corners (1, 0, 0), (1, 1, 0), (0, 1, 1).
Answer: Area = 21 (length of a × b) where a = (0, 1, 0), b = (−1, 1, 1)
(0, 1, 1)
b
a
(1, 0, 0)

e1
det  0
−1
e2
1
1
(1, 1, 0)

e3
0  = e1 − e2 · 0 + e3
1
= e1 + e3
= (1, 0, 1) = a × b
Hence ||a × b|| =
√
2 =⇒ area of triangle is
√
2
2 .
Volume
Proposition 9.9. The volume of the parallelepiped built on a, b, c equals |a ·
(b × c)|


a1 a2 a3 = det  b1 b2 b3 
c1 c2 c3 Proof. Let’s show that vol= |a · (b × c)|. By previous proposition the area of
the base is (b × c) = ||b × c|| · n where n is a unit normal to the plane through
0, b and c. Therefore
|a · (b × c)| = ||a|| · ||b × c|| · | cos θ|
where θ is the angle between a and n. But ||a|| · | cos θ| is the height of the
parallelepiped. So we proved that vol= |a · (b × c)|.
Finally

a1
det  b1
c1
a2
b2
c2

a3
b3 
c3
”
Properties of scalar triple product
Proposition 9.10. For any a, b, c ∈ R3 we have
64
9
Geometry in R3
Geometry & Linear Algebra
(i) a · (b × c) = c · (a × b) = b · (c × a)
(ii) a, b, c are coplanar (= 0, a, b, c belong

a1 a2
det  b1 b2
c1 c2
to a plane) if and only if

a3
b3  = 0
c3
Proof. (i)

a1
det  b1
c1
a2
b2
c2


a3
a1
b3  = − det  c1
c3
b1
a2
c2
b2


a3
c1
c3  = det a1
b3
b1
c2
a2
b2

c3
a3 
b3
Hence a · (b × c) = c · (a × b). Similarly we can probe that c · (a × b) = b · (c × a).
(ii) a, b, c are coplanar if f volume of parallelepiped built on a, b, c is 0 ⇐⇒
det = 0.
”
Alternative Proof of (ii). Any plane through 0 can be given by
m1 x1 + m2 x2 + m3 x3 = 0
(at least one co-efficient mi ̸= 0). Then

 
a1 a2 a3
m1
 b1 b2 b3  m2  = 0
c1 c2 c3
m3
Since a, b, c are contained in this plane. This system of linear equations has a
non-zero solution. Therefore det = 0.
This argument can be reversed; conversely if det = 0, then there is a non-zero
solution which gives a plane though 0, a, b, c.
”
Example 9.11. Are the points (1, 0, 0), (1, 1, 1), (1, 2, 3), (2, 3, 4) in the same
plane?
Let a = (0, 1, 1), b = (0, 2, 3), c = (1, 3, 4).


(
)
0 1 1
1 1
det 0 2 3 = det
=1
2 3
1 3 4
This is not 0, hence the four points do not belong to a common point.
What is (a × b) × c?
We know that this is perpendicular to the plane through a × b and c. Hence
(a×b)×c is in the plane through 0, a, b, so the triple vector product (a×b)×c) =
(k1 )a + (k2 )b.
65
Geometry in R3
9
Geometry & Linear Algebra
Proposition 9.12. For a, b, c ∈ R3 :
Lecture 22
(i) (a × b) × c) = (a · c)b − (b · c)a
(ii) a × (b × c) = (a · c)b − (a · b)c
Remark. Vector product is not associative!
Proof. (i) a × b = (a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ). (a × b) × c = the x1
coordinate of (a × b) × c which is (a3 b1 − a1 b3 )c3 − (a1 b2 − a2 b1 )c2 . The x1
coordinate of RHS is (a1 c1 + a2 c2 + a3 c3 )b1 − (b1 )1 + b2 c3 + b3 c3 )a1 . Hence
LHS = RHS. Similarly, check that x2 and x3 coordinate of LHS and RHS are
equivalent.
(ii)
a × (b × c) = −(b × c) × a)
=⇒ = (c × b) × a)
=⇒ = (c · a)b − (b · a)c
= (a · b)b − (a · b)c
”
Remark. Use this proposition to establish the Jacobi identity:
(a × b) × c + (c × a) × b + (b × c) × a = 0
This, with the property that a × b = −b × a turn R3 into a Lie Algebra.
Example 9.13. a, b, c, d ∈ R3 . What is (a × b) · (c × d)?
Answer: Call u = a × b. We proved that
u · (c × d) = d · (u × c) (prop 9.7)
= d · ((a × b) × c)
= d · ((a · c)b − (b · c)a) (prop 9.8)
= (a · c) · (b · d) − (b · c)(a · d)
Proposition 9.14. Let a ∈ R3 , a ̸= 0. Then b × a = 0 if and only if b = λa
for some λ ∈ R.
Proof. If b = λa then b × a = λa × a = 0. Conversely assume that b × a = 0.
Then certainly (a×(b×a) = a×0 = 0. Then a×(b×a) = (a·a)b−(a·b)a. We
(a·b)
”
know that this is zero, but (a · a) = ||a||2 ̸= 0, since a ̸= 0, so b = ||a||
2 a.
Proposition 9.15. Let a, b ∈ R3 . Assume a ̸= 0. Then the set of x ∈ R3
such that (x − b) × a = 0 is the line {b + λa | λ ∈ R}.
Proof. (x − b) × a = 0 ⇐⇒ x − b = λa for some λ ∈ R by the previous
proposition.
”
66
9
Geometry in R3
Geometry & Linear Algebra
9.2 Rotations in R3
The axis of rotation is the line in R3 such that every point of this line is fixed
by the rotation. We write a rotation by a 3 × 3 matrix P such that a vector
(x1 , x2 , x3 )t goes to P (x1 , x2 , x3 )t .
Claim:
(i) P is an orthogonal matrix, i.e. P t = P −1
(ii) Any orthogonal matrix has determinant 1 or −1. If det P = 1, then this
is a rotation matrix.
(iii) The axis is {λv | λ ∈ R}, where v is an eigenvector of P with eigenvalue
1, i.e. P v = v
(iv) To find the angle, we take any unit vector w perpendicular to the axis,
i.e. w · v = 0. Then calculate w · P w = ||w|| · ||P w|| cos θ = cos θ.
67
10
Vector spaces
Geometry & Linear Algebra
10 Vector spaces
We’ve seen R2 and R3 . Similarly, we can work with Rn = {(x1 , . . . , xn ) | xi ∈
R}. In the same way we can do linear algebra in Cn = {x1 , . . . , xn | xi ∈ C}.
In this case “scalars” mean “complex numbers”. The same methods work
equally well for binary vectors.
Notation. F2 = {0, 1} is the binary field such that:
0+0=1+1=0
1+0=0+1=1
0×0=1×0=0×1=0
1×1=1
So we have three notions of scalars: R, C, F2 .
What is (F2 )n ?
(F2 )n = {(x1 , . . . , xn ) | xi ∈ F2 }
(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , x2 + y2 , . . . , xn + yn )
Here + standard for binary addition, e.g. (1, 0) + (1, 1) = (0, 1).
If λ ∈ F2 , then λ(x1 , . . . , xn ) = (λx1 , . . . , λxn ).
Now let F be one of R, C or F2 . I will refer to F as a field.
Rules satisfied by addition of vectors
For any u, v, w ∈ V [V is in R, C or F2 ], we have
A1 u + (v + w) = (u + v) + w
A2 u + 0 = 0 + u = u, 0 = (0, 0, . . . , 0)
A3 u + (−u) = 0, u = (u1 , . . . , un ) − u = (−u1 , . . . , −un )
A4 u + v = v + u
Rules satisfied by scalar multiplication
For any λ, µ ∈ F and any u, v ∈ V we have:
S1 λ(u + v) = λu + λv
S2 (λ + µ)u = λu + µu
S3 λ(µu) = (λµ)u
S4 1u = u
Definition (Vector space V over F ). Let F be a field (i.e. R, C or F2 ) and
let V be a set, the elements of which are called vectors. Suppose we have
addition in V i.e. a function V × V → V and multiplication by a scalar,
68
Lecture 23
10
Vector spaces
Geometry & Linear Algebra
i.e. a function F × V → V and an element of V called zero such that the
properties A1-A4 and S1-S4 all hold. Then V is a vector space over the
field F .
Examples 10.1.
(1) Rn , Cn , (F2 )n
(2) Let S be any set. Consider all functions S → |R. Let’s define the
structure of a vector space on Functions(S, R).
If f1 : S → R and f2 : S → R are functions, then f1 + f2 is a function
such that for any element s ∈ S, we have (f1 + f2 )(s) = f1 (s) + f2 (s). The
zero element of Functions(S, R) is the zero function, i.e. the function that
is equal to zero ∀s ∈ S. If f ∈ Functions(S, R) then −f is the function such
that (−f )(s) = −f (s), ∀s ∈ S.
It’s clear that A1-4 hold. Now lets define scalar multiplication: λ ∈ R,
f ∈ Functions(s, R) then λf is the function such that (λf )(s) = λ · f (s), ∀s.
Check that S1-4 hold, e.g. let’s check S4. λ(µf ) is the function such that
∀s ∈ S we have (λ(µf ))(s) = λµf (s), which is the same as the value of the
function (λµ) · f .
(3) Similarly the set of functions from S to F is a vector space over F .
(4) Let’s be more specific. Let S = R. Consider Functions(R, R), e.g.
So consider just polynomial functions. This is a subset of the set of all
functions R → R. Addition of polynomials is again a polynomial, and a
scalar of a polynomial is also a polynomial. The set of polynomials is a
subspace of the vector space of all functions R → R.
Polynomials inherit addition and scalar multiplication from Functions(R, R),
they are closed under these operations. Hence
R[x] = {the set of all polynomials with real coeficients}
is a vector space over R. We can also check A1-4 and S1-4 for polynomials
directly.
(5) Now consider polynomials of degree up to d, where d is a fixed positive
integer. This set is closed under addition and scalar multiplication so it’s
also a vector space over R.
e.g. d = 0 is the space of polynomials of degree 0 = R. d = 1 is ax + b. So
addition is (ax+b)+(a′ x+b) = (a+a′ )x+(b+b′ ) and multiplication is λ(ax+
b) = λax + λb. So the space of linear polynomials can be identified with R2 .
Similarly, the space of quadratic polynomials = {ax2 + bx + c | a, b, c ∈ R}
can be identified with R3 be associating to ax2 + bx + c the vector (a, b, c).
69
10
Vector spaces
Geometry & Linear Algebra
Notation. V × V = {(v, v ′ ) | v, v ′ ∈ V }. V × V → V means a function
from pairs of elements of V goes to V . For example addition of vectors is such
an operation. F × V = {λ, v) | λ ∈ F, v ∈ V }. So scalar multiplication is a
function F × V → V .
Let’s deduce some consequences of the axioms A1-4, S1-4.
Proposition 10.2. Let F be a field (i.e. F = R, C or F2 ). Let V be a vector
space over F . For any v ∈ V and any λ ∈ F we have:
(i) 0v = 0
(ii) λ0 = 0
(iii) If λ ∈ F , λ ̸= 0, then λv = 0 only if v = 0.
(iv) (−λ)v = λ(−v) = −(λv)
(v) (−1)v = −v.
Proof.
(i) In F we have 0 = 0 + 0. Therefore 0v = (0 + 0)v = 0v + 0v. Let’s add −Ov
to both sides =⇒ 0v−0v = 0v+0v = 0v. By A1 =⇒ 0v−0v = 0v+(0v−0v)
and by A3 =⇒ 0 = 0v. (∗)
(ii) We need to prove λ0 = 0. By A2, 0 = 0 + 0. Then λ0 = λ(0 + 0). By S1
λ0 = λ0 + λ0. Add (−λ0) to both sides =⇒ λ0 − λ0 = λ0 + λ0 + λ0 − λ0.
By (∗) =⇒ 0 = λ0.
(iii) λ−1 ∈ F =⇒ λ−1 (λv) = λ−1 0 = 0. By S3 and S4 (λ−1 λ)v = 1 · v = v =
0.
”
(iv), (v) exercise.
10.1 Subspaces
Definition. Let V be a vector space over F . Let W be a subset of V .
Then W is called a vector space of V when W itself is a vector space over
F with the same addition and scalar multiplication as in V .
Proposition 10.3. Let W be a subset of a vector space V over F . THen W
is a subspace if the following conditions hold:
(i) W contains 0 (zero vector of V )
(ii) For any w1 , w2 ∈ W, w1 + w2 ∈ W (W is closed under vector addition)
(iii) For any w ∈ W and λ ∈ F , λw ∈ W (W is closed under scalar multiplication)
Remark. (i) is a consequence of (ii) and (iii). So to check that W ⊂ V is
a subspace, it is enough to check (ii) and (iii). Indeed suppose W is closed
70
Lecture 24
10
Vector spaces
Geometry & Linear Algebra
under addition and multiplication by scalars. Take any w ∈ W , w ̸= 0. Then
(−1)w = −w and w + (−w) = 0 ∈ W (because W is closed under addition.)
Proof. We must check that A1-A4 and S1-S4 hold in W .
−w ∈ W because −w = (−1)w. W contains 0, for any w ∈ W , −w ∈ W ,
closed under addition and scalar multiplication. All axioms hold in W because
they hold in V by assumption.
”
Examples of subspaces:
(i) Trivial subspaces: V is its own subspace. Also {0} is a subspace (closed
by Prop 10.1 and 10.2).
(ii) Polynomials with real coefficients R[x] is a subspace of Functions(R, R).
(iii) Polynomials of degree at most d is a subspace of Functions(R, R) and
equally a subspace of R[x].
Non-examples
(i) The upper half plane {(x1 , x2 ) | x1 ≥ 0, x1 , x2 ∈ R} is not a subspace of
R2 as it fails to be closed with negative scalar multiplication.
(ii) The co-ordinate cross, i.e. the union of the x1 -axis and the x2 -axis is
not a subspace of R2 . It fails vector addition.
In this lecture V is always a vector space over F , where F = R, C or F2 .
Every vector space V contains “trivial” subspaces {0} and V itself. For example, R contains only trivial subspaces. Indeed, if W ̸= {0}, then ∃ w ∈
W, w ̸= 0. Then R = {λw | λ ∈ R}.
In R2 there are non-trivial subspaces, e.g. take x = (x1 , x2 ) ̸= 0. Then λx is
a vector subspace. Of course {λx | λ ∈ R} is a line through zero. So any line
through zero in R2 is a vector subspace of R2 . These are given by ax1 +bx2 = 0.
Proposition 10.4. Let A be a matrix of size m × nS with entries in I. Then
the set of vectors x ∈ F n such that Ax = 0 is a vector subspace of F n .
Proof. By Prop 10.2 we need to check that Ax = 0 and Ay = 0, then A(x +
y) = 0. This is clear. Similarly, if Ax = 0, then for any λ ∈ R, we have
Aλx = λAx = 0.
”
Definition. Let V be a vector space over F , and let v1 , . . . , vn ∈ V . The
vector λ1 v1 + · · · + λn vn is called a linear combination of v1 , . . . , vn . Here
λ1 , . . . , λn ∈ F are arbitrary scalars.
Remark. Clearly λ1 vn + . . . λn vn ∈ V for any λ1 , . . . , λn ∈ F .
Example 10.5 (R2 ). It is easy to see that if x, y ∈ R2 are not multiples of
each other, then R2 = {λx + µy | λ, µ ∈ R}
71
Lecture 25
10
Vector spaces
Geometry & Linear Algebra
v1
v2
0
Definition. Let v1 , . . . , vn be vectors in V . Then the set of all linear
combinations of v1 , . . . , vn is called the linear span of v1 , . . . , vn . This is
denoted by Sp(v1 , . . . , vn ).
Proposition 10.6. For any v1 , . . . , vn ∈ V the linear span Sp(v1 , . . . , vn ) is
a vector space of V .
Proof. Again, by Prop 10.2 we need only to check that Sp(v1 , . . . , vn ) is closed
under + and (× by scalars).
Indeed
λ1 v1 + · · · + λn vn + (µ1 v1 + · · · + µn vn )
= (λ1 + µ1 )v1 + · · · + (λn + µn )vn
Also for µ ∈ F we have µ(λ1 v1 + · · · + λn vn ) = (µ1 v1 )v1 + · · · + (µn λn )vn . ”
Remark. b = (b1 , . . . , bn )t ∈ F n . When is b an element of Sp(v1 , . . . , vn ),
where vi ∈ F n ?
Consider the m × n matrix

|
A =  v1
|

|
. . . vn 
|
b ∈ Sp(v1 , . . . , vn ) if and only if b = x1 v1 + . . . xn vn . Equivalently,
  
 x 
b1
1
|
|
 ..  
 .. 

 .  = v1 . . . vn  .  = Ax
|
|
v
x
n
n
for some x ∈ F .
n
Example 10.7. V = R2 . Consider
 
 
1
−1
v1 = 2 , v2 =  0  ,
3
1


3
v3 =  1 
−1
What is Sp(v1 , v2 , v3 )?
Using the remark, we know that b ∈ Sp(v1 , v2 , v3 ) if and only if b = Ax has
a solution.
72
10
Vector spaces
Geometry & Linear Algebra
The augmented matrix of Ax = b is this:

1 −1 3
2 0
1
3 1 −1
Using Gaussian elimination we get

1 −1 3
−→ 0 2 −5
0 0
0

1 −1
−→ 0 2
0 4

b1
b2 
b3

b1
b2 − 2b1 
b1 − 2b2 + b3
3
−5
−10

b1
b2 − 2b1 
b3 − 3b1
If b1 − 2b2 + b3 ̸= 0, then on solutions. If b1 − 2b2 + b3 = 0, we have solutions
(one free variable =⇒ infinitely many).
Conclusion: b ∈ Sp(v1 , v2 , v3 ) if and only if b1 −2b2 +b3 = 0. In other words,
Sp(v1 , v2 , v3 ) is a plane in R3 given by the equation x1 − 2x2 + x3 = 0.
 
3
Example 10.8. Same v1 , v2 but now let v3 = 1.
0
What is Sp(v1 , v2 , v3 )?
We need to solve Ax = b, where the augmented matrix is


1 −1 3 b1
2 0 1 b2 
3 1 0 b3
Once again, follow the same procedure:

1 −1 3
−→ 0 2 −5
0 4 −9

1 −1
−→ 0 2
0 0
3
−5
1

b1
b2 − 2b1 
b3 − 3b1

b1
b2 − 2b1 
b1 − 2b2 + b3
The system has a solution for any choice of b1 , b2 , b3 (and there is exactly
one solution!) So Sp(v1 , v2 , v3 ) = R3 .
Definition. Let V be a vector space over F , and let W ⊂ V be a subspace.
If W = Sp(v1 , . . . , vn ), then v1 , . . . , vn is called a spanning set for W . Then
we say that v1 , . . . , vn spans W .
73
10
Vector spaces
Geometry & Linear Algebra
Example 10.9. The vectors e1 = (1, 0, . . . , 0), . . . , en = (0, 0, . . . , 1) span
F n . Indeed (x1 , . . . , xn ) = x1 e1 + · · · + xn en .
Go back to the first example... The plane x1 − 2x2 + x3 = 0 is Sp(v1 , v2 , v3 ).
In fact, this plane is also Sp(v1 , v2 ). The reason for this is that v3 is a linear
combination of v1 and v2 . Indeed
 
 
 
3
1
−1
v3 =  1  , v1 = 2 , v2 =  0 
−1
3
1
v3 =
5
1
v1 − v2
2
2
Therefore λ1 v1 + λ2 v2 + λ3 v3 = (λ1 +
λ3
2 )v1
+ (λ2 − 25 λ3 )v2 .
Definition. Let V be a vector space over F . Then vectors v1 , . . . , vn are
linearly dependent if there exists λ1 , . . . , λn ∈ F not all equal to zero such
that λ1 v1 + · · · + λn vn = 0 ∈ V .
The motivation for this is when v1 , . . . , vn are linearly dependent, one of these
vectors is a linear combination of all the others.
Indeed some λi ̸= 0. To fix ideas assume λ1 ̸= 0. Then
v1 = −
1
(λ2 v2 + · · · + λn vn )
λ1
In this case Sp(v1 , . . . , vn ) = Sp(v2 , . . . , vn ).
Definition. v1 , . . . , vn are linearly independent if the only way to write
the zero vector as a linear combination of v1 , . . . , vn is with all coefficients
equal to 0, i.e. if 0 = λ1 v1 + . . . λn vn =⇒ λ1 = · · · = λn = 0.
Remarks.
(i) One vector v ∈ V is linearly independent (assuming v ̸= 0, i.e. v is a
non-zero vector)
(ii) Two vectors v1 , v2 ∈ V are linearly independent if and only if they are not
proportional. λ1 v1 + λ2 v2 = 0. If λ1 ̸= 0, then v1 = − λλ12 v2 . If λ2 ̸= 0, then
v2 = − λλ12 v1 .
(ii) Let V = F m . Suppose v1 , . . . , vn ∈ V . Consider the m × n matrix A such
that v1 , . . . , vn are the columns of A:


| ... |
A = v1 . . . vn 
| ... |
Then v1 , . . . , vn are linearly dependent if and only if the equation Ax = 0 has
a non-zero solution.
Clearly x = (x1 , . . . , xn )t is a solution, then x1 v1 + · · · + xn v + n = 0. If
Ax = 0 =⇒ x = 0, then v1 , . . . , vn are linearly independent. If ax = 0 has a
74
Lecture 26
10
Vector spaces
Geometry & Linear Algebra
non-zero solution then v1 , . . . , vn are linearly dependent.
Proposition 10.10. Any subset of a set of linearly independent set of vectors
is linearly independent.
Proof. Suppose that v1 , . . . , vn are linearly independent. Without loss of generality, we can assume that our subset is v1 , . . . , vk for some k ∈ n. If v1 , . . . , vk
are linearly dependent, then we can find λ1 , . . . , λn ∈ F (scalars) such that
λ1 v1 + . . . λk vk = 0, and not all of λ1 , . . . , λk are 0.
Consider
λ1 v1 + · · · + λk vk + 0vk+1 + · · · + 0vn = 0
This is a linear combination of all v1 , . . . , vn with at least one non-zero coefficient. Hence v1 , . . . , vn are linearly dependent. This is a contradiction,
therefore v1 , . . . , vk are linearly independent.
”
Proposition 10.11. Let V be a vector space over F . Let v1 , . . . , vn ∈ V .
Then v1 , . . . , vn are linearly dependent if and only if one of them is a linear
combination of the others.
Proof. Suppose v1 , . . . , vn are linearly dependent. Then we can find λ1 , . . . , λn
such that λ1 v1 + · · · + λn vn = 0 and one of λi s, say λ1 is non-zero. Then
v1 = −
1
(λ2 v2 + . . . λn vn )
λ1
Conversely if v1 = µ2 v2 + . . . µn vn where µi ’s ∈ F , then
(−1)v1 + µ2 v2 + · · · + µn vn = 0
(−1) ̸= 0, so at least one coefficient is non-zero. Hence v1 , . . . , vn are linearly
dependent.
”
Definition. Let V be a vector space over F . A set of vectors v1 , . . . , vn ∈ V
is called a basis of V if
(i) V = Sp(v1 , . . . , vn )
(ii) v1 , . . . , vn are linearly independent.
In other words, every vector of V is a linear combination of v1 , . . . , vn (this is
property (i)), and and no vi is a linear combination of v1 , . . . , vi−1 , vi+1 , . . . , vn .
Examples 10.12.
(i) V = F n . Then e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), . . . , en = (0, 0, . . . , 1)
is a basis of F n .
(x1 , . . . , xn ) = x1 e1 + . . . xn en , so Sp(e1 , . . . , en ) = F n . Clearly e1 , . . . , en
are linearly independent: λ1 e1 + · · · + λn en = 0. This says (λ1 , λ2 , . . . , λn )
is the zero vector. Hence λi = 0, i = 1, . . . , n. So e1 , . . . , en is a basis.
75
10
Vector spaces
Geometry & Linear Algebra

Warning. There are many other bases! e.g. R2 has a basis (1, 1) and
(1, 3). Indeed (1, 1) and (1, 3) are not proportional, hence are linearly
independent.
(ii) Let V be a vector space of polynoials in x of degree at most d.
V = {fd xd + · · · + f1 x1 + f0 | fi ∈ F }
fd xd +· · ·+f0 is a linear combination of 1, x, . . . , xd with coefficients f0 , . . . , fd .
But 1, x, . . . , xd are linear indepdent vectors of V . So suppose lambda0 +
λ1 x + · · · + λd xd = 0 (the zero polynomial.) Conclusion: 1, x, . . . , xd is a
basis of the vector space of polynomials of degree ≤ d.
Other basis exist: e.g. 1, x + 1, x2 + 1, . . . , xd + 1 is also a basis.
Remark. Let V = F n . Then v1 , . . . , vn ∈ V form a basis of V if and only if
the matrix equation Ax = 0 has only one solution x = 0. Here


| ... |
A = v1 . . . vn 
| ... |
We’ve seen that v1 , . . . , vn are linearly independent if the only solution is the
zero solution. We still need to justify that if Ax = 0 has only one solution,
then v1 , . . . , vn span V .
When does v1 , . . . , vn ∈ F n form a basis?
Consider

|
A =  v1
|

|
. . . vn 
|
Answer: v1 , . . . , vn are a basis of F n is and only if the equation Ax = 0, where
x = (x1 , . . . , xn )t has the unique solution 0. Equivalently, det(A) ̸= 0.
x is a solution ⇐⇒ x1 v1 + · · · + xn vn = 0. Hence, since v1 , . . . , vn are linearly
independent the only solution is x1 = · · · = xn = 0. det(A) ̸= 0 =⇒ A−1
exists.
To prove that F n = Sp(v1 , . . . , vn ), we must observe that any vector b =
(b1 , . . . , bn )t can be written as b = Ac for some c = (c1 , . . . , cn )t . This is so
because b = Ac says that b = c1 v1 + · · · + cn vn . Take c = A−1 b. Then b = Ac
as required.
Proposition 10.13. Let V be a vector space over G with a basis {v1 , . . . , vn }.
Then any vector v ∈ V is uniquely written as v = λ1 v1 + · · · + λn vn .
Proof. Suppose that
v = λ1 v1 + · · · + λn vn
76
Lecture 27
10
Vector spaces
Geometry & Linear Algebra
and
v = µ1 v1 + . . . µn vn
Then (µ1 − λ1 )v1 + · · · + (µn − λn )vn = 0. Since v1 , . . . , vn is a basis, these
vectors are linearly independent. By the definition of linear independence, we
must have λ1 = µ1 , . . . , λn = µn .
”
Theorem 10.14
Let V be a vector space over F . Suppose that {v1 , . . . , vn } & {w1 , . . . , wm }
are both bases of V . Then m = n.
- So any basis of V has the same number of elements.
Proof. V = Sp(w1 , . . . , wm ) and V = Sp(v1 , . . . , vn ). Thus we can write:
vi =
n
∑
for some aij ∈ F
aij wj
j=1
Let A be the m × n matrix with entries aij .
Similarly we write:
wj =
m
∑
for some bjk ∈ F
bkj vk
k=1
Therefore,
vi −
n
∑
aij ·
j=1
m
∑
bjk vk
(∗)
k=1
By Proposition 10.7, there is only one way of writing vi as a linear combination
of v1 , . . . , vn , namely as vi = 0v1 + · · · + 1 · vi + · · · + 0vn .
Hence RHS of (∗) is simply vi . In otherwords, we have
n ∑
m
∑
aij bjk = 1
if k = i
aij bjk = 0
if k ̸= i
j=1 k=1
and
n ∑
m
∑
j=1 k=1
The framed expressions are the entries of AB.
The ii entry of AB is the first sum, that is
n ∑
m
∑
aij bjk = 1
j=1 k=1
The 1k-entry of AB, i ̸= k, is the second sum, that is
n ∑
m
∑
aij bjk = 0
j=1 k=1
77
10
Vector spaces
Geometry & Linear Algebra
This means AB = Im , the identity matrix of size m × m.
By swapping vi ’s and wj ’s, we show similarly that BA = In , the identity
matrix of size n × n. Thus AB = In and BA = Im .
Recall that the trace of a square matrix is the sum of the diagonal elements.
Fact: Tr(AB) = Tr(BA).
This is enough to finish the proof because Tr(In ) = n and Tr(Im ) = m.
Proof of Fact. The ii entry of AB is
n
∑
aij bji =⇒ T r(AB) =
j=1
n ∑
n
∑
aij bji
i=1 j=1
The jj-entry of BA is
m
∑
bji aij
i=1
So
T r(BA) =
n ∑
m
∑
bji · aij =
j=1 i=1
m ∑
n
∑
aij bji = T r(AB)
i=1 j=1
The proves the fact, and hence the theorem.
”
Definition. Suppose a vector space V over G has a basis consisting of n
vectors. Then n is called the dimension of V = dimV .
If V has a basis (of finitely many elements), then V is called a finitedimensional vector space.
Example 10.15. Let V be the space of all polynomials in x with real
coefficients.
Claim: V is not finite-dimensional.
Proof. Suppose that V is spanned by finitely many polynomials, say f1 (x), . . . , fn (x).
Let d be a natural number so that d > deg fi (x) with i = 1, . . . , n. Then xd
is not a linear combination of f1 (x), . . . , fn (x).
”
Theorem 10.16
Let V be a vector space. Suppose that v1 , . . . , vn ∈ V span V . Then
(i) dimV ≤ n
(ii) There is a subset of {v1 , . . . , vn }, which is a basis of V .
Note that (ii) =⇒ (i). Indeed dimV = number of elements in any basis of
V , so dimV ≤ n, because removing some of the vectors from {v1 , . . . , vn } we
obtain a basis of V . Hence (ii) =⇒ (i).
78
10
Vector spaces
Geometry & Linear Algebra
Proof of (ii). We use the casting out process.
Start with v1 , . . . , vn . If v1 , . . . , vn are linearly independent, then {v1 , . . . , vn }
is a basis. If v1 , . . . , vn are linearly independent [i.e. linearly dependent] then
we find λ1 , . . . , λn ∈ F such that λ1 v1 + λn vn = 0 and at least one coefficient
λi ̸= 0. Say λn ̸= 0. Then
vn = −
1
(λ1 v1 + · · · + λn−1 vn−1 )
λn
Therefore V is spanned by v1 , . . . , vn−1 , i.e. V = Sp(v1 , . . . , vn ) = Sp(v1 , . . . , vn−1 ).
Continue like this, we stop when we are met with a linearly independent set
of vectors. But we ensure that these vectors span V . Hence we have obtained
a basis of V .
”
We wanted to classify all vector space of R3 . dim(R3 ) = 3, and dim({0}) = 0
by definition. (We define the dimension of {0} to be zero.)
If V ⊂ R3 is a vector space, then dim(V ) can be 0, 1, 2, or 3. If dim(V ) = 3,
then V has a basis of 3 vectors, but then it’s a basis of R3 . Hence V = R3 .
Felina. Finally we look at an application to error-correcting codes:
10.2 * Error correcting codes *
Now F = F2 = {0, 1} with binary addition and multiplication. 1 + 1 = 0.
Consider the n-dimensional vector space (F2 )n = {(x1 , . . . , xn ) | xi ∈ F2 }.
We’ve seen that every vector space of (F2 )k is a vector subspace of dimension
n over F2 , where 0 ≤ m ≤ n.
Let v1 , . . . , vm be a basis of V , where m = dim(V ). We know that every vector
v ∈ W is uniquely written as
v = λ1 v1 + . . . λm vm
where λi ∈ F2 , i.e. λi = 0 or λi = 1.
The number of elements in (F2 )n is 2n .If V ⊂ (F2 ) has dim(V ) = m, then
|V | = 2m .
Recall that if v = (x1 , . . . , xn ) and w = (y1 , . . . , yn ) in Rn , then the distance
between v and w is
√
(x1 − y1 )2 + · · · + (xn − yn )2
This doesn’t work well for F2 . Instead consider the Hamming distance.
Definition. Let v, w ∈ (F2 )n . The Hamming distance, dist(v, w) = the
number of i ∈ {1, . . . , n} such that vi ̸= wi .
For example, dist(v, 0) = the number of non-zero coordinates of v.
Any reasonable notion of distance should satisfy three axioms:
79
Lecture 29
10
Vector spaces
Geometry & Linear Algebra
(i) dist(v, v) = 0
(ii) dist(v, w) = dist(w, v) ∀v, w ∈ V
(iii) dist(v, w) ≤ dist(u, u) + dist(u, w)
(dist(v, w) is always a non-negative real number.)
Exercise: Check Axiom (iii).
Error-correcting codes are used to transmit messages in such a way that a
distorted message can be decoded correctly.
Messages are first encoded into sequences of 0 and 1 of some length. We get,
say 2d possible messages, encoded into sequences of length 1.
The idea of error correction in the case of 1 error:
The Hamming code
We are going to realise 2d messages as elements of a vector subspace of (F2 )n ,
V ⊂ (F2 )n .
The construction: consider the matrix of size m × 2 such that the columns are
all vectors with coordinates 0 and 1 except the all-0 vector. E.g. if m = 3,


1 0 0 1 1 0 1
A = 0 1 0 1 0 1 1
0 0 1 0 1 1 1
m
Define V = {x ∈ F2
−1
| Ax = 0}.
Crucial property: Suppose v, w ∈ V, v ̸= w. We transmit v, w and receive
v ′ , w′ . We assumed that at most one error occurs during transmission, so
either v ′ = v or dist(v, v ′ ) = 1 and similarly for w.
The main point is that the set of v ′ ∈ (F2 )n such that dist(v, v ′ ) ≤ 1 does
not overlap with the set of w′ ∈ (F2 )n such that dist(w, w′ ) ≤ 1. Once we’ve
proved that we know that the closest vector to v ′ is the original vector v.
In fact for any v, w ∈ V , dist(v, w) ≥ 3 if v ̸= w.
Proof.

1 0
A = 0 1
0 0
0 1
0 1
1 0
1
0
1
0
1
1

1
1
1
If dist(v, w) = 1, then v − w has exactly one non-zero coordinate. But V has
no such vectors! (The matrix has no zero columns).
If dist(v, w) = 2, then v − w has exactly two non-zero co-ordinates, but there
are non such vectors in V (The matrix has no two identitcal columns). Hence
dist(v, w) ≥ 3, if v ̸= w in V .
”
Proof of the crucial property. If v ′ = w′ , dist(v, v ′ ) ≤ 1 and dist(w, w′ ) ≤ 1.
Then by the triangle inequality
dist(v, w) ≤ dist(v, v ′ ) + dist(w, w′ ) ≤ 2
80
10
Vector spaces
Geometry & Linear Algebra
”
a contradiction.
The decoding is very easy:
Let v ∈ V , so that Av = 0. Let v ′ be the received vector. We assume that
v ′ = v or v ′ = v + e, where e has exactly one non-zero coordinate.
Recipe for decoding.
Av ′ is a column vector with 3 co-ordinates. If Av ′ = 0 then v ′ = v and no
errors occurred. Otherwise Av ′ is a column of A. The position of Av ′ in A is
the coordinate where the error has occurred. Hence we can recover v.
Summary
V is a vector space of dimension
dimV = 2m − 1 − m
Exercise: Show that the system of linear equations that defines V has this
number of free variables.
m
Hence |V | = 22
−1−m
m
, with V ⊂ (F2 )2
−1
.
m
Final Comment: The whole space (F2 )2 is the disjoint union of Hamming
m
spheres of radius, i.e. (F2 )2 −1 is the disjoint union of {a′ | dist(n − v)} over
m
all v ∈ V . In other words each vector of (F2 )2 −1 is at the distance 1 from
vector in v, which is unchanged.
- End of Geometry & Linear Algebra -
81