Download LINEAR TRANSFORMATIONS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Determinant wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Covariance and contravariance of vectors wikipedia , lookup

Jordan normal form wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Vector space wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Gaussian elimination wikipedia , lookup

Four-vector wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Matrix calculus wikipedia , lookup

Matrix multiplication wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Ordinary least squares wikipedia , lookup

System of linear equations wikipedia , lookup

Transcript
CHAPTER 2
LINEAR TRANSFORMATIONS
1. Defining linear transformations
Linear transformations are (mathematical abstractions of) very common types of function.
Exercise 1 (Rotations in the plane). Consider the function which, given a vector v
in the plane, produces as output the same vector rotated (anti-clockwise) through an
angle θ: we write Rθ v for this new vector.
What is Rθ (v + w), in terms of Rθ v and Rθ w? What is Rθ (λv) in terms of Rθ v?
Exercise 2 (The simple harmonic oscillator from quantum mechanics). The simple
harmonic oscillator is the quantum mechanical version of a mass oscillating in simple
harmonic motion on a spring. The “operator” H is a function from functions to functions.
Given a function f : R → R, we define the new function Hf : R to R by the rule
d2
(Hf )(x) = − 2 f (x) + x2 f (x).
dx
2
2
For instance, if f (x) = e−x /2 , then Hf (x) = e−x /2 .
What is H(f + g) in terms of Hf and Hg? What is H(λf ) in terms of Hf ?
These two examples enjoy the same basic algebraic properties—they respect the basic
vector operations of addition and scalar multiplication.
Definition 2. Let V and W be vector spaces. A linear transformation (or mapping
or map) from V to W is a function T : V → W such that
T (v + w) = T v + T w
T (λv) = λT (v)
for all vectors v and w and scalars λ.
The aim of our study of linear transformations is two-fold:
• to understand linear transformations in R, R2 and R3 .
• to bring this understanding to bear on more complex examples.
Let us try to find the linear maps from R to R. Which of the following functions are
linear?
f (x) = 3x + 1
g(x) = −πx
h(x) = e−x + 2
k(x) = sin(x).
In general, the linear functions from R to R are of the form l(x) = cx, for some fixed
c ∈ R (including 0).
25
26
2. LINEAR TRANSFORMATIONS
2. Linear Maps and Matrices
Suppose that A is an m × n matrix. Define TA : Rm → Rn by the formula
TA x = Ax for all x ∈ Rm .
Then TA is a linear map. Indeed,
TA (x + y) = A(x + y) = Ax + Ay = TA (x) + TA (y),
and TA (λx) = λTA (x) similarly.
In some senses, matrices are the only examples of linear maps.
Theorem 15. Suppose that V and W are vector spaces with bases {v 1 , . . . , v m } = A
and {w 1 , . . . , wn } = B respectively, and suppose that T : V → W is a linear map. Let ai
denote the vector T vi , [ai ]B denote its coordinates relative to the basis B, and A denote
the matrix with columns [ai ]B . Then for all x in V ,
[T x]B = A[x]A .
Proof. For x in V ,
x = x1 v 1 + x2 v 2 + · · · + xm v m ,
where (x1 , x2 , . . . , xm )T = [x]A . Then
T x = T (x1 v 1 + x2 v 2 + · · · + xm v m ) = x1 T v1 + x2 T v 2 + · · · + xm T v m ,
so
[T x]B = x1 [T v 1 ]B + x2 [T v2 ]B + · · · + xm [T v m ]B = A(x1 , x2 , . . . , xm )T ,
as required.
Example. Let Rθ : R2 → R2 be the rotation (anti-clockwise) through the angle θ.
Then
1
cos θ
=
Rθ
0
sin θ
1
− sin θ
Rθ
=
,
0
cos θ
in the standard basis for R2 . Then
x
cos θ − sin θ
x
1
0
.
= xRθ
+ yRθ
=
Rθ
y
sin θ cos θ
y
0
1
Example. Let Ds : R3 → R3 be the dilation by a factor of s in R+ . The Ds x = sx
for all x in R3 . Thus


s 0 0
Ds x = 0 s 0 x.
0 0 s
Exercise 3. What is the geometric effect of multiplying by the matrix


1 0
0
0 1
0 
0 0 10−3
3. KERNELS AND RANGES
in R3 ?
27
Exercise 4. What is the geometrical effect of multiplying by the matrix
1 1
0 1
in
R2 ?
Exercise 5. Let a be a unit vector in R3 . Define the map L : R3 → R3 by the
formula Lx = (x · a)a.
(a) Show that L is linear.
(b) Express L as multiplication by a matrix.
(c) Describe L geometrically.
Exercise 6. Let a be a unit vector in R3 . Define the map X : R3 → R3 by the
formula
Xv = v × a.
(a) Show that X is linear.
(b) Express X as multiplication by a matrix.
(c) Describe X geometrically.
3. Kernels and Ranges
Consider the linear system
a11 x1 + a12 x2 + · · · + a1m xn = b1
...
am1 x1 + am2 x2 + · · · + amn xn = bm ,
or equivalently, in matrix form
Ax = b.
What b can we solve this for? Is the solution unique?
We have seen that we can solve the equation if and only if b is a linear combination
of the columns of A; we write b ∈ col(A) for short. We also know that, if xpart is a
particular solution of Ax = b, then every solution is of the form xpart + xhom , where xhom
is a solution of the homogeneous equation
Ax = 0.
If the homogeneous equation has one solution, so does Ax = b; if the homogeneous
equation has many solutions, so does Ax = b.
Next, consider the differential equation
d2 u
Hu(x) = − 2 (x) + x2 u(x) = b(x),
dx
where b is a known function and u is unknown. What b can we solve this for? Is the
solution unique?
We can answer the first question in a formal sense: b must lie in the range of H, also
known as the image of H, and written range(H) or image(H). For the second question,
28
2. LINEAR TRANSFORMATIONS
we can show that, if upart is a particular solution of Hu = b, and uhom is any solution of
the homogeneous equation Hu = 0, then
H(upart + uhom ) = b + 0 = b,
i.e., upart + uhom is a solution of Hu = b. In fact, every solution of Hu = b arises in this
way. Thus the uniqueness or nonuniqueness of the solutions of the homogeneous equation
controls the uniqueness or nonuniqueness of the original equation. We unify these (and
other examples) in the following observations.
Let T : V → W be a linear map. The set of vectors T V = {T v : v ∈ V } is called the
range or image of T , and written range(T ) or image(T ). Then range(T ) is the collection
of vectors w in W for which the equation T x = w can be solved.
Let xpart be a particular solution of the equation T x = w, and let xhom be any solution
of the homogeneous equation T xhom = 0. Then
T (xpart + xhom ) = T (xpart ) + T (xhom ) = w + 0 = w,
so that xpart + xhom is a solution of T x = w. Further, every solution of T x = w is of this
form. Indeed, if T x = w and T xpart = w, then
T (x − xpart ) = T (x) − T (xpart ) = w − w = 0,
so x − xpart is a solution of the homogeneous equation.
Given a linear map T : V → W , we define the kernel of T , written ker(T ), to be the
subspace of V consisting of all solutions of the homogeneous equation T x = 0; in symbols
ker(T ) = {x ∈ V : T x = 0}.
4. Rank and nullity
We define the nullity of T , written nullity(T ), to be the dimension of ker(T ). This is
equal to the number of parameters in the solution of T x = w.
Example. The general solution of
d2
u(x) − u(x) = x
dx2
is
u(x) = −x + A sin x + B cos x.
This has two parameters, because the nullity of the differential operator T : T f (x) =
f (x) − f (x), is 2.
Suppose that V and W are vector spaces, and that T : V → W is a linear map.
The dimension of T (V ) (also known as image(T ) or range(T )) is called the rank of T .
Sometimes the number dim(W ) − dim(T (V )) is called the co-rank of T . Obviously the
rank determines the co-rank, and vice versa.
Proposition 16. Suppose that T : V → W is a linear map. Then
(i) T is one-to-one if and only if nullity(T ) = 0.
(ii) T is onto if and only if rank(T ) = dim(W ), i.e., if and only if co-rank(T ) = 0.
4. RANK AND NULLITY
29
Theorem 17. If U is a subspace of W , then the smallest number of equations needed
to describe U is dim(W ) − dim(U).
Example. Consider the line with parametric equation x = λd, where d = (2, 0, 1)T .
This line is a one-dimensional subspace. It may also be described by the equations y = 0
and x = 2z. It is of codimension 2.
Challenge Problem. Find equations which define span (1, 2, 0, 1)T , (1, 0, 2, 0)T
in R4 . What is the minimal number of equations?
We conclude that the co-rank of a linear transformation T : Rm → Rn is the number
of equations needed to describe the image of T .
Theorem 18 (The rank-nullity theorem for matrices). Suppose that A ∈ Mm,n . Then
rank(A) + nullity(A) = m.
Proof. Suppose that A is row-reduced to row-echelon form. Then the columns of A
corresponding to the leading columns of the reduced matrix form a basis for range(A),
hence rank(A) is equal to the number of leading columns. The nonleading columns of
the reduced matrix correspond to the parameters of the solution, i.e., nullity(A) is equal
to the number of nonleading columns. These numbers add to give the total number of
columns, i.e., m.
Corollary 19. If A is a square matrix, then nullity(A) = co-rank(A). Consequently,
the linear transformations associated to square matrices are one-to-one if and only if they
are onto.
Example. Consider D : Pn → Pn−1 , given by
D(an tn + · · · + a0 ) = nan tn−1 + · · · + a1
(i.e., D corresponds to differentiation). By choosing the coefficients an , an−1 , . . . , a1
correctly, we can arrange that the right hand side is any polynomial of degree n − 1.
Thus range(D) = Pn−1 , and hence rank(D) = dim(Pn−1 ) = n. Also, the kernel of D
is the set of all constant polynomials: indeed, D(an tn + · · · + a0 ) = 0 if and only if
an = an−1 = · · · = a1 = 0, and the polynomial is constant. Then nullity(D) = 1. Finally,
rank(D) + nullity(D) = n + 1 = dim(Pn ).
Exercise 7. Find the nullity of the

1
1
1
matrix

2 3 4
0 4 2 .
−1 0 0
Answer. Reduce to row-echelon form.

1 ∗
0 ∗
0 0
The reduced matrix is of the form

∗ ∗
∗ ∗ .
∗ ∗
Then the rank of the matrix is 3 and the nullity is 1.
Note that we do not have to find ker(T ) explicitly to show that it is one-dimensional.
30
2. LINEAR TRANSFORMATIONS
Exercise 8. Find a basis for the kernel

1 2
1 0
1 −1
of the matrix

3 4
4 2 .
0 0
Answer. We need to find x1 , . . . , x4 such that
 

 x1
1 2 3 4  
1 0 4 2 x2  = 0,
x3 
1 −1 0 0
x4
i.e., to find the solutions of the system represented by the augmented matrix


1 2 3 4 0
 1 0 4 2 0 .
1 −1 0 0 0
Row-reduced, this is of the form

1 ∗ ∗ ∗
0 ∗ ∗ ∗
0 0 ∗ ∗

0
0 .
0
The solution space has the parametric equation


−10
−10

x = λ
 −2  ,
9
and is 1-dimensional. Then {(−10, −10, −2, 9)T } is a basis for the kernel.
Example. Find a basis for the image of

1 2
1 0
1 −1
Answer. We row-reduce this matrix,

1 −1
0 1
0 0
the matrix

3 4
4 2 .
0 0
and get

0
0
4
2 .
−9 −2
Thus the first three columns are linearly independent, and the fourth column depends
linearly on these. Hence the vectors (1, 1, 1)T , (2, 0, −1)T and (3, 4, 0)T are linearly inde
pendent; since R3 is 3-dimensional, they must form a basis.
Of course, other sets of three of these vectors, such as {(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T }, are
also bases for R3 .
4. RANK AND NULLITY
31
Lemma 20. Suppose that V and W are vector spaces and T : V → W is a function.
Then T is a linear transformation if and only if
T (λu + µv) = λT (u) + µT (v)
for all u, v ∈ V and all scalars λ, µ.
Proof. If T is linear, then
T (λu + µv) = T (λu) + T (µv)
= λT (u) + µT (v).
Conversely, if (2.1) holds, then taking λ = µ = 1 shows that
T (u + v) = T (u) + T (v)
and taking µ = 0 shows that
T (λu) = λT (u),
so T is linear.
Exercise 9. Define T : P3 (R) → R4 by
T (a3 x3 + a2 x2 + a1 x + a0 ) = (a0 , a1 , a2 , a3 )T .
Show that T is a linear mapping.
Answer. Suppose that
p(x) = a3 x3 + a2 x2 + a1 x + a0
q(x) = b3 x3 + b2 x2 + b1 x + b0 .
Then
T (p(x) + q(x)) = T ((a3 + b3 )x2 + (a2 + b2 )x2 + (a1 + b1 )x + (a0 + b0 ))
= (c0 + b0 , a1 + b1 , a2 + b2 , a3 + b3 )T
= (a0 , a1 , a2 , a3 )T + (b0 , b1 , b2 , b3 )T
= T (p(x)) + T (q(x)),
and further
T (λp(x)) = T (λa3 x3 + λa2 x2 + λa1 x + λa0 )
= (λa0 , λa1 , λa2 , λa3 )
= λ(a0 , a1 , a2 , a3 )
= λT (p(λ)),
so T is linear.
(2.1)
32
2. LINEAR TRANSFORMATIONS
Alternatively, it suffices to write
T (λp(x) + µq(x))
= T ((λa3 + µb3 )x3 + (λa2 + µb2 )x2 + (λa1 + µb1 )x + (λa0 + µb0 ))
= (λa0 + µb0 , λa1 + µb1 , λa2 + µb2 , λa3 + µb3 )T
= λ(a0 , a1 , a2 , a3 )T + µ(b0 , b1 , b2 , b3 )T
= λT (p(x)) + µ(T (q(x)).
by Lemma 20.
Exercise 10. Define T : P3 (R) → R4 by
T (a3 x3 + a2 x2 + a1 x + a0 ) = (a0 , a1 + 1, a2 , a3 )T .
Show that T is not a linear map.
Answer. Now T (0) = (0, 1, 0, 0)T , so T (0) = 0T . This implies that T is not linear.
Exercise 11. Define T : P3 (R) → R∗ by
T (a3 x3 + a2 x2 + a1 x + a0 ) = (a0 , a1 , a22 , a3 )T .
Show that T is not a linear map.
Answer. Suppose that p(x) = a3 x3 + a2 x2 + a1 x + a0 , where a2 = 0. Then
T (λp(x)) = T (λa3 x3 + λa2 x2 + λa1 x + λa0 )
= (λa0 , λa1 , (λa2 )2 , λa3 )
= λ(a0 , a1 , λa22 , a3 )
= λT (p(x)),
unless λ = 0 or 1, or a2 = 0.
Exercise 12. Define T : Pn (R) → Pn (R) by
T (p(x)) = x2
d2 p(x)
dp(x)
+ p(x).
−
2x
dx2
dx
Find rank(T ) and nullity(T ).
Answer. We can represent T by a matrix: let B be the basis {1, x, x2 , · · · , xn } for
Pn (R). Then
[T (p(x))]B = M[p(x)]B ,
4. RANK AND NULLITY
33
where the ith column of M is [T (xi−1 )]B . Now
T (xi−1 ) = x2 (i − 1)(i − 2)xi−3 − 2x(i − 1)xi−2 + xi−1
= [(i2 − 3i + 2) − 2(i − 1) + 1]xi−1
= (i2 − 5i + 5)xi−1 .
Then M is the diagonal matrix

∗
0

0
.
 ..
0
0
∗
0
..
.
0
0
0
∗
..
.
0
···
···
···
..
.
···

0
0

0 ,
.. 
.
∗
and mii = i2 − 5i + 5. For i = 1, 2, 3, . . . , this never vanishes. Thus rank(M) = n + 1 and
nullity(M) = 0. It follows that rank(T ) = n + 1 and nullity(T ) = 0.
Alternatively, we can avoid using matrices. By differential equations, the general
solution to
T (f (x)) = 0
√
is f (x) = Aeαx +Beβx , where α, β = (5± 5)/2. The only such f (x) which is a polynomial
is f (x) = 0. Thus when we consider T acting on polynomials, the kernel is {0}. By the
rank-nullity theorem, the rank is (n + 1). This means that, for any q in Pn (R), we can
find a unique polynomial solution to
T (p(x)) = q(x).