* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Vector Spaces
Survey
Document related concepts
Non-negative matrix factorization wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Jordan normal form wikipedia , lookup
Cross product wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Matrix multiplication wikipedia , lookup
Exterior algebra wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Laplace–Runge–Lenz vector wikipedia , lookup
System of linear equations wikipedia , lookup
Euclidean vector wikipedia , lookup
Vector space wikipedia , lookup
Matrix calculus wikipedia , lookup
Transcript
CHAPTER 1 Vector Spaces We saw different types of vectors last session with very similar algebraic properties. Other mathematical objects share these properties, and we will investigate these: functions, finite vector spaces, polynomials, matrices. Because they have very similar structures, techniques useful for dealing with one of these may be useful for others. 1. Spaces of functions Let I be an interval, for example, [0, 1], and write C(I, R) for the set of all continuous real-valued functions on I. We say that functions f and g are equal, and we write f = g, if and only if f (x) = g(x) for all x ∈ I. Given functions f and g in C(I, R) and λ ∈ R, we define new functions f + g and λf in C(I, R) as follows: (f + g)(x) = f (x) + g(x) for all x ∈ I and (λf )(x) = λf (x) for all x ∈ I. We write −f for (−1)f , that is, (−f )(x) = −f (x) for all x in I, and 0 for the zero function, i.e., 0(x) = 0 for all x in I. Proposition 1. The set C(I, R) of all continuous real-valued functions on the interval I has the following properties: (1) for all f, g ∈ C(I, R), f + g ∈ C(I, R) (closure under addition) (2) for all f ∈ C(I, R) and λ ∈ R, λf ∈ C(I, R) (closure under scalar multiplication) (3) for all f ∈ C(I, R), f +0= 0+f =f (existence of zero) (4) for all f in C(I, R), f + (−f ) = (−f ) + f = 0 (existence of additive inverses) (5) for all f, g, h ∈ C(I, R), (f + g) + h = f + (g + h) (addition is associative) 1 2 1. VECTOR SPACES (6) for all f, g ∈ C(I, R), f +g =g+f (addition is commutative) (7) for all f ∈ C(I, R), 1f = f (1 is an identity for scalar multiplication) (8) for all f ∈ C(I, R) and all λ, µ ∈ R, (λµ)f = λ(µf ) (scalar multiplication is associative) (9) for all f, g ∈ C(I, R) and all λ, µ ∈ R, (λ + µ)f = λf + µf λ(f + g) = λf + λg (scalar multiplication distributes over addition). These are essentially the same properties enjoyed by geometric vectors and algebraic or coordinate vectors. Actually, functions have more properties: you can multiply them, differentiate them, and so on. But many properties of functions just rely on addition and scalar multiplication. Polynomials behave in a similar way to functions—indeed, they are special case of functions. Definition 2. We write Pd (R) and Pd (C) for the sets of all real polynomials and all complex polynomials of degree at most d. These are both examples of vector spaces (defined below); the field of scalars for Pd (R) is taken to be R, but the field of scalars for Pd (C) may be taken to be R or C. We observed last session that matrices can be added and multiplied by scalars in the same way as vectors. Matrices also have a multiplicative structure, which is not commutative. 2. Finite vector spaces Finite vector spaces require finite fields. Problem 3. Write Z2 for the integers modulo 2. This is the set {0, 1}, with addition and multiplication defined thus: 0 1 0 1 0 1 1 0 Addition Show that Z2 is a field. 0 1 0 1 0 0 0 1 Multiplication 3. VECTOR SPACE AXIOMS 3 Problem 4. Write Z3 for the integers modulo 3. This is the set {0, 1, 2}, with addition and multiplication defined thus: 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 0 1 2 Addition 0 0 0 0 1 0 1 2 2 0 2 1 Multiplication Show that Z3 is a field. Challenge Problem. Draw up similar tables for Zn , the integers modulo n. Show that Zn is a field if and only if n is a prime. In particular Z4 is not a field. Challenge Problem. Find rules for adding and multiplying four symbols to get a field. If F is any field, we may form Fn , the set of “vectors” of the form (x1 , x2 , . . . , xn )T , where all xi ∈ F. This is a vector space over F. Finite fields are used in coding. For example, 128-bit encoding works as follows: 128 bits of computer information corresponds to a vector with 128 places, each of which is to Z128 which is one-to-one and onto. Unless 0 or 1. A code is a function from Z128 2 2 we know this function, we cannot decode the message. Constructing and inverting these functions often involves vector space ideas, such as multiplying by a matrix to encode and multiplying by its inverse to invert. 3. Vector space axioms Let F be a field and V be a set. Suppose that there are operations + : V × V → V and · : F × V → V . Then V is called a vector space over F if it satisfies the following conditions: (1) for all v, w ∈ V , their sum v + w is defined and is in V (V is closed under addition) (2) for all λ ∈ F and v ∈ V , their product λ·v is defined and is in V (V is closed under scalar multiplication) (3) for all u, v, w ∈ V , (u + v) + w = u + (v + w) (addition is associative) (4) for all u, v, w ∈ V , u+v =v+u (addition is commutative) (5) there exists a vector 0 such that for all v ∈ V , v+0=v (there exists a zero vector) 4 1. VECTOR SPACES (6) for all v ∈ V , there exists a vector −v such that (−v) + v = 0 (existence of additive inverses) (7) for all v ∈ V , 1·v = v (1 is a multiplicative identity) (8) for all v ∈ V and all λ, µ ∈ F, (λµ)·v = λ·(µ·v) (multiplication is associative) (9) for all u, v ∈ V and all λ, µ ∈ R, (λ + µ)·u = λ·u + µ·u λ·(u + v) = λ·u + λ·v (scalar multiplication distributes over addition). Usually we just write λv, rather than λ·v. From these defining properties of a vector space, called axioms, we can deduce other properties. Proposition 5. Suppose that V is a vector space over F. Then (1) for all u ∈ V , there is only one z ∈ V (the vector 0) such that u + z = u (uniqueness of the zero vector) (2) for all u, v, w ∈ V , if u + v = u + w, then v = w (cancellation property) (3) for all u ∈ V , there is only one v ∈ V such that u + v = 0 (uniqueness of inverses) (4) for all λ ∈ F, λ0 = 0 (5) for all v ∈ V , 0v = 0 (6) for all v ∈ V , (−1)v + v = 0 = v + (−1)v (7) for all v ∈ V and λ ∈ F, if λv = 0 then λ = 0 or v = 0 (8) for all v, w ∈ V and all λ, µ ∈ F, if λv = µv and v = 0, then λ = µ if λv = λw and λ = 0, then v = w. Proof. These follow from the vector space axioms. We consider only (7) and (8). To prove (7), note that if λv = 0 and λ = 0, then v = λ−1 λv = λ−1 0 = 0. To prove (8), observe that if λv = µv, then (λ − µ)v = 0, so λ = µ or v = 0 by (7). Similarly, if λv = λw and λ = 0, then λ−1 λv = λ−1 λw, so v = w. Note that (−1)v = −v. 4. SUBSPACES 5 4. Subspaces A subspace of a vector space V over a field F is a nonempty subset of V which is a vector space in its own right; in particular, V is a subspace of itself. We are often asked to decide when a subset is a subspace, and this might require us to check up to ten items. This is quite tedious. Consequently, the following theorem is convenient. Theorem 6 (Subspace theorem). If S is a subset of a vector space V over a field F, then S is a subspace if and only if S has all three of the following properties: (1) S is not empty (2) S is closed under addition (3) S is closed under scalar multiplication. Further, S is not a subspace if 0 ∈ / S or if any of the above properties fails to hold. Proof. First, note that vector space axioms (3), (4), (7), (8) and (9) automatically hold for subsets. Axioms (1) and (2) might not hold, but are ensured by hypotheses (2) and (3) of this proposition. Finally, 0 = 0v and −v = (−1)v, so axioms (5) and (6) follow from hypothesis (3). Thus if all the hypotheses hold, so do all the vector space axioms. Conversely, if hypothesis (1) is false or if 0 ∈ / S, then vector space axiom (5) cannot hold; if hypothesis (2) or (3) fails, then axiom (1) or (2) fails. Problem 7. Show that the set of vectors (x1 , x2 )T in R2 such that x2 = 2x1 + c is a subspace if and only if c = 0. Answer. Suppose that c = 0. Clearly (0, 0)T lies in S, so that S is not empty. If x, y ∈ S, then x2 = 2x1 and y2 = 2y1, so x2 + y2 = 2(x1 + y1 ), and x + y ∈ S. Finally, if x ∈ S and λ ∈ R, then x2 = 2x1 , so λx2 = 2λx1 , and λx ∈ S. x2 x1 = 2x2 x1 x1 = 2x2 − 4 6 1. VECTOR SPACES Suppose that c = 0. Then (0, 0)T ∈ / S. Therefore S does not contain the zero vector, and so S fails to satisfy the vector space axiom on the existence of the zero vector; thus S is not a subspace. Problem 8. Show that the set of vectors (x1 , x2 , x3 )T in R3 such that x1 + 2x2 + 3x3 = d is a subspace if and only if d = 0. Problem 9. Show that the unit circle in R2 {(x1 , x2 )T ∈ R2 : x21 + x22 = 1} is not a subspace. Problem 10. Show that the sets of vectors H1 and H2 in R2 given by H1 = {(x, y) : y ≥ x} H2 = {(x, y) : y ≤ x} are not subspaces, but H1 ∩ H2 and H1 ∪ H2 are subspaces. H1 H2 Problem 11. Suppose that A ∈ Mm,n (R), and define S = {x ∈ Rn : Ax = b}. Show that S is a subspace if and only if b = 0. Answer. Suppose first that b = 0. Then A0 = b, so 0 ∈ / S. Consequently, S is not a subspace. Now suppose that b = 0. Then A0 = b, so that 0 ∈ S, and S is not empty. Next, suppose that x, y ∈ S. Then Ax = 0 and Ay = 0, so A(x + y) = Ax + Ay = 0, and x + y ∈ S, i.e., S is closed under addition. Finally, suppose that λ ∈ R and x ∈ S. Then Ax = 0, so λAx = 0, and A(λx) = 0. Hence λx ∈ S, i.e., S is closed under scalar multiplication. By the subspace theorem, S is a subspace of Rn . 5. LINEAR COMBINATIONS AND SPANS 7 Problem 12. Suppose A ∈ Mm,n (R). Show that the set of b for which Ax = b has at least one solution is a subspace. Problem 13. Let S denote the set {p ∈ P3 (C) : p(1) = 0}. Show that S is a subspace of P3 (C). Answer. First, 0 ∈ S, so S is not empty. Next, if p, q ∈ S, then p(1) = 0 and q(1) = 0, so (p + q)(1) = p(1) + q(1) = 0, and p + q ∈ S, i.e., S is closed under addition. Further, if λ ∈ C and p ∈ S, then (λp)(1) = λp(1) = 0, so λp ∈ S, i.e., S is closed under scalar multiplication. By the subspace theorem, S is a subspace of P3 (C). 5. Linear combinations and spans Definition 14. A linear combination of vectors v 1 , v 2 , . . . , v n is a vector of the form λ1 v 1 + λ2 v 2 + · · · + λn v n , where λ1 , λ2 , . . . , λn are scalars. Definition 15. The span of the vectors v 1 , v 2 , . . . , v n is the set of all linear combinations of v 1 , v 2 , . . . , v n : it is written span{v1 , v 2 , . . . , v n }. In a vector space, all finite sums of the form λ1 v 1 + λ2 v 2 + · · · + λn v n are well-defined, i.e., have an unambiguous meaning. We could put brackets round the sum in many different ways, but it can be shown that all give the same result. Challenge Problem. In how many different ways can we bracket the sum of n vectors? Prove (by induction) that they are all equivalent. Theorem 16. The smallest subspace of a vector space which contains the vectors v 1 , v 2 , . . . , v n is their span. This theorem says two things: the span is a subspace, and it is the smallest subspace. Proof. By the subspace theorem, to show that span{v 1 , v 2 , . . . , v n } is a subspace, we must show it is nonempty and closed under addition and scalar multiplication. First, if λ1 = λ2 = · · · = λn = 0, then λ1 v 1 + λ2 v 2 + · · · + λn v n = 0, so that 0 ∈ span{v 1 , v 2 , . . . , v n }, and the span is not empty. Next, if w, w ∈ span{v1 , . . . , v n }, then there exist scalars λ1 , . . . , λn and λ1 , . . . , λn such that w = λ1 v 1 + λ2 v 2 + · · · + λn v n w = λ1 v 1 + λ2 v 2 + · · · + λn v n . 8 1. VECTOR SPACES Then, adding these, we see that w + w = (λ1 + λ1 )v 1 + · · · + (λn + λn )v n , so w + w ∈ span{v 1 , v 2 , . . . , v n }. Finally, suppose that w ∈ span{v 1 , . . . , v n } and λ is a scalar. Then there exist scalars λ1 , . . . , λn such that w = λ1 v 1 + λ2 v 2 + · · · + λn v n , so that λw = λλ1 v 1 + λλ2 v 2 + · · · + λλn v n , and λw ∈ span{v 1 , v 2 , . . . , v n }. By the Subspace Theorem, span{v 1 , v 2 , . . . , v n } is a subspace. To show that span{v 1 , v 2 , . . . , v n } is the smallest subspace containing v 1 , v 2 , . . . , v n , suppose that W is any subspace such that v 1 , v 2 , . . . , v n ∈ W. We need to show that span{v 1 , . . . , v n } ⊆ W , that is, every vector in span{v 1 , . . . , v n } is also in W . Since W is closed under scalar multiplication and addition, given any scalars λ1 , . . . , λn , λ1 v 1 + λ2 v 2 + · · · + λn v n ∈ W. Thus span{v 1 , v 2 , . . . , v n } ⊆ W , as required. We say that a subset {v 1 , v 2 , . . . , v n } of a vector space V is a spanning set for V , or just spans V , if span{v 1 , v2 , . . . , vn } = V. Problem 17. Show that {(1, 0)T , (0, 1)T } is a spanning set for R2 . Answer. Take (x, y)T in R2 . Then (x, y)T = x(1, 0)T + y(0, 1)T . Therefore every vector in R2 is a linear combination of the vectors (1, 0)T and (0, 1)T ; hence they span R2 . Problem 18. Show that {(1, 0, 0)T , (0, 1, 0)T , (0, 0, 1)T } spans R3 . Problem 19. Do the vectors (0, 1, −1)T , (2, 1, −3)T and (1, 2, −3)T span R3 ? If not, describe their span. Answer. Suppose (c1 , c2 , c3 )T ∈ R3 . Then (c1 , c2 , c3 )T ∈ span{(0, 1, −1)T , (2, 1, −3)T , (1, 2, −3)T } if and only if there exist scalars λ1 , λ2 , λ3 ∈ R such that 0 2 1 c1 λ1 1 + λ2 1 + λ3 2 = c2 . c3 −1 −3 −3 5. LINEAR COMBINATIONS AND SPANS 9 This is equivalent to the system 0λ1 + 2λ2 + 1λ3 = c1 1λ1 + 1λ2 + 2λ3 = c2 −1λ1 − 3λ2 − 3λ3 = c3 , which in turn can be represented by the augmented matrix 0 2 1 c1 1 1 2 c2 . −1 −3 −3 c3 Note that the columns of the augmented matrix are the vectors in question. We solve this by row-reductions: R1 ←→ R2 , R3 = R3 + R1 1 1 2 0 2 1 0 −2 −1 R3 = R2 + R3 1 1 2 0 2 1 0 0 0 c2 c1 c2 + c3 c2 . c1 c1 + c2 + c3 This has a solution if and only if c1 + c2 + c3 = 0. This means that (c1 , c2 , c3 )T ∈ span{(0, 1, −1)T , (2, 1, −3)T , (1, 2, −3)T } if and only if c1 + c2 + c3 = 0. Since there are vectors in R3 , such as (1, 0, 0)T , which do not satisfy this, span{(0, 1, −1)T , (2, 1, −3)T , (1, 2, −3)T } is a subspace of R3 different from R3 ; we say that it is a proper subspace of R3 . Recall that we say that the real or complex polynomials p(x) and q(x) are the same when the coefficients are the same, i.e., when the values p(x) and q(x) are the same for all x. In this case we often just write p = q. We call x a “dummy variable”. The same convention applies to functions. Problem 20. Suppose that p1 (t) = t3 − 3t p2 (t) = t3 − t2 − t p3 (t) = t2 − t − 1 q(t) = t + 1. Is it true that q ∈ span{p1 , p2 , p3 }? Answer. If this were true, then there would exist λ1 , λ2 , λ3 ∈ R such that q = λ 1 p1 + λ 2 p2 + λ 3 p3 , 10 1. VECTOR SPACES i.e., t + 1 = λ1 (t3 − 3t) + λ2 (t3 − t2 − t) + λ3 (t2 − t − 1) = (λ1 + λ2 )t3 + (λ2 + λ3 )t2 − (3λ1 + λ2 + λ3 )t − λ3 . So, we would have λ1 + λ2 λ2 + λ3 3λ1 + λ2 + λ3 λ3 =0 =0 = −1 = −1. Back-substituting, λ3 = −1 and λ2 = 1, but then λ1 = −1 and 3λ1 = −1, which is impossible. Since this system has no solutions, q ∈ / span{p1 , p2 , p3 }. Problem 21. Is sin(2x + 3) in span{sin(2x), cos(2x)}? Answer. By the well-known addition formula, sin(2x + 3) = sin(2x) cos(3) + cos(2x) sin(3) = cos(3) sin(2x) + sin(3) cos(2x), and sin(3) and cos(3) are scalars, so sin(2x + 3) ∈ span{(sin(2x), cos(2x)}. Problem 22. Is sin(2x) in span{sin x, cos x}? Answer. Suppose that there exist numbers λ1 and λ2 such that sin(2x) = λ1 sin x + λ2 cos x for all x. Then, taking x = 0 and π/2, we see that 0 = 0 + λ2 0 = λ1 + 0 so we must have λ1 = λ2 = 0. But now taking x = π/4, we get 1 = 0 + 0 which is impossible. Thus sin 2x ∈ / span{sin x, cos x}. To summarise the last two problems, the function sin(2x + 3) is a linear combination of the functions sin(2x) and cos(2x), because the formula sin(2x + 3) = cos 3 sin 2x + sin 3 cos 2x is true for all x, and the function sin(2x) is not a linear combination of the functions sin(x) and cos(x), because we cannot find numbers λ and µ such that sin(2x) = λ sin(x) + µ cos(x) for all x, though for a fixed x this is possible. In these examples, it is hard to find an “x-free” way to write things. 5. LINEAR COMBINATIONS AND SPANS 11 Proposition 23. Suppose that a1 , a2 , . . . , an ∈ Rm . Let A be the matrix with columns a1 , a2 , . . . , an . Then b ∈ span{a1 , a2 , . . . , an } if and only if there exists x ∈ Rn such that b = Ax. Proof. Observe that Ax = · · · = x1 a1 + · · · + xn an . The result follows. We define the column space of a matrix A to be the subspace spanned by its columns. Problem 24. Is (−1, 0, 5)T in the column space of the matrix 1 1 2 1 ? 3 −1 Answer. It is equivalent to ask whether the system represented by the augmented matrix −1 1 1 2 1 0 5 3 −1 has a solution. We proceed to row-reduce the augmented matrix. R2 = R2 − 2R1 , R3 = R3 − 3R1 R3 = R3 − 4R2 1 1 0 −1 0 −4 −1 2 8 1 1 0 −1 0 0 −1 2 . 0 There is a (unique) solution, so −1 1 1 0 ∈ span 2 , 1 . 5 3 −1 Problem 25. Let S denote span{x3 − x + 1, x3 − x2 − 1, x2 − x + 2} in P3 (R). Which polynomials q(x) belong to S? Answer. Suppose that q is a linear combination of these polynomials. Then q is of degree at most 3. Write q(x) = a3 x3 + a2 x2 + a1 x + a0 . Then q(x) = λ1 (x3 − x + 1) + λ2 (x3 − x2 − 1) + λ3 (x2 − x + 2) = (λ1 + λ2 )x3 + (λ3 − λ2 )x2 − (λ1 + λ3 )x + (λ1 − λ2 + 2λ3 ) 12 1. VECTOR SPACES precisely when λ1 + λ2 λ3 − λ2 −λ1 − λ3 λ1 − λ2 + 2λ3 = a3 = a2 = a1 = a0 . This system of equations is represented by the augmented matrix a3 1 1 0 0 −1 1 a2 −1 0 −1 a1 . a0 1 −1 2 We reduce this to row-echelon form: R3 = R3 + R1 , R4 = R4 − R1 1 1 0 −1 0 1 0 −2 R3 = R3 + R2 , R4 = R4 − 2R2 1 1 0 −1 0 0 0 0 0 1 −1 2 0 1 0 0 a3 a2 a1 + a3 a0 − a3 a3 a2 . a1 + a2 + a3 a0 − 2a2 − a3 Therefore a3 x3 + a2 x2 + a1 x + a0 ∈ span{x3 − x − 1, x3 − x2 − 1, x2 − x + 2} if and only if a1 + a2 + a3 = 0 and a0 − 2a2 − a3 = 0. Problem 26. What is the difference between span{(3, 0, −1)T , (1, 1, −1)T , (0, −3, 2)T } and span{(3, 0, −1)T , (1, 1, −1)T }? Answer. These are two subspaces of R3 . The first subspace clearly contains every vector the second subspace. The question is whether the first subspace is larger. Let b = (b1 , b2 , b3 )T . Then b ∈ span{(3, 0, −1)T , (1, 1, −1)T , (0, −3, 2)T } if and only if the system represented by the augmented matrix b1 3 1 0 0 1 −3 b2 b3 −1 −1 2 has a solution, which is when b1 + 2b2 + 3b3 = 0. 6. LINEAR DEPENDENCE 13 Similarly, b ∈ span{(3, 0, −1)T , (1, 1, −1)T } if and only if b1 + 2b2 + 3b3 = 0. So the two spans are the same. The answer to the question why the extra vector does not change the span will be our next concern. 6. Linear Dependence Definition 27. Vectors v 1 , . . . , v n are linearly independent if the only possible choice of scalars λ1 , . . . , λn for which λ1 v 1 + · · · + λn v n = 0 (1) is λ1 = · · · = λn = 0. They are linearly dependent if there exists a choice of λ1 , . . . , λn , not all 0, so that (1) holds, that is, if they are not linearly independent. For example, the vectors (1, 0)T and (2, 0)T in R2 are linearly dependent, since 2(1, 0)T + (−1)(2, 0)T = (0, 0)T . The vectors (2, 0)T and (1, 1)T are linearly independent: indeed, if λ(2, 0)T + µ(1, 1)T = (0, 0)T , then 2λ + µ = 0 and µ = 0, so λ = µ = 0. Things get trickier as the number of vectors increases. Problem 28. Are the vectors (1, 0, 1)T , (1, 2, 0)T , and (0, 2, −1)T linearly dependent or independent? Answer. Suppose that 1 1 0 0 λ1 0 + λ2 2 + λ3 2 = 0 . 1 0 −1 0 Equivalently, 1λ1 + 1λ2 + 0λ3 = 0 0λ1 + 2λ2 + 2λ3 = 0 1λ1 + 0λ2 − 1λ3 = 0. This can be represented by the augmented matrix 1 1 0 0 0 2 2 0 . 1 0 −1 0 We row-reduce this to 1 1 0 0 2 2 0 0 0 0 0 . 0 (2) 14 1. VECTOR SPACES Thus the original system (2) has infinitely many solutions, and in particular has nonzero solutions, e.g., λ1 = 1, λ2 = −1, λ3 = 1. Since 1 1 0 0 1 0 − 1 2 + 1 2 = 0 , 1 0 −1 0 the vectors are linearly dependent. Problem 29. Are the polynomials p1 (t) = 1 + t, 2 p3 (t) = 1 + t , p2 (t) = 1 − t p4 (t) = 1 − t3 linearly dependent or independent? Answer. Suppose that λ1 p1 + λ2 p2 + λ3 p3 + λ4 p4 = 0. Then λ1 (1 + x) + λ2 (1 − x) + λ3 (1 + x2 ) + λ4 (1 − x3 ) = 0 for all x, i.e., (λ1 + λ2 + λ3 + λ4 ) + (λ1 − λ2 )x + λ3 x2 − λ4 x3 = 0 for all x, so λ1 + λ2 + λ3 + λ4 λ1 − λ2 λ3 − λ4 =0 =0 =0 = 0. Solving, the only solution is λ1 = λ2 = λ3 = λ4 = 0, so the polynomials p1 , p2 , p3 and p4 are linearly independent. Note that the rows of the system of equations correspond to the x0 terms, the x1 terms, the x2 terms and the x3 terms. This is not an accident—as far as questions of spans and linear dependence are concerned, the polynomial a0 + a1 x + a2 x2 + · · · + an xn behaves like the vector (a0 , a1 , a2 , . . . , an )T . Problem 30. Are the functions 1, ex and e2x linearly independent? Answer. Suppose that (3) λ1 + λ2 ex + λ3 e2x = 0 x 2x for all x. If this implies that λ1 = λ2 = λ3 = 0, then 1, e and e are linearly independent. One way to show this is via limits: if (3) holds, then lim (λ1 + λ2 ex + λ3 e2x ) = lim 0, x→−∞ x→−∞ 6. LINEAR DEPENDENCE so λ1 = 0. Now 15 λ2 ex + λ3 e2x = 0 so dividing by ex , λ2 + λ3 ex = 0. Another limit argument shows that λ2 = 0, and then λ3 ex = 0, so λ3 = 0. Thus 1, ex and e2x are linearly independent. Challenge Problem. Suppose that α1 < α2 < · · · < αn . Show that eα1 x , eα2 x , . . . , eαn x are linearly independent functions. Challenge Problem. Suppose that 0 < α1 < α2 < · · · < αn . Show that sin α1 x, sin α2 x, . . . , sin αn x are linearly independent functions. Theorem 31. Suppose that v 1 , . . . , v n are vectors, and that v ∈ span{v 1 , . . . , vn }. If v 1 , . . . , v n are linearly independent, there is only one choice of scalars λ1 , . . . , λn such that v = λ1 v 1 + · · · + λn v n . (4) If v 1 , . . . , v n are linearly dependent, there is more than one choice of scalars λ1 , . . . , λn such that (4) holds. Proof. Suppose that the vectors v 1 , . . . , v n are linearly independent. Because v is in span{v 1 , . . . , v n }, there exist λ1 , . . . , λn so that v = λ1 v 1 + · · · + λn v n . Suppose that µ1 , . . . , µn are scalars (possibly different from λ1 , . . . , λn ) so that v = µ1 v 1 + · · · + µn v n . Subtracting, 0 = (λ1 − µ1 )v 1 + · · · + (λn − µn )v n . By linear independence, λ1 − µ1 = · · · = λn − µn = 0, so µi = λi , and there is only one representation of v in terms of the v i . Suppose now that v 1 , . . . , v n are linearly dependent. Then there exist ν1 , . . . , νn , not all zero, so ν1 v 1 + · · · + νn v n = 0. Since v ∈ span{v 1 , . . . v n }, there exist λ1 , . . . , λn so that v = λ1 v 1 + · · · + λn v n . Adding, v = (λ1 + ν1 )v 1 + · · · + (λn + νn )v n . Since not all νi are zero, this is a different representation of v in terms of the v i . Theorem 32. The vectors v 1 , . . . , v n are linearly dependent if and only if at least one of them may be written as a linear combination of the others. 16 1. VECTOR SPACES Proof. Suppose that v 1 , . . . , v n are linearly dependent. Then there exist λ1 , . . . , λn , not all zero (say λi = 0) so that λ1 v 1 + · · · + λi v i + · · · + λn v n = 0. Rearranging to make v i the subject, λ1 λi−1 λi+1 λn v1 − · · · − v i−1 − v i+1 − · · · − v n , λi λi λi λi so v i is a linear combination of the other vectors, i.e., v i lies in their span. Conversely, suppose that v i may be written as a linear combination of the other vectors, i.e., for some scalars µ1 , . . . , µi−1 , µi+1 , . . . , µn , we have vi = − v i = µ1 v 1 + · · · + µi−1 v i−1 + µi+1 v i+1 + · · · + µn v n . Put µi = −1. Then µ1 v 1 + · · · + µi−1 v i−1 + µi v i + µi+1 v i+1 + · · · + µn v n = 0, i.e., v 1 , . . . , v n are linearly dependent. There are several other results whose proofs are similar. Theorem 33. Suppose that v 0 , v 1 , . . . , v n are vectors. Then span{v 0 , v 1 , . . . , v n } = span{v 1 , . . . , v n } if and only if v 0 ∈ span{v 1 , . . . , v n }. Proof. Suppose first that span{v 0 , v 1 , . . . , v n } = span{v 1 , . . . , v n }. Because v 0 is in span{v0 , v 1 , . . . , v n }, it follows that v 0 ∈ span{v 1 , . . . , v n }. Conversely, suppose that v 0 ∈ span{v 1 , . . . , v n }; then there exist µ1 , . . . , µn so that v 0 = µ1 v 1 + · · · + µn v n . Now, if w ∈ span{v 0 , v 1 , . . . , v n }, then there are scalars λ0 , λ1 , . . . , λn so that w = λ0 v 0 + λ1 v 1 + · · · + λn v n = λ0 (µ1 v 1 + · · · + µn v n ) + λ1 v 1 + · · · + λn v n = (λ0 µ1 + λ1 )v 1 + · · · + (λ0 µn + λn )v n , and so w ∈ span1 , . . . , vn }, i.e., span{v 0 , v1 , . . . , v n } ⊆ span{v 1 , . . . , v n }. Since the opposite inclusion span{v 0 , v 1 , . . . , v n } ⊇ span{v 1 , . . . , vn } holds (by definition), the two spans coincide. 6. LINEAR DEPENDENCE 17 Problem 34. Show that the vectors v 1 = (1, 2, 0, 3)T , v 2 = (2, 1, 0, 3)T , v 3 = (0, 1, 1, 0)T , v 4 = (1, −1, −1, 1)T are linearly dependent. Find a proper subset S of {v 1 , v 2 , v 3 , v4 } so that span(S) is the same as span{v 1 , v 2 , v3 , v 4 }. Answer. Consider the expression λ1 v 1 + λ2 v 2 + λ3 v 3 + λ4 v 4 = 0. This gives rise to a system of equations which is represented by the augmented matrix 1 2 0 −4 0 2 1 1 1 0 , 0 0 1 0 0 3 3 0 −3 0 which we row-reduce to 1 2 0 −4 0 −3 1 9 0 0 1 0 0 0 0 0 0 0 . 0 0 Since C4 is a nonleading column, the general solution is λ4 = λ, λ3 = 0, λ2 = 3λ, λ1 = −2λ. Thus 2v 1 − 3v 2 − v 4 = 0. In particular, v 4 ∈ span{v 1 , v 2 , v 3 } so that span{v 1 , v 2 , v 3 , v 4 } = span{v 1 , v2 , v 3 }. However, we also have v 1 ∈ span{v 2 , v 3 , v4 }, so that span{v 1 , v 2 , v 3 , v 4 } = span{v 2 , v3 , v 4 }. and v 2 ∈ span{v 1 , v 3 , v 4 } so that span{v 1 , v 2 , v 3 , v 4 } = span{v 1 , v3 , v 4 }. Note that it is not true that v 3 ∈ span{v 1 , v2 , v 4 }. In fact, the position of the zeros in the augmented matrix tell us that C4 is a nonleading column, and this translates into the fact that v 4 is a linear combination of v 1 , v 2 and v 3 , but without explicitly finding the solution, we cannot tell whether v 3 ∈ span{v 1 , v 2 , v 4 }, or v 2 ∈ span{v 1 , v3 , v 4 }. In practical terms, this means that the natural vector to choose to eliminate is v 4 , because we can see immediately that this is a linear combination of the others. 18 1. VECTOR SPACES The preceding example is a particular case of the following theorem, whose proof follows from the previous results. Theorem 35. If S is a linearly independent set of vectors, then for any proper subset S of S, span(S ) ⊂ span(S). However, if S is a linearly dependent set of vectors, then there is at least one proper subset S of S so that span(S ) = span(S). For example, consider the vectors (0, 0)T and (1, 0)T . These are linearly dependent because 2(0, 0)T + 0(1, 0)T = (0, 0)T . Also, span{(0, 0)T , (1, 0)T } = span{(1, 0)T }, but span{(0, 0)T , (1, 0)T } = span{(0, 0)T }. For later use, we need one more theorem. Theorem 36. Suppose that v 1 , . . . , v n are linearly independent vectors. If v n+1 ∈ span{v 1 , . . . , v n }, then v 1 , . . . , v n , v n+1 are linearly dependent. If v n+1 ∈ / span{v 1 , . . . , v n }, then v 1 , . . . , v n , v n+1 are linearly independent. Proof. The first half is already proved. Suppose that v 1 , . . . , v n are linearly independent and that v n+1 ∈ / span{v 1 , . . . , v n }. If λ1 , . . . , λn+1 are scalars so that λ1 v 1 + · · · + λn v n + λn+1 v n+1 = 0, then λn+1 = 0, for otherwise we could make v n+1 the subject, and this would show that v n+1 ∈ span{v 1 , . . . , v n }, which is not allowed. But now λ1 v 1 + · · · + λn v n = 0, so λ1 = · · · = λn = 0 since v 1 , . . . , v n are linearly independent. Since we have now shown that λ1 = · · · = λn = λn+1 = 0, it follows that v 1 , . . . , v n+1 are linearly independent, as required. 7. Bases and dimensions A basis (plural “bases”) for a vector space V is a linearly independent spanning set for V . The number of elements in a basis for V does not depend on the basis. This number is called the dimension of V , and is written dim(V ). This result is found by comparing the number of elements in spanning sets and in linearly independent sets. Theorem 37. If {v 1 , . . . , v m } is a set of linearly independent vectors in the vector space V , and {w1 , . . . , w n } is a spanning set for V , then m ≤ n. Proof. We suppose that {v 1 , . . . , v m } is linearly independent and that {w1 , . . . , w n } is a spanning set. 7. BASES AND DIMENSIONS 19 Since {w 1 , . . . , wn } spans V , there are scalars aij (where i = 1, . . . , m and j = 1, . . . , n) so that v 1 = a11 w1 + · · · + a1n wn v 2 = a21 w1 + · · · + a2n wn (5) ... v m = am1 w1 + · · · + amn w n . Suppose that λ1 , . . . , λm solve the equations a11 λ1 + a21 λ2 + · · · + am1 λm = 0 a12 λ1 + a22 λ2 + · · · + am2 λm = 0 ... a1n λ1 + a2n λ2 + · · · + amn λm = 0. (6) (Note that the aij have been transposed, going from the array (5) to the array (6)). Then (a11 λ1 + a21 λ2 + · · · + am1 λm )w 1 + (a12 λ1 + a22 λ2 + · · · + am2 λm )w 2 + ... + (a1n λ1 + a2n λ2 + · · · + amn λm )wn = 0, or equivalently, λ1 (a11 w 1 + a12 w 2 + · · · + a1n w n ) + λ2 (a21 w 1 + a22 w 2 + · · · + a2n w n ) + ... + λm (am1 w1 + am2 w2 + · · · + amn w n ) = 0, i.e., λ1 v 1 + λ2 v 2 + · · · + λm v m = 0, and so λ1 = λ2 = · · · = λm = 0, as the vectors v 1 , v 2 , . . . , v m are linearly independent. We have just shown that the only solution to the equations (6) is λ1 = λ2 = · · · = λm = 0. This implies that, if the augmented matrix corresponding to these equations, namely a11 a21 · · · am1 0 a12 a22 · · · am2 0 , . .. .. .. . . a1n a2n · · · amn 0 is row-reduced, then all columns are leading columns. This in turn implies that m < n, as required. Corollary 38. If {v 1 , . . . , v m } and {w1 , . . . , wn } are both bases of the vector space V , then m = n. Proof. Since {v 1 , . . . , v m } spans V and {w1 , . . . , w n } is linearly independent, m ≥ n. Conversely, since {v 1 , . . . , vm } is linearly independent and {w 1 , . . . , wn } spans V , n ≥ m. Combining these inequalities, m = n. 20 1. VECTOR SPACES Definition 39. The dimension of a vector space V is the number of elements in a basis for V . Problem 40. Show that the dimension of Rn is n. Answer. The vectors e1 , e2 , . . . , en form a basis. Indeed, for any x ∈ Rn , x = x1 e1 + · · · + xn en , so they span, and they are linear independent because if x1 e1 + · · · + xn en = 0, then x1 = x2 = · · · = xn = 0. In particular, the dimension of a line is 1 and of a plane is 2. Problem 41. Let V be the set of all x in R4 such that 2x1 + 3x2 − x3 = 0. Show that V is a subspace of R4 and find dim(V ). Answer. We do not show here that V is a subspace. Consider the vectors 1 0 0 0 1 0 v2 = v3 = v1 = 2 3 0 . 0 0 1 If λ1 v 1 + λ2 v 2 + λ3 v 3 = 0, then λ1 0 λ2 = 0 , 2λ1 + 3λ2 0 λ3 0 so λ1 = λ2 = λ3 = 0, i.e., the vectors v 1 , v 2 and v 3 are linearly independent. Conversely, if x ∈ V , then x3 = 2x1 + 3x2 so that x1 x1 1 0 0 x2 x2 = = x1 0 + x2 1 + x4 0 , x3 2x1 + 3x2 2 3 0 x4 x4 0 0 1 i.e., x = x1 v 1 + x2 v 2 + x4 v 3 , and {v 1 , v 2 , v 3 } spans V . Thus {v 1 , v 2 , v 3 } is a basis, and so dim(V ) = 3. The obvious question to ask about this solution is where the vectors v 1 , v 2 and v 3 came from. We shall see the answer to this later. Suppose that V is a vector space of dimension n, and that {v 1 , . . . , v m } ⊆ V . If {v 1 , . . . , v m } spans V , then m ≥ n. If m = n, then {v 1 , . . . , v m } is a basis, while if m > n, then it is possible to remove m − n of the vectors to get a basis. Similarly, if {v 1 , . . . , v m } is linearly independent, then m ≤ n. If m = n, then {v 1 , . . . , v m } is a basis, while if m < n, then it is possible to add n − m vectors to get a basis. 7. BASES AND DIMENSIONS 21 Theorem 42. Suppose that {v 1 , . . . , v n } is a spanning set for V . Then there is a subset of {v 1 , . . . , v n } which is a basis of V . Proof. First, remove any v i which is zero. Having done this, and relabelled the vectors if necessary, we may assume each v i is nonzero. Define subsets S1 , S2 , . . . , Sn of {v 1 , . . . , v n } as follows: S1 = {v 1 } if v 2 ∈ span(S1 ) S1 S2 = / span(S1 ) S1 ∪ {v 2 } if v 2 ∈ Sn = ... if v n ∈ span(Sn−1 ) Sn−1 . Sn−1 ∪ {v n } if v n ∈ / span(Sn−1 ) Using Theorem 33, we can show by induction that span{v 1 , . . . , v i } = span(Si ) for all i. In particular, when i = n, we see that Sn is a spanning set. On the other hand, it is linearly independent by construction, so is a basis. Problem 43. Suppose that 2 2 1 2 v1 = 3 , v 2 = 2 , 0 1 0 1 v3 = 1 , 0 2 4 v4 = 2 , 2 0 1 v5 = 2 . 2 Show that {v 1 , v 2 , . . . , v 5 } spans R4 , and find a subset which is a basis. Answer. Consider a = (a1 , a2 , a3 , a4 )T in R4 . We can write a as λ1 v 1 + · · · + λ5 v 5 if and only if that the system represented by the augmented matrix 2 2 0 2 0 a1 1 2 1 4 1 a2 3 2 1 2 2 a3 0 1 0 2 2 a4 has a solution. Reducing 1 0 0 0 to row-echelon form, we obtain the augmented matrix 2 1 4 1 a2 1 0 2 2 a4 . 0 −2 −2 2 a1 − 2a2 + 2a4 0 0 0 5 −a2 + 2a4 − a1 + a3 We can deduce two things from this. First, for any a, there is a solution. Therefore the vectors v 1 , v 2 , . . . , v 5 span R4 . Next, the solution is not unique, as the column corresponding to v 4 is not a leading column. If we take a = 0, then there is a solution for which the coefficient of v 4 is nonzero, so we can write v 4 as a linear combination of the other vectors. This means that we can 22 1. VECTOR SPACES eliminate v 4 without changing the span. As the span is R4 , which is 4-dimensional, we need at least 4 vectors in a basis. Consequently {v 1 , v 2 , v 3 , v5 } is a basis. We can say more about this problem: in fact we can show that v 1 − 2v 2 − v 3 + v 4 = 0, and we can make any one of v 1 , . . . , v 4 the subject of this formula, for instance v 4 = 2v 2 + v 3 − v 1 . This means that we could omit any one of these vectors to get a basis. If we had omitted v 4 from the beginning, we would have got the system 2 2 0 0 a1 1 2 1 1 a2 3 2 1 2 a3 , 0 1 0 2 a4 which row-reduces to 1 0 0 0 2 1 1 1 0 2 0 −2 2 0 0 5 a2 a4 . a1 − 2a2 + 2a4 −a1 − a2 + a3 + 2a4 From this, we see that it is possible to write all vectors a as a linear combination of v 1 , v 2 , v 3 and v 5 uniquely. The uniqueness implies that these vectors are linearly independent, by Theorem 31. If we omitted a different vector, we cannot see immediately what we would end up with in the row-reduction process. The method of solution of Problem 43 works for coordinate vectors in general. Given vectors v 1 , . . . , v n in Rm , to get a linearly independent set, we write the vectors as a matrix, row-reduce it, and remove the vectors corresponding to nonleading columns. This method also works for polynomials: as we have seen, they behave like coordinate vectors. But for other kinds of vectors, like functions, we cannot use this method. The point of Theorem 42 is that it works for all vector spaces—all we need is a way to decide whether vectors are in the span of other vectors. How do we go about enlarging sets of linearly independent vectors to get bases? Theorem 44. Suppose that {w1 , . . . , wm } is a spanning set for a vector space V , and that {v 1 , . . . , v k } is a linearly independent set. Then we can find vectors v k+1 , . . . , v n so that {v 1 , . . . , v k , v k+1 , . . . , v n } is a basis of V . Proof. Consider the set {v 1 , . . . , v k , w1 , . . . , wm }. Since it contains a spanning set, it is a spanning set. We apply the algorithm of Theorem 42 to it. When we do this, none of the vectors v 1 , . . . , v k will be discarded, because they are linearly independent. Indeed, the algorithm adds vectors to the set until it reaches a vector which depends linearly on the previously chosen vectors, and the first vector which might depend on the previously chosen vectors is w 1 . 7. BASES AND DIMENSIONS 23 We can formalise this in another algorithm. Let S0 = {v 1 , . . . , v k }. We construct linearly independent sets S1 , S2 , . . . of vectors as follows: S0 if w 1 ∈ span(S0 ) S1 = / span(S0 ) S0 ∪ {w1 } if w 1 ∈ Sm−1 if w m ∈ span(Sm−1 ) Sm = / span(Sm−1 ). Sm−1 ∪ {wm } if wm ∈ The proof that this works is similar to the proof of Theorem 42. Problem 45. Show that the set {(1, 2, 0, 1)T , (1, 3, 0, 1)T } is linearly independent in R and that (1, 2, 0, 1)T , (1, 3, 0, 1)T , (1, 0, 0, 0)T , (0, 1, 0, 0)T , (0, 0, 1, 0)T , (0, 0, 0, 1)T 4 spans R4 . Find a subset of this last set which is a basis. Answer. We first consider the augmented matrix: 1 1 0 2 3 0 0 0 0 . 1 1 0 This row-reduces to 1 0 0 0 1 1 0 0 0 0 . 0 0 The vectors are linearly independent, because both columns are leading columns, but do not span, because there are some all-zero rows. For the second part, we consider the augmented matrix: 1 1 1 0 0 0 a1 2 3 0 1 0 0 a2 0 0 0 0 1 0 a3 1 1 0 0 0 1 a4 and row-reduce it: a1 1 1 1 0 0 0 0 1 −2 1 0 0 a2 − 2a1 0 0 0 0 1 0 a3 a4 − a1 0 0 −1 0 0 1 1 1 1 0 0 0 a1 0 1 −2 1 0 0 a2 − 2a1 0 0 −1 0 0 1 a4 − a1 0 0 0 0 1 0 a3 24 1. VECTOR SPACES This system is consistent, so has solutions for any (a1 , . . . , a4 )T in R4 , and hence the set of vectors spans R4 . Alternatively, we could argue that the set of standard basis vectors {e1 , . . . , e4 } spans R4 , and so any collection of vectors containing these is certainly spanning. Finally, we observe that columns 4 and 6 in the reduced matrix from the previous part are non-leading. We omit the corresponding vectors. Then (1, 2, 0, 1)T , (1, 3, 0, 1)T , (1, 0, 0, 0)T , (0, 0, 1, 0)T is a basis for R4 , containing the initial vectors (1, 2, 0, 1)T and (1, 3, 0, 1)T . Problem 46. Suppose that {v i : i = 1, . . . , m} is an orthonormal set in Rn , i.e., 1 if i = j vi · vj = 0 if i = j. (i) Show that {v i : i = 1, . . . , m} is linearly independent. (ii) Show that, if there exists a nonzero vector v such that v · v i = 0 for i = 1, . . . , m, then we can enlarge {v i : i = 1, . . . , m} to a bigger orthonormal set. Challenge Problem. This is part (iii) of the previous example. Show that, if the only vector v for which v · v i = 0 when i = 1, . . . , m is 0, then {vi : i = 1, . . . , m} is a basis. Last, but not least, {0} satisfies all the definitions of a vector space, but in a trivial way. All the discussion above about spans, linear dependence, and so on, is meaningless and silly. We define dim{0} = 0. This is a convenient convention. 8. Coordinates with respect to a basis Let B = {v 1 , . . . , v n } be a basis for a vector space V . Any vector v in V may be written in the form v = λ1 v 1 + · · · + λn v n , for appropriate scalars λ1 , . . . , λn , which are uniquely determined by v. The numbers λ1 , . . . , λn are called the coordinates of v relative to the basis B, and written [v]B ; this is a column vector. The order of the scalars λ1 , . . . , λn is determined by the order of the vectors v 1 , . . . , v n . The order of the vectors is important, because it determines the order of the coefficients, and we sometimes refer to an ordered basis to emphasize this. For a fixed vector v, choosing different bases leads to different coordinate vectors. It is natural to deal with different bases. For example, in describing three-dimensional space, we may want an axis pointing “north”, one “up”, and one “east”. Or we may want axes with one in the direction of the axis of rotation of the earth. Or axes lying in the plane of the solar system. Or axes in the plane of our galaxy. “Choosing coordinates” means choosing a basis, and then writing vectors relative to that basis. We will investigate this further later in this course.