* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download 1. General Vector Spaces 1.1. Vector space axioms. Definition 1.1
Symmetric cone wikipedia , lookup
Cross product wikipedia , lookup
Rotation matrix wikipedia , lookup
Euclidean vector wikipedia , lookup
Exterior algebra wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Vector space wikipedia , lookup
Covariance and contravariance of vectors wikipedia , lookup
Principal component analysis wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Determinant wikipedia , lookup
System of linear equations wikipedia , lookup
Jordan normal form wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Gaussian elimination wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Matrix calculus wikipedia , lookup
1. General Vector Spaces 1.1. Vector space axioms. Definition 1.1. Let V be a nonempty set of objects on which the operations of addition and scalar multiplication are defined. By addition we mean a rule for assigning to each pair of vectors u, v ∈ V a unique vector u + v. By scalar multiplication we mean a rule for associating to each scalar k and each u ∈ V a unique vector ku. The set V together with these operations is called a vector space, provided the following properties hold for all u, v, w ∈ V and scalars k, l in some field K: (1) If u, v ∈ V , then u + v ∈ V . We say that V is closed under addition. (2) u + v = v + u. (3) (u + v) + w = u + (v + w). (4) V contains an object 0, called the zero vector, which satisfies u + 0 = u for every vector u ∈ V . (5) For each u ∈ V there exists an object −u such that u + (−u) = 0. (6) If u ∈ V , and k ∈ K, then ku ∈ V . We say V is closed under scalar multiplication. (7) k(u + v) = ku + kv. (8) (k + l)u = ku + lu. (9) k(lu) = (kl)u. (10) 1u = u, where 1 is the identity in K. Remark 1.2. The most important vector spaces are real vector spaces (for which K = R in the preceding definition), and complex vector spaces (where K is the complex numbers C). 1.2. Subspaces, linear independence, span, basis. Definition 1.3. A nonempty subset W of a vector space V is called a subspace if W is closed under scalar multiplication and addition. Definition 1.4. A set M = {v1 , ..., vs } of vectors in V is called linearly independent, provided the only set {c1 , ..., cs } of scalars which solve the equation c1 v1 + c2 v2 + ... + cs vs = 0 is c1 = c2 = ... = cs = 0. If M is not linearly independent then it is called linearly dependent. Definition 1.5. The span of a set of vectors M = {v1 , ..., vs } is the set of all possible linear combinations of the members of M . Definition 1.6. A set of vectors in a subspace W of V is said to be a basis for V if it is linearly independent and its span is V . Definition 1.7. A vector space V is finite-dimensional if it has a basis with finitely many vectors, and infinite-dimensional otherwise. If V is finite-dimensional, then the dimension of V is the number of vectors in any basis; otherwise the dimension of V is infinite. 1 2 1.3. Examples. Illustration 1: Euclidean and Complex Spaces The most important examples of finite-dimensional vector spaces are n-dimensional Euclidean space Rn , and n-dimensional complex space Cn . Illustration 2. An important example of an infinite-dimensional vector space is the space of real-valued functions which have n-th order continuous derivatives on all of R, which we denote by C n (R). Definition 1.8. Let f1 (x), f2 (x), ..., fn (x) be elements of C (n−1) (R). The Wronskian of these functions is the determinant whose n-th row contains the (n − 1) derivatives of the functions, ¯ ¯ ¯ f1 (x) f2 (x) ··· fn (x) ¯¯ ¯ ¯ f10 (x) f20 (x) ··· fn0 (x) ¯¯ ¯ ¯ ¯ · · · · ¯ ¯ (n−1) (n−1) (n−1) ¯ f (x) f2 (x) · · · fn (x) ¯ 1 Theorem 1.9. (Wronski’s test for linear independence) Let f1 (x), f2 (x), ..., fn (x) be real-valued functions which have (n − 1) continuous derivatives on all of R . If the Wronskian of these functions is not identically zero on R, then the functions form a linearly independent set in C (n−1) (R). Example. Show that f1 (x) = sin2 2x, f2 (x) = cos2 2x, f3 (x) = cos 4x are linearly dependent in C 2 (R). Solution: One approach would be to examine the Wronskian of our functions. A simple computation shows that the Wronskian is identically 0, hence our functions are linearly dependent. Alternatively, since cos 4x = cos2 2x − sin2 2x, it follows that f1 (x) − f2 (x) + f3 (x) = 0, and our functions are linearly dependent. 2. Linear Transformations 2.1. Definition. Definition 2.1. Let V ,W be real vector spaces. A transformation T : V → W is a linear transformation, if for any pair α, β ∈ R, and u, v ∈ V , we have T (αu + βv) = αT (u) + βT (v). Illustration. Let V = C 1 (R) denote the continuously-differentiable real-valued functions defined on R, and W = C 0 (R) denote the continuous real-valued functions df d d on R. The derivative operator dx : V → W , defined by dx (f ) = dx ∈ W for f ∈ V df dg d is linear, since dx (αf + βg) = α dx + β dx . Example. Find the matrix representation A of the linear transformation T : R2 → R2 , where T rotates each vector x ∈ R2 with basepoint at the origin clockwise by an angle θ. Solution: We must find the images T (e1 ) and T (e2 ) of the standard basis under our transformation. It is easy to check that T (e1 ) = T ((1, 0)T ) = (cos θ, − sin θ)T , while T (e2 ) = T ((0, 1)Tµ) = (sin θ, cos θ)T¶. cos θ sin θ Hence our matrix A = . − sin θ cos θ 3 2.2. Isomorphism. Definition 2.2. A linear transformation T : V → W is called an isomorphism if it is one-to-one and onto, and we say a vector space V is isomorphic to W if there is an isomorphism between V and W . Theorem 2.3. Every real n-dimensional vector space is isomorphic to Rn . Example. If V is an n-dimensional vector space and the transformation T : V → Rn is an isomorphism, show there exists a unique inner product < ·, · > on V such that T (u) · T (v) =< u, v >, where T (u) · T (v) denotes the Euclidean product on Rn . Solution: We show that < u, v > defines an inner product. • < u, v >= T (u) · T (v) = T (v) · T (u) =< v, u >. • < u + v, w >= T (u + v) · T (w) = (T (u) + T (v)) · T (w) = T (u) · T (w) + T (v) · T (w) =< u, w > + < v, w >. • < ku, v >= T (ku) · T (v) = kT (u) · T (v) = k < u, v >. 2 • Since T is an isomorphism, < v, v >= kT vk = 0 if and only if v = 0. So < u, v > satisfies all the properties of an inner product. Uniqueness of the inner product on V follows from the similar property held by the Euclidean dot product on Rn . 2.3. Kernel and range, one-to-one and onto. Let T : V → W be a linear transformation. Then: Definition 2.4. The kernel of T is the set ker(T ) := {x ∈ V | T (x) = 0}. Definition 2.5. The range of T is the set {y ∈ W | ∃ x ∈ V such that y = T (x)}. Definition 2.6. T is onto if its range is all of W , and one-to-one if T maps distinct vectors in V to distinct vectors in W . We say T is an injection if if is one-to-one, and a surjection if it is onto. Example. Let T : V → W be a linear transformation. Show that T is one-to-one if and only if ker(T ) = {0}. Solution: Suppose first T is one-to-one. Since T is linear, T (0) = 0. Since T is one-to-one, 0 is the only vector for which T (0) = 0, so ker(T ) = {0}. Next suppose ker(T ) = {0}. Further, choose x1 , x2 ∈ V such that x1 6= x2 . Then x1 − x2 is not in the kernel of T , so that T (x1 − x2 ) = T (x1 ) − T (x2 ) 6= 0, and T is one-to-one. 3. Matrix Algebra Theorem 3.1. Let T : Rn → Rm be a linear transformation, and let {e1 , ..., en } denote a basis for Rn . Then given any x ∈ Rn , we can express T (x) as a matrix transformation T (x) = Ax, where A is the m×n matrix whose i-th column is T (ei ). Let us fix some notation. We denote the entry in the i-th row and k-th column of A by the lowercase aij . 4 3.1. Fundamental spaces of a matrix. Definition 3.2. Let A be an m × n matrix. (1) The row (column) space of A is the subspace spanned by the row (column) vectors of A. These are denoted row(A) and col(A) respectively. (2) The null space is the solution space of Ax = 0, denoted null(A). Definition 3.3. The dimension of the row space of a matrix A is called the rank of A, while the dimension of the null space is called the nullity of A. Definition 3.4. If S is a nonempty subset of Rn then the orthogonal complement of S, denoted S ⊥ , is the set of vectors in Rn which are orthogonal to every vector in S. Theorem 3.5. If A is an mxn matrix, then the row space (column space) of A and the null space of A are orthogonal complements. 2 7 4 5 8 4 4 8 5 4 Example. For the matrix 1 −9 −3 −5 −14 , show that null(A) and 3 5 7 5 6 row(A) are orthogonal complements. Solution: Recall that the null space of A consists of those vectors which solve the equation Ax = 0. It is left as an exercise to show that the null space is spanned by the vectors (7, −6, −3, 0, 5)T and (−1, −2, −1, 4, 0)T . Further, row(A) is the same as the span of the row echelon vectors from the reduced 1 0 0 1/4 −7/5 0 1 0 1/2 6/5 form of A (check!), which is given by the matrix 0 0 1 1/4 3/5 . Hence 0 0 0 0 0 row(A) is spanned by the three non-zero rows in the reduced matrix. It is easily checked by computing the dot products pairwise, that any vector in the row space is orthogonal to any vector in the column space. Example. Prove that the row vectors of an invertible n × n matrix A form a basis for Rn . Solution: If A is invertible, then the row vectors of A are linearly independent (check!). We know that the row space is a subspace of Rn , and further is spanned by n linearly independent vectors; hence the row space of A is all of Rn . It follows that the row vectors form a basis for Rn . 3.2. Dimension theorem. Theorem 3.6. If A is an mxn matrix, then rank(A) + nullity(A) = n. 2 Example. Prove that T if A is a square matrix for which A and A have the same rank, then null(A) col(A) = {0}. Solution: First we show that null(A) = null(A2 ). By the dimension theorem, we know that dim(null(A2 )) = n − rank(A2 ) = n − rank(A) = dim(null(A)). Since null(A) ⊂ null(A2 ) (check!), itTfollows that null(A) = null(A2 ). Suppose now that y ∈ null(A) col(A). Then there exists x such that y = Ax and Ay = 0. Since A2 x = Ay = 0, x ∈ null(A2 ) = null(A), and therefore y = 0. 5 3.3. Rank Theorem for matrices. Theorem 3.7. The row space and column space of a matrix have the same dimension. The rank theorem has several immediate implications. Proposition 3.8. Suppose A is an mxn matrix. Then: • rank(A) = rank(AT ). • rank(A) + nullity(AT ) = m. Example. Prove the latter proposition. Solution: To prove the first claim, note that rank(A) = dim(row(A)) = dim(col(AT )), the latter equality following since the rows of A are the columns of AT . By the rank theorem, dim(col(AT )) = dim(row(AT )), and the result follows. To prove the second claim, first recall that the dimension theorem applied to AT reads rank(AT ) + nullity(AT ) = m. Now apply part one of the proposition, i.e. rank(A) = rank(AT ), and the result follows. 3.4. Matrix multiplication. Definition 3.9. Suppose A is an m × n matrix, and B is an n × k matrix. Then we define their product AB, such that the entry in the i-th row and k-th column of AB is Σnj=1 aij bjk . Illustration. Suppose we represent v ∈ Rn as a column vector, i.e. v = (v 1 , v 2 , ..., v n )T with respect to the standard basis. Further, let A = (aij ) be an n × n matrix with entries aij . Then the vector Ax obtained by multiplying x by A has components (Ax)i = (Σnk=1 aik v k . 3.5. Change of basis. Definition 3.10. Suppose B = {v1 , ..., vk } is an ordered basis for a subspace W of Rn , and w = a1 v1 + ... + ak vk is an expression for w ∈ W in terms of B. Then we call the set {a1 , ..., an } the coordinates of w with respect to B. Further, the k-tuple of coordinates [w]B := (a1 , ..., an )TB is referred to as the coordinate matrix of w with respect to B. Theorem 3.11. Suppose B and B 0 = {v10 , ..., vn0 } are two bases for Rn , and w ∈ Rn . Then the relation between [w]B and [w]0B is given by [w]B 0 = PB→B 0 [w]B , where PB→B 0 := ([v1 ]B 0 | [v2 ]B 0 | · · · | [vn ]B 0 ) is the matrix whose column vectors are the members of B 0 . Example. Let S denote the standard basis for R3 , and let B = {v1 , v2 , v3 } be the basis with members v1 = (1, 2, 1), v2 = (2, 5, 0), and v3 = (3, 3, 8). Find the transition matrices PB→S and PS→B . Solution. By our theorem, we know that 1 2 3 PB→S = ([v1 ]S | [v2 ]S ) | [v3 ]S ) = 2 5 3 . 1 0 8 We can immediately find PS→B by noting that the it must be the inverse of PS→B (why?). 6 3.6. Similarity and Diagonalizability. Definition 3.12. If A and C are square matrices with the same size, we say that C is similar to A if there is an invertible matrix P such that C = P −1 AP . Definition 3.13. Properties of similar matrices. (1) Two square matrices are similar if and only if there exist bases with respect to which the matrices represent the same linear operator. (2) Similar matrices have the same eigenvalues, determinant, rank, nullity, and trace. Definition 3.14. A square matrix A is diagonalizable if there exists an invertible matrix P for which P −1 AP is a diagonal matrix. Theorem 3.15. If A is an nxn matrix, then the following are equivalent. • A is diagonalizable. • A has n linearly independent eigenvectors. • Rn has a basis consisting of eigenvectors of A. −1 4 −2 Example. Determine whether the matrix A = −3 4 0 is diagonalizable. −3 1 3 If so, find the matrix P that diagonalizes the matrix A. Solution: You can check that the characteristic polynomial of A is p(λ) = (λ − 1)(λ − 2)(λ − 3), so that A has three distinct eigenvalues. Since eigenvectors corresponding to distinct eigenvalues are linearly independent (check!), A has 3 linearly independent eigenvectors and we know A is diagonalizable. To determine P , we must find eigenvectors corresponding to the eigenvalues λ = 1, 2, 3. The reader can check that that these eigenvectors are v1 = (1, 1, 1)T , v2 = (2,3, 3)T , and v3 = (1, 3, 4)T . Hence one choice of the matrix P is given 1 2 1 by P = 1 3 3 . 1 3 4 3.7. Orthogonal diagonalizability. Definition 3.16. A square matrix A is orthogonally diagonalizable if there exists an orthogonal matrix P for which P T AP is a diagonal matrix. Theorem 3.17. A matrix is orthogonally diagonalizable if and only if it is symmetric. Example. Prove that if A is a symmetric matrix, then eigenvectors from different eigenspaces are orthogonal. Solution. Let v1 and v2 be eigenvectors corresponding to distinct eigenvalues λ1 , λ2 . Consider λ1 v1 · v2 = (λ1 v1 )T v2 = (Av1 )T v2 = v1T AT v2 . Since A is symmetric, v1T AT v2 = v1T Av2 = v1T λ2 v2 = λ2 v1 · v2 . This implies (λ1 − λ2 )v1 · v2 = 0, which in turn tells us v1 · v2 = 0. 3.8. Quadratic forms. Definition 3.18. Let A be a real n × n matrix, and x ∈ Rn . Then the real-valued µ ¶ a11 a12 T function x Ax is called a quadratic form. For example, if A = , then a21 a22 the quadratic form associated with A is a11 x21 + a22 x22 + 2a12 a21 x1 x2 . 7 Theorem 3.19. (Principal Axes Theorem) If A is a symmetric n × n matrix, then there is an orthogonal change of variable x = P y that transforms the quadratic form xT Ax into a quadratic form y T Ay with no cross product terms. Specifically, if P orthogonally diagonalizes A, then xT Ax = y T Dy = λ1 y12 + ... + λn yn2 , where λ1 , ..., λn are the eigenvalues of A corresponding to the eigenvectors that form the successive columns of P . Definition 3.20. A quadratic form xT Ax is said to be: • Positive definite if xT Ax > 0 for all x 6= 0. • Negative definite if xT Ax < 0 for all x 6= 0. • Indefinite otherwise. Example. Show that if A is a symmetric matrix, then A is positive definite if and only if all eigenvalues of A are positive. Solution: From the Principal Axes Theorem, we know that we can find P such that xT Ax = y T Dy = λ1 y12 + ... + λn yn2 . Since P is invertible, it follows that y 6= 0 ⇔ x 6= 0. Further, the values of xT Ax are the same as y T Dy for x, y 6= 0. This means that xT Ax > 0 ⇔ all eigenvalues of A are positive. 3.9. Functions of a matrix, matrix exponential. Definition 3.21. Suppose A is an n×n diagonalizable matrix which is diagonalized by P , and λ1 , λ2 , ..., λn are the ordered eigenvalues of A. If f is a real-valued function whose Taylor series converges on some interval containing the eigenvalues of A, then f (A) = P diag(f (λ1 ), f (λ2 ), ..., f (λn ))P −1 . −2 0 −36 −3 0 , compute exp(tA). Example. Given A = 0 −36 0 −23 Solution. We leave as an exercise to show that the eigenvalues are λ = −3, 25, −50, with corresponding eigenvectors v1 = (0, 1, 0)T , v2 = (−4/5, 0, 3/5)T , and v3 = 0 −4/5 3/5 0 0 , (3/5, 0, 4/5)T . It follows that the matrix P that diagonalizes A is P = 1 0 3/5 4/5 and A = P diag(−3, 25, −50)P T . From our theorem, it follows that exp (tA) = P exp (tdiag(−3, 25, −50))P T , and exp (tdiag(−3, 25, −50)) = diag(exp (−3t), exp (25t), exp (−50t)). It is easy to verify that 16 9 12 0 − 12 25 exp (25t) + 25 exp (−50t) 25 exp (25t) + 25 exp (−50t) 0 exp (−3t) 0 exp (tA) = 12 9 16 − 25 exp (25t) + 12 exp (−50t) 0 exp (25t) + exp (−50t) 25 25 25 µ ¶ a11 a12 denote an arbitrary 2 × 2 matrix. a21 a22 Recall that the determinant of A is defined by det(A) := a11 a22 − a12 a21 . More generally, let A be a square n × n matrix, and denote the entry in the i-th row and j-th column by aij . 3.10. Determinants. Let A = Definition 3.22. The determinant of a square n × n matrix is defined by the sum det(A) = Σ ± a1j1 a2j2 · · · anjn . Here the summation is over all permutations 8 {j1 , j2 , ..., jn } of {1, 2, ..., n}, where the sign is “+” if the permutation is even, and “-” if the permutation is odd. a11 a12 a13 Illustration. Suppose that A = a21 a22 a23 . Then: a31 a32 a33 det(A) = a11 a22 a33 + a12 a23 a31 + a13 a21 a32 − a13 a22 a31 − a12 a21 a33 − a11 a23 a32 . 3.11. Properties of determinants. Proposition 3.23. Suppose A, B are square matrices of the same size. Then: (1) A is invertible if and only if det(A) 6= 0. (2) det(AB) = det(A) · det(B). (3) det(A) = det(AT ). Example. Show that a square matrix A is invertible if and only if AT A is invertible. Solution: Suppose A is invertible. Then from the first item in the above proposition we know det(A) 6= 0. Further, from items 2 and 3 we see that det(AT A) = det(AT )det(A) = det(A)2 6= 0, so that AT A is invertible. On the other hand, if det(AT A) 6= 0 then by the same equality we have det(A) 6= 0, and A is invertible. 3.12. Cramer’s rule. Theorem 3.24. If Ax = b is a linear system of n equations in n unknowns, then the system has a unique solution if and only if det(A) 6= 0. Cramer’s rule then says det(A2 ) det(An ) 1) that the exact solution is given by x1 = det(A det(A) , x2 = det(A) , ..., xn = det(A) . Here Ai denotes the matrix which results when the i-th column of A is replaced by the column vector b. ¶ µ ¶ ¶µ µ 0 x1 1 0 using Cramer’s rule. = Example. Solve 1 x2 2 1 µ ¶ µ ¶ 0 0 1 0 det det 1 1 2 1 0 µ ¶ = = 0, µ ¶ = 1. Solution : x1 = x2 = 1 1 0 1 0 det det 2 1 2 1 3.13. Formula for A−1 . Definition 3.25. If A is a square matrix, then the minor of entry aij is denoted by Mij , and is defined to be the determinant of the submatrix that remains when the i-th row and j-th column are deleted. The number Cij = (−1)i+j Mij is called the cofactor of entry aij . C11 C12 ... C1n C21 C22 ... C2n Definition 3.26. If A is a square matrix, the matrix C = . . . . Cn1 Cn2 ... Cnn is called the matrix of cofactors. The adjoint of A is the transpose of C, which we denote by adj(A). 9 1 Theorem 3.27. If A is invertible, then its inverse is given by A−1 = det(A) adj(A) = 1 T det(A) C . 2 0 3 Example. Find the inverse of A = 0 3 2 using Theorem 1.6. −2 0 −4 Solution: First we compute the determinant, expanding along the first row, det(A) =a11 C11 + a12 C12 + a13 C13 µ ¶ µ 3 2 0 =2 det − 0 det 0 −4 −2 2 −4 ¶ µ + 3 det 0 3 −2 0 ¶ =2(−12) + 3(6) = − 6. We can similarily obtain the remaining Cij which determine the adjoint. Finally we find that the inverse is: −12 0 −9 1 A−1 = − −4 −2 −4 . 6 6 0 6 3.14. Geometric interpretation of the determinant. Theorem 3.28. If A is a 2 × 2 matrix, then | det(A) | represents the area of the parallelogram determined by the two column vectors of A, when they are positioned so that their base points coincide. If A is a 3 × 3 matrix, then | det(A) | represents the volume of the parallelipiped determined by the three column vectors of A, when they are positioned so that their base points coincide. Example. Find the area of the parallelogram in the plane with vertices P1 (1, 2), P2 (4, 4), P3 (7, 5), P4 (4, 3). Solution: Let’s consider the vectors P1 P2 and P1 P4 , which starting from P1 exT tend to P2 and P4 respectively. A simple calculation shows P1 P2 = (3, ¶ µ 2) , and 3 3 , P1 P4 = (3, 1)T . Placing these vectors as the columns of the matrix A = 2 1 by our theorem we know that the area of our parallelogram is given by | det(A) |= 3. 3.15. Cross product. Definition 3.29. Let u = (u1 , u2 , u3 )T , v = (v1 , v2 , v3 )T . The cross product of u with v, denoted u × v, is the vector µ ¶ µ ¶ µ ¶ u2 u3 u1 u3 u1 u2 u × v := (det , − det , det )T . v2 v3 v1 v3 v1 v2 Example. For u = (1, 0, 2)T , v = (−3, 1, 0)T , compute u × v. Solution: By the definition, u × v = ((0 · 0 − 2 · 1), (−3 · 2 − 1 · 0), (1 · 1 − 0 · (−3)))T = (−2, −6, 1)T . 10 4. Eigenvalues and eigenvectors 4.1. Eigenvalues of mappings between linear spaces. Definition 4.1. Suppose V is a real vector space, and T : V → V is a linear map. Then we say λ ∈ R is an eigenvalue of T , provided there exists a non-zero vector x ∈ V such that (T − λI)x = 0. Example. Suppose V is a real vector space, and let I be the identity operator on V . Find the eigenvalues and eigenspaces of I. Solution: Since Ix = x, for all x ∈ V , it follows that 1 is the only eigenvalue, and the eigenspace corresponding to 1 is all of V . 4.2. Real and complex eigenvalues for maps between finite-dimensional spaces. Definition 4.2. If A is an n × n matrix, then a scalar λ is called an eigenvalue of A if there exists a non-zero vector x such that Ax = λx. If λ is an eigenvalue of A, then every nonzero vector x such that Ax = λx is called an eigenvector of A. 4 0 1 Example. Find all eigenvalues of the matrix A = −2 1 0 , and the corre−2 0 1 sponding eigenvectors. Solution: We note that λ is an eigenvalue provided the equation (A − λId )x = 0 has a solution for some non-zero x, where Id denotes the identity matrix. This is only possible if λ solves the characteristic equation det(A − λId ) = 0. For the matrix A in our example, the characteristic equation reads (check!) λ3 − 6λ2 + 11λ − 6 = (λ − 1)(λ − 2)(λ − 3) = 0, which has solutions λ = 1, 2, 3. Next, to determine the eigenvectors corresponding to λ = 1, we must solve the system (A − Id )x = 0 for non-zero x. In other words, we solve 3 0 1 x1 0 −2 0 0 x2 = 0 . −2 0 0 x3 0 Using your favourite solution method, you can easily determine that one eigenvector is (x1 , x2 , x2 )T = (0, 1, 0)T . Similarily, we find an eigenvector corresponding to λ = 2 is (−1, 2, 2)T , and for λ = 3 the eigenvector is (−1, 1, 1)T . Finally, it is important to note that scalar multiples of any of these eigenvectors is also an eigenvector, so we have actually determined a subspace of eigenvectors corresponding to each eigenvalue (referred to as the eigenspace of λ). Definition 4.3. If n is a positive integer, then a complex n-tuple is a sequence of n complex numbers (v1 , ..., vn ). The set of all complex n-tuples is called complex n-space and is denoted by C n . Definition 4.4. If u = (u1 , u2 , ..., un ) and v = (v1 , v2 , ..., vn ) are vectors in C n , then the complex Euclidean dot (inner) product of u and √ v is defined u · v := u1 v1 + u2 v2 + ... + un vn . The Euclidean norm is kvk := v · v. Definition 4.5. A complex matrix A is a matrix whose entries are complex numbers. Further, we define the complex conjugate of a matrix A, denoted A, to be the matrix whose entries are the complex conjugates of the entries of A. That is, if A has entries aij , then A has entries aij . 11 Definition 4.6. If A is a complex n × n matrix, then the complex roots λ of the characteristic equation det(A − λI) = 0 are called complex eigenvalues of A. Further, complex nonzero solutions x to (A − λI)x = 0 are referred to as the complex eigenvectors corresponding to λ. µ ¶ 4 −5 Example. Given A = , determine the eigenvalues and find bases for 1 0 the corresponding eigenspaces. Solution: It is left as an exercise to check that the characteristic equation is λ2 − 4λ + 5 = 0, so the eigenvalues are λ = 2 ± i. Let µ us determine¶the eigenspace corresponding to λ = 2 + i. We must solve −2 + i 5 (x, y)T = (0, 0)T . Since we know this system must have a non−1 2+i zero solution, it follows that one of the rows in the reduced matrix must have a row of zeros. Hence we need only solve (−2 + i)x + 5y = 0. which has as solution the eigenvector (x, y) = ( −2+i 5 , 1), which spans the eigenspace of λ = 2 + i. It is a good exercise for the reader to check that for a complex eigenvalue λ with corresponding eigenvector x, it is always true that λ is another eigenvalue with corresponding eigenvector x. Hence ( −2−i 5 , 1) is a basis for the eigenspace corresponding to λ = 2 − i. 4.3. Generalized Eigenspaces. Definition 4.7. Let A be a complex n × n matrix, with distinct eigenvalues {λ1 , λ2 , ..., λk }. The generalized eigenspace Vλi pertaining to λi is defined by Vλi = {x ∈ Cn | (A − λi I)n x = 0}. In particular, all eigenvectors corresponding to λi are in Vλi . Theorem 4.8. Let A be a complex n×n matrix with distinct eigenvalues {λ1 , λ2 , ..., λk } and corresponding invariant subspaces Vλi , i = 1, ..., k. Then: (1) Vλi is invariant under A, in the sense that AVλi ⊂ Vλi for i = 1, ..., k. (2) The spaces Vλi are mutually linearly independent. (3) dimVλi = m(λi ), where m(λi ) is the multiplicity of the eigenvalue λi . (4) A is similar to a block diagonal matrix with k blocks A1 , ..., Ak . 4.4. Jordan Normal Form. Definition 4.9. Let λ ∈ C. A Jordan block Jk (λ) is a k × k upper-triangular matrix of the form λ 1 0 ··· 0 0 λ 1 0 · ·· 0 Jk (λ) = · · . . . · · 0 0 ··· λ 1 0 0 0 ··· λ Definition 4.10. A Jordan matrix is any matrix of the form Jn1 (λ1 ) 0 .. J = . 0 Jnk (λk ) where each Jni (λi ) is a Jordan block, and n1 + n2 + · · · + nk = n. 12 Theorem 4.11. Given any complex n × n matrix A, there is an invertible matrix S such that Jn1 (λ1 ) 0 −1 −1 .. A=S S = SJS , . 0 Jnk (λk ) where each Jni (λi ) is a Jordan block, and n1 + n2 + · · · + nk = n. The eigenvalues are not necessarily distinct, though if A is real with real eigenvalues, then S can be taken to be real. 5. Inner product spaces 5.1. Inner product. Definition 5.1. An inner product on a real vector space V is a function that associates a unique real number < u, v > to each pair of vectors u, v ∈ V , in such a way that the following properties hold for all u, v, w ∈ V and scalars k: (1) (2) (3) (4) < v, v >≥ 0, and < v, v >= 0 if and only if v = 0. < u, v >=< v, u >. < u + v, w >=< u, w > + < v, w >. < ku, v >= k < u, v >. A real vector space equipped with an inner product is called a real inner product space. Illustration. The most familiar example of an inner product space is Rn , equipped with the Euclidean dot product as inner product. That is, for v, w ∈ Rn , we define the dot product v · w := Σni=1 v i wi . Example. Let V = C([0, 2π]), the continuous real-valued functions defined on the closed interval [0, 2π]. We make V into an inner product space by defining an inner R 2π product < f, g >:= 0 f (x)g(x)dx, for any two functions f, g ∈ V . Suppose p and q are distinct non-zero integers. Show that f (x) = sin qx and g(x) = cos px are orthogonal with respect to the inner product. Solution: Using the identity cos px sin qx = sin (p + q)x − sin (p − q)x, we see that R 2π R 2π < f, g >= 0 cos px sin qxdx = 0 [sin (p + q)x − sin (p − q)x]dx = 0 − 0 = 0, as required. 5.2. Norms, Cauchy-Schwarz inequality. Definition 5.2. If V is an inner product space, then we define the norm of v ∈ V √ by kvk = < v, v >, and the distance between u and v by d(u, v) = ku − vk. Theorem 5.3. (Pythagoras) If u, v ∈ V are orthogonal with respect to the inner 2 2 product, then ku + vk = kuk + kvk . Theorem 5.4. (Cauchy-Schwarz Inequality) If u, v are vectors in an inner product space V , then |< u, v >| ≤ kuk kvk . Theorem 5.5. (Triangle Inequality) If u, v, w are vectors in an inner product space, then ku + vk ≤ kuk + kvk. 13 5.3. Orthogonality, orthonormal bases. Definition 5.6. A pair of vectors v, w in an inner product space V are called orthogonal if < v, w >= 0. A set W of vectors in an inner product space is called orthogonal if each pair of vectors in orthogonal. The set W is orthonormal if it is orthogonal and each vector has unit length. Finally, a basis B which is orthonormal is called an orthonormal basis. Theorem 5.7. Properties of orthonormal bases: (1) If {v1 , ..., vk } is an orthonormal basis for a subspace W ⊂ V , and if w ∈ W , then we may express w = (w · v1 )v1 + (w · v2 )v2 + ... + (w · vk )vk . (2) Every nonzero subspace of a finite-dimensional inner product space V possesses an orthonormal basis (Gram-Schmidt). Example. Confirm that the set v1 = (2/3, 1/3, 2/3), v2 = (1/3, 2/3, −2/3), v3 = (2/3, −2/3, −1/3) is an orthonormal basis for R3 equipped with the Euclidean inner product. Solution We leave it to you to check that v1 , v2 , v3 are pairwise orthogonal, by computing the dot products. Further, each of these vectors has norm 1, so the set is orthonormal. Finally, an orthogonal set of nonzero vectors is linearly independent (check!), so that our set forms an orthonormal basis. 5.4. Hermitian, Unitary, and Normal Matrices. Definition 5.8. If A is a complex matrix, then the conjugate transpose of A, T denoted A∗ , is defined by A∗ = A , where the overbar denotes complex conjugation. Definition 5.9. A square complex matrix A is said to be unitary if A∗ = A−1 , and hermitian if A∗ = A. Theorem 5.10. Suppose A is a n × n unitary, complex matrix. Then • Ax · Ay = x · y for all x, y ∈ C n . • The column and row vectors form an orthonormal set with respect to the complex Euclidean inner product. Theorem 5.11. Suppose A is a Hermitian matrix. Then • The eigenvalues of A are real numbers. • The eigenvectors from different eigenspaces are orthogonal. Example. Show that if A is a unitary matrix, then so is A∗ . Solution: Since A is unitary, A−1 = A∗ , and it is left as an exercise to check that (A∗ )−1 = (A−1 )∗ . From the latter it follows (A∗ )−1 = (A∗ )∗ , as required. Example. Show that the determinant of a Hermitian matrix is real. Solution: First of all we show that det(A∗ ) = det(A). By expanding the formula for the determinant, it is readily seen that det(A) = det(A). Using the latter, and the fact that the determinant of A is the same as that of its transpose, we find det(A∗ ) = det((A)T ) = det(A) = det(A). Since A is Hermitian, det(A) = det(A∗ ) = det(A), and det(A) is real. Definition 5.12. A square complex matrix A is called normal if AA∗ = A∗ A (a property you should check is shared by, for example, unitary and hermitian matrices).