* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Matrix Operations
Matrix completion wikipedia , lookup
Symmetric cone wikipedia , lookup
Exterior algebra wikipedia , lookup
Capelli's identity wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Rotation matrix wikipedia , lookup
System of linear equations wikipedia , lookup
Principal component analysis wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Jordan normal form wikipedia , lookup
Determinant wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Four-vector wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Matrix calculus wikipedia , lookup
Gaussian elimination wikipedia , lookup
14 MATH10212 • Linear Algebra • Brief lecture notes Matrix Operations Definition A matrix is a rectangular array of numbers called the entries, or elements, of the matrix. The following are all examples of matrices: · ¸ · √ ¸ 2 1 2 5 −1 0 4 , , , 0 3 2 π 21 17 5.1 6.9 −7.3 1.2 −1 0 4.4 , 9 8.5 £ 1 1 1 1 ¤ [7] The size of a matrix is a description of the numbers of rows and columns it has. A matrix is called m × n (pronounced “m by n”) if it has m rows and n columns. Thus, the examples above are matrices of sizes 2 × 2, 2 × 3, 3 × 1, 1 × 4, 3 × 3 and 1 × 1, respectively. A 1 × m matrix is called a row matrix, and n × 1 matrix is called a column matrix (or column vector). We use double-subscript notation to refer to the entries of a matrix A. The entry of A in row i and column j is denoted by aij . Thus, if · ¸ 3 9 −1 A= 0 5 4 then a13 = −1 and a22 = 5. We can therefore compactly denote a matrix A by [aij ] (or [aij ]m×n if it is important to specify the size of A, although the size will usually be clear from the context). With this notation, a general m × n matrix A has the form a11 a12 · · · a1n a21 a22 · · · a2n A= .. .. . . .. . . . . am1 am2 · · · amn If the columns of A are the vectors ~a1 , ~a2 , . . . , ~an , then we may represent A as A = [~a1~a2 . . . ~an ] ~1, A ~2, . . . , A ~ n , then we may represent A as If the rows of A are A ~ A1 A ~ 2 A= . .. ~m A The diagonal entries of A are a11 , a22 , a33 , . . . , and if m = n (that is, if A has the same number of rows as columns), then A is called a square matrix. 15 MATH10212 • Linear Algebra • Brief lecture notes A square matrix whose nondiagonal entries are all zero is called a diagonal matrix. A diagonal matrix all of whose diagonal entries are the same is called a scalar matrix. If the scalar on the diagonal is 1, the scalar matrix is called an identity matrix. For example, let · A= ¸ 5 0 , 4 1 0 0 6 0 , 0 2 2 −1 3 C= 0 0 · 3 4 B= 1 D= 0 0 1 5 ¸ , 0 0 1 0 1 0 The diagonal entries of A are 2 and 4, but A is not square; B is a square matrix of size 2 × 2 with diagonal entries 3 and 5; C is a diagonal matrix; D is 3 × 3 identity matrix. The n × n identity matrix is denoted by In (or simply I if its size is understood). We can view matrices as generalizations of vectors. Indeed, matrices can and should be thought of as being made up of both row and column vectors. (Moreover, an m × n matrix can also be viewed as a single “wrapped vector” of length mn.) Many of the conventions and operations for vectors carry through (in an obvious way) to matrices. Two matrices are equal if they have the same size and if their corresponding entries are equal. Thus, if A = [aij ]m×n and B = [bij ]r×s , then A = B if and only if m = r and n = s and aij = bij for all i and j. Example 3.1. Consider the matrices ¸ · ¸ · a b 2 0 , B= , A= c d 5 3 · C= 2 0 5 3 x y ¸ Neither A nor B can be equal to C (no matter what the values of x and y), since A and B are 2 × 2 matrices and C is 2 × 3. However, A = B if and only if a = 2, b = 0,c = 5 and d = 3. Matrix Addition and Scalar Multiplication We define matrix addition componentwise. If A = [aij ] and B = [bij ] are m × n matrices, their sum A + B is the m × n matrix obtained by adding the corresponding entries: A + B = [aij + bij ] If A and B are not the same size, then A + B is not defined. Example 3.3. Let · A= 1 −2 4 6 0 5 ¸ · , B= −3 1 −1 3 0 2 ¸ , 16 MATH10212 • Linear Algebra • Brief lecture notes · C= Then · A+B = 4 3 2 1 −2 1 5 6 ¸ −1 7 ¸ but neither A + C nor B + C is defined. If A is m × n matrix and c is a scalar, then the scalar multiple cA is the m × n matrix obtained by multiplying each entry of A by c. More formally, we have cA = c[aij ] = [caij ] [In terms of vectors, we could have equivalently stipulate that each column (or row) of cA is c times the corresponding column (or row) of A.] Example 3.4. For matrix A in Example 3.3, · ¸ · 1 1 2 2 8 0 2 2A = , A= −4 12 10 −1 3 2 · ¸ −1 −4 0 (−1)A = 2 −6 −5 0 5 2 ¸ , The matrix (−1)A is written as −A and called the negative of A. As with vectors, we can use this fact to define the difference of two matrices: if A and B are the same size, then A − B = A + (−B) Example 3.5. For matrices A and B · 1 4 0 A−B = −2 6 5 · 4 −5 in Example 3.3, ¸ · ¸ −3 1 −1 − = 3 0 2 ¸ 3 1 6 3 A matrix all of whose entries are zero is called a zero matrix and denoted by O (or Om×n if it is important to specify its size). It should be clear that if A is any matrix and O is the zero matrix of the same size, then A+O =A=O+A and A − A = O = −A + A 17 MATH10212 • Linear Algebra • Brief lecture notes Matrix Multiplication Definition If A is m × n matrix and B is n × r matrix, then the product C = AB is an is m × r matrix. The (i, j) entry of the product is computed as follows: cij = ai1 b1j + ai2 b2j + · · · + ain bnj Remarks • Notice that A and B need not be the same size. However, the number of columns of A must be the same as the number of rows of B. If we write the sizes of A, B and AB in order, we can see at a glance whether this requirement is satisfied. Moreover, we can predict the size of the product before doing any calculations, since the number of rows of AB is the same as the number of rows of A, while the number of columns of AB is the same as the number of columns of B A B m×n = n×r AB m×r • The formula for the entries of the product looks like a dot product, and indeed it is. It says that (i, j) entry of the matrix AB is the dot product of the ith row of A and the jth column of B: a11 a12 · · · a1n .. .. .. b11 · · · b1j · · · b1r . . . b21 · · · b2j · · · b2r ai1 ai2 · · · ain .. .. .. . . . .. .. .. . . . bn1 · · · bnj · · · bnr am1 am2 · · · amn Notice, that in the expression cij = ai1 b1j + ai2 b2j + · · · + ain bnj the “outer subscripts" on each ab term in the sum are always i and j whereas the “inner subscripts" always agree and increase from 1 to n. We see this pattern clearly if we write cij using summation notation: cij = n X aik bkj k=1 • Matrix multiplication is not commutative: AB 6= BA in general, even if both AB and BA are defined. · ¸ · ¸ · ¸ · ¸ ¸ · ¸ · 2 0 1 1 2 1 2 0 2 2 1 1 E.g., · = = . 6= · 0 1 0 1 0 1 0 1 0 1 0 1 18 MATH10212 • Linear Algebra • Brief lecture notes • There is no cancellation rule in general: · ¸ · ¸ · ¸ · ¸ · 1 1 −1 1 0 0 1 1 2 · = = · 1 1 1 −1 0 0 1 1 −2 · ¸ · ¸ −1 1 2 −2 but 6= . 1 −1 −2 2 ¸ −2 , 2 • Helpful hint on multiplication of matrices: If the left factor A is “sparse”, or/and contains “simple” elements like many 0s or 1s, then it may be advantageous to compute the product AB as follows: the ith row of AB is a linear combination of the rows of B taken with coefficients given by the ith row of A. 1 2 7 · ¸ · ¸ −3 2 5 1 0 1 0 3 5 9 E.g., · = ; the first row of 2 3 2 1 0 0 −1 3 3 5 −2 −1 2 AB is the sum of the 1st and 3rd rows of B, because we have 1, 0, 1, 0 in the first row in A; the second row of AB is 1st row of B minus 4th row of B because we have 1, 0, 0, −1 in the second row of A. Actually, this rule can always be applied, but is particularly effective when A is “easier” than B. (Note: “rows ↔ left” works here again.) Similarly, if the right matrix B is better, then the kth column of AB is obtained as a linear combination of columns of A taken with coefficients given in the kth column of B. • The real justification of the definition of the product of matrices (which may seem at first glance rather arbitrary, “too complicated”) is connected with so-called linear transformations, which we shall study later. These transformations are mappings (functions) from Rn to Rm ; ϕ ψ in coordinates they are given by matrices. If Rm −→ Rn −→ Rs are two such transformations whose matrices are A, B, then the resulting, composite, transformation ~a → ψ(ϕ(~a)) is given precisely by the product AB. • Matrix multiplication is associative: A(BC) = (AB)C (of course, when these products are defined, i.e. with correct sizes Am×n , Bn×s , Cs×t ). This is a very important property, which is very far from being obvious (as the definition of products is so complicated...). The proof of this property is deferred until we study linear transformations later. In fact, then it will become quite easy to prove: if A, B, C are the maϕ ψ χ trices for Rm −→ Rn −→ Rs −→ Rt , then the matrix A(BC) = (AB)C is simply the matrix of the composite transformation ~a → χ(ψ(ϕ(~a)))... Theorem 3.1. Let A be an m × n matrix, e~i an 1 × m standard unit vector, and e~j an n × 1 standard unit vector. Then (a) e~i A is the ith row of A and (b) Ae~j is the jth column of A. 19 MATH10212 • Linear Algebra • Brief lecture notes Proof We prove (b) and leave proving (a) as an exercise. If ~a1 , . . . , ~an are the columns of A, then the product Ae~j can be written Ae~j = 0~a1 + 0~a2 + · · · + 1~aj + · · · + 0~an = ~aj We could also prove (b) by direct calculation: Ae~j = a11 a21 .. . am1 = a1j a2j .. . ··· ··· a1j a2j .. . ··· ··· ··· amj ··· 0 a1n . .. a2n 1 .. . . .. amn 0 amj since the 1 in e~j is the jth entry. Partitioned Matrices It will often be convenient to regard a matrix as being composed of a number of smaller submatrices. By introducing vertical and horizontal lines into a matrix, we can partition it into blocks. There is a natural way to partition many matrices, particularly those arising in certain applications. For example, consider the matrix 1 0 0 2 −1 0 1 0 1 3 0 A= 0 0 1 4 0 0 0 1 7 0 0 0 7 2 It seems natural to partition A as 1 0 0 0 1 0 A= 0 0 1 0 0 0 0 0 0 2 1 4 1 7 −1 3 0 7 2 · I = O B C ¸ where I is the 3 × 3 identity matrix, B is 3 × 2, O is 2 × 3 zero matrix, and C is 2 × 2. In this way, we can view A as a 2 × 2 matrix whose entries are themselves matrices. When matrices are being multiplied, there is often an advantage to be gained by viewing them as partitioned matrices. Not only does this frequently reveal underlying structures, but it often speeds up computation, 20 MATH10212 • Linear Algebra • Brief lecture notes especially when the matrices are large and have many blocks of zeros. It turns out that the multiplication of partitioned matrices is just like ordinary matrix multiplication. We begin by considering some special cases of partitioned matrices. Each gives rise to a different way of viewing the product of two matrices. Suppose A is m × n matrix and B is n × r, so the product AB exists. If we partition B in terms of its column vectors, as . . . B = [~b1 .. ~b2 .. · · · .. ~br ] (here vertical dots are used simply as dividers, as in the textbook), then . . . . . . AB = A[~b1 .. ~b2 .. · · · .. ~br ] = [A~b1 .. A~b2 .. · · · .. A~br ] This result is an immediate consequence of the definition of matrix multiplication. The form on the right is called the matrix-column representation of the product. In fact, we saw this already in Helpful Hint: the jth column of the product is A~bj and clearly it is the linear combination of the columns of A with −1 1 1 2 3 4 2 10 0 1 coefficients given by ~bj . For example, 2 3 4 5 · = 2 14 1 1 6 7 8 9 2 30 0 1 (The first column of this product C = AB is the 3rd column of A minus 1st column of A: that is, ~c1 = −1~a1 + 1~a3 , with the coefficients given by ~b1 , the second column of AB is ~c2 = 1~a1 + 1~a2 + 1~a3 + 1~a4 , i.e. the sum of all columns of A.) Suppose A is m × n matrix and B is n × r, so the product AB exists. If we partition A in terms of its row vectors, as ~ A1 −− A ~ 2 −− . .. −− ~m A then ~ A1 −− A ~ 2 −− AB = . .. −− ~m A ~ A1 B −− A ~ 2B −− B = .. . −− ~mB A 21 MATH10212 • Linear Algebra • Brief lecture notes Once again, this result is a direct consequence of the definition of matrix multiplication. The form on the right is called the row-matrix representation of the product. ~iB Again, we saw this already in Helpful Hint: the ith row of AB is A and clearly it is the linear combination of the rows of B with coefficients ~i. given by A Matrix Powers. When A and B are two n × n matrices, their product AB will also be an n × n matrix. A special case occurs when A = B. It makes sense to define A2 = AA and, in general, to define Ak as Ak = AA · · · A 1 if k is a positive integer. Thus, A A0 = In . (k times) = A, and it is convenient to define If A is a square matrix and r and s are nonnegative integers, then 1. Ar As = Ar+s 2. (Ar )s = Ars Definition The transpose of an m × n matrixA is the n × m matrix AT obtained by interchanging the rows and columns of A. That is, the ith column of AT is the ith row of A for all i. The transpose is sometimes used to give an alternative definition of the dot product of two vectors in terms of matrix multiplication. If u1 v1 u2 v2 ~u = . and ~v = . .. .. un vn then ~u · ~v (dot product) = u1 v1 + u2 v2 + · · · + un vn v1 £ ¤ v2 = u1 u2 . . . un . = ~uT ~v (matrix product) .. vn A useful alternative definition of the transpose is given componentwise: (AT )ij = Aji for all i and j Definition A square matrix A is symmetric if AT = A, that is, if A is equal to its own transpose. A square matrix A is symmetric if and only if Aij = Aji for all i and j. MATH10212 • Linear Algebra • Brief lecture notes 22 Matrix Algebra Theorem 3.2. Algebraic Properties of Matrix Addition and Scalar Multiplication Let A, B and C be matrices of the same size and let c and d be scalars. Then a . A + B = B + A Commutativity b. (A + B) + C = A + (B + C) Associativity c. A + O = A d. A + (−A) = O e. c(A + B) = cA + cB Distributivity f. (c + d)A = cA + dA Distributivity g. c(dA) = (cd)A h. 1A = A We say that matrices A1 , A2 , . . . , An of the same size are linearly independent if the only solution of the equation c1 A1 + c2 A2 + · · · + ck Ak = O is the trivial one: c1 = c2 = · · · = ck = 0, otherwise they are linearly dependent. Questions like “Is matrix B equal to a linear combination of given matrices A1 , . . . , Ak ?” or “Are given matrices A1 , . . . , Ak linearly dependent?” can be answered by using linear systems just like for vectors (here regarding matrices as “wrapped” vectors). Theorem 3.3. Properties of Matrix Multiplication Let A, B and C be matrices (whose sizes are such that the indicated operations can be performed) and let k be a scalar. Then a. (AB)C = A(BC) (Associativity) b. A(B + C) = AB + AC (Left distributivity) c. (A + B)C = AC + BC (Right distributivity) d. k(AB) = (kA)B = A(kB) e. Im A = A = AIn if A is m × n (Multiplicative identity) Proof (a) Let A be m × n matrix, B n × s matrix, C s × t matrix (for the products to be defined). First, check the sizes are the same on the left and on the right. Left: AB is m × s matrix; (AB)C is m × t matrix. Right: BC is n × t matrix; A(BC) is m × t matrix. 23 MATH10212 • Linear Algebra • Brief lecture notes We now check that every ij-th entry is the same. Starting from the left: ((AB)C)ij = by def. of matrix product s s ³X n ´ X X = (AB)ik Ckj = Ail Blk Ckj = k=1 k=1 l=1 using distributivity for numbers expand brackets = s X n X Ail Blk Ckj = k=1 l=1 change order of summation, which is OK since it is simply the sum over all pairs k, l either way n X s X = Ail Blk Ckj = l=1 k=1 collecting terms = n X l=1 Ail s ³X ´ Blk Ckj = k=1 now the inner sum is (BC)lj by def. of product = n X Ail (BC)lj = (A(BC))ij , l=1 the ij-th entry on the right. There is also another way to prove the associativity (later, in connection with linear transformations). We also prove (b) and half of (e). The remaining properties are considered in the exercises in the textbook. ~i (b) To prove A(B + C) = AB + AC, we let the rows of A be denoted by A ~ and the columns of B and C by bj and c~j . Then the jth column of B + C is b~j + c~j (since addition is defined componentwise), and thus ~ i · (b~j + c~j ) [A(B + C)]ij = A ~ i · b~j + A ~ i · c~j =A = (AB)ij + (AC)ij = (AB + AC)ij Since this is true for all i and j, we must have A(B + C) = AB + AC. (e) To prove AIn = A we note that the identity matrix In can be columnpartitioned as . . . I = [~e ..~e .. . . . ..~e ] n 1 2 n where ~ei is a standard unit vector. Therefore, . . . AIn = [A~e1 ..A~e2 .. . . . ..A~en ] . . . = [~a1 ..~a2 .. . . . ..~an ] = A by Theorem 3.1(b). MATH10212 • Linear Algebra • Brief lecture notes Theorem 3.4. 24 Properties of the Transpose Let A and B be matrices (whose sizes are such that the indicated operations can be performed) and let k be a scalar. Then a. (AT )T = A b. (A + B)T = AT + B T c. (kA)T = k(A)T d. (AB)T = B T AT e. (Ar )T = (AT )r for all nonnegative integers r Theorem 3.5. a. If A is a square matrix, then A + AT is a symmetric matrix. b. For any matrix A, AAT and AT A are symmetric matrices. 25 MATH10212 • Linear Algebra • Brief lecture notes The Inverse of a Matrix Definition If A is an n × n matrix, an inverse of A is an n × n matrix A0 with the property that AA0 = I and A0 A = I where I = In is the n × n identity matrix. If such an A0 exists, then A is called invertible. · Examples. 1) ¸ · 1 2 1 · 0 1 0 ¸ · −2 1 = 1 0 ¸ · ¸ · ¸ −2 1 2 1 0 · = . 1 0 1 0 1 2) Having an inverse helps: e.g. to solve the equation AX = B, where A, V, B are matrices, if there is A−1 , then simply X = A−1 B, after we multiply both parts by A−1 on the left and use associativity: A−1 (AX) = A−1 B; (A−1 A)X = A−1 B; X = A−1 B. · ¸ · ¸ · ¸· ¸ · ¸ 1 1 3 4 1 −1 3 4 −2 −2 E.g., · X2×2 = ⇒X = = is the 0 1 5 6 0 1 5 6 5 6 unique solution. · ¸ 1 1 3) The matrix A = has no inverse: for any matrix B2×2 we have 1 1 equal rows in AB (by Helpful · Hint ¸ both are simply the sum of the rows 1 0 of B), so it cannot be = I = whose rows are not equal. 0 1 Theorem 3.6. If A is an invertible matrix, then its inverse is unique. From now on, when A is invertible, we will denote its (unique) inverse by A−1 . Matrix form of a linear system. An arbitrary linear system a11 x1 + a12 x2 + . . . + a1n xn = b1 a21 x1 + a22 x2 + . . . + a2n xn = b2 ... ... ... ... ... am1 x1 + am2 x2 + . . . + amn xn = bm can be viewed as the equality of two big column-vectors: b1 a11 x1 + a12 x2 + . . . + a1n xn b2 a21 x1 + a22 x2 + . . . + a2n xn = .. . ... ... ... ... . am1 x1 + am2 x2 + . . . + amn xn bm The left column in turn is obviously the sum 26 MATH10212 • Linear Algebra • Brief lecture notes = a11 x1 a21 x1 .. . + a12 x2 a22 x2 .. . + ··· + a1n xn a2n xn .. . am1 x1 a x amn xn m2 2 a11 a12 a1n a22 a2n a21 .. · x1 + .. · x2 + · · · + .. . . . am1 am2 amn = · xn . By our “Helpful hint” on multiplication, the left-hand column is therefore the product of two matrices: a11 a21 ... am1 a12 a22 ... am2 ... ... ... ... x 1 a1n a11 x1 x2 a21 x1 a2n · = ... . . . ... amn am1 x1 xm + a12 x2 + a22 x2 ... + am2 x2 + . . . + a1n xn + . . . + a2n xn . ... ... + . . . + amn xn We now can write the linear system in the matrix form: A · ~x = ~b, where a11 a12 a21 a22 A= ... ... am1 am2 x1 x2 ~x = . is the .. . . . a1n . . . a2n is the coefficient matrix of the system, ... ... . . . amn b1 b2 column matrix (vector) of variables, and ~b = . is the .. xn column matrix of constant terms (right-hand sides). bm Theorem 3.7. If A is an invertible n × n matrix, then the system of linear equations given by A~x = ~b has the unique solution ~x = A−1~b for any ~b in Rn . The following theorem will be superseded later by more general methods, but it is useful in small examples. · Theorem 3.8. If A = a b c d ¸ , then A is invertible if and only if ad − bc 6= 0, in which case A−1 = 1 ad − bc · d −b −c a ¸ (So, if ad − bc = 0, then A is not invertible.) The expression ad − bc is called the determinant of A (in this special 2 × 2 case), denoted det A.