* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Review of Linear Algebra
Symmetric cone wikipedia , lookup
Cross product wikipedia , lookup
Euclidean vector wikipedia , lookup
Vector space wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Exterior algebra wikipedia , lookup
Rotation matrix wikipedia , lookup
System of linear equations wikipedia , lookup
Covariance and contravariance of vectors wikipedia , lookup
Principal component analysis wikipedia , lookup
Jordan normal form wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Determinant wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Gaussian elimination wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Four-vector wikipedia , lookup
Review of linear algebra 1 Vectors and matrices We will just touch very briefly on certain aspects of linear algebra, most of which should be familiar. Recall that we deal with vectors, i.e. elements of Rn , which here we will denote with bold face letters such as v, and scalars, in other words elements of R. We could also use Cn , or Qn as needed, with scalars respectively C or Q. The main point is that, for the scalars, we need to be able to add, subtract, multiply and divide (except by 0). We can then add two vectors: if v = (v1 , . . . , vn ) and w = (w1 , . . . , wn ), then v + w = (v1 + w1 , . . . , vn + wn ). Scalar multiplication is similarly defined: given t ∈ R and v = (v1 , . . . , vn ) ∈ Rn , tv = (tv1 , . . . , tvn ). Vector addition is commutative and associative, there is a zero vector 0 = (0, . . . , 0), and every vector v has an additive inverse −v = (−1)v = (−v1 , . . . , −vn ). Scalar multiplication satisfies: for all s, t ∈ R and v ∈ Rn , s(tv) = (st)v and 1v = v. Finally there are two analogues of the distributive law: s, t ∈ R and v ∈ Rn , (s + t)v = sv + tv, and for all t ∈ R and v, w ∈ Rn , t(v + w) = tv + tw. These are easily check by using the usual properties of addition and multiplication for real numbers. We will not discuss the standard definition of linear independence, span, dimension or basis here. However, we will frequently use the standard basis e1 , . . . , en , where the components of ei are 0 except for the ith component, which is 1. Thus, any vector v = 1 , . . . , vn ) can be uniquely written in P(v n terms of the standard basis: v = i=1 vi ei . There is also the dot product or scalar product or inner product of two vectors v, w ∈ Rn , which we shall call the inner product and write as hv, wi, although if called the dot product it is usually written as v • w. Note that the product of two vectors is a scalar, whence the name scalar product. It 1 is bilinear and symmetric: for all v, w, u ∈ Rn and t ∈ R, hv + w, ui = hv, ui + hw, ui; htv, wi = thv, wi; hu, v + wi = hu, vi + hu, wi; hv, twi = thv, wi; hv, wi = hw, vi Of course, the third and fourth identities are a consequence of the first two and the symmetry condition. We can also define the length or norm of v: kvk = (hv, vi)1/2 . The standard basis e1 , . . . , en is orthonormal: for all i, j, ( 0, if i 6= j; hei , ej i = 1, if i = j. Any basis u1 , . . . , un with this property, that hui , uj i = 1 if i 6= j and hui , ui i = kui k2 = 1, will be called an orthonormal basis. Our main interest will be interesting sets of matrices. Recall that an m × n matrix is a rectangular array a11 a12 . . . a1n .. .. . .. A = ... . . . am1 am2 . . . amn We will often abbreviate this as A = (aij ). The above matrix consists of m rows and n columns. We refer to the number aij as the ij th entry. This means that aij is the number in the ith row and j th column. In particular a vector (x1 , . . . , xn ) is also a matrix, in this case a 1 × n matrix. We will call such a matrix a row vector. We can also think of a vector as an n × 1 matrix, which we shall refer to as a column vector. (We will often have to think of vectors as column vectors because of our conventions on the way we write functions.) The set of all m × n matrices is written Mm,n (R). Mm,n (C), Mm,n (Q), and even Mm,n (Z) are defined similarly. We can add two matrices in Mm,n (R) and multiply a matrix by a scalar. The zero matrix O = Om,n ∈ Mm,n (R) is the matrix all of whose entries are 0. In case m = n, we abbreviate Mn,n (R) by Mn (R) and call such a matrix a square (n × n) matrix. An important element of Mn (R) is the identity matrix In = I, 2 whose diagonal entries aii are equal to 1 and whose other entries aij , i 6= j, are equal to 0. It is easy to see that, for all A ∈ Mm,n (R), Im A = AIn = A. Given an m×n matrix A and an n×k matrix B, we can the matrix Pform n th product AB, an m × k matrix whose ij entry is given by t=1 ait btj . Thus the ij th entry is the inner product of the ith row of A with the j th column of B. Matrix multiplication is associative and distributes over matrix addition, where defined, but for A, B ∈ Mn (R) (the only case where AB and BA are both defined and of the same shape), it is rarely the case that AB = BA: matrix multiplication is not commutative. Recall that a linear function F : Rn → Rm is a function F such that, for all v, w ∈ Rn and t ∈ R, F (v + w) = F (v) + F (w) and F (tv) = tF (v). A linear function is completely specified by its values on the standard basis vectors e1 , . . . , en . Conversely, given any set of vectors v1 , . . . , vn ∈ Rm , there is a unique linear function F : Rn → Rm such that F (ei ) = vi for all i, P namely F (x1 , . . . , xn ) = i xi vi . In this case, recall that we can associate an m × n matrix to F as follows: write the vectors vi = (a1i , . . . , ami ). Then to F we associate the matrix a11 a12 . . . a1n .. .. . .. A = ... . . . am1 am2 . . . amn Here the columns of A are the vectors vi , written vertically, and the linear map F (x1 , . . . , xn ) corresponds to the matrix productPA · x, where A · x is the n × 1 matrix (column vector) whose j th entry is ni=1 aji xi . In particular A · ei = vi , written as a column vector; its j th entry is aji and it P is equal to kj=1 aji ej , where in the equality A · ei = k X aji ej j=1 the ei on the left is a basis vector in Rn and the ej on the right is a basis vector in Rk . Note the reversal of the indices! The case F : Rn → Rn corresponds to square (n × n) matrices. For example the linear function IdRn corresponds to the identity matrix In . Then we have: Proposition 1.1. If G : Rk → Rn and F : Rn → Rm are linear maps, and A and B are the matrices corresponding to F and G respectively, then F ◦ G is again linear and the matrix corresponding to F ◦ G is the matrix product A · B. 3 This gives another, more conceptual proof of the associativity of matrix multiplication. Let A be an m×n matrix A = (aij ). Recall that the transpose matrix t A is the n × m matrix whose (i, j)th entry is aji . For example, if A is a square (n × n) matrix, then t A is the reflection of A along the diagonal running from upper left to lower right. Clearly, t (t A) = A. A calculation shows that, for all standard basis vectors ei ∈ Rm and ej ∈ Rn , hei , Aej i = ht Aei , ej i. (Here of course the first inner product is of vectors in Rm and the second is of vectors in Rn .) Using bilinearity, it follows that, for all v ∈ Rm and w ∈ Rn , hv, Awi = ht Av, wi. From this (or directly from the definitions) one can prove: If A is an m × n matrix and B is an n × k matrix, then t 2 (AB) = t B t A. Invertible matrices We will write linear maps F : Rn → Rm as matrices A: F (v) = Av, with the understanding that, for the right hand side, v must be viewed as a column vector. Define the nullspace or kernel of A to be the set {v ∈ Rn : Av = 0}. We then have the basic result: Proposition 2.1. The linear function A : Rn → Rm is injective ⇐⇒ the nullspace of A is {0} ⇐⇒ the columns of A are linearly independent. The linear function A : Rn → Rm is surjective ⇐⇒ the columns of A span Rm . Corollary 2.2. Let F : Rn → Rm be a linear function, corresponding to the matrix A. (i) If F is injective, then n ≤ m. (ii) If F is surjective, then n ≥ m. (iii) If n = m, then F is injective ⇐⇒ F is surjective ⇐⇒ F is a bijection, and in this case the inverse function F −1 corresponds to a matrix, denoted A−1 , with the property that AA−1 = A−1 A = In . 4 We call a matrix A ∈ Mn (R) invertible if an inverse A−1 exists. The problem of deciding when a given n×n matrix A is invertible can be answered by determinants. Recall that, for every n, we have a function det : Mn (R) → R with the following properties: 1. For all A, B ∈ Mn (R), det(AB) = (det A)(det B). 2. det In = 1. 3. A is invertible ⇐⇒ det A 6= 0. In this case, det(A−1 ) = (det A)−1 . 4. det t A = det A. Define the general linear group GLn (R) to be the subset of Mn (R) consisting of invertible matrices. Equivalently, by (3) above, GLn (R) = {A ∈ Mn (R) : det A 6= 0}. The subset GLn (R) of Mn (R) is closed under products, In ∈ GLn (R), and if A ∈ GLn (R), then by definition A−1 exists and A−1 ∈ GLn (R); note that A−1 is invertible and that (A−1 )−1 = A. Define the special linear group SLn (R) via: SLn (R) = {A ∈ Mn (R) : det A = 1}. Clearly SLn (R) ⊆ GLn (R). By (1) above, SLn (R) is closed under multiplication and by (2) above, In ∈ SLn (R). Finally, if A ∈ SLn (R), then A is invertible and A−1 ∈ SLn (R) by (3). 3 Orthogonal matrices Orthogonal matrices are invertible matrices with very special geometric properties. Definition 3.1. A linear function A : Rn → Rn is an isometry if, for all v ∈ Rn , kAvk = kvk. In other words, A preserves length. Proposition 3.2. Given A ∈ Mn (R), the following conditions on A are equivalent. (i) A is an isometry, i.e. for all v ∈ Rn , kAvk = kvk. 5 (ii) For all v, w ∈ Rn , hAv, Awi = hv, wi. In other words, A preserves inner product. (iii) The columns of A are an orthonormal basis of Rn . (iv) A is invertible and t A = A−1 . (v) The rows of A are an orthonormal basis of Rn . Proof. (i) =⇒ (ii): This follows from the polarization identity: For all v, w ∈ Rn , kv + wk2 − kv − wk2 = 4hv, wi. This in turn follows from the bilinearity and symmetry of inner product and expansion: For example, kv + wk2 = hv + w, v + wi = hv, vi + 2hv, wi + hw, wi, and similarly for kv − wk2 . Then, if A is an isometry, 4hAv, Awi = kAv + Awk2 − kAv − Awk2 = kA(v + w)k2 − kA(v − w)k2 = kv + wk2 − kv − wk2 = 4hv, wi. Hence hAv, Awi = hv, wi. (ii) =⇒ (i): If hAv, Awi = hv, wi for all v, w ∈ Rn , then take v = w, so that kAvk2 = hAv, Avi = hv, vi = kvk2 . (ii) =⇒ (iii): The columns of A are equal to ui = Aei . By (ii), hui , uj i = hAei , Aej i = hei , ej i. Thus u1 , . . . , un is an orthonormal basis of Rn . (iii) ⇐⇒ (iv): The ij th entry of t AA is the inner product hui , uj i. Hence t AA = I n ⇐⇒ hui , uj i is 0 if i 6= j and 1 if i = j ⇐⇒ the columns of A are an orthonormal basis of Rn . (iv) ⇐⇒ (v): Similar to the above, using At A instead of t AA. (iv) =⇒ (ii): If t A = A−1 , then for all v, w ∈ Rn , hAv, Awi = hv, t AAwi = hv, A−1 Awi = hv, wi. We see that any of the five statements in the proposition implies any other, so they are all equivalent. 6 Definition 3.3. A matrix A ∈ Mn (R) satisfying any (and hence all) of the equivalent properties above is called an orthogonal matrix. The set of all orthogonal n × n matrices is denoted On , the orthogonal group. The set of all orthogonal matrices with determinant 1 is denoted SOn , the special orthogonal group. Proposition 3.4. If A, B ∈ On , then AB ∈ On . Moreover In ∈ SOn and hence In ∈ On . Finally, if A ∈ On , then A−1 ∈ On . Similar statements hold with On replaced by SOn . Proof. If A, B ∈ On , then t (AB) = t B t A = B −1 A−1 = (AB)−1 . Thus AB ∈ On . Clearly In ∈ SOn . Finally, note that, in general, if A is an n × n matrix with an inverse A−1 , then t (A−1 ) = (t A)−1 , by applying the identity t (AB) = t B t A to the product AA−1 = I. Thus, if A is orthogonal, t (A−1 ) = (t A)−1 = (A−1 )−1 . It follows that A−1 ∈ On . The following says that there is not a big difference between On and SOn : Proposition 3.5. If A ∈ On , then det A = ±1. Proof. Using t A = A−1 , we see that det A = det t A = det A−1 = (det A)−1 . Thus (det A)2 = 1, so that det A = ±1. We sometimes think of SOn as the set of rigid motions of Rn (fixing the origin). More details about SO2 and O2 are in the homework. 7