* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Matrix Operations - Tonga Institute of Higher Education
Tensor operator wikipedia , lookup
Capelli's identity wikipedia , lookup
Bra–ket notation wikipedia , lookup
Quadratic form wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Cartesian tensor wikipedia , lookup
Linear algebra wikipedia , lookup
Rotation matrix wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Jordan normal form wikipedia , lookup
Symmetry in quantum mechanics wikipedia , lookup
Determinant wikipedia , lookup
System of linear equations wikipedia , lookup
Four-vector wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Matrix calculus wikipedia , lookup
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 7: Matrix Operations Objectives • After this chapter, you will know: – Matrix review – Matrix addition – Matrix multiplication – Solving systems of linear equations – How to implement the above in programming Matrix Operations • Operations on matrices are the heart of scientific computing. That means efficient algorithms for multiplying matrices or solving linear equations are very useful. • Some brief review of matrices: – A matrix – 1 3 4 2 5 1 – – – – Column Vector is an array of arrays (or a rectangular array) 7 This is a 3x3 matrix 1 5 A column vector is one column of a matrix A row vector is one row of a matrix A "unit vector" is a vector where the ith element is 1 and all other spots are 0 A "identity matrix" has ajk = 0 if i != j 1 1 Unit Vector 1 0 0 1 3 7 4 0 Identity 0 1 0 Row Vector Matrix 5 0 0 0 1 Operations on Matrices • Addition – If A and B are m x n matrices, then the sum is C = A + B, which is defined by n m • Multiplication C j 0 k 0 a jk b jk – If A is "m x n" matrix and B is a "n x p" matrix, then product is C, a "m x p" matrix, where C = A*B and can be found by multiplying each item in the row of the first matrix by each item in the column of the second matrix, summing up the value and saving as a new item – Example: 1 5 2 7 31 5 3 1 * 7 3 1 0 22 11 3 1 1 2 6 1 6 2 44 22 6 2 Matrix Properties • Multiplication: Generally, speaking we can say multiplication works like: a b w x aw by ax bz c d * y z cw dy cx dz – There are also properties that matrix multiplication holds, such as: • A*0 = 0 • A(BC) = (AB)C • A(B+C) = AB + AC • Inverses: Some matrices have an "inverse" which means that if you have a matrix A, there exists a matrix A-1 so that A*A-1 = IdentityMatrix and A-1*A = IdentityMatrix • Matrices that do not have an inverse are "singular" Determinants • Determinants allows us to learn different things about a matrix. • If a determinant is zero, then we know the matrix is singular (and does not have an inverse) • Finding the determinant of a 2x2 matrix is simple, larger matrices are harder. • If A = a b then the det(A) = ad - cb c d Matrices and Programming • In the graphs chapter we learned about how to store matrices (for graphs). • We will build upon that knowledge know and extend what we can do with matrices. • Matrix Addition Code: – We assume A and B are initialized with data – Line 4-6 will loop through each row and then loop through each column – It will take the values and add them into C 1: void MatrixAdd(A,B) 2: int n = rows; int k = cols; 3: int C[4][4]; // C = A + B // to add matrices, loop // through each row and column 4: for (int k = 0; k < n; k++) { 5: for (int j = 0; j < k; j++) { 6: C[k][j] = A[k][j] + B[k][j] 7: } 8: } Matrix Multiplication • Matrix multiplication is an important tasks for many science applications and a lot of work has been done on it. • There is an obvious straightforward way that O(n3) that we will look at, but there are other methods that have actually made matrix multiplication faster than O(n3). • One such example is Strassen's algorithm, which we will try to understand Matrix Multiplication • In this example, C is a "n x n" matrix • Lines 3-4 say to loop through all the rows and all the columns of C • Line 5 initializes the row and column of c to be 0 • Lines 6-7 go through each column of a and each row and b, multiplying them together and saving the sum back into c[i][j] • This solution should have an obvious running time of O(n3) 1: MatrixMultiply(A,B) 2: n = A->rows 3: for i = 1 to n 4: for j = 1 to n 5: c[i][j] = 0 6: for k = 1 to n 7: c[i][j] += a[i][k] * b[k][j] Matrix Multiplication • We can also use a recursive algorithm to do matrix multiplication, that uses a straight-forward divide and conquer method. • In the recursive method, we divide a matrix into smaller and smaller sub-matrices that are decided by the following algorithm, where – – – – r = ae + bf s = ag + bh t = ce + df u = cg + dh C = A * r s a b e t u c d f B g h • This will work when the matrices are "n x n" and we can easily divide matrices into smaller "n/2 x n/2" matrices Recursive Matrix Multiplication • The code will recursively perform 8 multiplications and add parts of product matrices four times. With that we can get the following recurrence relation O(1) if n 1 T ( n) n n 2 8T ( 2 ) 4( 2 ) n 1 • • • Why does this work at all and how do we know this recurrence? This is a hard question to answer We can sub-divide matrices and multiply correctly because of two facts. Fact1: We can sub-divide matrices and their products piece by piece. – This idea is called blocking and it means if we take our matrix and split into four pieces we can still do our multiplication. If we just split into four real numbers, then obviously it works, but it ALSO works for entire matrices. • Fact2: We can sub-divide matrices and addition still works – This idea says we can take entire matrices and addition works just like it does with real numbers. • Both of these facts allow us to recursively multiply matrices Recursive Matrix Multiplication • The recurrence relation says that we have 8 recursive calls and a 4(n/2)2 value, which is really O(n2) O(1) if n 1 T ( n) n n 2 8 T ( ) 4 ( ) n 1 2 2 • These numbers come from the recursive nature of the multiplication algorithm • Remember the algorithm: – – – – r = ae + bf s = ag + bh t = ce + df u = cg + dh r s a b e t u c d f g h • Each of these is a sub-matrix. Count the number of multiplications… There are 8 matrix multiplications and then four additions. Each submatrix is n/2, so we need to multiply 8 "n/2 x n/2" matrices (8T(n/2)) • Then we need to add 4 sub matrices. How long does addition of a matrix take? O(n2). So adding 4 "n/2 x n/2" matrices = 4(n/2)2 Recursive Matrix Multiplication • Unfortunately, when we solve the recurrence it is still O(n3) • But there is an amazing algorithm that is better than O(n3), called Strassen's algorithm • Strassen's algorithm runs in O(nlog 7) time which is about O(n2.808). • It should seem odd that you can do matrix multiplication faster than O(n3), but the current fastest matrix multiplication algorithm is O(n2.376). • These faster multiplication algorithms usually have very, very large constants, so it is not a good idea to use them unless you have, for example, a "45 x 45" matrix or bigger (in the Strassen's algorithm case. ) • For many science applications, sometimes matrices are "10000 x 10000", so these faster algorithms are very useful • Another interesting fact is that Strassen's is "weakly stable." This means that there is a round-off error when it is used and you may not always get the correct answer Strassen's Algorithm • Strassen's algorithm changes the straight-forward recursive algorithm by only doing 7 multiplications and 18 additions. Since additions are faster than multiplications, it is actually faster • No one really knows how Strassen (in 1969) was able to figure out that you only needed 7 multiplications and that you could mix and match different additions to get the same answer. • Other methods since have improved upon his ideas, but no one knows exactly how "hard" matrix multiplication is • A lower bound of Ω(n2) has been proved, but beyond that we don't know how fast matrix multiplication can go. Strassen's Algorithm • If we have the following matrix multiplication, then Strassen's says: • Get seven submatrices: – – – – – – – S1 S2 S3 S4 S5 S6 S7 = = = = = = = A(F-H) (A+B)H (C+D)E D(G-E) (A+D)(E+H) (B-D)(G+H) (A-C)(E+F) I K J A B E L C D G F H • With these sub-matrices, we can find out the answers – – – – • • • • I = S5 + S6 + S4 – S2 J = S1 + S2 K = S3 + S4 L = S1 – S7 – S3 + S5 If we count up multiplications, we get 7 If we count up additions (and subtractions), we get 18 Our recurrence relation is T(n) = 7T(n/2) + 18(n/2)2 = O(n2.808) As you can see, know one understands how he discovered this C Strassen's Example • Get seven submatrices: – – – – – – – S1 S2 S3 S4 S5 S6 S7 = = = = = = = A(F-H) 1(-1-(-2)) = 1 (A+B)H (1+3)-2 = -8 (C+D)E (-2+2)0 = 0 D(G-E) 2(4-0) = 8 (A+D)(E+H) (1+2)(0-2) = -6 (B-D)(G+H) (3-2)(4-2) = 2 (A-C)(E+F) (1-(-2))(0-1) = -3 = A * B I K J A B E L C D G 1 3 0 1 2 2 4 2 12 7 C 8 2 • With these sub-matrices, we can find out the answers – – – – I = S5 + S6 + S4 – S2 -6 + 2 + 8 – (-8) = 12 J = S1 + S2 1 + (-8) = -7 K = S3 + S4 0 + 8 = 8 L = S1 – S7 – S3 + S5 1 – (-3) – 0 + (-6) = -2 F H Solving Linear Equations • Solving linear equations is one of the most important uses of matrices. • Linear equations, which appear in many fields, follow the form a x +a x +…+a x =b 11 1 12 2 1n n 1 a21x1 + a22x2 + … + a2nxn = b2 … an1x1 + an2x2 + … + annxn = bn • Which can also be expressed in matrix format as a11 a12 a1n x1 b1 Which can also be written as a x b Ax = b a a 22 21 2 21 2 If A is not-singular then x = A-1b an1 an 2 ann xn bn Solving Linear Equations • We will only look at this case where the system is Ax=b with n equations and n unknowns. (Other systems exists were there are less equations then unknowns or more equations than unknowns. Both are harder to solve) • One way to solve this is to find the inverse and apply the relationship x = A-1b • This would require us to find the inverse though. The solution is also "weakly stable" which means that answers may have round-off errors which will result in a wrong answer • Instead we will implement a "LUP Decomposition" that is faster than the inverse and also stable. LUP Decomposition • Our goal is to solve Ax = b • The idea behind LUP decomposition is to find three n x n matrices L, U and P so that: – – – – PA = LU L is a unit lower triangular matrix U is an upper triangular matrix P is a permutation matrix (usually the Identity Matrix) • Every non-singular matrix has a L, U and P matrix • If we say (because P is the identity matrix): – PAx = Pb, • Then we can re-write the problem as: – LUx = Pb LUP Decomposition • Then we say that – y = Ux • And the system turns into – Ly = Pb – Which we can solve with a method called "forward substitution" • Once we know y, then we can solve: – Ux = y – Using a method called "back substitution" – This vector x is our solution vector Making a LU Decomposition • We will consider the case where A is a "n x n" nonsingular matrix and P is the Identity matrix. • So what we want is PA = LU which equals A = LU • To find LU we do what's called a "Gaussian elimination" • To get the matrices, we start by subtracting multiples of the first equation from the other equations so that the first variable is removed from those equations. • Then we subtract multiples of the second equation to get rid of the second variables in equations below. • We continue this process until we have an upper triangular matrix (U). • The multipliers we used to get rid of the terms make up the Lower triangular matrix (L). LU Decomposition Algorithm • To do a LU decomposition, we will use a recursive algorithm that continues to break a matrix into 4 separate pieces. a11 a12 a a22 21 A an1 an 2 a11 wt 1 v A 0 a11 1 v / a11 I n 1 0 The four pieces help to make up the new matrices L and U a1n wt is a row vector, v is the column vector and A1 is a special matrix a21 that is "n-1 x n-1" We then can break that matrix into ann two matrices, L and U In L, v/a11 are the values that would eliminate terms of the equations In U, the hard part is the A1-(v/a11)wT. This says, make a new n-1xn-1 t w matrix by multiplying (v/a11)*wT A1 (v / a11) wt and then use A1 to subtract from it This is called the Schur complement LU-Decomposition Algorithm • Line 3 begins the for loop that does the "recursive" part. It will go through each submatrix • Line 4: sets the U values along diagonal • Line 5-7 v/a11 and sets the top rows of the U matrix • Lines 8-10 computer the new "n-1 x n-1" matrix (called the Schur 1: LUDecomp(A) 2: n = A->rows 3: for k = 0; k < n; k++ 4: U[k][k] = A[k][k] 5: L[k][k] = 1 6: for i = k+1 to n 7: L[i][k] = A[i][k]/A[k][k] 8: U[k][i] = A[k][i] 9: for t = k+1 to n 10: for s = k+1 to n 11: A[t][s] -= L[i][k]*U[k][i] 12: return L and U LU Decomposition Example 2 3 1 5 6 13 5 19 2 19 10 23 4 10 11 31 2 3 1 5 3 4 2 4 1 16 9 18 2 4 9 21 a) b) 2 3 1 5 1 6 13 5 19 3 2 19 10 23 1 4 10 11 31 2 A = 2 3 1 2 1 5 2 4 1 2 7 17 3 4 4 1 2 3 1 2 c) 0 1 4 1 0 0 1 7 L 0 2 0 0 0 0 1 0 * 3 4 4 1 1 2 1 7 d) 3 4 0 0 1 2 1 0 U 5 4 2 3 5 4 2 3 LUP Decomposition • The LU decomposition we looked at assumes that we will not divide by zero (that the pivot element is never zero) • The P matrix will move rows around so that a pivot element (a number on the diagonal) is not zero. If an entire row is zero, then the matrix is singular and we cannot solve it • The LU decomposition before assumed that the P matrix was the Identity Matrix, meaning that we won't divide by zero. • Making a new P matrix is not very difficult, but it adds more complexity. We will just always assume we don't divide by zero. Solving with LUP Decomposition • We will first solve using the L matrix with something called "forward substitution." • Ly = Pb or Ly = b (since P is identity) • The L matrix is a lower triangular matrix, meaning that it looks like this: Lower Triangular Matrix • And that we can rewrite equations like: y1 L21y1 + y2 L31y1 + L32y2 + y3 Ln1y1 + Ln2y2 + … + yn = = = = b1 b2 b3 bn 1 3 1 2 0 1 4 1 0 0 1 7 0 0 0 1 Forward Substitution • Since we know y1 = b1, we can use that to forward substitute into the next equation – L21y1 + y2 = b2 – Y2 = b2 – L21y1 y2 = b2 – L21b1 • To get the third, we continue this "forward substitution" – Y3 = b3 – (L31y1 + L32y2) • In general, for the ith term, we can write yi bi j 1 Lij y j i 1 Back substitution • Back substitution will solve for the final part of the problem (Ux = y). • Since U is an upper triangular matrix that looks like this: 2 3 1 5 Upper Triangular Matrix 0 4 2 4 0 0 1 2 0 0 0 3 • We can rewrite the equation like this: U11x1 + U12x2 + … + U1nxn U22x2 + U23x3 + … + U2nxn U33x3 + u34x4 + … + U3nxn Unnxn = y1 = y2 = y3 = yn Back Substitution • So we can recursively solve for x, by doing – xn = yn/Unn – xn-1 = (yn-1 – Un-1,nxn)/Un-1,n-1 –… • And the general case: xi yi ( j i 1U ij x j ) / Ui j n LUP-Solve • Lines 3-6 do the forward substitution to find Ly = Pb (although we are assuming P is the Identity matrix, so it does not change b) • Lines 7-10 do the back substitution to find Ux = y. • x is the solution • You should see the running time is O(2n2) = O(n2) 1: LUPSolve(L,U,P,b) 2: n = L->rows 3: for i = 1 to n 4: for j = 1 to i-1 5: sum += L[i][j]*y[j] 6: y[i] = b[i] – sum 7: for i = n down to 1 8: for j = i+1 to n 9: sum += U[i][j]*x[j] 10: x[i] = (y[i] – sum)/U[i][j] 11: return x LUP-Solve Example Ax = b 1 2 1 4 3 5 4 x 1 1 1 2 1 Do the LU Decomposition 1 2 1 3 5 4 1 1 2 0 1 L 3 1 1 3 1 2 1 3 1 1 1 3 3 0 1 0 U 0 0 1 2 1 1 3 1 1 1 3 6 2 1 1 1 0 6 LUP-Solve Example Do forward substitution for Ly = b 0 0 1 4 3 1 0 * y 1 1 3 1 1 y1 = 4 y2 = 1 - 3*4 = -11 y3 = -1 – (-1*4 + -3*-11) = -30 Do back substitution for Ux = y 1 2 1 4 0 1 1 * x 11 0 0 6 30 x3 = -30/6 = -5 x2 = (-11 – (1*-5))/-1 = 6 x1 = (4-(2*6+1*-5))/1 = -3 3 x 6 5 LUP Solve Example Is the following x vector correct? 1 2 1 3 4 3 5 4 6 1 1 1 2 5 1 b1 = -3*1 + 2*6 + 1*-5 = 4 b2 = 3*-3 + 5*6 + 4*-5 = 1 b3 = -1*-3 + 1*6 + 2*-5 = -1 Yes, it is correct. Our algorithm for solving systems of linear equations works for "n x n" matrices that are not singular and do not have a situation where we divide by zero. Summary • Matrix operations have provided computer scientists with many possibilities for creating interesting algorithms. • We should now be familiar with how to multiply matrices (using three different methods) and also how to solve linear equations using LUP decomposition for certain types of matrices.