* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Lecture 3
Survey
Document related concepts
Jordan normal form wikipedia , lookup
Determinant wikipedia , lookup
Eigenvalues and eigenvectors wikipedia , lookup
Perron–Frobenius theorem wikipedia , lookup
Linear least squares (mathematics) wikipedia , lookup
Singular-value decomposition wikipedia , lookup
Matrix (mathematics) wikipedia , lookup
Four-vector wikipedia , lookup
Non-negative matrix factorization wikipedia , lookup
Orthogonal matrix wikipedia , lookup
Matrix calculus wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Least squares wikipedia , lookup
Ordinary least squares wikipedia , lookup
Matrix multiplication wikipedia , lookup
Transcript
Selected Numerical Methods Part 3: direct methods for linear systems Roberto Ferretti • Basic facts about norms and linear systems • The intrinsic conditioning of linear systems • Gauss Elimination Method • Gauss Elimination with pivoting • LU factorization 1 Basic facts about norms and linear systems A norm k · k is a function mapping a vector space X into R, satisfying the properties: kxk ≥ 0 , kxk = 0 if and only if x = 0; kcxk = |c| kxk (c ∈ R); kx + yk ≤ kxk + kyk. The three norms of most common use in Numerical Analysis are the euclidean, k · k∞ and k · k1 norms defined as: 1/2 X x2 , kxk2 = i i kxk∞ = max |xi| , i kxk1 = X |xi| i 2 If X is the space of linear bounded operators on a vector space Y , then the norm is usually required to further satisfy the properties kABk ≤ kAk kBk kAxkY ≤ kAk kxkY (submoltiplicativity); (compatibility) and we denote by natural norm (associated to a given norm on Y ) the following norm on X: kAxkY . x6=0 kxkY kAk := sup 3 In particular, the three natural matrix norms associated to respectively euclidean, k · k∞ e k · k1 norms are: kAk2 = ρ(AtA)1/2 , kAk∞ = max i X |aij | , kAk1 = max j j X |aij | i where ρ(B) = maxj |λj (B)| denotes the spectral radius of a matrix. Another matrix norm, compatible with the euclidean norm on vectors, is the Frobenius norm 1/2 X kAkF = a2 ij i,j which is not, however, a natural norm (kAkF ≥ kAk2). 4 Linear System: in compact form Ax = b, in explicit form a11x1 + a12x2 + · · · + a1nxn = b1 a x + a x + · · · + a x = b 21 1 22 2 2n n 2 ... a x + a x + · · · + a x = b nn n n n1 1 n2 2 (1) • It is known that (1) has a unique solution if and only if the matrix A is nonsingular, and there exist algorithms (e. g., Cramer’s) for its solution • The complexity of Cramer’s method is factorial, whereas gaussian elimination and similar algorithms have polynomial complexity 5 Example: number of operations required to solve a linear 3×3 system: • In Cramer’s method solution is computed as ∆k xk = (k = 1, 2, 3) ∆ where ∆, ∆k are 3 × 3 determinants. This requires a total of 3 quotients + (4 determinants)×(6 terms)×(2 products + 1 sum) = 75 operations 6 • With the method of elimination via diagonalization (Gauss–Jordan) the system is brought to the form α1x1 = β1 α 2 x2 = β 2 α 3 x3 = β 3 Each variable has to be eliminated (by means of linear combinations of rows) from two equations, giving a total of (6 eliminations)×(1 quotient + 3 products + 3 sums) + 3 quotients = 45 operations 7 • With the gaussian elimination via triangularization the system is brought to the form α11x1 + α12x2 + α13x3 = β1 α22x2 + α23x3 = β2 α33x3 = β3 and the triangular system is solved starting from x3, giving a total of (3 eliminations)×(1 quotient + 3 products + 3 sums) + 1 quotient + (1 product + 1 sum + 1 quotient) + (2 product + 2 sums + 1 quotient) = 30 operations. This is the typical strategy of numerical algorithms. 8 The intrinsic conditioning of linear systems Before analysing the stability of solutions of a linear system with respect to perturbations, we expect that conditioning cannot be good whenever the rows of the matrix A are almost linearly dependent (in two dimensions, this amounts to look for the intersection of two lines with similar slope) 9 ill-conditioned well-conditioned 10 The intrinsic conditioning of this problem, if evaluated in term of relative error, is related to the so–called condition number K∗(A) = kAk∗ kA−1k∗ of the matrix A with respect to the norm k · k∗ • If only the right–hand side b is perturbed, the solution x + δx of the system A(x + δx) = b + δb is affected by the relative perturbation kδxk kδbk ≤ K(A) kxk kbk • In the more general case in which the matrix A is also perturbed, the expression is more complex, but the conclusions are similar index 11 Gauss Elimination Method This method is based on the principle of using suitable linear combination of rows to obtain a sequence of equivalent systems A(1)x = b(1) → A(2)x = b(2) → · · · → A(n)x = b(n) where the last one is in triangular form • This is the algorithm having the lowest computational complexity for general matrices. • In machine arithmetics the propagation of rounding errors may become prohibitive in high dimensions. 12 The endpoint of the elimination process is a triangular system α11x1 + α12x2 + · · · + α1nxn = β1 α22x2 + · · · + α2nxn = β2 ... αnnxn = βn (2) whose solution may be computed by the so-called back substitutions as xn = βn , αnn xk = n X 1 αkj xj βk − αkk j=k+1 (k = n − 1, . . . , 1) in which the value of an unknown is obtained on the basis of the (successive) ones already computed. 13 We start from the initial system, which will be rewritten as A(1)x = b(1), or in extended form: (1) (1) (1) (1) a x + a x + · · · + a x = b 11 1 12 2 1n n 1 (1) (1) (1) (1) a 21 x1 + a22 x2 + · · · + a2n xn = b2 ... a(1)x + a(1)x + · · · + a(1)x = b(1). nn n n n1 1 n2 2 (3) (1) Assuming first that a11 6= 0, the elimination of the variable x1 is achieved by summing to the k–th row the first one multiplied by (1) (1) −ak1 /a11 . After n − 1 such linear combinations, the unknown x1 will only appear in the first equation. 14 After the first elimination step, the system will be in the form (1) (1) (1) (1) a x + a x + · · · + a x = b n 11 1 12 2 1n 1 (2) (2) (2) a22 x2 + · · · + a2n xn = b2 ... (2) (2) (2) an2 x2 + · · · + ann xn = bn . (4) (2) Assuming that a22 6= 0, in order to eliminate the variable x2 from the last n − 2 equations one sums to the k–th row the second one (2) (2) multiplied by −ak2 /a22 . After this operation, the unknown x2 will only appear in the two first equations (and so forth...). 15 (k) Clearly, it might not be true that the (pivot) element akk is nonzero. • This fact is only true when the k–th order principal minor is nonsingular (e. g., if A > 0) (k) • However, if det A 6= 0, at least one among the elements aik for i > k will be nonzero, so that to go past a zero pivot it suffices to exchange the k–th row with the row in which the nonzero candidate pivot appears • In machine arithmetics the choice of the pivot has a remarkable influence on the accuracy of the result 16 Example: the point (10, 1) solves the system 70x + 700x = 1400 1 2 3x + 31x = 61 1 2 Solving the system by Gaussian Elimination in three significant digits arithmetics, the multiplier associated to x1 is 3/70 = 0.0429 in finite arithmetics, and the system is brought to the triangular form 70x + 700x = 1400 1 2 x2 = 1 which has the correct solution. 17 If on the contrary the rows are interchanged, the system takes the form 3x + 31x = 61 1 2 70x + 700x = 1400 1 2 The multiplier associated to x1 is now 70/3 = 23.3 in finite arithmetics, and the system is triangularized as 3x + 31x = 61 1 2 −22x2 = −20 which (in the three significant digits arithmetics) has the solution x1 = 10.9, x2 = 0.909 (that is, with a 9% error). 18 Effects reducing the precision of Gaussian Elimination: • The final system obtained by GE is only approssimately triangular (2) (for example, in the second case we have in fact a21 = 0.01) • When summing numbers of different orders of magnitude, the smaller one loses significant digits due to the increase in the exponent • Coefficients obtained as difference between ”large” numbers may present an increase in the relative error with respect to the original coefficients (the latter two situations are are more likely to happen with large multipliers) 19 Complexity of Gaussian Elimination: • The solution of triangular system requires, for the k–th variable, n − k sums and n − k products. So, the overall operation count is 2 + 4 + 6 + · · · + 2(n − 1) = 2O n2 2 ! = O(n2) • The phase of system triangularization requires, to eliminate the k– th variable, (n−k)2 products and (n−k)2 sums. Therefore, this phase has the leading complexity which is 2(n − 1)2 + 2(n − 2)2 + · · · + 8 + 2 = 2O n3 3 ! =O 2n3 ! 3 index 20 Gauss Elimination with pivoting In the strategy of partial pivoting, which is based on row permutations, at the i–th elimination step the algorithm brings in i–th position the j–th equation (with j ≥ i), where j is such that (i) (i) |aji | = max |aki |. k≥i • The complexity of this operazione is linear for each elimination step, so that it is not the leading complexity term (actually, an elimination step operates on a whole submatrix and has therefore quadratic complexity). 21 In the global pivoting strategy, based on both row and column permutations, the element brought in pivot position at the i–th elimination (i) step is ajl (with j, l ≥ i) such that (i) (i) |ajl | = max |akh |. k,h≥i • It becomes necessary to keep record of the variable interchange operations (column exchanges) • The complexity of global pivoting is quadratic for each step. Therefore, it is comparable with the leading complexity of the elimination index 22 LU factorization The elimination of a generic variable xi amounts to obtain A(i+1) = TiA(i) by left multiplying A(i) by a transformation matrix Ti = 1 ... 1 −mi+1,i ... −mni ... 1 where the elements mki for k > i are the multipliers defined as (i) mki = aki (i) aii 23 Assuming first that no row permutation is necessary, the upper triangular matrix A(n) = U is therefore obtained as A(n) = Tn−1A(n−1) = Tn−1Tn−2A(n−2) = · · · = = Tn−1Tn−2 · · · T1A(1) = ΛA where the matrix Λ = Tn−1Tn−2 · · · T1 is lower triangular as a product of l.t. matrices. It follows that ΛA = U and hence, setting Λ−1 = L: A = LU with L again lower triangular, being the inverse of a l.t. matrix (moreover, both Λ and L have unity elements on the diagonal). 24 −1 Since Λ = Tn−1Tn−2 · · · T1, then L = Λ−1 = T1−1 · · · Tn−1 . Setting 0 ... 0 mk = mk+1,k ... mnk and therefore Tk = I − mk etk , we can easily check that • The inverse of the transformation Tk is Tk−1 = I + mk etk • The product Tj−1Tk−1 with j < k gives Tj−1Tk−1 = I +mj etj +mk etk , and −1 hence by induction L = T1−1 · · · Tn−1 = I+m1et1 +m2et2 +· · ·+mn−1etn−1 25 • The matrix A is therefore factorized in the product of L (lower triangular matrix of the multipliers) and U (upper triangular matrix resulting from Gaussian Elimination) • The solution of the linear system Ax = b, introducing the auxiliary variable z, is obtained by successive solution of the two triangular systems Lz = b and U x = z • When pivoting is performed, the system is subject to row permutations so that it becomes P Ax = P b. Then, P A = LU and the triangular systems to be solved are Lz = P b and U x = z 26 Other problems which can be solved by factorization: • Inverse computation: in this case the columns Xi of the inverse A−1 solve the linear systems AXi = ei then A is factorized once at the start, and only two triangular systems need to be solved for each new r.h.s. (with O(2n2) complexity) • Determinant computation: since det L = 1, one has det A = det L det U = det U = Y uii i index 27