Download Lecture 3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Jordan normal form wikipedia , lookup

Determinant wikipedia , lookup

Eigenvalues and eigenvectors wikipedia , lookup

Perron–Frobenius theorem wikipedia , lookup

Linear least squares (mathematics) wikipedia , lookup

Singular-value decomposition wikipedia , lookup

Matrix (mathematics) wikipedia , lookup

Four-vector wikipedia , lookup

Non-negative matrix factorization wikipedia , lookup

Orthogonal matrix wikipedia , lookup

Matrix calculus wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Least squares wikipedia , lookup

Ordinary least squares wikipedia , lookup

Matrix multiplication wikipedia , lookup

System of linear equations wikipedia , lookup

Gaussian elimination wikipedia , lookup

Transcript
Selected Numerical Methods
Part 3: direct methods for linear systems
Roberto Ferretti
• Basic facts about norms and linear systems
• The intrinsic conditioning of linear systems
• Gauss Elimination Method
• Gauss Elimination with pivoting
• LU factorization
1
Basic facts about norms and linear systems
A norm k · k is a function mapping a vector space X into R, satisfying
the properties:
kxk ≥ 0 ,
kxk = 0 if and only if x = 0;
kcxk = |c| kxk (c ∈ R);
kx + yk ≤ kxk + kyk.
The three norms of most common use in Numerical Analysis are the
euclidean, k · k∞ and k · k1 norms defined as:
1/2
X


x2
,
kxk2 =
i
i

kxk∞ = max |xi| ,
i
kxk1 =
X
|xi|
i
2
If X is the space of linear bounded operators on a vector space Y ,
then the norm is usually required to further satisfy the properties
kABk ≤ kAk kBk
kAxkY ≤ kAk kxkY
(submoltiplicativity);
(compatibility)
and we denote by natural norm (associated to a given norm on Y )
the following norm on X:
kAxkY
.
x6=0 kxkY
kAk := sup
3
In particular, the three natural matrix norms associated to respectively
euclidean, k · k∞ e k · k1 norms are:
kAk2 = ρ(AtA)1/2 ,
kAk∞ = max
i
X
|aij | ,
kAk1 = max
j
j
X
|aij |
i
where ρ(B) = maxj |λj (B)| denotes the spectral radius of a matrix.
Another matrix norm, compatible with the euclidean norm on vectors,
is the Frobenius norm
1/2
X


kAkF =
a2
ij
i,j

which is not, however, a natural norm (kAkF ≥ kAk2).
4
Linear System: in compact form Ax = b, in explicit form


a11x1 + a12x2 + · · · + a1nxn = b1




a x + a x + · · · + a x = b
21 1
22 2
2n n
2
...





a x + a x + · · · + a x = b
nn n
n
n1 1
n2 2
(1)
• It is known that (1) has a unique solution if and only if the matrix
A is nonsingular, and there exist algorithms (e. g., Cramer’s) for its
solution
• The complexity of Cramer’s method is factorial, whereas gaussian
elimination and similar algorithms have polynomial complexity
5
Example: number of operations required to solve a linear 3×3 system:
• In Cramer’s method solution is computed as
∆k
xk =
(k = 1, 2, 3)
∆
where ∆, ∆k are 3 × 3 determinants. This requires a total of 3 quotients + (4 determinants)×(6 terms)×(2 products + 1 sum) = 75
operations
6
• With the method of elimination via diagonalization (Gauss–Jordan)
the system is brought to the form



α1x1 = β1
α 2 x2 = β 2



α 3 x3 = β 3
Each variable has to be eliminated (by means of linear combinations
of rows) from two equations, giving a total of (6 eliminations)×(1
quotient + 3 products + 3 sums) + 3 quotients = 45 operations
7
• With the gaussian elimination via triangularization the system is
brought to the form



α11x1 + α12x2 + α13x3 = β1



α22x2 + α23x3 = β2
α33x3 = β3
and the triangular system is solved starting from x3, giving a total of
(3 eliminations)×(1 quotient + 3 products + 3 sums) + 1 quotient
+ (1 product + 1 sum + 1 quotient) + (2 product + 2 sums + 1
quotient) = 30 operations. This is the typical strategy of numerical
algorithms.
8
The intrinsic conditioning of linear systems
Before analysing the stability of solutions of a linear system with
respect to perturbations, we expect that conditioning cannot be good
whenever the rows of the matrix A are almost linearly dependent (in
two dimensions, this amounts to look for the intersection of two lines
with similar slope)
9
ill-conditioned
well-conditioned
10
The intrinsic conditioning of this problem, if evaluated in term of
relative error, is related to the so–called condition number K∗(A) =
kAk∗ kA−1k∗ of the matrix A with respect to the norm k · k∗
• If only the right–hand side b is perturbed, the solution x + δx of the
system A(x + δx) = b + δb is affected by the relative perturbation
kδxk
kδbk
≤ K(A)
kxk
kbk
• In the more general case in which the matrix A is also perturbed,
the expression is more complex, but the conclusions are similar
index
11
Gauss Elimination Method
This method is based on the principle of using suitable linear combination of rows to obtain a sequence of equivalent systems
A(1)x = b(1) → A(2)x = b(2) → · · · → A(n)x = b(n)
where the last one is in triangular form
• This is the algorithm having the lowest computational complexity
for general matrices.
• In machine arithmetics the propagation of rounding errors may
become prohibitive in high dimensions.
12
The endpoint of the elimination process is a triangular system


α11x1 + α12x2 + · · · + α1nxn = β1





α22x2 + · · · + α2nxn = β2
...






αnnxn = βn
(2)
whose solution may be computed by the so-called back substitutions
as
xn =
βn
,
αnn

xk =
n
X

1 
αkj xj 
βk −
αkk
j=k+1
(k = n − 1, . . . , 1)
in which the value of an unknown is obtained on the basis of the
(successive) ones already computed.
13
We start from the initial system, which will be rewritten as A(1)x =
b(1), or in extended form:

(1)
(1)
(1)
(1)


a
x
+
a
x
+
·
·
·
+
a
x
=
b

11 1
12 2
1n n
1



(1)
(1)
(1)
(1)
a
21 x1 + a22 x2 + · · · + a2n xn = b2
...






a(1)x + a(1)x + · · · + a(1)x = b(1).
nn n
n
n1 1
n2 2
(3)
(1)
Assuming first that a11 6= 0, the elimination of the variable x1 is
achieved by summing to the k–th row the first one multiplied by
(1)
(1)
−ak1 /a11 . After n − 1 such linear combinations, the unknown x1 will
only appear in the first equation.
14
After the first elimination step, the system will be in the form

(1)
(1)
(1)
(1)


a
x
+
a
x
+
·
·
·
+
a
x
=
b

n
11 1
12 2
1n
1



(2)
(2)
(2)

a22 x2 + · · · + a2n xn = b2
...






(2)
(2)
(2)

an2 x2 + · · · + ann xn = bn .
(4)
(2)
Assuming that a22 6= 0, in order to eliminate the variable x2 from
the last n − 2 equations one sums to the k–th row the second one
(2)
(2)
multiplied by −ak2 /a22 . After this operation, the unknown x2 will
only appear in the two first equations (and so forth...).
15
(k)
Clearly, it might not be true that the (pivot) element akk is nonzero.
• This fact is only true when the k–th order principal minor is nonsingular (e. g., if A > 0)
(k)
• However, if det A 6= 0, at least one among the elements aik for
i > k will be nonzero, so that to go past a zero pivot it suffices to
exchange the k–th row with the row in which the nonzero candidate
pivot appears
• In machine arithmetics the choice of the pivot has a remarkable
influence on the accuracy of the result
16
Example: the point (10, 1) solves the system

70x + 700x = 1400
1
2
3x + 31x = 61
1
2
Solving the system by Gaussian Elimination in three significant digits
arithmetics, the multiplier associated to x1 is 3/70 = 0.0429 in finite
arithmetics, and the system is brought to the triangular form

70x + 700x = 1400
1
2

x2 = 1
which has the correct solution.
17
If on the contrary the rows are interchanged, the system takes the
form

3x + 31x = 61
1
2
70x + 700x = 1400
1
2
The multiplier associated to x1 is now 70/3 = 23.3 in finite arithmetics, and the system is triangularized as

3x + 31x = 61
1
2

−22x2 = −20
which (in the three significant digits arithmetics) has the solution
x1 = 10.9, x2 = 0.909 (that is, with a 9% error).
18
Effects reducing the precision of Gaussian Elimination:
• The final system obtained by GE is only approssimately triangular
(2)
(for example, in the second case we have in fact a21 = 0.01)
• When summing numbers of different orders of magnitude, the
smaller one loses significant digits due to the increase in the exponent
• Coefficients obtained as difference between ”large” numbers may
present an increase in the relative error with respect to the original
coefficients (the latter two situations are are more likely to happen
with large multipliers)
19
Complexity of Gaussian Elimination:
• The solution of triangular system requires, for the k–th variable,
n − k sums and n − k products. So, the overall operation count is
2 + 4 + 6 + · · · + 2(n − 1) = 2O
n2
2
!
= O(n2)
• The phase of system triangularization requires, to eliminate the k–
th variable, (n−k)2 products and (n−k)2 sums. Therefore, this phase
has the leading complexity which is
2(n − 1)2 + 2(n − 2)2 + · · · + 8 + 2 = 2O
n3
3
!
=O
2n3
!
3
index
20
Gauss Elimination with pivoting
In the strategy of partial pivoting, which is based on row permutations,
at the i–th elimination step the algorithm brings in i–th position the
j–th equation (with j ≥ i), where j is such that
(i)
(i)
|aji | = max |aki |.
k≥i
• The complexity of this operazione is linear for each elimination step,
so that it is not the leading complexity term (actually, an elimination step operates on a whole submatrix and has therefore quadratic
complexity).
21
In the global pivoting strategy, based on both row and column permutations, the element brought in pivot position at the i–th elimination
(i)
step is ajl (with j, l ≥ i) such that
(i)
(i)
|ajl | = max |akh |.
k,h≥i
• It becomes necessary to keep record of the variable interchange
operations (column exchanges)
• The complexity of global pivoting is quadratic for each step. Therefore, it is comparable with the leading complexity of the elimination
index
22
LU factorization
The elimination of a generic variable xi amounts to obtain A(i+1) =
TiA(i) by left multiplying A(i) by a transformation matrix





Ti = 





1
...
1
−mi+1,i
...
−mni
...









1
where the elements mki for k > i are the multipliers defined as
(i)
mki =
aki
(i)
aii
23
Assuming first that no row permutation is necessary, the upper triangular matrix A(n) = U is therefore obtained as
A(n) = Tn−1A(n−1) = Tn−1Tn−2A(n−2) = · · · =
= Tn−1Tn−2 · · · T1A(1) = ΛA
where the matrix Λ = Tn−1Tn−2 · · · T1 is lower triangular as a product
of l.t. matrices. It follows that ΛA = U and hence, setting Λ−1 = L:
A = LU
with L again lower triangular, being the inverse of a l.t.
matrix
(moreover, both Λ and L have unity elements on the diagonal).
24
−1
Since Λ = Tn−1Tn−2 · · · T1, then L = Λ−1 = T1−1 · · · Tn−1
. Setting

0
...
0











mk = 

mk+1,k 


...


mnk
and therefore Tk = I − mk etk , we can easily check that
• The inverse of the transformation Tk is Tk−1 = I + mk etk
• The product Tj−1Tk−1 with j < k gives Tj−1Tk−1 = I +mj etj +mk etk , and
−1
hence by induction L = T1−1 · · · Tn−1
= I+m1et1 +m2et2 +· · ·+mn−1etn−1
25
• The matrix A is therefore factorized in the product of L (lower
triangular matrix of the multipliers) and U (upper triangular matrix
resulting from Gaussian Elimination)
• The solution of the linear system Ax = b, introducing the auxiliary
variable z, is obtained by successive solution of the two triangular
systems Lz = b and U x = z
• When pivoting is performed, the system is subject to row permutations so that it becomes P Ax = P b.
Then, P A = LU and the
triangular systems to be solved are Lz = P b and U x = z
26
Other problems which can be solved by factorization:
• Inverse computation: in this case the columns Xi of the inverse A−1
solve the linear systems
AXi = ei
then A is factorized once at the start, and only two triangular systems
need to be solved for each new r.h.s. (with O(2n2) complexity)
• Determinant computation: since det L = 1, one has
det A = det L det U = det U =
Y
uii
i
index
27