* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download s08a.pdf
Survey
Document related concepts
Linear algebra wikipedia , lookup
Quartic function wikipedia , lookup
Gröbner basis wikipedia , lookup
Dessin d'enfant wikipedia , lookup
System of linear equations wikipedia , lookup
Cayley–Hamilton theorem wikipedia , lookup
Polynomial ring wikipedia , lookup
Polynomial greatest common divisor wikipedia , lookup
Eisenstein's criterion wikipedia , lookup
System of polynomial equations wikipedia , lookup
Fundamental theorem of algebra wikipedia , lookup
Factorization wikipedia , lookup
Factorization of polynomials over finite fields wikipedia , lookup
Transcript
Session 8a Approximation Theory Ernesto Gutierrez-Miravete Spring 2003 1 Introduction There are two fundamental problems in numerical approximation theory. One problem is to produce faithful yet simple functional representations of given functions or actual data. This is best done in practice by representing the data (or function) in piecewise fashion, as in spline interpolation. The other problem consists in the creation of optimal representations of data or functions by means of a single simpler function over the entire domain. This session focuses on the problem of developing optimal representations of given data or given functions. 2 Discrete Least Squares Consider a situation involving a single independent variable x and an associated dependent variable y. Let m pairs of values (points) (xi , yi ), i = 1, 2, ..., m be given. Consider the approximation problem consisting of finding the equation of a straight line which best represents the collection of points. It is not required that the line coincides with any of the points. One possible formulation is the minmax problem of determining the values of the parameters a and b such that min maxi=1,2,...m |yi − (axi + b)| This is typically a dificult problem to solve. Alternatively, one can attempt to minimize the total absolute deviation min m X i=1 |yi − (axi + b)| However, differentiability problems occur. 1 A most convenient method for determining linear approximations is the least squares procedure. Here, one must find the values of the parameters a and b which minimize the sum of the squares of the errors m X min i=1 (yi − (axi + b))2 Minimization requires taking derivatives of sum of squared errors with respect to the parameters a and b giving the normal equations a m X i=1 x2i + b m X i=1 xi = m X i=1 xi yi and a m X i=1 xi + bm = m X i=1 yi The solution of the normal equations is a= m( Pm P P m xi yi ) − ( m i=1 xi )( i=1 yi ) Pm P 2 m( i=1 x2i ) − ( m i=1 xi ) i=1 and b= ( Pm i=1 P P Pm m x2i )( m i=1 yi ) − ( i=1 xi yi )( Pm P 2 m( i=1 x2i ) − ( m i=1 xi ) i=1 xi ) Exercise 1. Example 8.1.1 in B-F. Instead of a linear function y = ax + b one can use an algebraic polynomial of degree P n < m − 1, Pn (x) = nk=0 ak xk . The constants a1 , a2 , ..., an are chosen such that the least squared error is obtained. Again, taking derivatives and equating to zero one obtains the n + 1 generalized normal equations n X k=0 ak m X i=1 xij+k = m X i=1 xji yi j = 0, 1, ..., n These normal equations have a unique solution for distinct xi . Exercise 2. Example 8.1.2 in B-F. Sometimes an exponential fitting function is more appropriate. Two possibilities are y = b exp(ax) and y = bxa 2 One can proceed as before (minimizing erros squared with respect to the parameters of the fitting function). However, no closed-form solution of the resulting system of equations can be obtained. Instead, one can take logarithms to obtain ln y = ln b + ax and ln y = ln b + a ln x One can then proceed with a linear model using Y = ln y, B = ln b, A = a and X = x or X = ln x. However, the approximation thus obtained is not least squares. Exercise 3. Example 8.1.3 in B-F. 3 Orthogonal Polynomials and Least Squares If instead of approximating data points one must approximate functions one can always use polynomials. Consider the problem of approximating a function f (x) ∈ C[a, b] with a polynomial Pn (x) of degree at most n. The square error is in this case Z b a (f (x) − Pn (x))2 dx Then one must find the set of coefficients aj , j = 0, 1, 2, ..., n which minimizes the value of the integral above. The corresponding normal equations in this case are an ill conditioned set (Hilbert matrix) n X k=0 Z ak b a xj+k dx = Z b a xj f (x)dx j = 0, 1, ..., n Further, having determined Pn does not facilitate obtaining Pn+1 . Exercise 4. Example 8.2.1 in B-F. Approximate sin(πx) in [0, 1] with a second order algebraic polynomial.. Definition of Linear Independence of Functions. The set {φ0 (x), φ1 (x), ..., φn (x)} is linearly independent in [a, b] if n X i=0 cj φj = 0 ∀ x ∈ [a, b] if and only if cj = 0 ∀ j = 0, 1, 2, ..., n. Theorem of Linear Independence of Polynomials. If φj is a polynomial of degree j for each j = 0, 1, 2, ..., n, then the set {φ0 (x), φ1 (x), ..., φn (x)} is linearly independent in [a, b]. Theorem of Linear Combination If {φ0 (x), φ1 (x), ..., φn (x)} is a collection of linearly independent polynomials from the (larger) set of all polynomials of degree at most n, any 3 polynomial of of degree at most n can be written uniquely as a linear combination of the φj ’s. Definition of Weight function. To assign varying degrees of importance to approximations on certain portions of the approximation interval one uses weight functions. Using the weight function and a linear combination of a linearly independent set of functions, the error to be minimized is now Z E(a0 , a1 , ..., an ) = b a w(x)(f (x) − n X ak φk (x))2 dx k=0 Minimization leads to the system of normal equations Z b a Z w(x)φk (x)φj (x)dx = aj b a w(x)[φj (x)]2 dx = aj αj for each j = 0, 1, 2, ..., n, and with 1 aj = αj Z b a w(x)f (x)φj (x)dx and Z b a ( w(x)φk (x)φj (x)dx = αj > 0 if j = k 0 if j 6= k Definition of Orthogonality of Functions. A set of functions {φ0 (x), φ1 (x), ..., φn (x)} ∈ [a, b] is orthogonal with respect to the weight function w(x) if and Z b a ( w(x)φk (x)φj (x)dx = αj > 0 if j = k 0 if j 6= k If αj = 1 ∀j = k the set is orthonormal. Theorem of Least Squares Approximation with Orthogonal Sets of Functions. If {φ0 (x), φ1 (x), ..., φn (x)} [a, b] is an orthogonal set of functions with respect to the weight function w(x), the least squares approximation to f (x) with respect to w is n X ak φk (x) k=0 with Rb 1 w(x)φk (x)f (x)dx = ak = R b 2 αk a w(x)[φk (x)] dx a Z b a w(x)φk (x)f (x)dx Theorem of Recursive Construction of Orthogonal Polynomials. The set of polynomial functions {φ0 (x), φ1 (x), ..., φn (x)} ∈ [a, b] given by φ0 (x) = 1 4 φ1 (x) = x − B1 = xw(x)[φ0 (x)]2 dx = x − Ra b 2 a w(x)[φ0 (x)] dx Rb φk (x) = (x − Bk )φk−1 (x) − Ck φk−2 (x) = Rb a xw(x)[φk−1 (x)] dx a xw(x)φk−1 (x)φk−2 (x)dx = (x − R b )φ )φk−1 (x) Rb k−1 (x) − 2 2 a w(x)[φk−1 (x)] dx a w(x)[φk−2 (x)] dx Rb 2 is orthogonal with respect to w. This provides a scheme to recursively compute orthogonal polynomials. Exercise 4. Example 8.2.3 in B-F. Legendre polynomials. 4 Chebyshev Polynomials The Chebyshev polynomials {Tn (x)} are an orthogonal set in [−1, 1] with respect to w(x) = (1 − x2 )−1/2 . T0 (x) = 1 T1 (x) = x and Tn+1 (x) = 2xTn (x) − Tn−1 (x) for each n ≥ 1. Theorem of Zeroes of Chebyshev Polynomials. Tn (x) has n simple zeros in [−1, 1] at x∗k = cos( 2k − 1 π) k = 1, 2, ..., n 2n Further, Tn (x) has absolute extrema at 0 xk = cos( kπ ) k = 1, 2, ..., n n The monic Chebyshev polynomials Tnm (x) are defined as T0m (x) = 1 and Tnm (x) = 1 2n−1 5 Tn (x) for each n ≥ 1. Theorem of Bound. Tnm (x) ∈ [−1, 1] have the property 1 2n−1 = max|Tnm (x)| ≤ max|Pnm | where Pnm is the set of monic algebraic polynomials. The optimal location of interpolating node xk , k = 0, 1, ..., n when using Lagrange inter2k+1 π) (i.e. equal to the (k + 1)th zero of Tnm (x)). polation in [−1, 1] is when xk = cos( 2(n+1) Exercise 5. Example 8.3.1 in B-F. Chebyshev polynomials can also be used to produce approximating polynomials of reduced order with minimum loss of accuracy. Exercise 6. Example 8.3.2 in B-F. 5 Rational Function Approximation Rational function approximations spread approximation errors more evenly over the approximation interval. They can also produce better approximations to discontinuous functions. A rational function of degree N , r(x) is of the form r(x) = p0 + p1 x + ... + pn xn p(x) = q(x) q0 + q1 x + ... + qm xm where p(x) and q(x) are algebraic polynomials of degrees n and m, respectively such that n + m = N . If q0 = 1 N + 1 parameters (q1 , q2 , ..., qm , p0 , p1 , ..., pn ) must be determined when approximating f (x) with r(x). Pade’ approximation selects the parameters such that f (k) (0) = r(k) (0), (or, equivalently, that f − r has a zero of multiplicity N + 1 at x = 0), for each k = 0, 1, .., N . This leads to the system of simulataneous linear algebraic equations (for k = 0, 1, .., N ) k X i=0 ai qk−i = pk for the N + 1 unknown parameters. Here the ai ’s are the coefficients of the Maclaurin expansion of f (x). Maple can be used to compute Pade approximations. First one computes the Maclaurin series and then produces the Pade approximation. The command is g := convert(series(f (x), x),ratpoly,3,2 Exercise 7. Example 8.4.1 in B-F. Pade Rational Approximation Algorithm. • Give the values of m and n, the function f (x) and the coefficients of its Maclauring polynomials. 6 • Set q0 = 1; p0 = a0 . • Set up linear system. For i = 1, 2, ..., N then, for j = 1, 2, ..., i − 1, if j ≤ n set bi,j = 0 and if i ≤ n set bi,i = 0. For j = i + 1, i + 2, ..., N set bi,j = 0. For j = 1, 2, ..., i, if j ≤ m set bi,n+j = −ai−j . For j = n + i + 1, n + i + 2, ..., N set bi,j = 0. Set bi,N +1 = ai . • Solve linear system by Gauss elimnation with partial pivoting. • Output qi for i = 0, 1, ..., m and pi for i = 0, 1, .., n. To improve accuracy, instead of algebraic polynomials, Chebyshev polynomials can be used both in r(x) and f (x), i.e. Pn pk Tk (x) k=0 qk Tk (x) r(x) = Pk=0 m and f (x) = ∞ X k=0 ak Tk (x) Exercise 8. Example 8.4.2 in B-F. Chebyshev Rational Approximation Algorithm • Give the values of m and n and the function f (x). • Set N = m + n. • Set a0 = (2/π) Rπ 0 f (cos(θ))dθ. • For k = 1, 2, ..., N + m set ak = (2/π) Rπ 0 f (cos(kθ))dθ. • Set q0 = 1. • For i = 0, 1, ..., N set up linear system. • Solve the linear system by Gauss elimination with partial pivoting. • Output qi for i = 0, 1, ..., m and pi for i = 0, 1, .., n. In Maple the Chebyshev polynomials are determined by first calling the library with(orthopoly,T) and then computing the Chebyshev rational approximation by g := convert(numapprox[chebyshev](f (x), x 7 6 Trigonometric Function Approximation Trigonometric functions can produce excellent representations of arbitrary functions. It is well known that the set of functions {φ0 (x), φ1 (x), ..., φ2n−1 (x)} such that φ0 (x) = 1 2 φk (x) = cos(kx) k = 1, 2, ..., n φn+k (x) = sin(kx) k = 1, 2, ..., n − 1 is orthogonal on [−π, π] with respect to w(x) = 1. The set of all linear combinations of the functions φ0 , φ1 , ..., φ2n−1 is the set of trigonometric polynomials of degree less or equal to n. The least squares trigonometric approximation to a given function f (x) ∈ C[−π, π], obtained from the set of trigonometric polynomials of degree less or equal to n, Sn (x) is given by Sn (x) = n−1 X a0 + an cos(nx) + [ak cos(kx) + bk sin(kx)] 2 k=1 where the coefficients are determined from 1 ak = π Z π −π f (x) cos(kx)dx for k = 0, 1, 2, ..., n and bk = 1 π Z π −π f (x) sin(kx)dx for k = 1, 2, ..., n − 1. Recall that limn→∞ , Sn (x) is the Fourier series of f (x). For finite n we have the truncated Fourier series. Exercise 9. Example 8.5.1 in B-F. Approximate |x| ∈ [−π, π] with S0 (x), S1 (x), S2 (x) and S3 (x). In many applications where one has to analyze large quantities of data the discrete analog of the above is often quite useful . Let {(xj , yj )}2m−1 j=0 be a collection of 2m paired data points with the xj equally spaced. One wants to find the trigonometric polynomial of degree n, Sn (x) of the form n−1 X a0 + an cos(nx) + [ak cos(kx) + bk sin(kx)] Sn (x) = 2 k=1 8 P 2 which minimizes the sum of the squares of the errors over the points E = 2m−1 j=0 (yj −Sn (xj )) . One proceeds as before, minimizing with respect to the parameters and taking advantage of the fact that the trigonometric polynomials are also orthogonal in the discrete case. The optimal coefficients are determined by the Theorem of Constants Minimizing the LS Sum and are given by ak = X 1 2m−1 yj cos(kxj ) m j=0 bk = X 1 2m−1 yj sin(kxj ) m j=0 for k = 0, 1, 2, ..., n, and for k = 1, 2, ..., n − 1. Exercise 10. Example 8.5.2 in B-F. For f (x) = x4 − 3x3 + 2x2 − tan(x[x − 2]) with x ∈ [0, 2] select 10 equally spaced points and determine S3 (x). Transform first the domain [0, 2] into [−π, π] by means of the change of variable from x to z = π(x − 1). The least squares trigonometric polynomial of order n = 3 is S3 (z) = [ 2 X a0 + a3 cos(3z) + (ak cos(kz) + bk sin(kz))] 2 k=1 where ak = 9 1X zj f (1 + ) cos(kzj ) 5 j=0 π bk = 9 1X zj f (1 + ) sin(kzj ) 5 j=0 π for k = 0, 1, 2, 3 and for k = 1, 2. 7 Fast Fourier Transforms Besides the least squares polynomial of degree n on a set of 2m data points {(xj , yj )}2m−1 j=0 described above we are interested in the interpolatory polynomial of order m on those data points. The reason for the interest is that very accurate results are produced when using the interpolating polynomial on large amounts of equally spaced data points. Direct calculation of the polynomial requires the performance of O((2m)2 ) operations. The Fast Fourier 9 Transform algorithm is a very convenient computation procedure based on selecting m appropriately such that it can be easily factored into powers of two. This choice reduces the number of required operations to to O(m log2 m). If one demands that the coefficients of the interpolatory polynomial agree with those of the discrete least squares polynomial, Sm (x) is given by Sm (x) = X a0 + am cos(mx) m−1 + (ak cos(kx) + bk sin(kx)) 2 k=1 with ak = X 1 2m−1 yj cos(kxj ) m j=0 bk = X 1 2m−1 yj sin(kxj ) m j=0 for k = 0, 1, 2, ..., m, and for k = 1, 2, ..., m − 1. Interpolating 2m points with Sm (x) above directly requires about (2m)2 multiplications/divisions and (2m)2 additions/substractions. The Fast Fourier Transform algorithm can do it with m log2 m of each by calculating the coefficients in clusters in the complex domain. The complex coefficients ck are ck = 2m−1 X j=0 yj exp( m ikπj )= (ak + ibk ) m (−1)k for each k = 0, 1, 2, ..., 2m − 1. Note that compuation of all ck ’s directly with this expression requires O[(2m)2 ] operations. Now select the value of m such that m = 2p where p > 0 is an integer. Thus, for each k ck + cm+k = 2m−1 X j=0 2m−1 X ikπj i(m + k)πj )+ )= yj exp( yj exp( m m j=0 = 2m−1 X j=0 yj exp( ikπj )(1 + exp(iπj)) m Since 1 − exp(iπj) = 2 when j is even and zero otherwise, one can replace the dummy summation index j by 2j and write ck + cm+k = 2 m−1 X j=0 y2j exp( 10 ikπj ) m/2 similarly ck − cm+k = 2 exp(ikπ/m) m−1 X j=0 y2j+1 exp( ikπj ) m/2 These two expressions allow determination of all ck ’s. This results in a reduction in the number of multiplications required to obtain ck to 2m2 + m. However, each of the above sums can in turn be replaced by two sums from j = 0 to j = (m/2) − 1 and as a result, the number of required multiplications is now m2 + 2m. Repetition of this breakdown process r times leads to a required number of multiplications of m2 /2r−2 + mr. When r = p + 1 the process is complete and the number of multiplications is of order m log2 m. Exercise 11. Example 8.6.1 in B-F. Fast Fourier Transform Algorithm. • Give m, p, uniformly spaced xj ∈ [−π, π] and yj • Set M = m; q = p; ξ = exp(iπ/m). • For j = 0, 1, ..., 2m − 1 set cj = yj . • For j = 1, 2, ..., M set ζj = ξ j ; ζj+M = −ζj • Set K = 0; ζ0 = 1. • For L = 1, 2, ...p + 1 while K < 2m − 1 for j = 1, 2, ..., M let K = kp (2p ) + kp−1 (2p−1 ) + ... + k1 (2) + k0 ; set K1 = K/2q = kp 2p−q + ... + kq+1 (2) + kq ; set K2 = kq 2p + kq+1 2p−1 + ... + kp 2q . • Set η = cK+M ζK2 ; cK+M = cK − η; cK = cK + η. • Set K = K + 1. • Set K = K + M . • Set K = 0; M = M/2; q = q − 1. • While K < 2m − 1, let K = kp (2p ) + kp−1 (2p−1 ) + ... + k1 (2) + k0 ; set j = k0 (2p ) + k1 (2p−1 ) + ... + kp−1 (2) + kp . • If j > K interchange cj and ck . • Set K = K + 1. • Set a0 = c0 /m; am = Re(exp(−iπm)cm /m) • For j = 1, ..., m − 1 set aj = Re(exp(−iπj)cj /m); bj = Im(exp(−iπj)cj /m). Exercise 12. Example 8.6.2 in B-F. 11