Download s08a.pdf

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Linear algebra wikipedia , lookup

Quartic function wikipedia , lookup

Gröbner basis wikipedia , lookup

Dessin d'enfant wikipedia , lookup

System of linear equations wikipedia , lookup

Equation wikipedia , lookup

Resultant wikipedia , lookup

Cayley–Hamilton theorem wikipedia , lookup

Polynomial ring wikipedia , lookup

Polynomial wikipedia , lookup

Polynomial greatest common divisor wikipedia , lookup

Eisenstein's criterion wikipedia , lookup

System of polynomial equations wikipedia , lookup

Fundamental theorem of algebra wikipedia , lookup

Factorization wikipedia , lookup

Factorization of polynomials over finite fields wikipedia , lookup

Transcript
Session 8a
Approximation Theory
Ernesto Gutierrez-Miravete
Spring 2003
1
Introduction
There are two fundamental problems in numerical approximation theory. One problem is
to produce faithful yet simple functional representations of given functions or actual data.
This is best done in practice by representing the data (or function) in piecewise fashion, as in
spline interpolation. The other problem consists in the creation of optimal representations
of data or functions by means of a single simpler function over the entire domain. This
session focuses on the problem of developing optimal representations of given data or given
functions.
2
Discrete Least Squares
Consider a situation involving a single independent variable x and an associated dependent
variable y. Let m pairs of values (points) (xi , yi ), i = 1, 2, ..., m be given. Consider the
approximation problem consisting of finding the equation of a straight line which best represents the collection of points. It is not required that the line coincides with any of the
points. One possible formulation is the minmax problem of determining the values of the
parameters a and b such that
min maxi=1,2,...m |yi − (axi + b)|
This is typically a dificult problem to solve. Alternatively, one can attempt to minimize the
total absolute deviation
min
m
X
i=1
|yi − (axi + b)|
However, differentiability problems occur.
1
A most convenient method for determining linear approximations is the least squares
procedure. Here, one must find the values of the parameters a and b which minimize the
sum of the squares of the errors
m
X
min
i=1
(yi − (axi + b))2
Minimization requires taking derivatives of sum of squared errors with respect to the
parameters a and b giving the normal equations
a
m
X
i=1
x2i + b
m
X
i=1
xi =
m
X
i=1
xi yi
and
a
m
X
i=1
xi + bm =
m
X
i=1
yi
The solution of the normal equations is
a=
m(
Pm
P
P
m
xi yi ) − ( m
i=1 xi )( i=1 yi )
Pm
P
2
m( i=1 x2i ) − ( m
i=1 xi )
i=1
and
b=
(
Pm
i=1
P
P
Pm
m
x2i )( m
i=1 yi ) − ( i=1 xi yi )(
Pm
P
2
m( i=1 x2i ) − ( m
i=1 xi )
i=1
xi )
Exercise 1. Example 8.1.1 in B-F.
Instead of a linear function y = ax + b one can use an algebraic polynomial of degree
P
n < m − 1, Pn (x) = nk=0 ak xk . The constants a1 , a2 , ..., an are chosen such that the least
squared error is obtained. Again, taking derivatives and equating to zero one obtains the
n + 1 generalized normal equations
n
X
k=0
ak
m
X
i=1
xij+k =
m
X
i=1
xji yi j = 0, 1, ..., n
These normal equations have a unique solution for distinct xi .
Exercise 2. Example 8.1.2 in B-F.
Sometimes an exponential fitting function is more appropriate. Two possibilities are
y = b exp(ax)
and
y = bxa
2
One can proceed as before (minimizing erros squared with respect to the parameters of the
fitting function). However, no closed-form solution of the resulting system of equations can
be obtained. Instead, one can take logarithms to obtain
ln y = ln b + ax
and
ln y = ln b + a ln x
One can then proceed with a linear model using Y = ln y, B = ln b, A = a and X = x or
X = ln x. However, the approximation thus obtained is not least squares.
Exercise 3. Example 8.1.3 in B-F.
3
Orthogonal Polynomials and Least Squares
If instead of approximating data points one must approximate functions one can always
use polynomials. Consider the problem of approximating a function f (x) ∈ C[a, b] with a
polynomial Pn (x) of degree at most n. The square error is in this case
Z
b
a
(f (x) − Pn (x))2 dx
Then one must find the set of coefficients aj , j = 0, 1, 2, ..., n which minimizes the value of
the integral above. The corresponding normal equations in this case are an ill conditioned
set (Hilbert matrix)
n
X
k=0
Z
ak
b
a
xj+k dx =
Z
b
a
xj f (x)dx j = 0, 1, ..., n
Further, having determined Pn does not facilitate obtaining Pn+1 .
Exercise 4. Example 8.2.1 in B-F. Approximate sin(πx) in [0, 1] with a second order
algebraic polynomial..
Definition of Linear Independence of Functions. The set {φ0 (x), φ1 (x), ..., φn (x)} is linearly independent in [a, b] if
n
X
i=0
cj φj = 0 ∀ x ∈ [a, b]
if and only if cj = 0 ∀ j = 0, 1, 2, ..., n.
Theorem of Linear Independence of Polynomials. If φj is a polynomial of degree j for
each j = 0, 1, 2, ..., n, then the set {φ0 (x), φ1 (x), ..., φn (x)} is linearly independent in [a, b].
Theorem of Linear Combination If {φ0 (x), φ1 (x), ..., φn (x)} is a collection of linearly independent polynomials from the (larger) set of all polynomials of degree at most n, any
3
polynomial of of degree at most n can be written uniquely as a linear combination of the
φj ’s.
Definition of Weight function. To assign varying degrees of importance to approximations
on certain portions of the approximation interval one uses weight functions.
Using the weight function and a linear combination of a linearly independent set of
functions, the error to be minimized is now
Z
E(a0 , a1 , ..., an ) =
b
a
w(x)(f (x) −
n
X
ak φk (x))2 dx
k=0
Minimization leads to the system of normal equations
Z
b
a
Z
w(x)φk (x)φj (x)dx = aj
b
a
w(x)[φj (x)]2 dx = aj αj
for each j = 0, 1, 2, ..., n, and with
1
aj =
αj
Z
b
a
w(x)f (x)φj (x)dx
and
Z
b
a
(
w(x)φk (x)φj (x)dx =
αj > 0 if j = k
0
if j 6= k
Definition of Orthogonality of Functions. A set of functions {φ0 (x), φ1 (x), ..., φn (x)} ∈
[a, b] is orthogonal with respect to the weight function w(x) if and
Z
b
a
(
w(x)φk (x)φj (x)dx =
αj > 0 if j = k
0
if j 6= k
If αj = 1 ∀j = k the set is orthonormal.
Theorem of Least Squares Approximation with Orthogonal Sets of Functions. If {φ0 (x), φ1 (x), ..., φn (x)}
[a, b] is an orthogonal set of functions with respect to the weight function w(x), the least
squares approximation to f (x) with respect to w is
n
X
ak φk (x)
k=0
with
Rb
1
w(x)φk (x)f (x)dx
=
ak = R b
2
αk
a w(x)[φk (x)] dx
a
Z
b
a
w(x)φk (x)f (x)dx
Theorem of Recursive Construction of Orthogonal Polynomials. The set of polynomial
functions {φ0 (x), φ1 (x), ..., φn (x)} ∈ [a, b] given by
φ0 (x) = 1
4
φ1 (x) = x − B1 =
xw(x)[φ0 (x)]2 dx
= x − Ra b
2
a w(x)[φ0 (x)] dx
Rb
φk (x) = (x − Bk )φk−1 (x) − Ck φk−2 (x) =
Rb
a xw(x)[φk−1 (x)] dx
a xw(x)φk−1 (x)φk−2 (x)dx
= (x − R b
)φ
)φk−1 (x)
Rb
k−1 (x) −
2
2
a w(x)[φk−1 (x)] dx
a w(x)[φk−2 (x)] dx
Rb
2
is orthogonal with respect to w. This provides a scheme to recursively compute orthogonal
polynomials.
Exercise 4. Example 8.2.3 in B-F. Legendre polynomials.
4
Chebyshev Polynomials
The Chebyshev polynomials {Tn (x)} are an orthogonal set in [−1, 1] with respect to w(x) =
(1 − x2 )−1/2 .
T0 (x) = 1
T1 (x) = x
and
Tn+1 (x) = 2xTn (x) − Tn−1 (x)
for each n ≥ 1.
Theorem of Zeroes of Chebyshev Polynomials. Tn (x) has n simple zeros in [−1, 1] at
x∗k = cos(
2k − 1
π) k = 1, 2, ..., n
2n
Further, Tn (x) has absolute extrema at
0
xk = cos(
kπ
) k = 1, 2, ..., n
n
The monic Chebyshev polynomials Tnm (x) are defined as
T0m (x) = 1
and
Tnm (x) =
1
2n−1
5
Tn (x)
for each n ≥ 1.
Theorem of Bound. Tnm (x) ∈ [−1, 1] have the property
1
2n−1
= max|Tnm (x)| ≤ max|Pnm |
where Pnm is the set of monic algebraic polynomials.
The optimal location of interpolating node xk , k = 0, 1, ..., n when using Lagrange inter2k+1
π) (i.e. equal to the (k + 1)th zero of Tnm (x)).
polation in [−1, 1] is when xk = cos( 2(n+1)
Exercise 5. Example 8.3.1 in B-F.
Chebyshev polynomials can also be used to produce approximating polynomials of reduced order with minimum loss of accuracy.
Exercise 6. Example 8.3.2 in B-F.
5
Rational Function Approximation
Rational function approximations spread approximation errors more evenly over the approximation interval. They can also produce better approximations to discontinuous functions.
A rational function of degree N , r(x) is of the form
r(x) =
p0 + p1 x + ... + pn xn
p(x)
=
q(x)
q0 + q1 x + ... + qm xm
where p(x) and q(x) are algebraic polynomials of degrees n and m, respectively such that
n + m = N . If q0 = 1 N + 1 parameters (q1 , q2 , ..., qm , p0 , p1 , ..., pn ) must be determined when
approximating f (x) with r(x).
Pade’ approximation selects the parameters such that f (k) (0) = r(k) (0), (or, equivalently,
that f − r has a zero of multiplicity N + 1 at x = 0), for each k = 0, 1, .., N . This leads to
the system of simulataneous linear algebraic equations (for k = 0, 1, .., N )
k
X
i=0
ai qk−i = pk
for the N + 1 unknown parameters. Here the ai ’s are the coefficients of the Maclaurin
expansion of f (x).
Maple can be used to compute Pade approximations. First one computes the Maclaurin
series and then produces the Pade approximation. The command is g := convert(series(f (x), x),ratpoly,3,2
Exercise 7. Example 8.4.1 in B-F.
Pade Rational Approximation Algorithm.
• Give the values of m and n, the function f (x) and the coefficients of its Maclauring
polynomials.
6
• Set q0 = 1; p0 = a0 .
• Set up linear system. For i = 1, 2, ..., N then, for j = 1, 2, ..., i − 1, if j ≤ n set bi,j = 0
and if i ≤ n set bi,i = 0. For j = i + 1, i + 2, ..., N set bi,j = 0. For j = 1, 2, ..., i, if
j ≤ m set bi,n+j = −ai−j . For j = n + i + 1, n + i + 2, ..., N set bi,j = 0. Set bi,N +1 = ai .
• Solve linear system by Gauss elimnation with partial pivoting.
• Output qi for i = 0, 1, ..., m and pi for i = 0, 1, .., n.
To improve accuracy, instead of algebraic polynomials, Chebyshev polynomials can be
used both in r(x) and f (x), i.e.
Pn
pk Tk (x)
k=0 qk Tk (x)
r(x) = Pk=0
m
and
f (x) =
∞
X
k=0
ak Tk (x)
Exercise 8. Example 8.4.2 in B-F.
Chebyshev Rational Approximation Algorithm
• Give the values of m and n and the function f (x).
• Set N = m + n.
• Set a0 = (2/π)
Rπ
0
f (cos(θ))dθ.
• For k = 1, 2, ..., N + m set ak = (2/π)
Rπ
0
f (cos(kθ))dθ.
• Set q0 = 1.
• For i = 0, 1, ..., N set up linear system.
• Solve the linear system by Gauss elimination with partial pivoting.
• Output qi for i = 0, 1, ..., m and pi for i = 0, 1, .., n.
In Maple the Chebyshev polynomials are determined by first calling the library with(orthopoly,T)
and then computing the Chebyshev rational approximation by g := convert(numapprox[chebyshev](f (x), x
7
6
Trigonometric Function Approximation
Trigonometric functions can produce excellent representations of arbitrary functions. It is
well known that the set of functions {φ0 (x), φ1 (x), ..., φ2n−1 (x)} such that
φ0 (x) =
1
2
φk (x) = cos(kx) k = 1, 2, ..., n
φn+k (x) = sin(kx) k = 1, 2, ..., n − 1
is orthogonal on [−π, π] with respect to w(x) = 1. The set of all linear combinations of the
functions φ0 , φ1 , ..., φ2n−1 is the set of trigonometric polynomials of degree less or equal to n.
The least squares trigonometric approximation to a given function f (x) ∈ C[−π, π],
obtained from the set of trigonometric polynomials of degree less or equal to n, Sn (x) is
given by
Sn (x) =
n−1
X
a0
+ an cos(nx) +
[ak cos(kx) + bk sin(kx)]
2
k=1
where the coefficients are determined from
1
ak =
π
Z
π
−π
f (x) cos(kx)dx
for k = 0, 1, 2, ..., n and
bk =
1
π
Z
π
−π
f (x) sin(kx)dx
for k = 1, 2, ..., n − 1.
Recall that limn→∞ , Sn (x) is the Fourier series of f (x). For finite n we have the truncated
Fourier series.
Exercise 9. Example 8.5.1 in B-F. Approximate |x| ∈ [−π, π] with S0 (x), S1 (x), S2 (x)
and S3 (x).
In many applications where one has to analyze large quantities of data the discrete analog
of the above is often quite useful . Let {(xj , yj )}2m−1
j=0 be a collection of 2m paired data points
with the xj equally spaced. One wants to find the trigonometric polynomial of degree n,
Sn (x) of the form
n−1
X
a0
+ an cos(nx) +
[ak cos(kx) + bk sin(kx)]
Sn (x) =
2
k=1
8
P
2
which minimizes the sum of the squares of the errors over the points E = 2m−1
j=0 (yj −Sn (xj )) .
One proceeds as before, minimizing with respect to the parameters and taking advantage
of the fact that the trigonometric polynomials are also orthogonal in the discrete case. The
optimal coefficients are determined by the Theorem of Constants Minimizing the LS Sum
and are given by
ak =
X
1 2m−1
yj cos(kxj )
m j=0
bk =
X
1 2m−1
yj sin(kxj )
m j=0
for k = 0, 1, 2, ..., n, and
for k = 1, 2, ..., n − 1.
Exercise 10. Example 8.5.2 in B-F. For f (x) = x4 − 3x3 + 2x2 − tan(x[x − 2]) with
x ∈ [0, 2] select 10 equally spaced points and determine S3 (x). Transform first the domain
[0, 2] into [−π, π] by means of the change of variable from x to z = π(x − 1).
The least squares trigonometric polynomial of order n = 3 is
S3 (z) = [
2
X
a0
+ a3 cos(3z) +
(ak cos(kz) + bk sin(kz))]
2
k=1
where
ak =
9
1X
zj
f (1 + ) cos(kzj )
5 j=0
π
bk =
9
1X
zj
f (1 + ) sin(kzj )
5 j=0
π
for k = 0, 1, 2, 3 and
for k = 1, 2.
7
Fast Fourier Transforms
Besides the least squares polynomial of degree n on a set of 2m data points {(xj , yj )}2m−1
j=0
described above we are interested in the interpolatory polynomial of order m on those data
points. The reason for the interest is that very accurate results are produced when using
the interpolating polynomial on large amounts of equally spaced data points. Direct calculation of the polynomial requires the performance of O((2m)2 ) operations. The Fast Fourier
9
Transform algorithm is a very convenient computation procedure based on selecting m appropriately such that it can be easily factored into powers of two. This choice reduces the
number of required operations to to O(m log2 m).
If one demands that the coefficients of the interpolatory polynomial agree with those of
the discrete least squares polynomial, Sm (x) is given by
Sm (x) =
X
a0 + am cos(mx) m−1
+
(ak cos(kx) + bk sin(kx))
2
k=1
with
ak =
X
1 2m−1
yj cos(kxj )
m j=0
bk =
X
1 2m−1
yj sin(kxj )
m j=0
for k = 0, 1, 2, ..., m, and
for k = 1, 2, ..., m − 1.
Interpolating 2m points with Sm (x) above directly requires about (2m)2 multiplications/divisions and (2m)2 additions/substractions. The Fast Fourier Transform algorithm
can do it with m log2 m of each by calculating the coefficients in clusters in the complex
domain.
The complex coefficients ck are
ck =
2m−1
X
j=0
yj exp(
m
ikπj
)=
(ak + ibk )
m
(−1)k
for each k = 0, 1, 2, ..., 2m − 1. Note that compuation of all ck ’s directly with this expression
requires O[(2m)2 ] operations.
Now select the value of m such that m = 2p where p > 0 is an integer. Thus, for each k
ck + cm+k =
2m−1
X
j=0
2m−1
X
ikπj
i(m + k)πj
)+
)=
yj exp(
yj exp(
m
m
j=0
=
2m−1
X
j=0
yj exp(
ikπj
)(1 + exp(iπj))
m
Since 1 − exp(iπj) = 2 when j is even and zero otherwise, one can replace the dummy
summation index j by 2j and write
ck + cm+k = 2
m−1
X
j=0
y2j exp(
10
ikπj
)
m/2
similarly
ck − cm+k = 2 exp(ikπ/m)
m−1
X
j=0
y2j+1 exp(
ikπj
)
m/2
These two expressions allow determination of all ck ’s. This results in a reduction in the
number of multiplications required to obtain ck to 2m2 + m. However, each of the above
sums can in turn be replaced by two sums from j = 0 to j = (m/2) − 1 and as a result, the
number of required multiplications is now m2 + 2m. Repetition of this breakdown process r
times leads to a required number of multiplications of m2 /2r−2 + mr. When r = p + 1 the
process is complete and the number of multiplications is of order m log2 m.
Exercise 11. Example 8.6.1 in B-F.
Fast Fourier Transform Algorithm.
• Give m, p, uniformly spaced xj ∈ [−π, π] and yj
• Set M = m; q = p; ξ = exp(iπ/m).
• For j = 0, 1, ..., 2m − 1 set cj = yj .
• For j = 1, 2, ..., M set ζj = ξ j ; ζj+M = −ζj
• Set K = 0; ζ0 = 1.
• For L = 1, 2, ...p + 1 while K < 2m − 1 for j = 1, 2, ..., M let K = kp (2p ) + kp−1 (2p−1 ) +
... + k1 (2) + k0 ; set K1 = K/2q = kp 2p−q + ... + kq+1 (2) + kq ; set K2 = kq 2p + kq+1 2p−1 +
... + kp 2q .
• Set η = cK+M ζK2 ; cK+M = cK − η; cK = cK + η.
• Set K = K + 1.
• Set K = K + M .
• Set K = 0; M = M/2; q = q − 1.
• While K < 2m − 1, let K = kp (2p ) + kp−1 (2p−1 ) + ... + k1 (2) + k0 ; set j = k0 (2p ) +
k1 (2p−1 ) + ... + kp−1 (2) + kp .
• If j > K interchange cj and ck .
• Set K = K + 1.
• Set a0 = c0 /m; am = Re(exp(−iπm)cm /m)
• For j = 1, ..., m − 1 set aj = Re(exp(−iπj)cj /m); bj = Im(exp(−iπj)cj /m).
Exercise 12. Example 8.6.2 in B-F.
11