Download Six Myths of Polynomial Interpolation and Quadrature

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
faug
July 12, 2011
10:51
Page 1
Six Myths of Polynomial Interpolation and
Quadrature
Lloyd N. Trefethen FRS, FIMA, Oxford University
t is a pleasure to offer this essay for Mathematics Today as a
On the face of it, this caution is justified by two theorems.
record of my Summer Lecture on 29 June 2011 at the Royal Weierstrass proved in 1885 [1] that any continuous function can
Society.
be approximated arbitrarily closely by polynomials. On the other
Polynomials are as basic a topic as any in mathematics, and hand, Faber proved in 1914 [2] that no polynomial interpolation
for numerical mathematicians like me, they are the starting point scheme, no matter how the points are distributed, will converge for
of numerical methods that in some cases go back centuries, like all such functions.
So it sounds as if there is something wrong with polynomial inquadrature formulae for numerical integration and Newton iterations for finding roots. You would think that by now, the basic terpolation. Yet the truth is, polynomial interpolants in Chebyshev
facts about computing with polynomials would be widely under- points always converge if f is a little bit smooth. (We shall call
stood. In fact, the situation is almost the reverse. There are indeed them Chebyshev interpolants.) Lipschitz continuity is more than
1885 that enough,
any continuous
can ≤
beL|x
approximated
arbitrarily
that is, |ffunction
(x) − f (y)|
− y| for some
constant Lclosely
and by polywidespread views about polynomials, but some of the important
they other
in 1914 that
no polynomial
interpolation
all x,
∈ [−1,hand,
1]. SoFaber
long proved
as f is Lipschitz
continuous,
as it will
ones are wrong, founded on misconceptions entrenched bynomials.
gener- On
scheme, nobematter
howany
thepractical
points are
distributed,
for all such functions.
in almost
application,
pn will
→ fconverge
is guaranteed.
ations of textbooks.
So it sounds
as ifisthere
is something
wrong
polynomial
interpolation. Yet the
There
indeed
a big problem
withwith
convergence
of polynomial
Since 2006, my colleagues and I have been solving mathemattruth
is, polynomial
interpolants
in Chebyshev
points always
convergepoints.
if f is a little bit
interpolants,
but it pertains
to interpolation
in equispaced
ical problems with polynomials using the Chebfun software
syssmooth (weAsshall
call
them
Chebyshev
interpolants).
Lipschitz
continuity
is more than
Runge showed in 1901 [3], equispaced interpolants may diverge
tem (www.maths.ox.ac.uk/chebfun). We have learned from
enough, that is, |f (x) − f (y)| ≤ L|x − y| for some constant L and all x, y ∈ [−1, 1].
daily experience how fast and reliable polynomials are. This en- exponentially, even if f is so smooth as to be analytic (holomorSo long as f is Lipschitz continuous, as it will be in almost any practical application,
phic). This genuinely important fact, known as the Runge phetirely positive record has made me curious to try to pinp down
n → f is guaranteed.
nomenon,
With Faber’softheorem
seeming
to
where these misconceptions come from, and this essay is anThere
at- is
indeed ahas
bigconfused
problempeople.
with convergence
polynomial
interpolants,
but justification,
real problem
with As
equispaced
polynomial
tempt to summarise some of my findings. Full details, including
it pertainsprovide
to interpolation
in aequispaced
points.
Runge showed
in 1901, equisinterpolants
has
been
overgeneralised
so
that
people
have
suspected
precise definitions and theorems, can be found in my draft
book
paced interpolants may diverge exponentially, even if f is so smooth
as to be analytic
to polynomial
interpolants
in general.
Approximation Theory and Approximation Practice, available
at it of applying
(holomorphic).
This genuinely
important
fact, known
as the Runge phenomenon, has
TheWith
smoother
f is,
the faster
its Chebyshev
www.maths.ox.ac.uk/~trefethen.
confused people.
Faber’s
theorem
seeming
to provide interpolants
justification,cona real problem
verge. polynomial
If f has νinterpolants
derivatives, has
with
the overgeneralized
ν th derivative so
being
The essay is organised around ‘six myths’. Each myth has
some
with
equispaced
been
thatofpeople have
−ν
suspected
it of applying
to polynomial
in ngeneral.
variation
V , then kf interpolants
− pn k = O(V
) as n → ∞. (By
truth in it – mathematicians rarely say things that are simply
false! bounded
The smoother
thethe
faster
its Chebyshev
kf − pn kf I is,
mean
maximum
of |f (x) −interpolants
pn (x)| for x converge.
∈ [−1, 1].) If f has ν
Yet each one misses something important.
−ν
ν th derivative
being of bounded
variation,
then
f is the
analytic,
the convergence
is geometric,
with kf
−kf
pn−p
k n=k = O(n )
Throughout the discussion, f is a continuous function derivatives,
defined Ifwith
−n
as
n
→
∞.
(By
kf
−
p
k
I
mean
the
maximum
of
|f
(x)
−
p
(x)|
for
x
∈
[−1,
1].)
If
f
n
n parameter ρ
ρ > 1. I will give details about the
on the interval [−1, 1], n + 1 distinct points x0 , . . . , xn in [−1, 1] O(ρ ) for some
−n
is
analytic,
the
convergence
is
geometric,
with
kf
−
p
k
=
O(ρ
)
for
some
ρ
>
1.
I
n
are given, and pn is the unique polynomial of degree at most n with under Myth 5.
will give details
about
the
parameter
ρ
under
Myth
5.
For example, here is the degree 10,000 Chebyshev interpolant
pn (xj ) = f (xj ) for each j. Two families of points will be of particFor example, here is the degree 10,000 Chebyshev interpolant pn to the sawtooth
−1
ular interest: equispaced points, xj = −1 + 2j/n, and Chebyshev pn to the sawtooth function f (x) defined as the integral from
function f (x) defined as the integral from −1 to x of sign(sin(100t/(2 − t))). This
to x of sign(sin(100t/(2 − t))). This curve may not look like a
points, xj = cos(jπ/n). I will also mention Legendre points,
decurve may not look like a polynomial, but it is! With kf − pn k ≈ 0.0001, the plot is
it is!
kf − pn k ≈ 0.0001, the plot is indisfined as the zeros of the degree n + 1 Legendre polynomialindistinguishable
Pn+1 . polynomial,
from but
a plot
of With
f itself.
The discussion of each myth begins with two or three represen- tinguishable from a plot of f itself.
tative quotations from leading textbooks, listed anonymously with
the year of publication. Then I say a word about the mathematical
truth underlying the myth, and after that, what that truth overlooks.
I
Myth 1. Polynomial interpolants diverge as n → ∞
Textbooks regularly warn students not to expect pn → f as
n → ∞.
‘Polynomial interpolants rarely converge to a general There is not
muchisuse
polynomial
interpolantsinterpolants
to functionstowith
so little smoothThere
notinmuch
use in polynomial
functions
continuous function.’ (1989)
ness as this,
butsomathematically
they
are trouble-free.
For smoother
like ex ,
with
little smoothness
as this,
but mathematically
they arefunctions
troucos(10x) orble
1/(1
+ For
25x2smoother
), we getfunctions
convergence
digits oforaccuracy
for2 ),small values
free.
like to
ex ,15
cos(10x)
1/(1+25x
‘Unfortunately, there are functions for which interof n (14, 34 and 182, respectively).
we get convergence to 15 digits of accuracy for small values of n
polation at the Chebyshev points fails to converge.’
(14, 34 and 182, respectively).
(1996)
Myth 2. Evaluating polynomial interpolants numerically is problematic
Interpolants in Chebyshev points may converge in theory, but aren’t there problems
on a computer? Textbooks warn students about this.
“Although Lagrangian interpolation is sometimes useful in theoretical investigations,
it is rarely
used in practical computations.” (1985)
faug
July 12, 2011
10:51
Page 2
Myth 2. Evaluating polynomial interpolants
numerically is problematic
Myth 3. Best approximations are optimal
This one sounds true by definition!
Interpolants in Chebyshev points may converge in theory, but aren’t
there problems on a computer? Textbooks warn students about this.
‘Polynomial interpolation has drawbacks in addition
to those of global convergence. The determination and
evaluation of interpolating polynomials of high degree
can be too time-consuming and can also lead to difficulty problems associated with roundoff error.’ (1977)
‘Since the Remes algorithm, or indeed any other algorithm for producing genuine best approximations,
requires rather extensive computations, some interest attaches to other more convenient procedures to
give good, if not optimal, polynomial approximations.’
(1968)
‘Although Lagrangian interpolation is sometimes useful in theoretical investigations, it is rarely used in
practical computations.’ (1985)
‘Minimal polynomial approximations are clearly suitable for use in functions evaluation routines, where it
is advantageous to use as few terms as possible in an
approximation.’ (1968)
‘Interpolation is a notoriously tricky problem from the
point of view of numerical stability.’ (1990)
‘Ideally, we would want a best uniform approximation.’ (1980)
The origin of this view is the fact that some of the methods one Though the statement of this myth looks like a tautology, there is
might naturally try for evaluating polynomial interpolants are slow content in it. The ‘best approximation’ is a common name for the
or numerically unstable or both. For example, if you write down unique polynomial p∗n that minimises kf − p∗n k. So a best approxthe Lagrange interpolation formula in its most obvious form and imant is optimal in the maximum norm, but is it really the best in
implement
on a computer
as problem
written, you
algorithm
thatof numerical
practice?
“Interpolation
is a itnotoriously
tricky
fromget
theanpoint
of view
requires O(n2 ) work per evaluation point. (Partly because of this,
stability.” (1990)
As the first quotation suggests, computing p∗n is not a trivbooks
warnview
readers
thatfact
theythat
should
useof
Newton
interpolation
for- naturally
The origin
of this
is the
some
the methods
one might
ial matter, since the dependence on f is nonlinear. By contrast,
mulae rather
than Lagrange
– another
myth.)
if you compute
try for evaluating
polynomial
interpolants
are slow
or And
numerically
unstable computing
or both. a Chebyshev interpolant with the barycentric formula is
‘linear
style’, byinterpolation
setting up a Vandermonde
maFor example,interpolants
if you write
downalgebra
the Lagrange
formula in its
mosteasy.
obvious
Here our prejudices about value for money begin to intrude.
n
trix whoseitcolumns
contain samples
of 1,you
x, x2get
, . . an
. , xalgorithm
at the grid
form and implement
on a computer
as written,
thatIfrequires
best approximations are hard to compute, they must be valuable!
O(n2 ) work points,
per evaluation
point. (Partly
of this, books
warnThis
readers
theyconsiderations make the truth not so simple. First of all,
your numerical
methodbecause
is exponentially
unstable.
is thatTwo
should use Newton
interpolationalgorithm
formulaeof
rather
than
— of
another
And
the ‘polyval/polyfit’
Matlab,
andLagrange
I am guilty
propakf
− p∗nmyth.)
k. the
So amaximum-norm
best approximant
is optimal
in thebetween
maximum
norm, but
is it really the
accuracy
difference
Chebyshev
interif you compute
interpolants
“linear
style”,
by setting
a Vandermonde
matrix
best
in practice?
gating
Myth 2 myself
by algebra
using this
algorithm
in my up
textbook
Specpolants and best approximations can never
be large, for Ehlich and
whose
columns
∗
n
a trivial
As
thenumerical
first quotation
suggests,
computing
pcannot
contain samples
1, x,
x2 , . . . , x
at on
thea computer
grid points,
your
tral Methods
in Matlabof[4].
Rounding
errors
destroy
n is not
Zeller
proved
in
1966
[7]
that
kf
−
p
k
exceed
kf − matter,
p∗n k by since the n
method
on fthan
is nonlinear.
By contrast,
computing
a Chebyshev
interpolant
with
is exponentially
This
is the
Matlab,
all accuracy ofunstable.
this method
even
for n“polyval/polyfit”
= 60, let alone n algorithm
= dependence
10,000 ofmore
the factor 2+(2/π)
log(n+1).
Usually
the difference
is
the in
barycentric
formula is easy. Here our prejudices about value for money begin to
and I am guilty
my textbook
as inof
thepropagating
plot above. Myth 2 myself by using this algorithm
less than that, and in fact, the best known error bounds for functions
intrude.
If best approximations
hard to compute, they must be valuable!
Spectral Methods
Matlab.
errors
on isa acomputer
all accuracy
of ν derivativesare
Orin
how
about nRounding
= 1,000,000?
Here
plot of thedestroy
polynomial
that have
or
are not
analytic,
the two smoothness classes
Two
considerations
make
the
truth
simple. First of all, the maximum-norm
this methodinterpolant
even for n to
= f60,
n = 10,000
as in the
plot above.
(x)let=alone
sin(10/x)
in a million
Chebyshev
points. mentioned under Myth 1, are just asofactor
of 2 larger for Chebyshev
accuracy
difference
Or how The
about
= 1,000,000?
Here30
is seconds
a plot of
polynomial
interpolant
tobetween Chebyshev interpolants and best approximations can never
plotnwas
obtained in about
on the
my laptop
by evalu∗
interpolants
as
for
best
approximations.
be large,
Ehlich
f (x) = sin(10/x)
in interpolant
a million Chebyshev
points.
The near
plot zero.
was obtained
in for
about
30 and Zeller proved in 1966 that kf − pn k cannot exceed kf − pn k
ating the
at 2,000 points
clustered
Secondly,
according
to
the
equioscillation
theorem
going
back
by more than
seconds on my laptop by evaluating the interpolant at 2000 points clustered
near the
zero.factor 2 + (2/π) log(n + 1). Usually the difference is less than that,
to the
Chebyshev
in the
1850s,
the maximum
error
of ahave
bestνapproxand in fact,
best known
error
bounds
for functions
that
derivatives or are
imation
is always attained
at least n under
+ 2 points
analytic, the
two smoothness
classesatmentioned
Mythin1,[−1,
are 1].
justFor
a factor of 2
larger for Chebyshev
interpolants
for best
approximations.
example, the
black curveasbelow
is the
error curve f (x) − p∗n (x),
Secondly,
to the
equioscillation
theorem
to Chebyshev
in the
x ∈according
[−1, 1], for
the best
approximant
to |x −going
1/4| back
of degree
100,
1850s, the equioscillating
maximum errorbetween
of a best
approximation
is always
attained
atred
at least n + 2
102
extreme values
≈ ±0.0027.
The
points in [−1,
1]. corresponds
For example,tothe
curve below
is the error
f (x)−p∗n (x), x ∈
curve
theblack
polynomial
interpolant
pn incurve
Chebyshev
[−1, 1], forpoints.
the best
approximant
tomost
|x − 1/4|
of of
degree
between 102
It is
clear that for
values
x, |f 100,
(x) −equioscillating
pn (x)| is much
extreme values
≈
±0.0027.
The
red
∗ curve corresponds to the polynomial interpolant pn
smaller than |f (x) − pn (x)|. Which approximation would be more
in Chebyshev points. It is clear that
for most values of x, |f (x)− pn (x)| is much smaller
useful in an application? I think the only reasonable answer
is, it
than |f (x) − p∗n (x)|. Which approximation would be more useful in an application?
depends.
Sometimes
one
really
does
need
a
guarantee
about
worstThe fast and stable algorithm that makes these calculations possible
comes
from reasonable
a
I think
the only
answer is, it depends. Sometimes one really does need
case
behaviour.
In other
situations,
wouldsituations,
be wasteful
sacrifice
The
and stable
algorithm
that as
makes
these calculations
representation of
thefast
Lagrange
interpolant
known
the barycentric
published
aformula,
guarantee
about
worst-case
behaviour.
In itother
it to
would
be wasteful to
so
much
accuracy
over
95%
of
the
range
just
to
gain
one
bit
of
the
Lagrange
interpolant
by Salzer inpossible
1972: comes from a representation
sacrifice so much accuracy over 95% of the range just to gain one bit of
of acaccuracy in a
,
n formula,
n
curacy in a small subinterval.
j
known as the barycentric
by Salzer
in 1972
[5]:
X
X
small
subinterval.
0 (−1)j fjpublished
0 (−1)
pn (x) =
,
x − xj j ,j=0
n x − xjj
n
X
X
j=0
0 (−1) fj
0 (−1)
pn (x) =
,
x
xon−the
xj summation signs signify
with the special case pn (x) = fj j=0
if x =xx−
.
The
primes
j
j
j=0
that the terms j = 0 and j = n are multiplied by 1/2. The work required is O(n) per
evaluation point,
andspecial
though
thepndivisions
look
dangerous
with the
case
(x) = fjbyif xx−=xjxjmay
. The
primes
on the for x ≈ xj ,
the formula summation
is numerically
wastheproved
in 2004.
signsstable,
signifyasthat
terms by
j =Nick
0 andHigham
j = n are
multiplied by 1/2. The work required is O(n) per evaluation point, and
the divisions by x − xj may look dangerous for x ≈ xj ,
Myth 3.though
Best
approximations are optimal
the formula is numerically stable, as was proved by Nick Higham
in 2004
[6].
This one sounds
true
by definition!
“Since the Remes algorithm, or indeed any other algorithm for producing genuine
best approximations, requires rather extensive computations, some interest attaches to
Mythapproxima4. Gauss quadrature has twice the order of
other more convenient procedures to give good, if not optimal, polynomial
tions.” (1968)
accuracy of Clenshaw–Curtis
“Ideally, we would want a best uniform approximation.” (1980)
Though the statement of this myth looks like a tautology, there isQuadrature
content in it.
The are usually derived from polynomials. We approximate an integral
formulae
“best approximation” is a common name for the unique polynomialby
p∗nathat
minimizes
finite
sum,
I=
Z
1
f (x)dx
≈
In =
n
X
wk f (xk ),
(1)
faug
July 12, 2011
10:51
Page 3
Myth 4. Gauss quadrature has twice the order of
accuracy of Clenshaw–Curtis
Quadrature formulae are usually derived from polynomials. We
approximate an integral by a finite sum,
Z 1
n
X
I=
f (x)dx ≈ In =
wk f (xk ),
(1)
−1
k=0
and the weights {wk } are determined by the principle of interpolating f by a polynomial at the points {xk } and integrating
the interpolant. Newton–Cotes quadrature corresponds to equispaced points, Clenshaw–Curtis quadrature to Chebyshev points,
and Gauss quadrature to Legendre points. Almost every textbook
first describes Newton–Cotes, which achieves In = I exactly if f is
a polynomial of degree n, and then shows that Gauss has twice this
order of exactness: In = I if f is a polynomial of degree 2n + 1.
Clenshaw–Curtis is occasionally mentioned, but it only has order
of exactness n, no better than Newton–Cotes.
A theorem of mine from 2008 makes this observation precise.
If f has a ν th derivative of bounded variation V , Gauss quadrature can be shown to converge at the rate O(V (2n)−ν ), the factor
of 2 reflecting its doubled order of exactness. The theorem asserts
that the same rate O(V (2n)−ν ), with the same factor of 2, is also
achieved by Clenshaw–Curtis. (Folkmar Bornemann (private communication) has pointed out that both of these rates can probably
be improved by one further power of n.)
The explanation for this surprising result goes back to O’Hara
and Smith in 1968 [13]. It is true that (n + 1)-point Gauss quadrature integrates the Chebyshev polynomials Tn+1 , Tn+2 , . . . exactly whereas Clenshaw–Curtis does not. However, the error that
Clenshaw–Curtis makes consists of aliasing them to the Chebyshev
polynomials Tn−1 , Tn−2 , . . . and integrating these correctly. As it
happens, the integrals of Tn+k and Tn−k differ by only O(n−3 ),
and that is why Clenshaw–Curtis is more accurate than its order of
exactness seems to suggest.
Myth 5. Gauss quadrature is optimal
‘However, the degree of accuracy for Clenshaw–
Curtis quadrature is only n − 1.’ (1997)
Gauss quadrature may not be much better than Clenshaw–Curtis,
but at least it would appear to be as accurate as possible, the gold
standard of quadrature formulae.
‘Clenshaw–Curtis rules are not optimal in that the degree of an n-point rule is only n − 1, which is well
below the maximum possible.’ (2002)
‘The precision is maximised when the quadrature is
This ubiquitous emphasis on order of exactness is misleading.
Gaussian.’ (1982)
Gauss quadrature
Legendre
points.
Almost Gauss
every textbook
first gives
describes
Newton–
Textbookstosuggest
that
the reason
quadrature
better
‘In fact, it can be shown that among all rules using
Cotes, which
In = I exactly is
if fitsishigher
a polynomial
of exactness,
degree n, and
resultsachieves
than Newton–Cotes
order of
butthen
thisshows
that Gauss has twice this order of accuracy: In = I if f is a polynomial of degree
n function evaluations, the n-point Gaussian rule is
is not correct. The problem with Newton–Cotes is that the sample
2n + 1. Clenshaw–Curtis is occasionally mentioned, but it only has order of exactness
likely to produce the most accurate estimate.’ (1989)
points are equally spaced: the Runge phenomenon. In fact, as Pólya
n, no better than Newton–Cotes.
provedthe
in 1933
Newton–Cotes
quadrature does
not converge
“However,
degree[8],
of accuracy
for Clenshaw–Curtis
quadrature
is only as
n − 1.”
There is another misconception here, quite different from the one
(1997)
n → ∞, in general, even if f is analytic.
just discussed as Myth 4. Gauss quadrature is optimal as measured
“Clenshaw–Curtis
rules are and
not optimal
in that the degree
of an
n-pointdifferrule is only
Clenshaw–Curtis
Gauss quadratures
behave
entirely
by polynomial order of exactness, but that is a skewed measure.
n − 1, which is well below the maximum possible.” (2002)
ently. Both schemes converge for all continuous integrands, and if
This ubiquitous emphasis on order of exactness is misleading. Textbooks suggest The power of polynomials is nonuniform: they have greater resintegrand
analytic,gives
the convergence
geometric.
Clenshaw–
that thethe
reason
Gauss is
quadrature
better resultsisthan
Newton–Cotes
is its higher
olution near the ends of an interval than in the middle. Suppose,
Curtis
is
easy
to
implement,
using
either
the
Fast
Fourier
Transform
order of exactness, but this is not correct. The problem with Newton–Cotes
is that the
for example, that f (x) is an analytic function on [−1, 1], which
sample or
points
arealgorithm
equally spaced:
the Runge
fact,reason
as Pólya
proved in
by an
of Waldvogel
inphenomenon.
2006 [9], andInone
it gets
means
it can be analytically continued into a neighbourhood of
1933, Newton–Cotes
quadrature
does Gauss
not converge
as n →
∞,ainbig
general,
is
little attention
may be that
quadrature
had
head even
start,if f [−1,
1] in the complex x-plane. Then polynomial approximants
analytic.
invented in 1814 [10] instead of 1960 [11]. Both Clenshaw–Curtis to f , whether Chebyshev or Legendre interpolants or best approxBoth Clenshaw–Curtis and Gauss quadrature behave entirely differently. Both
Gaussfor
quadrature
are practical
evenand
if nifisthe
in the
millions,
in the the
schemesand
converge
all continuous
integrands,
integrand
is analytic,
imants, will converge at a geometric rate O(ρ−n ) determined by
latteriscase
because
nodes and weights
can
calculated
by either
an algoconvergence
geometric.
Clenshaw–Curtis
is easy
to be
implement,
using
the Fast
how close any singularities of f in the plane are to [−1, 1]. To be
Fourier rithm
Transform
or by an algorithm
of Waldvogel
in 2006,
reasoninit gets
implemented
in Chebfun
due to Glaser,
Liu and
andone
Rokhlin
precise, ρ is the sum of the semiminor and semimajor axis lengths
little attention
may
be
that
Gauss
quadrature
had
a
big
head
start,
invented
in
1814
2007 [12]. (Indeed, since Glaser–Liu–Rokhlin, it is another myth to of the largest ellipse with foci ±1 inside which f is analytic and
instead of 1960. Both Clenshaw–Curtis and Gauss quadrature are practical even if n
Gauss
only practicable
for be
small
values by bounded.
is in theimagine
millions,that
in the
latterquadrature
case becauseisnodes
and weights can
calculated
an
But ellipses are narrower near the ends than in the midof
n.)
algorithm implemented in Chebfun due to Glaser, Liu and Rokhlin in 2007. (Indeed,
dle. If f has a singularity at x0 = iε for some small ε, then we
since Glaser–Liu–Rokhlin,
it isorder
another
to imagine
Gauss
quadrature
With twice the
ofmyth
exactness,
we that
would
expect
Gaussis only
get O(ρ−n ) convergence with ρ ≈ 1 + ε. If f has a singularity
at
√
practicable
for smallto
values
of n.) twice as fast as Clenshaw–Curtis. Yet it
quadrature
converge
2ε,
cor1
+
ε,
on
the
other
hand,
the
parameter
becomes
ρ
≈
1
+
Withdoes
twice
theUnless
order off exactness,
expect
Gauss
quadrature
to converge
not.
is analyticweinwould
a large
region
surrounding
[−1,
1] responding to much faster convergence. A function with a singutwice as fast as Clenshaw–Curtis. Yet it does not. Unless f is analytic in a large region
in
the
complex
plane,
one
typically
finds
that
Clenshaw–Curtis
surrounding [−1, 1] in the complex plane, one typically finds that Clenshaw–Curtis
larity at 1.01 converges 14 times faster than one with a singularity
quadrature
converges
rate asasGauss,
as illustrated
quadrature
converges
at about at
theabout
samethe
ratesame
as Gauss,
illustrated
by these curves
at 0.01i.
2
for f (x)by
= these
exp(−1/x
): for f (x) = exp(−1/x2 ):
curves
Quadrature rules generated by polynomials, including both
0
10
−5
error
10
Clenshaw−Curtis
−10
10
Gauss
−15
10
0
10
20
30
40
50
degree n
60
70
80
90
100
Gauss and Clenshaw–Curtis, show the same nonuniformity. This
might seem unavoidable, but in fact, there is no reason why a
quadrature formula (1) needs to be derived from polynomials. By
introducing a change of variables, one can generate alternative formulae based on interpolation by transplanted polynomials, which
may converge up to π/2 times faster than Gauss or Clenshaw–
Curtis quadrature for many functions. This idea was developed in a
paper of mine with Nick Hale in 2008 [14] and is related to earlier
work by Kosloff and Tal-Ezer in 1993 [15].
A theorem of mine from 2008 makes this observation precise. If f has a ν th derivative of bounded variation, Gauss quadrature can be shown to converge at the rate
O(n−2ν ), the factor of 2 reflecting its doubled order of accuracy. The theorem asserts
that the same rate O(n−2ν ), with the same factor of 2, is also achieved by Clenshaw–
Curtis.
The explanation for this surprising result goes back to O’Hara and Smith in 1968.
It is true that (n + 1)-point Gauss quadrature integrates the Chebyshev polynomi5
faug
July 12, 2011
10:51
Page 4
The following theorem applies to one of the transformed methods Hale and I proposed. Let f be a function that is analytic in an εneighborhood of [−1, 1] for ε ≤ 0.05. Then whereas Gauss quadrature converges at the rate In − I = O((1 + ε)−2n ), transformed
Gauss quadrature converges 50% faster, In − I = O((1 + ε)−3n ).
Here is an illustration for f (x) = 1/(1 + 25x2 ).
they can be computed accurately by solving an eigenvalue problem involving a ‘colleague matrix’. The details were worked out
by Specht in 1957 [18] and Good in 1961 [19].
Chebfun finds roots of a function f on [−1, 1] by approximating it by a polynomial expressed in Chebyshev form and then solving a colleague-matrix eigenvalue problem, and if the degree is
greater than 100, first subdividing the interval recursively to reduce it. These ideas originate with John Boyd in 2002 [20] and
10
are extraordinarily effective. Far from being exceptionally trouble10
some, polynomial root-finding when posed in this fashion begins
Gauss
to emerge as the most tractable of all root-finding problems, for we
10
can solve the problem globally with just O(n2 ) work to get all the
transformed Gauss
10
roots in an interval to high accuracy.
Far100from beingFor
exceptionally
troublesome,
polynomial
rootfinding
when
in this
example, the
function f (x)
= sin(1000πx)
on [−1,
1] posed
is
0
10
20
30
40
50
60
70
80
90
degree n
fashion begins
to
emerge
as
the
most
tractable
of
all
rootfinding
problems,
for
we
can
represented in Chebfun by a polynomial
of degree 4091. It takes
2
solve the problem
globally
with
just
O(n
)
work
to
get
all
the
roots
in
an
interval
to
2 seconds on my laptop to find all 2001 of its roots in [−1, 1], and
The fact
thatfact
some
quadrature
formulae converge
up to converge
π/2 times faster
Gauss
The
that
some quadrature
formulae
uphigh
tothan
π/2
accuracy.
−16
the maximum deviation from the exact values is 4.4 × 10 .
as n → ∞ is probably not of much practical importance. The importance is conceptual.
times faster than Gauss as n → ∞ is probably not of much practical
For example, the function f (x) = sin(1000πx) on [−1, 1] is represented in Chebfun
Here is another illustration of the robustness of polynomial
by a polynomial of degree 4091. It takes 2 seconds on my laptop to find all 2001 of its
importance. The importance is conceptual.
root-finding
on an interval. In Chebfun, we have plotted the funcMyth 6. Polynomial rootfinding is dangerous
roots in [−1, 1], and the maximum deviation from the exact values is 4.4 × 10−16 .
tion
f
(x)
=
exp(x/2)(sin(5x)
+ sin(101x))
and then
executedon an interHere is another illustration
of the robustness
of polynomial
rootfinding
Our final myth
with Jim Wilkinson
(1919–1986),
hero of mine who taught
Mythoriginates
6. Polynomial
root-finding
isadangerous
the
commands
r
=
roots(f-round(f)),
plot(r,f(r),’.’).
val.on In
we have plotted the function f (x) = exp(x/2)(sin(5x) + sin(101x))
me two courses in graduate school at Stanford. Working with Alan Turing
the Chebfun,
Pilot
This sequence
solves a collection
of hundreds of polynomial
rootand
thensome
executed
the commands
r = roots(f-round(f)),
plot(r,f(r),’.’).
This
Our finalinmyth
with that
Jim Wilkinson
hero
Ace computer
1950, originates
Wilkinson found
attempts to (1919–1986),
compute roots
ofa even
finding
problems
to
locate
all
the
points
where
f
takes
a
value
equal
low-degree
polynomials
failed
dramatically.
He
publicized
this
discovery
widely.
sequence
solves
a
collection
of
hundreds
of
polynomial
rootfinding
problems
to locate
of mine who taught me two courses in graduate school at Stanto an
integer
or a half-integer,
andtoplots
them asand
dots.plots
The them
compu“Our main object in this chapter has been to focus attention on the severe
theinherent
points
where
f takes
a value equal
an integer,
as dots. The
ford. Working with Alan Turing on the Pilot Ace computerall
in 1950,
limitations of all numerical methods for finding the zeros of polynomials.”
(1963) tation
took
2/3
of
a
second.
computation
took
2/3
of
a
second.
Wilkinson found that attempts to compute roots of even some low0
error
−5
−10
−15
“Beware: Some polynomials are ill-conditioned!” (1992)
degree
failed
dramatically.
He publicised
this discovThe
first ofpolynomials
these quotations
comes
from Wilkinson’s
book on rounding
errors, and
he also
coined
the memorable phrase “the perfidious polynomial” as the title of a 1984
ery
widely.
article that won the Chauvenet Prize for outstanding mathematical exposition.
What Wilkinson
discovered
the chapter
extreme has
ill-conditioning
of roots
‘Our main
object was
in this
been to focus
at- of certain
polynomials as functions of their coefficients. Specifically, suppose a polynomial pn is
tention on the severe inherent limitations of all nu
specified by its coefficients in the form a0 + a1 x + · · · + an xn . If pn has roots near the
methods
finding
zeros of polynomials.’
unit circle inmerical
the complex
plane,for
these
pose the
no difficulties:
they are well-conditioned
(1963)
functions of the
coefficients ak and can be computed accurately by Matlab’s “roots”
command, based on the calculation of eigenvalues of a companion matrix containing
‘Beware:
Somethepolynomials
the coefficients.
Roots far from
circle, however,are
suchill-conditioned!’
as roots on the interval [−1, 1],
(1992)
can be so ill-conditioned
as to be effectively uncomputable. The monomials xk form
exponentially bad bases for polynomials on [−1, 1].
The
of argument
these quotations
comes
fromabout
Wilkinson’s
book
on of
The
flawfirst
in the
is that it says
nothing
the condition
of roots
polynomials
as functions
of their
effective
[−1, 1] ‘the
based on
rounding
errors [16],
andvalues.
he alsoFor
coined
the rootfinding
memorableonphrase
pointwise samples, all one must do is fix the basis: replace the monomials xk , which are
perfidious polynomial’ as the title of a 1984 article that won the
orthogonal polynomials on the unit circle, by the Chebyshev polynomials Tk (x), which
Chauvenet
Prize
for outstanding
mathematical
are orthogonal
on the
interval.
Suppose a polynomial
pn is exposition
specified by [17].
its coefficients
What
Wilkinson
thepnextreme
in the form
a0 T0 (x)
+ a1 T1 (x)discovered
+ · · · + an Tnwas
(x). If
has rootsill-conditioning
near [−1, 1], these are
well-conditioned
the coefficientsasakfunctions
, and they of
cantheir
be computed
accurately
of roots offunctions
certainofpolynomials
coefficients.
Conclusion
by solving an eigenvalue problem involving a a “colleague matrix”. The details were
Specifically, suppose a polynomial pn is specified by its coefficients
worked out by Specht in 1957 and Good in 1961.
Perhaps
I might
close by mentioning
another perspective
on the
in the finds
formroots
a0 +ofaa1function
x + · · ·f+onan[−1,
xn1]
. If
the
unit I might
Perhaps
close
by mentioning
another perspective
on the misconceptions
that
n has roots near
Chebfun
bypapproximating
it by
a polynomial
misconceptions
that
have
affected
the
study
of
computation
withof variables
circle
the complex
these
posea no
difficulties:
are wellhave
affected the study of computation with polynomials. By the change
expressed
in in
Chebyshev
form plane,
and then
solving
colleague
matrix they
eigenvalue
problem,
and ifconditioned
the degree is greater
thanof
100,
subdividingathe
interval
reduce
x = tocos
θ, polynomials.
one can show
interpolation
by polynomials
in Chebyshev
Bythat
the change
of variables
x = cos θ, one
can show points is
functions
thefirst
coefficients
can recursively
be computed
k and
it. These
ideas
originate
with
John
Boyd
in
2002
and
are
extraordinarily
effective.
equivalent
to
interpolation
of
periodic
functions
by
series
of
sines
and
cosines in equisthat
interpolation
by
polynomials
in
Chebyshev
points
is
equivaaccurately by Matlab’s ‘roots’ command, based on the calculation
Conclusion
paced points.
latter is the subject
of discrete
Fourier
analysis,
and one
lentThe
to interpolation
of periodic
functions
by series
of sines
andcannot help
of eigenvalues of a companion matrix containing the coefficients.
noting
that
whereas
there
is
widespread
suspicion
that
it
is
not
safe
to
compute with
cosines
in
equispaced
points.
The
latter
is
the
subject
of
discrete
Roots far from the circle, however,
such
as
roots
on
the
interval
7
polynomials,
nobody
worriesand
about
the Fast
Fourier
Inthere
the end
Fourier
analysis,
one cannot
help
notingTransform!
that whereas
is this may
[−1, 1], can be so ill-conditioned as to be effectively uncomputable.
be the biggest
difference
between
Fourier
and
the difference
widespread
suspicion
that
it is not
safepolynomial
to computeinterpolants,
with polynomiThe monomials xk form exponentially bad bases for polynomials
in their reputations.
als, nobody worries about the Fast Fourier Transform! In the end
on [−1, 1].
bonus,
free
of charge.
this amay
be the
biggest
difference between Fourier and polynomial
The flaw in the argument is that it says nothing about theAnd
con- here’s
dition of roots of polynomials as functions of their values. For interpolants, the difference in their reputations.
here’s a bonus,
free of charge. Lagrange interpolation
effective root-finding on [−1, 1] based on pointwise samples,
all 7. And
Myth
Lagrange
discovered
one must do is fix the basis: replace the monomials xk , which
It was Waring, in Myth
the Philosophical
Transactions
of theLagrange
Royal Society in 1779. Euler
are orthogonal polynomials on the unit circle, by the Chebyshev
7. Lagrange
discovered
usedSupthe formula in 1783, and Lagrange in 1795.
polynomials Tk (x), which are orthogonal on the interval.
interpolation
pose a polynomial pn is specified by its coefficients in the form
a0 T0 (x) + a1 T1 (x) + · · · + an Tn (x). If pn has roots near [−1, 1], It was Waring in 1779 [21]. Euler used the formula in 1783, and
these are well-conditioned functions of the coefficients ak , and Lagrange in 1795. h
8
faug
July 12, 2011
10:51
Page 5
References
1 Weierstrass, K. (1885) Über die analytische Darstellbarkeit sogenannter willkürlicher Funktionen einer reellen Veränderlichen, Sitzungsberichte der Akademie zu Berlin, pp. 633–639 and 789–805.
2 Faber, G. (1914) Über die interpolatorische Darstellung stetiger Funktionen, Jahresber. Deutsch. Math. Verein., vol. 23, pp. 190–210.
3 Runge, C. (1901) Über empirische Funktionen und die Interpolation
zwischen äquidistanten Ordinaten, Z. Math. Phys., vol. 46, pp. 224–
243.
4 Trefethen, L.N. (2000) Spectral Methods in MATLAB, SIAM, Philadelphia.
5 Salzer, H.E. (1972) Lagrangian interpolation at the Chebyshev points
xn,v ≡ cos(vπ/n), v = 0(1)n; some unnoted advantages, Comp. J.
vol. 15, pp. 156–159.
6 Higham, N.J. (2004) The numerical stability of barycentric Lagrange
interpolation IMA J. Numer. Anal., vol. 24, no. 4, pp. 547–556.
7 Ehlich, H. and Zeller, K. (1966) Auswertung der Normen von Interpolationsoperatoren, Math. Ann., vol. 164, pp. 105–112.
8 Pólya, G. (1933) Über die Konvergenz von Quadraturverfahren, Math.
Z., vol. 37, pp. 264–286.
9 Waldvogel, J. (2006) Fast construction of the Fejér and Clenshaw–
Curtis quadrature rules, BIT Num. Math., vol. 46, no. 1. pp. 195–202.
10 Gauss, C. F. (1814) Methodus nova integralium valores per approximationem inveniendi, Comment. Soc. R. Scient. Göttingensis Rec., vol.
3, pp. 39–76.
11 Clenshaw, C. W. and Curtis, A. R. (1960) A method for numerical integration on an automatic computer, Numer. Math., vol. 2, pp. 197–205.
12 Glaser, A., Liu, X. and Rokhlin, V. (2007) A fast algorithm for the calculation of the roots of special functions, SIAM J. Sci. Comp., vol. 29,
no. 4, pp. 1420–1438.
13 O’Hara, H. and Smith, F. J. (1968) Error estimation in the Clenshaw–
Curtis quadrature formula, Comp. J., vol. 11, pp. 213–219.
14 Hale, N. and Trefethen, L. N. (2008) New quadrature formulas from
conformal maps, SIAM J. Numer. Anal., vol. 46, pp. 930–948.
15 Kosloff, D. and Tal-Ezer, H. (1993) A modified Chebyshev pseudospectral method with an O(N 1 ) time step restriction, J. Comp.
Phys., vol. 104, pp. 457–469.
16 Wilkinson, J.H. (1994) Rounding Errors in Algebraic Processes
(Prentice-Hall Series in Automatic Computation) Dover Publications.
17 Wilkinson, J.H. (1984) The perfidious polynomial, MAA Stud. Num.
Anal.
18 Specht, W. (1957) Die Lage der Nullstellen eines Polynoms. III, Math.
Nachr., vol. 16, pp. 363–389.
19 Good, I. J. (1961) The colleague matrix, a Chebyshev analogue of the
companion matrix, Quart. J. Math., vol. 12, pp. 61–68.
20 Boyd, J.P. (2002) Computing zeros on a real interval through Chebyshev expansion and polynomial rootfinding, SIAM J. Numer. Anal., vol.
40, no. 5, pp. 1666–1682.
21 Waring, E. (1779) Problems concerning interpolations, Phil. Trans.,
vol. 69, pp. 59–67.
Urban Maths: Virtual Unreality
A. Townie
riving through an unfamiliar city centre recently, with
its poorly signposted routes and confusing one-way systems, I became increasingly frustrated and completely lost.
Sometimes I went with the flow and took the direction being followed by the majority of the traffic; at other times I deliberately
avoided such directions. I began to feel like a particle randomly
diffusing through an incomprehensible network of roads! When
I eventually reached my destination and regained my equilibrium I
realised the process I’d followed through the streets reminded me of
a technique for solving linear networks that I’ve only come across
in the relatively recent past.
D
q
Imagine an electrical circuit comprising a number of interconnected electrical resistors. Focus on a particular interconnection
node and the nodes and resistors to which it is directly connected.
For example, let’s look at a segment of the circuit where node, m,
say, might be surrounded by nodes, q, r and s, to which it is connected via resistors R1 , R2 and R3 , as in Figure 1.
We start conventionally by analysing this part of the circuit using Kirchhoff’s current law and Ohm’s law. Kirchhoff’s current
law says that the sum of the electrical currents out of any interconnection node must be zero (this is basically a statement of the
conservation of electric charge). So, with currents i1 , i2 and i3
from Figure 1 we have:
r
R2
R1
i1
i2
i1 + i2 + i3 = 0.
(1)
Ohm’s law relates the voltage difference across a resistor to the
current flowing through it and the value of the resistance. Using Vx
to represent the voltage at node x, we have for the three branches
of Figure 1:
m
Vm − Vq = i1 R1 ,
Vm − Vr = i2 R2 ,
i3
R3
Vm − Vs = i3 R3 .
s
Figure 1: Segment of electrical network
(2)
Divide each of the equations in (2) by its value of resistance, sum
the resulting equations and make use of equation (1) to obtain, after