Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ethnomathematics wikipedia , lookup
List of important publications in mathematics wikipedia , lookup
Line (geometry) wikipedia , lookup
Numerical continuation wikipedia , lookup
Mathematics of Sudoku wikipedia , lookup
Karhunen–Loève theorem wikipedia , lookup
Mathematics of radio engineering wikipedia , lookup
Applied Numerical Mathematics 61 (2011) 443–459 Contents lists available at ScienceDirect Applied Numerical Mathematics www.elsevier.com/locate/apnum Numerical experiments on the condition number of the interpolation matrices for radial basis functions John P. Boyd ∗ , Kenneth W. Gildersleeve Department of Atmospheric, Oceanic and Space Science, University of Michigan, 2455 Hayward Avenue, Ann Arbor, MI 48109, United States a r t i c l e i n f o Article history: Received 9 July 2009 Received in revised form 4 September 2010 Accepted 11 November 2010 Available online 1 December 2010 Keywords: Radial basis functions Matrix condition number Interpolation a b s t r a c t Through numerical experiments, we examine the condition numbers of the interpolation matrix for many species of radial basis functions (RBFs), mostly on uniform grids. For most RBF species that give infinite order accuracy when interpolating smooth f (x)—Gaussians, sech’s and Inverse Quadratics—the condition number κ (α , N ) rapidly asymptotes to a limit κasymp (α ) that is independent of N and depends only on α , the inverse width relative to the grid spacing. Multiquadrics are an exception in that the condition number for fixed α grows as N 2 . For all four, there is growth proportional to an exponential of 1/α (1/α 2 for Gaussians). For splines and thin-plate splines, which contain no width parameter, the condition numbers grows asymptotically as a power of N—a large power as the order of the RBF increases. Random grids typically increase the condition number (for fixed RBF width) by orders of magnitude. The quasi-random, low discrepancy Halton grid may, however, have a lower condition number than a uniform grid of the same size. © 2010 IMACS. Published by Elsevier B.V. All rights reserved. 1. Introduction Radial basis functions have proved very useful in computer graphics and neural networks [18,11,49,31,10,45] and are growing in popularity for solving partial differential equations [4,27,16,17,14,19,20,1,12,30,32–38,40,46,48,50,21,22]. One serious impediment is that when the RBFs are wide compared to the average grid spacing h, the RBF interpolation matrix is very ill-conditioned. This is very unfortunate because it is known that RBF accuracy generally increases in the “flat” limit and is unreservedly awful in the opposite limit of narrow RBFs. This has inspired some theoretical effort to bound the condition number κ of the interpolation matrix. However, it is very difficult to obtain sharp bounds and the estimates cataloged in Wendland’s book [49] are wildly pessimistic, at least for a uniform grid. In this article, we have taken the direct approach of numerical experimentation. Radial basis functions always contain, either explicitly or implicitly, a width parameter. We employ a “relative” width parameter α , which scales the width to be proportional to the grid spacing h as the grid density varies. In the limit that N → ∞ for fixed α where N is the number of grid points, which is dubbed the “stationary” limit in [18], the matrix condition number κ ( N , α ) asymptotes to a function κasymp (α ) which is a function only of α and the RBF species, or to such a function multiplied by a power of N. For some RBFs, it is possible to make highly plausible conjectures about the analytical form of κasymp (α ). The so-called “saturation error”, which is the finite error that remains even when the grid spacing goes to zero for some RBF species, is known analytically, and turns out be give good clues to condition numbers, too [18]. * Corresponding author. E-mail addresses: [email protected] (J.P. Boyd), [email protected] (K.W. Gildersleeve). 0168-9274/$30.00 © 2010 IMACS. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.apnum.2010.11.009 444 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Table 1 Radial basis functions: definitions and polynomial augmentation. φ(r ) √ 1 + 2r2 (1 + 2 r 2 )m/2 √ 1 Name Abbreviation Polynomial part Multiquadrics Generalized MQ Inverse multiquadrics MQ GMQ IMQ None Degree (m − 2) None ( 2 r 2 + 1)−m/2 1/(1 + 2 r 2 ) exp(− 2 r 2 ) sech( r ) r 2 log(r ) r 4 log(r ) r 2m log(r ) Generalized IMQ Inverse quadratic Gaussians Sech Thin plate splines Thin plate splines Thin plate splines Cubic Quintic Monomial GIMQ IQ GA SH TPS2 TPS4 TPS MN3 MN5 MN None None None None Linear Quadratic Degree m Linear Quadratic Degree m 2 r 2 +1 r3 r5 r 2m+1 Note: is the “absolute inverse width”, a user-choosable parameter often replaced in applications by the “relative inverse width” as the average grid spacing. α where = α /h with h Radial basis function approximation is very simple: in any number of dimensions d, N f (x) ≈ f RBF (x; N ) ≡ λ j φ [α /h]x − x j 2 , x ∈ R d (1) j =1 for some function φ(r ) and some set of N points x j , which are called the “centers”. Here h is the grid spacing (for a uniform grid) or the average grid spacing (for a non-uniform grid) and α is the “relative inverse width parameter”. Many species of φ(r ) are collected in Table 1. Because the basis functions are not orthogonal and therefore Fourier-type orthogonality integrals are not an option, the RBF coefficients λ j are usually found by interpolation at a set of points yk that may or may not coincide with the centers. For simplicity, we shall discuss only coincident centers and interpolation points here. Similarly, although it is possible (and indeed desirable) to vary the RBF width on a non-uniform grid, we shall for simplicity assume that the RBF width is the same for all functions in the basis. The interpolation conditions are f (x j ) = f RBF (x j ; N ), j = 1, 2, . . . , N (2) These can be organized into a matrix system. The interpolation matrix, which is also known as the “generalized Vandermonde matrix”, is a symmetric matrix (for coincident centers and interpolation points) with the elements V i j ≡ φ [α /h]xi − x j 2 On a uniform grid with spacing h, the elements of the interpolation matrix are independent of h and depend only on V ij ≡ φ α zi − z j 2 (3) α: (4) where the zi = xi /h are the points on a grid rescaled to have unit grid spacing. Consequently, the condition number of the matrix is a function only of N, the size of the matrix, and of α . The condition number κ (α ; N ) used here is the default in Maple: κ = V ∞ V −1 ∞ (5) However, we found little difference from κs , the ratio of the largest to the smallest singular value, which is the default in Matlab. The significance of the condition number is a mandatory topic in linear algebra texts such as [47], so we shall note only that the condition measure is a worst-case measure of the growth of roundoff error when factoring the matrix. Each floating point operation incurs a roundoff error which is the order of magnitude of “machine epsilon”, 2 × 10−16 for systems conforming to the IEEE 754 64-bit standard such as Matlab. Estimating roundoff effects by the ratio of singular values implicitly assumes that the singular value mode of smallest singular value appears in the inhomogeneous term in a matrix equation with the same magnitude as the mode of largest singular value. This is possible, but uncommon in applications; the loss of digits due to roundoff is usually smaller than log10 (κ ) in practice. Still, solving a matrix equation with a condition number larger than the reciprocal of machine epsilon is like jumping off a cliff; one may land softly, or one may be very sorry. J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Table 2 Maple code to compute 445 κ (α , N ) for IQ RBFs. with (LinearAlgebra); Digits := 25; alpha := evalf(1/8); ncase := 20; for jN from 1 by 1 to ncase do N := 20 + 20 ∗ ( jN − 1); h := 2/( N − 1); epsilon := evalf(alpha/h); for j from 1 by 1 to N do xgrid[ j ] := evalf(−1 + 2 ∗ ( j − 1)/( N − 1)); od: for ii from 1 by 1 to N do for j from 1 by 1 to N do rsq := evalf((xg[ii ] − xg[ j ]) ∗ ∗2); G [ii , j ] := evalf(1/(1 + epsilon ∗ epsilon ∗ rsq)); od: od: # change RBF species in this line GM := Matrix( N + 1, N + 1, G ); kappaa[ jN ] := ConditionNumber(GM); print(kappaa[ jN]); od: Fig. 1. Left: contours of the condition number κ for one-dimensional Gaussian RBF on a uniform grid. Right: a dashed line schematically divides the regime where κ is independent of N from the regime where the contours of κ curve and the condition number obeys a power law in the limit α → 0 for fixed N, i.e., κ is proportional to a power of α with an N-dependent exponent. For radial basis function methods, it is far better to choose α sufficiently large so that the condition number is not too big, or to use alternative strategies that (at extra cost) bypass the interpolation matrix [25,24]. To avoid contamination of our results by roundoff errors, all computations of condition number in this paper, with the exception of those in Fig. 12, were performed in variable precision arithmetic in Maple. The “Digits” parameter was increased until the numbers were independent of “Digits”. At the suggestion of a reviewer, and also because of the extreme brevity of the code, the Maple statements to compute the condition number for a typical species are given in Table 2. 2. Review of bounds and theorems Baxter gives optimal condition number bounds for uniform grids for many RBF species [5]. Baxter and Sivakumar [7] provide condition number estimates when the centers of the Gaussians are shifted by an arbitrary constant χ . They show that the interpolation matrix is singular when χ is a half-integral multiple of the grid spacing; the condition number is minimized when the interpolation points and centers coincide. Ball, Narcowich and Ward, Schaback and others have proved bounds and theorems for irregular grids [2,39,43,44]. This work is reviewed in the books by Buhmann [11], Wendland [49] and Fasshauer [18]. We shall not reproduce the book chapters here. There is a proverb that asserts “pure mathematics is about what is provable; applied mathematics is about what is”. The gap is especially large for condition number estimates in that the theorems in Wendland give bounds for Gaussians on an irregular grid whose logarithm is forty times that of Baxter’s much smaller, tighter bound for a uniform grid. In the rest of this article, we shall try to experimentally clarify the condition numbers of RBF interpolation matrices. Bounds are fine, but asymptotics are better. Fig. 1 shows that there are two regimes and two interesting limits. Fornberg and Zuev [26] discuss condition numbers in the limit α → 0 for fixed N and show a power law dependence: κ ∼ α σ ( N ) . In this article, we analyze the opposite limit—the vertical limit in the figure—of N → ∞ for fixed α . However, our analytic approximations are accurate over the entire range of α that one would likely use in practice. 446 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Fig. 2. The matrix condition number for the interpolation matrix of Gaussian RBFs on a one-dimensional uniform grid for five different values of curves are the experimental condition numbers; the horizontal dashed lines are the asymptotes as N → ∞. Fig. 3. The actual condition number for large N [N = 200] (solid) compared with the approximation axis is α 2 . α . The κ ∼ (1/2) exp(π 2 /(4α 2 )) (dashed). Note the horizontal 3. Gaussian RBFs on a one-dimensional uniform grid On a uniform one-dimensional grid of spacing h, Gaussian RBFs are of the form φ(x; α , h) = exp − α 2 /h2 x2 (6) We can normalize the interpolation interval to x ∈ [−1, 1] without loss of generality; the uniform grid spacing is h = 2/( N − 1) and the grid points x j = −1 + ( j − 1)h. (This h and grid will be used throughout the rest of the paper.) The elements of the interpolation matrix are V i j (α ) = exp −α 2 (i − j )2 (7) We analyzed the condition number in three stages. First, we graphed κ (α , N ) versus N for several α . This showed that the condition number is independent of matrix size N for large N as illustrated in Fig. 2. Second, we plotted the condition number for a fixed large N versus α and versus α 2 . The near-linear behavior of the second plot allowed us to deduce that κasymp (α ) ≈ (1/2) exp(π 2 /(4α 2 )) (Fig. 3). This was not a blind guess; the saturation error [18] for Gaussian RBFs is known from theory to be proportional to exp(−π 2 /(4α 2 )) on a uniform grid [9] and Baxter has proved tight bounds of this form [6]. J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 447 Fig. 4. Plot of the difference between κasymp and κ ( N ) for α = 2/5 as plotted using a linear scale for N (left) and a logarithmic scale (right). The linearity of the plot on the left shows that κ ( N ) asymptotes exponentially fast to κasymp . Third, we plotted the difference from the asymptote as a function of N, the matrix size, on both log–linear and log–log scales. If the decay is a power law, κ (∞) − κ ( N ) ∼ pN r (8) where p and r are constants, then log κ (∞) − κ ( N ) ∼ log( p ) + r log( N ) This means that on a log–log plot, exponential, (9) κ (∞) − κ ( N ) will asymptote as N → ∞ to a straight line with slope r. If the decay is κ (∞) − κ ( N ) ∼ p exp(−qN ) (10) where p and q > 0 are constants, then log κ (∞) − κ ( N ) ∼ log( p ) − qN (11) This means that on a log–linear plot, κ (∞) − κ ( N ) will asymptote as N → ∞ to a straight line with slope −q. Fig. 4 shows that for Gaussian RBFs, the rate of decay is linear on a log–linear plot, but has an ever-decreasing slope on a log–log plot. This suggests that the rate of decay is exponential. There is no theoretical reason why the behavior for other types of RBFs is necessarily the same, so this analysis must be repeated for each species of RBFs. Fourth, to estimate the asymptote more precisely, we used Prony’s method to determine the parameters (κasymp , p , q) in an approximation of the form κ ( N ) ≈ κasymp + p exp(−qN ) (12) Although Prony’s method for fitting a series of exponentials is well known, the special case of a constant plus a single exponential has not been listed in any of the references we consulted, so we have placed the appropriate formulas in Appendix A. When Prony’s method was applied to successive trios from our computed sequence of κ ( N ), computed every third integer value of N, we found that the sequences for κasymp , p and q were decaying oscillations. Averaging adjacent members of these fitted sequences greatly accelerated convergence; a simpler strategy was to simply fit three successive computed values of even N, omitting the odd N. This yielded Fig. 5, which shows the relative difference between the Prony fit to κ ( N ) and the analytic approximation, Gauss κanal (α ) ≡ (1/2) exp π 2 / 4α 2 (13) Although some fluctuations due to roundoff are visible at the upper limit of N in the graph, the fact that the Prony-fitted values converge to κanal to within better than ten decimal places strongly suggests that the analytic approximation is in fact the exact asymptote. Because this formula was derived by numerical experimentation and not a deductive proof, we shall regard the statement lim N →∞ Gauss κ (α ) = κanal (α ) (14) is a highly probable conjecture. Remarkably, our analytic formula is identical with Baxter’s tight bound [6] though we were unaware of his result until after the completion of this part of our work, and the justifications were completely different. 448 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Fig. 5. Plot of (κasymp − κanal )/κanal for to generate the plotted point. α = 2/5. The horizontal axis denotes the value of N for the trio of {κ ( N − 12), κ ( N − 6), κ ( N )} that were Prony-fitted Fig. 6. The matrix condition number for the interpolation matrix of Inverse Quadratic (IQ) RBFs on a one-dimensional uniform grid for five different values of α . The curves are the experimental condition numbers; the horizontal dashed lines are the asymptotes predicted by our analytic formula as N → ∞. 4. Inverse quadratics IQ RBFs are defined by φ(r ; α , h) = 1 (15) 1 + [α /h]2 r 2 Like Gaussians, inverse quadratic RBFs can yield an exponential rate of convergence if the function f (x) being approximated is analytic on the approximation interval. Fig. 6 shows that, just as for Gaussians, the condition number κ ( N ) asymptotes to an α -dependent number as N → ∞ for fixed α . It is known that the saturation error for IQ functions is proportional to exp(−π /α ). We therefore guessed that the same function, or more accurately, its reciprocal, appears in the asymptotic condition number. Adjusting the proportionality constant to match the numerical results suggested that the asymptote is roughly IQ κanal (α ) ≡ (1/2) exp(π /α ) (16) IQ ( anal When we plotted the difference between κ ( N ; α ) and κ α ) on both log–linear and log–log axes, we found the curves on the log–linear scale became flatter as N increased whereas the difference curves were nearly linear on the log–log plate. This implies that the IQ conditions are not asymptoting exponentially as true of Gaussians but rather as 1/ N. To fit the J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Table 3 Richardson Extrapolation Table for inverse quadratic RBFs for Degree m → 1 2 449 α = 1/8. 3 4 Interpolation: 4.1111 4.11135 4.1113161 m + 2 points 4.1111 4.11135 4.1113162 4.111315747 4.111315754 m + 3 points Asymptotic 4.1110 4.11135 4.1113162 4.111315742 5 4.1113157974 4.1113157783 4.1113157751 6 4.11131588602 4.11131580170 4.11131577261 4.11131577927 Note: All numbers are divided by 1010 . Digits that agree with the asymptotic formula are shown in bold face. The three numbers that generate the three points plotted on the right axis (1/α = 8) in Fig. 7 are shown in boxes. Fig. 7. Circles: interpolation of the set {κ (420), κ (440), κ (460), κ (480), κ (500)} by a quartic polynomial in 1/ N. ×’s: same but fitting a polynomial of fifth degree in a least squares sense to these κ values plus κ (380), κ (400). The curve with diamonds fits a sextic polynomial to the last nine κ ( N ) of the computed sequence. sequence of κ ( N ), we therefore did not use Prony’s method, but rather performed least-squares fits of various degrees using various numbers of elements from the end of the sequence of numerically computed κ ( N ). The constant in the polynomial in 1/ N is the extrapolation of the sequence to N → ∞. The results of this strategy, which is often called “Richardson Extrapolation” [41,42], can be conveniently organized in a “Richardson Table” as illustrated in Table 3. The column number denotes the degree m of the fitted polynomial. The top row gives the result for interpolation. Lower rows show the approximations obtained by least-squares fitting using more and more points from the end of the computed sequence. The extrapolation error usually decreases both down and to the right in the table. It decreases to the right because polynomial degree increases. The error falls downward in the table because more members of the sequence are used in the least squares fit. However, a point of diminishing returns will eventually be reached with both polynomial degree and the number of elements in the fit. High order polynomial approximation will usually diverge—the Runge Phenomenon [8]—unless the interpolation points are distributed as a Chebyshev grid; unfortunately, we cannot evaluate κ ( N ) for fractional matrix sizes N. Adding more elements of the sequence means including κ ( N ) for smaller and smaller N, values farther and farther from the asymptotic limit N → ∞. Restricting Richardson Extrapolation to moderate polynomial degree m in 1/ N and least-squares fitting of only the tail of the sequence is usually the best strategy. Fig. 7 compares the results of Richardson Extrapolation for five different α using three different elements from the Richardson Table for each α . The three curves are almost indistinguishable except for α = 1/8 where all plotted points are smaller than 10−8 , but exhibit a noticeable spread. The graph suggest that κ ( N ; α ) ∼ (1/2) exp(π /α ) + E IQ (α ) + c (α )/ N + · · · (17) The N-independent correction E (α ) is graphed in the figure; it appears that this decays exponentially with 1/α relative to the leading term, (1/2) exp(π /α ). The difference between its extrapolated limit as α → 0 and κ ( N ) is so tiny that it is a highly probable conjecture that the leading term in the large-N, small-α approximation to κ ( N ; α ) is exactly, and not merely approximately, (1/2) exp(π /α ). IQ 5. Sech RBFs on one dimension For these, φ(r ; α , h) ≡ sech [α /h]r (18) 450 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Table 4 Error in analytical formula for the condition number of sech RBFs. α N Relative error of asymptotic formula 1 1/2 1/5 1/10 250 250 250 300 1 1 1 1 part part part part in in in in 4800 9.3 × 107 1.26 × 1010 408,000 Fig. 8. Left: log–log plot of the condition number κ (α , N ) for multiquadrics for several α . The dot–dash curve at the bottom is a guideline, the graph of 4N 2 , showing that for a given fixed α , the condition number follows a simple power law. Right: A plot of κ / N 2 versus 1/α with a logarithmic scale. The dashed curve, almost indistinguishable from the curve connecting the empirical measurements, is 0.363 exp(3.07/α ). The behavior of the condition number is very similar to Gaussians: as N increases, the condition number asymptotes exponentially fast from below to its limit; the limit is an exponential function of α . We found κ sech (α , N ) ∼ (1/4) exp π 2 /(2α ) , N → ∞ (19) sech which we believe to be exact. We did not perform elaborate fits because κ (1/5, 250) is approximated by the asymptotic formula to within 1 part in ten trillion! Agreement is worse for larger α (because the analytical formula is likely an asymptotic approximation as α → 0); agreement is worse for a given N for small α because κ asymptotes to its limit for infinite matrix size N more slowly as α decreases. Nevertheless, Table 4 shows the analytic formula is very accurate for large N and any α . The IQ RBF has poles at x = ±ih/α ; the sech RBF has poles at x = ±i (π /2)h/α (plus others farther from the real axis). It is known [51] that the rate of decay of the Fourier transform of a function is directly proportional to the distance of the nearest singularities of the function from the real axis. In particular, the Fourier transform of the IQ decays proportional to exp(−(α /h)|k|) for large wavenumber k while the transform of sech([α /h]r ) decays as exp(−[π /2](α /h)|k|). Why does the same π /2 difference appear in the exponential growth of the condition number of the interpolation matrix for these two species? We have no explanation. 6. Multiquadrics (MQ) The left panel of Fig. 8 shows that over the entire range shown, the condition number for fixed α grows quadratically with the number of points N. The right panel shows that the curves collectively are well-approximated by κ (α , N ) ≈ 0.363N 2 exp(3.07/α ) [MQ] (20) Because the numerical constants are not close to recognizable numbers, no attempt was made at high-accuracy fitting. 7. Power and thin-plate spline RBFs RBFs such as “power-RBFs” differ from Gaussian-RBF and sech-RBF in that (i) they grow rather than decay with r, (ii) give only a finite order rate of convergence instead of an exponential decrease in error with N when approximating a smooth f (x), (iii) are quite inaccurate unless polynomial or augmented by a polynomial part, and (iv) do not contain a width J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 451 Fig. 9. Left: the condition number κ ( N ) for φ(r ) = |r |3 with a linear polynomial; the total number of interpolation points is N + 1 while the total number of degrees of freedom is N + 3. The line with ×’s shows that the numerical condition numbers are graphically indistinguishable from 0.4641N 5 . The plot at the right shows that the ratio of the analytic power law divided by the numerical condition numbers is not exactly one, but rather decays to one from below as N → ∞. Table 5 Richardson Extrapolation Table for cubic RBFs. Degree m → 0 1 2 3 4 5 Interpolation: m + 2 points m + 3 points m + 4 points m + 5 points m + 6 points 3.58e−03 3.68e−03 3.79e−03 3.90e−03 4.03e−03 4.16e−03 −7.993e−06 −8.48e−06 −9.03e−06 −9.67e−06 −1.04e−05 −1.13e−05 1.60e−08 1.76e−08 1.94e−08 2.17e−08 2.45e−08 2.79e−08 1.65e−10 2.02e−10 2.41e−10 2.85e−10 3.41e−10 4.14e−10 −6.50e−11 −4.13e−11 −1.60e−11 −5.51e−12 −2.19e−12 −1.27e−12 −8.63e−11 −1.49e−10 −6.98e−11 −2.50e−11 −8.27e−12 −2.96e−12 parameter α . Consequently, there is no good reason to suppose that they will behave similarly to previous cases, and their condition numbers do not. 7.1. Power RBFs When plotted on a log–log scale, κ ( N ) is almost a straight line for φ(r ) = |r |3 , henceforth “cubic-RBF”. This implies that κ ( N ) ∼ N k ; the slope of the line shows that k = 5 to within numerical precision (Fig. 9). The proportionality constant p in Cubic κanal ∼ pN 5 (21) can be determined more accurately by fitting polynomials of various degrees in 1/ N to various subsequences at the end of our computed sequences, which included κ ( N ) for N = 20, 40, . . . , 400. (The fit was to the sequence of κ ( N )/ N 5 .) The Richardson Extrapolation Table for this case is shown in Table 5. To guess an analytical approximation to the Richardson-extrapolated p, we typed the value from the lower right of the Richardson Table into the Inverse Symbolic Calculator, developed at the Centre for Experimental and Constructive Mathematics at Simon Fraser University, which suggests numbers close to the input digits. We found that √ p ≈ 2 3 − 3 ≈ 0.4641016 . . . (22) To make the Richardson Table easier to read, we subtracted this approximation from the entries. (In the absence of an analytical guess, one can subtract the lowest, rightmost entry in the Richardson Table itself to see more easily how the extrapolation varies with polynomial degree and the number of points included in the fit.) Applying similar methodology applied to quintic-RBFs, i.e., φ(r ) = |r |5 augmented by a quadratic polynomial, we found κ MN5 ( N ) ∼ 0.112322124N 8 (23) However, the Inverse Symbolic Calculator did not yield a simple approximation to the proportionality constant in this case. 452 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Fig. 10. Left: the condition number κ ( N ) for φ(r ) = |r |2 log(|r |) with a linear polynomial; the total number of interpolation points is N + 1 while the total number of degrees of freedom is N + 3. The line with ×’s shows that the numerical condition numbers are graphically indistinguishable from 0.1198139997N 4 . The plot at the right shows that the ratio of the analytic power law divided by the numerical condition numbers is almost linear in 1/ N, which allows accurate extrapolation to 1/ N = 0. 7.2. Thin-plate splines The thin-plate spline φ(r ) ≡ r 2 log |r | (24) is commonly employed with a linear polynomial in x appended to the sum of RBF functions. Fig. 10 shows, through the log– log plot on the left, that the condition number grows proportionally to N 4 . Richardson Extrapolation gave the proportionality constant to many decimal places: κ TPS2 ( N ) ∼ 0.1198139997N 4 (25) Similarly, for the higher splines with φ(r ) = r 4 log(r ), κ TPS4 ( N ) ∼ 0.05304350371N 7 (26) 8. Two-dimensional grids: square and hexagonal A uniform Cartesian grid with N grid points filling the unit square, [−1, 1] ⊗ [−1, 1], is composed of the points xi = −1 + h(i − 1), y j = −1 + h( j − 1), i = 1, . . . , N s , j = 1, . . . , N s √ where N s = N is the number of points along one side of the square. (27) Iske [31] and Fornberg, Flyer and Russell [23] argue that hexagonal grids are more accurate and better-conditioned in two dimensions than square grids. Although square grids are more popular than hexagonal grids in applications [15,3], it is still useful to examine hexagonal grids, too. Fig. 11 shows a typical hexagonal grid. Each point is surrounded by six points which can be connected by line segments which form the sides of a regular hexagon. Confusingly, the grid is also called “triangular” because grid points can be connected as equilateral triangles as well. Construction of a hexagonal to fit the unit square begins by defining N s to be the number of points along the bottom row of the grid at y = −1, just as for the Cartesian grid. The distance between each point in the bottom row is h = 2/( N s − 1). In the row immediately above, we insert points so as to form equilateral triangles with the points in the bottom row. √ This implies that the row just above the bottom is at y = −1 + h v where h v = h 3/2. Let N v denote the number of rows; the number of intervals is N v − 1. We want to choose the number of rows so that the number of vertical intervals, multiplied by h v , is approximately two, the height of the unit square. Unfortunately, it is not possible to fit the vertical rows exactly in the square because the vertical separation between rows is an irrational number. The best we can do to enforce √ ( N v − 1) 3 2 h≈2 (28) J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 453 Fig. 11. A hexagonal grid with 85 points, adjusted to exactly fit the unit square. Fig. 12.√The condition number of the interpolation matrix for Gaussian RBFs on√a two-dimensional uniform grid is plotted, for five different values of α , versus N where N is the total number of interpolation points (and RBFs) and N is the number of points along one side of the unit square. The dashed curves show the theoretical asymptote as N → ∞ for each value of α as labeled. is to choose 2 N v ≡ 1 + round √ (Ns − 1) 3 (29) We then adjust the vertical spacing between the rows so that the grid always includes one row at y = −1 and one row at y = 1: h̃ v = 2 (30) Nv − 1 If mod( N v , 2) = 0, then the number of rows is even and we can create the grid by laying down npairs = N v /2 pairs of rows. When mod( N V , 2) = 1, then the number of rows is odd and we can create the grid by laying down npairs = ( N v − 1)/2 pairs √ plus an additional long row at y = 1. The slight adjustment in vertical spacing (h 3/2 → h̃ v ) is not visible to the naked eye, and becomes numerically smaller as N increases. 9. Gaussian RBFs on uniform two-dimensional grids in two dimensions Because computations in higher dimensions require multiprecision arithmetic in Maple, sometimes requiring several hours on a workstation to compute a single data point, it was not possible to obtain results as precise as in one dimension. Nevertheless, the numerical results for the square grid strongly suggest that in two dimensions 2 κ2 (α , N ) ∼ κ1 (α , N ) = (1/4) exp π2 2α 2 [Cond. number: 2D uniform grid] (31) 454 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Fig. 13. Same as previous graph, but comparing the condition numbers on square and hexagonal grids. Upper pair of solid curves with markers: α = 0.3. Lower pair of curves: α = 0.4. The dashed curves are Baxter’s predicted asymptotes (as N → ∞) for the square grid. In each pair of curves, the uppermost is the square grid condition number while the lower curve (hexagonal markers) is for the same α but on the hexagonal grid. Fig. 14. The condition number of the interpolation matrix for sech RBFs on a two-dimensional uniform grid for four different √ number of grid points in one direction; N = 32 is a grid with a total of 1024 points. α . The horizontal index is the If this conjecture is true, then the two-dimensional condition number is larger than the reciprocal of standard Matlab arithmetic (“machine epsilon”, 2.2 × 10−16 ), for all α < 0.36. The analogous “danger point” in one space dimension is at α = 0.26. We have not performed computations in three or more dimensions, but the trend suggests that the condition number for a uniform grid in three dimensions is likely to be the cube of κ1 (α , N ). This conjecture is supported by the bounds of Baxter [6]. Fig. 13 compares the condition numbers for square and hexagonal grids. The hexagonal condition numbers are a little smaller than their Cartesian counterparts, but not by a life-changing ratio. Consequently, we examine only conditions numbers on a uniform square grid in the next section. 10. Two-dimensional sech RBFs Fig. 14 shows that the condition number for sech RBFs in two dimensions rapidly asymptotes to its limit as N increases. However, as shown in Fig. 15, the asymptote is not the square of the condition number in one dimension, unlike the case of Gaussian RBFs. We have no explanation for the difference. 11. Irregular grids Grids for observational measurements are often very irregular. Meteorological measurements, for example, are very sparse over the oceans, moderately sparse in deserts and mountains, and very dense in highly populated coastal areas. It is difficult to characterize the wide range of irregularity that arises in the very broad range of applications where RBFs J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 455 Fig. 15. The condition number of the interpolation matrix for sech RBFs on a two-dimensional uniform grid. For Gaussians, the 2D condition number is the square of the 1D case; for sech’s, the 2D result is intermediate between the 1D condition number (lowest curve) and its square (top curve). Fig. 16. Base-10 logarithm of the condition number of the interpolation matrix for Gaussian RBFs for five different α in one dimension. Solid bars: uniform grid. Histograms: distribution for random grids with an ensemble of 1000 fifty-point grids for each α . The random grids were generated by the Matlab command sort( -1 + 2* rand(1, 50) ). The RBFs were exp(−(α 2 /h2 )x2 ) where h is the average grid spacing on x ∈ [−1, 1]; h = 2/49 when N = 50. have been deployed. Therefore we have examined three cases: random grids, Halton grids, and randomly perturbed hexagonal grids. Fig. 16 shows that a completely random grid increases the (mean) condition number by many orders of magnitudes compared to a uniform grid for fixed RBF width. We computed κ for each member of an ensemble of a thousand different random grids for each α . There is a big spread of κ within each ensemble; the outliers have very large condition numbers indeed. When the grid is very irregular, a common stratagem is to vary RBF width in proportion to the local grid spacing. This will likely reduce the condition number from that of the corresponding fixed width matrix. We omit variable width cases because there are many reasonable ways to define a “local width”, so a thorough treatment would require another article unto itself. The performance of random grids is so dismal, in accuracy [not discussed here] as well as condition number, that the Halton grid has become a popular substitute in the RBF world [18]. The Halton grid was invented to facilitate Monte Carlo calculations [28,13]. Although irregular, the Halton points constitute a so-called “low discrepancy” sequence, which is a property that greatly accelerates Monte Carlo convergence. Such sequences are also called “quasi-random” or “sub-random” sequences. 456 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Fig. 17. A realization of a Halton grid with 400 points. Fig. 18. Condition numbers of two-dimensional Halton grids of various sizes (solid) for different √ the horizontal axis is N. The dashed lines are the same but for a square grid for comparison. α where N is the total number of points on the grid and Fig. 17 illustrates a typical Halton grid. Fig. 18 compares the condition numbers of a Halton grid with a uniform grid of the same size for Gaussian RBFs. Both again employed fixed α , which was translated into the RBF width through h, defined as the average grid spacing, the same for both grids. Remarkably, the condition numbers for the Halton grid for small α are somewhat smaller than those of the uniform grid. Even more remarkably, the condition number actually seems to be slowly decreasing for large N (and fixed α ) instead of asymptoting to a constant as happens for the uniform grid. No theoretical explanation is known. It is unclear that the Halton grid is representative of real-world irregularity. Nature and observational networks may be a good deal less fond of “quasi-random”, low discrepancy grids than Monte Carlo modelers or RBF partisans. When one has the freedom to choose the grid points, as in solving partial differential equations, is it better to choose a Halton grid than the more obvious uniform grid? This is a complex question beyond the scope of this article because the answer depends on many issues besides the condition number including accuracy, preconditioning for iterative solvers, and sorting and searching. and Y multiplied by the product of a As a third test, we perturbed the hexagonal grid points by random vectors X parameter τ with the grid spacing h: hexagon x j (τ ) = x j hexagon + τ h X j , y j + τ h Y j (32) The random vectors are computed first and do not vary as the parameter τ varies so that each grid point traces a straight line as it deforms with increasing τ . For such a randomly-perturbed grid, the condition number rises only slightly when τ < 1/2, but then rises steeply to increase the condition number by a factor of one hundred to one thousand. However, the condition number fluctuates and does not rise monotonically with τ . Collectively, these three cases show that generically randomness increases the condition number. However, for the Halton grid, the slow decrease in κ with increasing grid spacing h imposes a certain humility with respect to the condition numbers for the irregular grids encountered in the real world. J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 Fig. 19. Two-dimensional hexagonal grid, perturbed by random trajectories scaled by the parameter 457 τ. Table 6 Condition number: uniform grid. Name κ Gaussian 1D Gaussian 2D sech 1D sech 2D IQ 1D MQ 1D Cubic MN3 Quintic MN5 Thin-plate splines TPS2 Thin-plate splines TPS4 (r 4 log(r )) (1/2) exp(π 2 /[4α 2 ]) (1/4) exp(π 2 /[2α 2 ]) (1/4) exp(π 2 /(2α )) 0.26 exp(7.20/α ) (1/2) exp(π /α ) 0.363N 2 exp(3.07/α ) √ (2 3 − 3) N 5 0.112322124N 8 0.1198139997N 4 0.05304350371N 7 12. Summary On a uniform grid, the condition numbers of splines and thin-plate splines grow asymptotically as powers of N. We have no explanation for why the power is N 5 for cubic splines and rises to N 8 for quintic splines, which are more accurate by only two orders. For Gaussian, sech and Inverse Quadratic RBFs, interpolation yields an infinite order accuracy when interpolating smooth f (x). For these RBFs, the condition number κ (α , N ) rapidly asymptotes to a limit κasymp (α ) that is independent of N and depends only on α , the inverse width relative to the grid spacing. The asymptotic condition number κasymp (α ) unfortunately grows exponentially as α → 0, that is, in the limit of broad, flat RBFs. Multiquadric (MQ) RBFs are similar except for a multiplicative factor of N 2 in κ . These uniform grid results are collected in Table 6. Randomness in the grid generally increases κ . However, even though we offer figures for random, quasi-random (Halton) and randomly-perturbed hexagonal lattices, we have only “tested the waters” of irregular grids. In applications, the grid is not random, but has highly problem-dependent irregularity. Meteorological data, for example, is very sparse over the oceans. Accuracy and condition number can be improved by scaling the width of the RBFs to the local density of the grid, but there are many ways to implement such scaling. A full study of the effect of grid irregularity on condition number therefore is long and arduous, and is left for the future. Acknowledgements This work was supported by the National Science Foundation through grants OCE 0451951 and ATM 0723440 and by the Undergraduate Research Opportunities Program (UROP) at the University of Michigan. We thank the three reviewers for their very detailed comments. Appendix A. Prony’s method for the fit of a constant plus an exponential Gaspard de Prony (1755–1839) developed a method in 1795 for fitting a sum of k exponentials with unknown multipliers and exponents to a sequence of values κ ( N ), a sort of non-linear interpolation. Here, we specialize to (i) k = 2 and (ii) one of the exponents is known to be zero, leaving only three parameters to be fitted: κ ( N ) ∼ κfit ( N ; κasymp , p , q) ≡ κasymp + p exp(qN ) (33) 458 J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 One minor complication is that to save computer time, we often did not compute every number for matrix sizes differing by a “stride” S. Prony’s method gives κ ( N )κ ( N − 2S ) − κ ( N − S )2 κ ( N ) − 2κ ( N − S ) + κ ( N − 2S ) (κ ( N ) − κ ( N − S ))2 p= κ ( N ) − 2κ ( N − S ) + κ ( N − 2S ) 1 κ (N ) − κ (N − S ) c = log S κ ( N − S ) − κ ( N − 2S ) κasymp = κ ( N ), but only the condition (34) (35) (36) The quoted solutions can be easily verified to solve κ ( N ) = κfit ( N ; κasymp , p , q) (37) κ ( N − S ) = κfit ( N − S ; κasymp , p , q) (38) κ ( N − 2S ) = κfit ( N − 2S ; κasymp , p , q) (39) We omit a derivation since Prony’s method is well described in [29]. References [1] K. Balakrishnan, R. Jureshkumar, P.A. Ramachandran, An operator splitting-radial basis function method for the solution transient nonlinear Poisson problems, Comput. Math. Appl. 43 (2002) 289–304. [2] K. Ball, Eigenvalues of Euclidean distance matrices, J. Approx. Theory 68 (1992) 74–82. [3] L.A. Barba, Spectral-like accuracy in space of a meshless vortex method, in: V.M.A. Leitao, C.J.S. Alves, C.A. Duarte (Eds.), Advances in Meshfree Techniques: ECCOMAS Thematic Conference on Meshless Methods, ECCOMAS, in: Comput. Methods Appl. Sci., vol. 5, Springer, New York, 2007, pp. 187–197. [4] L.A. Barba, L.F. Rossi, Global field interpolation for particle methods, J. Comput. Phys. 229 (2010) 1292–1310. [5] B.J.C. Baxter, Norm estimates for inverses of Toeplitz distance matrices, J. Approx. Theory 70 (1994) 222–242. [6] B.J.C. Baxter, On the asymptotic behaviour of the span of the translates of the multiquadric φ(r ) = (r 2 + c 2 )1/2 as c → ∞, Comput. Math. Appl. 24 (1994) 1–6. [7] B.J.C. Baxter, N. Sivakumar, On shifted cardinal interpolation by Gaussians and multiquadrics, J. Approx. Theory 87 (1996) 36–59. [8] J.P. Boyd, Chebyshev and Fourier Spectral Methods, second ed., Dover, Mineola, New York, 2001, 665 pp. [9] J.P. Boyd, L.R. Bridge, Sensitivity of RBF interpolation on an otherwise uniform grid with a point omitted or slightly shifted, Appl. Numer. Math. 60 (2010) 659–672. [10] M.D. Buhmann, Radial basis functions, Acta Numer. 9 (2000) 1–38. [11] M.D. Buhmann, Radial Basis Functions: Theory and Implementations, Cambridge Monogr. Appl. Comput. Math., vol. 12, Cambridge Univ. Press, 2003. [12] T. Cecil, J. Qian, S. Osher, Numerical methods for high dimensional Hamilton–Jacobi equations using radial basis functions, J. Comput. Phys. 196 (2004) 327–347. [13] H. Chi, M. Mascagni, T. Warnock, On the optimal Halton sequence, Math. Comput. Simulation 70 (2005) 9–21. [14] P.P. Chinchapatnam, K. Djidjeli, P.B. Nair, Unsymmetric and symmetric meshless schemes for the unsteady convection–diffusion equation, Comput. Methods Appl. Mech. Engrg. 195 (2006) 2432–2453. [15] T.A. Driscoll, A. Heryudono, Adaptive residual subsampling methods for radial basis function interpolation and collocation problems, Comput. Math. Appl. 53 (2007) 927–939. [16] G.E. Fasshauer, Solving differential equations with radial basis functions: Multilevel methods and smoothing, Adv. Comput. Math. 11 (1999) 139–159. [17] G.E. Fasshauer, Newton iteration with multiquadrics for the solution of nonlinear PDEs, Comput. Math. Appl. 43 (2002) 423–438. [18] G.F. Fasshauer, Meshfree Approximation Methods with MATLAB, Interdiscip. Math. Sci., World Scientific Publishing Company, Singapore, 2007. [19] A.I. Fedoseyev, M.J. Friedman, E.J. Kansa, Continuation for nonlinear elliptic partial differential equations discretized by the multiquadric method, Internat. J. Bifurcation Chaos 10 (2000) 481–492. [20] A.I. Fedoseyev, M.J. Friedman, E.J. Kansa, Improved multiquadric method for elliptic partial differential equations via PDE collocation on the boundary, Comput. Math. Appl. 43 (2002) 439–455. [21] N. Flyer, G.B. Wright, Transport schemes on a sphere using radial basis functions, J. Comput. Phys. 226 (2007) 1059–1084. [22] N. Flyer, G.B. Wright, A radial basis function method for the shallow water equations on a sphere, Proc. Roy. Soc. A 465 (2009) 1949–1976. [23] B. Fornberg, N. Flyer, J.M. Russell, Comparisons between pseudospectral and radial basis function derivative approximations, IMA J. Numer. Anal. 30 (2010) 149–172. [24] B. Fornberg, C. Piret, A stable algorithm for flat radial basis functions on a sphere, SIAM J. Sci. Comput. 30 (2007) 60–80. [25] B. Fornberg, G. Wright, Stable computation of multiquadric interpolants for all values of the shape parameter, Comput. Math. Appl. 48 (2004) 853–867. [26] B. Fornberg, J. Zuev, The Runge phenomenon and spatially variable shape parameters in RBF interpolation, Comput. Math. Appl. 54 (2007) 379–398. [27] C. Franke, R. Schaback, Solving partial differential equations by collocation using radial basis functions, Appl. Math. Comput. 93 (1998) 73–91. [28] J. Halton, On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals, Numer. Math. 2 (1960) 84–90. [29] R.W. Hamming, Numerical Methods for Scientists and Engineers, second ed., Dover Press, Mineola, New York, 1973. [30] Y.C. Hon, K.F. Cheung, X.Z. Mao, E.J. Kansa, Multiquadric solution for shallow water equations, ASCE J. Hydr. Engrg. 125 (1999) 524–533. [31] A. Iske, Multiresolution Methods in Scattered Data Modelling, Lect. Notes Comput. Sci. Eng., vol. 37, Springer, Heidelberg, 2004. [32] E.J. Kansa, Multiquadrics—a scattered data approximation scheme with applications to computational fluid dynamics—II. Solutions to hyperbolic, parabolic, and elliptic partial differential equations, Comput. Math. Appl. 19 (1990) 147–161. [33] E.J. Kansa, H. Power, G.E. Fasshauer, L. Ling, A volumetric integral radial basis function method for time-dependent partial differential equations: I. Formulation, Engrg. Anal. Boundary Elements 28 (2004) 1191–1206. [34] E. Larsson, B. Fornberg, A numerical study of some radial basis function based solution methods for elliptic PDEs, Comput. Math. Appl. 46 (2003) 891–902. [35] Q.T. Le Gia, Approximation of parabolic PDEs on spheres using spherical basis functions, Adv. Comput. Math. 22 (2005) 377–397. J.P. Boyd, K.W. Gildersleeve / Applied Numerical Mathematics 61 (2011) 443–459 459 [36] L. Ling, R. Opfer, R. Schaback, Results on meshless collocation techniques, Engrg. Anal. Boundary Elements 30 (2006) 247–253. [37] T.J. Moroney, I.W. Turner, A finite volume method based on radial basis functions for two-dimensional nonlinear diffusion equations, Appl. Math. Model. 30 (2006) 1118–1133. [38] T.J. Moroney, I.W. Turner, A three-dimensional finite volume method based on radial basis functions for the accurate computational modeling of nonlinear diffusion equations, J. Comput. Phys. 225 (2007) 1409–1426. [39] F.J. Narcowich, J.D. Ward, Norm estimates for the inverses of a general class of scattered-data radial-function interpolation matrices, J. Fourier Anal. Appl. 69 (1992) 84–109. [40] R.B. Platte, T.A. Driscoll, Computing eigenmodes of elliptic operators using radial basis functions, Comput. Math. Appl. 48 (2006) 1251–1268. [41] L.F. Richardson, The deferred approach to the limit. Part I—Single lattice, Philos. Trans. R. Soc. 226 (1927) 299–349. [42] L.F. Richardson, The deferred approach to the limit. Part I–Single lattice, in: O.M. Ashford, H. Charnock, P.G. Drazin, J.C.R. Hunt, P. Smoker, I. Sutherland (Eds.), Collected Papers of Lewis Fry Richardson, Cambridge Univ. Press, New York, 1993, pp. 625–678. [43] R. Schaback, Lower bounds for norms of inverses of interpolation matrices for radial basis functions, J. Approx. Theory 79 (1994) 287–306. [44] R. Schaback, Error estimates and condition numbers for radial basis function interpolation, Constr. Approx. 3 (1995) 251–264. [45] R. Schaback, H. Wendland, Kernel techniques: From machine learning to meshless methods, Acta Numer. 15 (2006) 543–639. [46] M. Sharan, E.J. Kansa, S. Gupta, Application of the multiquadric method for the solution of elliptic partial differential equations, Appl. Math. Comput. 84 (1997) 275–302. [47] L.N. Trefethen, D. Bau III, Numerical Linear Algebra, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, 1997. [48] H. Wendland, Meshless Galerkin methods using radial basis functions, Math. Comp. 68 (1999) 1521–1531. [49] H. Wendland, Scattered Data Approximation, Cambridge Univ. Press, 2005. [50] S.M. Wong, Y.C. Hon, M.A. Golberg, Compactly supported radial basis functions for shallow water equations, Appl. Math. Comput. 127 (2002) 79–101. [51] A.I. Zayed, Handbook of Function and Generalized Function Transformations, Math. Sci. Ref. Ser., vol. 3, CRC Press, Boca Raton, FL, 1996.