We consider the recovery of a realâvalued function f : Ω â R on some compact domain ... terms of cardinal functions uj â VX such that uj(xk) = δj,k. Then, the ...
Stability of Kernel–Based Interpolation Stefano De Marchi Department of Computer Science, University of Verona (Italy) Robert Schaback Institut f¨ ur Numerische und Angewandte Mathematik, University of G¨ottingen (Germany)
Abstract It is often observed that interpolation based on translates of radial basis functions or non-radial kernels is numerically unstable due to exceedingly large condition of the kernel matrix. But if stability is assessed in function space without considering special bases, this paper proves that kernel–based interpolation is stable. Provided that the data are not too wildly scattered, the L2 or L∞ norms of interpolants can be bounded above by discrete ℓ2 and ℓ∞ norms of the data. Furthermore, Lagrange basis functions are uniformly bounded and Lebesgue constants grow at most like the square root of the number of data points. However, this analysis applies only to kernels of limited smoothness. Numerical examples support our bounds, but also show that the case of infinitely smooth kernels must lead to worse bounds in future work, while the observed Lebesgue constants for kernels with limited smoothness even seem to be independent of the sample size and the fill distance.
1
Introduction
We consider the recovery of a real–valued function f : Ω → R on some compact domain Ω ⊆ Rd from its function values f (xj ) on a scattered set X = {x1 , ..., xN } ⊂ Ω ⊆ Rd . Independent of how the reconstruction is done in detail, we denote the result as sf,X and assume that it is a linear function of the data, i.e. it takes the form sf,X =
N X j=1
1
f (xj )uj
(1)
with certain continuous functions uj : Ω → R. To assert the stability of the recovery process f 7→ sf,X , we look for bounds of the form ksf,X kL∞ (Ω) ≤ C(X)kf kℓ∞ (X)
(2)
which imply that the map taking the data into the interpolant is continuous in the L∞ (Ω) and ℓ∞ (X) norm. Of course, one can also use L2 (Ω) and ℓ2 (X) norms above. An upper bound for the stability constant C(X) is of course supplied by putting (1) into (2) to get
N
X
=: ΛX . |uj (x)| C(X) ≤
j=1
L∞ (Ω)
This involves the Lebesgue function
λX (x) :=
N X j=1
|uj (x)| .
Its maximum value ΛX := maxx∈Ω λX (x) is called the associated Lebesgue constant. It is a classical problem to derive upper bounds for the stability constant in (2) and for its upper bound, the Lebesgue constant ΛX . As well-known in recovery by polynomials, in both the univariate and the bivariate case, there exist upper bounds for the Lebesgue function. Moreover, many authors faced the problem of finding near-optimal points for polynomial interpolation. All these near-optimal sets of N points have a Lebesgue function that behaves in the one dimensional case like log(N) while as log2 (N) in the two dimensional one (cf. [2] and references therein). An important example, worth to mention, of points suitable for polynomial interpolation in the square whose Lebesgue constant grows as O(log2 (N)) are the so-called Padua-points (see [1]). However, stability bounds for multivariate kernel–based recovery processes are missing. We shall derive them as follows. Given a positive definite kernel Φ : Ω × Ω → R, the recovery of functions from function values f (xj ) on the set X = {x1 , ..., xN } ⊂ Ω ⊆ Rd of N different data sites can be done via interpolants of the form sf,X :=
N X
αj Φ(·, xj )
j=1
from the finite-dimensional space VX := span {Φ(·, x) : x ∈ X} of translates of the kernel, and satisfying the linear system f (xk ) = sf,X (xk ) =
N X j=1
αj Φ(xk , xj ), 1 ≤ k ≤ N
2
(3)
based on the kernel matrix with entries Φ(xk , xj ), 1 ≤ j, k ≤ N. The case of conditionally positive definite kernels is similar, and we suppress details here. The interpolant of (3), as in classical polynomial interpolation, can also be written in terms of cardinal functions uj ∈ VX such that uj (xk ) = δj,k . Then, the interpolant (3) takes the usual Lagrangian form (1). The reproduction quality of kernel–based methods is governed by the fill distance or mesh norm hX,Ω = sup min kx − xj k2 (4) x∈Ω xj ∈X
describing the geometric relation of X to the domain Ω. In particular, the reproduction error is small if hX,Ω is small. But unfortunately the kernel matrix has bad condition if the data locations come close, i.e. if the separation distance qX =
1 2
min
xi , xj ∈ X xi 6= xj
kxi − xj k .
(5)
is small. Then the coefficients of the representation (3) get very large even if the data values f (xk ) are small, and simple linear solvers will fail. However, users often report that the final function sf,X of (3) behaves nicely in spite of the large coefficients, and using stable solvers lead to useful results even in case of unreasonably large condition numbers. This means that the interpolant can be stably calculated in the sense of (2), while the coefficients in the basis supplied by the Φ(x, xj ) are unstable. This calls for the construction of new and more stable bases, but here we shall focus only on stability bounds. The fill distance (4) and the separation distance (5) are used for standard error and stability estimates for multivariate interpolants, and they will be also of importance here. The inequality qX ≤ hX,Ω will hold in most cases, but if points of X nearly coalesce, qX can be much smaller than hX,Ω , causing instability of the standard solution process. Point sets X are called quasi–uniform with uniformity constant γ > 1, if the inequality 1 qX ≤ hX,Ω ≤ γqX γ holds. Later, we shall consider arbitrary sets with different cardinalities, but with uniformity constants bounded above by a fixed number. Note that hX,Ω and qX play an important role in finding good points for radial basis function interpolation, as recently studied in [6, 3, 4].
3
To generate interpolants, we allow conditionally positive definite translation-invariant kernels Φ(x, y) = K(x − y) for all x, y ∈ Rd , K : Rd → R which are reproducing in their “native” Hilbert space N which we assume to be norm– equivalent to some Sobolev space W2τ (Ω) with τ > d/2. The kernel will then have a Fourier transform satisfying ˆ 0 < c(1 + kωk2 )−τ ≤ K(ω) ≤ C(1 + kωk2 )−τ (6) 2
2
at infinity. This includes polyharmonic splines, thin-plate splines, the Sobolev/Matern kernel, and Wendland’s compactly supported kernels. It is well-known that under the above assumptions the interpolation problem is uniquely solvable, and the space VX is a subspace of Sobolev space W2τ (Ω). In the following, the constants are dependent on the space dimension, the domain, and the kernel, and the assertions hold for all sets X of scattered data locations with sufficiently small fill distance hX,Ω . Our central result is Theorem 1 The classical Lebesgue constant for interpolation with Φ on N = |X| data locations X = {x1 , . . . , xN } in a bounded domain Ω ⊆ Rd satisfying an outer cone condition has a bound of the form τ −d/2 √ hX,Ω . ΛX ≤ C N qX
For quasi-uniform sets with bounded uniformity γ, this simplifies to √ ΛX ≤ C N . Each single cardinal function is bounded by kuj kL∞ (Ω) ≤ C
which in the quasi-uniform case simplifies to
hX,Ω qX
τ −d/2
,
kuj kL∞ (Ω) ≤ C. There also is an L2 analog of this: Theorem 2 Under the above assumptions, the cardinal functions have a bound τ −d/2 hX,Ω d/2 kuj kL2 (Ω) ≤ C hX,Ω qX and for quasi-uniform data locations they behave like d/2
kuj kL2 (Ω) ≤ ChX,Ω . 4
But the Lebesgue constants are only upper bounds for the stability constant in function space. In fact, we can do better: Theorem 3 Interpolation on sufficiently many quasi–uniformly distributed data is stable in the sense of ksf,X kL∞ (Ω) ≤ C kf kℓ∞ (X) + kf kℓ2 (X) and
ksf,X kL2 (Ω) ≤ Chd/2 kf kℓ2(X)
with a constant C independent of X. Note that the right-hand side of the final inequality is a properly scaled discrete version of the L2 norm. We shall prove these results in the next section. However, it turns out there that the assumption (6) is crucial, and we were not able to extend the results to kernels with infinite smoothness. The final chapter will provide examples showing that similar results are not possible for kernels with infinite smoothness.
2
Bounds for Lagrange Bases
Our most important tool for the proof of Theorem 1 is a sampling inequality (cf. [12, Th. 2.6]). For any a bounded Lipschitz domain Ω with an inner cone condition, and for Sobolev space W2τ (Ω) ⊂ Rd with τ > d/2 there are positive constants C and h0 such that τ −d/2 τ kukL∞ (Ω) ≤ C hX,Ω kukW2 (Ω) + kukℓ∞ (X) (7) holds for all u ∈ W2τ (Ω) and all finite subsets X ⊂ Ω with hX,Ω ≤ h0 . This is independent of kernels. We can apply the sampling inequality in two ways: τ −d/2 τ ksf,X kL∞ (Ω) ≤ C hX,Ω ksf,X kW2 (Ω) + ksf,X kℓ∞ (X) τ −d/2 ≤ C hX,Ω ksf,X kW2τ (Ω) + kf kℓ∞ (X) , τ −d/2 ≤ C hX,Ω ksf,X kN + kf kℓ∞ (X) , τ −d/2 τ kuj kL∞ (Ω) ≤ C hX,Ω kuj kW2 (Ω) + kuj kℓ∞ (X) τ −d/2 ≤ C hX,Ω kuj kW2τ (Ω) + 1 τ −d/2 ≤ C hX,Ω kuj kN + 1 5
since we know that the space VX is contained in W2τ (Ω). To get a bound on the norm in native space, we need bounds of the form kskN ≤ C(X, Ω, Φ)kskℓ∞ (X) for arbitrary elements s ∈ VX . Such bounds are available from [10], but we repeat the basic notation here. Let Φ satisfy (6). Then [10] has −2τ +d −2τ +d ksk2W2τ (Ω) ≤ CqX ksk2ℓ2 (X) ≤ CNqX ksk2ℓ∞ (X) for all s ∈ VX
with a different generic constant. If we apply this to uj , we get ! τ −d/2 hX,Ω +1 , kuj kL∞ (Ω) ≤ C qX while application to sf,X yields ksf,X kL∞ (Ω) ≤ C
hX,Ω qX
τ −d/2
kf kℓ2(X) + kf kℓ∞ (X) √ hX,Ω τ −d/2 N qX + 1 kf kℓ∞ (X) . ≤ C
Then the assertions of Theorem 1 and the first part of Theorem 3 follow. For the L2 case covered by Theorem 2, we take the sampling inequality d/2 τ τ kf kL2 (Ω) ≤ C hX,Ω kf kW2 (Ω) + kf kℓ2 (X) hX,Ω , for all f ∈ W2τ (Ω) of [7, Thm. 3.5]. We can apply the sampling inequality as d/2 ksf,X kL2 (Ω) ≤ C hτX,Ω ksf,X kW2τ (Ω) + ksf,X kℓ2 (X) hX,Ω d/2 ≤ C hτX,Ω ksf,X kW2τ (Ω) + kf kℓ2(X) hX,Ω , τ −d/2 d/2 h kf kℓ2(X) hX,Ω , ≤ C qX,Ω X d/2 τ kuj kL2 (Ω) ≤ C hX,Ω kuj kW2τ (Ω) + kuj kℓ2 (X) hX,Ω τ −d/2 d/2 ≤ C hX,Ω kuj kW2τ (Ω) + 1 hX,Ω τ −d/2 hX,Ω d/2 ≤ C + 1 hX,Ω . qX
This proves Theorem 2 and the second part of Theorem 3.
3
(8)
Examples
We ran a series of examples for uniform grids on [−1, 1]2 and increasing numbers N of data locations. Figure 1 shows the values ΛX of the Lebesgue constants for the 6
Sobolev/Matern kernel (r/c)ν Kν (r/c) for ν = 1.5 at scale c = 20. In this and other examples for kernels with finite smoothness, one can see that our bounds on the Lebesgue constants are valid, but the experimental Lebesgue constants seem to be uniformly bounded. In all cases, the maximum of the Lebesgue function is attained in the interior of the domain. Things are different for infinitely smooth kernels. Figure 2 shows the behavior for the Gaussian. The maximum of the Lebesgue function is attained near the corners for large scales, while the behavior in the interior is as stable as for kernels with limited smoothness. The Lebesgue constants do not seem to be uniformly bounded. Lebesgue against n Interior Corner
0.36
10
0.34
10
0.32
10
0.3
10
0.28
10
0
500
1000
1500
2000
Figure 1: Lebesgue constants for the Sobolev/Matern kernel A second series of examples was run on 225 regular points in [−1, 1]2 for different kernels at different scales using a parameter c as Φc (x) = Φ(x/c). Figures 3 to 5 show how the scaling of the Gaussian kernel influences the shape of the associated Lagrange basis functions. The limit for large scales is called the flat limit [5] which is a Lagrange basis function of the de Boor/Ron polynomial interpolation [9]. It cannot be expected that such Lagrange basis functions are uniformly bounded. In contrast to this, Figure 6 shows the corresponding Lagrange basis function for the Sobolev/Matern kernel at scale 320. The scales were such that the conditions of the kernel matrices were unfeasible for the double scale. Figure 7 shows the Lebesgue function in the situation of Figure 5, while Figure 8 shows the Sobolev/Matern case in the situation of Figure 6. Acknowledgments. This work has been supported by the CNR-DFG exchange programme, 2006 and by the GNCS-Indam funds 2007.
7
Lebesgue against n
2
10
Interior Corner
1
10
0
10
0
500
1000
1500
2000
Figure 2: Lebesgue constants for the Gauss kernel
Figure 3: Lagrange basis function on 225 data points, Gaussian kernel with scale 0.1
8
Figure 4: Lagrange basis function on 225 data points, Gaussian kernel with scale 0.2
Figure 5: Lagrange basis function on 225 data points, Gaussian kernel with scale 0.4 9
Figure 6: Lagrange basis function on 225 data points, Sobolev/Matern kernel with scale 320
Figure 7: Lebesgue function on 225 regular data points, Gaussian kernel with scale 0.4
10
Figure 8: Lebesgue function on 225 regular data points, Sobolev/Matern kernel with scale 320
References [1] Bos, L., Caliari, M., De Marchi, S., Vianello, M. and Xu, Y., Bivariate Lagrange interpolation at the Padua points: the generating curve approach, J. Approx. Theory, Vol. 143 (2006), 15-25. [2] Caliari, M., De Marchi, S. and Vianello, M. Bivariate polynomial interpolation on the square at new nodal sets, Appl. Math. Comput., Vol. 165(2) (2005), 261-274. [3] De Marchi, S. On optimal center locations for radial basis interpolation: computational aspects, Rend. Sem. Mat. Torino, Vol. 61(3)(2003), 343-358. [4] De Marchi, S., Schaback, R. and Wendland, H. Near-Optimal Data-independent Point Locations for Radial Basis Function Interpolation, Adv. Comput. Math., Vol. 23(3) (2005), 317-330. [5] Driscoll, T.A. and Fornberg, B. Interpolation in the limit of increasingly flat radial basis functions, Comput. Math. Appl., Vol. 43 (2002) 413–422 [6] Iske, A. Optimal distribution of centers for radial basis function methods, Tech. Rep. M0004, Technische Universit¨at M¨ unchen, 2000. [7] W. R. Madych, An estimate for Multivariate interpolation II, to appear in J. Approx. Theory 142(2) (2006), 116–128. [8] R. Schaback, Lower bounds for norms of inverses of interpolation matrices for radial basis functions, J. Approx. Theory 79 (1994), 287–306. 11
[9] R. Schaback, Multivariate interpolation by polynomials and radial basis functions, Constr. Approx. 21 (2005), 293–317. [10] R. Schaback, H. Wendland, Inverse and saturation theorems for radial basis function interpolation, Mathematics of Computation 71 (2002), 669-681 [11] H. Wendland, Scattered Data Approximation, Cambridge Monographs on Applied and Computational Mathematics, Vol. 17, 2005. [12] H. Wendland, C. Rieger, Approximate interpolation with applications to selecting smoothing parameters,, Num. Math. 101 (2005), 643-662.
12