JEREMY LEVESLEY. University of Leicester. University ... The multiquadric was rst considered by Hardy 5] more than twenty years ago, in the context of ttingĀ ...
POINTWISE ESTIMATES FOR MULTIVARIATE INTERPOLATION USING CONDITIONALLY POSITIVE DEFINITE FUNCTIONS JEREMY LEVESLEY University of Leicester University Road Leicester LE1 7RH UK
Abstract. We seek pointwise error estimates for interpolants, on scattered
data, constructed using a basis of conditionally positive de nite functions of order m, and polynomials of degree not exceeding m-1. Two dierent approaches to the analysis of such interpolation are considered. The former uses distributions and reproducing kernel ideas, whilst the latter is based on a Lagrange function approach. Error estimates in terms of a point density measure are given for both methods of analysis.
1. Introduction Over recent years it has become clear that radial functions are very useful tools for multivariate approximation. Two radial functions inpparticular have found favour with practitioners, the multiquadric, h(x) = 1 + kxk2, and the thin plate spline, h(x) = kxk2 log kxk. In this paper we shall be considering the pointwise convergence of interpolation schemes employing conditionally positive de nite (CPD) functions, a class which includes multiquadrics and thin plate splines, as well as other practically important functions. The multiquadric was rst considered by Hardy [5] more than twenty years ago, in the context of tting surface data. Based on much experimental data, Franke [3] conjectured that the interpolation matrix, [h(x1 ? xj )]ij ; i; j = 1; 2; : : :; N , is invertible for arbitrary distributions of the centres x1 ; x2; : : :; xN . This question was settled by Micchelli [8] in the wider setting of conditionally negative de nite functions. He showed that
2 the interpolation matrix possesses N ? 1 negative eigenvalues and a single positive eigenvalue. The theory for thin plate splines has been developed by Duchon [1, 2] and Mienguet [9]. Duchon showed that an interpolant of the form s(x) = P N c h(x ? x ) + p(x), where p 2 , the linear polynomials, is the ini 1 i=1 i terpolant which minimizes a particular semi-norm. The thin plate spline is conditionally positive de nite of order 2. So we see that the multiquadric and the thin plate spline t into the wider framework of CPD functions of order m, abbreviated to m-CPD. In order to de ne such functions we look again at the formulation of Duchon. For the 2-CPD function h(x) = kxk2 log kxk we look for interpolants of the form N X s(x) = ci h(x ? x1 ) + p(x); p 2 1: i=1
For general m-CPD functions h we seek an interpolant of the form
s(x) =
N
X
i=1
cih(x ? xi) + p(x); p 2 m?1 :
Now, this system has N + M , M = dim m?1 , degrees of freedom where N of these degrees of freedom are absorbed by the interpolation conditions
s(xi) = vi :
(1)
So, in order to have enough conditions to uniquely determine the interpolant we need M more conditions. What we will do is insist that N
X
i=1
ci p(xi) = 0; p 2 m?1 :
(2)
De nition 1 A function h is m-CPD if and only if for all choices of N and the data points x1 ; x2; : : :xN , the quadratic form N
X
iji =1
cicj h(xi ? xj ) > 0
for all vectors c 2 IRN such that for any p 2 m?1 , N
X
i=1
cip(xi) = 0:
3 We can write the interpolation problem and side conditions (1) and (2) in the matrix form A P c = v ; (3) PT 0 0
where Aij = h(xi ? xj ), i; j = 1; 2 : : :N , and Pi = xi , i = 1; 2; : : :; N , jj < m, adopting the usual multiindex notation. It can be shown that if h is m-CPD the vector c is uniquely determined by the data vector v . However, if the data points have a certain geometric con guration it may be that the polynomial is not uniquely determined. For the purpose of this discussion we will assume that (3) is always uniquely solvable. For the remainder of this paper it will be convenient to write the interpolant in a dierent form. Using the standard de nition of the convolution of a measure with a function (see e.g. Rudin [11]) we can write N
X
i=1
ci h(x ? xi ) = c h(x);
where
c =
N
X
i=1
ci xi :
(The standard de nition tells us that x h(y ) = h(y ? x)). Note that c (p) = 0 if p 2 m?1 , because of conditions (2). We write c 2 ?m?1 = f : (p) = 0g. Hence, the interpolant
s(x) =
N
X
i=1
cih(x ? xi ) + p; p 2 m?1 ;
can be written as
s = c h + p; p 2 m?1 ; c 2 ?m?1 ; and supp c X = fx1; x2; : : :xN g.
(4)
In Section 2 we will follow work laid out in two papers of Madych & Nelson [6, 7]. There we will develop a speci c theory for each m-CPD function which emulates Duchon inasmuch as an interpolant of the form (4) is shown to minimise a semi-norm. We will see in Section 2 that the functions for which we can furnish error estimates are not independent of h. In [12], Wu & Schaback follow a dierent approach, involving the generation of Lagrange functions, which, nevertheless has striking similarities. The interpolation problem is recast as a conditional minimisation problem,
4 an interpolant of the above form being the minimiser of a quadratic functional. In Section 3 we review this approach, pointing out areas of similarity with Section 2. In the following we shall assume that all integrals are over IRn and that C is a generic constant which is not necessarily the same at each occurrence.
2. Reproducing Kernel Techniques
In this section we shall, using a xed m-CPD function h, generate a space of functions Ch which we can approximate. The space Ch possesses a semi inner product h ; ih , and this pair Ch ; h ; ih have the following properties: (A1) hf; g i = 0 , f or g 2 m?1 ; i.e. kerh ; i = m?1 , (A2) Ch=m?1 = H , a Hilbert space, (A3) 2 ?m?1 ) h 2 Ch and h h; f ih = (f); f 2 Ch: Property A3 above is the crucial result here and is the reproducing kernel
property.
We can now give an idea of how the proof of pointwise convergence proceeds: ? x x - we are aiming for a pointwise estimate, ? nd a measure x 2 ?` (` m) such that x(f ? s) = f (x) ? s(x), ? use the reproducing kernel to write jf(x) ? s(x)j = jx(f ? s)j = jhx h; f ? sih j
kx hkhkf ? skh: ? we obtain order ` convergence by arranging for x to be supported in a small neighbourhood of x. This idea will be made explicit later. This section proceeds in four parts. First we construct the space Ch with properties A1 ? A3. Then we use the reproducing kernel property to
give an error estimate which depends on our ability to construct a measure with particular properties. In the third subsection we explicitly construct such a measure, and we close the section by discussing examples for which this approach gives good convergence results. In [6] the space Ch was generated without use of Fourier transforms, but this meant that describing the functions in Ch was dicult. The methodology of [7], the one which we shall adopt, uses Fourier transform techniques and gives a clearer description of Ch .
5 In order to use Fourier transforms we require an alternative de nition of m-CPD functions. This can be shown (see [7]) to be equivalent to De nition 1. In this de nition D is the space of compactly supported test functions. De nition 2 A function h is m-CPD , Z Z
h(x ? y) (x) (y)dxdy > 0;
(5)
for all 2 Dm := f 2 D : p = 0 8 p 2 m?1 g: We can rewrite (1) as R
Z
h(x) ~(x)dx > 0;
(6)
where ~(x) = (?x). Note 1 The set Dm contains those functions whose Fourier transforms have zeros of orders m at the origin. These are useful as m-CPD functions have Fourier transforms with poles at the origin. The result which is fundamental in the generation of the space Ch, with properties A1-A3, is the following one of Gel'fand & Vilenkin [4]. Theorem 1 Let h be continuous and m-CPD. Then there exist (i) a positive Borel measure on Rn nf0g, (ii) a function 2 D with ^( ) ? 1 = O(k k2m+1 ) as ! 0, (iii) coecients fa : j j 2mg, such that, for all 2 D, Z
h(x)(x)dx =
Z
+
^ () ? ^()
X
j j2m?1
a D ^(0): ! j j2m X
D ^ (0)
! d() (7)
Furthermore, for all v 2 Vm , the Euclidean space of vectors whose components are indexed by the multiindices with jj = m, X
jj=j j=m
v v a+ 0:
The coecients fa : j j = 2mg are uniquely determined by h.
(8)
6
Note 2 We note here that this result is very similar to Z
Z
h(x)(x)dx = ^h(?)^()d
if h; 2 L2 (IRn ). Essentially then d( ) = h^ (? )d . To make the integral in (7) convergent we need to subtract o enough of the Taylor expansion of ^ near the origin so that the pole in h^ is cancelled out. This is the purpose of the subtraction in (7). The sum in (7) compensates for this subtraction. This note is intended as an explanation of the above result not a proof. We shall make use later of the idea that d ^hd . It turns out that (8) is an important result. For if we de ne the matrix A by A = a!+ ! , jj; j j < m, then, from (8), A is real symmetric and positive semi-de nite. Hence, we can de ne a semi inner product on Vm , hv; wiA = vT Aw, and norm kwkA = wT Aw. If we de ne NA = fw 2 Vm : Aw = 0g, then the quotient HA = Vm =NA is a Hilbert space with inner product h ; iA inherited from Vm. Now, in mind of (6), let us apply Theorem 1 to = ~, with 2 Dm . We obtain, using ( ~)^ = j ^j2 Z
h(x) ~(x)dx =
Z
j ^()j d() + 2
[D j ^j2 ](0) a : ! j j=2m X
(9)
Here we have used the fact that 2 Dm , i.e. D ^(0) = 0; j j < m, to set [D j ^j2](0) = 0 if j j < 2m. Now, using Leibnitz formula, X ! D ^(0)D? ^(0) D j ^j2(0) = !( ? )! X = D ^(0)D ^(0) ! ! ! ; jj=j j=m + =
as, again D ^(0) = 0 if j j < m. Hence, the sum in (9) above is X
jj=j j=m
D ^(0)D ^(0) a!+ ! = ^mT A ^m;
where ^m 2 Vm and [ ^m] = D ^(0); (jj = m): Thus, we can rewrite (9) as Z Z h(x) ~(x)dx = j ^j2d + k ^mk2A 0; (10)
7 using the fact that is positive. This result will be crucial in proving A1 ? A3 concerning Ch . 2.1. THE SPACE CH We now de ne but later we will provide a complete description of the space Ch . De nition 3 f 2 Ch C(IRn) ,
jhf; ij c(f )
Z
1
j ^()j d() + k ^mkA 2 : 2
2
(11)
The smallest value c(f ) = inf c(f ) is a semi-norm on Ch . It can easily be seen that c is homogeneous and satis es the triangle inequality. We know that if f 2 m?1 ; hf; i = 0 by the de nition of Dm . Thus, m?1 is in the kernel of this semi-norm. In fact, it can be shown that m?1 is the kernel. Recall here property A1. Notice on the right hand side of (11), the norm on the Hilbert space H = L2() HA . This Hilbert space will be used later. In this case note that (11) can be rewritten jhf; ij kf khk ^ ( ^M + NA)kH ; (12)
a Cauchy-Schwartz type inequality. Here we have written c (f ) as the seminorm kf kh . It is now enlightening to see what sort of functions are in Ch . First of all it is clear m?1 Ch . Secondly, if for v 2 Vm , we set q (x) = P jaj=m (Av )(?ix) , then
hq; i = =
X
jj=m X
jj=m
Z
(Av ) (?ix) (x)dx (Av ) D ^(0)
= h ^m ; v iA kvkAk ^mkA: In mind of (11) we see that q 2 Ch and kq kh = kv kA. Finally, as noted after Theorem 1, d( ) = ^h(? )d , so that, if for g 2 L2 (), we de ne by ^ (? ) = g ( )h(? ), then, in essence we have
jh; ij =
Z
(x) (x)dx
8 =
Z
=
Z
^(?) ^()d
^( )g ( )d( )
k ^kL2 kgkL2 ; ( )
( )
and 2 Ch with k kh = kg kL2() . So we have seen that elements of the form p + q + , where p 2 m?1 , and q and are discussed in the previous paragraphs, are in Ch . The next result shows that such functions exhaust Ch .
Theorem 2
Ch = L2 () HA = H:
m?1
Proof
In this proof we associate with each element f 2 Ch a functional `f on H . Because H is a Hilbert space this functional has a unique representer rf = gf (wf + NA). The isomorphism in the statement of the theorem associates f with rf . First de ne J : Dm ! H by J = ^ ( ^m + NA ). Then, from (12),
f 2 Ch )
so that J 1 = J JDm, given by
2
jhf; ij kf khkJ kH ; ) hf; i = hf; i. Thus, the linear functional `f on 1
2
`f (J ) = hf; i is well de ned. It can be shown that JDm is dense in H so that `f can be extended to H . H is Hilbert space. Therefore there exists a unique rf = gf (wf + NA) 2 H such that Z
hf; i = gf () ^()d() + hwf ; ^miA : Now, let : Ch ! H be de ned by f = rf :
(13)
Then, from the remarks preceeding this theorem, is onto. Now, if f = 0; hf; i = 0 and f 2 m?1 . Hence, the kernel of this map is m?1 and we obtain the result. Implicitly with the above isomorphism we have the semi inner product
hf ; f ih = hf ; f iH . Clearly then m? = kerh ; i. The above theorem gives us properties A1 and A2. To prove A3 is more tricky. This is the topic 1
2
1
of the next result.
2
1
9
Proposition 1 If 2 ?m? then h 2 Ch and h h; f ih = (f) for all f 2 Ch . Proof In this proof we explicitly construct the representer for h using Theorem 1. We then use this representer to prove the proposition. Now, for 2 Dm, it is easy to show that h h; i = hh; i; 1
where
Z
(x) =
(x + z )d (z ):
So, using Theorem 1 we have
hh; i =
Z
^() ? ^()
X
j j 0), where d = maxy2IRn minx2X kx ? y k is a measure of the density of the interpolation Z
2
2
X
=2
points, we obtain the estimate
Z
jf(x) ? fX (x)j KM ` d` djxj(y);
(22)
and we have O(d` ) pointwise convergence as long as x has nite total variation. The number K above restricts the class of m-CPD functions we can R use. For, if for ` m, k k2`d( ) does not converge then we cannot get an error estimate via this analysis. This is an important restriction as it disallows some important m-CPD functions; see Section 2.4 for examples. 2.3. CONSTRUCTION OF X To complete the story we need to nd a measure x such that x (p) = p(x) for all p 2 `?1 and rad(supp x ) < Md for some positive constant M . To do this we need a set of points, fa : jj < `g unisolvent with respect to the polynomials in `?1 . That is, the array B = a , jj; j j < `, is invertible. We also require that the set fb : jj < `g where b 2 B1 (a ) = fx : kx ? ak g is also unisolvent. Note here that we are allowing b to be any point in B1 (a ). That such a set fa : jj < `g exists is clear by a simple continuity argument. Let R be chosen such that B1 (a ) BR (0) for all jj < `. Now choose t 2 Bd (x), and consider the set of points t + da. Now, because d is the density of interpolation points, there exists x 2 Bd (t+da ) with x 2 X the set of interpolation points. Thus, there exists b 2 B1 (a ) such that x = t + db. Note that
kx ? xk kx ? xk + kt ? xk d + Rd = (R + 1)d: (23) Because of our choice of the set fa : jj < `g we know that fb : jj < `g is unisolvent. Therefore, there exists a set of Lagrange polynomials fp :
14
jj < `g such that p(b ) = . Now we de ne the measure x by t : x = p x ? x d jj