multivariate approximation on spheres, a

THE UNIVERSITY OF NEW SOUTH WALES SCHOOL OF MATHEMATICS DEPARTMENT OF APPLIED MATHEMATICS

MULTIVARIATE APPROXIMATION ON SPHERES, A REPRODUCING KERNEL APPROACH

by

QUOC THONG LE GIA

Supervisor: Prof. Ian H. Sloan

A thesis submitted for consideration in the degree of Bachelor of Science with Honours in applied mathematics.

University of New South Wales. November 1998

Acknowledgements I would like to thank my supervisor, Professor Ian H. Sloan, for his continuous advice and guidance over the year. Thanks to Professor J. D. Ward, Professor K. Jetter for providing the materials for the last chapter. Thanks to all of the members of School of Mathematics, UNSW for their time and consideration with my various questions. Thanks to Dr. W. McKee for his advice on the layout of the thesis. I am also grateful to the financial support from AIDAB agency. Finally, I owe thanks to my parents, Mr. and Mrs. Le Gia Than, for their encouragement during the writing of the thesis.

i

Contents

Acknowledgements

i

1 Introduction

1

2 Mathematical Preliminaries

10

2.1

Multivariate polynomials on the unit sphere . . . . . . . . . . . . . . . . . .

10

2.2

Spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

14

2.3

Spherical t-designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18

2.4

Reproducing kernel Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . .

20

2.5

Linear projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.6

Stone-Weierstrass theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

3 Polynomial interpolation on spheres

26

3.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

3.2

Fundamental systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

3.3

Multivariate Lagrangian interpolation . . . . . . . . . . . . . . . . . . . . . .

28

3.3.1

General setting of Lagrangian interpolation . . . . . . . . . . . . . . .

28

3.3.2

Relation between Lagrangians and the reproducing kernel

. . . . . .

28

3.3.3

Relation between spherical t-design and Lagrangians . . . . . . . . .

32

3.4

The interpolation operator and its norm . . . . . . . . . . . . . . . . . . . .

37

3.5

Extremal fundamental system . . . . . . . . . . . . . . . . . . . . . . . . . .

42

ii

iii

CONTENTS 4 Polynomial hyperinterpolation on spheres

45

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.2

Erdös-Turàn property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.3

Polynomial hyperinterpolation . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.4

The hyperinterpolation operator and its norm . . . . . . . . . . . . . . . . .

51

4.5

Construction of hyperinterpolation points . . . . . . . . . . . . . . . . . . . .

54

5 Optimal approximation on spheres

58

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

58

5.2

Optimal approximation in a Hilbert space . . . . . . . . . . . . . . . . . . .

58

5.3

Positive definite kernels and native space . . . . . . . . . . . . . . . . . . . .

61

5.3.1

Positive definite kernels on spheres . . . . . . . . . . . . . . . . . . .

61

5.3.2

Native spaces and multi-level approximation . . . . . . . . . . . . . .

65

5.4

Interpolation in reproducing kernel Hilbert spaces . . . . . . . . . . . . . . .

67

5.5

Generalized Hermite interpolation on spheres . . . . . . . . . . . . . . . . . .

69

5.6

Error analysis for Lagrangian interpolation . . . . . . . . . . . . . . . . . . .

72

A

77 A.1 Hahn-Banach theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

A.2 Laplace-Beltrami operator on S r−1 . . . . . . . . . . . . . . . . . . . . . . .

77

A.3 Markov inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

A.4 Sobolev spaces

79

Bibliography

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

80

Chapter 1 Introduction The central problem of approximation theory can be described as the following. Given a Banach space X with a finite dimensional linear subspace U, the best approximants u of x are those elements u ∈ U such that kx − uk is minimum. To make this more precise, define the distance from x to U by dist(x, U) = inf kx − uk. u∈U

If this infimum is attained by one or more elements, these elements are called best approximations of x in U, i.e a member of U is a best approximation of x if and only if kx − uk = dist(x, U). Classical approximation theory has dealt largely with the approximation of univariate (real or complex) functions. The Stone-Weierstrass approximation theorem, which states that every continuous function f ∈ C([−1, 1]) can be approximated uniformly by a polynomial, is a major classical result. Typical spaces playing the role of X are C(S), L1 (S), L2 (S), and typical subspaces that play the role of U are the polynomials of degree ≤ n, the trigonometric polynomial of order n, and spaces of spline functions with specified knots. Multivariate approximation theory concerns the approximation of functions of several real or complex variables. Though there were some important results in this field in the beginning of this century, multivariate issues have been promoted vigorously only since the 70’s. Approximation theory also deals with method of construction such as interpolation, 1

2

CHAPTER 1. INTRODUCTION

hyperinterpolation, and with quadrature and cubature formulae, etc. The domain of interest can be a circle, a sphere, a unit ball, a cube or a hyperplane in Rr . The theory of multivariate constructive approximation deals with the way to construct good approximants for multivariate functions. There are two major trends in recent developments of the theory, one trend is concentrated on multivariate approximation in the Euclidean space Rr , another trend is concentrated on approximation on the unit sphere S r−1 . In the Euclidean space Rr setting, E.W. Cheney [7] reduces multivariate problems to univariate problems by tensor-product arguments because of their linear structure, M. J. Powell [29] uses radial basis functions to reduce the dimension. In the unit sphere S r−1 setting, restricted to approximation of functions on the unit sphere, M. Reimer [30] uses bizonal reproducing kernel together with the Addition Theorem of spherical harmonics to bring down the dimension, F. Narcowich, J. D. Ward introduce strictly positive definite functions on spheres and discuss the unisolvency of the interpolation problem. K. Jetter et. al.[16] consider the Lagrangian interpolation problem as the orthogonal projection to some finite dimensional spline space, which is called the native space. There is also an attempt to combine wavelets and spherical harmonics in approximation of functions on S r−1 . The results of this trend can be found in [14, 26, 27]. In the tensor-product arguments, we start with two Banach spaces, X and Y . Then we P construct expressions of the form ni=1 xi ⊗ yi , where n ∈ N , xi ∈ X, and yi ∈ Y . If the elements of X are functions defined on a set S and if the elements of Y are functions defined P on a set T , then the expression xi ⊗ yi can be interpreted as a function on S × T , namely the function

(s, t) 7→

n X

xi (s)yi (t).

i=1

The set of all such expressions is denoted by X ⊗ Y , an algebraic tensor product of X and Y . An equivalence relation in X ⊗ Y is defined as follows: we write n X i=1

xi ⊗ yi ≃

m X j=1

uj ⊗ vj

(1.1)

3 if and only if n X

φ(xi )ψ(yi) =

i=1

m X

φ(uj )ψ(vj )

(1.2)

j=1

for all φ ∈ X ∗ and for all ψ ∈ Y ∗ . (Here, X ∗ denotes the Banach space conjugate to X). Thus, the two expressions in (1.1) are considered to be equivalent if they represent the same function on X ∗ × Y ∗ as defined in (1.2). Then, from each representation of each equivalence class of X × Y , the algebra of such expressions is constructed and the approximants of a bivariate function f (s, t) are constructed. The bizonal reproducing kernel arguments mainly deal with polynomial approximation on the unit sphere. The main tool is the reproducing kernel G : S r−1 × S r−1 → R of U, where U is a finite dimensional subspace of C(S r−1 ) equipped with the inner product h., .i, within which the approximation is to be sought. By definition, the reproducing kernel G satisfies: G(x, .) ∈ U

for

x ∈ S r−1 ,

(1.3)

G(x, y) = G(y, x)

for

(y, x) ∈ S r−1 × S r−1 ,

(1.4)

hG(x, .), F i = F (x)

for

x ∈ S r−1 , F ∈ U.

(1.5)

The last property is the reproducing kernel property. Now, let {p0 , . . . , pn } be an orthonormal basis for U, then G(x, y) :=

n X j=0

pj (x)pj (y),

∀x, y ∈ S r−1

uniquely defines a reproducing kernel for U. The first two properties (1.3), (1.4) are obvious. The reproducing kernel property can be verified by * n + n X X hpj , F i pj (x) = F (x), hG(x, .), F i = pj (x)pj (.), F = j=0

j=0

where the last step follows because hpj , F i is just the Fourier coefficient of F . The uniqueness of G follows from the fact that if G′ is another reproducing kernel then for some fixed point

4


y ∈ S r−1 , hG(., y) − G′ (., y), G(., y) − G′ (., y)i = hG(., y), G(., y) − G′ (., y)i − hG′ (., y), G(., y) − G′ (., y)i = (G(y, y) − G′ (y, y)) − (G(y, y) − G′ (y, y)) = 0. Since {p0 , . . . , pn } can be arbitrary, the spherical harmonics {Yℓm} on the sphere S r−1 can be chosen. If the spherical polynomial interpolating space is rotation-invariant, we can use the Addition Theorem, which is an important property of spherical harmonics, to reduce G(x, y) from a multivariate function to a univariate function g(t) ∈ C[−1, 1] in the following relation G(x, y) = g(x.y),

∀x, y ∈ S r−1 ,

where x.y is the usual dot product in Rr . This point will be elaborated later in chapter 2 and chapter 3. One of the most important problems in approximation on S r−1 is Lagrangian interpolation. We are given an unknown continuous function f , i.e f ∈ C(S r−1 ), and x1 , . . . , xN are prescribed points on S r−1 . Here, f (x1 ), . . . , f (xN ) are given and N is the dimension of some finite dimensional polynomial space U in which we wish to construct the interpolant of f . The polynomial interpolant of f , which we call Λf , is defined to be the polynomial such that Λf (xj ) = f (xj ),

for all j = 1, . . . , N.

Now the Lagrangians defined on x1 , . . . , xN , which are called l1 , . . . , lN , having the property li (xj ) = δij , form a basis for U. Hence we can construct Λf through the Lagrangian polynomials li ’s in U, for i = 1, . . . , N, as the following: Λf =

N X j=1

f (xj )lj .

5 An interesting fact which will be proved in Chapter 3 is that if we define the matrix L in which Lij = hli , lj i and the matrix G in which Gij = G(xi , xj ) then we will have L = G−1 . Most of the important estimations for the Lagrangian interpolation, e.g. the Lebesgueconstant, the Lagrangian square sums, see [31], are expressed through L, which in turn can be expressed through G and hence through the univariate function g ∈ C[−1, 1]. A generalisation of interpolation, hyperinterpolation was first introduced by Sloan in [34]. Hyperinterpolation can be defined over general regions, but here we are only interested in hyperinterpolation over the sphere S r−1 . Hyperinterpolation is started with a quadrature rule which is exact for every polynomial of degree ≤ 2n. Thus, we require M X k=1

wk p(tk ) =

Z

pdS S r−1

∀p ∈ Pr2n (S r−1 ),

where wk > 0, tk ∈ S r−1 , for k = 1, . . . , M, and Pr2n (S r−1 ) is the polynomial space with degree ≤ 2n restricted on the sphere S r−1 . We remark that M ≥ N, where N is the dimension of the space of all polynomial of degree ≤ n, i.e. Prn (S r−1 ). Then, the polynomial hyperinterpolant Ln f ∈ Prn (S r−1 ) of a continuous function f ∈ C(S r−1 ) is defined as a projection to the space spanned by an orthonormal basis {p1 , . . . , pN } as Ln f :=

N X j=1

hf, pj iM pj ,

where h., .iM is a semi-innerproduct defined as hu, viM :=

M X

wk u(tk )v(tk ),

k=1

in which the exact integral is replaced by the quadrature rule. The main result of hyperinterpolation is that the L2 (S r−1 )-norm of the hyperinterpolation operator, i.e. kLn k2 = sup{kLn f k2 : kf k∞ ≤ 1}, is bounded as n → ∞. Moreover, kLn k2 attains the value of the orthogonal projection norm (the orthogonal projection has norm √ kΠk2 = ωr−1 , see chapter 4), that is kLn f k2 ≤

√

ωr−1 kf k∞ ,

6


and √ kLn f − f k2 ≤ 2( ωr−1 )

inf

χ∈Prn(S r−1 )

kf − χk∞ .

Thus kLn f − f k2 → 0 as n → ∞. In the above, ωr−1 is the total surface area of S r−1 . Approximation theory also deals with other kind of problems, in which the data points are already given in some sense. The general framework for Hilbert spaces perhaps is first set out in Golomb and Weinberger [19], in which all the problems are stated together by the means of linear functionals. In the general problem of linear approximation of an unknown function x of the Hilbert space X equipped with a norm k.k, according to [19], we are given the values Fi (x) = fi ,

i = 1, . . . , n,

and seek an approximation to the value of F (x). Here, F, F1 , . . . , Fn are given linear functionals defined for all elements of X. An important example is the point evaluation functional δp , where p ∈ S r−1 . Without loss of generality, we can assume that F1 , . . . , Fn are linearly independent. However, in order to give the estimation for |F (x)|, Golomb and Weinberger [19] requires an additional non-linear constraint, that is kxk ≤ r for some given positive real number r. From this constraint, the concept of a hypercircle is introduced as Cr := {x ∈ X : kxk ≤ r,

Fi (x) = fi

i = 1, . . . , n}.

Geometrically, the hypercircle is the intersection of the hyperplane defined by Fi = fi , i = 1, . . . , n and the ball B of radius r defined by kxk ≤ r. The main result of [19] is that there exists a unique element x∗ ∈ Cr , which is in fact the center of Cr , that best approximates x in the sense that kx∗ k =

inf

Fi (x)=fi

Fi (x∗ ) = fi ,

kxk, i = 1, . . . , n.

The estimate for |F (x)| is given by the so-called hypercircle inequality: |F (x) − F (x∗ )|2 ≤ |F (y ∗)|2 (r 2 − kx∗ k2 ),

∀x ∈ Cr ,

7 where y ∗ is the unique element with unit norm for which F (y ∗ ) =

sup Fi (y)=0,kyk=1

|F (y)|.

i=1,... ,n

Recently, a new trend in approximation on spheres emerges, pursued by Dyn, Narcowich, Ward [14], Cheney [8], Jetter et.al.[16], introducing (strictly) positive definite kernels and the hypercircle inequality as primary tools for generalized interpolation on spheres and closed compact Riemannian manifolds. The major result of this trend is the characterization between (strictly) positive definite kernels and the existence and uniqueness of the solution for generalized interpolation problems on spheres. The error analysis for the interpolant makes extensively use of hypercircle inequality and the symmetry of the unit sphere S r−1 , see [16] and [14]. In more detail, the notion of positive definite functions on spheres was first introduced by Schoenberg [32], as a class of real continuous functions g(t) such that for arbitrary points p1 , . . . , pN on S r−1 and real numbers x1 , . . . , xN , N X N X i=1 j=1

g(θ(pi , pj ))xi xj ≥ 0,

where N ≥ 2 and θ is the usual geodesic distance on S r−1 , θ(p, q) = arccos(p.q). Xu and Cheney [8] gave conditions for Schoenberg’s spherical positive definite functions to be strictly positive definite, and hence provided a wide class of basis functions which can be used to do generalized interpolation on S r−1 . The concept was extended in Narcowich [26] to the Sobolev space in which the real numbers xi ’s are replaced by distributions in H −s (S r−1 ). Narcowich [14, Theorem 2.1] proved that a kernel κ ∈ H 2s (S r−1 × S r−1 ) ∩ C(S r−1 × S r−1 ) is positive definite if and only if for every u ∈ H −s (S r−1 ), hu ⊗ u, κi ≥ 0, where the tensor product of two distributions u, v ∈ H −s (S r−1 ) is defined as Z Z hu ⊗ v, κi = u(p) v(q)κ(p, q)dS(q) dS(p) S r−1 S r−1 Z Z = v(q) u(p)κ(p, q)dS(p) dS(q). S r−1

S r−1

(1.6)

8


In (1.6), if the equality implies the distribution u = 0, then we will say the κ is strictly positive definite on S r−1 . In this framework, Narcowich [26] proposed that the generalized interpolation prob−s r−1 lem on S r−1 should be stated as: given a linearly independent set {uj }M ), j=1 ⊂ H (S

complex numbers dj , j = 1, . . . , M and a positive definite kernel κ on S r−1 , find u ∈ span{u1, . . . , uM } such that Z

uj (p)(κ ⋆ u)(p)dS(p) = dj ,

j = 1, . . . , M,

S r−1

where κ ⋆ u(p) :=

Z

κ(p, q)u(q)dS(q).

S r−1

He also showed that if the kernel κ is strictly positive definite then there exists a solution for the problem and the solution is unique. In this thesis, we will concentrate on multivariate approximation on the unit sphere S r−1 , and attempt to unify various trends of the theory in a reproducing kernel Hilbert space approach. The first chapter introduces some mathematical background to prepare for the work on the unit sphere. Multivariate spherical polynomials, spherical harmonics, reproducing kernel Hilbert space and linear projections are included in this chapter. The second chapter describes multivariate polynomial interpolation on the sphere. Most of the material is concerned about Lagrangian interpolation on the sphere, its strength and its limitations. To overcome the limitations of interpolation, hyperinterpolation is introduced in the next chapter. Roughly speaking, hyperinterpolation suggests that the number of interpolation points should exceed the dimension of the space in order to achieve better accuracy. Reproducing kernels for finite dimensional polynomial space play an important role through chapter 3 and 4. Finally, rather than restrict ourselves to polynomials, the last chapter introduces the problem of approximation on spheres by splines. In a reproducing kernel Hilbert space, optimal approximation is described as the orthogonal projection to some

9 finite dimensional subspace which is spanned by splines on S r−1 . We discuss the hypercircle inequality and error analysis for Lagrangian interpolation on S r−1 . The applications of constructive theory and approximation theory of multivariate functions are diverse. In recent years, there has been an increasing awareness of the importance of multivariate approximation on the sphere with obvious applications to meteorology, oceanography and satellite based techniques such as the Global Position System (GPS). Applications also arise in quadrature, placing grids on S 2 , tomography, coding theory, etc. Accounts of potential applications are given in [17] and [37].

Chapter 2 Mathematical Preliminaries 2.1

Multivariate polynomials on the unit sphere

Let us consider the Euclidean space Rr . The unit sphere S r−1 is defined by S

r−1

r

:= {x : x ∈ R ,

r X

x2i = 1}.

i=1

By Pr we denote the linear space of all real polynomials of the form P (x) =

X

cm xm ,

where the sum is finite and m = (m1 , . . . , mr ) ∈ Zr+ are multi-indices. In this context, mr 1 xm = xm 1 . . . xr .

In addition, let |m| := m1 + m2 + . . . + mr . We define the following subspaces of Pr which will be used in subsequent chapters. The space of all polynomials from Pr with total degree at most n is Prn := {P : P (x) =

X

|m|≤n

cm xm }.

The space of all polynomials from Pr with total degree exactly equal to n is ∗

Prn := {P : P (x) =

X

|m|=n

cm xm }.

The space of all polynomials restricted to the unit sphere with total degree at most n is Prn (S r−1 ) := {P |S r−1 : P (x) = 10

X

|m|≤n

cm xm }.

11

2.1. MULTIVARIATE POLYNOMIALS ON THE UNIT SPHERE

The space of all homogeneous polynomials restricted to the unit sphere with total degree n is X

∗

Prn (S r−1) := {P |S r−1 : P (x) =

|m|=n

cm xm }.

The dimensions of the above polynomial spaces are given by the following lemmas: Lemma 2.1.1 The dimension of Prn is Prn

dim

=

n+r r

.

mr 1 Proof. We can construct a basis for Prn by picking all monomials of the form xm 1 . . . xr ,

where m1 + . . . + mr ≤ n.

(2.1)

The number of non-negative integer solutions for inequality (2.1) is

n+r r

.

2

∗

Lemma 2.1.2 The dimension of Prn is dim

∗

Prn =

n+r−1 r−1

.

∗

∗

Proof. Firstly, we prove that the two spaces Prn and Pr−1 are isomorphic, i.e Prn ∼ = Pr−1 n , by n defining the map ∗

H : Pr−1 → Prn n p(x1 , . . . , xr−1 ) 7→ xnr p(

x1 xr−1 ,... , ). xr xr

Obviously, the map H is linear and surjective. Furthermore, if Hp = Hq ∀x ∈ Rr then by putting xr = 1 we have p = q ∀x ∈ Rr−1 , i.e p ≡ q. Thus, H is bijective and the lemma is proved.

2

An important property about the decomposition of Prn (S r−1 ) is stated as the following theorem:

12

CHAPTER 2. MATHEMATICAL PRELIMINARIES

Theorem 2.1.1 For n ∈ N+ , r ≥ 2 ∗

∗

Prn (S r−1 ) =Prn (S r−1)⊕ Prn−1 (S r−1 ). Proof. Consider the subspace Qrn (S r−1 ) of Prn (S r−1 ), which is defined as the following: Qrn (S r−1 ) := {Q ∈ Prn (S r−1 ) : Q(−x) = (−1)n Q(x)}, then, since a polynomial can be decomposed into odd/even or even/odd part, we have Prn (S r−1 ) = Qrn (S r−1 ) ⊕ Qrn−1 (S r−1 ). ∗

∗

Obviously, Prn (S r−1) ⊂ Qrn (S r−1 ) and Prn−1 (S r−1 ) ⊂ Qrn−1 (S r−1). Furthermore, every element of Qrn (S r−1 ) is of the form Q(x) =

⌊n/2⌋

X

X

cm xm .

i=0 |m|=n−2i

As x ∈ S r−1 , we have |x| = 1 and ⌊n/2⌋ X x Q(x) = |x| Q( ) = |x| i=0 n

X

cm (x21 + . . . + x2r )i xm ,

|m|=n−2i ∗

∗

which is an element of Prn (S r−1 ). Hence Qrn (S r−1 ) ⊂ Prn (S r−1 ). Thus, we have ∗

Qrn (S r−1 ) = Prn (S r−1 ). By similar arguments, we also get ∗

Qrn−1 (S r−1 ) = Prn−1 (S r−1 ).

2

So, the theorem is proved. We have immediately a corollary about the dimension of Prn (S r−1 ): Corollary 2.1.1 dim

Prn (S r−1 )

=

n+r−1 r−1

+

n+r−2 r−1

.

2.1. MULTIVARIATE POLYNOMIALS ON THE UNIT SPHERE

13

Obviously, Prn (S r−1 ) is invariant under rotation, i.e, if AAT = I, P (x) ∈ Prn (S r−1 ) then we have PA := P (Ax) ∈ Prn (S r−1 ). We now introduce some other related concepts, which are adapted from Reimer [30]. In the above polynomial spaces, there are some polynomials P with the property that PA = P for some rotations A. This represents the class of zonal polynomials by the following definition: Definition 1 A polynomial P is called zonal with axis t ∈ S r−1 if P (x) = f (t.x), for x ∈ S r−1 , where f is a univariate function [−1, 1] → R. In the above definition and from now on, we assume that the notation x.y stands for the usual dot product in Rr . Next, given t ∈ S r−1 , we define a subgroup Urt of the rotation group SO(r) as Urt := {A ∈ SO(r) : At = t}. Theorem 2.1.2 (Zonal polynomials) Let r ≥ 3, then the polynomial P ∈ Pr is zonal with axis t ∈ S r−1 if and only if PA = P holds for all A ∈ Urt . Proof. Suppose PA = P for all A ∈ Urt . Then there exists an element u ∈ S r−1 where t.u = 0. Keep u and t fixed. For an arbitrary vector x, we can express x as a linear combination of two orthogonal vectors t and v, (v is in the hyperplane spanned by t and x, v.t = 0), as the following x = (t.x)t +

p

1 − (t.x)2 v.

Then, we can find a rotation A ∈ Urt so that Av = u to get Ax = (t.x)t +

p

1 − (t.x)2 u.

14


Define f ∈ C[−1, 1] as f (ζ) := P (ζt +

p

1 − ζ 2 u), for ζ ∈ [−1, 1], we then obtain

P (x) = PA (x) = P (Ax) = f (t.x), x ∈ S r−1. The other direction is obvious since for every orthogonal matrix A, we have At.Ax = (At)T Ax = tT AT Ax = tT x = t.x.

2 2.2

Spherical harmonics

One important class of polynomials over the spheres is the spherical harmonics. Let ∆ be the Laplacian operator in Rr , that is r X ∂2 ∆= . 2 ∂x j j=1

From now through the end of the section, we assume that x ∈ S r−1, i.e |x| = 1. Following [25], we have the following definition: Definition 2 Let Hℓ (x) be a homogeneous polynomial of degree ℓ in Rr , which satisfies ∆Hℓ (x) = 0.Then Yℓ := Hℓ |S r−1 is called a (regular) spherical harmonic of order ℓ. From the definition and Green’s theorem, we have 0=

Z

|x|≤1

(Hm ∆Hn − Hn ∆Hm )dV =

Z

S r−1

Hm (x)Hn (x)(m − n)dS

as

∂Hm (rx) ∂r

= mHm (x) and r=1

∂Hn (rx) ∂r

Therefore, for m 6= n, we have Z

S r−1

Ym (x)Yn (x)dS = 0.

= nHn (x). r=1

15

2.2. SPHERICAL HARMONICS ∗

Denote by Hrℓ (S r−1 ) the space of all spherical harmonics of order ℓ, and by Hrn (S r−1 ) the space of all spherical harmonics of degree ≤ n, then we have the relation between the spaces, cf. [30] Prn (S r−1 ) = Hrn (S r−1 ), n M ∗ r r−1 Hn (S ) = Hrℓ (S r−1 ).

(2.2) (2.3)

ℓ=0

∗

The above relations suggest the dimension N(r, n) of Hrn (S r−1 ) as N(r, n) = dim Prn (S r−1 ) − dim Prn−1 (S r−1 ). With the help of corollary 2.1.1, we have n+r−3 n+r−1 . − N(r, n) = r−1 r−1 It is worth noting that 2

L (S

r−1

)=

∞ M

∗

Hrℓ (S r−1),

ℓ=0

∗

i.e. the set of spherical harmonics is dense in L2 (S r−1 ). It is known that Hrℓ (S r−1 ) is rotation-invariant, since it is the eigenspace of the Laplace-Beltrami operator on S r−1 . So, ∗

suppose that A is an orthogonal matrix, then Yℓ (Ax) ∈Hrℓ (S r−1). Suppose further that the functions Yℓj , j = 1, . . . , N(r, ℓ) constitute an orthonormal set, i.e. Z Yℓj (x)Yℓk (x)dS = δjk .

(2.4)

S r−1

We can represent Yℓj (Ax) as a linear combination of N(r, ℓ) spherical harmonics of order ℓ. In particular, N (r,ℓ)

Yℓj (Ax) =

X

cℓjn Yℓn (x).

(2.5)

n=1

Equations (2.4) and (2.5) give Z

N (r,ℓ)

Yℓj (Ax)Yℓk (Ax)dS = S r−1

X n=1

cℓjn cℓkn .

(2.6)

16


But the orthogonal transformation A leaves the surface element dS unchanged, i.e. Z

Yℓj (Ax)Yℓk (Ax)dS =

S r−1

Z

Yℓj (x)Yℓk (x)dS = δjk .

S r−1

So, from (2.6) we now get N (r,ℓ)

X

cℓjn cℓkn = δjk .

n=1

It means that cℓjn are elements of an orthogonal matrix, thus N (r,ℓ)

X

cℓnj cℓnk = δjk .

(2.7)

n=1

For any two points x, y we define the polynomial function N (r,ℓ)

Gℓ (x, y) =

X

Yℓj (x)Yℓj (y).

(2.8)

j=1

Then, because of (2.7) we have for any orthogonal matrix A N (r,ℓ)

Gℓ (Ax, Ay) =

X

Yℓj (Ax)Yℓj (Ay)

j=1

N (r,ℓ) N (r,ℓ)

=

X X j=1

N (r,ℓ)

cℓjn Yℓn (x)

n=1

X X n=1

cℓjmYℓm (y)

m=1

N (r,ℓ) N (r,ℓ)

=

X

δnm Yℓn (x)Yℓm (y)

m=1

= Gℓ (x, y).

(2.9)

The property (2.9) of Gℓ above asserts that Gℓ is a bizonal function, as a consequence of the following definition and Theorem 2.2.1. Definition 3 For r ≥ 3, a function G ∈ Pr (S r−1 × S r−1 ) is called bizonal if for two arbitrary points x, y ∈ S r−1 , the following holds G(x, y) = g(x.y) for some univariate function g : [−1, 1] → R.

17

2.2. SPHERICAL HARMONICS

The function is called bizonal since it is zonal to both arguments. Similar to theorem 2.1.2, there is a theorem for a bizonal function, which is adapted from [30]: Theorem 2.2.1 Let x, y ∈ S r−1 be arbitrary. The function G ∈ Pr (S r−1 × S r−1 ) is bizonal if and only if G(Ax, Ay) = G(x, y) ∀A ∈ SO(r). Proof. Suppose G(Ax, Ay) = G(x, y) for all matrices A ∈ SO(r). Then fix y and choose A ∈ Ury , we have G(., y) as a zonal polynomial with respect to axis y. By applying theorem 2.1.2 to G(., y), there exists a continuous univariate function gy : [−1, 1] → R such that G(x, y) = gy (x.y), x ∈ S r−1 . Now, we shall see that gy does not depend on y since for all A ∈ SO(r) and x, y ∈ S r−1 gAy (x.y) = G(Ax, Ay) = G(x, y) = gy (x.y). Hence gy = g for some g : [−1, 1] → R independent of y. The reverse direction is trivial

2

since the rotation preserves the dot product.

The bizonal property when being applied particularly for function (2.8) is stated as the following theorem: Theorem 2.2.2 (Addition Theorem) Let {Yℓk } be an orthonormal set of N(r, ℓ) spherical harmonics of order ℓ on S r−1 , then N (r,ℓ)

X k=1

Yℓk (x)Yℓk (y) =

N(r, ℓ) (r) P (x.y), ωr−1 ℓ

x, y ∈ S r−1 , (r)

where ωr−1 is the total surface area of the sphere S r−1 and Pℓ (t) is the Legendre polynomial of degree ℓ in Rr . Proof. Firstly, we recall the definition of Legendre function (see [25]) as a homogeneous, harmonic polynomial and zonal with respect to a fixed axis. Therefore, the univariate function

18


g : [−1, 1] → R in theorem 2.2.1 is just a multiple of the Legendre function, so that N (r,ℓ)

X

(r)

Yℓk (x)Yℓk (y) = CPℓ (x.y),

k=1

where C is some constant to be determined. Put x = y, since Pℓ (1) = 1, we get N (r,ℓ)

X

[Yℓk (x)]2 = CPℓ (1) = C.

k=1

2

Integration over S r−1 gives N(r, ℓ) = Cωr−1, proving the theorem. An application of the Addition theorem is the following lemma, (cf. [25, lemma 8]) Lemma 2.2.1 Let Yℓ (x) be a spherical harmonic of degree ℓ. Then s sZ N(r, ℓ) |Yℓ (x)| ≤ |Yℓ (x)|2 dS. ωr−1 S r−1 Proof. We express Yℓ (x) as a linear combination of Yℓk (x)’s, N (r,ℓ)

Yℓ (x) =

X

ak Yℓk (x), where ak =

Z

Yℓ (x)Yℓk (x)dS.

S r−1

k=1

By the Cauchy-Schwarz inequality and the Addition theorem, we obtain N (r,ℓ) 2

|Yℓ (x)| ≤ However, Z

X

N (r,ℓ)

(ak )

k=1

2

X k=1

N (r,ℓ) 2

Yℓ (x) dS =

S r−1

X

ak

k=1

Thus, the result follows immediately.

2.3

N (r,ℓ) X N(r, ℓ) (Yℓk (x)) = Pℓ (1) (ak )2 . ωr−1 2

k=1

Z

S r−1

N (r,ℓ)

Yℓ (x)Yℓk (x)dS =

X

(ak )2 .

k=1

2

Spherical t-designs

A spherical t-design, a notion introduced by Delsarte, Goethals and Seidel [12], is a set of points T = {t1 , . . . , tM } on the sphere S r−1 such that the equal-weight quadrature rule based on these points is exact for all polynomials of degree ≤ t. Z M ωr−1 X p(tk ) = p(x)dS, ∀p ∈ Prt (S r−1 ), M S r−1 k=1

(2.10)

2.3. SPHERICAL T -DESIGNS

19

where ωr−1 is the surface area of S r−1 . For the quadrature rule to be exact, [12] gives the lower bound for the cardinality of T as M≥

r+n−1 r−1

+

r+n−2 r−1

,

M ≥2

r+n−1 r−1

,

(2.11)

for t = 2n and t = 2n + 1 respectively. We remark that, in case of t = 2n, the lower bound is the dimension of the polynomial space Prn (S r−1 ). The relation between the cardinality of the set T and the dimension of the polynomial space suggests the following definition: Definition 4 A spherical t-design T = {t1, . . . , tM } is called a tight spherical t-design if the cardinality of T attains its lower bound pointed out in (2.11). Examples of tight spherical t-designs for t = 2, 3, 4, 5, 7, 11 were constructed in [12]. However, Bannai and Damerell ([4]) proved that Theorem 2.3.1 Let t = 2n and n ≥ 3 and r ≥ 3. Then there exists no tight spherical t-design in S r−1 . The proof of the above theorem makes use of properties of the roots of a certain polynomial, namely Rn (x) = Cn (x) + Cn−1 (x), whereas Cn (x) is the usual Gegenbauer polynomial r/2

Ck (x). It was proved that if a tight design exists, then Rn (x) has all its roots rational. By reducing Rn (x) modulo various primes, it was shown that if its roots are all rational, then their reciprocals are all integers, and all of the same parity as r. Sn (x) is defined as the polynomial having these integer as its roots. Two cases of n are considered where n is even or odd. If n is even, say n = 2m, the sum of the roots of Sn (x) is −2m. Now Rn (x) is the sum of two Gegenbauer polynomials whose roots interlace, so we can divide the roots of Sn (x) into pairs, say a and b such that a > 0, b < 0, a > |b|. Since these are integers of same parity, we find b = −a + 2. Therefore Sn (x) is an even function of (x − 1). By expressing Sn (x) as a polynomial in (x − 1) and finding a nonzero coefficient we obtain a contradiction. If n is odd, say n = 2m + 1, then we pair off all but one of the roots in a similar way. As

20


before, a + b ≥ 2 since the sum of the roots of Sn (x) is −(r + 2m) and the unpaired root is ≤ −(r + 4m). But we can show Sn (x) 6= 0 in this interval; this contradiction proves the theorem for n odd.

2.4

Reproducing kernel Hilbert spaces

We start the general theory of reproducing kernels from Aronszajn [2]. Consider a linear class H of functions f (x) defined on a set E. Furthermore, assume that H is complete, forming a Hilbert space with respect to an inner product h., .i. Then the function G(x, y) is called a reproducing kernel of H if: (i) For all y ∈ E, G(., y) ∈ H. (ii) G(x, y) = G(y, x) for all x, y ∈ E. (iii) The reproducing property: For every y ∈ E and every f ∈ H, f (y) = hf, G(., y)i. Other properties of G are summarized in the following: Property 2.4.1 If a reproducing kernel exists, it is unique. Proof. Suppose there exists another reproducing kernel G′ , we would have for some y: 0 < kG(., y) − G′ (., y)k2 = = =

hG(., y) − G′ (., y), G(., y) − G′ (., y)i hG(., y) − G′ (., y), G(., y)i − hG(., y) − G′ (., y), G′ (., y)i 0,

by the reproducing property of G and G′ .

(2.12)

2

Property 2.4.2 The reproducing kernel G exists if and only if for every y ∈ E, f (y) is a continuous functional of f over the Hilbert space H. Proof. If G exists, then by applying the Cauchy-Schwarz inequality to f (y) = hf, G(., y)i we have: |f (y)| ≤ kf k hG(., y), G(., y)i1/2 = kf k hG(y, .), G(., y)i1/2 = G(y, y)1/2kf k.

21

2.4. REPRODUCING KERNEL HILBERT SPACES

Conversely, if f (y) is a continuous functional, then by the Riesz representation theorem there exists an element gy ∈ H such that f (y) = hf, gy i, and then if we put G(x, y) = gy (x) it will

2

be a reproducing kernel.

Example 2.4.1 Consider the space Pn [a, b] of all polynomials of degree ≤ n over the interval [a, b] with the usual inner product hf, gi :=

Z

b

f (x)g(x)dx.

a

Because Pn [a, b] is a finite dimensional subspace of L2 [a, b], Pn [a, b] is a Hilbert space with respect to the inner product h., .i. Let {p0 , . . . , pn } be an orthonormal basis for the space. Then we can verify that G(x, y) :=

n X

pj (x)pj (y)

j=0

uniquely defines a reproducing kernel in Pn [a, b]. The first and second property are obvious. The reproducing property is checked in the following: hp, G(., y)i =

*

p,

n X

+

pj (.)pj (y)

j=0

=

n X j=0

hp, pj i pj (y) = p(y).

The following example of reproducing kernel will be elaborated in Chapter 5. Example 2.4.2 Consider the positive definite kernels in C(S r−1 × S r−1), κ(p, q) :=

(r,ℓ) ∞ NX X

aℓk Yℓk (p)Yℓk (q)

ℓ=0 k=1

where all aℓk > 0 and N(r, ℓ) is the dimension of all spherical harmonics of degree ℓ. Then κ(p, q) defines a reproducing kernel in the native space Hκ := {f ∈ L2 (S r−1 ) : kf k2κ =

2 X |fc ℓk | ℓk

aℓk

< ∞}.

22

2.5


Linear projections

Let U be a normed linear space and V be a finite dimensional subspace of U. For u ∈ U, define the minimal deviation of u in V as E(u; V ) := inf{kv − uk : v ∈ V }. Every element v in V with kv − uk = E(u; V ) is called a proximum (or best approximation) to u in V . The existence of v is given by the following fundamental theorem which is adapted from E. W. Cheney [9]. Before the theorem is given, we need a lemma. Lemma 2.5.1 Every closed, bounded, finite dimensional subspace V of U is compact. Proof. Let {v1 , . . . , vn } be a basis for V . For each element v ∈ V , there is a unique tuple P (a1 , . . . , an ) so that v = ni=1 ai vi . We define the map T : Rn → V that maps a 7→ v, where

a = (a1 , . . . , an ). If kak = maxi=1,... ,n |ai | then T is continuous. To see this, let a, b ∈ Rn ,

then kT a − T bk = k

n X i=1

ai vi −

n X i=1

bi vi k ≤

n X i=1

|ai − bi |kvi k ≤ ka − bk

n X i=1

kvi k.

So, to prove that V is compact, it is enough to show that A = {a : T a ∈ V } is compact. Firstly, we show that A is closed. If a(k) → a, then T a = T (limk a(k) ) = limk T (a(k) ). Since V is closed, T a ∈ V , whence a ∈ A. This shows that A is closed. Secondly, let us prove that A is bounded. Since the set {a : kak = 1} is compact and T is continuous, the infimum, α, of kT ak is attained on that set. Since {v1 , . . . , vn } is linearly independent, α > 0. Thus for any a 6= 0, kT ak = kT (a/kak)kkak ≥ αkak. Since kT ak is bounded on A, kak is bounded

2

on A.

Theorem 2.5.1 Let {v1 , . . . , vn } be a basis for V and u be an element in U. The problem of finding min ku − (a1 v1 + . . . + an vn )k ai

has a solution.

23

2.5. LINEAR PROJECTIONS

Proof. The solution lies in the set M := {v ∈ V : ku−vk ≤ ku−wk}, where w is an arbitrary fixed element in V . By the previous lemma, we know that M is compact. Let δ = E(u; V ). From the definition of an infimum, we may find a sequence of points x1 , x2 , . . . in M with the property that ku − xn k → δ as n → ∞. By the compactness of M, we may assume that the sequence converges to a point v of M. (If necessary, we can extract a subsequence with this property). We will show that v is the point of minimum distance from u. From the triangle inequality, ku − vk ≤ ku − xn k + kxn − vk. As n → ∞, we have ku − vk ≤ δ. Since v ∈ V , ku − vk ≥ δ. Hence ku − vk = δ.

2

We denote by L(U, V ) the set of all bounded linear operators from U to V . An operator P ∈ L(U, V ) is called a projection operator if P is surjective and if P 2 = P. Then, we have an important theorem: Theorem 2.5.2 Let P be a projection operator from a normed space U to a finite dimensional subspace V . Then for arbitrary u ∈ U, ku − P uk ≤ (kIk + kP k)E(u; V ), where kIk is the norm of the identity operator. Proof. Since every v ∈ V can be written as v = P u for some u ∈ U, it follows that P v = P (P u) = P 2 u = P u = v. Let v ∈ V be a proximum to u in U. Then we have ku − P uk = ku − v + P (v − u)k ≤ ku − vk + kP (v − u)k ≤ (kIk + kP k)E(u; V ), as stated.

2

24

CHAPTER 2. MATHEMATICAL PRELIMINARIES Assume further that U equipped by h., .i is a real inner product space with V one of its

finite dimensional subspaces. A projection Π is called an orthogonal projection if hu − Πu, vi = 0. holds for u ∈ U and arbitrary v ∈ V . The orthogonal projection is uniquely determined, for if we have Π and Π′ are orthogonal projections, then hu − Πu, vi = 0 and hu − Π′ u, vi = 0 for all u ∈ U, v ∈ V . By subtraction, h(Π − Π′ )u, vi = 0 holds for all u ∈ U, v ∈ V , and hence Π = Π′ in the weak sense. An orthogonal projection is a minimal projection in the sense of the following theorem: Theorem 2.5.3 For an orthogonal projection Π : U → V , we have kΠk = 1, where kΠk := sup{kΠf k : kf k ≤ 1} and k.k is induced from h., .i. Proof. By definition, hu − Πu, Πui = 0 for arbitrary u ∈ U.

This implies kΠuk2 =

hΠu, Πui = hu, Πui ≤ kukkΠuk, and hence kΠuk ≤ kuk.

(2.13)

But Π is a linear projection, Πv = v for all v ∈ V , so kΠuk = kuk only for some elements of U, thus kΠk ≥ 1. From (2.13) and (2.14) we conclude kΠk = 1.

2.6

(2.14)

2

Stone-Weierstrass theorem

One of the most important theorems in approximation theory is the Stone-Weierstrass theorem. The classical statement of the theorem was originally stated by Weierstrass as the following:

2.6. STONE-WEIERSTRASS THEOREM

25

Theorem 2.6.1 Let f (x) ∈ C[a, b]. Given an ǫ > 0 we can find a polynomial pn (x) of sufficiently high degree for which sup |f (x) − pn (x)| ≤ ǫ.

x∈[a,b]

Proof. A proof which makes extensively use of Bernstein polynomials can be found in [11], page 107-122.

2

The Weierstrass theorem has been generalized in many different directions. The Stone’s generalisation version stated below can be found in [40]. Definition 5 Let A be a family of functions on a set E. Then A is said to separate points on E if to every pair of distincts points x1 , x2 ∈ E there corresponds a function f ∈ A such that f (x1 ) 6= f (x2 ). If to each x ∈ E there corresponds a function g ∈ A such that g(x) 6= 0, we say that A vanishes at no point of E. Theorem 2.6.2 Let A be an algebra of real continuous functions on a compact set K. If A separates points on K and if A vanishes at no point of K, then the uniform closure B of A consists of all real continuous functions on K.

Chapter 3 Polynomial interpolation on spheres 3.1

Introduction

In this chapter, interpolation is considered as a linear projection from the space C(S r−1 ) to a finite dimensional space P ⊂ Pr (S r−1 ). The concept of fundamental system and their properties are discussed in section 2. Section 3 introduces Lagrangian interpolation together with a variety of connections between the Lagrangians and the reproducing kernel G on the sphere S r−1 . Section 4 gives various estimations of the interpolation operators in C(S r−1 ) to C(S r−1 ) and C(S r−1 ) to L2 (S r−1 ) settings. The bizonal property of the reproducing kernel G is the key tool to bring down the multivariate functions to univariate ones. Section 5 discusses the concept of extremal fundamental system which can be considered as a best configuration of interpolating points.

3.2

Fundamental systems

The interpolation problem is started with a set of points T := {x1 , . . . , xN } on the sphere S r−1 and an unknown function f ∈ C(S r−1 ) but f |T is given. One important characterisation of the set of points T is pointed out in the following concept. Definition 6 (Fundamental system) Let N be the dimension of P. Then the set {t1 , . . . , tN } is called a fundamental system if the evaluation functionals {f → f (tj )}, for j = 1, . . . , N and f ∈ P, is a linearly independent set, and hence a basis for the dual space P∗ . 26

3.2. FUNDAMENTAL SYSTEMS

27

The definition of the fundamental system suggests the following propositions. Proposition 3.2.1 Let {p1 , . . . , pN } be a basis for P. The set {t1 , . . . , tN } is a fundamental system if and only if det P 6= 0 where Pij = pi (tj ). Proof. From the definition, t1 , . . . , tN form a fundamental system in P if and only if N X j=1

aj p(tj ) = 0 ∀p ∈ P =⇒ a1 = . . . = aN = 0.

Since {p1 , . . . , pN } is a basis for P, the above statement is equivalent to N X

aj pk (tj ) = 0,

for k = 1. . . . , N =⇒ a1 = . . . = aN = 0.

j=1

Therefore det P 6= 0 if and only if {t1 , . . . , tN } is a fundamental system.

2

For r = 1, if P is P1n ([−1, 1]) and N = n + 1 then arbitrary points t1 , . . . , tN constitute a fundamental system, provided that they are pairwise different. This can be proved by Vandermond matrix arguments which is well known. However, the situation changes entirely in case of r ≥ 2. There are several ways to give a counter example to the above fact. One of them is recently given by Sloan (unpublished) as the following proposition: Proposition 3.2.2 Assume r ≥ 2 and {t1 , . . . , tN } is a fundamental system on S r−1 . Let F(t) = {f1 (t), . . . , fN (t)} for 0 ≤ t ≤ 1 be a continuous transformation of the points in such a way that F(0) = {t1 , . . . , tN } F(1) = {tπ1 , . . . , tπN } where π1 , . . . , πN is an odd permutation of 1, . . . , N. Then there exists t0 ∈ (0, 1) such that det P = 0. Proof. Consider the matrix P. The effect of the transformation in the end is to permute the rows of P. Since det P is continuous and the permutation is odd, it changes sign as t varies from 0 to 1. By the intermediate value theorem there exists t0 such that det P |t=t0 = 0.

2

28

CHAPTER 3. POLYNOMIAL INTERPOLATION ON SPHERES

3.3

Multivariate Lagrangian interpolation

3.3.1

General setting of Lagrangian interpolation

Lagrangian interpolation is one of the most important methods in constructive approximation theory. The domain of the problem varies from hyperplane to other general regions. Here, we are only interested in Lagrangian interpolation on the unit sphere S r−1 . Let P ⊂ Pr (S r−1 ) be an N-dimensional polynomial space, and let p1 , . . . , pN form a basis for P and T := {t1 , . . . , tN } ⊂ S r−1 be a fundamental system for P. As before, we define the matrix P as



p1 (t1 ) . . .  .. .. P(t1, . . . , tN ) :=  . . pN (t1 ) . . .

 p1 (tN )  ..  . pN (tN )

Let li , for i = 1, . . . , N, be the Lagrangian polynomials on the fundamental system T . The representation of the Lagrangians over T are uniquely given by the Cramer rule as li (x) =

det P(t1, . . . , ti−1, x, ti+1 , . . . , tN ) , det P(t1 , . . . , ti−1 , ti, ti+1 , . . . , tN )

for i = 1, . . . , N.

By the alternate property of determinants, li (tj ) = δij for i, j = 1, . . . , N, so l1 , . . . , lN form a basis for P. Now, f ∈ C(S r−1 ) is some unknown continuous function but the values f (t1 ), . . . , f (tN ) are given. The Lagrangian interpolant of f , which we call Λf ∈ P, is a polynomial that satisfies Λf (tj ) = f (tj ), for j = 1, . . . , N. It is uniquely defined as Definition 7 Given a fundamental system T = {t1 , t2, . . . , tN } ⊂ S r−1 , li ’s are the Lagrangians defined over T , then the Lagrangian interpolant for f ∈ C(S r−1 ) is Λf :=

N X

f (ti )li .

(3.1)

i=1

3.3.2

Relation between Lagrangians and the reproducing kernel

To study multivariate Lagrangian interpolation on the sphere S r−1 , Reimer [30] suggested a method that makes use of the theory of reproducing kernel Hilbert space and the Addition

3.3. MULTIVARIATE LAGRANGIAN INTERPOLATION

29

Theorem of spherical harmonics to bring down the dimension. The restriction is that P must be a rotation-invariant subspace of Pr (S r−1) so that the Addition Theorem of spherical harmonics can be applied to reduce the dimension. The arguments are presented as the following. The representation of the reproducing kernel G(., .) in P with respect to an arbitrary inner product h., .i is described in the following lemma: Lemma 3.3.1 The reproducing kernel of P is given by G(x, y) :=

N X

pj (x)pj (y),

j=1

where {p1 , . . . , pN } is an arbitrary orthonormal basis of P with respect to the inner product h., .i. Proof. It is enough to check the reproducing kernel property of G as: *N + N X X hG(x, .), f i = pj (x)pj (.), f = hpj , f i pj (x) = f (x). j=1

j=1

The remaining properties are obvious.

2

The reproducing kernel G in P is unique by property 2.4.1. When the inner product is the usual inner product in L2 (S r−1 ), i.e. Z hf, gi =

S r−1

f (x)g(x)dS, for f, g ∈ P,

the spherical harmonics can be chosen to be the basis for P. By the Addition Theorem, we can now represent G(x, y) by a univariate function, as a result of the following lemma: Lemma 3.3.2 If P is a nonempty rotation-invariant subspace of Pr (S r−1), and G is the reproducing kernel in the space, then G(x, y) =

1 ωr−1

g(x.y),

where ωr−1 is the surface area of S r−1 , the function g : [−1, 1] → R is the sum of some Legendre functions with different degrees and leading constants and x.y is the usual dot product in Rr .

30


Proof. Since P is a nonempty rotation invariant subspace of Pr (S r−1 ), it is a direct sum ∗

of some homogeneous harmonic polynomial spaces Hrℓ (S r−1 ); here ℓ is the degree of the ∗

polynomials. We then apply the Addition Theorem for all components Hrℓ (S r−1 ) of P to

2

get the result. A useful fact that can be deduced from the above lemma is G(x, x) =

1 ωr−1

g(1) =

N . ωr−1

(3.2)

(r)

This follows from the fact that Pℓ (1) = 1 for all Legendre functions with different degree ℓ’s. Since {t1 , . . . , tN } is assumed to be a fundamental system, the functions {G(t1, .), . . . , G(tN , .)} form a basis for P by the following lemma: Lemma 3.3.3 Let T := {t1 , . . . , tN } be an arbitrary set of N points on S r−1 and G be the reproducing kernel in P. The family of polynomials {G(ti, .) : i = 1, . . . , N; G(ti , .) ∈ P} is a basis for P if and only if T is a fundamental system. Proof. As pointed out in lemma 3.3.1, we can write G(tk , .) =

N X

pj (tk )pj (.), for k = 1, . . . , N.

j=1

The matrix {pj (tk )} is non-singular if and only if T is a fundamental system. Therefore span{G(tk , .) : k = 1, . . . , N} = P holds if and only if T is a fundamental system.

2

We are now in a position to state some important theorems about the connection between the Lagrangians lj ’s and the reproducing kernel G: Theorem 3.3.1 In the polynomial space P, let the matrix L be defined by Lij = hli , lj i and the matrix G be defined by Gij = G(ti , tj ). Furthermore, suppose that {t1 , . . . , tN } form a fundamental system. Then we have L = G−1 .

31


Proof. By lemma 3.3.3, G(tj , .), for j = 1, . . . , N, form a basis for P, hence by the definition of the Lagrangians, we can express G(tk , .) as G(tk , .) =

N X

G(tk , tj )lj (.).

j=1

Meanwhile, the reproducing property of G gives * + N N X X lk (x) = hlk , G(., x)i = lk , G(tj , x)lj = hlk , lj i G(tj , x). j=1

j=1

for arbitrary point x ∈ S r−1 . Now, if we put x = ti for all i = 1, . . . , N we will get N X j=1

hlk , lj i G(tj , ti ) = lk (ti) = δki ,

2

proving the theorem.

The relation between the Lagrangian square sum and the eigenvalues of G is given by the following theorem, which is adapted from [31]: Theorem 3.3.2 (Reimer) Let T = {t1 , . . . , tN } be an arbitrary system on S r−1 and λ1 ≤ λ2 . . . ≤ λN be eigenvalues of the matrix G defined on T . Let lj , j = 1, . . . , N be the Lagrangians defined on T . Furthermore, assume that G is positive definite. Then T is a fundamental system and for every x ∈ S r−1 , we have N

G(x, x) G(x, x) X 2 ≤ lj (x) ≤ . λN λ1 j=1 Proof. Since G = PT P and G is positive definite, T is a fundamental system. The matrix G is symmetric by the symmetry of the reproducing kernel G. Therefore, there exists an orthogonal matrix A such that AGAT = M = diag(λ1 , . . . , λN ). Using the definition of the interpolant (3.1) twice, we get G(x, y) =

N X j=1

lj (x)G(tj , y) =

N X N X

lj (x)G(tj , tk )lk (y).

(3.3)

j=1 k=1

Define l := (l1 , . . . , lN )T and q = (q1 , . . . , qN )T := Al, we have N X j=1

qj2 (x) = (Al(x))T Al(x) = l(x)T l(x) =

N X j=1

lj2 (x).

(3.4)

32


Equation (3.3) now becomes T

T

T

T

G(x, y) = l (x)Gl(y) = l (x)A MAl(y) = q (x)Mq(y) =

N X

λj qj (x)qj (y).

j=1

Finally, with the help of (3.4), we have λ1

N X

lj2 (x)

j=1

≤ G(x, x) ≤ λN

N X

lj2 (x),

j=1

where λ1 is positive since G is positive definite. From this, the statement of the theorem

2

follows immediately.

We should remark that when P is a rotation invariant subspace of C(S r−1), then G(x, x) = g(x.x) = g(1) for some univariate function g : [−1, 1] → R as G is bizonal. Since the trace of G is invariant under similarity transformation on the sphere, N X

λj = Ng(1),

(3.5)

j=1

in other words, λav = g(1), where λav is the average of all eigenvalues of the matrix G. In the special case, when λ1 = . . . = λN , (3.5) implies λ1 = λN = g(1) and by theorem 3.3.2, the statement N X

lj2 (x) = 1,

j=1

will follow.

3.3.3

x ∈ S r−1

(3.6)

Relation between spherical t-design and Lagrangians

An interesting fact is that, if P is the polynomial space Prn (S r−1 ), when r ≥ 3, n ≥ 3, then equation (3.6) does not hold. The fact was first recognized by Bos, [6], which is a consequence of the following theorem: Theorem 3.3.3 Let lj (x) for j = 1, . . . , N are Lagrangians over a fundamental system T on the sphere S r−1 , then max

x∈S r−1

N X j=1

if and only if T is a tight spherical 2n-design.

lj2 (x) = 1

33


But theorem 2.3.1 tells us that, for r ≥ 3, n ≥ 3 there exists no tight spherical design, hence equation (3.6) does not hold. Theorem 3.3.3 is proved through an important theorem about probability measures on S r−1 . Let µ be an arbitrary probability measure on S r−1 , M denote the N × N Gram matrix M(µ) :=

Z

pi (x)pj (x)dµ S r−1

,

i,j=1,... ,N

where {p1 , . . . , pN } is an arbitrary basis of P. Then, we define p = (p1 , . . . , pN )T and, for det M(µ) 6= 0, d(x; µ) := pT (x)M−1 (µ)p(x),

x ∈ S r−1 .

The latter quantity is invariant under a change of basis, for if q = (q1 , . . . , qN )T is another basis such that q = Ap, where A is non-singular orthogonal N × N matrix, then the new Gram-matrix is M′ (µ) = AM(µ)AT and d(x; µ) = d′ (x; µ). Theorem 3.3.4 (Bos) Suppose that µ∗ is a probability measure on S r−1. Then µ∗ maximizes det M(µ) iff maxx∈S r−1 d(x; µ∗ ) = N. Moreover, the optimal matrix is unique. Proof. By a lemma in [22, page 323], the family of matrices M(µ) is a compact, convex set. For t ∈ [0, 1], (1 − t)µ1 + tµ2 , for two arbitrary measures µ1 and µ2 on S r−1 , is also a probability measure and M((1 − t)µ1 + tµ2 ) = (1 − t)M(µ1 ) + tM(µ2 ). Since these matrices are Gram matrices, they are symmetric, positive semi-definite. So we can find [20, page 314] a non-singular matrix A such that AT M(µ1 )A = diag (a1 , . . . , aN ) and AT M(µ2 )A = diag (b1 , . . . , bN ). Thus det M((1 − t)µ1 + tµ2 ) = (det A)−2 det diag ((1 − t)ai + tbi ) ≥ (det M(µ1 ))1−t (det M(µ2 ))t

(3.7)

34


with equality iff ai = bi , 1 ≤ i ≤ N, i.e., M(µ1 ) = M(µ2 ). Hence, if µ1 and µ2 both maximize det M(µ), M(µ1 ) = M(µ2 ). Now, we can compute from (3.7) that N

X ∂2 (bi − ai )2 log det M((1 − t)µ + tµ ) = − ≤ 0. 1 2 2 ∂t2 ((1 − t)a + tb ) i i i=1 Therefore, µ1 maximizes det M(µ) iff ∂ log det M((1 − t)µ1 + tµ2 )|t=0 ≤ 0 ∂t for all probability measure µ2 . But, ∂ ∂ log det M((1 − t)µ1 + tµ2 ) = trace log M((1 − t)µ1 + tµ2 ) ∂t ∂t ∂ = trace M−1 ((1 − t)µ1 + tµ2 ) M((1 − t)µ1 + tµ2 ). ∂t Thus, µ1 is optimal iff trace M−1 (µ1 )(M(µ2 ) − M(µ1 )) = trace M−1 (µ1 )M(µ2 ) − N ≤ 0. We can compute from the definition of matrix M that Z −1 trace M (µ1 )M(µ2 ) = pT (x)M−1 (µ1 )p(x)dµ2 S r−1

and so µ∗ maximizes det M(µ) iff Z

S r−1

d(x; µ∗)dµ ≤ N

for all probability measures µ. If we take the measure µ so that µ({x}) = 1 then if µ∗ maximizes det M(µ) then d(x; µ∗ ) ≤ N. But Z

d(x; µ)dµ = trace M−1 (µ)M(µ) = N,

S r−1

we have N ≤ maxx∈S r−1 d(x; µ∗) and thus we conclude max d(x; µ∗) = N.

x∈S r−1

35

3.3. MULTIVARIATE LAGRANGIAN INTERPOLATION For the converse, if maxx∈S r−1 d(x; µ∗ ) = N then for any other probability measure µ, Z

S r−1

d(x; µ∗)dµ ≤ N

2

and so µ∗ maximizes det M(µ). Corollary 3.3.1 The normalized surface measure dµS :=

dS , ωr−1

where dS is the surface measure on S r−1 , maximizes det M(µ). Proof. Theorem 3.3.4 holds for any choice of basis, so let us choose the basis for Prn (S r−1 ) to be orthonormal with respect to the surface measure dS, that is {p1 , . . . , pN }. With respect to this basis, the Gram-matrix becomes M(µS ) =

1 ωr−1

Z

pi (x)pj (x)dS

S r−1

i,j=1,... ,N

=

1 ωr−1

I.

Now, d(x; µS ) = ωr−1

N X

pj (x)pj (x) = ωr−1G(x, x) = N.

j=1

The last equality follows from the uniqueness of the reproducing kernel G and the Addition Theorem on S r−1 . Then, the corollary follows from theorem 3.3.4.

2

Corollary 3.3.2 The discrete probability measure µT corresponding to an equal weight quadrature rule for the set T = {t1 , . . . , tN }, i.e. Z

N 1 X f (tj ), f (x)dµT := N j=1 S r−1

maximizes det M(µ) iff max r−1

x∈S

N X j=1

lj2 (x) = 1.

36


Proof. Again, theorem 3.3.4 holds for any choice of basis, so this time we take the Lagrangians {l1 , . . . , lN } to be a basis. The Gram-matrix now becomes M(µT ) =

Z

S r−1

li (x)lj (x)dµT

i,j=1,... ,N

# " N 1 X = li (tk )lj (tk ) N k=1

i,j=1,... ,N

=

1 I. N

Now, d(x; µT ) = N

N X

li (x)δij lj (x) = N

N X

lj2 (x).

j=1

i,j=1

2

By applying theorem 3.3.4 we obtain the statement of the corollary. Thus, with respect to the Lagrangian basis, M(µT ) = M(µS ) if and only if max r−1 x∈S

N X

lj2 (x) = 1.

(3.8)

j=1

Now the proof of theorem 3.3.3 follows, for the statement (3.8) means Z N 1 X 1 li (tk )lj (tk ) = li (x)lj (x)dS, N k=1 ωr−1 S r−1 for all i, j = 1, . . . , N. And since the products li lj span Pr2n (S r−1 ), this is equivalent to Z N 1 1 X p(tk ) = p(x)dS, N k=1 ωr−1 S r−1

∀p ∈ Pr2n (S r−1 ),

i.e, T is a tight spherical 2n-design. We realize that there is a connection between the Lagrangians and spherical design. But Lagrangians and the reproducing kernel are connected to each other. Hence, there should be some other connections between the matrix G and spherical design. The following result states such a connection between the eigenvalues of the matrix G and tight spherical t-design.

Theorem 3.3.5 (Sloan) Let λ1 , . . . , λN be the eigenvalues of the matrix G as defined in theorem 3.3.1 above and T = {t1 , . . . , tN } be a fundamental system for the space Prn (S r−1 ). Then, λ1 = . . . = λN if and only if T is a tight spherical 2n-design.

3.4. THE INTERPOLATION OPERATOR AND ITS NORM

37

Proof. Because G is symmetric, we can find an orthogonal matrix A, i.e AAT = I, such that G = AMAT , where M = diag(λ1 , . . . , λN ). Suppose now that λ1 = . . . = λN = λ. Then M = λI and G = λAAT = λI. From the definition of the matrix G, Gij =

N X

pk (ti )pk (tj ),

k=1

r r−1 where {pj }N ), we have the following equation j=1 make up an orthonormal basis for Pn (S

λδij =

N X

pk (ti )pk (tj ),

k=1

i.e.

√1 PT √1 P λ λ

= I, where Pij = pi (tj ). Since T is a fundamental system, the square matrix

P is invertible, so we must have

√1 P √1 PT λ λ

= I, that is

N

1X pi (tk )pj (tk ) = δij . λ

(3.9)

k=1

By equation (3.5) and the remark in (3.2), N N 1 X λj = , λ= N j=1 ωr−1

equation (3.9) now is equivalent to N

ωr−1 X pi (tk )pj (tk ) = N k=1

Z

pi (x)pj (x)dS.

S r−1

Since all pi pj , for i, j = 1, . . . , N, span Pr2n (S r−1 ), the above equation implies Z N ωr−1 X p(tk ) = p(x), N k=1 S r−1

∀p ∈ Pr2n (S r−1 ).

i.e. T form a tight spherical 2n-design.

The above arguments can be gone through in a reverse direction. Therefore, the theorem is proved completely.

3.4

2

The interpolation operator and its norm

It can be considered that Λ is a linear operator from C(S r−1) to the polynomial subspace P. The estimate for the norm of the operator depends on how we define the norm on P. If

38


the supremum norm is considered, then we have the following theorem, which introduces the interpolation norm as the Lebesgue-constant: Theorem 3.4.1 Let the operator be Λ : C(S r−1 ) → P, and the norm of the operator be defined as kΛk∞ := sup{kΛf k∞ : kf k∞ ≤ 1}. Then we have kΛk∞ = max r−1 x∈S

N X

|lj (x)|.

j=1

Proof. Let us define λ(x) :=

N X j=1

|lj (x)|,

x ∈ S r−1 .

Then, by assuming kf k∞ ≤ 1, we have Λf (x) =

N X j=1

f (tj )lj (x) ≤ λ(x), ∀x ∈ S r−1 .

(3.10)

Now, since S r−1 is compact, λ attains its maximum value at some point ζ ∈ S r−1 . Equation (3.10) implies kΛk∞ ≤ λ(ζ). We then construct a particular function f ∗ (x) by sign (lj (ζ))(1 − R1 |x − tj |) for |x − tj | < R, ∗ f (x) := 0 otherwise, where R is some chosen real number such that 0 < R < 21 min{|ti − tj | : 1 ≤ i, j ≤ N, i 6= j}. Obviously, kf ∗ k∞ = 1. It follows that ∗

∗

kΛk∞ ≥ kΛf k∞ ≥ |(Λf )(ζ)| =

N X

sign (lj (ζ))lj (ζ) = λ(ζ).

j=1

And thus, we obtain kΛk∞ = λ(ζ) and the theorem is proved.

2

The divergence of the interpolation operator as N → ∞ in the supremum norm is proved indirectly through the orthogonal projection operator. Firstly, the orthogonal projection Π is proved to be a minimal projection in the supremum norm. Here, a linear projection Π is said to be minimal if for every linear projection T , we obtain kΠk ≤ kT k. Then, since interpolation is a linear operator, the divergence of the orthogonal projection forces the interpolation operator to diverge. More precisely, we have:

39


Theorem 3.4.2 Let r ≥ 2 and P denote any finite dimensional rotation invariant subspace of Pr (S r−1 ). Then the orthogonal projection Π : C(S r−1) → P is a minimal linear projection in the supremum norm.

2

Proof. For r = 2, we refer to Berman, [5]. For r ≥ 2, we refer to Daugavet, [10].

When P is a polynomial space Prn (S r−1 ), the dependence of the supremum norm of the orthogonal projection on n is given by Theorem 3.4.3 Let Π(n) denote the orthogonal projection from C(S r−1) to Prn (S r−1 ). Then kΠ(n)k∞ =

O(log n) O(n(r−2)/2 )

for r = 2, for r ≥ 3.

Proof. For r = 2, we refer to Davis [11], page 358. For r ≥ 3, the result was shown by

2

Gronwall [21].

Now, since Λ is a linear projection, by theorem 3.4.2, we have kΛ(n)k∞ ≥ kΠ(n)k∞ . Then, applying the result of theorem 3.4.3, we have the divergence of the interpolation operator as n → ∞. For a fixed value n, while kΠ(n)k∞ is well understood, little is known about kΛ(n)k∞ . Obviously, the value of kΛ(n)k∞ depends on the choice of the fundamental system, but how the points are distributed to minimize kΛ(n)k∞ is still under extensive research, see [36]. For a fundamental system T , a coarse bound for kΛ(n)k∞ given in Corollary 2 of [31], as kΛ(n)k∞ ≤ N

1/2

r

λav , λmin

(3.11)

where λmin = λ1 ≤ λ2 ≤ . . . ≤ λN = λmax are the eigenvalues of the matrix G and λav is the average value as pointed out in theorem 3.3.2. To see this, we apply the Cauchy Schwarz inequality to the expression of the uniform norm in theorem 3.4.1 to get kΛ(n)k∞ = max r−1 x∈S

N X i=1

|li (x)| ≤ N 1/2 max r−1 x∈S

N X i=1

li2 (x)

!1/2

.

Using theorem 3.3.2, in which G(x, x) = λav , we arrive at the estimate given in (3.11).

40

CHAPTER 3. POLYNOMIAL INTERPOLATION ON SPHERES In the above, the interpolation operator is set out from C(S r−1 ) to C(S r−1 ). In another

setting of the problem, we can view P as a subspace of L2 (S r−1 ). The L2 (S r−1) norm of the operator is then defined as kΛk2 := sup{kΛf k2 : f ∈ C(S r−1 ), kf k∞ ≤ 1}, where Z 1/2 2 kΛf k2 = |Λf (x)| dS . S r−1

The estimation of kΛk2 was first studied by Sloan, see [35]. The trivial lower bound for kΛk2 √ is kΛk2 ≥ ωr−1 . This can be shown by integrating the constant function 1 over S r−1 , i.e. Z 2 kΛ1k2 = 12 dS = ωr−1 . S r−1

However, the lower bound is not attainable except in some special cases. This is the result of the following theorem: Theorem 3.4.4 (Sloan) Let r ≥ 3 and n ≥ 3, the interpolating polynomial space be Prn (S r−1 ), √ then kΛk2 > ωr−1 , where ωr−1 is the surface area of S r−1 . Before the proof is given, we recall a following well known result. Lemma 3.4.1 Let a1 , . . . , aN be N positive integers, then N a1 + . . . + aN , ≥ −1 N a1 + . . . + a−1 N with equality if and only if a1 = a2 = . . . = aN . Proof. (of theorem 3.4.4) To prove the theorem, it is enough to prove the existence of √ f ∗ ∈ C(S r−1 ) such that kf ∗ k∞ = 1 and kΛf ∗ k2 > ωr−1 . Using definition (3.1) with N is the dimension of Prn (S r−1 ) and T = {t1 , . . . , tN } is a fundamental system for Prn (S r−1), we have kΛf k22 = hΛf, Λfi =

N X N X

f (ti )Lij f (tj ),

i=1 j=1

where Lij = hli , lj i as in theorem 3.3.1 with the particular inner product Z hli , lj i = li (x)lj (x)dS. S r−1

(3.12)

41


Since |f (tj )| ≤ 1 for all j = 1, . . . , N, we may view equation (3.12) as a positive quadratic form defined over the unit cube I N ⊂ RN : kΛf k22 =

N X N X

ξi Lij ξj ,

i=1 j=1

where f (tj ) has been replaced by ξj . This quadratic form must take its maximum value at one of the vertices of the unit cube; that is when f takes one or other extreme values ±1 at each of the interpolation points tj . Let us define the set of 2N such functions fk (x) = (k1 , . . . , kN )T , where vector k = (k1 , . . . , kN )T which each components kj , for j = 1, . . . , N takes the values ±1 independently. Now, we average the quadratic form over 2N functions as average k

kΛfk k22

= average k

N X N X

ki Lij kj .

i=1 j=1

We observe that, the off-diagonal elements of the matrix L are averaged to zero. Thus, average k

kΛfk k22

=

N X

Ljj .

j=1

Since there must be at least one fk such that the value of kΛfk k2 is as large as the average value, so from the definition of k.k2 , we conclude kΛk22

≥

N X

Ljj .

(3.13)

j=1

Let λ1 , . . . , λN denote the eigenvalues of matrix G = L−1 (Theorem 3.3.1). Since the trace of the matrix is invariant under a similarity transformation, N X

N X 1 trace (L) = Ljj = . λ j=1 j=1 j −1 Applying lemma 3.4.1 to the set of λ−1 1 , . . . , λN , we obtain N X j=1

N2 Ljj ≥ PN j=1

λj

=

N = ωr−1 , G(x, x)

42


with equality if and only if λ1 = . . . = λN . In the above, the last step follows by equations (3.5) and (3.6). Now, we recall the result of theorem 3.3.5 to obtain the fact that λ1 = . . . = λN if and only if T is a tight spherical 2n-design. By theorem 2.3.1, for r ≥ 3, n ≥ 3, the

2

statement cannot be true. Therefore, the theorem is proved.

We remark that the above proof is simpler than its original version in [35] since it does not appeal to theorem 3.3.4 as in [35].

3.5

Extremal fundamental system

The concept of extremal basis and extremal fundamental system were introduced in Reimer, see [30]. It is an attempt to construct a good fundamental system for Lagrangian interpolation on S r−1 . We start by the following definition. Definition 8 (Extremal basis) A basis p1 , . . . , pN is called extremal if every pj is best approximated in Pj := span {pk : k ∈ {1, . . . , N} \ {j}} by zero, for all j=1, . . . ,N. From the definition, p1 , . . . , pN is extremal if for all j = 1, . . . , N kpj +

X k6=j

ak pk k ≥ kpj k

holds for arbitrary real or complex coefficients ak ’s. An example of an extremal fundamental system is an orthonormal basis. Now let {p1 , . . . , pN } be an arbitrary basis for P ⊂ Prn (S r−1 ) and T := {t1, . . . , tN } be an arbitrary set of N points on S r−1 . Then we define p1 (t1 ) . . . p1 (tN ) .. .. .. ∆(t1 , . . . , tN ) := det P(t1, . . . , tN ) = . . . pN (t1 ) . . . pN (tN )

As T is moved continuously on the sphere S r−1 , ∆ is a continuous function from (S r−1 )N to R. Since S r−1 is compact, (S r−1 )N is compact, so ∆ must attain its maximum value at

43

3.5. EXTREMAL FUNDAMENTAL SYSTEM

some fundamental system T ∗ = {t∗1 , . . . , t∗N }, which is called an extremal fundamental system. Obviously, ∆(T ∗ ) attains its maximum value independently of the choice of the basis. For suppose there is another basis q = (q1 , . . . , qN )T so that q = Ap, where A is an N × N matrix and p = (p1 , . . . , pN )T , then the new value is ∆′ (T ∗ ) = det(A)∆(T ∗ ), which is just a scalar multiple of the old value. The Lagrangians defined over T ∗ form an extremal basis by the following lemma. Lemma 3.5.1 Let T ∗ = {t∗1 , . . . , t∗N } be an extremal fundamental system. The Lagrangians corresponding to this fundamental system satisfy kl1 k∞ = . . . = klN k∞ = 1. Furthermore, l1 , . . . , lN form an extremal basis for P. Proof. We recall the definition lj as lj (x) :=

∆(t∗1 , . . . , x, . . . , t∗N ) . ∆(t∗1 , . . . , t∗j , . . . , t∗N )

Since ∆ attains its maximum at (t∗1 , . . . , t∗N ) and lj (tk ) = δjk we obtain klj k∞ = 1 for j = 1, . . . , N. At the point t∗j , for any real or complex numbers ak ’s, we have (lj +

X

ak lk )(t∗j ) = lj (t∗j ) = 1.

k6=j

Hence for all j = 1, . . . , N, klj +

X k6=j

ak lk k∞ ≥ 1 = klj k∞ ,

2

i.e. {l1 , . . . , lN } is an extremal basis for P.

Let G be a reproducing kernel of P. For two arbitrary fundamental systems X := (x1 , . . . , xN ) and Y := (y1 , . . . , yN ), we define

G(x1 , y1 ) . . . G(x1 , yN ) .. .. .. ∆(X , Y) := det G(X , Y) = . . . G(xN , y1 ) . . . G(xN , yN )

.

44


Lemma 3.5.2 Let X , Y be two arbitrary set of points on S r−1, then |∆(X , Y)| ≤ |∆(T , T )|,

(3.14)

where T is some extremal fundamental system on S r−1 . Proof. As a result of lemma 3.3.3, G(x1 , .), . . . , G(xN , .) form a basis for P so we can fix X and let Y vary to an extremal fundamental system T . By definition, ∆(X , Y) ≤ ∆(X , T ). Now, since T is also a fundamental system, G(., t1), . . . , G(., tN ) form a basis for P. So, by fixing T and letting X varies, we obtain |∆(X , T )| ≤ |∆(T , T )|. Thus, in the end,

2

|∆(X , Y)| ≤ |∆(T , T )|.

Recalling the definition of the matrix G(T ) where Gij = G(ti , tj ), we have ∆(T , T ) = det G(T ). As G is positive definite, lemma 3.5.2 implies |∆(X , Y)| ≤ det G(T ). Suppose T ′ is another extremal fundamental system, then det G(T ′ ) ≤ det G(T ). By swapping the roles of T ′ and T , we also get det G(T ) ≤ det G(T ′ ). In summary, we have the following theorem. Theorem 3.5.1 (Reimer) Let T be an extremal fundamental system for P. Then det G(T ) = max{det G(X )|X ∈ (S r−1)N }. Since the matrix L = G−1 , (theorem 3.3.1) we have an immediate corollary Corollary 3.5.1 Let T be an extremal fundamental system for P. Then det L(T ) = min{det L(X )|X ∈ (S r−1 )N }.

Chapter 4 Polynomial hyperinterpolation on spheres 4.1

Introduction

This chapter introduces hyperinterpolation on the sphere S r−1 . Section 1 is concerned with the Erdös-Turàn property for the norm of the interpolation operators. Roughly speaking, the property is concerned with the lower bound for the norm of a linear projection in the C(S r−1 ) to L2 (S r−1 ) setting. In the previous chapter, theorem 3.4.4 shows that it is impossible for a Lagrangian interpolation operator to achieve this bound. Inspired by this fact, the concept of hyperinterpolation is introduced in section 3. Section 4 deals with estimations of the hyperinterpolation operator, in which we show the operator achieves the Erdös-Turàn lower bound. Various ways of constructing hyperinterpolation points is described in section 5.

4.2

Erd¨ os-Tur` an property

In this section, we consider linear projections in the C(S r−1 ) to L2 (S r−1 ) setting as being studied in section 3.4. By similar arguments as in section 3.4, for every linear projection L : C(S r−1 ) → Prn (S r−1 ), we obtain a lower bound for the norm of the projection as kLk2 ≥

√

ωr−1 .

(4.1)

Now we present two important linear projections that do achieve the lower bound pointed out in (4.1), that are the identity operator and the orthogonal projection operator. More 45

46

CHAPTER 4. POLYNOMIAL HYPERINTERPOLATION ON SPHERES

precisely, we have Lemma 4.2.1 The identity operator I : C(S r−1 ) → Prn (S r−1 ) has kIk2 =

√

ωr−1 where

kIk2 := sup{kIf k2 : kf k∞ ≤ 1}. Proof. Since the identity map is also a linear projection, we have kIk2 ≥

√ ωr−1 . However,

from the definition of the identity map, kIf k2 = kf k2 =

Z

2

S r−1

|f (x)| dS

The definition of kIk2 now gives kIk2 ≤

1/2

√

≤ kf k∞

Z

dS

S r−1

1/2

ωr−1. Therefore, kIk2 =

√ = kf k∞ ωr−1.

√ ωr−1 as stated.

2

Lemma 4.2.2 The orthogonal projection operator Π : C(S r−1 ) → Prn (S r−1 ) has kΠk2 = √ ωr−1 where kΠk2 := sup{kΠf k2 : kf k∞ ≤ 1}. Proof. By the definition of orthogonal projection, hf − Πf , Πfi = 0, for all f ∈ C(S r−1 ). Here, h., .i is the usual inner product in L2 (S r−1 ). Using that property and applying the Cauchy-Schwarz inequality, we obtain hΠf , Πf i = hf, Πf i ≤ kf k2 kΠf k2, which is equivalent to kΠf k2 ≤ kf k2. Since kf k2 =

Z

S r−1

2

|f (x)| dS

1/2

≤ kf k∞

Z

S r−1

dS

1/2

√ = kf k∞ ωr−1 ,

√ √ we have kΠf k2 ≤ kf k∞ ωr−1 . Using the definition of kΠk2 , we have kΠk2 ≤ ωr−1 . The √ orthogonal projection is also a linear projection, so by (4.1), we conclude that kΠk2 = ωr−1 and the lemma is proved.

2

We now turn to interpolation operators, which is an important subclass of linear projections. We start from an interpolation operator on the interval [−1, 1] ⊂ R. Lemma 4.2.3 If r(x) is a positive weight function on [−1, 1], (i.e. r non-negative, integrable, and vanishing only on a finite set), and p0 , p1 , p2 , . . . , pn is a family of orthogonal

¨ ` PROPERTY 4.2. ERDOS-TUR AN

47

polynomials with respect to r(x), the polynomial Ln f of degree ≤ n that interpolates a continuous function f at the zeros of pn+1 satisfies Z

1

−1

1/2 Z [Ln f (x)] r(x)dx ≤ kf k∞

1

2

−1

1/2 r(x)dx .

Proof. The Lagrange interpolation formula can be written as (Ln f )(x) =

n X

f (xi )li (x)

li (x) =

i=1

pn+1 (x) . (x − xi )p′n+1 (x)

For i 6= j, we have Z

1

1 li (x)lj (x)r(x)dx = ′ pn+1 (xi )p′n+1 (xj ) −1

Z

1

pn+1 (x) −1

pn+1 (x) r(x)dx = 0, (x − xi )(x − xj )

since the second term under the integral sign is a polynomial of degree n − 1. For i = j, we P P start from ( lj )2 = 1, (since lj is the interpolant for the constant function 1), multiply

both sides by r(x) and integrate over [−1, 1] to get n Z X i=1

1

−1

li2 (x)r(x)dx

=

Z

1

r(x)dx. −1

Thus, Z

n Z X

1 2

[Ln f (x)] r(x)dx = −1

i=1

1 −1

kf k2∞

≤

f 2 (xi )li2 (x)r(x)dx

n Z X i=1

1

−1

li2 (x)r(x)dx

=

kf k2∞

Z

1

r(x)dx,

−1

from which the result follows.

2

Since kLn kr ≥ V 1/2 , we have kLn kr = V 1/2 , where V is the volume of [−1, 1] with respect to the measure r(x)dx and kLn kr := sup

(Z

1

−1

) 1/2 : kf k∞ ≤ 1 . [Ln f (x)]2 r(x)dx

The generalisation of the above lemma is developed in the next section. The above lemma a different form of the following theorem of Erdös and Turàn [15]:

48


Theorem 4.2.1 Let r(x) is a positive weight function on [−1, 1], p1 , p2 , . . . , pn and Ln f is defined as the above lemma, then Z 1 Z 1/2 2 ≤2 [Ln f (x) − f (x)] r(x)dx −1

1 −1

1/2 inf1 kf − χk∞ . r(x)dx χ∈Pn

Proof. We denote by V the volume of of [−1, 1] with respect to the measure r(x)dx, use the notation k.kr for the L2 [−1, 1]-norm with respect to the weight function r(x), then for arbitrary polynomial χ ∈ P1n [−1, 1], kLn f − f kr = kLn f − χ + χ − f kr ≤ kLn f − χkr + kχ − f kr ≤ V 1/2 kf − χk∞ + V 1/2 kf − χk∞ = 2V 1/2 kf − χk∞ . Since it is true for every χ, we obtain the result of the theorem kLn f − f kr ≤ 2V 1/2 inf1 kf − χk∞ . χ∈Pn

2 Returning to the spheres S r−1, it is now natural to state an important property of interpolation projection operators, which is a subclass of linear projections: Property 4.2.1 (Erdös-Turàn property) An interpolation operator Ln : C(S r−1) → Prn (S r−1 ) with respect to the fundamental system T is said to have a strong Erd¨ os-Tur` an property √ if kLn k2 = ωr−1 where kLn k2 := sup{kLn f k2 : kf k∞ ≤ 1}. In previous chapter, we have seen that the Lagrangian interpolation does not achieve the strong Erdös-Turàn property for n ≥ 3, r ≥ 3. The following section introduces polynomial hyperinterpolation, a projection operator that attains the lower bound in the Erdös-Turàn property.

4.3

Polynomial hyperinterpolation

Polynomial hyperinterpolation over general regions was first introduced by Sloan in [34]. Here we are only interested in hyperinterpolation on the unit sphere. The starting point of

49

4.3. POLYNOMIAL HYPERINTERPOLATION hyperinterpolation is the exact quadrature rule for polynomials on the sphere S r−1 , as M X k=1

wk p(tk ) =

Z

p(x)dS, S r−1

∀p ∈ Pr2n (S r−1),

(4.2)

where wk > 0’s and T is a set of points with M = |T | ≥ N, where N is the dimension of Prn (S r−1 ). We remark that when M = N, the hyperinterpolation operator becomes an interpolation operator that satisfies strong Erdös-Turàn property. The existence of such a quadrature rule is provided by the following theorem which is a special case of the main theorem in [33]: Theorem 4.3.1 (Seymour and Zaslavsky) Given P is a finite dimensional polynomial subspace of Pr (S r−1 ). Then there exists a finite set T , which is called averaging set, so that 1 X 1 p(t) = |T | t∈T ωr−1

Z

S r−1

pdS, ∀p ∈ P,

In the above, ωr−1 is the total surface area of the sphere, dS is the surface measure on the sphere. Then a semi-innerproduct with respect to the above quadrature rule is defined as hf, giM :=

M X

wj f (tj )g(tj ).

j=1

It is a semi-innerproduct since we can construct a non-zero function with f (tj ) = 0 for all j = 1, . . . , M. For a function f ∈ C(S r−1 ), the polynomial hyperinterpolant Ln f is now defined as Definition 9 Let p1 , . . . , pN be an orthonormal basis for Prn (S r−1 ). Then we define Ln f =

N X j=1

hf, pj iM pj

(4.3)

as the polynomial hyperinterpolant for f in Prn (S r−1 ). Several important properties of the hyperinterpolant are described in the following lemmas, which are taken from [34].

50


Lemma 4.3.1 If f ∈ Prn (S r−1 ), then Ln f = f , i.e. Ln f becomes exact if f is a spherical polynomial of degree ≤ n. PN

Proof. Since f ∈ Prn (S r−1), it may be expressed as f =

i=1

hyperinterpolant, Ln f =

*N N X X j=1

=

i=1

N X N X j=1 i=1

=

N X

ai pi , pj

+

ai pi . From the definition of

pj

M

ai hpi , pj iM pj

aj pj = f.

j=1

In the above, we have used the exactness of quadrature rule (4.2) as hpi , pj iM = hpi , pj i = δij .

2

The following lemma is a summary of other fundamental properties of the hyperinterpolant: Lemma 4.3.2 Given f ∈ C(S r−1 ) and let Ln f be defined by (4.3), then (a)

hf − Ln f , χiM = 0, ∀χ ∈ Prn (S r−1 ),

(b)

hLn f , Ln f iM + hf − Ln f , f − Ln fiM = hf, f iM ,

(c)

hLn f , Ln f iM ≤ hf, f iM ,

(d)

hf − Ln f , f − Ln f iM =

min

χ∈Prn(S r−1 )

hf − χ, f − χiM .

Proof. (a) It is enough to show that hf, pj iM = hLn f , pj iM ∀j = 1, . . . , N. Recall the definition of Ln f in (4.3), we obtain hLn f , pj iM =

* N X k=1

hf, pk iM pk , pj

+

M

= hf, pj iM . The last step follows by the exactness of the quadrature rule (4.2) for all pj ’s and the orthonormal relation between pj ’s, i.e. hpi , pj i = hpi , pj iM = δij , for i, j = 1, . . . , N. (b) The statement follows from hLn f , Ln f iM = hf, Ln f iM , which is just a result of (a) for

51

4.4. THE HYPERINTERPOLATION OPERATOR AND ITS NORM χ = Ln f . (c) Since hf − Ln f , f − Ln f iM ≥ 0, (b) implies (c). (d) We replace f by f − χ in (b) with χ ∈ Prn (S r−1 ) to obtain hLn f − χ, Ln f − χiM + hf − Ln f , f − Ln f iM = hf − χ, f − χiM ,

2

from which the result follows.

4.4

The hyperinterpolation operator and its norm

The definition of the supremum norm of Ln is kLn k∞ := sup{kLn f k∞ : f ∈ C(S r−1 ), kf k∞ ≤ 1}. Theorem 4.4.1 (Sloan)Let G(x, y) be the reproducing kernel of Prn (S r−1 ) and tj be the hyperinterpolation points as in (4.2). The supremum norm of the hyperinterpolation operator satisfies kLn k∞ = max r−1 x∈S

M X j=1

wj |G(x, tj )|,

Proof. Let us define gj (x) := G(x, tj ), for x ∈ S r−1 , and γ(x) :=

M X j=1

wj |gj (x)|,

x ∈ S r−1 .

From the reproducing kernel property of G and the exactness of quadrature rule (4.2), Ln f (x) = hG(x, .), Ln f i = hG(x, .), Ln fiM =

M X

wj f (tj )G(x, tj ) =

j=1

M X

wj f (tj )gj (x).

j=1

Thus, for x ∈ S r−1 |Ln f (x)| ≤

M X j=1

wj |f (tj )||gj (x)| ≤ kf k∞

M X j=1

wj |gj (x)| = kf k∞ γ(x).

Using continuity of spherical polynomials on S r−1 which is compact, γ(x). kLn f k∞ ≤ kf k∞ max r−1 x∈S

52


Assuming that γ attains it maximum value at some point ζ ∈ S r−1 , we now construct a function f ∗ such that ∗

f (x) :=

sign(gj (ζ))(1 − R1 |x − tj |) 0

for |x − tj | < R, otherwise,

where R is some chosen real number such that 0 < R < 21 min{|ti − tj | : 1 ≤ i, j ≤ N, i 6= j}. Obviously, kf ∗ k∞ = 1. It follows that ∗

∗

kLn k∞ ≥ kLn f k∞ ≥ |(Ln f )(ζ)| =

M X

wj sign (gj (ζ))gj (ζ) = γ(ζ).

j=1

2

Thus, we obtain kLn k∞ = γ(ζ) and the theorem is proved. An estimation for kLn k∞ which is taken from [36] as the following theorem. Theorem 4.4.2 kLn k∞ ≤ N 1/2 , where N is the dimension of Prn (S r−1 ).

Proof. Let G be the reproducing kernel of Prn (S r−1). Using theorem 4.4.1 and the CauchySchwarz inequality, we obtain kLn k∞ =

M X j=1

wj |G(ζ, tj )| ≤ =

M X

wj

j=1

√

ωr−1

!1/2

Z

M X

wj G(ζ, tj )

j=1

2

G(ζ, x) dS S r−1

2

1/2

!1/2 ,

where the last step the exactness of the quadrature rule (4.2) is used twice. By applying the reproducing kernel property of G together with the remark (3.2), we get Z N G(ζ, x)2 dS = G(ζ, ζ) = , ωr−1 S r−1 and hence kLn k∞ ≤ N 1/2 as stated.

2

For the special case r = 3, in which N = (n + 1)2 , a better upper bound for kLn k∞ can be achieved if the quadrature rule in definition (4.2) satisfies the spherical cap assumption, which is introduced in [36], as Assumption 1 The family of M-points of the quadrature rule Q in (4.2) are distributed such that there exists a constant c1 > 0, which is independent of M and Q, such that for √ P every spherical cap A with spherical radius 1/ M, we have tj ∈A wj ≤ c1 |A|.

53

4.4. THE HYPERINTERPOLATION OPERATOR AND ITS NORM

√ In the above, |A| = 2π(1 − cos(1/ M)), which is the area of the spherical cap. Here, we parametrize the sphere S 2 by spherical coordinate (θ, φ) ∈ [0, π] × [0, 2π). The spherical cap with spherical radius α is cut out of S 2 by a cone of half-angle α, the axis of the cap is the polar axis of the cone. Theorem 4.4.3 For r = 3, suppose the spherical cap assumption is satisfied, then kLn k∞ ≤ Cn1/2 , for C is some positive constant. It has been pointed out in [36] that the tensor product Gauss-Legendre rule, Clenshaw-Curtis rule, Fejer rule satisfy the spherical cap assumption. The L2 (S r−1 ) norm of the hyperinterpolation operator is defined as kLn k2 := sup{kLn f k2 : f ∈ C(S r−1 ), kf k∞ ≤ 1}. Theorem 4.4.4 (Sloan) Let the hyperinterpolation operator be Ln : C(S r−1) → Prn (S r−1 ). √ Then the norm of the operator is kLn k2 = ωr−1 , where ωr−1 is the total surface area of S r−1 . Proof. Since Ln f = f for f ∈ Prn (S r−1), Ln is a linear projection. Hence, we have kLn k2 ≥ √ ωr−1 . To get the other way of inequality, we use the exactness of quadrature rule (4.2) for polynomials in Prn (S r−1 ) to obtain kLn f k22 = hLn f, Ln fi = hLn f , Ln f iM ≤ hf, f iM , where the last step is the result of lemma 4.3.2. We also have hf, f iM = Therefore, kLn f k2 ≤ √ So kLn k2 = ωr−1 .

√

M X k=1

2

wk f (tk ) ≤

M X k=1

wk kf k2∞ = ωr−1 kf k2∞ .

ωr−1 kf k∞ . By the definition of norm, this implies kLn k2 ≤

We have immediately a corollary

√

ωr−1 .

2

54


Corollary 4.4.1 Let Ln f be the hyperinterpolant for f ∈ C(S r−1 ). Then √ kLn f − f k2 ≤ 2 ωr−1

inf

χ∈Prn(S r−1 )

kf − χk∞ .

Thus, kLn f − f k2 → 0 as n → ∞. Proof. For arbitrary χ ∈ Prn (S r−1), from theorem 4.4.4, kLn f − f k2 = kLn f − χ + χ − f k2 ≤ kLn (f − χ)k2 + kχ − f k2 ≤

√ √ ωr−1 kf − χk∞ + ωr−1 kχ − f k∞

√ ≤ 2 ωr−1 kf − χk∞ . Since it is true for all χ ∈ Prn (S r−1), we have √ kLn f − f k2 ≤ 2 ωr−1

inf

χ∈Prn(S r−1 )

kf − χk∞ .

By Stone-Weierstrass theorem kf −χk∞ can be made arbitrarily small. Hence, kLn f −f k2 →

2

0 as n → ∞.

4.5

Construction of hyperinterpolation points

Since hyperinterpolation is based on the exactness of the quadrature rule, the construction of hyperinterpolation points has a close relation with construction of exact quadrature rule on the sphere S r−1 . There is a fairly large literature deal with the problem of exactness of the quadrature rule on spheres and the relation between the number of points and the dimension of the interpolating polynomial space. Spherical design and tensor products of Gauss quadrature rule on the spheres are among the rules that can be used for hyperinterpolation on S r−1 . We begin with an example on S 1 . Example 4.5.1 (The unit circle S 1 ⊂ R2 ). Here dS is the angular measure in radians and ω1 = 2π. The space of polynomials of degree ≤ n is spanned by 2n + 1 functions {1, sin θ, cos θ, . . . , sin nθ, cos nθ}.

4.5. CONSTRUCTION OF HYPERINTERPOLATION POINTS

55

Consider m = 2n + 1 points which are equally distributed on the circle, i.e. in polar coordinate, the points are given by θk =

2πk . m

wk =

Then the weights found are

2π m

k = 0, . . . , m − 1.

and the quadrature rule with the above weights is exact for all polynomials of degree ≤ 2n Z 2π m−1 2πk 2π X g = g(θ)dθ, g ∈ P22n (S 1 ). m k=0 m 0 Hence, the hyperinterpolant Ln f for f is n

Ln f (θ) =

a0 X + (aj cos jθ + bj sin jθ), 2 j=1

where aj bj

m−1 2πjk 2πk 2π X cos f , = m k=0 m m m−1 2πjk 2πk 2π X sin f , = m k=0 m m

j ≥ 0, j ≥ 1.

2

There are various explicit ways of constructing an exact quadrature rule for polynomials over the sphere S 2 . Stroud [38] proposed tensor products of Gauss rules with respect to appropriate angular variables. More precisely, in polar coordinates, Z

gdS =

S2

Z

0

2π

Z

π

g(θ, φ) sin θdθdφ,

0

with θ the polar angle and φ the azimuthal angle, then a rule of order 2n + 1 is 2(n+1) n+1 X X jπ π , µi g θi , n + 1 j=1 i=1 n+1 where {cos θi } are the zeros of the Legendre polynomial of degree n + 1, and {µi } are the corresponding Gauss-Legendre weights. In a different direction, Bajnok [3] proposed an explicit construction of spherical design on S 2 . Suppose that the set X = {x1 , x2 , . . . , xn } is an interval t-design on [−1, 1], that is

56


for every continuous real functions in a finite dimensional space ⊂ C([−1, 1]), the following property holds n

1X 1 f (xk ) = n k=1 1 − (−1)

Z

1

f (x)dx. −1

Then the planes given by the equations x = xi intersect the sphere x2 + y 2 + z 2 = 1 in n circles. On each circle, m points are equidistributed as a regular m-gon, where m is some positive integer ≥ t + 1. Then Bajnok proved that the resulting mn nodes form a t-design on S 2 . It is worth mentioning the known results for t-design on the interval and spherical design. Let m(t) denote the size of a smallest t-design on [−1, 1], and m′ (t) denote the minimum integer such that for any n ≥ m′ (t) an interval t-design exists having size n. The existence of m′ (t) (and hence of m(t)) has been proved in a general setting by Seymour and Zaslavsky in 1984, [33]. The values of m(t) are only known for t ≤ 9 : m(1) = 1, m(2) = m(3) = 2, m(4) = m(5) = 4, m(6) = m(7) = 6 and m(8) = m(9) = 9. It was proved by S. N. Bernstein in 1937 that m(t) ≤ t iff t ≤ 7 or t = 9. For spherical design, let M(t) denote the size of the smallest t-design on S 2 , all known tight spherical designs on S 2 are: • for t = 1, M(1) = 2, a pair of antipodal points is a tight 1-design, • for t = 2, M(2) = 4, the regular tetrahedron is a tight 2-design, • for t = 3, M(3) = 8, the regular octahedron is a tight 3-design, • for t = 5, M(5) = 12, the regular icosahedron is a tight 5-design. The cube is a 3-design (but not a 4-design), and the dodecahedron is a 5-design (but not a 6-design). For other values of t, M(t) has not been determined. Bajnok [3] gives an upperbound as M(t) ≤ (t + 1)m(t), where m(t) has been pointed out in the above for t ≤ 9. The following example, which is called discrete Fourier transform on S 2 in Driscoll and Healy [13], can be viewed as a construction of hyperinterpolation on S 2 .

4.5. CONSTRUCTION OF HYPERINTERPOLATION POINTS

57

Example 4.5.2 We parametrize the points on S 2 by spherical coordinates p = (θ, φ) ∈ [0, π]×[0, 2π). The complex spherical harmonics, which are denoted by Yℓm for m = −ℓ, . . . , ℓ and ℓ = 0, . . . , n, constitute an orthonormal basis for the interpolating polynomial space P3n (S 2 ). The approximation is constructed on the equiangular grid points θs =

πs πt , φt = , 0 ≤ s ≤ 2n − 1, 0 ≤ t ≤ 2n − 1. 2n n

The hyperinterpolant of a function f ∈ C(S 2 ) is expressed as a Laplace sum Ln f =

n X ℓ X

αℓm (f )Yℓk ,

ℓ=0 m=−ℓ

where √

2n−1 2n−1 2π X (n) X a αℓm (f ) = Yℓm (θs , φt )f (θs , φt ), 2n s=0 s t=0

and a(n) s

√ n−1 πs X 1 2 2 πs . := sin sin (2ℓ + 1) n n ℓ=0 2ℓ + 1 n

In this construction, the number of interpolation points is M = 4n2 , whereas the dimension of P3n (S 2 ) is (n + 1)2 . The discrete inner product is given by hu, viM :=

2n−1 X s=0

a(n) s

2n−1 X t=0

u((θs , φt ))v((θs , φt )),

u, v ∈ P3n (S 2 ).

2

Chapter 5 Optimal approximation on spheres 5.1

Introduction

In this chapter, we consider problems of constructive approximation in which the data points are specified. Then, we construct and estimate the interpolant based on the given data. Rather than restrict ourselves to polynomial interpolation in which the interpolant lies inside some finite dimensional polynomial space, in this chapter we will introduce some other interpolation spaces spanned by splines or radial basis functions. For this general setting, the reproducing kernels and various form of them still play an important role. For error estimation, we introduce the hypercircle inequality as a primary tool. Fourier analysis in Euclidean space Rr is used for analysing properties of the interpolant.

5.2

Optimal approximation in a Hilbert space

In a Hilbert space H, it is well known that optimal approximation can be interpreted as orthogonal projection. The problem of optimal approximation in H can be stated as the following. Assume that F, F1 , . . . , FM are linearly independent continuous linear functionals on H. Furthermore, assume that we are given the values Fi (v) = fi ,

i = 1, . . . , M,

v ∈ H, and

seek an optimal approximation to the value of F (v). For example, if H is an appropriate space of functions defined on the sphere, F1 , . . . , FM might be the point evaluation functionals at 58

59

5.2. OPTIMAL APPROXIMATION IN A HILBERT SPACE

prescribed points t1 , . . . , tM , while F might be point evaluation functional at another point t. By Riesz’s representation theorem, there exist φ1 , . . . , φM such that for all v ∈ H, Fj (v) = hv, φj i ,

j = 1, . . . , M. Let W = span {φ1 , φ2 , . . . , φM } ⊂ H, and let W ⊥ be the

orthogonal complement, i.e. v ∈ W ⊥ ⇐⇒ hv, φj i = 0, for j = 1, . . . , M. Now define the orthogonal projection from H to W as Π : H → W such that hΠv − v, χi = 0,

∀χ ∈ W.

(5.1)

The element u := Πv ∈ W is called in this chapter the interpolant of v ∈ H. The minimum property of the interpolant Πv is given by the following lemma, which is just the Pythagoras theorem in H. Lemma 5.2.1 For all v ∈ H, we have kvk2 = kΠvk2 + kv − Πvk2 . Proof. We expand the right hand side as kΠvk2 + hv − Πv, v − Πvi = kΠvk2 + kΠvk2 + kvk2 − 2 hv, Πvi = 2kΠvk2 + kvk2 − 2 hΠv, Πvi

by relation (5.1)

= kvk2 .

2 Now we apply the above lemma to prove the following inequality, which appears in mathematical literature as the hypercircle inequality. The hypercircle inequality can be found in Golomb and Weinberger [19], Davis [11], Stroud [38] and many mathematical physics text in various forms. Here, we are only interested in the hypercircle inequality in a Hilbert space which can be used for error analysis for the interpolant. Firstly, we obtain: Theorem 5.2.1 Let F be any continuous linear functional on H and let Π be the orthogonal projection from H to W as defined above. Let v ∈ H, then we have |F (v) − F (Πv)|2 ≤ (kvk2 − kΠvk2 )kφ − Πφk2 ,

60

CHAPTER 5. OPTIMAL APPROXIMATION ON SPHERES

where φ ∈ H is the representer of F according to the Riesz representation theorem, F (v) = hv, φi ,

∀v ∈ H.

Proof. Since φ is the representer of F , we have |F (v) − F (Πv)| = | hv − Πv, φi | = | hv − Πv, φi − hv − Πv, Πφi |

by relation (5.1)

= | hv − Πv, φ − Πφi | ≤ kv − Πvkkφ − Πφk

by the Cauchy-Schwarz inequality. (5.2)

Applying the result of lemma 5.2.1, we have the result as |F (v) − F (Πv)|2 ≤ (kvk2 − kΠvk2 )kφ − Πφk2 .

2 In the light of the above theorem, the hypercircle inequality described in Golomb and Weinberger [19] appears as a corollary. We recall the definition of a hypercircle in H as Cr := {v ∈ H : kvk ≤ r,

Fi (v) = fi ,

i = 1, . . . , M}.

where r > 0 is some given real number. For all w ∈ W ⊥ , we have Πw = 0, so in this case (5.2) gives |F (w)| ≤ kwkkφ − Πφk, with equality iff w is a scalar multiple of φ − Πφ. Let y∗ =

φ − Πφ , kφ − Πφk

(5.3)

then y ∗ is the unit norm element for which F |W ⊥ attains its upper bound. Then, F (y ∗) = hy ∗ , φi = hy ∗ , φ − Πφi = kφ − Πφk. Thus, we have the following corollary: Corollary 5.2.1 Let F, F1 , . . . , FM be given bounded linear functionals defined over H and v ∗ = Πv, where Π is the orthogonal projection as defined above. Then |F (v) − F (v ∗ )|2 ≤ |F (y ∗)|2 (r 2 − kv ∗ k2 ),

v ∈ Cr .

61

5.3. POSITIVE DEFINITE KERNELS AND NATIVE SPACE where y ∗ is the unique element with unit norm for which F (y ∗ ) =

sup Fi (y)=0,kyk=1

|F (y)|.

i=1,... ,M

In the above corollary, in Golomb and Weinberger [19], v ∗ is characterised as the center of the hypercircle Cr and kv ∗ k = inf Fi (v)=fi kvk and Fi (v ∗ ) = fi ,

5.3 5.3.1

i = 1, . . . , M.

Positive definite kernels and native space Positive definite kernels on spheres

The notion of positive definite functions on spheres S r−1 and S ∞ was first introduced by Schoenberg [32] in 1942 as Definition 10 A real continuous function g(t) is said to be positive definite on S r−1 if we have N X N X i=1 j=1

g(θ(pi , pj ))xi xj ≥ 0,

for any N points p1 , . . . , pN on S r−1 and any real numbers x1 , . . . , xN , and all N = 2, 3, . . . . Here θ is the usual geodesic distance on S r−1 , θ(p, q) = arccos(p.q). In the above definition, if the equality for distinct points p1 , . . . , pN implies xi = 0, ∀i = 1, . . . , N, then we say that g is strictly positive definite. It follows from the definition (λ)

that g(θ) = f (cos θ) for some continuous function f . Let Pn (cos t) be the ultraspherical polynomials defined by the expansion 2 −λ

(1 − 2r cos t + r ) (0)

=

∞ X

r n Pn(λ) (cos t),

(λ > 0).

n=0

((r−2)/2)

For λ = 0 we set Pn (cos t) = cos nt. We remark that Pn

is just the Legendre

polynomial of degree n in Rr . Schoenberg established the following result ( [32, Theorem 1])

62


Theorem 5.3.1 The most general f (cos θ) which is positive definite on S r−1 is given by the expansion f (cos θ) =

∞ X

an Pn(λ) (cos θ),

n=0

1 (an ≥ 0, λ = (r − 2)), 2

provided that the series converges for θ = 0. It follows from the definition that for a given set of distinct points p1 , . . . , pN on S r−1 the matrix G defined by Gij := g(θ(pi , pj )) is positive semi-definite if g is positive definite and (strictly) positive definite if g is strictly positive definite. Recently, Cheney [8] put emphasis on strictly positive definite kernels in C(S r−1 × S r−1 ) in the problem of interpolating scattered data on a sphere. Suppose we are given numerical values a1 , a2 , . . . , aN associated with certain prescribed distinct points p1 , . . . , pN on S r−1 . If a strictly positive definite function g is available, then it is possible to interpolate the data by a function of the form F (x) =

N X

cj g(θ(x, pj )),

j=1

x ∈ S r−1 .

The interpolation conditions becomes ai = F (pi ) =

N X

cj g(θ(pi , pj )),

j=1

(1 ≤ i ≤ N).

or in matrix notation a = Gc. Since G is symmetric and positive definite; the matrix G is invertible. Xu and Cheney gave the conditions for g to be strictly positive definite as the following theorem ( [41, Theorem 2]) Theorem 5.3.2 Let r be a positive integer. Set λ = (r − 2)/2. Let g(t) =

∞ X k=0

(λ)

ak Pk (cos t),

ak ≥ 0,

∞ X k=0

(λ)

ak Pk (1) < ∞.

Let p1 , p2 , . . . , pN be N distinct points on S r−1 . In order that the matrix G, where Gij = g(θ(pi , pj )) be positive definite it is sufficient that the coefficients ak be positive for 0 ≤ k < N.

5.3. POSITIVE DEFINITE KERNELS AND NATIVE SPACE

63

In a more general setting, proposed by Narcowich [26], we now consider the following absolutely convergent kernel, κ(p, q) =

(r,ℓ) ∞ NX X

aℓk Yℓk (p)Yℓk (q),

ℓ=0 k=1

p, q ∈ S r−1 .

(5.4)

The only assumption we need is that κ(., .) is continuous on S r−1 × S r−1 . There are many ways to achieve this assumption. One way, used by Narcowich [26], is to embed the kernel in an appropriate Sobolev space by the following condition, ∞ X

N (r,ℓ)

(1 + 2λℓ )

ℓ=0

2s

X k=1

a2ℓk < ∞,

(5.5)

for some positive integer s and λℓ = ℓ(ℓ + r − 2) are the eigenvalues of the Laplace- Beltrami operator on S r−1 . It can be proved that condition (5.5) ensures the kernel lies inside the Sobolev space H 2s (S r−1 × S r−1). By the Sobolev embedding theorem, for s > (r − 1)/2, the above kernel is continuous, i.e. κ ∈ C(S r−1 × S r−1 ). For the interpolation problem, we need to know under what conditions the kernel κ(., .) is positive definite or strictly positive definite. Sufficient conditions, which generalise those in theorem 5.3.1 and theorem 5.3.2, are given in the following theorem. Theorem 5.3.3 For r ≥ 3, the kernel κ(., .) as defined in (5.4) is positive definite if all the coefficients aℓk ≥ 0; and strictly positive definite if all the coefficients aℓk > 0. Before the proof is given, we need a useful lemma. Lemma 5.3.1 For a finite set of distinct points p1 , p2 , . . . , pN on S r−1 , we can always find a point p ∈ S r−1 so that p.pj = cos θ(p, pj ) for j = 1, . . . , N are N distinct real numbers. Proof. Suppose there exists no such point p. Then for every p ∈ S r−1 , we can choose a pair of points pi , pj in the given set so that pi .p = pj .p. Now (pi − pj ).t = 0 for all t on the hypersphere Cij defined by the intersection between S r−1 and the hyperplane {x ∈ Rr : (pi − pj ).x = 0}. Since the points p1 , . . . , pN are distinct, Cij 6= ∅ for all S i, j = 1, . . . , N. This would lead to S r−1 ⊂ i,j=1,... ,N Cij , which is not possible since S r−1

64


is of dimension r − 1 whereas

S

i,j=1,... ,N

2

Cij is of dimension r − 2.

Proof. (of theorem 5.3.3) For a given vector c = (c1 , . . . , cN ) ∈ RN , we have N X N X

N X N X

ci κ(pi , pj )cj =

i=1 j=1

ci

i=1 j=1

X

=

ℓ,k

X

=

X

Yℓk (pi )Yℓk (pj )cj

ℓ,k

N X

ci Yℓk (pi )

i=1

aℓk

N X

!

ci Yℓk (pi )

i=1

ℓ,k

N X

cj Yℓk (pj )

j=1

!2

.

! (5.6)

Clearly, if aℓk ≥ 0 then the right hand side is non-negative, so κ is a positive definite kernel. If aℓk > 0 and the right hand side in (5.6) equals to 0 then N X

∀ℓ ∈ N, k = 1, . . . , N(r, ℓ).

ci Yℓk (pi ) = 0,

i=1

Since the points pi ’s are distinct and the number of points N is finite, by lemma 5.3.1 we can choose some fixed pole p on S r−1 so that pi .p = cos(θ(pi , p)) =: cos(θi ) are distinct. Thus, N X

ci Yℓk (pi )Yℓk (p) = 0,

i=1

∀ℓ ∈ N, k = 1, . . . , N(r, ℓ).

By applying the Addition theorem, we obtain N X

(r)

ci Pℓ (pi .p) = 0,

i=1

(r)

∀ℓ ∈ N.

(5.7)

(r)

Now, the univariate Legendre polynomials P0 , . . . , PN −1 generate the N-dimensional univariate polynomial space of degree ≤ N − 1, for which it is well known that interpolation of arbitrary data at any N nodes is possible. Therefore, there exists a univariate polynomial P such that P (cos θi ) = ci . Hence, by taking appropriate linear combinations of (5.7) for ℓ = 0, . . . , N − 1, 0=

N X i=1

ci P (cos θi ) =

N X

c2i ,

i=1

implying ci = 0 for i = 1, . . . , N. So the kernel is strictly positive definite.

2

5.3. POSITIVE DEFINITE KERNELS AND NATIVE SPACE

5.3.2

65

Native spaces and multi-level approximation

If κ is a strictly positive definite kernel, then in fact, upon completion, κ defines a Hilbert space, which is called the native space, a notion introduced by Madych and Nelson [23], defined as a subspace of square integrable functions on the sphere, Hκ := {f ∈ L2 (S r−1 ) : kf k2κ =

2 X |fc ℓk |

aℓk

ℓk

< ∞}.

(5.8)

Here, it is understood that for a function f ∈ L2 (S r−1 ), fc ℓk is the Fourier coefficient in the

following expansion

f=

(r,ℓ) ∞ NX X ℓ=0 k=1

fc ℓk Yℓk ,

so that

fc ℓk :=

Z

f (x)Yℓk (x)dS.

S r−1

We shall show that Hκ actually is a reproducing kernel Hilbert space. Theorem 5.3.4 If the kernel κ(p, q) as defined in (5.4) is strictly positive definite and aℓk > 0 ∀ℓ, k, then Hκ is a reproducing kernel Hilbert space with kernel κ with respect to the inner product hf, giκ :=

(r,ℓ) ∞ NX X fc c ℓk g ℓk

aℓk

ℓ=0 k=1

,

Proof. The symmetry property of the kernel κ(p, q) is obvious. For the first property of reproducing kernel, we need to check that κ(p, .) ∈ Hκ . This can be shown by the definition of κ and the continuity of κ(., .). The definition gives κ(p, .) =

(r,ℓ) ∞ NX X

aℓk Yℓk (p)Yℓk (.).

ℓ=0 k=1

\.) = aℓk Yℓk (p). Thus, by continuity of κ(., .) on S r−1 × S r−1 The Fourier coefficients are κ(p, which is compact, kκ(p, .)k2κ

=

(r,ℓ) ∞ NX X ℓ=0 k=1

aℓk [Yℓk (p)]2 = κ(p, p) < ∞.

To verify the reproducing kernel property, we suppose f ∈ Hκ . Then hf, κ(p, .)iκ =

(r,ℓ) ∞ NX X fc ℓk aℓk ℓ=0 k=1

aℓk

Yℓk (p) =

(r,ℓ) ∞ NX X ℓ=0 k=1

fc ℓk Yℓk (p) = f (p).

66


2

Thus, κ(., .) defines a reproducing kernel Hilbert space.

We now consider a family of continuous kernels defined by the generating function of the Legendre polynomials, (see [25, page 30]), with the parameter 0 < z < 1. κz (p, q) =

∞

1 − z2 1 X = N(r, ℓ)z ℓ Pℓ (p.q). ωr−1 (1 + z 2 − 2zp.q)r/2 ωr−1 1

ℓ=0

For the above family of kernels, it follows from the Addition theorem that the coefficients aℓk = aℓ = z ℓ . Choose {zj }∞ j=1 to be a sequence of positive, increasing number so that limj→∞ zj = 1, we have a family of native spaces Vj which are defined as Vj := Hκj for κj = κzj . Some appropriate choices of zj are zj = e−1/j , zj = 2−1/j or zj = 1 − 1/(j + 1). We have the following properties of the sequence of native spaces. Lemma 5.3.2 The native spaces Vj defined as above satisfy (i) V1 ⊂ V2 ⊂ . . . ⊂ Vj ⊂ Vj+1 . . . ⊂ L2 (S r−1 ). (ii) Vj is dense in L2 (S r−1 ), for j = 1, 2, . . . . Proof. Property (i) follows directly from the definition of native spaces in (5.8). Take a function f ∈ Vj , then we have kf k2Vj+1

N (r,ℓ) ∞ X 1 X c2 |fℓk | < kf k2Vj < ∞, = ℓ z ℓ=0 j+1 k=1

thus f ∈ Vj+1 . To prove property (ii), we observe that Vj contains every polynomial on S r−1 . Since every continuous function can be approximated by a polynomial, and every function in L2 (S r−1 ) can be approximated by a continuous function, we conclude that Vj is dense in

2

L2 (S r−1 ).

The result of the above lemma gives rise to multilevel approximation on S r−1 , as being studied in [28]. The formulation of multilevel approximation technique is introduced in [28] as the following. Let us assume that there is a sequence of nested spaces W0 ⊂ W1 ⊂ W2 . . . ⊂ Wm = W.

(5.9)

5.4. INTERPOLATION IN REPRODUCING KERNEL HILBERT SPACES

67

In each of the space Wj we pose an approximation problem inf kfj − gkWj = kfj − fj∗ kWj

g∈Wj

to approximate an element fj ∈ Wj by elements from a closed subspace Wj ⊂ Wj . The function fj is always the residual of the previous step, i.e., ∗ fj := fj−1 − fj−1 ∈ Wj−1 ,

2 ≤ j ≤ m, f1 := f.

P ∗ ∗ The final approximation after step m to the first input f = f1 will be gm := m j=1 fj , and it P will be from the space Vm := m j=1 Wj , where the sum need not be direct or orthogonal. The

spaces Vm are nested as in multiresolution analysis, but the spaces Wm are not necessarily

orthogonal. The intermediate spaces of (5.9) should allow a sequence of recursive Jackson bounds kfj − fj∗ kWj ≤ Kj kfj kWj−1 ,

1 ≤ j ≤ m,

(5.10)

where Kj is some positive constant which depends on Wj , Wj−1 , Wj . The error bounds (5.10) can be applied recursively with the final result taking the form ! m Y ∗ ∗ kf − gm kW = kf1 − gm kWm ≤ Kj kf1 kW0 . j=1

To achieve better result than single level approximation, multi-level approximation requires 0 < Kj < 1.

5.4

Interpolation in reproducing kernel Hilbert spaces

In the general Lagrangian interpolation problem, suppose we are given the set of points on the sphere S r−1 , labelled as T = {t1 , . . . , tM }, and the values f (t1 ), . . . , f (tM ) are given, assuming that f ∈ Hκ . Then, in the framework of the previous section, the given linear functionals are the Dirac δ functions, i.e. Fi = δti , wish to approximate F = δp ,

i = 1, . . . , M, ti ∈ T . We

p ∈ S r−1 . At this point, we recall the reproducing kernel

property of κ(., .) as hκ(p, .), f iκ = f (p),

f ∈ Hκ , p ∈ S r−1 . Thus, the representers of

68


F, F1 , . . . , FM are given by κ(p, .), κ(t1 , .), . . . , κ(tM , .), respectively. These functions are called spherical basis functions in [27], and they are defined independently of any particular choice of spherical coordinate system. The interpolation space is then defined by W = span {κ(t1 , .), κ(t2, .), . . . , κ(tM , .)}, and its orthogonal complement is given as W ⊥ = {u : hκ(ti , .), uiκ = 0,

i = 1, . . . , M}.

It is natural to raise the problem of existence and uniqueness of the interpolant if it exists. We call an element u ∈ W a solution to the Lagrangian interpolation problem in the native space Hκ if u(ti) = f (ti ),

∀ i = 1, . . . , M.

(5.11)

The following lemma shows that a solution to the interpolation problem exists. Lemma 5.4.1 For f ∈ Hκ , the orthogonal projection of f to W as defined in section 2, is a solution for the Lagrangian interpolation problem. Proof. We need to show that Πf (ti ) = f (ti), ∀ i = 1, . . . , M. By definition of Π, hΠf − f , κ(ti , .)iκ = 0,

∀ i = 1, . . . , M.

(5.12)

However, in Hκ , κ(ti , .)’s are the representers of the point evaluation functionals Fi ’s. Thus, equation (5.12) implies 0 = Fi (Πf − f ) = Πf (ti ) − f (ti ),

∀ i = 1, . . . , M, proving the

lemma.

2

Since a strictly positive definite kernel is a built-in feature of a native space, we have Theorem 5.4.1 In the native space Hκ defined by the strictly positive definite kernel κ, there always exists a solution for the Lagrangian interpolation problem. Moreover, the solution is unique. Proof. Since u ∈ W , we can express u as a linear combination of κ(ti, .) u(.) =

M X j=1

cj κ(tj , .),

for

cj ∈ R.

5.5. GENERALIZED HERMITE INTERPOLATION ON SPHERES The condition (5.11) now becomes

PM

j=1 cj κ(tj , ti )

= f (ti ),

69

i = 1, . . . , M. Since κ is strictly

positive definite, the matrix {κ(tj , ti)} is (strictly) positive definite and hence invertible.

2

Thus, the existence and uniqueness of the interpolant follows.

5.5

Generalized Hermite interpolation on spheres

In order to do generalized interpolation on spheres, we follow the framework of Narcowich et al.[14]. For a given function κ ∈ C(S r−1 × S r−1 ) ∩ H 2s (S r−1 × S r−1 ), we may define the tensor product of two distributions u, v ∈ H −s (S r−1 ) as hu ⊗ v, κi = =

Z

u(p)

S r−1

Z

v(q)

S r−1

Z

Z

S r−1

S r−1

v(q)κ(p, q)dS(q) dS(p) u(p)κ(p, q)dS(p) dS(q).

(5.13)

Dyn, Narcowich and Ward [14, Theorem 2.1] proved that Theorem 5.5.1 Let the kernel κ ∈ H 2s (S r−1 × S r−1 ) ∩ C(S r−1 × S r−1 ) be self-adjoint, i.e. κ(p, q) = κ(q, p). Then, κ is positive definite if and only if for every u ∈ H −s (S r−1 ), hu ⊗ u, κi ≥ 0,

(5.14)

In (5.14), if the equality implies the distribution u = 0, then we will say the κ is strictly positive definite on S r−1 . The convolution between the kernel and a distribution u is defined as κ ⋆ u(p) :=

Z

κ(p, q)u(q)dS(q).

S r−1

In this framework, Narcowich [26] proposed that the generalized interpolation problem on −s r−1 S r−1 should be stated as: given a linearly independent set {uj }M ), comj=1 ⊂ H (S

plex numbers dj , j = 1, . . . , M and a positive definite kernel κ on S r−1 , find u ∈ U := span {u1 , . . . , uM } such that Z

S r−1

uj (p)(κ ⋆ u)(p)dS(p) = dj ,

j = 1, . . . , M.

70


In the following, we give some examples of distributions that represent point evaluation, point evaluation of partial derivative, and the average of f . • if dj = f (pj ) then u = δpj . • if dj = ∂φ ((θj , φj )) then u = −∂φ • if dj =

R

S r−1

δ(θ − θj )δ(φ − φj ) . sin θ

f (x)dS(x) then u = (ωr−1 )−1 .

Theorem 5.5.2 (Narcowich) The generalized interpolation problem has a solution and the solution is unique if and only if the kernel κ is strictly positive definite. Proof. The existence and uniqueness of the interpolant in span {u1 , . . . , uM } follows from the invertibility of the matrix A, where Z

Ajk :=

uj (p)(κ ⋆ uk )(p)dS(p),

where

j, k = 1, . . . , M.

S r−1

Firstly, we show that A is self-adjoint by self-adjointness of κ Ajk =

Z

uj (p)(κ ⋆ uk )(p)dS(p) Z Z = uj (p) κ(p, q)uk (q)dS(q) dS(p) S r−1 S r−1 Z Z = uj (p) κ(q, p)uk (q)dS(q) dS(p) S r−1 S r−1 Z Z = uk (p) κ(q, p)uj (p)dS(p) dS(q) S r−1

S r−1

by (5.13)

S r−1

= Akj

by (5.15)

Next, we need to show A is positive definite by consider the quadratic form ∗

c Ac =

M X M X j=1 k=1

cj Ajk ck , where c = (c1 , . . . , cM ) ∈ CM .

(5.15)

5.5. GENERALIZED HERMITE INTERPOLATION ON SPHERES

71

From equation (5.15), we obtain ∗

c Ac =

M X M Z X j=1 k=1

=

Z

S r−1

If we let u :=

PM

k=1 uk

cj uj (p)(κ ⋆ ck uk )(p)dS(p)

S r−1

X

cj uj (p)

j

!"

κ⋆

X k

ck u k

!#

(p)dS(p).

in the above, then ∗

c Ac =

Z

u(p)(κ ⋆ u)(p)dS(p).

S r−1

or in short notation c∗ Ac = hu ⊗ u, κi . Since κ is positive definite c∗ Ac = hu ⊗ u, κi ≥ 0.

(5.16)

Suppose that Ac = 0, then hu ⊗ u, κi = 0 by (5.16). Since κ is strictly positive definite on P S r−1 , u = 0. Because uj ’s are linearly independent, u = M j=1 cj ui = 0 implies cj = 0 for

all j = 1, . . . , M. i.e. c = 0 and hence the matrix A is (strictly) positive definite. Therefore,

2

the theorem is proved.

For u, v ∈ H −s (S r−1 ), we can use κ to define an inner product [u, v]κ := hv ⊗ u, κi, where h., .i is the usual L2 (S r−1 ) inner product. The norm associated with this inner product is p kukκ := hu ⊗ u, κi. In the inner-product space we have just obtained, the generalized

interpolation problem in the beginning of the section is recasted in the following. Suppose f := κ ⋆ v is the unknown function but [v, uj ]κ = dj for j = 1, . . . , M are known. We are looking for the interpolant f ∗ := κ ⋆ v ∗ , where v ∗ ∈ U, approximates the original function f . Since f ∗ must satisfy the interpolation requirement, we must have [v ∗ , uj ]κ = dj . Thus, [v − v ∗ , uj ]κ = 0 for j = 1, . . . , M. In other words, v − v ∗ is orthogonal to U = span{u1, . . . , uM }. Theorem 5.5.3 Let f, f ∗ , v, v ∗ be defined as above and let w be an arbitrary distribution in H −s (S r−1 ). Then we have | hw, f − f ∗ i | ≤ distκ (v, U)distκ (w, U).

72


Proof. | hw, f − f ∗ i | = | hw, κ ⋆ v − κ ⋆ v ∗ i | = |[v − v ∗ , w]κ | = |[v − v ∗ , w − u]κ | for some u ∈ U ≤ kv − v ∗ kκ kw − ukκ

by the Cauchy-Schwarz inequality.

From the previous paragraph, we know that v − v ∗ ⊥ U, therefore kv − v ∗ kκ = distκ (v, U). Since w is arbitrary, we can choose u to be the orthogonal projection of w to U so that kw − ukκ = distκ (w, U). Therefore, the theorem is proved.

5.6

2

Error analysis for Lagrangian interpolation

In this section, we obtain an error analysis for the Lagrangian interpolation problem on S r−1 in the framework set out by previous section. Mainly, we follow Jetter et. al. [16]. The Dirac δ functions play an important role in this section. It is convenient to introduce the mesh norm of specified set of points T on S r−1 as h(T ) := sup θ(p, T ). p∈S r−1

We start by the following proposition (cf. [16, Proposition 1]) Proposition 5.6.1 Suppose a given set of points T has mesh norm h(T ) < 1/(2ℓ), where ℓ ∈ N+ . Then, the following holds max |δt (Y )| ≥ (1/2)kY k∞ , t∈T

∀Y ∈ Hrℓ (S r−1 ).

Proof. Without loss of generality, we can pick Y ∈ Hrℓ (S r−1 ) such that kY k∞ = 1, then |Y (p)| = 1 for some p ∈ S r−1. For a given ǫ > 0, pick t ∈ T so that θ(p, t) < (1 + ǫ)/(2ℓ). Using the Markov inequality (see Appendix A.3), we obtain |Y (p) − Y (t)| ≤ ℓθ(p, t)kY k∞ < (1 + ǫ)/2.

5.6. ERROR ANALYSIS FOR LAGRANGIAN INTERPOLATION

73

Thus, |Y (t)| = |Y (t) − Y (p) + Y (p)| ≥ |Y (p)| − |Y (p) − Y (t)| > (1 − ǫ)/2. Since this is true for at least one point t ∈ T , we can let ǫ → 0 to get max |δt (Y )| ≥ (1/2), t∈T

2

and hence the result follows.

Proposition 5.6.2 For the continuous linear functionals on C(S r−1 ) of the form F = P P t∈T ct δt , we have kF k = t∈T |ct |.

Proof. For a function f ∈ C(S r−1 ), X X X |ct ||f (t)| ≤ kf k∞ |ct |. (5.17) ct δt (f ) ≤ t∈T t∈T t∈T P Thus, the definition of the norm gives kF k ≤ t∈T |ct |. To get the different direction of the inequality, we construct a continuous function f ∗ so that kf ∗ k∞ = 1 and f ∗ = sign (ct ) near t ∈ T . Then, since kf ∗ k∞ = 1, kF k ≥ |F (f ∗)| = Thus, the result follows.

X t∈T

|ct |.

2

An obvious corollary of the above proposition is kδt k = 1 for all t ∈ T . We now introduce the sampling operator as T : C(S r−1 ) → C(T ) f 7→ f |T . Obviously, T is a contraction and a bounded linear operator. Here, it is understood that C(T ) := {f |T : f ∈ C(S r−1 )}. The space C(T ) can be identified with RM , where M = |T |, equipped with the maximum norm. The dual space C(T )∗ is therefore span {δt |C(T ) : t ∈ T }. The following theorem, which plays a crucial role in the error analysis, appears in Jetter et. al. [16] in a slightly different form. Here, we clarify by writing the result in the following form.

74


Theorem 5.6.1 Let W be a finite dimensional subspace of C(S r−1 ) and assume that the following holds max |δt (w)| ≥ (1/2)kwk∞, t∈T

∀w ∈ W.

(5.18)

Then the dual space W ∗ can be identified with span{δt |W : t ∈ T }. Moreover, any g ∈ W ∗ with kgk = 1 can be expressed in the form X

ct δt |W ,

t∈T

X t∈T

|ct | ≤ 2.

Proof. We consider the sampling operator restricted to W , T0 := T |W : W → T (W ) ⊂ C(T ). For an element w ∈ W , by condition (5.18), we have kT0 (w)k∞ = max |w(t)| = max |δt (w)| ≥ (1/2)kwk∞. t∈T

t∈T

Therefore T0 is an isomorphism and kT0−1 k ≤ 2. Now we consider the adjoint operator T0∗ , T0∗ : (T (W ))∗ → W ∗ . We wish to show that T0∗ is also an isomorphism. Firstly, we show that T0∗ is bounded linear operator. By definition, for a given bounded linear functional f ∈ (T (W ))∗ , kT0∗ f k = sup {|(T0∗f )(w)| : kwk∞ = 1}.

(5.19)

w∈W

But, the definition of adjoint operator and the boundedness of kT0 k give |(T0∗ f )(w)| = |f (T0 w)| ≤ kf kkT0wk ≤ kf kkT0 kkwk∞ . Thus, in (5.19), since kwk∞ = 1, kT0∗ f k ≤ kT0 kkf k, i.e. T0∗ is a bounded linear operator. Secondly, by similar arguments, in which we appeal to the boundedness of kT0−1k which has been proved, we can show that (T0∗ )−1 is also bounded. Therefore T0∗ is also an isomorphism and k(T0∗ )−1 k ≤ kT0−1 k ≤ 2. Consequently, for any g ∈ W ∗ with kgk = 1, there exists an element f ∈ (T (W ))∗ so that T0∗ (f ) = g and kf k = k(T0∗ )−1 gk ≤ k(T0∗ )−1 kkgk ≤ k(T0∗ )−1 k ≤ 2. Finally, since T (W ) is a subspace of C(T ), by the Hahn-Banach theorem, every f ∈ (T (W ))∗ can be extended to F ∈ (C(T ))∗ ,

F =

X t∈T

ct δt |C(T )

75

5.6. ERROR ANALYSIS FOR LAGRANGIAN INTERPOLATION so that f = F |T (W )

and

kf k = kF k =

X t∈T

|ct | ≤ 2.

So, for a fixed g ∈ W ∗ we denote by F as above the extension of the functional (T0∗ )−1 g. Then, we need to verify g = F |W . This can be checked by g(w) = g(T0−1 T0 w) = ((T0∗ )−1 g)(T0 w) ! X = ct δt |C(T ) (T0 w) =

X

t∈T

t∈T

ct δt |W

!

(w),

2

for all w ∈ W . Thus, the theorem is proved.

Now we apply the result to get the estimate for Lagrangian interpolation, (cf. [16, Theorem 2]) Theorem 5.6.2 Suppose the specified set of points T has mesh norm h(T ) and M = |T |. Here, F, F1 , . . . , FM are point evaluation functionals. Then there exist numbers ci for i = 1, . . . , M such that F−

M X i=1

!

ci Fi Y = 0. ∀Y ∈

HrL (S r−1 ),

and

M X i=1

|ci| ≤ 2,

(5.20)

where L := ⌊2/h(T )⌋. The term |F (y ∗)|2 in the hypercircle inequality is bounded by |F (y ∗)|2 ≤

5(M + 1) X aℓ N(r, ℓ), ωr−1 ℓ>L

Proof. The existence of the coefficients ci with

where

P

i

aℓ :=

max

1≤k≤N (r,ℓ)

aℓk .

|ci | ≤ 2 is given by theorem 5.6.1 together

with proposition 5.6.1. By orthogonality and Fourier expansion of y ∗, ! ! (r,ℓ) M ∞ NX M X X X ∗ |. ci Fi Yℓk |yc ci Fi y ∗ ≤ |F (y ∗)| = F − F− ℓk i=1

ℓ=0 k=1

i=1

By condition (5.20), we have F−

M X i=1

ci Fi

!

Yℓk = 0 ∀ℓ ≤ L, 1 ≤ k ≤ N(r, ℓ).

76


Then the Cauchy-Schwarz inequality gives |F (y ∗)|2

! 2 (r,ℓ) (r,ℓ) M ∗ 2 X NX X NX X c |yℓk | ≤ ci Fi Yℓk aℓk F − . aℓk ℓ>L k=1 i=1 ℓ>L k=1 ! (r,ℓ) M 2 X NX X ≤ aℓk F − ci Fi Yℓk ,

(5.21)

i=1

ℓ>L k=1

since the first sum is bounded by ky ∗k2κ = 1. Now we estimate the inner sum in (5.21) by using the Addition theorem, with t0 := p and c0 := −1, 2 M ! 2 N (r,ℓ) N (r,ℓ) M X X X X ci Fi Yℓk = ci Yℓk (ti ) aℓk F − aℓk i=1

k=1

k=1

≤ aℓ

i=0

N (r,ℓ) M X X

ci cj Yℓk (ti )Yℓk (tj )

k=1 i,j=0

M N(r, ℓ) X (r) = aℓ ci cj Pℓ (ti .tj ). ωr−1 i,j=0

h i (r) The matrix P := Pℓ (ti.tj ) , for i, j = 0, . . . , M is symmetric, positive semi-definite.

Thus, there exists an orthogonal matrix A and 0 ≤ λ0 ≤ λ1 . . . ≤ λM so that P = AT diag (λ0 , . . . , λM )A and hence M X

(r) ci cj Pℓ (ti.tj )

i,j=0

T

T

T

= c Pc ≤ c A diag (λM , . . . , λM )Ac = λM

The proof is finished by the asumption (5.20) which gives !2 M M X X 2 2 |ci | ≤ 5, |ci | ≤ |c0 | + j=0

M X

c2i .

i=0

i=1

and λM ≤ trace(P) = M + 1.

2

Since kf − Πf kκ ≤ kf kκ for all f ∈ Hκ , the hypercircle inequality and the above theorem give the following corollary: Corollary 5.6.1 Let T be a given data set of points with mesh norm h(T ) and M = |T |. For a function f ∈ Hκ , let the interpolant of f be u, for u ∈ W . Then, we have kf − uk∞ ≤ where L := ⌊1/2h(T )⌋.

∞ X 5(M + 1) aℓ N(r, ℓ), kf kκ ωr−1 ℓ=L+1

Appendix A

A.1

Hahn-Banach theorem

Let V be a Banach space, W a linear subspace of V and p is a semi-norm defined on V . Suppose that f is a linear functional on W satisfying |f (v)| ≤ p(v), for v ∈ W . Then there exists an extension of f to a linear functional F on V , (F = f on W ), such that |F (v)| ≤ p(v) for v ∈ V . In case V is a Hilbert space and p is the associated norm, the result follows readily from the orthogonal decomposition of V .

A.2

Laplace-Beltrami operator on S r−1

Firstly, we express x ∈ Rr in polar coordinate x = ρ(ter +

√ 1 − t2 zr−1 ),

(A.1)

where zr−1 is the unit vector spanned by the unit vectors e1 , e2 , . . . , er−1 . Suppose now there is some coordinate representation v1 , . . . , vr−1 of S r−2 , we then set ur = ρ; ur−1 = t; ui = vi

for i = 1, . . . , r − 2,

so that the unit vector zr is a function of t and v1 , . . . , vr−1 , or in the above notation of u1 , . . . , ur−1 . Let gik =

∂zr ∂zr . ; ∂ui ∂uk

g = det(gik );

g ik gij = δjk , 77

i, k = 1, 2, . . . , r − 1;

78

APPENDIX A.

we form the Laplace-Beltrami operator for S r−1 as r−1 r−1 1 X X ∂ √ ik ∂ ∆∗ = √ . gg g i=1 ∂ui ∂uk k=1

From equation (A.1), for i, k = 1, 2, . . . , r − 1, ∂x ∂x ∂zr = zr ; =ρ ∂ur ∂ui ∂ui ∂x ∂x ∂x ∂x = ρ2 gik ; . = 0; ∂ui ∂uk ∂ui ∂ur

∂x ∂x . = 1, ∂ur ∂ur

and then we obtain ∆=

1 ∂ 1 ∗ ∂2 + (r − 1) + ∆. ∂ρ2 ρ ∂ρ ρ2

As ρℓ Yℓ (x) is a harmonic function we get 0 = ∆ρℓ Yℓ (x) = ℓ(ℓ + r − 2)ρℓ−2 Yℓ (x) + ρℓ−2 ∆∗ Yℓ (x), which gives ∗

Theorem A.2.1 Hrℓ (S r−1 ) is the eigenspace of the Laplace-Beltrami operator with respect to the eigenvalue −ℓ(ℓ + r − 2).

A.3

Markov inequality

(cf. Jetter et. al. [16]) The restriction of any spherical function Yℓ of order ℓ to a great circle (which are the geodesic curves on the sphere) is a univariate trigonometric polynomial of degree ≤ ℓ. Hence the classical Bernstein inequality implies |DT Yℓ (p)| ≤ ℓkYℓ k∞ , where DT denotes any unit tangential derivative at p and the supremum norm is on S r−1 and Yℓ ∈ Hrℓ (S r−1 ). In the integrated form, for Y ∈ Hrℓ (S r−1 ), |Y (p) − Y (q)| ≤ ℓθ(p, q)kY k∞ . p, q ∈ S r−1 .

79

A.4. SOBOLEV SPACES

A.4

Sobolev spaces

The Sobolev space on S r−1 , H s (S r−1) is defined as H s (S r−1 ) = {u ∈ L2 (S r−1 ) : D m u ∈ L2 (S r−1 ) for |m| ≤ s}. In the above, m = (m1 , m2 , . . . , mr−1 ) and Dm =

∂ m1 ∂ m2 . . . ∂ mr−1 . ∂x1 ∂x2 . . . ∂xr−1

It is known from Fourier analysis theory, the relation between the Fourier transform uˆ(ξ) and u ∈ H s (S r−1 ) is given by u ∈ H s (S r−1 ) ⇔ (1 + |ξ|2 )s/2 uˆ ∈ L2 (S r−1 ). Theorem A.4.1 (Sobolev embedding theorem) If s > (r − 1)/2, then u ∈ H s (S r−1 ) is bounded and continuous. Proof. By Fourier inversion formula, it suffices to prove that uˆ ∈ L1 (S r−1 ). Indeed, using the Cauchy inequality, Z

S r−1

|ˆ u(ξ)|dξ ≤

Z

2

S r−1

2 s

|ˆ u(ξ)| (1 + |ξ| ) dξ

1/2 Z

S r−1

2 −s

(1 + |ξ| ) dξ

The last integral is finite for s > (r − 1)/2, from which the result follows.

1/2

.

2

Bibliography [1] K. Atkinson, Numerical integration on the sphere, J. Austral. Math. Soc. Ser. B, 23 332-347 (1982). [2] N. Aronszajn, Theory of reproducing kernels, Trans. Amer. Math. Soc. , 337-404 (1950). [3] B. Bajnok, Construction of spherical t-designs, Geometriae Dedicata 43, no.2, 167179 (1992). [4] E. Bannai, R. M. Damerell, Tight spherical design, I, J. Math. Soc. Japan 31, 199-207 (1979). [5] D. L. Berman, On a class of linear operators, Dokl. Akad. Nauk SSSR 85, 13-16 (1952). (Russian) [6] L. Bos, Some remarks on the Fejér problem for Lagrange interpolation in several variables, J. Approximation Theory 60, 133-140 (1990). [7] E. W. Cheney, Multivariate Approximation Theory: Selected Topics, Regional Conference Series in Applied Mathematics, SIAM, (1986). [8] E. W. Cheney, Approximation using positive definite functions, Approximation Theory VIII, vol 1: Approximation and Interpolation, (C. K. Chui and L. L. Schumaker, eds.) 145-168, World Scientific, Singapore (1995). 80

BIBLIOGRAPHY

81

[9] E. W. Cheney, Introduction to approximation theory, McGraw-Hill Book Company, (1966). [10] I. K. Daugavet, Some applications of the Marcinkiewicz-Berman identity, Vestnik Leningrad University, Math. 1, 321-327 (1974). [11] P. J. Davis, Interpolation and Approximation, Blaisdell Publishing, (1965). [12] P. Delsarte, J. M. Goethals, J. J. Seidel, Spherical codes and designs, Geom. Dedicata 6, 363-388 (1977). [13] J. R. Driscoll, D. M. Healy, Computing Fourier transforms and convolutions on the 2-sphere, Advances in Applied Mathematics, 15, 202-250 (1994). [14] N. Dyn, F. J. Narcowich, J. D. Ward, Variational principles and Sobolev type estimates for generalized interpolation on a Riemannian manifolds, J. Constr. Approx., (to appear). [15] P. Erdös, P. Turàn,On interpolation. I. Quadrature and mean convergence in the Lagrange interpolation, Ann. Math. 38, 145-155 (1937). [16] K. Jetter, J. Stöckler, J. D. Ward, Error estimates for scattered data interpolation on spheres, reprint, (1997). [17] W. Freeden, E. W. Grafarend, Mathematische Methoden der Geodäsie, Oberwolfach Conference, (1995). [18] W. Freeden, U. Windheuser, Combined spherical harmonic and wavelet expansion - A future concept in Earth’s gravitational determination, Appl. Comp. Harm. Analysis, 4, 1-37, (1997). [19] M. Golomb, H. F. Weinberger, Optimal approximation and error bounds, pp.117-190 in On Numerical Approximation, ed. by R. E. Langer, Madison, (1959).

82

BIBLIOGRAPHY [20] G. H. Golub, C. F. Van Loan, Matrix computations, Johns Hopkins Press, Baltimore, (1983). [21] T. H. Gronwall, On the degree of the convergence of the Laplace series, Trans. Amer. Math. Soc., 15, pp. 1-30 (1914). [22] S. Karlin, W. J. Studden, Tchebycheff Systems: With Applications in Analysis and Statistics, Interscience, New York, (1966). [23] W. R. Madych, S. A. Nelson, Multivariate interpolation: a variational theory, preprint. [24] U. Maier, J. Fliege, Charge distribution of points on the sphere and corresponding cubature formulae, Multivariate Approximation: Recent Trends and Results, 147-159, Mathematical Research vol. 101, Akademie-Verlag, Berlin (1997). [25] C. M¨ uller, Spherical Harmonics, Lecture Notes in Mathematics, vol. 17, Springer Verlag, Berlin - Heidelbarg, (1966). [26] F. J. Narcowich, Generalized Hermite interpolation and positive definite kernels on a Riemannian manifold, J. Math. Analysis and Appl. 190, 165-193 (1995). [27] F. J. Narcowich, J. D. Ward, Nonstationary wavelets on the m-sphere for scattered data, App. Comp. Har. Analysis 3, 324-336, (1996). [28] F. J. Narcowich, R. Schaback, J. D. Ward, Multilevel interpolation and approximation, Texas A&M University, College Station, (preprint), (1997). [29] M. J. D. Powell, The theory of radial basis approximation in 1990, in Wavelets, Subdivision and Radial functions (W. Light, Ed.), Oxford University Press, Oxford, 1990. [30] M. Reimer, Constructive Theory of Multivariate Functions, Wissenschaftsverlag, Mannheim, Wien, Z¨ urich, (1990).

BIBLIOGRAPHY

83

[31] M. Reimer, Interpolation on the sphere and bounds for the Lagrangian square sums, Results in Mathematics, Vol. 11, (1987). [32] I. J. Schoenberg, Positive definite functions on spheres, Duke Math. Journal, Vol. 9, 96-108, (1942). [33] P. D. Seymour, T. Zaslavsky, Averaging sets: a generalization of mean values and spherical designs, Advances in Mathematics, 52, 213-240, (1984). [34] I. H. Sloan, Hyperinterpolation over general regions, Journal of Approximation Theory, (1995). [35] I. H. Sloan, Interpolation and hyperinterpolation on the sphere, Multivariate Approximation: Recent Trends and Results, Mathematical Research, 101, Akademie-Verlag, Berlin (1997). [36] I. H. Sloan, R. S. Womersley, Constructive polynomial approximation on the sphere, manuscript, (1998). [37] N. J. Sloane, Encrypting by random rotations, Cryptography, Lecture notes in Computer Science, 149, T. Beth editor, Berlin, (1983). [38] A. H. Stroud, Approximation calculation of multiple integrals, Prentice-Hall, (1971). [39] G. Szegö, Orthogonal polynomials, New York, AMS (1939). [40] W. Rudin, Principles of Mathematical Analysis, McGraw-Hill, (1989). [41] Y. Xu, E. W. Cheney, Strictly positive definite functions on spheres, Proc. Amer. Math. Soc. 116 201-215, (1992).