Block Thresholding on the Sphere Claudio Durastanti y University of "Tor Vergata", Rome March 5, 2013
Abstract Th aim of this paper is to study the nonparametric regression estimators on the sphere built by the needlet block thresholding. The block thresholding procedure proposed here follows the method introduced by Hall, Kerkyacharian and Picard in [24], [25], modi…ed to exploit the properties of the spherical standard needlets. Therefore, we will investigate on their convergence rates, attaining their adaptive properties over the Besov balls. This work is strongly motivated by issues arising in Cosmology and Astrophysics, concerning in particular the analysis of cosmic rays. AMS classi…cation: 62G08, 62G20, 65T60 Keywords: Block Thresholding, Needlets, Spherical Data, Nonparametric Regression
1
Introduction
Over the last years, wavelet techniques have been used to achieve remarkable results in the …eld of statistics, in particular in the framework of minimax estimation in nonparametric settings. The pioneering work in this area was provided by Donoho et al. in [11], where authors proved that nonlinear wavelet estimators based on thresholding techniques attain nearly optimal minimax rates, up to logarithmic terms, for a large class of unknown density and regression functions. Since then, this research area has been deeply investigated and extended - we suggest for instance [23] as a textbook reference: speci…cally, we are focussing on the wavelet block thresholding procedure, among the other techniques. Loosely speaking, this method keeps or annihilates blocks of wavelet coe¢ cients on each given level (for more details, see [23]), hence representing an intermediate way between local and global thresholding, which …x a threshold respectively for each coe¢ cient and for all the set of them. This procedure, initially suggested in [15] for orthogonal series based estimators and later applied by [24] for both wavelet and kernel density estimation on R (see also [25]), was used in [5] jointly Research supported by ERC Grant n. 277742 Pascal
[email protected]
y E-mail:
1
to Oracle inequalities; overlapping block thresholding estimators were studied in [6]. The block thresholding was also applied to study adaptivity in density estimation in [9], a data-driven block thresholding procedure for wavelet regression is instead investigated in [7], while wavelet-based block thresholding rules on maxisets are proposed by [1]. Even if a huge number of results concerns estimation with the thresholding paradigm in standard Euclidean frameworks, such as R or Rn , more recent applications are being established in more general settings, such as spherical data or more general manifolds. In particular, we aim to a highly successful construction of a second-generation wavelet system on the sphere, the so-called needlets. The needlets were introduced by Narcowich, Petrushev and Ward in [36], [37]; their stochastic properties, when exploited on spherical random …elds, were studied in [2], [3], [32] and [33]. This approach has been extended to more general manifolds by [20], [21], [22], while their generalization to spin …ber bundles on the sphere were described in [18], [19], the so-called spin (pure and mixed) needlets. Most of these researches can be motivated in view of their applications to Cosmology and Astrophysics: for instance, a huge amount of spherical data, concerning the Cosmic Microwave Background radiation, are being provided by satellite missions WMAP and Planck, see [39], [35], [40], [16], [41], [42], [10], [43], [13] and [14] for more details. The applications mentioned here, however, do not concern thresholding estimation, but rather they can be related to the study of random …elds on the sphere, such as angular power spectrum estimation, higher-order spectra, testing for Gaussianity and isotropy, and several others (see also [8]). As another example, we mention experiments concerning incoming directions of Ultra High Energy Cosmic Rays, such as the AUGER Observatory (http://www.auger.org). The Ultra-High Energy Cosmic Rays are particles with energy above 1018 eV reaching the Earth. Even if they were discovered almost a century ago, their origin, their mechanisms of acceleration and propagation are still unknown. As described in [4], see also [17], an e¢ cient nonparametric estimation of the density function of these data would explain the origin of the High Energy Cosmic Rays, i.e. if it is uniform, they are generated by cosmological e¤ects, such as the decay of the massive particles generated during the Big Bang, or, on the other hand, if it is highly non-uniform and, moreover, strongly correlated with the local distribution of nearby Galaxies, it implies that the they are generated by astrophysical phenomena, as for instance the acceleration into Active Galactic Nuclei. Massive amount of data in this area are expected to be available in the next few years. Also in view of this application, the needlet approach was recently applied within the thresholding paradigm to the estimation of the directional data: the seminal contribution in this …eld is due to [4], see also [28], [27], while applications to astrophysical data is still under way, see for instance [16], [17] and [26]. Minimax estimators for spherical data, outside the needlets approach, were also studied by Kim and coauthors (see [30], [29], [31]). Furthermore, adaptive nonparametric regression estimators of spin-functions, based on spin pure and mixed needlets de…ned in [18], [19], were investigated in [12]. In this case, the needlet nonparametric regression estimators were built on spin …ber bundles on the sphere, i. e. the 2
function to be estimated does not take as its values scalars but algebraic curves living on the tangent plane for each point of the sphere. This work, hence, extends the results established in [4] and [12] towards the needlet block thresholding procedure following two main directions. First of all, we will suggest a construction of blocks of needlet coe¢ cients, exploiting the Voronoi cells based on the geodesic distance on the sphere. Then, we will de…ne the needlet block thresholding estimator, whose we will achieve an optimal convergence rate. In view of this aim, we will use both the needlet properties established in [36], [37] (see also [34]) and a set of well consolidated standard techniques, introduced by [11] (see also [23]), remarking that this kind of approach has been also applied within the needlet framework to local thresholding by [4] and [12]. Section 2 will recall some preliminary notions, as needlets, their main properties and the Besov spaces. Section 3 will describe the block thresholding procedure we build for needlet regression estimation , while Section 4 will present the main minimax results. Section 5 will collect some auxiliary probabilistic results, while Section 6 will exploit the proof of the main result of this work, named as Theorem 1.
2
Background results
In this Section, we will review brie‡y a few of well-known characteristics about the Voronoi cells on the sphere, the spherical needlet construction and the Besov spaces. For what concerns Voronoi cells, we are following strictly [3]: further details can be found for instance in the textbook [34], see also [2] and [37]. From now on, given two positive sequences faj g and fbj g, we write that aj bj if there exists a constant c > 0 so that c 1 aj bj caj for all j. Furthermore, deBx0 ( ) = x 2 S2 : d (x; x0 ) < and B x0 ( ) = x 2 S2 : d (x; x0 ) note respectively standard open and closed balls on S2 around x0 2 S2 , while jAj is the spherical measure of a general subset A S2 . Given " > 0, the set 2 " = fx1 ; :::; xN g of points on S , such that for i 6= j we have d (xi ; xj ) > , is called a maximal "-net if it satis…es d (x; " ) < " for x 2 S2 , [xi 2 " Bxi (") = S2 and Bxi ("=2) \ Bxj ("=2) = ?, for i 6= j. For all xi 2 " , a family of Voronoi cells is de…ned as V (xi ) = x 2 S2 : for j 6= i; d (x; xi ) < d (x; xj ) .
(1)
In [3] it is proved that: Bxi
" 2
V (xi )
Bxi (") .
Now, we resume the construction of the scalar needlet framework, suggesting for a more detailed discussion [36], [37], see also [4] and [34]. A needlet system describes a well-localized tight frame on the sphere: it is a well-known fact (cfr. [36]) that any function belonging to L2 S2 can be represented as a linear 3
combination of the components of that frame, preserving furthermore some of the most relevant properties of needlets. Indeed, let us recall that the space L2 S2 of square-integrable functions on the sphere can be decomposed as the direct sum of the spaces Hl of harmonic polynomials of degree l, spanned by l spherical harmonics fYlm gm= l , whose de…nition and properties can be found in [44] and [4]. If we consider
l
=
l M
Hl 0 ,
l0 =0
the space of the restrictions to S2 of the polynomials of degree less (and equal) to l, the following quadrature formula holds (see for instance [4]): given l 2 N, there exists a …nite subset l such that a positive real number (the cubature weight) corresponds to each 2 l (the cubature point) and for all f 2 l , Z X f (x) dx = f ( ). S2
2
l
Given B > 1 and a resolution level j, we call [B 2(j+1) ] = Zj , card (Zj ) = Nj ; since now any element of the set of cubature points and weights, jk ; jk , will be indexed by j, the resolution level, and k, the cardinality over j, belonging to Zj . Furthermore, we choose fZj gj 1 to be nested so that B 2j ;
Nj
B
jk
2j
.
(2)
We consider a symmetric, real-valued, not negative function b ( ) (see again [4]) such that 1. it has compact support on B
1
;B ;
2. b 2 C 1 (R); 3. the following unitary property holds for j j X
b2
j 0
For each
jk
1:
=1.
Bj
2 Zj , given b ( ) and B, the scalar needlets are de…ned as: jk
(x) =
p
jk Bj
X
b
1 1. The Besov norm is de…ned as follows: 8 hP q i1 q > jq (r+ 12 1 ) P kf kL (S2 ) + supB j (r+ 2 ) : jk k ` j
As shown for instance in [4], if max (0; 1= 1=q) < r and have f 2 B r q , kf kBr q < 1 .
q 1, then we
The Besov spaces present, among their properties, some embeddings which will be pivotal in our proofs below. As proven in [4] and [12], we have that, for q2 1 2 ; q1 B r q1
3
Br q2 ; B r 2 q
Br 1 q , Br 1 q
B
r 2q
1 1
+
1 2
:
(7)
Needlet Block Thresholding on the Sphere
In this Section we will present the needlet estimators for nonparametric regression problems and, then, we will suggest a procedure to …x blocks for any given resolution level j and, consequently, we will de…ne the so-called needlet block threshold estimator. The …rst step is close to the one described in [4], [12] for local thresholding, the other one being an adaptation to the sphere of the procedure developed on R in [24], [25], see also [23] In order to introduce the nonparametric regression estimator, let us initially de…ne the so-called uncentered isonormal Gaussian process with mean f . Following [38], a isonormal Gaussian process over H is de…ned as X = fX (h) : h 2 Hg where H is a real separable Hilbert space, with inner product h ; i. Hence, we assume that X describes a family of (uncentered) Gaussian variables, de…ned on some probability space ( ; F;P ) such that for all h1 ; h2 2 H, fX(h1 ); X(h2 )g are jointly Gaussian with mean Z EX(h) = hh; f i = f (x)h(x)dx S2
and covariance E (X(h1 )
EX(h1 )) (X(h2 )
EX(h2 )) = hh1 ; h2 i .
In our case, := S 2 and F is the -algebra generated by X. We will use L2 S 2 instead of L2 S 2 ; F; P to simplify the notation. We shall in fact be concerned with sequences fXn g of such processes, where we assume that a Z EXn (h) = hh; f i = f (x)h(x)dx S2
6
and covariance E (Xn (h1 )
EXn (h1 )) (Xn (h2 )
EXn (h2 )) =
Consider now the usual needlet system the following:
where
b
jk
= EXn (
jk
= Xn (
E"jk;n E"2jk;n E"jk1 ;n "jk2 ;n
jk ) jk )
=
=
jk
and let f 2 Lp (S 2 ); we have
jk j;k
jk ; f
=
Z
f (x)
S2
+ "jk;n ,
1 hh1 ; h2 i . n
jk (x)dx
, (8)
= E Xn ( jk ) EXn ( jk ) = 0 , 1 1 2 = jk ; jk L2 (S 2 ) = jk L2 (S 2 ) , n n 1 ; = n jk1 jk2 L2 (S 2 ) P 2 l 2l+1 1 l b ( 2j ) 4 Pl ( jk1 ; jk2 ) = . P 2 l 2l+1 n l b ( 2j ) 4
(9)
In a formal sense, one could consider the Gaussian white noise measure on the sphere such that for all A; B S 2 ; we have Z EW (A)W (B) = dx , A\B
so that "jk;n
1 = n
Z
jk (x)W (dx)
S2
.
As described above (see also [4], [12]), f can be described in terms of needlet coe¢ cients, up to a constant, as f=
Nj XX
jk
jk
.
j 0 k=1
Let us now de…ne the blocks on which we will apply the thresholding procedure: as anticipated in the Introduction, di¤erently from [24], the structure itself of the needlet framework suggests a quite intuitive way to be followed. Let us …x j > 0: recall that for each resolution level j, we have Nj B 2j cubature points. Given the size of the blocks, i.e. the number of cubature points belonging to each of them - let us say `j - we will build using (1) a set of Voronoi cells, containing `j cubature points. For each cell, we choose a cubature point js to index it: we de…ne Sj (`j ) as the number of Voronoi cells obtained to split cubature points into groups of cardinality `j . Let us de…ne the set Rj;s = k :
jk
2V
js
7
; s = 1; :::; Sj :
From (1), it is immediate to see that each cubature point Voronoi cell. We choose `j such that
belongs to a unique
Nj .
`j = Nj where [ ] denotes the integer part and 0 < Sj =
jk
Nj `j
< 1, such that
B 2j
1
We …nally de…ne, for any integer p Ajs;p
1, 1 X := `j
.
p jk
,
k2Rj;s
and its corresponding estimator X p b , bjs;p = 1 A jk `j k2Rj;s
similar to the ones suggested in [24], Remark 4.7. Let us de…ne the following weight function wjs;p = I we have: f =
Sj Jn X X j=0 s=1
where:
0 @
bjs;p > tpn A
X
k2Rj;s
b
jk
jk
;
1
A wjs;p ,
(10)
Jn is the highest resolution level considered, taken such that 1
B Jn = n 2 ; is the threshold constant (for more discussions see for instance [4], [12]); the scaling factor tn , depends on the size of the sample. We will …x tn = n
4
1 2
.
Minimax Lp -risk rates of convergence
This Section aims to describe the performance of the procedure in terms of the optimality of its convergence rates with respect to general Lp S 2 -loss functions: this result is established in the next theorem. Remark that this procedure achieves minimax the rates provided in [4] and [12], see also [23]. Furthermore, as again in the frameworks described in [4] and [12], the minimax rates are not a¤ected by the construction over the sphere, which instead is pivotal in the development of statistical procedures. 8
Theorem 1 Let f 2 B r q (G), the Besov ball so that kf kBr q (G)
M < +1,
2
r > 0. Consider f as de…ned by (10). For p 2 N, there exists a constant cp = cp (p; r; q; M; B) such that sup f 2Br q (G)
where (r; ; p) =
p
E kf
8
2 2=nj B 2j e
nx2 2
we obtain E2
C
e
nx2 4
Z
p x> 2pn2 j p 2
C2 n
,
10
nx2 4
xp
+2j
1
B 2j e
e
nx2 4
nx2 2
,
dx
!
x dx
dx ,
!
x dx
jk
= E1 + E2 , where
x)
so we achieve (12). In order to prove (13), we write bjs;p A
P
Ajs;p > tpn
80 > `j < 1 X bp =P @ jk > : `j
p
E b jk A
k=1
De…ne
e
where
jk
:=
p
n
jk
+
p
11=p p
n"jk;n =
"jk :=
p
n
>p
jk
9 > =
n> ;
.
+ "jk ,
n"jk;n ;
our aim is hence to study the behaviour of the terms of the form 0
p `j `j X p nX @1 "pjk + `j `j k=1
p 1 jk "jk
+ ::: +
pn
(p 1)=2
`j
k=1
`j X
k=1
Observe that:
`j 1 X `j
0
k=1
we have that `j X
k=1
2p 2 jk
Nj X
2p 2 jk
11=2 0
`j X @1 `j
p 1 jk "jk
=O B
js
1
j(1
B
p 1 A jk "jk
.
(14)
11=2 `j X 1 @ "2jk A ; `j
2p 2 A jk
k=1
11=p
p
1)
k=1
=O B
js
B
p j( p
2 1)
.
k=1
On the other hand, by Lemma 3, for all p; > 0, there exists 8 9 `j 0 .
> 0 there exists > 0 such that 9 8 `j = C : `j k=1
Let us rewrite
we can take p to be even; note indeed that 9 8 9 `j =