Aug 26, 2014 - (2) Laboratoire Jean Kuntzmann, France. Abstract ... ing, α denotes the conditional probability to be larger than the conditional quan- tile.
On the strong consistency of the kernel estimator of extreme conditional quantiles Stephane Girard, Sana Louhichi
To cite this version: Stephane Girard, Sana Louhichi. On the strong consistency of the kernel estimator of extreme conditional quantiles. 2014.
HAL Id: hal-00956351 https://hal.archives-ouvertes.fr/hal-00956351 Submitted on 6 Mar 2014
HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destin´ee au d´epˆot et `a la diffusion de documents scientifiques de niveau recherche, publi´es ou non, ´emanant des ´etablissements d’enseignement et de recherche fran¸cais ou ´etrangers, des laboratoires publics ou priv´es.
On the strong consistency of the kernel estimator of extreme conditional quantiles St´ephane Girard(1,2) and Sana Louhichi(2) (1)
Mistis, Inria Grenoble Rhˆone-Alpes, France. (2) Laboratoire Jean Kuntzmann, France.
Abstract Nonparametric regression quantiles can be obtained by inverting a kernel estimator of the conditional distribution. The asymptotic properties of this estimator are well-known in the case of ordinary quantiles of fixed order. The goal of this paper is to establish the strong consistency of the estimator in case of extreme conditional quantiles. In such a case, the probability of exceeding the quantile tends to zero as the sample size increases, and the extreme conditional quantile is thus located in the distribution tails.
1 Introduction Quantile regression plays a central role in various statistical studies. In particular, nonparametric regression quantiles obtained by inverting a kernel estimator of the conditional distribution function are extensively investigated in the sample case [3, 27, 29, 30]. Extensions to random fields [1], time series [14], functional data [11] and truncated data [25] are also available. However, all these papers are restricted to conditional quantiles having a fixed order α ∈ (0, 1). In the following, α denotes the conditional probability to be larger than the conditional quantile. Consequently, the above mentioned asymptotic theories do not apply in the distribution tails, i.e when α = αn → 0 or αn → 1 as the sample size n goes to infinity. Motivating applications include for instance environmental studies [16, 28], finance [31], assurance [4] and image analysis [26]. The asymptotics of extreme conditional quantile estimators have been established in a number of regression models. Chernozhukov [6] and Jureckov´a [22] considered the extreme quantiles in the linear regression model and derived their asymptotic distributions under various error distributions. Other parametric models are considered in [10, 28]. A semi-parametric approach to modeling trends in extremes has been introduced in [9] basing on local polynomial fitting of the Generalized extreme-value distribution. Hall and Tajvidi [21] suggested a nonparametric estimation of the temporal trend when fitting parametric models to extreme-values. Another semi-parametric method has been developed in [2] using a conditional Pareto-type distribution for the response. Fully nonparametric estimators of extreme conditional quantiles have been discussed in [2, 5] including
1
local polynomial maximum likelihood estimation, and spline fitting via maximum penalized likelihood. Recently, [15, 18] proposed, respectively, a moving-window based estimator for the tail index and extreme quantiles of heavy-tailed conditional distributions, and they established their asymptotic properties. In the kernel-smoothing case, the asymptotic theory for quantile regression in the tails is still in full development. [19, 20] have analyzed the case αn = 1/n in the particular situation where the response Y given X = x is uniformly distributed. The asymptotic distribution of the kernel estimator of extreme conditional quantile is established by [7, 17] for heavy-tailed conditional distributions. This result is extended to all types of tails in [8]. Here, we focus on the strong consistency of the kernel estimator for extreme conditional quantiles. Our main result is established in Section 2. Some illustrative examples are provided in Section 3. The proofs of the main results are given in Section 4 and the proofs of the auxiliary results are postponed to the Appendix.
2 Main results Let (Xi ,Yi )1≤i≤n be independent copies of a random pair (X,Y ) ∈ Rd × R+ with density f(X,Y ) . Let g be the density of X that we suppose strictly positive. The conditional survival function of Y given X = x is denoted by, 1 ¯ F(y|x) = P(Y > y|X = x) = g(x)
Z +∞ y
f(X,Y ) (x, z)dz
¯ The kernel estimator of F(y|x) is, for x such that ∑ni=1 Kh (x − Xi ) 6= 0, ∑n Kh (x − Xi )1IYi >y F¯n (y|x) = i=1n , ∑i=1 Kh (x − Xi ) where h = hn → 0 as n → ∞ and Kh (u) = function which satisfies the conditions:
1 K( hu ), hd
the kernel K is a measurable
(K1 )K is a continuous probability density. (K2 )K is with compact support: ∃ R > 0 such that K(u) = 0 for any kuk ≤ R. Define κ := kKk∞ = supx∈Rd K(x) < ∞.
Recall that, for a class of function G , N (ε , G , dQ ) denotes the minimal number of balls {g, dQ (g, g′ ) < ε } of dQ -radius ε needed to cover G and dQ is the L2 (Q)metric. Let K be the set of functions K = {K((x − ·)/h), h > 0, x ∈ Rd } and N (ε , K ) = supQ N (ε kKk∞ , K , dQ ) where the supremum is taken over all the probability measure Q on Rd × R. Suppose that, (K3 )for some C, ν > 1, N (ε , K ) ≤ Cε −ν for any ε ∈]0, 1[.
A number of sufficient conditions for which (K3 ) holds are discussed in [13] and the references therein. Finally suppose that ¯ α |x)|x) = α . (A1 )for all α ∈ (0, 1) there exists an unique q(α |x) ∈ R such that F(q( ¯ at the The conditional quantile q(α |x) is then the inverse of the function F(·|x) point α . Let (αn ) be a fixed sequence of levels with values in [0, 1]. For any x ∈ Rd , define q(αn |x) as the unique solution of the equation, ¯ αn |x)|x), αn = F(q(
2
(1)
whose existence is guaranteed by Assumption (A1 ). The kernel estimator of the conditional quantiles q(α |x) is: qˆn (α |x) = inf{y ∈ R, F¯n (y|x) ≤ α }.
(2)
Finally, denote by gˆn (x) the kernel density estimator of the probability density g i.e. 1 n gˆn (x) = ∑ Kh (x − Xi ). n i=1 Our main result is the following. Theorem 1 Let (Xi ,Yi )1≤i≤n be independent copies of a random pair (X,Y ) ∈ Rd × R+ with density f(X,Y ) . Let g be the density of X that we suppose bounded and strictly positive. Suppose that assumption (A1 ) is satisfied. Let (αn ) be a sequence of levels in [0, 1] for which lim sup sup n→∞ x∈Rd
qˆn (αn |x) ≤ Cst, q(αn |x)
(3)
almost surely. Suppose that the kernel K satisfies Conditions (K1 ), (K2 ), (K3 ). Define ¯ ¯ ¯ ¯ ¯ F(y|u) ¯. ¯ A(y, z, x, hn ) = sup − 1 ¯ ¯ F(z|x) ¯ u:d(u,x)≤Rh n
Suppose that for some fixed positive ε0 and for z ∈ {q(αn |x), (1 + ε )q(αn |x)} lim sup
sup
n→∞ x∈Rd ,|ε |≤ε0
A((1 + ε )q(αn |x), z, x, hn ) ≤ C < ∞.
(4)
If, moreover,
ln(αn hdn ∧ αn2 ) = 0, (5) n→∞ n→∞ nhdn αn then there exists a positive constant C, not dependent on x, such that one has for n sufficiently large ¯ ¯ ¯ ¯ ¯ ¯1 − F(qˆn (αn |x)|x) ¯ ≤ C sup A((1 + ε )q(αn |x), (1 + ε )q(αn |x), x, hn ) ¯ ¯ F(q(αn |x)|x) ¯ |ε |≤ε0 s ln(αn−1 h−d C n ) ∨ ln ln n + , almost surely. gˆn (x) nαn hdn lim nhdn αn = ∞ and lim
The first term of the bound can be interpreted as a bias term due to the kernel smoothing. The second term can be seen as a variance term, nαn hdn being the effective number of points used in the estimation. The following proposition gives conditions under which (3) is satisfied. Proposition 1 Suppose that g is Lipschitzian and bounded above by gmax . Let R vd = kvk≤1 dv be the volume of the unit sphere. If A(q(αn |x), q(αn |x), x, 0, h) → 0 and there exist ε > 0 such that ∞
∑ nhd αn exp{−vd gmax nhd αn (1 + ε )} = ∞,
(6)
n=1
then,
qˆn (αn |x) ≤ 1, almost surely. q(αn |x) Some examples of distributions satisfying condition (4) are provided in the next section. lim sup sup
n→∞ x∈Rd
3
3 Examples Let us first focus on a conditional Pareto distribution defined as ¯ F(y|x) = y−θ (x) , for all y > 0.
(7)
Here, θ (x) > 0 can be read as the inverse of the conditional extreme-value index. The above distribution belongs to the so-called Fr´echet maximum domain of attraction which encompasses all distributions with heavy tails. As a consequence of Theorem 1, we have: Corollary 1 Let us consider a conditional Pareto distribution (7) such that 0 < θmin ≤ θ (x) ≤ θmax for all x ∈ Rd . Assume that θ is Lipschitzian. If the sequences (αn ) and (hn ) are such that hn log αn → 0 as n → ∞ and (5), (6) hold, then qˆn (αn |x)/q(αn |x) → 1 almost surely as n → ∞. Let us now consider a conditional exponential distribution defined as ¯ F(y|x) = exp(−θ (x)y), for all y > 0,
(8)
where θ (x) > 0 is the inverse of the conditional expectation of Y given X = x. This distribution belongs to the Gumbel maximum domain of attraction which collects all distributions with a null conditional extreme-value index. These distributions are often referred to as light-tailed distributions. In such a case, Theorem 1 yields a stronger convergence result than in the heavy-tail framework: Corollary 2 Let us consider a conditional exponential distribution (8) with 0 < θmin ≤ θ (x) ≤ θmax for all x ∈ Rd . Assume that θ is Lipschitzian. If the sequences (αn ) and (hn ) are such that hn log αn → 0 as n → ∞ and (5), (6) hold, then (qˆn (αn |x) − q(αn |x)) → 0 almost surely as n → ∞.
4 Proofs 4.1 Proof of Theorem 1 Clearly, by (1), ¯ qˆn (αn |x)|x)| ¯ αn |x)|x) − F( |F(q( ¯ F(q(αn |x)|x)
|αn − F¯n (qˆn (αn |x)|x)| ¯ αn |x)|x) F(q( ¯ qˆn (αn |x)|x)| |F¯n (qˆn (αn |x)|x) − F( . ¯ αn |x)|x) F(q(
≤ +
First, from (2), |αn − F¯n (qˆn (αn |x)|x)| is bounded above by the maximal jump of F¯n (y|x) at some observation point (X j ,Y j ): |αn − F¯n (qˆn (αn |x)|x)| ≤
max j=1,...,n Kh (x − X j ) . ∑ni=1 Kh (x − Xi )
It follows from (K2 ) that |αn − F¯n (qˆn (αn |x)|x)| κ . ≤ d ¯ αn |x)|x) F(q( nh αn gˆn (x)
4
Let us then focus on the second term: ¯ ¯ ¯ qˆn (αn |x)|x)| F( ¯ qˆn (αn |x)|x) ¯ F¯n (qˆn (αn |x)|x) ¯ |F¯n (qˆn (αn |x)|x) − F( ¯. ¯ = − 1 ¯ ¯ ¯ ¯ ¯ F(q(αn |x)|x) F(q(αn |x)|x) F(qˆn (αn |x)|x)
(9)
qˆ (α |x)
We write qˆn (αn |x) = (1 + ε ) q(αn |x), with ε = q(n α n|x) − 1. Condition (3) allows n to deduce that there exists, for n sufficiently large, ε0 not dependent on x such that |ε | ≤ ε0 . Consequently and taking into account (9) there exists a positive constant ε0 not dependent on x and n such that (for the sake of simplicity, we write q = q(αn |x)) ¯ ¯ ¯ ¯ F( ¯ κ ¯ qˆn (αn |x)|x) − 1¯ ≤ ¯ F(q( ¯ ¯ αn |x)|x) nhd αn gˆn (x) ¯ ¯¶ µ ¯ ¯ F(q(1 + ε ))|x) ¯¯ F¯n (q(1 + ε )|x) ¯ (. 10) + sup − 1 ¯ ¯ ¯ ¯ F(q|x) F(q(1 + ε ))|x) |ε |≤ε0 ¯¯ ¯ ¯ Fn (q(1+ε )|x) ¯ Our purpose now is to control the term ¯ F(q(1+ − 1 ¯ of (10). For this, write ¯ ε )|x)
ψˆ n (y, x) ∑n Kh (x − Xi )1IYi >y =: , F¯n (y|x) = i=1n gˆn (x) ∑i=1 Kh (x − Xi )
with
ψˆ n (y, x) =
1 n 1 n Kh (x − Xi )1IYi >y , gˆn (x) = ∑ Kh (x − Xi ). ∑ n i=1 n i=1
We need the following lemma. Lemma 1 Suppose that Condition (K2 ) holds. Then, for each n, one has for any ¯ y ∈ R and x ∈ Rd for which F(y|x) gˆn (x) 6= 0, ¯ ¯ ¯ ¯ F¯n (y|x) ¯ − 1¯¯ ¯ F(y|x) ¯ |gˆn (x) − Egˆn (x)| |ψˆ n (y, x) − E (ψˆ n (y, x)) | ≤ A(y, y, x, hn ) + (1 + A(y, y, x, hn )) + . ¯ gˆn (x) F(y|x) gˆn (x) The proof is postponed to the Appendix. According to Lemma 1, we have to |ψˆ (y,x)−E(ψˆ n (y,x))| control the two quantities E |gˆn (x) − Egˆn (x)| and n F(y|x) . This is the ¯ purpose of Propositions 2 and 3 below. Proposition 2 Einmahl-Mason (2005). Suppose that g is a bounded density on Rd , and that the assumptions (K.i), · · · , (K.iv) of [13] are all satisfied. Then, for any c > 0, s nhdn sup |gˆn (x) − Egˆn (x)| =: K(c) < ∞, lim sup sup −d n→∞ {c ln n/n≤hdn ≤1} ln(hn ) ∨ ln ln n x∈Rd almost surely. |ψˆ (y,x)−E(ψˆ (y,x))|
n Our task now is to control n F(y|x) . Let ε be a fixed real in [−ε0 , ε0 ] for ¯ some arbitrary positive ε0 . The following proposition evaluates the almost sure asymptotic behaviour of
|ψˆ n (q(1 + ε ), x) − E(ψˆ n (q(1 + ε ), x))| . ¯ F(q|x)
5
Proposition 3 Let (αn ) be a sequence in [0, 1] and for x ∈ Rd , q = q(αn |x) be the conditional quantile as defined by (1). Define the set of functions F by, ½ µ ¾ ¶ x−u F = (u, v) 7−→ K (11) Iv>q(1+ε ) , n ∈ N, h > 0, x ∈ Rd , |ε | ≤ ε0 h and suppose that N (ε , F ) ≤ Cε −ν , for some C, ν > 1 and all ε ∈]0, 1[. Suppose ln(α hd ∧α 2 )
n n n → 0 as n → ∞, also that Condition (4) is satisfied. If nhdn αn → ∞ and nhdn αn then there exists a positive constant C1 such that s nαn hdn |ψˆ n (q(1 + ε ), x) − E(ψˆ n (q(1 + ε ), x))| lim sup ≤ C1 , sup −1 −d ¯ F(q|x) n→∞ ln(αn hn ) ∨ ln ln n x∈Rd ,|ε |≤ε0
almost surely. Proof of Proposition 3. We have, 1 (ψˆ n (q(1 + ε ), x) − E (ψˆ n (q(1 + ε ), x))) ¯ F(q|x) = =
1 ¯ nF(q|x) 1 nhdn
n
∑ [Kh (x − Xi )1IY >q(1+ε ) − E(Kh (x − Xi )1IY >q(1+ε ) )] i
i
i=1
n
1
∑ (vh,x,ε (Xi ,Yi ) − E(vh,x,ε (Xi ,Yi ))) = hd √n βn (vh,x,ε ), n
i=1
where, vh,x,ε (u, v) = K(
1 n x − u 1Iv>q(1+ε ) ) ¯ , βn (g) = √ ∑ (g(Xi ,Yi ) − E(g(Xi ,Yi ))) h n i=1 F(q|x)
Define the class of functions: G := Gn,h = {vh,x,ε , x ∈ Rd , ε ∈ [−ε0 , ε0 ]}
(12)
and let kβn kG = supg∈G |βn (g)| and Θn =
sup
x∈Rd ,ε ∈[−ε
0 ,ε0
|ψˆ n (q(1 + ε ), x) − E (ψˆ n (q(1 + ε ), x)) | . ¯ F(q|x) ]
Consequently, for any γ > 0, ¶ µ ³√ ´ √ P (Θn > γ ) ≤ P nkβn kG > γ nhd ≤ P max mkβm kG > γ nhd 1≤m≤n
(13)
√ We have then to evaluate max1≤m≤n mkβm kG . By Talagrand Inequality, (see A.1. in [12]), we have for any t > 0 and suitable finite constants A1 , A2 > 0, !! Ã Ã n √ P max mkβm kG > A1 Ek ∑ εi g(Xi ,Yi )kG + t 1≤m≤n
i=1
≤ 2 exp(−A2t 2 /nσ 2 ) + 2 exp(−A2t/M),
where (εi )i is a sequence of independent Rademacher random variables independent of the random vectors (Xi ,Yi )1≤i≤n and sup kgk∞ ≤ M, sup Var(g(X,Y )) ≤ σ 2 .
g∈G
g∈G
6
Here kgk∞ ≤
kKk∞ αn
=: M, and
Var(v2h,x,ε (X,Y )) ≤ E(v2h,x,ε (X,Y )) = 1 = ¯ F(q|x) ≤
Z
K2(
x−X 1 )1IY >q(1+ε ) ) E(K 2 ( h F¯ 2 (q|x)
¯ + ε )|u) x − u F(q(1 ) g(u)du ¯ h F(q|x)
hd hdn (1 + A(q(1 + ε ), q, x, h))kKk22 kgk∞ = n L =: σ 2 ,(14) sup αn x∈Rd ,ε ∈[−ε0 ,ε0 ] αn
for some positive constant L, since for n sufficiently large sup
x∈Rd , ε ∈[−ε0 ,ε0 ]
A(q(1 + ε ), q, x, h) ≤ cst.
We obtain combining this with the above Talagrand’s Inequality, !! Ã Ã n √ P max mkβm k∞ > A1 Ek ∑ εi g(Xi ,Yi )kG + t 1≤m≤n
i=1
αn αn ). ≤ 2 exp(−A2t ) + 2 exp(−A2t kKk∞ nhdn L 2
The last bound together with (13) gives à à P
n
Ek ∑ εi g(Xi ,Yi )kG + t
Θn > A1 n−1 h−d n
i=1
αn αn ). ≤ 2 exp(−A2t ) + 2 exp(−A2t kKk∞ nhdn L
(15)
!!
2
(16)
We have now to evaluate Ek ∑ni=1 εi g(Xi ,Yi )kG . We will argue as for the proof of Proposition A.1. in [12]. We have by (6.9) of Proposition 6.8 in [24], n
Ek ∑ εi g(Xi ,Yi )kG ≤ 6tn + 6E(max kεi g(Xi ,Yi )kG ) ≤ 6tn + 6 i≤n
i=1
where we have defined (
Ã
n
tn = inf t > 0, P k ∑ εi g(Xi ,Yi )kG > t i=1
!
Our purpose is then to control tn . Define the event ( Fn =
n−1 sup
n
1 ≤ 24
kKk∞ , αn
)
(17)
.
)
∑ g2 (X j ,Y j ) ≤ 64σ 2 ,
g∈G j=1
where σ 2 is as in (14). Let g0 be a fixed element of G . We have, n √ E| ∑ εi g0 (Xi ,Yi )1IFn | ≤ 8σ n. i=1
By (A8) of [12], we have now to control N (ε , G , dn,2 ). Recall that N (ε , G , dQ ) is the minimal number of balls {g, dQ (g, g′ ) < ε } of dQ -radius ε needed to cover G , dQ is the L2 (Q)-metric and dn,2 ( f , g) = dQn ( f , g) =
Z
7
( f (x) − g(x))2 dQn (x),
with Qn = n1 ∑ni=1 δ(Xi ,Yi ) . In other words dn,2 ( f , g) =
1 n ∑ ( f (Xi ,Yi ) − g(Xi ,Yi ))2 . n i=1
We note first that, on the event Fn , N (ε , G , dn,2 ) = 1, whenever ε > 16σ . We will suppose then ε ≤ 16σ . We have N (ε , G , dn,2 ) = N (ε , G , dQn ) ≤ N (εαn , F , dQn ), where F is as defined by (11). Recall that N (ε , F ) = supQ N (ε kKk∞ , F , dQ ) where the supremum is taken over all the probability measure Q on Rd × R. We have supposed that, N (ε , F ) ≤ Cε −ν . for some C, ν > 1 and all ε ∈]0, 1[. Consequently, N (ε , G , dn,2 ) ≤ C(
αn ε −ν ) , kKk∞
(18)
as soon as αn ε < kKk∞ . Hence, we have almost surely on the event Fn , Z ∞q 0
ln(N (ε , G , dn,2 ))d ε
=
Z 16σ q
ln(N (ε , G , dn,2 ))d ε s ∞ Z 2−i 16σ kKk∞ ∑ 2−i−1 16σ ln(C) + ν ln( αn ε )d ε i=0 s ∞ 2i+1 kKk∞ −i−1 ∑ 2 16σ ln(C) + ν ln( αn 16σ ) i=0 s √ ∞ i+1 1 σ 2 max(ln(C1 ), ln( 2 2 )), C2 ∑ i+1 2 α nσ i=0 0
≤ ≤ ≤
for some positive constants C1 ,C2 that depend only on C, ν and kKk∞ . We conclude, using (A8) of [12], s ! Ã n 1 (19) E k ∑ εi g(Xi ,Yi )kG 1IFn ≤ C2′ nσ 2 max(ln(C1 ), ln( 2 2 )). α nσ i=1 We now use Inequality A2 in [12] (which is due to Gin´e and Zinn), with t = √ kKk 64 nσ 2 . We obtain, for m ≥ 1, since for any g ∈ G , kgk∞ ≤ αn ∞ , Ã ! n
∑ g2 (Xi ,Yi ) > 64σ 2 g∈G
P(Fnc ) = P n−1 sup
j=1
≤ 4P(N (ρ n−1/4 , G , dn,2 ) ≥ m) + 8m exp(−nσ 2 αn2 /kKk2∞ ) where n−1/4 ρ = n−1/4 min(σ n1/4 , n1/4 ) = min(σ , 1). Hence by (18), N (ρ n−1/4 , G , dn,2 ) ≤ C(
8
αn min(σ , 1) −ν ) , kKk∞
Consequently, we have for m = [2C( P(Fnc ) ≤ 16C(
αn min(σ ,1) −ν ) ], kKk∞
αn min(σ , 1) −ν ) exp(−nσ 2 αn2 /kKk2∞ ) kKk∞
The last bound together with (19) gives, Ã ! n
P k ∑ εi g(Xi ,Yi )kG > t
≤
i=1
≤ +
P(Fnc ) +
! Ã n 1 E k ∑ εi g(Xi ,Yi )kG 1IFn t i=1
kKk∞ 16C( )ν exp(−nσ 2 αn2 /kKk2∞ ) αn min(σ , 1) s C2′ 1 nσ 2 max(ln(C1 ), ln( 2 2 )). (20) t αn σ hd
Let us control the second term in (20). Recall that by (14), σ 2 = L αnn and thus αn2 σ 2 = Lhdn αn for some positive constant L. This fact together with hdn αn → 0 as n → ∞ allows to deduce that for n sufficiently large, s s C2′ 1 Cst nhdn 1 2 ln( ). (21) nσ max(ln(C1 ), ln( 2 2 )) ≤ t t αn αn σ αn hdn Our task now is to control the first term in (20). We have, kKk∞ )ν exp(−nσ 2 αn2 /kKk2∞ ) αn min(σ , 1) µ ¶ ν ν /2 d 2 d ln(αn hn ∧ αn )) , ≤ L0 exp −nhn αn (L1 + 2nhdn αn
16C(
ln(α hd ∧α 2 )
n n n which tends to 0 as n → ∞ as soon as nhdn αn → 0 and → 0. Hence nhdn αn r d nh for t ≥ 48Cst αnn ln( α 1hd ) =: tn , Inequalities (20) and (21) give for n sufficiently n n
large and for t ≥ tn
Ã
n
P k ∑ εi g(Xi ,Yi )kG > t i=1
!
≤
1 . 24
We conclude using this fact together with Inequality (17), s à ! Ãs ! n nhdn nhdn 1 1 1 Ek ∑ εi g(Xi ,Yi )kG ≤ c + ln( ln( ) =O ) . (22) αn αn αn αn hdn αn hdn i=1 Recalling that Θn =
sup
x∈Rd ,ε ∈[−ε0 ,ε0
|ψˆ n (q(1 + ε ), x) − E (ψˆ n (q(1 + ε ), x)) | , ¯ F(q|x) ]
and collecting (22), (16), yields à à P
Θn > A1 n−1 h−d n
n
Ek ∑ εi g(Xi ,Yi )kG + t i=1
!!
¶ µ · µ ¶¸ q A L˜ ln(ln(n)) + exp − 2 ≤ 2 exp −A2 L˜ 2 nhdn αn ln(ln n) , (23) L kKk∞
9
µ r ¶ q d nhdn nh 1 for any t ≥ Cst αn ln( α hd ) ∨ L˜ αnn ln(ln n) and some L˜ > 0. Now, n n
ln(αn hdn ∧ αn2 ) ln(nαn hdn ) ln n + ≤ , nαn hdn nαn hdn nαn hdn which tends to 0 as n tends to infinity, since limn→∞ nαn hdn = ∞. Hence, lim
n→∞
ln n = 0, nαn hdn
which proves that for n sufficiently large nhdn αn ≥ ln n. We conclude then from (23) that, Ã !! Ã n
Ek ∑ εi g(Xi ,Yi )kG + t P Θn > A1 n−1 h−d n i=1
≤ 4(ln n)−ρ ,
since we have nhdn αn ≥ ln n. We choose L˜ in such a way that ρ > 1. Proposition 3 is proved thanks to Borel-Cantelli lemma. We continue the proof of Theorem 1. Inequality (10), together with Lemma 1 and Condition (4), gives for some universal positive constant C ¯ ¯ ¯ ¯ ¯ F( κ ¯ qˆn (αn |x)|x) − 1¯ ≤ ¯ nhd α gˆ (x) ¯ F(q( ¯ αn |x)|x) n n +C sup A(q(1 + ε ), q(1 + ε ), x, hn ) |ε |≤ε0
+C sup (1 + A(q(1 + ε ), q(1 + ε ), x, hn )) |ε |≤ε0
+C sup
|ε |≤ε0
|gˆn (x) − Egˆn (x)| gˆn (x)
|ψˆ n (q(1 + ε ), x) − E (ψˆ n (q(1 + ε ), x)) | . ¯ F(q|x) gˆn (x)
(24)
We first use Einmahl and Mason’s result (cf. Proposition 2 above). All the requirements of Einmahl and Mason result are satisfied from that of Theorem 1. This gives that, for c ln n/n ≤ hdn ≤ 1, s nhdn (25) sup |gˆn (x) − Egˆn (x)| < C, ln(h−d n ) ∨ ln ln n x∈Rd almost surely. Our task now is to apply Proposition 3. We first claim that Lemma 2 Under Condition (K3 ), the class of function F defined by (11) satisfies N (ε , F ) ≤ Cε −ν , for C > 0, ε > 0, ν > 1.
Proof of Lemma © 2. Define¡ the set F ¢ of function ª = K I , where the set of funcd , h > 0 , and I = {v 7−→ 1I , x ∈ R tions K is K = u 7−→ K x−u v>q(1+ε ) , x ∈ h Rd , n ∈ N, |ε | ≤ ε0 }. The proof of Lemma 2 follows from Lemma A.1 of [12] since N (ε , {v 7−→ 1Iv>y , y ∈ R}) ≤ Cε −ν˜ with ν˜ > 0 and C > 0.
10
Consequently, all the requirements of Proposition 3 are satisfied from that of Theorem 1. The conclusion of Proposition 3 together with (25), (24) and the facts that s s ln(h−d ) ∨ ln ln n ln(αn−1 h−d n ) ∨ ln ln n n , ≤ nhdn nhdn αn s ln(αn−1 h−d 1 n ) ∨ ln ln n ≤ d nhn αn nαn hdn complete the proof of Theorem 1.
4.2 Proof of Proposition 1 (n)
Let us introduce Zi (x) for i = 1, . . . , n a triangular array of i.i.d. random variables (n)
defined by Zi (x) = Yi Ikx−Xi k≤h . Their common survival distribution function can be expanded as: ¯ n (t, x) Ψ
= =
(n)
P(Z1 (x) > t) = hd
Z
kvk≤1
Z
kx−uk≤h
¯ F(t|u)g(u)du
¯ F(t|x − hv)g(x − hv)dv,
or equivalently, ¯ n (t, x) Ψ ¯ hd F(t|x)g(x)
= + +
Z
dv µ ¶ Z ¯ F(t|x − hv) g(x − hv) dv − 1 ¯ g(x) F(t|x) kvk≤1 µ ¶ Z g(x − hv) − 1 dv. g(x) kvk≤1 kvk≤1
R
Letting vd = kvk≤1 dv the volume of the unit sphere, and assuming that g is Lipschitzian, it follows, ¯ n (t, x) Ψ d ¯ h F(t|x)g(x)
= vd + o(1) + O(A(t,t, x, 0, h))
¯ n (q(αn |x)|x), we obtain and introducing βn (x) = nΨ
βn (x) = vd g(x)nhd αn (1 + o(1)) under condition A(q(αn |x), q(αn |x), x, 0, h) → 0 as n → ∞. We now need the following lemma (also available to triangular arrays). Lemma 3 (Klass, 1985) [23] Let Z, Z1 , Z2 , . . . be a sequence of i.i.d. random vectors and define Mn = max{Z1 , . . . , Zn }. Suppose that (bn ) is nondecreasing, P(Z > bn ) → 0 and nP(Z > bn ) → ∞ as n → ∞. If, moreover, ∞
∑ P(Z > bn ) exp{−nP(Z > bn )} = ∞,
n=1
then lim supn→∞ Mn /bn < 1 a.s.
11
From Lemma 3, a sufficient condition for (n)
lim sup n→∞
is
max1≤i≤n Zi (x) < 1 a.s. q(αn |x)
(26)
∞
βn (x) exp{−βn (x)} = ∞, n=1 n
∑
which is fulfilled under (6). Finally, (n)
qˆn (αn |x) max1≤i≤n Zi (x) ≤ q(αn |x) q(αn |x) and the conclusion follows from (26).
4.3 Proofs of corollaries Proof of Corollary 1. For all τ ∈ [0, 1] and (u, x) such that d(u, x) ≤ Rhn , we have ¯ αn |x)(1 + ε )|u) F(q( ¯ F(q(αn |x)(1 + ε )τ |x)
= = = =
q(αn |x)θ (x)−θ (u) (1 + ε )τθ (x)−θ (u) ¾ ½ θ (u) − θ (x) log αn + (τθ (x) − θ (u)) log(1 + ε ) exp θ (x) exp {O(hn log αn ) + O(τθ (x) − θ (u))} (1 + O(hn log αn )) exp{O(τθ (x) − θ (u))}.
(27)
Assuming that hn log αn → 0 as n → ∞, it follows that (27) is bounded above for all τ ∈ [0, 1] and (u, x) such that d(u, x) ≤ Rhn and therefore condition (4) is fulfilled. If, moreover, τ = 1 then O(θ (x) − θ (u)) = O(hn ) and thus (27) tends to zero as n goes to infinity. Proposition 1 then entails that assumption (3) holds. Theorem 1 implies that ¯ ¯ ¯¯ ¶−θ (x) ¯¯ µ ¯ ¯ ¯ ¯ ¯1 − F(qˆn (αn |x)|x) ¯ = ¯¯1 − qˆn (αn |x) ¯→0 ¯ ¯ αn |x)|x) ¯ ¯ ¯ q(αn |x) F(q(
almost surely as n → ∞. The conclusion follows.
Proof of Corollary 2. For all τ ∈ [0, 1] and (u, x) such that d(u, x) ≤ Rhn , we have µ ¶¾ ½ ¯ αn |x)(1 + ε )|u) θ (u) − θ (x) F(q( τ −1 ) ε ) log( α ε ) + 1 − (1 + = exp (1 + n ¯ αn |x)(1 + ε )τ |x) θ (x) F(q( ´o n ³ = exp {O(hn log αn )} exp (1 + ε ) log(αn ) 1 − (1 + ε )τ −1 =
(1 + O(hn log αn )o(1) = o(1).
Assuming that hn log αn → 0 as n → ∞, it follows that (28) tends to zero as n goes to infinity. Assumptions (3) and (4) both hold. Theorem 1 implies that ¯ ¯ ¯ ¯ ¯ ¯1 − F(qˆn (αn |x)|x) ¯ = |1 − exp ((q(αn |x) − qˆn (αn |x))θ (x))| → 0 ¯ ¯ F(q(αn |x)|x) ¯ almost surely as n → ∞. The conclusion follows.
12
(28)
Appendix: proof of auxiliary results Proof of Lemma 1. Clearly, ¯ ¯ ¯ ¯ ¯ ¯ ¯ F¯n (y|x) ¯ ¯ F¯n (y|x) ¯ E (ψˆ n (y, x)) ¯¯ ¯¯ E (ψˆ n (y, x)) ¯ ¯≤¯ ¯ + − 1 − − 1 (29) ¯ F(y|x) ¯ ¯ F(y|x) ¯ ¯ ¯ ¯ ¯ F(y|x)E (gˆn (x)) ¯ ¯ F(y|x)E (gˆn (x)) We have,
E (ψˆ n (y, x)) =
1 n ∑ E(Kh (x − Xi )1IYi >y ) = E(Kh (x − X1 )1IY1 >y ) n i=1
= E(Kh (x − X1 )P(Y1 > y|X1 )) = =
Z
Z
Kh (x − z)P(Y1 > y|X1 = z)g(z)dz
¯ Kh (x − z)F(y|z)g(z)dz,
and E (gˆn (x)) = n1 ∑ni=1 E(Kh (x − Xi )) = E(Kh (x − X1 )). Consequently, E (ψˆ n (y, x)) −1 = ¯ F(y|x)E (gˆn (x)) µZ · ¸ ¶ ¯ F(y|z) 1 Kh (x − z) ¯ − 1 g(z)dz = E(Kh (x − X1 )) F(y|x) µZ · ¸ ¶ ¯ F(y|x − u) 1 Kh (u) − 1 g(x − u)du . = ¯ E(Kh (x − X1 )) F(y|x) We conclude, since the kernel K is compactly supported, that for some R > 0, ¯ ¯ ¯ ¯ ′) ¯ ¯ F(y|x ¯ ¯ ¯ E (ψˆ n (y, x)) ¯ ¯≤ ¯ = A(y, y, x, h). ¯ sup (30) − 1 − 1 ¯ ¯ ¯ ¯ F(y|x)E ¯ ¯ (gˆn (x)) {x′ ,d(x,x′ )≤hR} F(y|x)
Now,
¯ ¯ ¯ F¯n (y|x) E (ψˆ n (y, x)) ¯¯ ¯ − ¯ F(y|x) ¯ ¯ F(y|x)E (gˆn (x)) ¯ |ψˆ n (y, x) − E (ψˆ n (y, x)) | E (ψˆ n (y, x)) |gˆn (x) − Egˆn (x)| ≤ + ¯ ¯ F(y|x) gˆn (x) F(y|x) gˆn (x)Egˆn (x) E |gˆn (x) − Egˆn (x)| |ψˆ n (y, x) − E (ψˆ n (y, x)) | , + (1 + A(y, y, x, h)) ≤ ¯ gˆn (x) F(y|x) gˆn (x) by (30). The last bound together with (30) and (29) prove Lemma 1.
13
References [1] Abdi, S, Abdi, A, Dabo-Niang, S. and Diop, A. (2010). Consistency of a nonparametric conditional quantile estimator for random fields. Mathematical Methods and Statistics, 2010, Vol. 19, No. 1, 1-21. [2] Beirlant, J., and Goegebeur, Y. (2004). Local polynomial maximum likelihood estimation for Pareto-type distributions. Journal of Multivariate Analysis, 89, 97–118. [3] Berlinet, A., Gannoun, A. and Matzner-Lober, E. (2001). Asymptotic normality of convergent estimates of conditional quantiles. Statistics, 35, 139– 169. [4] Beirlant, J., Goegebeur, Y., Segers, J., and Teugels J. (2004). Statistics of extremes: Theory and applications, Wiley. [5] Chavez-Demoulin, V., and Davison, A.C. (2005). Generalized additive modelling of sample extremes. Journal of the Royal Statistical Society, series C, 54, 207–222. [6] Chernozhukov, V. (2005). Extremal quantile regression, The Annals of Statistics, 33(2), 806–839. [7] Daouia, A., Gardes, L., Girard, S. and Lekina, A. (2011). Kernel estimators of extreme level curves. Test, 20, 311–333. [8] Daouia, A., Gardes, L. and Girard, S. (2013). On kernel smoothing for extremal quantile regression”, Bernoulli, 19, 2557–2589. [9] Davison, A.C., and Ramesh, N.I. (2000). Local likelihood smoothing of sample extremes. Journal of the Royal Statistical Society, series B, 62, 191– 208. [10] Davison, A.C., and Smith, R.L (1990). Models for exceedances over high thresholds. Journal of the Royal Statistical Society, series B, 52, 393–442. [11] Dabo-Niang, S. and Laksaci, A. (2012). Nonparametric Quantile Regression Estimation for Functional Dependent Data. Comm. Statist. Theory Methods, 41-7: 1254-1268. [12] Einmahl, U and Mason, D. M. (2000). An empirical process approach to the uniform consistency of kernel type function estimators. J. Theor. Probab. 13, 1-37. [13] Einmahl, U and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel type function estimators. Ann. Statist. Volume 33, Number 3, 13801403. [14] Ezzahrioui, M. and Ould-Sa¨ıd, E. (2008). Asymptotic results of a nonparametric conditional quantile estimator for functional time series. Comm. Statist. Theory Methods, 37(16-17):2735-2759. [15] Gardes, L., and Girard, S. (2008). A moving window approach for nonparametric estimation of the conditional tail index. Journal of Multivariate Analysis, 99, 2368–2388. [16] Gardes, L., and Girard, S. (2010). Conditional extremes from heavy-tailed distributions: An application to the estimation of extreme rainfall return levels. Extremes, 13, 177–204.
14
[17] Gardes, L., and Girard, S. (2012). Functional kernel estimators of large conditional quantiles. Electronic Journal of Statistics, 6, 1715–1744. [18] Gardes, L., Girard, S., and Lekina, A. (2010). Functional nonparametric estimation of conditional extreme quantiles. Journal of Multivariate Analysis, 101, 419–433. [19] Girard, S., and Jacob, P. (2008). Frontier estimation via kernel regression on high power-transformed data. Journal of Multivariate Analysis, 99, 403–420. [20] Girard, S., and Menneteau, L. (2005). Central limit theorems for smoothed extreme value estimates of Poisson point processes boundaries. Journal of Statistical Planning and Inference, 135, 433–460. [21] Hall, P., and Tajvidi, N. (2000). Nonparametric analysis of temporal trend when fitting parametric models to extreme-value data. Statistical Science, 15, 153–167. [22] Jureckov´a, J. (2007). Remark on extreme regression quantile. Sankhya, 69, Part 1, 87–100. [23] Klass, M. (1985). The Robbins-Siegmund series criterion for partial maxima. Ann. Probab. Volume 13, Number 4, 1369-1370. [24] Ledoux, M., Talagrand, M. (1991) Probability in Banach spaces Ergebnisse der Mathematik. Springer. [25] Ould-Sa¨ıd, E., Yahia, D., and Necir, A. (2009). A strong uniform convergence rate of a kernel conditional quantile estimator under random lefttruncation and dependent data. Electron. J. Statist. Volume 3, 426-445. [26] Park, B.U. (2001). On nonparametric estimation of data edges. Journal of the Korean Statistical Society, 30, 265–280. [27] Samanta, T. (1989). Non-parametric estimation of conditional quantiles. Statistics and Probability Letters, 7, 407–412. [28] Smith, R.L. (1989). Extreme value analysis of environmental time series: An application to trend detection in ground-level ozone (with discussion). Statistical Science, 4, 367–393. [29] Stone, C.J. (1977). Consistent nonparametric regression (with discussion). The Annals of Statistics, 5, 595–645. [30] Stute, W. (1986). Conditional empirical processes. The Annals of Statistics, 14, 638–647. [31] Tsay, R.S. (2002). Analysis of financial time series. Wiley, New York.
15