On estimation in a spatial functional linear regression

2 downloads 0 Views 507KB Size Report
This paper deals with functional linear regression for spatial data. We study the asymptotic ... centrées et indépendantes à valeurs dans E, et R respectivement et Xi désigne la dérivée première de Xi. ... Sous les hypothèses 3.1 − 3.5 avec α1,∞(t) = O(t−θ), θ ≥ d + 1,ona: E ..... Journal of Multivariate Analysis, 34, 37–53. 6.
On estimation in a spatial functional linear regression model with derivatives Stephane Bouka b,1 , Sophie Dabo-Niang a,c , Guy Martial Nkiet b a Laboratoire

LEM, CNRS 9221, University of Lille, France URMI, Univ. Masuku, Franceville, Gabon c INRIA-MODAL, Lille, France

b Laboratoire

Abstract This paper deals with functional linear regression for spatial data. We study the asymptotic properties of an estimator of a linear model where a spatial scalar response variable is related to a spatial functional explanatory variable and to its derivative. Convergence results with rate of this estimator are derived. Keywords: Functional regression, mixing spatial process. To cite this article : S. Bouka, S. Dabo-Niang, G.M., Nkiet, C. R. Acad. Sci. Paris, Ser. I 355 (2017). Résumé Estimation dans un modèle de régression fonctionnelle spatiale avec dérivées. Cet article aborde l’estimation de la régression linéaire fonctionnelle dans un cadre spatial. Nous étudions les propriétés asymptotiques de l’estimateur d’un modèle où une variable réponse réelle est liée à une variable dépendante fonctionnelle et sa dérivée. Nous établissons des résultats de convergence pour cet estimateur, et des vitesses de convergence sont données. Ensembles des mots clés : régression fonctionnelle, processus spatial mélangeant. Pour citer cet article : S. Bouka, S. Dabo-Niang, G.M., Nkiet, C. R. Acad. Sci. Paris, Ser. I 355 (2017).

Version Française abrégée Nous considérons le modèle défini par Yi = hβ, Xi iE + hγ, Xi0 iF + i , i ∈ D ⊂ Zd , d ≥ 2, Email addresses: [email protected] (Stephane Bouka), [email protected] (Sophie Dabo-Niang), [email protected] (Guy Martial Nkiet). 1 . Corresponding author. Preprint submitted to Statistics

13 octobre 2017

où E est l’espace de Sobolev d’ordre (2, 1) (f ∈ E signifie que f et f 0 appartiennent à F := L2 [0, 1]), β et γ sont des fonctions inconnues appartenant respectivement à E et F , Xi et i sont des variables aléatoires centrées et indépendantes à valeurs dans E, et R respectivement et Xi0 désigne la dérivée première de Xi . Nous supposons que (Yi , Xi ) a la même distribution qu’un vecteur aléatoire (Y, X), et que le processus est observé dans une région In = {1, 2, ..., n}d où n = (n, ..., n), n ∈ N∗ . Nous nous intéressons à l’estimation des paramètres β et γ, puis à la prédiction à un site non visité. L’estimateur de (β, γ) est (βbn , γ bn ) défini, comme dans [10], sur la base d’estimateurs empiriques donnés de (2) à (5), par βbn = (Sn,β + ψn I)−1 un,β et γ bn = (Sn,γ + ψn I)−1 un,γ . L’opérateur Tn (resp. T ) étant un des opérateurs suivants : Γn , Γ0n , Γ0∗ n et 00 Γn (resp. Γ, Γ0 , Γ0∗ , Γ00 ), on a : Thérorème 3.1. Sous les hypothèses 3.1 − 3.5 avec α1,∞ (t) = O(t−θ ), θ ≥ d + 1, on a :    2 E kTn − T k∞ = O n−d log n . Notant HS l’espace des opérateurs de Hilbert-Schmidt et considérant les semi-normes k.kΓ := kΓ1/2 (.)kE et k.kΓ00 := kΓ001/2 (.)kF , on en déduit : Corollaire 3.1. Soit (vj )j≥1 une base orthonormée de E constituée de vecteurs propres associés aux valeurs propres (λj )j≥1 de Γ avec λj = O(rj ), 0 < r < 1, j ≥ 1. Alors, sous les hypothès du Théorème 3.1,on a :  (i) kTn − T kL2 (HS) = O n−d/2 log n . (ii)     2   2

2 (log n)2 ψn (log n)2 ψn

2 + O et kγ − γ b k = O + O .

β − βbn = Op p n Γ00 p p φ2n φ2n ψn2 nd φ2n φ2n ψn2 nd Γ Posant In+1d = {1, 2, ..., n+1}d , le prédicteur D E et sa D version E "théorique" à un site non D visité E j ∈ In+1d \In 0 ∗ 0 b b sont respectivement définis par Yj = βn , Xj + γ bn , X et Y = hβ, Xj i + γ, X . On a alors : j

E

F

j

E

j

F

Corollaire 3.2. Sous les hyopthèses du Corollaire 3.1, On a pour tout j ∈ In+1d \ In :   2   2  (log n)2 ψn E Ybj − Yj∗ =O + O . φ2n φ2n ψn2 nd

1. Introduction Several types of functional linear models for independent data have been developed over the years, serving different purposes. The most studied is perhaps the functional linear model for scalar response, originally introduced by [7]. Estimation and prediction problems for this model and some of its generalizations have been tackled mainly for independent data (see, e.g., [1], [3], [9], [10]). Some works exist on functional spatial linear prediction using krigging methods (see, e.g., [4], [5], [8], [11]). They highlight the interest of considering spatial linear functional models. In this paper, the results obtained in [10] on estimation and prediction in the functional linear model with derivatives for independent data are extended to the spatial case. Namely, we consider the model given by: 2

Yi = hβ, Xi iE + hγ, Xi0 iF + i , i ∈ D ⊂ Zd , d ≥ 2,

(1)

where E is the Sobolev space of order (2, 1) (f ∈ E means that f and f 0 belong to F := L2 [0, 1]), β and γ are unknown functions belonging to E and F respectively, Xi and i are centered and independent random variables defined on a probability space (Ω, A, P ) and valued into E and R respectively, and Xi0 stands for the first derivative of Xi . We assume that (Yi , Xi ) has the same distribution than a random vector (Y, X), and that the process is observed in a region In = {1, 2, ..., n}d where n = (n, ..., n), n ∈ N∗ . In Section 2, estimators of β and γ are introduced, as well as a predictor at a non-visited site. Section 3 gives assumptions and establishes asymptotic results, namely convergences with rate of the estimates. Finally, Section 4 is devoted to some indications for proving the theoretical results presented in this note.

2. Estimation and prediction We propose estimators of β and γ by using the same approach than [10]. Denoting by ⊗E (resp. ⊗F ) the tensor product defined by (u ⊗E v)(h) =< u, h >E v (resp. (u ⊗F v)(h) =< u, h >F v), we consider the covariance and cross-covariance operators defined by Γ = E(X ⊗E X), Γ0∗ = E(X ⊗E X 0 ), Γ0 = E(X 0 ⊗F X), Γ00 = E(X 0 ⊗F X 0 ) and we set ∆ = E(Y X) ∈ E, ∆0 = E(Y X 0 ) ∈ F, Sβ = Γ − Γ0 Γ00−1 Γ0∗ , Sγ = Γ00 − Γ0∗ Γ−1 Γ0 . Empirical versions of these operators and functions are given by: Γn =

X 1 X 0 1 1 X 0 X ⊗ X , Γ = X ⊗ X , ∆ = Yi Xi , i E i F i n n i nd nd (n − 1)d 1

(2)

X 1 X 1 X 0 1 0 00 0 0 X ⊗ X , Γ = X ⊗ X , ∆ = Yi Xi0 . i E F i n i i n nd nd (n − 1)d 1

(3)

i∈In

Γ0∗ n =

i∈In

i∈In

i∈In

i∈In

i∈In

Then, we consider the operators −1 e 00−1 0∗ 00 0∗ e −1 0 e −1 e 00−1 Γ , Γn = (Γ00n + φn I)−1 , Sn,β = Γn − Γ0n (Γ n = (Γn + φn I) n )Γn , Sn,γ = Γn − Γn (Γn )Γn ,

(4)

where φn and ψn are strictly positive real-sequences defined on Nd which tend to zero as n → +∞, and I is the identity operator, and we put 0 0 0∗ e −1 e 00−1 un,β = ∆n − Γ0n (Γ n )∆n , un,γ = ∆n − Γn (Γn )∆n .

The estimate of the pair (β, γ) is (βbn , γ bn ) defined by: βbn = (Sn,β + ψn I)−1 un,β , γ bn = (Sn,γ + ψn I)−1 un,γ . Let In+1d = {1, 2, ..., n + 1}d , the predictor at a non-visited site j ∈ In+1d \ In is defined as D E

Ybj = βbn , Xj + γ bn , Xj0 F . E

3

(5)

3. Assumptions and results In order to establish the asymptotic results, the following assumptions will be considered. Assumption 3.1 KerΓ = KerΓ00 = {0}, where, for any operator A, KerA = {x : Ax = 0} . Assumption 3.2 (β, γ) ∈ / N , where N = {(β, γ) ∈ E ×F : β +D∗ γ = 0} with D the ordinary differential ∗ operator and D its adjoint.

Assumption 3.3 Γ−1/2 β E < ∞, (Γ00 )−1/2 γ F < ∞. Assumption 3.4 The process {Zi = (Xi , Yi , Xi0 ), i ∈ Zd } is strongly mixing, that is limn→∞ α1,∞ (n) = 0, where α1,∞ (n) = sup{α(σ(Zi ), FG ), s ∈ Zd , G ⊂ Zd , ρ(G, {i}) ≥ n},

(6)

α being the α-mixing coefficient defined, for two sub σ-algebras U and V, by α(U, V) = sup{|P(A ∩ B) − P(A)P(B)|, A ∈ U, B ∈ V}, FG = σ(Zj ; j ∈ G) and the distance ρ is defined for any subsets G1 and G2 of Zd by ρ(G1 , G2 ) = min{|i − j|, i ∈ G1 , j ∈ G2 } with |i − j| = max1≤s≤d |is − js | for any i and j in Zd . Assumption 3.5 kXi kE ≤ M a.s. where M is a strictly positive constant. Assumptions 3.1–3.3 are technical conditions, which are also considered in [10]. Assumption 3.1 is needed for identifying the model defined in (1) and Assumption 3.2 permits to obtain estimators of the parameters β and γ. Examples of spatial processes that satisfy Assumption 3.4 can be found in [6]. It is usual to assume that α1,∞ (i) tends to zero with polynomial rate, or α1,∞ (i) ≤ C exp(−si), for some C, s > 0 i.e. α1,∞ (i) tends to zero with exponential rate. We will consider the polynomial case; however our results can be proved under the exponential case. In what follows, let the norms k.k∞ and k.kL2 (HS) be defined by kT k∞ = supx∈E kT xkE /kxkE and 1/2 where HS is the space of Hilbert-Schmidt operators endowed with the kT kL2 (HS) = E kT k2HS P+∞ inner product hT, SiHS = i=1 hT (ui ), S(ui )iE where (ui )i≥1 is a basis of E. Let Tn (resp. T ) be one 0 0∗ 00 00 of the following: Γn , Γ0n , Γ0∗ n and Γn (resp. Γ, Γ , Γ , Γ ). The following result establishes the almost sure convergence of Tn to T with respect to k.k∞ . Theorem 3.1 Under assumptions 3.1 − 3.5 with α1,∞ (t) = O(t−θ ), θ ≥ d + 1, we have:    2 E kTn − T k∞ = O n−d log n . A rate of convergence of Tn with respect to norm k.kL2 (HS) appears in the following corollary, as well as those of βbn and γ bn with respect to semi-norms k.kΓ := kΓ1/2 (.)kE and k.kΓ00 := kΓ001/2 (.)kF respectively. Corollary 3.1 Let (vj )j≥1 be a sequence of orthonormal eigenfunctions associated with a sequence of eigenvalues (λj )j≥1 of the operator Γ with λj = O(rj ), 0 < r < 1, j ≥ 1. Then, under assumptions of Theorem 3.1, we have: (i)   kTn − T kL2 (HS) = O n−d/2 log n .

(ii)    2    2

2 ψn (log n)2 ψn (log n)2

2 + O and kγ − γ b k = O + O .

β − βbn = Op p p n Γ00 p φ2n φ2n ψn2 nd φ2n φ2n ψn2 nd Γ 4

The following corollary gives a bound of the prediction error of the predictor Ybj of Yj∗ = hβ, Xj iE + D E γ, Xj0 at a non-visited site j ∈ In+1d \ In . F

Corollary 3.2 Assume that assumptions of Corollary 3.1 hold. Then, for each j ∈ In+1d \ In , we have     2 2  (log n)2 ψn ∗ b +O . (7) E Yj − Yj =O φ2n φ2n ψn2 nd

4. Brief outline of Proofs For seek of simplicity, we only give the proof of the empirical operator Γn , since for other estimators similar arguments can be applied. P Proof of Theorem 3.1: Let Tn = Γn = n1d i∈In Xi ⊗E Xi and T = E(X ⊗E X). Recall that    E kTn x − T xk2E 2 E kTn − T k∞ = sup kxk2E x∈E

Let Lij = hXi , xiE Xi − E(hX, xiE X), hXj , xiE Xj − E(hX, xiE X) E , then we have   1 X  1 X 2 E kTn x − T xk2E = 2d E (Lij ) := A + B E khXi , xiE Xi − E(hX, xiE X)kE + 2d n n i∈In

i6=j

where A=

 4M 4 kxk2 1 X  kxk2E 2 E E khX , xi X − E(hX, xi X)k ≤ = c i i 1 E E E n2d nd nd i∈In

with c1 a positive constant, and B=

1 n2d

X 0Cn

√ p It is easy to show that E (|Lij |) ≤ E( Lii Ljj ) ≤ 4M 4 kxk2E . Then, we obtain |B1 | ≤

c2 kxk2E log n nd

when Cn = b(log n)1/d c (where bxc stands for the integer part of x) and c2 is a positive constant. By stationarity, boundedness of X (kXkE < M a.s.), and Lemma 2.1 (ii) in [12], we have |E (Lij ) | ≤ Ckxk2E M 4 α1,∞ (|i − j|), where C is a positive constant. Since α1,∞ (t) = O(t−θ ) with θ ≥ d + 1, we have |B2 | ≤

∞ Ckxk2E M 4 X d−1−θ c3 kxk2E , t ≤ nd nd t=1

h i  2 with c3 a positive constant. Therefore, we deduce that E kTn − T k∞ = O n−d log n . 5

2

Proof of Corollary 3.1: (i) Let Tn = Γn = we have

1 nd

P

i∈In

Xi ⊗E Xi and T = E(X ⊗E X). By definition,

n h io1/2 2 kTn − T kL2 (HS) = E kTn − T kHS .

Since (vj )j≥1 is an orthonormal basis of E, we have Q h i X h i X h i 2 2 2 E kTn − T kHS = E kTn (vi ) − T (vi )kE + E kTn (vi ) − T (vi )kE := A + B. i=1

i>Q

Let us first treat the second term B:   +∞ h  +∞ i1/2 X i1/2 X Xh  X 4 4 2 E hX, vi iE . B= E E hX, vj iE hTn (vi ) − T (vi ), vj iE  ≤ 4 i>Q 4

j=1

j=1

i>Q 2

2



2

Since hX, vj iE ≤ kXk2E kvj k2E hX, vj iE < M 2 hX, vj iE a.s. and E hX, vj iE



= λj with λj = O(rj ),

C1 3d 0 < r < 1, j ≥ 1, if Q = bK log nc with K = log( , 1 , then B ≤ C1 exp(−K(log n)(log(1/r))/2) = n3d/2 r) where C1 is a positive constant. i PApplying Theorem 3.1 and taking Q = bK log nc, allow to obtain the h Q 2 2 −d (log n)2 where C is a positive constant. This finishes inequality A ≤ E kTn − T k∞ i=1 kvi kE ≤ Cn

the proof of Corollary 3.1 (i). The proofs of Corollary 3.1 (ii) and of Corollary 3.2 are derived from that of Corollary 3.1 (i) arguing as in [10]. 2 References [1] F., Comte, J., Johannes (2012). Adaptive functional linear regression. Annals of Statistics 40, 2765–2797. [2] N.-A.-C. Cressie, Statistics for spatial Data, Wiley Series in Probability and Mathematical Statistics, New-York, (1991). [3] A., Cuevas (2014). A partial overview of the theory of statistics with functional data. J. Statist. Plan. Inf. 147, 1-23. [4] R., Giraldo. (2014). Cokriging based on curves, prediction and estimation of the prediction variance, InterStat, 2, 1–30. [5] R., Giraldo, P., Delicado, J., Mateu (2011). Ordinary kriging for function-valued spatial data. Environmental and Ecological Statistics, 18(3), 411–426. [6] X., Guyon (1995). Random fields on a Network: Modeling, Statistics and Applications. Springer, New York. [7] T., Hastie, C., Mallows (1993). A statistical view of some chemometrics regression tools: Discussion. Technometrics, 35,140–143. [8] L., Horvath, P., Kokoszka (2012). Inference for functional data with applications, Springer. [9] B., Liu, L., Wang, J., Cao (2017). Estimating functional linear mixed-effects regression models. Computational Statistics & Data Analysis, 106, 153–164. [10] A., Mas, B., Pumo (2009). Functional linear regression with derivatives. Journal of Nonparametric Statistics, 21, 19–40. [11] D., Nerini, P., Monestiez, C. Manté (2010). Cokriging for spatial functional data. Journal of Multivariate Analysis, 101(2), 409–418. [12] L.T., Tran (1990). Kernel density estimation on random fields. Journal of Multivariate Analysis, 34, 37–53.

6

Suggest Documents