Nov 9, 2007 - Carleton University, Ottawa, Canada. Barbara Szyszkowicz. Carleton University, Ottawa, Canada. Qiying Wang. University of Sydney, Australia.
arXiv:0711.1385v1 [math.PR] 9 Nov 2007
Asymptotics of Studentized U-type processes for changepoint problems Mikl´os Cs¨org˝o Carleton University, Ottawa, Canada
Barbara Szyszkowicz Carleton University, Ottawa, Canada
Qiying Wang University of Sydney, Australia Dedicated to the memory of Tibor Nemetz ABSTRACT This paper investigates weighted approximations for studentized Ustatistics type processes, both with symmetric and antisymmetric kernels, only under the assumption that the distribution of the projection variate is in the domain of attraction of the normal law. The classical second moment condition E|h(X1 , X2 )|2 < ∞ is also relaxed in both cases. The results can be used for testing the null assumption of having a random sample versus the alternative that there is a change in distribution in the sequence. Key Words and Phrases: Weighted approximations in probability, functional limit theorems, U-statistics type processes, Studentization, change in distribution, symmetric and antisymmetric kernels, Gaussian processes. AMS 2000 Subject Classification: Primary 60F17, 62G10, Secondary 62E20. Running Head: Studentized U-type processes
——————————————– The research of M. Cs¨org˝o and B. Szyszkowicz is supported by their NSERC Canada Discovery Grants at Carleton University, Ottawa, and Q. Wang’s research is supported in part by Australian Research Council at University of Sydney.
1
Introduction and main results: the case of symmetric kernels
Let X, X1 , X2 , ... be a sequence of non-degenerate independent real-valued random variables with distribution function F . Suppose we are interested in testing the null hypothesis: H0 : Xi , 1 ≤ i ≤ n, have the same distribution, against the one change in distribution alternative: HA :
there is an integer k, 1 ≤ k < n, such that P (X1 ≤ t) = · · · = P (Xk ≤ t), P (Xk+1 ≤ t) = · · · = P (Xn ≤ t) f or all t and P (Xk ≤ t0 ) 6= P (Xk+1 ≤ t0 ) f or some t0 .
Testing for this kind of a change in distribution has been studied extensively in the literature by using parametric as well as non-parametric methods. One of the nonparametric methods was proposed by Cs¨org˝o and Horv´ath (1988a, b), who used functionals of a U-statistics type (U-type, from now on) process to test H0 against HA . Let h(x, y) be a measurable real valued symmetric function, i.e. h(x, y) = h(y, x). The U-type process of Cs¨org˝o and Horv´ath (1988a, b) is defined by Un (t) = Z[(n+1)t] − n2 t(1 − t)θ,
0 ≤ t ≤ 1,
where θ = Eh(X1 , X2 ), and Zk =
k n X X
h(Xi , Xj ),
1 ≤ k < n.
i=1 j=k+1
While Zk itself is not a U-statistic, it can be written as the sums of three U-statistics [cf. Cs¨org˝o and Horv´ath (1988a, b, 1997)]. The rational behind the definition of Zk is comparing the first k observations to the remaining (n − k) ones for k = 1, . . . , n − 1, via an appropriate bivariate kernel function h(x, y) for the sake of capturing the possibility of having a change in distribution at an unknown time k as postulated in HA . Typical choices of symmetric kernel h are xy, (x − y)2/2 (the sample variance), |x − y| (Gini’s mean difference), and sign(x + y) (Wilcoxon’s one-sample statistic). Throughout the paper, we write g(t) = E (h(X, t) − θ) , σ 2 = Eg 2 (X1 ) and, for later use, we define a Gaussian process Γ by Γ(t) = (1 − t) W (t) + t [W (1) − W (t)] ,
0 ≤ t ≤ 1,
(1)
where {W (t), 0 ≤ t < ∞} is a standard Wiener process. Furthermore, let Q be the class of positive functions q on (0, 1), i.e., inf δ≤t≤1−δ q(t) > 0 for 0 < δ < 1, which are nondecreasing in a neighbourhood of zero and nonincreasing in a neighbourhood of one, and let Z 1− cq 2 (t) 1 dt, 0 < c < ∞. exp − I(q, c) = t(1 − t) 0+ t(1 − t) 1
In terms of these notations, Cs¨org˝o and Horv´ath (1988a, b), Szyszkowicz (1991, 1992) established the following result [cf. Theorem 2.4.2 in Cs¨org˝o and Horv´ath (1997)]. Theorem A Assume H0 , 0 < σ 2 < ∞ and E|h(X1 , X2 )|2 < ∞. Then, on an appropriate probability space for X, X1 , X2 , · · · , we can define a sequence of Gaussian processes {Γn (t), 0 ≤ t ≤ 1} such that the equality in distribution {Γn (t), 0 ≤ t ≤ 1} =d {Γ(t), 0 ≤ t ≤ 1}
(2)
holds for each n ≥ 1, and as n → ∞,
.
sup n−3/2 σ −1 Un (t) − Γn (t) q(t) = oP (1).
0 0. This is the so-called non-degenerate case when studying U-statistics via the function g(t) = E(h(X, t) − θ) that induces the projection of Ustatistics into sums of i.i.d. random variables, the so-called Hoeffding (1948) projection principle that, in part, rests on a paper of Halmos (1946). For functions x, y in D[0, 1] and q ∈ Q, we define the weighted sup-norm metric ||/q|| by ||(x − y)/q|| = sup |(x(t) − y(t))/q(t)|, 0≤t≤1
whenever this is well defined, i.e., when lim sup |(x(t) − y(t))/q(t)| is finite for t ↓ 0 and t ↑ 1. In view of (2) and this terminology, (3) of Theorem A implies the following weak convergence, a functional limit theorem. Corollary A With q ∈ Q, and →d standing for convergence in distribution as n → ∞, we have h{n−3/2 σ −1 Un (·)/q(·)} →d h{Γ(·)/q(·)} for all h : D = D[0, 1] → IR that are (D, D) measurable and ||/q||-continuous, or ||/q||continuous except at points forming a set of measure zero on (D, D) with respect to the measure generated by the Gaussian Γ(·) process, if and only if I(q, c) < ∞ for all c > 0, where D denotes the σ-field of subsets of D generated by the finite dimensional subsets of D. Remark A For further use the statement of Corollary A will be summarized by writing, as n → ∞, n−3/2 σ −1 Un (·)/q(·) ⇒ Γ(·)/q(·) on (D[0, 1], D, ||/q||). For a summary of notions of convergence and weak convergence in general along these lines, we refer to pages 26–28 and Remarks 2 and 3 on page 49 of Shorack and Wellner (1986), and to Sections 3.3 and 3.4 of Cs¨org˝o (2002). 2
Thus Theorem A provides a basic tool for investigating the asymptotic behaviour of many test statistics for testing H0 versus HA via corresponding functionals of Γ(·)/q(·) for appropriate choices of the kernel h(x, y). This, in turn, motivates the establishment of our first result, in which we reduce the moment conditions related to the kernel h(x, y). It reads as follows. Theorem 1 Assume H0 , 0 < σ 2 < ∞ and E|h(X1 , X2 )|4/3 < ∞. Then, on an appropriate probability space for X, X1 , X2 , · · · , we can define a sequence of Gaussian processes {Γn (t), 0 ≤ t ≤ 1} such that (2) holds true, and if I(q, c) < ∞ for some c > 0, then as n → ∞, sup 1/n≤t≤(n−1)/n
. −3/2 −1 n σ Un (t) − Γn (t) q(t)
= oP (1).
(4)
In addition to reducing the moment conditions required in Theorem A, the result (4) of Theorem 1 generalizes (3) as well. Namely, as a direct consequence of Theorem 1, we have the following corollary. Corollary 1 Assume H0 , 0 < σ 2 < ∞ and E|h(X1 , X2 )|4/3 < ∞. If q ∈ Q, then (a) we still have the conclusion of Theorem A, i.e., (3) holds true if and only if I(q, c) < ∞ for all c > 0; (b) as n → ∞, .
.
n−3/2 σ −1 Un (·) q(·) ⇒ Γ(·) q(·) on (D[0, 1], D, ||/q||) if and only if I(q, c) < ∞ for all c > 0; (c) as n → ∞, .
n−3/2 σ −1 sup |Un (t)| q(t) →d 0i3/2 ) ]
E[|ψ(X1 , X2 )|I(k3/2 0, P
P (I1∗ (n) ≥ ǫ) ≤ 4ǫ−2 n−3 E max | 1≤k≤n−1
−2
≤ Aǫ
−1
n
2
k X
Yj |2 ≤ A ǫ−2 n−3
j=2
Eψ (X1 , X2 )I|ψ|≤n3/2
h
n X
EYj2
j=2
≤ A ǫ−2 n−1/3 E|ψ(X1 , X2 )|4/3 + E|ψ(X1 , X2 )|4/3 I|ψ|≥n → 0, as n → ∞. 8
i
This yields I1∗ (n) = oP (1). By a similar argument as in the proof for I1∗ (n) = oP (1), we have I2∗ (n) = oP (1). As for I3∗ (n), by using a similar argument as in the proof of (20), we obtain E|I3∗ (n)| ≤
1 n3/2
n X n X i=1
j=1 j6=i
E ψ(Xi , Xj ) − ψ ∗ (Xi , Xj ) + g ∗ (Xi ) + g ∗(Xj )
≤ 4 n1/2 E[|ψ(X1 , X2 )|I|ψ|≥n3/2 ]
≤ 4 E[|ψ(X1 , X2 )|4/3 I|ψ|≥n3/2 ] → 0, as n → ∞, which implies that I3∗ (n) = oP (1). Taking all the respective estimates for It∗ (n), t = 0, 1, 2, 3 into (21), we obtain the required (13). The proof of Lemma 1 is now complete. The next two lemmas are due to CsCsHM (1986) [cf. Lemma A.5.1 and Theorem A.5.1 respectively in Cs¨og˝o and Horv´ath (1997)]. Proofs of Lemmas 2 and 3 can also be found in Section 4.1 of Cs¨org˝o and Horv´ath (1993). Lemma 2 Let q(t) ∈ Q. If I(q, c) < ∞ for some c > 0, then lim t1/2 /q(t) = 0 t↓0
and
lim (1 − t)1/2 /q(t) = 0. t↑1
Lemma 3 Let {W (t), 0 ≤ t < ∞} be a standard Wiener process and q(t) ∈ Q. Then, (a) I(q, c) < ∞ for all c > 0 if and only if lim sup |W (t)|/q(t) = 0, a.s. and lim sup |W (1) − W (t)|/q(t) = 0, a.s. t↓0
t↑1
(b) I(q, c) < ∞ for some c > 0 if and only if lim sup |W (t)|/q(t) < ∞, a.s. and lim sup |W (1) − W (t)|/q(t) < ∞, a.s. t↓0
t↑1
We are now ready to prove our main theorems. Proof of Theorem 1. Together with the notation as in Section 1, we write ψ(x, y) = h(x, y) − θ − g(x) − g(y) and Tn (t) = W[(n+1)t] , 0 ≤ t ≤ 1, where Wk = (n − k)
k X
g(Xj ) + k
j=1
n X
g(Xj ).
j=k+1
Noting that g(Xj ) are i.i.d. random variables with Eg(X1) = 0 and σ 2 = Eg 2 (X1 ) < ∞, along the lines of the proof of (2.1.45) in Cs¨org˝o and Horv´ath (1997), on an appropriate probability space for X, X1 , X2 , · · · we can define a sequence of Gaussian processes {Γn (t), 0 ≤ t ≤ 1} such that, for each n ≥ 1, {Γn (t), 0 ≤ t ≤ 1}=d {Γ(t), 0 ≤ t ≤ 1}, 9
and if q ∈ Q and I(q, c) < ∞ for some c > 0, then, as n → ∞, sup 1/n≤t≤(n−1)/n
. −3/2 −1 n σ Tn (t) − Γn (t) q(t)
= oP (1).
(22)
By virtue of (22), Theorem 1 will follow if we prove Jn :=
sup 1/n≤t≤(n−1)/n
. −3/2 n Un (t) − n−3/2 Tn (t) q(t)
∗ In order to prove (23), write Vn (t) = W[(n+1)t] , where Wk∗ = Note that E(ψ(X1 , X2 ) | X1 ) = E(ψ(X1 , X2 ) | X2 ) = 0 and
= oP (1). Pk
j=1
(23)
Pn
j=k+1 ψ(Xi , Xj ).
E|ψ(X1 , X2 )|4/3 ≤ A E|h(X1 , X2 )|4/3 < ∞. It follows from (13) that Jn(1) := ≤
.
sup |n−3/2 Vn (t)| q(t)
δ≤t≤1−δ
1 n3/2
k n X X max ψ(Xi , Xj )
1≤k≤n−1
i=1 j=k+1
sup δ≤t≤1−δ
q −1 (t) = oP (1),
for any δ ∈ (0, 1) and q ∈ Q. Let δ > 0 be so small that q(t) is already nondecreasing on (0, δ) and nonincreasing on (1 − δ, 1) and let n be so large such that 1/n ≤ δ. It follows from (11) and Lemma 2 that Jn(2) := ≤
.
sup |n−3/2 Vn (t)| q(t)
0