Takens estimator, K-functions, Hill estimator. 1 ... Application of the continuous mapping theorem. ... (B) is established through the application of the Stein-.
Poisson limits and U-statistics Andr´ e Dabrowski∗ (Ottawa)
∗ Based
on joint work with H. Dehling (Bochum), T. Mikosch (Copenhagen) and O. Sharipov (Uzbek Academy of Sciences), S.P.A. 2002, 99, 137-157 Research supported by NSERC
• A point process for U-statistics • Stable limits for U-statistics • Takens estimator, K-functions, Hill estimator
1
Consider the U-statistic defined on iid Xi for a positive and symmetric kernel h, Hn =
n X n X
h(Xi, Xj ).
i=1 j=1
For example, take h(x, y) = |x − y|γ . A natural question is to identify the distribution of the (scaled) minimal h(Xi, Xj ) as n → ∞. In analogy with extremal processes we can define the point process Nn(A×B) = #{i < j ∈ {1 . . . n} : and look for its weak limit.
(i/n, j/n), anh(Xi, Xj ) ∈ A×B}
2
i
(i/n, j/n, anh(Xi, Xj ))
3
h(Xi, Xj ) = |Xi − Xj | < 0.05 for uniform Xi. 4
Another reason to look at point process limits: Weak convergence results are well-known∗ for Hn =
n n X X
h(Xi, Xj ).
i=1 j=1
when Var(h(X1, X2)) < ∞. In parallel with the partial sum case, we expect stable non-Gaussian limits for heavy-tailed h(X1, X2).
∗ e.g.
Serfling (1980) 5
A now-standard approach∗ to proving stable limits for parP tial sums n Xi involves • Establishing weak convergence for an extremal point process, In(A) = #{j ∈ {1 . . . n} : (j/n, a−1 n Xj ) ∈ A} to a simple point process on [0, 1] × 0 P [h(X1, X2) ≤ x] = L(x−1)xα. 4. As n → ∞, for all x > 0,
n3P [anh(X1, X2) ≤ x, anh(X1, X3) ≤ x] → 0 8
Condition D(un)0 Condition D(un) holds for a stationary sequence Yn and sequence un of constants if as k → ∞ lim sup n→∞
[n/k] X j=2
P [Y1 < un, Yj < un] → 0
As n → ∞, for all x > 0, n3P [anh(X1, X2) ≤ x, anh(X1, X3) ≤ x] → 0
9
Theorem. Nn ⇒ N where N is a point process on {(x, y) : 0 < x ≤ 1, 0 < y < x} × 0] .
As δ → 0, the variance is of the order α−2[1 + δ α].
24
Poisson convergence of the K-function In the spatial analysis of point patterns the K-function is used as a measure of spatial dependence∗. A sample version of it is given by the U -statistic Kn(δ) =
n i−1 X X
i=2 j=1
I[0,δ](ankXi − Xj k) .
Thus we have the kernel h(x, y) = kx − yk , and so we may conclude that Kn(δ) = Nn(E1 × [0, δ]) converges in distribution to a Poisson random variable with mean δ α. ∗ (1993)
Statistics for Spatial Data. Wiley, New York. 25
More generally, the Kn-processes converge in distribution in Mp(R+) to a Poisson process K with mean measure α xα−1 dx: Kn(·) = Nn(E1 × ·) =
n i−1 X X
i=2 j=1
d
εan|Xi−Xj |(·) → K(·) .
(3)
Writing K(δ) = K([0, δ]), it follows from (??) that in D[0, ∞) equipped with the J1-topology: d
(Kn(δ))δ≥0 → (K(δ))δ≥0 .
26
An application of the continuous mapping theorem on D[M0, M1] with 0 < M0 < M1 < ∞ to C[0, b] with b > 0 gives Bn = d
Z M 1 M0
→ B=
(log+ Kn(δ) − β log δ)2 dδ Z M 1 M0
!
β∈[0,b] !
(log+ K(δ) − β log δ)2 dδ
,
β∈[0,b]
in C[0, b].
27
Another application of the continuous mapping shows that the minimizer βn of Bn on [0, b] converges to the minimizer β0 of B on [0, b]: R M1 + K (δ) dδ log δ log n M0 βn = R M1 2 M0 (log δ) dδ R M1 + K(δ) dδ log δ log d M0 → β0 = R M1 2 M0 (log δ) dδ R M1 + α M0 (log δ log K(δ) − log(δ )) dδ = α+ , (4) R M1 2 M0 (log δ) dδ
where log+ x = log(max(1, x)).
28
Using a simple simulation, we estimate the the bias. The vertical axis represents the bias in estimating α – i.e. the fraction in line (4). The horizontal axes represent possible values for M0 and M1 in that fraction.
MEAN
3.83
2.46
1.09
-0.28 5.0
3.5 M_0 2.0 5.5 4.0 2.5
M_1
0.5 1.0
29
Hill estimation of α. In extreme value theory the estimation of α is usually not based on log regression methods but on an increasing number of logarithms of lower order statistics. In order to make the Hill estimator work one needs an increasing number of points, i.e. order statistics of the h(Xi, Xj )’s, in any neighborhood of the origin. The right tool in this context is the tail empirical process instead of the point process limit.
30
As before, write h(1) ≤ · · · ≤ h(n(n−1)/2) for the order statistics of the sample h(Xi, Xj ), i = 2, . . . , n, j = 1, . . . , i − 1. A classical estimator of α is Hill’s estimator given by
b n,m = − α
for m ≥ 1∗;
m X
−1
1 log(h(i)/h(m)) m i=1
Theorem. Under regular variation conditions, if m = √ mn → ∞ and mn/n → 0, then Hill’s estimator is consisP
b n,m → α. tent, i.e. α ∗ Hill,
B.M. (1975)Ann. Statist. 3, 1163–1174. 31
Proof. It suffices∗ to prove that the tail empirical process Nn,m of the points h(Xi, Xj ) satisfies n i−1 X 1 X d Nn,m = εan,mh(Xi,Xj ) → µ , m i=2 j=1 d
in Mp(R+), where → denotes convergence in distribution, µ is a measure on the Borel sets of R+ with density 0.5αxα−1dx and an,m → ∞ is chosen such that n2 P (an,mh(X1, X2) ≤ 1) ∼ 1 . 2m
∗ Resnick,
(5)
S.I. and St˘ aric˘ a, C. (1995)J. Appl. Probab. 32, 139–167. 32
Since µ is deterministic it suffices to show that the law of large numbers P
Nn,m((a, b]) → µ((a, b]) holds for any 0 < a < b < ∞. Thus it suffices to verify that ENn,m((a, b]) → µ((a, b])
and
var(Nn,m((a, b])) → 0 .
By the definition of (an,m) and by regular variation we have n(n − 1) ENn,m((a, b]) = P (an,mh(X1, X2) ∈ (a, b]) → µ((a, b]) . 2m
33
We have var(Nn,m((a, b]))
1 n2 ∼ var I(a,b](an,mh(X1, X2)) m2 2
+ n3cov I(a,b](an,mh(X1, X2), I(a,b](an,mh(X1, X3)
.
By (5) the first term converges to 0. The covariance is n3 [P (an,mh(X1, X2) ∈ (a, b] , an,mh(X1, X3) ∈ (a, b]) 2 m i 2 − (P (an,mh(X1, X2) ∈ (a, b]))
n3 = P (an,mh(X1, X2) ∈ (a, b] , an,mh(X1, X3) ∈ (a, b]) + o(1) . 2 m This also converges to 0 by the regular variation condition. 34