POISSON APPROXIMATION FOR TWO SCAN STATISTICS WITH ...

9 downloads 0 Views 248KB Size Report
prove rates of convergence for the tail probabilities of two scan statis- tics that ... See Glaz, Naus and Wallenstein (2001) for an introduction to scan statistics.
POISSON APPROXIMATION FOR TWO SCAN STATISTICS WITH RATES OF CONVERGENCE

arXiv:1403.4692v1 [math.PR] 19 Mar 2014

Xiao Fang

David Siegmund

National University of Singapore and Stanford University, Stanford University

Abstract As an application of Stein’s method for Poisson approximation, we prove rates of convergence for the tail probabilities of two scan statistics that have been suggested for detecting local signals in sequences of independent random variables subject to possible change-points. Our formulation deals simultaneously with ordinary and with large deviations. 1

INTRODUCTION

Let {X1 , . . . , Xn } be a sequence of random variables. A widely studied problem is to test the hypothesis that the X’s are independent and identically distributed against the alternative that for some 0 6 i < j 6 n, {Xi+1 , . . . , Xj } have a distribution that differs from the distribution of the other X’s. If t := j − i is assumed known and the change in distribution is a shift in the mean, one common suggestion to detect the change is the statistic Mn;t = max (Xi + · · · + Xi+t−1 ). (1.1) 16i6n−t+1

See Glaz, Naus and Wallenstein (2001) for an introduction to scan statistics. When t is unknown but the distributions of the X’s are otherwise completely specified, the maximum log likelihood ratio statistic is max (Sj − Si )

06i µ0 and b := at, we are interested in calculating approximately the probability P(Mn;t > b). The convergence rate for various suggested approximations for general X1 is not known. In practice, Monte Carlo simulations have been widely used to justify the accuracy of theoretical results. In the following theorem, we provide a Poisson approximation with rate of convergence in the case that the distribution of X1 can be imbedded in an exponential family of probability measures {Fθ : θ ∈ Θ} where dFθ (x) = eθx−Ψ(θ) dF (x).

(2.1)

It is known that the mean and variance of Fθ are Ψ′ (θ) and Ψ′′ (θ) respectively. We assume F (x) is non-degenerate, i.e., Ψ′′ (θ) > 0. In this paper, we 2

use Pθ (·) (Eθ (·) resp.) to denote the probability (expectation resp.) under which X1 ∼ Fθ . Theorem 2.1. Let {X1 , . . . , Xn } be independent, identically distributed random variables with distribution function F that can be imbedded in an exponential family, as above. Let EX1 = µ0 . For integers t < n, define Mn;t =

max

(Xi + · · · + Xi+t−1 ).

16i6n−t+1

Let a > µ0 be such that θa ∈ Θo , the interior of Θ, and let θa be defined by Ψ′ (θa ) = a. Let b = at. Suppose that F is either arithmetic or for some a′ > a such that θa′ ∈ Θo , Z ∞ sup |ϕθ (t)|ν dt < ∞ for some positive integer ν, θa 6θ6θa′

−∞

where ϕθ is the characteristic function of Fθ . Then for some constant C depending only on the exponential family (2.1), µ0 , and a, 2 P(Mn;t > b) − (1 − e−λ ) 6 C( (log t) + (log t ∧ log(n − t)) )(λ ∧ 1). (2.2) t n−t

In the non-arithmetic case



λ=

X1 + (n − t + 1)e−[aθa −Ψ(θa )]t E (e−θa Dk )], exp[− 1/2 k θa σa (2πt) k=1

(2.3)

P where σa2 = Ψ′′ (θa ), Dk = ki=1 (Xia − Xi ) and {Xi , Xia : i > 1} are independent, Xi ∼ F and Xia ∼ Fθa . In the arithmetic case, we assume without loss of generality that Xi is integer-valued with span 1 where the span of an integer-valued random variable is defined to be the largest value of ∆ such that X P(Xi = k∆ + w) = 1 for some w ∈ Z. k∈Z

In this arithmetic case,



λ=

X1 + (n − t + 1)e−(aθa −Ψ(θa ))t e−θa (⌈b⌉−b) E (e−θa Dk )) exp(− −θ 1/2 k (1 − e a )σa (2πt) k=1

where ⌈b⌉ = inf{v ∈ Z : v > b}. Remark 2.1. The various expressions entering into λ will be explained below. Here it is important to note that provided n − t and t are large the error of approximation is relative error, valid when n is relatively small, so λ ∼ 0, and when λ is bounded away from 0. Although it is possible to trace through the proof of Theorem 2.1 and obtain a numerical value for the constant C in (2.2), it would be too large for practical purposes. Therefore, we do not pursue it here. 3

Remark 2.2. Arratia, Gordon and Waterman (1990) obtained a bound for |P(Mn;t > b) − (1 − e−λ )| for independent, identically distributed Bernoulli random variables. They do not restrict b to grow linearly in t with fixed slope. For fixed a, their bound is of form (cf. equations (11)–(13) of Arratia, Gordon and Waterman (1990)) t )(λ ∧ 1). n Compared to their result, Theorem 2.1 applies to more general distributions and recovers typical limit theorems in the literature on scan statistics. As t, n − t → ∞ Theorem 2.1 guarantees the relative error in (2.2) goes to 0. See, for example, Theorem 1 of Chan and Zhang (2007). C(e−ct +

Remark 2.3. The infinite series appearing in the definition of λ is derived as an application of classical random walk results of Spitzer. It arises probabilistically in the proof of Theorem 2.1 in the form E[1−exp{−θa Dτ+ }]/E(τ+ ), where τ+ = inf{t : Dt > 0}. The series form is useful for numerical computation. For example, in the very special case that X1 ∼ N (µ0 , 1), we find that X1a ∼ N (a, 1) and in the definition of λ in (2.3), exp(−

∞ X 1 k=1

E(e−θ D k a

+ k

))

∞ X p 1 Φ(−(a − µ0 ) k/2)) k k=1 √ =: (a − µ0 )2 ν( 2(a − µ0 ))

= exp(−2

where the function ν(x) was defined in (4.38) of Siegmund (1985) and for small x satisfies exp(−cx)+o(x2 ) for c ≈ 0.583, while ν(x) ∼ 2/x2 as x → ∞. More generally, by Theorem 8.51 of Siegmund (1985), for the non-arithmetic case of Theorem 2.1, ∞ X + 1 E (e−θa Dk )) k k=1 ( Z ∞ 1 1 1 − = (a − µ0 )θa exp − 2π −∞ θa + iλ iλ ) i h  1 + log(−i(a − µ0 )λ) dλ , × log 1 − g(λ)

exp(−

where g(λ) = EeiλD1 .

The case that Xi is integer-valued and a is the largest value Xi can take is not included by Theorem 2.1 because of the constraint θa ∈ Θo . The following corollary covers this case. The proof of it is simpler than the proof of Theorem 2.1 and the convergence rate we obtain is faster. 4

Corollary 2.2. Let {X1 , . . . , Xn } be independent, identically distributed random variables with distribution function F that can be imbedded in an exponential family, as in (2.1). Let EX1 = µ0 . For integers t < n, define Mn;t =

max

(Xi + · · · + Xi+t−1 ).

16i6n−t+1

Assume X1 is integer-valued with span 1. Suppose a = sup{x : px := = x) > 0} is finite. Let b = at. Then we have, with constants C and c depending only on pa , P(Mn;t > b) − (1 − e−λ ) 6 C(λ ∧ 1)e−ct (2.4)

P(X1

where

λ = (n − t)pta (1 − pa ) + pta .

Proof. Following the proof of Theorem 2.1, let Y1 = I(X1 = · · · = Xt = a) and for 2 ≤ α ≤ n − t + 1, Yα = I(Xα−1 < a, Xα = · · · = Xα+t−1 = a). Pn−t+1 Then with W = α=1 Yα ,

EW = pta + (n − t)pta(1 − pa ) = λ. Instead of (3.4), we have P(Mn;t > b) = P(W > 1), and instead of

(3.8),

we have

|P(W > 1) − (1 − e−λ )| 6 (1 ∧

1 )(n − t + 1)(2t + 1)p2t a . λ

This proves the bound (2.4). 2.2

Scan statistics with varying window size

Next we study the maximum log likelihood ratio statistic (1.2). Suppose in (1.3), f0 (x) = dFθ0 (x) and f1 (x) = dFθ1 (x) where {Fθ : θ ∈ Θ} is an exponential family as in (2.1) and θ0 < θ1 . Then we have Si =

i X k=1

log[f1 (Xk )/f0 (Xk )] =

i X k=1

(θ1 − θ0 ) Xk −

Ψ(θ1 ) − Ψ(θ0 )  . θ1 − θ0

By appropriate change of parameters and a slight abuse of notation, studying (1.2) is equivalent to studying the following problem. Let {X1 , . . . , Xn } be independent, identically distributed random variables with distribution function F that can be imbedded in an exponential 5

family, as in (2.1). Let EX1 = µ0 < 0. Let S0 = 0 and Si = 1 6 i 6 n. Suppose there exist θ1 > 0 such that Ψ′ (θ1 ) = µ1 ,

Pi

j=1 Xj

Ψ(θ1 ) = 0.

for

(2.5)

For b > 0, we give an approximation to pn,b := P

max (Sj − Si ) > b

06i 0 be any function such that h(b) → ∞, h(b) = O(b1/2 ) as b → ∞. Suppose n − b/µ1 > b1/2 h(b). Then, for pn,b defined in (2.6), we have ( )  2 (b)  1/2 h(b) b/h b 2 pn,b − (1 − e−λ ) 6 Cλ 1+ e−ch (b) + (2.7) n − b/µ1 n − µb 1

where constants c, C only depending on the exponential family Fθ and θ1 . In the non-arithmetic case, ∞

λ = (n −

X1 b e−θ1 b −θ1 Sk+ E ). ) exp(−2 θ1 e µ1 θ 1 µ1 k k=1

In the arithmetic case, where we assume without loss of generality that Xi is integer-valued with span 1 and b is an integer, ∞

X1 b e−θ1 b −θ1 Sk+ E ). λ = (n − ) exp(−2 θ1 e −θ 1 µ1 (1 − e )µ1 k k=1

Remark 2.4. We refer to Remark 2.3 for the numerical calculation of λ. Choosing h(b) = b1/2 , we get |pn,b − (1 − e−λ )| 6 Cλ{e−cb +

b } n

from (2.7). By choosing h(b) = C(log b)1/2 with large enough C, we can see that the relative error in the Poisson approximation goes to zero under the conditions b → ∞, (b log b)1/2 ≪ n − b/µ1 = O(eθ1 b ), 6

where n − b/µ1 = O(eθ1 b ) ensures that λ is bounded. For the smaller range (in which case λ → 0) 1

δb 6 n − b/µ1 = o(e 2 θ1 b )

b → ∞,

for some δ > 0, Theorem 2 of Siegmund (1988) obtained more accurate estimates and the technique used is different from ours. 3

PROOFS

Before proving our main theorems, we first introduce our main tool: Stein’s method. Stein’s method was first introduced by Stein (1972) and further developed in Stein (1986) for normal approximation. Chen (1975) developed Stein’s method for Poisson approximation, which has been widely applied especially in computational biology after the work by Arratia, Goldstein and Gordon (1990). We refer to Barbour and Chen (2005) for an introduction to Stein’s method. The following theorem provides a useful upper bound on the total variation distance between the distribution of a sum of locally dependent Bernoulli random variables and a Poisson distribution. The total variation distance between two distributions is defined as dT V (L(X), L(Y )) = sup |P(X ∈ A) − P(Y ∈ A)|. A⊂R

P Theorem 3.1 (Arratia, Goldstein and Gordon (1990)). Let W = α∈A Yα be a sum of Bernoulli random variablesP where A is the index set and P(Yα = 1) = 1 − P(Yα = 0) = pα . Let λ = α∈A pα , and let P oi(λ) denote the Poisson distribution with mean λ. Then, dT V (L(W ), P oi(λ)) 6 (1 ∧

1 )(b1 + b2 + b3 ) λ

(3.1)

where given any neighborhood Bα for each α such that α ∈ Bα ⊂ A, X X b1 := pα pβ , α∈A β∈Bα

b2 :=

X

X

E(Yα Yβ ),

α∈A α6=β∈Bα

b3 :=

X

α∈A





(3.2) 

E E Yα − pα σ(Yβ : β ∈/ Bα) .

Remark 3.1. If Bα is chosen such that Xα is independent of {Xβ : β ∈ / Bα }, then b3 in (3.1) equals 0. Roughly speaking, in order for b1 and b2 to be small, the size of Bα has to be small and E(Yβ |Yα = 1) = o(1) for α 6= β ∈ Bα .

7

3.1

Proof of Theorem 2.1

In this proof, let C and c denote positive constants which may represent different values in different expressions. By choosing C to be large enough in (2.2), and using (cf. Theorem 1 and Theorem 6 of Petrov (1965))  P(Mn;t > b) 6 (n−t+1)P(X1 +· · ·+Xt > b) ∼ (n+t−1)e−(aθa −Ψ(θa ))t t1/2, (3.3) where x ∼ y means x/y is bounded away from zero and infinity, the bound (2.2) holds true if t or n − t is bounded. Therefore, in the sequel, we can assume t and n − t to be larger than any given constant. We embed the sequence {X1 , . . . , Xn } into an infinite i.i.d. sequence {. . . , X−1 , X0 , X1 , . . . }. For each integer α, let Tα = Xα + · · · + Xα+t−1 ,

Yeα = I(Tα > b).

To avoid the clumping of 1’s in the sequence (Yeα ) which makes a Poisson approximation invalid, we define, Yα = I(Tα > b, Tα−1 < b, . . . , Tα−m < b)

where m 6



t will be chosen later in (3.27). Let W =

n−t+1 X α=1

Yα ,

λ1 = EW = (n − t + 1)EY1 .

In the following, we first bound |P(Mn;t > b) − P(W > 1)|, then bound the total variation distance between the distribution of W and P oi(λ1 ), finally we bound |λ1 − λ|. First, since {Mn;t > b}\{W > 1} ⊂ ∪m α=1 {Tα > b}, we have 0 6 P(Mn;t > b) − P(W > 1) 6 mP(X1 + · · · + Xt > b).

(3.4)

Next, we apply Theorem 3.1 to bound the total variation distance between the distribution of W and P oi(λ1 ). For each 1 6 α 6 n − t + 1, define Bα = {1 6 β 6 n − t + 1 : |α − β| < t + m}. By definition of Bα , Yα is independent of {Yβ : β ∈ / Bα }. Therefore, b3 in (3.2) equals zero. Since |Bα | < 2(t + m), X X EYα EYβ < 2(t + m)λ1EY1 . b1 = 16α6n−t+1 β∈Bα

By our definition of Yα , for 1 6 |β − α| 6 m, EYα Yβ = 0, and for m < |β − α| < t + m, EYα Yβ 6 EYeα∧β Yeα∨β . Therefore, by symmetry, b2 =

X

X

16α6n−t+1 α6=β∈Bα

EYαYβ < 2(n − t + 1)EYe1 8

m+t X

β=m+2

P(Tβ > b|T1 > b).

For β > t + 1,

P(Tβ > b|T1 > b) = P(T1 > b).

Let a positive number 0 < δ < 1 ∧ (a − µ0 )/4 be chosen such that m < (a′′ − a)t/δ

Ψ(θa ) − (µ0 + δ)θa > 0 and

(3.5)

where a′′ < a′ will be chosen later. The first inequality above is possible because of the strict convexity of Ψ. We observe that for m + 2 6 β 6 t, Tβ > b and Xt+1 + · · · + Xt+β−1 6 (µ0 +δ)(β −1) together imply Xβ +· · ·+Xt > at−(µ0 +δ)(β −1). Therefore, t X

P(Tβ > b|T1 > b)

β=m+2

6

t X 

β=m+2

P(Xt+1 + · · · + Xt+β−1 > (µ0 + δ)(β − 1))

 + P Xβ + · · · + Xt > at − (µ0 + δ)(β − 1) T1 > b .

For the first term, we have t X

β=m+2

P(Xt+1 + · · · + Xt+β−1 > (µ0 + δ)(β − 1))

t X

6

(3.6)

e−[θµ0 +δ (µ0 +δ)−Ψ(θµ0 +δ )](β−1)

β=m+2

6

e−[θµ0 +δ (µ0 +2δ)−Ψ(θµ0 +δ )]m 1 − e−[θµ0 +δ (µ0 +2δ)−Ψ(θµ0 +δ )]

.

By the bound on V on page 613 of Koml´ os and Tusn´ady (1975) and recalling that we have chosen δ such that Ψ(θa ) − (µ0 + δ)θa > 0, t X

β=m+2

6C

t X

−[Ψ(θa )−(µ0 +δ)θa ](β−1)

e

β=m+2

6C



P Xβ + · · · + Xt > at − (µ0 + δ)(β − 1) T1 > b r

t t−β +1

e−[Ψ(θa )−(µ0 +δ)θa ]m . (1 − e−[Ψ(θa )−(µ0 +δ)θa ] )

Therefore, b2 6 C(n − t + 1)P(T1 > b)[mP(T1 > b) + e−cm ]. 9

 (3.7)

By Theorem 3.1, P(W > 1) − (1 − e−λ1 ) 6 C(1 ∧

1 )(n − t + 1)P(T1 > b)[tP(T1 > b) + e−cm ]. λ1

Finally, we calculate approximately

(3.8)

EY1 . By symmetry, we can write

EY1 = I(T1 > b, T2 < b, . . . , Tm+1 < b) = EYe1 (1 − Ye2 ) . . . (1 − Yem+1 ) Z b+mδ   E (1 − Ye2) . . . (1 − Yem+1 ) St = s dP(St 6 s) + P(St > b + mδ) 6 b

(3.9)

where St = X1 +· · ·+Xt . Observe that T1 = s and Ti+1 < b imply T1 −Ti+1 = Si − (Si+t − St ) > s − b. Therefore, given T1 = s, (1 − Ye2 ) . . . (1 − Yem+1 ) is s/t s/t the indicator of the event that {Sei − Si > s − b, 1 6 i 6 m} where Sei is independent of Si , s/t Sei =

i X j=1

e s/t X j

  e s/t : 1 6 i 6 m = L Xi : 1 6 i 6 m St = s . and L X i

Note that the assumption m < (a′′ − a)t/δ in (3.5) implies a 6 s/t 6 a′′ . It is known that when m is small compared to t, the conditional sequence e s/t : 1 6 i 6 m} behaves like an i.i.d. sequence {X s/t : 1 6 i 6 m} where {X i i s/t Xi comes from the same exponential family (2.1) as Xi , but with a different parameter θs/t . From the proof of Theorem 1.6 of Diaconis and Freedman (1988) and the assumption on the exponential family in the statement of the theorem, we have    e s/t : 1 6 i 6 m , L X s/t : 1 6 i 6 m 6 C m . (3.10) dT V L X i i t In fact, for the non-arithmetic case, since only the range of parameters [a, a′ ] enters into considerations, we do not need Condition 1.1 of Diaconis and Freedman (1988). In the following we verify their Conditions 1.2–1.4. By our assumptions for the non-arithmetic case, theirR Conditions 1.2 and ∞ 1.4 are satisfied for the range of parameters [a, a′ ]. By −∞ |ϕθa (t)|v dt < ∞, we have for t 6= 0, |ϕθa (t)| < 1 and |ϕθa (t)| → 0 as |t| → ∞. Therefore, there exists M > 0 such that |ϕθa (t)| < 1/2 for |t| > M . This, together with the fact that |ϕθa +h (t) − ϕθa (t)| → 0 as h → 0+ uniformly in t by the dominated convergence theorem, implies that there exists a′ > a′′ > a such that sup sup |ϕθ (t)| < 1 for all δ > 0. θa ≤θ≤θa′′ |t|>δ

10

Therefore, Conditions 1.2–1.4 of Diaconis and Freedman (1988) are satisfied for the range of parameters [a, a′′ ], which yields (3.10). The arithmetic case can be proved similarly. By the likelihood ratio identity, for b < s 6 b + mδ and m > ν,    s/t dT V L Xi : 1 6 i 6 m , L Xia : 1 6 i 6 m 6 Eθs/t I(Sm > t/m) + Eθa I(Sm > t/m) + Eθa e(θs/t −θa )Sm −m(Ψ(θs/t )−Ψ(θa )) − 1 I(Sm 6 t/m).

For s/t ∈ [a, a′′ ], we have |θs/t − θa | 6

1

sup θa 6θ6θa′′

|Ψ(θs/t ) − Ψ(θa )| 6

Ψ′′ (θ)

(s/t − a),

sup θa 6θ6θa′′

(3.11)

|Ψ′ (θ)|(s/t − a).

This implies that if a < s/t 6 a + mδ/t, Sm 6 t/m and m 6



t, then

(θs/t − θa )Sm − m(Ψ(θs/t ) − Ψ(θa )) 6 C. Therefore, by Markov’s inequality and the fact that |et − 1| 6 Ct if t is bounded, for b 6 s 6 b + mδ,    s/t dT V L Xi : 1 6 i 6 m , L Xia : 1 6 i 6 m m 6 (Eθs/t |Sm | + Eθa |Sm |) + C Eθa |(θs/t − θa )Sm − m(Ψ(θs/t ) − Ψ(θa ))| t 6 Cm2 /t (3.12) where in the last inequality we used s/t ∈ [a, a′′ ], Eθ |Sm | 6 CmPfor θ ∈ [θa , θa′′ ] and (3.11). Therefore, with Di := Sia − Si where Sia = ij=1 Xja and Xja is defined in the statement of Theorem 2.1, we have by (3.9), (3.10) and (3.12),

EY1 6

Z

b+mδ b



2

P(Di > s − b, 1 6 i 6 m) + C mt dP(St 6 s)

+ P(St > b + mδ).

(3.13)

Recalling 0 < δ < 1 ∧ (a − µ0 )/4 above (3.5) so that b 6 s 6 b + mδ implies s − b − m(a − µ0 )/2 < m(µ0 − a)/4,

11

we have for m1 > m,

P(Di > s − b, m1 > i > 1) = P(Di > s − b, 1 6 i 6 m)P(Di > s − b, m1 > i > m|Di > s − b, 1 6 i 6 m) > P(Di > s − b, 1 6 i 6 m)P(Di > s − b, m1 > i > m)  > P(Di > s − b, 1 6 i 6 m)P Di > s − b, i > m, Dm > m(a − µ0 )/2 > P(Di > s − b, 1 6 i 6 m) n

× 1 − P Dm

∞  X o P Di < m(µ0 − a)/4 . < m(a − µ0 )/2 − i=1

(3.14)

The first inequality in (3.14) follows from the FKG inequality (cf. (1.7) of Karlin and Rinott (1980)) and the fact that I(Di > s − b, 1 6 i 6 m) and I(Di > s − b, m1 > i > m) are both increasing functions of {X1a − a −X X1 , . . . , Xm m1 }. Letting m1 → ∞, we have 1

P(Di > s − b, i > 1) > P(Di > s − b, 1 6 i 6 m)

∞ n  X o P Di < m(µ0 − a)/4 . × 1 − P Dm < m(a − µ0 )/2 −

(3.15)

i=1

For 0 < r 6 θa ,

E exp(−rDi) = exp By Taylor’s expansion, Ψ(θa ) − Ψ(θa − r) = ra −



  − i Ψ(θa ) − Ψ(r) − Ψ(θa − r) .

r 2 ′′ Ψ (θa − r1 ), 2

−Ψ(r) = −rµ0 −

r 2 ′′ Ψ (r2 ) 2

where 0 6 r1 , r2 6 r. Therefore,  P Dm < m(a − µ0)/2   (a − µ0 )r  6 exp − m Ψ(θa ) − Ψ(r) − Ψ(θa − r) − 2  r  r 2 ′′ = exp − m (a − µ0 ) − Ψ (θa − r1 ) + Ψ′′ (r2 ) . 2 2

Let

c1 =

a − µ0 ∨ max Ψ′′ (θ). θ∈Θ:06θ6θa 4θa

Choosing r = (a − µ0 )/(4c1 ), we have 

P Dm < m(a − µ0)/2

 (a − µ0 )2 6 exp − m . 16c1

12

(3.16)

Similarly,

P Di < m(µ0 − a)/4) 6 exp





(a − µ0 )2 (a − µ0 )2 i− m . 16c1 16c1

(3.17)

Applying (3.16) and (3.17) in (3.15), we obtain

P(Di > s − b, 1 6 i 6 m)  (a − µ0 )2   (a − µ0 )2  6 P(Di > s − b, i > 1) + 2 exp − m 1 − exp − . 16c 16c 1

1

Therefore, by (3.13),

EYα − λ2 /(n − t + 1)  6



2 exp − m(a − µ0 )2 /(16c1 ) m2  +C + t 1 − exp − (a − µ0 )2 /(16c1 )

where λ2 = (n − t + 1)

Z

∞ b

P(St > b + mδ) P(S > b) t P(St > b)

(3.18)

P(Di > s − b, i > 1)dP(St 6 s).

From the corollary on page 611 of Koml´ os and Tusn´ady (1975), and recalling that in proving (2.2), we can only consider those t larger than any given constant, we have

P(T1 > b + mδ|T1 > b) 6 Ce−θ mδ . After proving a similar and easier lower bound of EY1 ,

(3.19)

a

we obtain, along

with (3.19),

|EY1 − λ2 /(n − t + 1)| 6 C

 m2  + e−cm P(St > b). t

(3.20)

To calculate λ2 , we first consider the non-arithmetic case of Theorem 2.1. By the proof of Theorem 2.7 of Woodroofe (1982), we have for x > 0,

P(Di > x, i > 1) = P(DEτ τ > x) +

(3.21)

+

where τ+ = inf{i > 1, Di > 0}. Let x0 = log t/θa . By change of variable and the likelihood ratio identity, Z ∞ P(Di > x, i > 1)dP(St 6 b + x) λ2 = (n − t + 1) 0

= (n − t + 1)e−(aθa −Ψ(θa ))t Z x0 × P(Di > x, i > 1)e−θa x dPθa (St 6 b + x) 0

+ O((n − t + 1)P(St > b + x0 )). 13

(3.22)

By the local central limit theorem (cf. Feller (1971)), uniformly for 0 6 x 6 x0 , 1 (log t)2 dPθa (St 6 b + x) = + O( ). (3.23) σa (2πt)1/2 t3/2 By (3.19) and (3.3),

P(St > b + x0) = P(St > b + x0|St > b)P(St > b) 6 Ce−θa x0

e−(aθa −Ψ(θa ))t 1 e−(aθa −Ψ(θa ))t √ √ 6C . t t t

(3.24)

Applying (3.21), (3.23) and (3.24) in (3.22), we obtain λ2 =

=

(n − t + 1)e−(aθa −Ψ(θa ))t (Eτ+ )σa (2πt)1/2 Z x0 2  × P(Dτ+ > x)e−θa x 1 + O( (logt t) ) dx 0 + O((n − t + 1)P(St > b + x0 ))

(n − t + 1)e−(aθa −Ψ(θa ))t (Eτ+ )σa (2πt)1/2 Z ∞  1 (log t)2  −θa x × P(Dτ+ > x)e 1 + O( t ) dx + O( t ) . 0

By the integration by parts formula, Z ∞ 1 −θa x Eτ+ 0 P(Dτ+ > x)e dx i 1 h = 1 − Ee−θa Dτ+ θa Eτ+ ∞  X  1 exp − = k−1 E(e−θa Dk , Dk > 0) θa Eτ+

(3.25)

k=1

=

 1 exp − θa

∞ X k=1

+  k−1 E(e−θa Dk )

where we used the first equality in the proof of Corollary 2.7 of Woodroofe (1982) and Corollary 2.4 of Woodroofe (1982). Therefore, λ2 =

(n − t + 1)e−(aθa −Ψ(θa ))t θa σa (2πt)1/2 ∞  X +  (log t)2  ) . × exp − k−1 E(e−θa Dk ) 1 + O( t

(3.26)

k=1

Let m = ⌊C(log t ∧ log(n − t))⌋ 14

(3.27)

1 such that e−cm = O( 1t ∨ n−t ) for the constants c in (3.8) and (3.20). Recall that in proving (2.2), we can only consider those t larger than any given number, thus (3.5) is satisfied. From (3.3),

λ ∼ (n − t + 1)P(X1 + · · · + Xt > b). By (3.20) and (3.26), |λ1 − λ| 6 Cλ By (3.4) and (3.8),

 (log t)2 1  + . t n−t

h log t ∧ log(n − t) i |P(Mn;t > b) − (1 − e−λ1 )| 6 C(λ ∧ 1) e−ct + . n−t

The bound (2.2) is proved by using the above two bounds for the cases λ = O(1) and λ ≫ 1 separately and using |e−λ − e−λ1 | 6 |λ − λ1 |e−(λ∧λ1 ) . Next we consider the arithmetic case of Theorem 2.1. Without loss of generality, we assume X1 is integer valued with span 1. The calculation of λ2 is similar to the non-arithmetic case except that we have, for integers 0 6 k 6 x0 , 2

1 (log t) Pθ (St = ⌈b⌉ + k) = σ (2πt) + O( 3/2 1/2 t a

)

a

and ∞ X k=0

P(Dτ

+

> ⌈b⌉ − b + k)e−θa (⌈b⌉−b+k)

= e−θa (⌈b⌉−b)

∞ X

P(Dτ

+

> k)e−θa k

k=0

 e−θa (⌈b⌉−b)  −θa Dτ+ E e . 1 − = −θ 1−e a

Therefore, for the arithmetic case, λ2 =

(n − t + 1)e−(aθa −Ψ(θa ))t e−θa (⌈b⌉−b) (1 − e−θa )σa (2πt)1/2 ∞  X +  (log t)2  n−1 E(e−θa Dn ) 1 + O( ) . × exp − t n=1

3.2

Proof of Theorem 2.3 P Recall Si = ik=1 Xk . Define τ+ = inf{n > 1 : Sn > 0} and τb := inf{n > 1 : Sn > b},

Tb := inf{n > 1 : Sn ∈ / [0, b)}.

15

(3.28)

In this proof, let C and c denote positive constants which may represent different values in different expressions. If b is bounded, then by choosing C to be large enough in (2.7), we have Cλb1/2 h(b)/(n − b/µ1 ) > 1 and (2.7) is trivial. Therefore, in the following we can assume b is larger than any given constant. Moreover, since we assume h(b) = O(b1/2 ) in the theorem, by choosing C to be large enough and c to be small enough in (2.7), we only need to consider the case where h(b)/b1/2 is smaller than any given positive constant. In particular, we can assume n 2 2(θ ′ − θ ) sup o ′′ h(b) 1 06θ6θ1 Ψ (θ) 1 6 min , µ1 µ21 b1/2

(3.29)

for some θ1′ ∈ Θ and θ1 < θ1′ < 2θ1 . We first prove several lemmas that will be used in the proof of Theorem 2.3. Lemma 3.2. Let {X1 , . . . , Xn } be independent, identically distributed random variables with distribution function F that can be imbedded in an exponential Pi family, as in (2.1). Let EX1 = µ0 < 0. Let S0 = 0 and Si = k=1 Xk for 1 6 i 6 n. Suppose there exist θ1 > 0 such that Ψ(θ1 ) = 0. Let Fn = σ{X1 , . . . , Xn }, and let T be a stopping time with respect to {Fn }. Then we have   (3.30) P(F ∩ {T < ∞}) = Eθ1 e−θ1 ST I(F ∩ {T < ∞}) for any F ∈ FT .

Proof. Equation (3.30) follows by a direct application of Wald’s likelihood ratio identity (cf. Theorem 1.1 of Woodroofe (1982)) to the sequence {X1 , X2 , . . . }. Lemma 3.3. Let t = ⌈ µb1 + b1/2 h(b)⌉. We have

Pθ (Tb > t) ≤ Ce−ch (b) . 2

1

Proof. Let r=

µ21 2 sup06θ6θ1 Ψ′′ (θ)

h(b)/b1/2 .

By (3.29), we have r < θ1 and µ1 r − sup06θ6θ1 Ψ′′ (θ)r 2 /2 > µ1 r/2. By

16

Markov’s inequality and Taylor’s expansion, rb −rS Pθ (Tb > t) 6 Pθ (S t 6 b) 6 e Eθ e  1

1

t

1

6 exp rb − [Ψ(θ1 ) − Ψ(θ1 − r)]t  6 exp rb − [µ1 r − sup Ψ′′ (θ)r 2 /2]t

6 exp

06θ6θ1 Ψ′′ (θ)r 2 b

 sup06θ6θ1 2µ1

− [µ1 r − sup Ψ′′ (θ)r 2 /2]b1/2 h(b) 06θ6θ1

Ψ′′ (θ)r 2 b

 sup06θ6θ1 µ1 r 1/2 − b h(b) 2µ1 2 3  µ1 2 h (b) . 6 exp − 8 sup06θ6θ1 Ψ′′ (θ)

6 exp

This proves Lemma 3.3.

Lemma 3.4. For positive integers m, we have ∞ X

P(Si > 0) 6 Ce−cm.

i=m

Proof. Lemma 3.4 follows from

P(Si > 0) 6 Eeθ S ∗

i

= eΨ(θ

∗ )i

,

where 0 < θ ∗ < θ1 and Ψ(θ ∗ ) < 0. Lemma 3.5. Let t1 = ⌊ µb1 − b1/2 h(b)⌋. We have

EI(∪06i b}) 6 Ce−θ b h2b(b) e−ch (b) . 1

2

1

Proof. We only need to consider the case when t1 > 0. Let r=

µ21 2 supθ1 6θ6θ1′ Ψ′′ (θ)

h(b)/b1/2 .

By (3.29), θ1 + r 6 θ1′ ∈ Θ. We have  Pθ1 (Sj > b) 6 exp j[Ψ(θ1 + r) − Ψ(θ1)] − rb ,

thus

i X j=1

Pθ (Sj > b) 6 1 − eΨ(θ 1)−Ψ(θ +r) exp 1

1

1

17

 i[Ψ(θ1 + r) − Ψ(θ1 )] − rb .

By (3.30) and Taylor’s expansion,

EI(∪06i b}) 6 e−θ b 1

1

6

e−θ1 b 1−

eΨ(θ1 )−Ψ(θ1 +r)

6 e−θ1 b



1

t1 X i=1

t1 X i X

Pθ (Sj > b) 1

i=1 j=1

 exp i[Ψ(θ1 + r) − Ψ(θ1 )] − rb 2

 exp t1 [Ψ(θ1 + r) − Ψ(θ1 )] − rb

1 − eΨ(θ1 )−Ψ(θ1 +r)  b 6 Ce−θ1 b 2 exp t1 [Ψ(θ1 + r) − Ψ(θ1 )] − rb h (b) supθ1 6θ6θ1′ Ψ′′ (θ) 2  b b exp ( − b1/2 h(b))(rµ1 + r ) − rb 6 Ce−θ1 b 2 h (b) µ1 2 ′′ supθ1 6θ6θ1′ Ψ (θ)b 2  b 6 Ce−θ1 b 2 exp − b1/2 h(b)rµ1 + r h (b) 2µ1 2 b 6 Ce−θ1 b 2 e−ch (b) . h (b) R∞ Lemma 3.6. If −∞ |ϕθ1 (t)|dt < ∞ where ϕθ1 (t) = Eθ1 eitX1 , then Sτ+ under Fθ1 has bounded density and is strongly nonarithmetic in the sense that lim inf |1 − ϕθ1 (λ)| > 0, where ϕθ1 (λ) = Eθ1 eiλSτ+ . |λ|→∞

R∞ Proof. The condition −∞ |ϕ(t)|dt < ∞ implies that X1 is strongly nonarithmetic. By (8.42) of Siegmund (1985) with Rs = 1, the distribution of Sτ+ ∞ is also strongly nonarithmetic. The condition −∞ |ϕ(t)|dt < ∞ also implies that the density of X1 is bounded by a constant M . Therefore,

Pθ (Sτ 1

+

∈ [x, x + dx]) 6 Eθ1 6 Eθ1 =

∞ X

n=0 ∞ X

n=0 ∞ Z 0 X

I(S1 , . . . , Sn 6 0, Sn+1 ∈ [x, x + dx]) I(Sn 6 0, Sn+1 ∈ [x, x + dx])

Pθ (Sn = dt)Pθ (X1 ∈ [x − t, x + dx − t])

n=0 −∞ ∞ X

6 M dx

1

1

Pθ (Sn 6 0) 6 Cdx, 1

n=0

where in the last inequality we used

Pθ (Sn 6 0) 6 eΨ(θ −θ )n 1

1

18



(3.31)

for 0 < θ ∗ < θ1 so that Ψ(θ1 − θ ∗ ) < 0. This proves that Sτ+ under Fθ1 has bounded density. Proof of Theorem 2.3. We embed the sequence {X1 , . . . , Xn } into an infinite + be i.i.d. sequence {. . . , X−1 , X0 , X1 , . . . }. For a positive integer m, let ωm +) = S the m-shifted sample path of ω := {X1 , . . . , Xn }, so Si (ωm m+i (ω) − + + + ), τ (ω + ) are Sm (ω), Tb (ωm ) = inf{n > 1 : Sn (ωm ) ∈ / [0, b)}, and τb (ωm + m defined similarly. Let t = ⌈ µb1 + b1/2 h(b)⌉ and m < t to be chosen at the end of this proof. For 1 6 α 6 n − t, let  Yα = I Sα < Sα−β , ∀ 1 6 β 6 m; Tb (ωα+ ) 6 t, STb (ωα+ ) > b .

That is, Yα is the indicator of the event that the sequence {Si } reaches a local minimum at α and the α-shifted sequence {Si (ωα+ )} exits the interval [0, b) within time t and the first exiting position is b. Let n−t X

W =

Yα .

α=1

In the following, we first compare pn,b with P(W > 1). Then, we approximate the distribution of W by the Poisson distribution with mean E(W ). Finally, we calculate approximately E(W ). First, from the definition of W , we have pn,b > P(W > 1) and with t1 = ⌊b/µ1 − b1/2 h(b)⌋, { max (Sj − Si ) > b}\{W > 1} 06i b, Tb (ωk+ ) > t ⊂ ∪k=0   ∪ ∪k∈[0,m]∪(n−t,n−t1) STb (ωk+ ) > b, Tb (ωk+ ) 6 t   ∪ ∪n−t1 6i b .

By symmetry,

pn,b − P(W > 1)

6 (n − t)P(STb > b, Tb > t) + (m + 2b1/2 h(b) + 2)P(STb > b, Tb 6 t) + EI(∪06i b}).

(3.32)

By (3.30) and Lemma 3.3, we have

P(ST

b

> b) = Eθ1 [e−θ1 STb I(STb > b)] 6 e−θ1 b

(3.33)

and

P(ST

b

> b, Tb > t) = Eθ1 [e−θ1 STb I(STb > b, Tb > t)]

6 e−θ1 b Pθ1 (Tb > t) 6 Ce−θ1 b−ch 19

2 (b)

.

(3.34)

Along with Lemma 3.5, pn,b − P(W > 1)

 b/h2 (b) −ch2 (b) (3.35) m + b1/2 h(b) 2 + e . 6 C(n − b/µ1 )e−θ1 b e−ch (b) + n − b/µ1 n − b/µ1

Next, we use Theorem 3.1 to obtain a bound on the total variation distance between the distribution of W and P oi(λ1 ) with λ1 := E(W ) = (n − t)EYα . For each 1 6 α 6 n − t, let Bα = {1 6 β 6 n − t : |β − α| 6 t + m}. In applying Theorem 3.1, by our definition of Bα , b3 = 0. Since |Bα | 6 2(t + m) + 1, we have

b1 < [2(t+m)+1]λ1 EYα 6 C(n−t)(t+m)P2 (STb > b) 6 C(n−t)(t+m)e−2θ1 b . (3.36) Let Y˜α = I(Tb (ωα+ ) 6 t, STb (ωα+ ) > b). We have for b2 in (3.2), b2 6

n−t X

X

E(YαYβ )

α=1 α6=β∈Bα

62

n−t X

β=1

 

X

EYβ Y˜α +

X

β−m6α6β−1

β−t−m6α Sα) + I(Sβ < Sα)] 6 EI(Sβ − Sα > 0)Y˜β + EI(ST (ωα+ ) > b, Tb (ωα+ ) 6 β − α)Y˜β . b

By independence and symmetry, X

EYβ Y˜α 6 EY˜1

t+m X

[P(Si > 0) + P(STb > b, Tb 6 t + m)].

i=m

β−t−m6α Sβ , which in turn implies Tb (wα+ ) 6 β − α, we have X X EYβ Y˜α 6 EY˜β I(STb (ωα+) > b, Tb (ωα+) 6 β − α) β−m6α6β−1

β−m6α6β−1 m X

6 EY˜1

P(ST

b

i=1

20

> b, Tb 6 i).

Therefore, b2 6 2(n − t)EY˜1

h t+m X

6 2(n − t)e

b

i=m

+ −θ1 b

P(Si > 0) + P(ST



m X

P(ST

b

i=1 −cm

Ce

 > b, Tb 6 t + m)

i > b, Tb 6 i)

+ (t + m)e−θ1 b

(3.37)



by Lemma 3.4 and (3.33). From (3.1), (3.36) and (3.37),   P(W > 1) − (1 − e−λ1 ) 6 C(n − t)e−θ1 b (t + m)e−θ1 b + e−cm .

(3.38)

Now we bound the difference between λ1 and

λ2 = (n − t)P(τ0 = ∞)P(STb > b). Recall λ1 = (n − t)EYα

= (n − t)P(Sα−β − Sα < 0, ∀ 1 6 β 6 m)P(Tb (ωα+ ) 6 t, STb (ωα+ ) > b) = (n − t)P(τ0 > m)P(Tb 6 t, STb > b).

From the upper and lower bounds of their difference λ2 − λ1 6 (n − t)P(Tb > t, STb > b),

λ1 − λ2 6 (n − t)P(STb > b)P(m < τ0 < ∞), we have |λ1 − λ2 | 6 C(n − t)e−θ1 b−ch

2 (b)

+ (n − t)e−θ1 b

−ch2 (b)

6 C(n − t)e−θ1 b [e

+ e−cm ]

∞ X

P(Si > 0)

i=m

(3.39)

by (3.34), (3.33) and Lemma 3.3. Finally we calculate approximately λ2 . By (3.30),

Since

 λ2 = (n − t)e−θ1 b P(τ0 = ∞)Eθ1 e−θ1 (STb −b) , STb > b .



1

  e−θ1 (Sτb −b) = Eθ1 e−θ1 (Sτb −b) , STb > b

 + Eθ1 e−θ1 (Sτb −b) , STb < 0 , 21

(3.40)

we have  e−θ1 (STb −b) , STb > b   = Eθ1 e−θ1 (Sτb −b) − Eθ1 e−θ1 (Sτb −b) , STb < 0 o n   = Eθ1 e−θ1 (Sτb −b) − Eθ1 Eθ1 e−θ1 (Sτb −b) STb , STb < 0 .



1

(0)

We first consider the non-arithmetic case. Let τ+ (k) τ+

(k+1) τ+

(3.41)

(k)

= 0, and let τ+

be

: Sn > Sτ (k) }. Define U (x) = = inf{n > defined recursively as + P∞ k=0 Pθ1 (Sτ (k) 6 x). Observe that {Sτ (k+1) − Sτ (k) , k = 0, 1, . . . } are i.i.d. +

+

+

with the same distribution as Sτ+ . By Lemma 3.6 and (2) of Stone (1965), U (x) =

x Eθ1 Sτ+ +

Eθ (Sτ2 Eθ Sτ 1

+

1

)

+

+ o(e−cx ), as x → ∞.

(3.42)

Following the proof of Corollary 8.33 of Siegmund (1985), we have for x > 0,

Pθ (Sτ 1

=

b

∞ X

n=0

− b > x)

Pθ (Sτ

(n) +

1

Z =(

< b, Sτ (n+1) > b + x) +

Z

)U (dt)Pθ1 (Sτ+ > b + x − t) (b/2,b) (0,b/2] Z = O(Pθ1 (Sτ+ > b/2)U (b/2)) + U (dt)Pθ1 (Sτ+ > b + x − t). +

(3.43)

(b/2,b)

For x > 0,

Pθ (Sτ 1

+

∞ X I(S0 , . . . , Si 6 0, Xi+1 > x − Si )] > x) = Eθ1 [ i=0

6 =

∞ X

i=0 ∞ X

Pθ (Si 6 0, Xi+1 > x) 1

Pθ (Si 6 0)Pθ (Xi+1 > x) 6 C Pθ (X1 > x) 1

1

1

i=0

where we used (3.31). Therefore, the right tail probability of Sτ+ under Fθ1 decays exponentially. Along with (3.42), the first term on the right-hand side of (3.43) is bounded by o(e−cb ). Let j = ⌈ecb ⌉ with small enough c, and let ∆ = 2jb . Then Z

(b/2,b)

U (dt)Pθ1 (Sτ+ > b + x − t) > A

22

where A=

j X [U (b − (k − 1)∆) − U (b − k∆)]Pθ1 (Sτ+ > x + k∆), k=1

and by (3.42) and the fact that Sτ+ under Fθ1 has bounded density (cf. Lemma 3.6), Z U (dt)Pθ1 (Sτ+ > b + x − t) − A (b/2,b)

j X [U (b − (k − 1)∆) − U (b − k∆)]Pθ1 (Sτ+ ∈ [x + (k − 1)∆, x + k∆]) 6 k=1 −cb

= o(e

).

From (3.42), A=

j X

∆ Pθ (Sτ E θ Sτ k=1 1

1

+

> x + k∆) + O(je−cb )

+

with the same c as in (3.42). By choosing c in the definition of j to be small enough, we have je−cb = o(e−cb ). Using the fact that Sτ+ under Fθ1 has bounded density and an exponential tail, we have j X k=1

∆ Eθ1 Sτ+ Pθ1 (Sτ+ > x + k∆) =

1 Eθ1 Sτ+

Z

∞ x

Pθ (Sτ 1

+

> y)dy + o(e−cb ).

Therefore, j X A= [

∆ + o(e−cb )]Pθ1 (Sτ+ > x + k∆) E θ1 Sτ+ k=1 Z ∞ 1 −cb = Eθ1 Sτ+ x Pθ1 (Sτ+ > y)dy + o(e ).

By (3.43) and the above argument, Z ∞ Pθ1 (Sτb − b > x) = E 1S Pθ1 (Sτ+ > y)dy + o(e−cb ). θ1 τ+ x

23

(3.44)

Using the integration by parts formula and the above equality,  Eθ1 e−θ1 (SZτb −b) ∞ = 1 − θ1 Pθ1 (Sτb − b > x)e−θ1 x dx 0 Z ∞Z ∞ θ1 −θ1 x −cb =1− Eθ1 Sτ+ Z0 x Pθ1 (Sτ+ > y)e dydx + o(e ) ∞ 1 −cb −θ1 y =1+ Eθ1 SτZ+ 0 (e − 1)Pθ1 (Sτ+ > y)dy + o(e ) ∞ 1 = e−θ1 y Pθ1 (Sτ+ > y)dy + o(e−cb ) µ1 Eθ1 τ+ 0 ∞ X  1 1 −θ1 Sk+ exp − = E + o(e−cb ), θ1 e θ 1 µ1 k

(3.45)

k=1

where in the last equality we used (3.25). From (3.41) and (3.45), we have, with τ− := inf{n : Sn < 0},  e−θ1 (STb −b) , STb > b ∞ X + 1 1 −θ1 Sn exp − E Pθ1 (STb > b) + o(e−cb) = θ1 e θ 1 µ1 n



=

1

n=1 ∞ X

+ 1 1 −θ1 Sn exp − E Pθ1 (τ− = ∞) + o(e−cb ) θ1 e θ 1 µ1 n n=1

as b → ∞, where we used

0 6 Pθ1 (STb > b) − Pθ1 (τ− = ∞) ∞ ∞ X X Pθ1 (Si < −b) 6 e−θ∗ b eΨ(θ1 −θ∗)i = o(e−cb ), 6 i=1

i=1

where 0 < θ ∗ < θ1 so that Ψ(θ1 − θ ∗ ) < 0. By (3.40) and

P(τ0 = ∞)Pθ (τ− = ∞) = exp 1



= exp −  = exp −

∞ X 1 k=1 ∞ X k=1

k





∞ X 1 [P(Sk > 0) + Pθ1 (Sk < 0)] k k=1

[Pθ1 (e−θ1 Sk , Sk > 0) + Pθ1 (Sk < 0)]

1 −θ1 Sk+ E θ1 e k

24

from Lemma 3.2 and Corollary 2.4 of Woodroofe (1982), we have, λ2 = (n − t)e−θ1 b 

= 1 + O(

∞ X  1 + 1 −θ1 Sn exp − 2 E + o(e−cb ) θ1 e θ 1 µ1 n n=1

b1/2 h(b)

n − b/µ1

 ) λ + (n − t)e−θ1 b o(e−cb ).

(3.46)

For the arithmetic case, assume X1 is integer valued with span 1, and b is an integer. By a similar and simpler argument as for (3.44), we have, for integers k > 0,

Pθ (Sτ 1

b

− b = k) = =

∞ X

n=0 b−1 X

Pθ (Sτ 1

[

∞ X

(n) +

< b, Sτ (n+1) = b + k)

Pθ (Sτ

(n) +

1

m=1 n=0

=O

∞  ⌊b/2⌋ XX

+

+

m=1 n=0 b−1 X

= m)]Pθ1 (Sτ+ = b + k − m)

Pθ (Sτ

(n) +

1

Pθ (Sτ 1

+

m=⌊b/2⌋+1

=

 = m)Pθ1 (Sτ+ > ⌊b/2⌋)

= b + k − m)

1 −cb Eθ1 Sτ+ Pθ1 (Sτ+ > k) + o(e ).

 1 + o(e−cb ) Eθ1 (Sτ+ )

By the above equality and (3.28),

Eθ (e−θ (S 1

1

τb −b)

)=

∞ X k=0

e−θ1 k

1 −cb Eθ1 Sτ+ Pθ1 (Sτ+ > k) + o(e )

1 [1 − Eθ1 e−θ1 Sτ+ ] + o(e−cb ) µ1 Eθ1 τ+ (1 − e−θ1 ) ∞ X + 1 1 −θ1 Sn = e + o(e−cb ). E exp − θ 1 µ1 (1 − e−θ1 ) n

=

n=1

Similar calculation as for the non-arithmetic case yield λ2 = (n−t)e−θ1 b



∞ X + 1 1 −θ1 Sn exp −2 E +o(e−cb ) . (3.47) θ1 e −θ 1 (1 − e )µ1 n n=1

Theorem 2.3 is proved by combining (3.35), (3.38), (3.39), (3.46) and (3.47) and letting m = ⌊ch2 (b)⌋ such that m < t.

25

4

DISCUSSION

The arguments we used to prove Theorem 2.1 and Theorem 2.3 may be useful in proving rates of convergence for tail probabilities of other test statistics for detecting local signals in sequences of independent random variables. Two for which some new techniques will be needed are the Levin and Kline statistic (Levin and Kline (1985)) and the generalized likelihood ratio statistic. For example, let {X1 , . . . , Xn } be independent random variables from the exponential family (2.1). Consider the testing problem at the beginning of the introduction. If the mean of X1 is known and without loss of generality equal to 0, the generalized likelihood ratio statistic is max16i

Suggest Documents