asymptotic normality of statistics based on the convex minorants of ...

1 downloads 0 Views 1MB Size Report
function of the concave majorant of the usual 2-sample empirical distribution function. The asymptotic distribution of the statistic was left as an open question ...
ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS OF EMPIRICAL DISTRIBUTION FUNCTIONS BY PIET GROENEBOOM AND RONAL.DPYKE

TECHNICAL REPORT NO. 5 JULY 1981

THIS RESEARCH IS BASED UPON WORK PARTIALLY SUPPORTED BY THE NATIONAL SCIENCE FOUNDATION UNDER GRANT MCS-78-09858

DEPARTMENT OF STATISTICS UNIVERSITY OF WASHINGTON SEATTLE) WASHINGTON 98195

DEPARTMENT

STAT

ASYMPTOTIC NORMALITY OF STATISTICS BASED ON THE CONVEX MINORANTS OF EMPIRICAL DISTRIBUTION FUNCTIONS

by Piet Groeneboom1 and Ronald Pyke 2 University of Washington

Abstract A

Let Fn be the Uniform empirical distribution function. Write F n for the (least) concave majorant of Fn, and let f n denote the corresA

ponding density.

It is shown that n!~(fn(t)-1)2dt is asymptotically

standard normal when centered at log n and normalized by (3 log n)~.

A

A

similar result is obtained in the 2-samp1e case in which f n is replaced by the slope of the convex minorant of Fr m = FmoHN-1 .

FOOTNOTES 1. This work was done while this author was on leave from the Mathematical Centre, Amsterdam, as a Visiting Professor in the Departments of Statistics and Mathematics at the University of Washington. 2. The research of this author was supported in part by the National Science Foundation, Grant

MCS~78-09858.

AMS 1970 Subject Classification. Primary: 62E20 Secondary: 62G99, 60J65 Key words and phrases: empirical distribution function, concave majorant, convex minorant, limit theorems, spacings, Brownian bridge, two-sample rank statistics.

1-

1.

Introduction

In 1956, Grenander introduced ideas of 'non-parametric' maximum likelihood estimation.

In one of the examples, he found the maximum likelihood

estimate (M.L.E.) within the class of all distribution functions that are concave over [0,=), or equivalently, the class of all monotone decreasing densities supported on [0,=).

The M.L.E. in this example is the concave

majorant of the ordinary empirical distribution function.

For a formal

definition of maximum likelihood which covers these 'non-parametric' cases, see Sch91z (1980). Statistics based on either concave majorants or convex minorants of empirical distribution functions have arisen independently in at least two other very different contexts.

In 1975, Behnen proposed a 2-sample rank

statistic defined as lithe supremum of all standardized and centered simple linear rank statistics having non-decreasing scores".

This statistic was

shown to have comparable performance to an adaptive statistic proposed by Randles and Hogg (1973) when used against shift alternatives. and to have a much superior performance against stochastically ordered alternatives. Although not mentioned in Behnen (1974, 1975), it was known independently to both Behnen and Scholz (personal correspondence, June and July, 1975) that this statistic is expressible in terms of the L2-norm of the density function of the concave majorant of the usual 2-sample empirical distribution function.

The asymptotic distribution of the statistic was left as an open

question, although Behnen (1974) provided extensive simulations for selected sample sizes up to m=n=100 which suggested to us the asymptotic normality of the statistic.

2In a completely different context, Scholz (1981) proposed a procedure for the combination of p-values from independent tests of significance. The procedure, utilizing Roy's union-intersection principle, results in a statistic that is expressible similarly in terms of the L2-norm of the density function of the concave majorant of the l-sample Uniform empirical distribution function.

Exact distributions are obtained by Scholz for the

very small sample sizes which are important for this context.

Simulations

were also carried out by Scholz (personal communication) for moderate sample sizes to evaluate the feasibility of approximations using asymptotic theory.

Scholz is derived in Section 3 and that of the 2-sample statistic of Behnen is obtained in Section 5. The method used in Section 3 utilizes a conditional representation of the concave majorant of the Uniform empirical distribution function in terms of a sequence of Poisson and gamma random variables. This representation is detailed in Section 2. This method is an extension of that used in Pyke (1965) and due originally to Le Cam (1958). The 2-sample case is proved in Section 5 using a strong invariance principle together with the asymptotic normality of the L2-norms of the slope processes of the convex minorants of a sequence of truncated Brownian Bridges. The latter is derived in Section 4. As is pointed out in the remarks in Section 6, it is possible to prove the 2-sample result by an analogue of the method presented in Sections 2 and 3. On the other hand, it is possible to prove the l-sample result by a method analogous to that of Sections 4 and 5. This is done in Groeneboom (1981), Theorem 3.2) where a detailed study of the concave majorant of Brownian motion is. presented.

3-

2. The Representation Theorem In this section, we describe the specific construction that is used to provide a tractable representation for the concave majorant of the To do this, the following notation is needed.

uniform empirical process.

Let Fn denote the Uniform empirical distribution function, and write Fn for its concave majorant, the function on [0, 1] formed as it were A

by stretching a rubber band over the top of Fn. Let nn be the number of vertices of Fn, including the end-points, (0,0) and (1,1). Let A

, 'nO =. 0

< ~n, 1 < •••
l, j>O} be independent r.v.'s J

being a r(j,l) r.v. for That is, each Sji

j~l

-

-

and Soi being a r(l,l)

is equal in law to a sum of j

or

independent

Exp(l) r.v.'s.

Assume {N j} and {Sji} are independent. Rewrite the spacings {Oni} of the concave majorant in a specific

First, write 0 0 for ° , the only zero-step spacing; nnn n then write all of the one-step spacings in the order of increasing magnitude

order as follows.

(which is the same as the order of appearance in the concave majorant): then write all of the two-step spacings, and so on. b

y

I'\(n) = I(,

In this order, denote them

(0 . 0 D· "', . no' nl1"'" nl Q u nZP ' '' ' °n2Q ... , 0nnQnn)' nz' nl'

Analogously, write

5j 1

~ Sj2

~ ... ~ SjN. J

for the ordered values of

Sj1"",SjN. and set J

~(n) = (501; 511 " " 51N1 ; 521 " " , S2N 5 , .. ·, 2;"'; n1 Then the following representation holds, where ~(n) = (N 1,···,

5nNn).

Nn). Note: We delete the parameter n from the notation whenever it is unlikely to cause confusion. Theorem 2.1.

For n>l

where (2.7)

Proof. Let {Yet):

t~O}

denote a Poisson process with EY(t) = t.

It is

well known that conditional on the (n+l)-th jump occurring at t=n, n-1Y(n.) on [0,1) ,

is equal in law to F.

(2.6) is'closely related.

The relationship given in

n

First of all, by (2.4), the marginal distribution

of 'YQ is the same as the conditional distribution of tVN(n) Now

given Tn = n.

is obtainable by standard techniques starting from the Uniform distribution over [O,lJ n. The main thing to observe is that as is implied f~,~

in the proof of (2.4), the distribution of

~

is the same for almost all

values of the original ordered Uniform spacings.

The latter when multiplied

by n, can be represented as the ordered values of n+1

independent

Yl, ... , Yn+1 given Yl +... + Yn+1 = n. Thus the same techniques that will yield f~,n~(~'~) from the joint density of ordered

Exp(l) r.v.

IS

Uniform spacings will yield f (n) -tn) (~,~,n,n) from ordered N ,S ITn,Sn exponentials. 0

7-

3. The One-Sample Limit Theorem The statistic Ln is shown to be asymptotically normal by studying a related statistic suggested by Theorem 2.1. First of all, notice that _ -1

(3.1)

L - n

n

n

.2

I. 1 J

J=

Qj ( II' I. 1 11nu

,=

)

.. , nJ'

which suggests that one might study the conditional limiting distribution of

-1 n .2 I j =l J

*

Ln = n ut1d~r th~

I

Nj i =l (l / Sj i )

e:] ... O. (Cf. Loeve (1963), p. 316). For this it suffices to compute fourth moments and use Markov's inequality. To this end, we compute

where Aj and Bj are the fourth and second central moments of (Sj1- j)2/ Sj 1, re~pectively. It is easy to show that both Aj and Bj are uniformly bounded in j>4, since

and by the cr-inequa1ity (cf'. Loeve (1963), p., 155) E(Sj_4,1-j)8

~ 27{E[L~:i(Yi-1)]8 + 48}

where Y1, Y ... are independent Exponential r.v.'s with mean 1. Straight2, forward computations show that E(X 1 +.•. + Xm)8 ~ Cm 4 for any independent r.v.'s with means zero. Therefore, LjP [ IXnj I > e: ] Similarly,

~

e: -4 Lj

E1Xnj 14


U, a N(O,l) r.v. To determine the limiting joint distribution of (V n, Wn), we compute the characteristic function .{. rr · · N . it(j/ri)N.} isV+itw E(e n n) = E n ([~ .(sn-~)] J e J) .j=l Sj1- J (3.9)

. Now, as j

~ ~

while j/n

~

u€(O,l), we have by the Central limit for

{j-~(Sj1-j)} that ~

2 2

.(sn-~) ~ e- s u /2.

Sj1- J

Recognizing a Riemann sum in the exponent of (3.9), the limit becomes

as desired.

It is straightforwardly checked by direct integration that the

exponent in (3.10) may be written as (3.11)

where

~

is the standard Normal density, so that the Levy measure of the

122-dimensional infinitely divisible r.v. (Y,W)

is absolutely continuous with

respect to Lebesgue measure on ((-~,O)u(O,~»x(O,l) with 'density' There is therefore no Normal part to the distribution of

(Y,W).

~(v/w)w-2.

This fact

completes the proof since it is easily checked that if {Yni} and {Zni}

are two triangular arrays which are jointly in the domain of attraction of an infinitely divisible distribution, and if the marginal limiting law of one is Normal and the other has no Normal component, then they are asymptotically independent. 0

, ,whereas the result being sought is the limiting conditional distribution of Un given Yn = 0 and Wn = 1. To obtain the conditional result from the joint, we follow an idea used originally by the LeCam (1958) to obtain limit l~ws

~.

for sums of a function of Uniform spacings.. The method was used by

pyke (1965) to obtain limit laws of more general functions as well as the weak cQnvergence of related processes.

It can be shown that

itU itU 'nm(t): = E[e nlIYn=O, Wn=l] = E{E[e mIYm,WmJIYn=O, Wn=l} (3.12)

= E[e

itU m Pnm(Wm)rnm(Ym' W m)]

where Pnm(k/n)

= P[Wm = k/mlW n = l]/P[Wm = kIm]

and rnm(V,k/m)

= fy

IW Y (v,k/m,O)/fy IW (v,k/m) m m m m' n

13for k

= 0,1, .•• and real v. To verify (3.12), write n

.

.

00

~nm(t) = tk=oP[Wm = k/mlW n = 1] [00 E[e

itU ml W m = kIm, Vm = v]

f V \W V {v, kIm, Oldy m m n

rnm{v, k/n)f V IW(v, k/m)dv m m which i$ then the unconditional expectation given in (3.12). Since the Njls are independent (3.13)

p

w~ere Tn = E~=l

nm (kIn) = P[T n-Tm = n-k]/P[T n = n]

Nj = nW n· Moreover, under the condition that W m = kIm (equivalently, Tm = k) and Vn = 0 (equivalently, Sn: = E~=l E~~l Sji = n), it is well known that n-15m is a Beta (k, n-k) r.v., while under the single condition Vn = 0, then Sm is a Gamma (k,l) r.v. j

Therefore, for k = 1,2, ... , nand 0




n, the latter would be

The rest of the Lemma is clear.

For m =o(n), Pnm(k/n)

+

0

1, as n + 00, uniformly for k < cn

for some c < 1. Proof. By (3.13) and Lemma 3.2, it only remains to show that the numerator of Pnm satisfies

15-

(3.17)

By the Fourier inversion formula for the characteristic function of n

Tn-Tm = Lj=m+1 jN j , (3.18)

PET -T = n-k] = {2~)-1 !2~ f (t) e-(n-k)it dt 0 n n m

where

is the characteristic function of Tn-Tm• the right hand side of (3.18) is equal to

,

,

{L~ ( 2 ( n-k )~ ) -1 !2~ o J=l

Integration by parts shows that

ei t j _ L~ eitj}f (t) e-{n-k)itdt . J=l nm

Since Pn = p[Tn=n] = P[T n = n-k] for all

k

~

n, then similarly one obtains

En ei t j 9 {t}e-{n-k)it dt Pn = {2n ~ )-1 !2~ 0 j=l n where (n .-1 itj} gn {t) = expt-Ej=l J (l-e ) is the characteristic function of Tn' Hence

(3.19)

_ (2~}-1 !2~ E~ ei t j f (t) -(n-k)it dt o J=l nm e

= (1-k/nr 1{p(Tn-Tm- cn] < nE(e m)e- cnt m tj = ne -cnt exp{m Lj=l J.-l(e _ l)}

if one chooses t = 11m. Thus we have proved Lemma 3.5. If m = o(n), then (3.20)

We can now state and prove the main result.

+

O.

18-

Theorem 3.1. As n +

~:

(3.21)

~t

= (3 log n)~(nLn ~ n -

log n) ~> U.

a N(O.l) random variable. Proof. In view of (3.5) and the discussion leading up to it, in law to the conditional law of Un given Vn o , the first term converges to 0 because (bn/bm)2 = (log n)/log m... O. For the middle term, we use (3.6) to compute its variance to be Clearly the last term converges to

o.

2 bn2 Enk=m-1 3/k + 0(1) = (1og n)-l 10g(n/m) + 0(1)

,: 1 - (log m) 11 og n = 0(1) . This shows that Un-Um ~> 0 as desired.

The proof is complete.

0

20..

4.

of slopes of convex minorants of truncated Brownian bridges. We shall prove the following result.

~-norm

Theorem 4.1. Let B= {B(t):tE [O,l]} be (standard) Brownian bridge on [0,1], let Bt,u =B.l [t,u]' where 1[t,u] is the indicator of the interval [t,u] and let gt,u be a version of the slope of the convex minorant of Bt,u on (0,1). Then (4.1)

{f~ g~/n,1_1/n(U)dU-109 n}/1310g n h. Z,

where Z is a standard normal random variable. Theorem 4.1 will be used in section 5 to derive the asymptotic from Theorem 3.1 by using strong approximation of the empirical process by versions of Brownian bridges in Komlifs et al(1975). The following class of functions will playa fundamental role in the sequel. Definition 4.1. U is the set of right-continuous and nondecreasing step-functions J:[O,lJ +m, which have only finitely many jumps and satisfy f~ J(u)du =0 and f~ J 2(u)du =1. Notice that all functions J€ Msatisfy the inequalities ( ) 4.2

-k -u -k2~J(U)~ (l-u) 2, UE(O,l).

The class Mis also considered in Behnen(1975) and Scho1z(1981) (with slight modifications). It can be used to give a convenient representation of the L2-norm of the slope of the convex minorant of bounded real-valued functions on [O,lJ, which satisfy certain regularity conditions near the boundary of the interval [O,lJ. The representation of the L2-norm of the slope of the convex minorant by means of functions in 1.4 has been studie.d by F. Scholz, and the following lemma is a generalization of results in Scholz(1981).

21Leoma 4.1. Let G:[O,lJ +R be a bounded function such that G(O) =G(1) =0 and 1im ot-~-oG(t) = limt.J- ot-~-oG(1-t) = 0, t.J-

(4.3)

for some 0 > O. Then

00

IIgl~= -infJ, M 1(0,1) G(u)dJ(u),

(4.4) where

119'112 < and

9 is

a version of the slope of the convex minorant

Gof

G.

Proof. First suppose that G is a step-function which only has jumps at the points t 1< ••• < tn' where t 1 > 0 and t n < 1. It follows from (4.3) that in this case G(t) = 0, if t < t or t> Since G~G, we have 1

(4.5) Integration by parts and the Cauchy-Schwarz inequality give

(4.6)

-/(0,1)

GdJ=/6 g(u)J(u)du~ "9'1I2 " JII2 =1l9'1I2 ·

Since G(O) = G(1) = 0, we also have 6(0) =6(1) = 0 and hence

-

16 g(u)du = o.

Suppose 119'112 > O. Without loss of generality we may take a ri ght-conti nuous vers i on 9 of the slope of G and in thi s case the functi on J = g/ 119'112 belongs to M. Hence the upper bound in (4.6) is attained for J=g/ 119'112 , Combining (4.5) and (4.6) we get (4.7)

-

-infJt: M 1(0,1) G dJ~ -/(0,1)

G dJ=IIgI1 2·

--

Let 0 be the set of discontinuity points of J, then 0 is not empty , since otherwi seG :: 0 and hence Ilgl~ = O. The set 0 is a subset of the set {t ... ,t of discontinuity points of G. Let H:[O,l] +R be the 1, n} function defined by _ {G(t-)I\ G(t+), if t e 0 and G(t) > G(t-)AG(t+) H(t) - G(t), otherwise. Then H(t) ='G(t), if t s 0 and hence, since j is a step-function which only has jumps in 0,

(4.8)

1(0,1) H dr = I (0, 1)

'G dJ.

22.. It is clear that the integral 1(0,1) H dJ can be approximated arbitrarily close by integrals 1(0,1) 6 dd, with JE M (move the points t, D, where 6(t) > 6(t-)" 6(t+) a bit to the right or left and consider functions J~ At which have jumps of approximately the same height as J at the shifted points instead of the original points). Relation (4.4) now follows from (4.7) and (4.8). If IIgI1 then 6=0, and hence 6~0. In this case (4.4) also holds, 2=0, since 1(0,1) 6 dJ = 0 for any function Jf'M such that J is constant on the intervals [O,t) and [t,1), with tE(0,t 1 ) . Now consider an arbitrary bounded function 6:[0,1] +:R such that

-

-

G(t) = 0, if

t~a

or

t~ l-a,

where a E (O,~). Define for each n the intervals

Ik,n by Ik,n= [ k2- n , ( k+l ) 2- n) , k=0,1, .... ,2 n-2, Ik,n= [ k2- n ,1 ]~ .tf k= Gn(t) = inf E I G(u}, u k,n

iftE I k n ' k = 0,1, ... ,2 '

z" -1,

n_l.

Fix e> O. Let P be the set of finitely discrete probabil ity measures on [0,1]. Then, if 'iT is the convex minorant of a function H:[O,l] +:R, we have for each t e [0,1], ,... H(t) = i nf {f [O,l]H (u) dP:1[0,1]udP = t , PeP} (see e.g. Rockafellar(1970), p. 36). Thus there exist positive constants c1,n' .... 'cm(n),n and points tl,n' .... 'tm(n),n belonging to m(n) disjoint intervals I ,n such that, for each n and fixed t E [0,1], k m( n) _ m( n) _ - () m( n) ( )_ l:i=l ci,n -1, l:i=l c i ,nti,n - t and 6n t > l:i=l c i ,n 6n ti,n e, This implies that there are points t ' , with It~ n-t1' n l ~2-n, such that 1,n 1" n) (I ;I 1 < 2-n and "'" Gn() t > r:m( : ) - 2e 1,n t 1,n 1=1 c.1,n6 t 1,n (let t.1,n and t~1,n belong to the same interval I ,n ,and use the definition of Gn ). k The sequence lG } is increasing and hence lim + "trn(t) exists (and is~O). n n The convex minorant G of G is continuous on [0,1], since 6 is bounded on [at l-aJ and zero outside this interval. Hence by (4.9), G(t) ~ lim n + Gn(t) + 2e. We also have G(t)2.Gn(t), for all n, .and thus lim n + Gn(t) ='G(t). n converges pointwise to G, the _ right~cont~nuous Since_the sequence slopes 9n of Gn converge to the right-continuous slope 9 of G, except possibly at countably many points of [O,lJ (see e.g. Roberts & Varberg(1973),

(4.9)

m( n) c.

1t-z';1=1

ClO

ClO

{G }

ClO

23Problem C(9), p. 20). The functions Gn and G are uniformly bounded below on (a,l-a) and zero outside this interval. This implies that the slopes gn and g are uniformly bounded on (0,1). Hence, by dominated convergence, 1im n + co lI'9n-g1l2 =O. Choose nO such that 119nll2 > 11'9112 - e , for n~nO' By the first part of the proof there exists for each n a step-function JnE M, such that -/(O,l)G ndJ> lI'9n lk- e:, where the points of jump of I n, say ul,n, ... ,up(n),n' belong to disjoint intervals Ik,n and are contained in [a,l-a]. By the definition of Gn there exist points u',n' •.•. 'up(n),n such that G(ui ,n) < Gn(u i ,n) + e and ui,n and ui,n belong to the same interval Ik,n' Furthermore, let J~ be the right-continuous step-function which has the same J'umos u:1,n instead of u,1,n (note that in . as I n, but at the ooints . genera1J~ .M).· Then,by (4.21,wehave for n~nO' 1

1

-/(O,1)GdJ~> -1(O,1)G ndJ n.,. 2e:a-'2> 11'9112- 2e:- 2e:a-'2.

It is also clear from (4.2) and the definition of the points ui,n that I/~ J~(u)dul = I/~ J~(u)du- I~ In(u)dul ~2-n~la-~ and 1/~(J~(U))2dU -1/ = 1/~(J~(u))2dU - I~J~(u)dul ~2-n-la-l. Thus, for n sufficiently large we can find a J~( M, obtained from making slight adjustments of mass, which satisfies

J~

by,

1

-/(O,1)GdJ~ >1/9112- 3e:- 2e:a-'2.

Therefore -infJ E M1(0,1) GdJ ~ 1I'9lk· Since -infJ E M1(0,1) GdJ2, 2, infJ E M1(0,1) GndJ = 119n1l2' for each n, relation (4.4) now follows. Finally, let G be an arbitrary bounded function, such that G(O)=G(l)=O and (4.3) is satisfied. By (4.3) and the boundedness of G there exists a constant c >0, such that IG(t) 1 ~ c.min{t¥O ,(l_t)~+o}, tE [0,1].

(4.10) Thus, if we have

9 is

--

the right-continuous slope of the convex minorant G of G,

24-

(4.11) This implies (4.12) Define for each t e

(O,~)

the function G by t

G (u)= {G(uL if u([t,l-t], t o , otherwise. By (4.10), (4".2) and integration by parts, we have for all JE M (4.13)

Ifro ,OG dJ- !(O,1)GtdJI.sc! (O,t) . 0 < c.t /0.

U~odJ+Cffl_t,l)(l"U)~odJ

Let Hn = G1/n and let 'Hn be the convex minorant of Hn. The sequence {H n} converges uniformly to G, as n -+00, and hence, by the same argument as used above, {H } converges pointwise to Gr. By (4.11) and (4.12) the n ,..., right-continuous slopes h of H are uniformly bounded (in absolute valu~) n n by an L f of the form f'(u) = k.min{u-l2+ o ,(l-ur~+o}, u~ (0,1), 2-function where k is some positive constant. Since a similar bound holds for 91, we have by dominated convergence N

(4.14)

limn-+ oo 1111 n- g 112 = o.

Thus lim n -+ oo IIhnl12 =11'9112 and by (4.13), (4.15)

1imn -+ 00 Ilhnl~= 1imn -+ 00 {-infJ £ H 1(0,1) G1/ndJ} =-inf J 6 M I(O,1)G dJ.

The result now follows from (4.14) and (4.15).

0

25-

Remark 4.1. It is clear that condition (4.3) can be somewhat weakened and we mainly chose (4.3) for convenience.

Proof of Theorem 4.1. Let Un be the empirical process defined by Un(t)=Irl(Fn(t)-tL tE'[O,l], where Fn is the empirical df of a uniform distribution on [0,1]. With probability one all observations are contained in the open interval (0,1) and hence Un satisfies almost surely the conditions of Lemma 4.1. Let un be a version of the slope of the convex minorant Un of Un' Then, by Lemma 4.1, ".,

u

Il n U2 =...infj, JJ f(O,l)UndJ. Fix £ >0 and let an = (log n)4/ n, bn = l-a n. There exists 0> 0 such that P[suP t E (0,1) IUn(t) I/It(l-t) 2. Mil 092 n] < e for all n2.3, where 1092n= loglog n (this follows from the law of the iterated logarithm for the empirical process). If IUn(t)I.=:.M/t 1092n and JE M, we have f[o/n,anJ

IUn ldJ .=:. MlJ og2n f[o/n,any''f dJ(t).=:. M{(10g2n)(lOg(nan/o))}~.=:.c.10g2n,

for some constant c independent of n. A similar upper bound holds for f[bn,l-o/nJ IUnldJ. Since sUPJE:M f(O,o/n] t dJ(t)+O~ and similarly sUPJ~ Mf[1-o/n,1) (l-t)dJ(t) +0, as n+ oo , there exists a constant k such that, for all large n, P[sUPJc:M f(o,an]v[bn,l) IUnldJ2.k 1092n] 1-2£, and since £>0 was arbitrarily chosen, we have (4.16)

({infJ(; Mf(an,b

10g n)/1310g n U n) ndJ}2_

k Z.

26..

By Komlos et al(1975), there are versions of Brownian bridges Bn such that SUPtE (o,l)IUn(t) -Bn(t)1 =O«1og n)/In) with probability one. Hence, by (4.2), linfJEMf(a b) UndJ-infJEMf(a b) B dJI=O(suPJEMf(a b )n-~109 n dJ) n' n n' n n n' n ~ 0(1/10g n}, almost surely. This implies ({infJ E}.{ f(a b ) BndJ}2- log n)/1310g n ~ t: n' n By the law of the iterated logarithm for Brownian bridge there exists a constant k >0 such that (4.17)

sUPJEMf[l/n,anlIBnl dJ l-e:. n Thus we can replace an by l/n and bn by 1-1/n in (4.17). By Lemma 4.1 we have -infJ~M f[1/n,1-1/nJ BndJ=lIg1/ n,1-1/n ll2' where gl/n,l-l/n is a version of the slope of the convex minorant of Bn.1[1/n,1-1/nJ. Since the distribution of Ilgl/n,l-l/nl~ will be the same for any version of the Brownian bridge Bn, the result now follows. 0

5.' Asymptotic normality of a statistic proposed by Behnen. Let X1, ... ,Xm and Y1' ... 'Y n be two independent samples from a uniform distribution on [O,lJ, let Fm (G n) be the empirical df of the first (second) sample and let HN be the empirical df of the combined sample. With probability one, all observations in the combined sample are different and contained in the open interval (0,1). Thus, on a set of probability one, we can define the inverse HN1 of HN as the right-continuous df such that HN(H N1(k/N)) = kiN and HN1(u) =HN1(kiN) ,k/(N+l) 2 u < (k+l)/(N+l), k =0, ... ,N. In the sequel we will restrict our attention to the set where H 1 is

N

well-defined and we shall omit the expressdon "wt th probabtlity one": vie 'define the (random) dfs Fm and Gn by -

-1 -1 Fm= Fmo HN and G n =Gno HN Note that by our definition of HN1 these dfs are right-continuous.

27... Behnen(1975) considered the statistic TN=suPJEU f(O~l) J(u)dFm(U)

(5.1)

(actually he considered slightly different versions, but this will make no difference for the limiting behavior). By integration by parts and Lemma 4.1 it is seen that (5.2)

TN=-infJ~M f(O,l)(tm(u)-u)dJ(u) =lIrm,N-ll~,

-

where fm,N is a version of the slope of the convex minorant of tm' Let LN(t) = (l-AN){AN~!IUm(HN1(t» -

(5.3)

(l-AN)-~

Vn(HN1(t»},

for all t, [0,1], where Um(u) = Iiii(Fm(u) - u), Vn(u) = Iri(Gn(u) - u) and AN = m/N. Then fnf J E M f(O,l)IN(Fm(tFt)dJ(t)=inf J E M f(O,1) LNdJ+O(1), since IIN(Fm(t)-t) - LN(t) 1= INIHN(HN1(t»-tl

~ N-~,

t e [0,1],

(cf. pyke &Shorack(1968), Lemma 3.1, p. 762, but note that our definition of H l is different; in particularIHN(I::!Nl(t»-tl = tA(l-t), if tA(l-t) < (N+1)-l). N To obtain the limiting behavior of 111m, N1I with f m, N as in (5.2), 2, we compare L with the corresponding functional for Brownian bridges B m N and B1 : n

[N(t) = (l-AN){AN~ Bm(HN1(t» -

(5.4)

(l-ANf~ B~(HN1(t»}.

Lemma 5.1. Let aN = (log N)4/N, bN = l-a and let Urn' V Bm and B~ be n, N independent versions of empirical processes and Brownian bridges respectively, such that almost surely, SUPtl: (0,1) IUm(t)-Bm(t) 1= 0(109 m/vin) and SUPt

E

(0,1) IVn(t)-B~(t) 1= O(log n/ln) ,

as m,n-+ oo• Then, if AN is bounded away from (5.5)

infJ,M f[aN'b

N]

o and

1, as N-+co, we have

LNdJ - infJ~ M f[aN,b

N]

LNdJ = O(l/log N),

with probability one, as N-+oo. Proof. Note that sUP t E(O,1)

IUm(HN1{t)-Bm{HN1(t»I~sUPt({0,1)IUm{t)-Bm(t)l,

with a similar relation for VnoH

N'

-

B~oHN1.

The rest of the proof follows

28..

exactly the same pattern as the argument in the proof of Theorem 4.1. [J

lemma 5.2. let a and b Then, if B is a Brownian N=(109N)4/ N N=l-aN• bridge on [0,1], we have

sUPJ~ M l[a

1(t»-B(t)ldJ(t)

N N,bN]IB(H

-* 0,

in probability, as N-*co. IS >0 and 1'1 :> J, .s.uchthat

Proof. Fix e > O. There exist P[suPO


0, there exists b = b(e:) such that P[Fm(t) ~Fm(bt), all t€ [0,1] ]~ l-€. (see Lemma 2.5, p.761, Pyke &Shorack(1968); our interval for t is [0,1] rather than [1/N, 1], because of our defi ni ti on of HN1) . There exi sts M> 0 such that P[supt £ {0,1)IUm(bt)INt~MI10g2m]

Suggest Documents