MAXIMUM LIKELIHOOD ESTIMATION OF A UNIMODAL ... - CiteSeerX

3 downloads 0 Views 684KB Size Report
Edward J. Wegman. 1. Introduction. ... Both estimates are based on a sample of size 90. ..... Since both terms on the right-hand side are positive, we have. J[. ] ...
MAXIMUM LIKELIHOOD ESTIMATION OF A UNIMODAL DENSITY FUNCTION by EDWARD J. WEGMAN Department of Statistics University of North Carolina at Chapel Hill Institute of Statistics Mimeo Series No.608 January 1969

This research arises in part from the author's doctoral dissertation submitted to the Department of Statistics, The Univeristy of Iowa. Supported in part by PHS Research Grant No. GM-14448-01 from the Div. of General Medical Sciences and in part by N. S. F. Development Grant GU-2059.

Maximum Likelihood Estimation of a Unimodal Density Function Edward J. Wegman 1.

Introduction.

Robertson [4] has described a maximum likelihood

estimate of a unimodal density when the mode is known.

This estimate is

represented as a conditional expectation given a a-lattice.

Discussions of

such conditional expectations are given by Brunk [1] and [2].

L,

A a-lattice, of subsets of

~

L,

the real line, measure, let

A L

2

those members of

(~, A,~)

is a collection

closed under countable unions and countable intersections

¢

and containing both a a-lattice,

of subsets of a measure space

A function

~

and

if the set, [f

a], is in

>

f

is measurable with respect to

L for each real a.

be the set of square-integrable functions and L

2

is

= A is Lebesgue

~

is the collection of Borel sets, and

~

If

which are measurable with respect to

L.

L (L) 2

be

Brunk [1] shows

the following definition of conditional ecpecfBtion with restect to L is suitable. Definition.

If

f

L , 2

E

conditional expectation of

then

f

g



L,

given

J f • 8(g)dA

( 1.1)

for every

8,

J

A

E

is equal to

L

with

0


(.I'

....

"

....

,,/

/ •o

""......"l"( /

/

1 /

/

"" "

//

,,""

r--

/

/ /

/

)

;; / /

/

/

o

""

"

'

....

'

....

....

'

'

.

'. .... ....

--
"

+ ![R,R']

~nm(Yr(n»)d>" +

~nm(Yf(n))d>"}

- ![L,L']

.

Combining the first and third terms of the right-hand side and observing A[L, R]

=

we have for

£

y

£

Finally, observing that both

[L', R']

>..[R, R']

>..[L, L']

and

~

and that g(y) > ~

nm

(y)

for all

y

nm

on

(L)

in

are

1

2 n ,

that

[L',R],

[L', R].

Thus it follows that g(y) This implies that

~

>

nm

g

contrary to choice of {Yl' ..• , Yn }

or

(y)

for all

~ nm

Hence either

~nm(Yf(n))

having proper end points.

compute

~nm(Yr(n))

L

or

R must lie in

If the latter case holds, an

If one or more of the sets

(- 00, L),

[L, R],

R belongs to 2n

estimates

2n

intervals of the form

... , ~nm. '

intervals.

Calculate the

g,

This completes the proof.

[L, R]

i = 1, 2, ••. , 2n, 2n

where one of

If the modal interval is unspecified,

y }. n

corresponding to these

1

2n

which is

contain no observations, we may define an alternate function,

There are at most or

[Yf(n)+l' Yr(n)-l]

~ nm may clearly be defined with modal interval

a similar manner to the above.

L

in

has a larger likelihood product than

estimate equally likely as

(R, 00)

y

likelihood products, n IT

j=l

f

nmi

(y . ) • J

or in

10 Select the maximum of these estimate. L

Let

m

n

2n

~

numbers and let

be the center mode.

be the corresponding

n

(Notice that if 2 or more of the

~

are equal, we could have some ambiguity of definition of

nm.

1

adopt the convention of choosing center mode).

~n

Let us

n

to be the estimate with the smallest

~

Clearly the likelihood product of

is as large as that of

n

any other estimate. Before closing this section, we note the following lemma that may be replaced by Lemma 2.2.

n

L([L, R]).

~ nm which is measurable

The maximum likelihood estimate L ([L, R])

with respect to

L ([L, R))

is given by

E(g

nm

where

IL([L, R]),

is as

before.

by

~

and

E

Let us abbreviate for purpose of this lemma

Proof: A

gnm

by

A

g •

We want to show

is measurable with respect to

L([L, R]).

~ =E

(g nm IL ([L,

(g nm ILn ([L,

R)))

Clearly,

R]).

Property (1.1) is easy to verify,

so let us show for We may assume

A

E

A is a closed interval, and

L([L, R))

[a, b]

,

0


0

or

Then since

~(Yj+l)'

~

and

/I

g

are both constant on

From this, it follows that

Thus

But

J[

a,y j+1

(g - ~) dA

]

=

J[

a,b

(g - ~)dA

]

+ J[b

,Y j+1

]

(g - ~)dA



Since both terms on the right-hand side are positive, we have

J[

a,y j+1

]

(g - ~)dA

>

O.

Rewriting

The last term on the left-hand side is nonpositive by (1.2), since belongs to

L ([L, R]). n

J Since Thus

~

and

/I

g

[a,yi+1]

From this, we have

(g - ~)dA

are also constant on

>

O. we have

g(y.) 1.

>

~(Y.)' 1.

12 Hence

so that

But

[y., YO+l] J

1

E

L ([L, R]), n

This is a contradiction.

which means by (1.2)

Our assumption that

A similar argument follows if either both.

~

[a,b]

(~

~)dA

L

or

R < b ~ Yr(n)

a


is false. or

This concludes the proof.

3.

Conditional Expectations of the true density.

investigate the conditional expectation, f

f

In this section, we

E(fIL([L, R]»,

of the true density

with respect to the a-lattice of intervals containing the interval,

Notice

ff 2 dA


or

E(fIL([L, R]»

f

E(fIL([L, R)))

(R - L)

If

f(L)

K

~

f(R)

>

(R - a)

f

E(fIL([L, R]»

impossible.

Notice that

f(L)

such that

a

=

If

the set defining

a.

f*(y)

(R - a)

-1

iii

sup {y :5: L': (R - y)

L' = min{L, M} •

Let

K

-1

y:5: L'

and

·f [a,R] f dA

f* = E(f!L([L, R))).

f

f[y,R]



and

dA

~

f(y):5: K

Clearly

fA (f - f*)dA :5: 0

y

for y

for

M

f(L)

K :5: f(R)

>

are Case

follows in a manner similar to case

and

f(y)

R :5:

[L, b)

on

f(y)}

in f*

for every

in

it is clear that

[a, R]

[a, R]. E

E

ii.

y

belongs to

a, indeed, exists.

and by

We wish to show

L([L, R)). A

c

Hence it remains that

L([L, R])

with

0




pair of observations r

l


..

on

[L, R] .

For sufficiently large

[~nm

>

t]

n belongs to n , [L -

n,

n*

to see

CJ

n

n,

Yo E [L , R ]· n n

Thus if

[L, R] c

on

yO > R

or

[4] or Wegman [5].

n

t =

~nm

In this case, any element of

L([L, R ])

-1

is the number of observations in

n

n

For sufficiently large

so that for sufficiently large

nm (yO) ~ (R - L + 2n) n

(yO)'

n

be any positive number.

R + n] E L([L , R ]), n

= f

L < yo < R.

eventually. Let

~

If

=

m

Yo < L

For

we use methods quite analogous to those found in Robertson Let us turn our attention to

f

n,

A

• f[L-n,R+n] gnm d>" . n [L - n, R + n], it is not difficult

17 1\ (n* + 2)/n ~ J [L-n,R+n] gnm

Let us write

Fn (x-)

for

limy t x Fn (y).

~

dA

(n* - l)/n .

n

Then we may rewrite the above inequality

as F (R + n) - F (L - n-) +

n

n

1n ~ J[L-n,R+n] gnm dA

+

~ F (R

n

n

1

n) - F (L - n-) n

n

Hence we have n lim -+

00

1\

dA

J [L-n, R+n l gnm

F(R + n) - F(L - n-)

=

=

J [L-n,R+n] f dL

n

From this, lim inf

~

t

nm (YO) ~ [R - L + 2nl

-1

• J[L-n,R+n] fdA.

n

But this is true for all

n

On the other hand, i f an interval.

Clearly

lim sup Yi(n)

:$

lim ooYi(n) = L. n-+

L

T

Yi(n)

and

0,

>

so

[~ t = nm

:$

~

n and

L n

lim inf

Yj (n)

then

tl, ~

Yj (n) R.

~

T

R , n

t = [Yi(n), Yj (n) l

so that

We wish to show

To see this, it is necessary to show

lim inf Yi(n)

~

this is not true, there is a subsequence which we shall again denote for simplicity, and a positive constant, n

~nm

moment,

t

Since

Yi(n)

:$

L - n.

Yi(n).

L.

If

Yi(n) For the

Since

~

nm

n

T t

eventually,

= ~nm (YO) = ~ nm

~ nm

such that

will mean the subsequence corresponding to

n is constant on

is

n

= ~ nm

(L - n)

n

(L - n)

converges to

f

n

lim sup ~

nm

n

=

f

n (L -

m

m

[L - n, R).

eventually on n)

(L - n)

we have on

[L -

n,

R)



18

[L - n, R + n]

Eventually,

![L-n,R+n] where

L([L , R ])

E

n

~nmn

so that by (1.2)

n

dA

![L-n,R+n]

$

~nmn

dA,

is the subsequence corresponding to

By Fatou's lemma,

A

1 im sup ! [L-n,R+n] gnm

dA

![L-n,R+n] lim sup

$

n

But

lim sup

~ nm

(x)

f

$

n

(L-n)

~nm

lim sup ![L-n,R+n]

dL

n

for all

m

~nm

dA

x

so that

![L-n,R+n] fm(L - n) dA .

$

n

We have just seen A

lim00

![L-n,R+n] gnm

n -+

= ![L-n, R+n] f dA ,

dA n

so

(4.3) But

![L-n,R+n] f dA [L, R]

must contain

Similarly,

Yj(n)

M.

Since

has unique mode,

f

Hence (4.3) cannot hold.

converges to

$

![L-n,R+n] fm(L - n) dA.

M since otherwise the characterization of case

in Theorem 3.1 could not hold. some neighborhood of

$

R.

Since

~

(Yj(n) - Yi(n))-l • ![y

Writing

dA

=

Thus

f

lim n-+ oo

belongs to

Y ] i (n)' j (n)

F (y. ( )) - F (y. ( ))

nJn

nln

> f

m

(L -

n

by (4.1)

dA.

+ (1) n

it is clear

that (R - L)-l • (F(R) - F(L)) Thus

n lim -+ 00

~nm

n

(YO)

=

for

yo

=

such that

in

y.I (n ) = L.

Hl(T ), t

~nm

n)

i

L




are in

M+

and

2

are in the support.

m'

m and

1 M- - E

M are

be the larges t .

d

1

2

E),

sufficiently

In the next lemma, when

we shall mean that subsequence of the sequence of maximum likelihood By

mI.

the maximum likelihood estimate whose mode is Lemma 5.4.

on

E.

1 R = m +-E 2

estimates for which the center modes converge to

~


c - u-".

] g

n

*

d'/\ •

For sufficiently large

This is easily seen, since 1 m - - E > n 2

As soon as

Yi,Y i *

1

m n

C -

0,

2

E

A

gn

n,

and

converges to

and

gn

*

agree.

This

together with (5.1) and (5.2) shows (5.3)

(Y i * - Yi ) Now let

(5.4)

t

-1

~ n(t O)

. J [Yi,Y and

P

~

A

gn dA i *]

[~

t

n

(y. - Y 1

~

(y. - Yj*) 1

> t].

j*

-1

f

A

[Yj*,Y i ]

gn dL

Again by use of (4.2), we have

) -1 • f [

* n be

for which there is a jump

Let

= [Y j *, Yk *]

Tt

Then

~ n has a

We wish to show equality holds

in the last inequality.

T

(to' c - 0) •

for which

to

be the smallest observation greater than

Yi *

Pick

Yj*'Y i

] ~

n

dL

26 Finally letting

t = ~ n (y i)

~ n (y i)

(5.5)

:s;

and using (4.1) , we have

(y i* - y i)

.f

-1

1\

[Yi'Yi*]

gn d".

Using (5.3), (5.4) and (5.5),

~

~

is a jump point in

But

n

,

Y f Y * i i

[f

n

*

>

t].

(y.). 1.

so

~ n (to) Thus our assumption

n

~ n (y.). 1.




are in the support of

0 f.

28 For sufficiently large probability one. f

m

,

1

"2

>

f

*

~

and

n

agree on

n

d + 11]

~n

and because

uniformly with probability one and

f

n

*

d + 11)c

(c - 11,

= min{f(c - 11), f(d + 11)}.

k

[c - 11,

on

k

Let

n,

Because

f

converges to

converges to

f

m

1 k

> -

m f

with

2

m

and

almost

,

almost uniformly

with probability one, logd ) converges to log(f ,) n m * log(f ) converges to log(f ) n m 6 >

o.

Pick the set

P(A) • 10g(kE/2)

> -

almost uniformly with probability one and almost uniformly with probability one.

Let

A of arbitrarily small measure so that 6.

Now i f

are the ordered observations,

let 1

L(f *) - L(~ ) n

n

n

n

For sufficiently large (c - 11,

c d + 11) •

[c - 11 ,

d + 11],

f

n

*

A, log(f

k

n

2

=

*)

log(~ n )

~n ~

Yi

in

n*

bilityone,

n

d + 11] n A.

1n

l -n L {log(f *(Y.)) l n 1 >

1. E

log(kE/2),

sufficiently large

n

n.

1

(y.))}. 1

~

and

agree on

n

in

Yi

log(~ n (y.))} 1

eventually.

.

This follows since

Thus n*

> n

Since

(n*/n)

L {log(f *(Y.)) - log(~ (y.))} 2

f * n

n

with probability one,

is the number of observations in

[c - 11,

log(~

1

be the summation over

l

log(~ n (y.))} 1

1 L {log(f *(Y.)) n 2 n 1 where

L

for sufficiently large

n

n

*(y.))

with probability one,

Thus i f we let

eventually and

>-

{log(f

L

n i=l

L(f *) - Ld ) n n n n Now on

n

n

1

log(kE/Z) A and

L

Z

converges to > -

6

eventually, is the summation over P(A)

with proba-

with probability one for

29 In a similar manner, one may show eventually 1

- -

n

Then if

L

3

L {10g(f (y.» m 1 2

is the sum over

Yi

- log(f ,(y.»} m 1 [c - n,

in

log(f

n

*)

log(~)

and

on

n

8 •

d + n] - A,

with probability one for sufficiently large properties of

> -

n.

By the uniform convergence d + n] - A,

n,

[c -

we have

eventually with probability one, L(f *) - L(~ ) ~ 1 L {log(f (y.» nn nn n l m1 On

[c -

n,

c

d - n] ,

sufficiently large

f

f

m

=

f

,

m

- log(f ,(y.»} - 38. m 1

so that with probability one and for

n, n

L(f *) - L(~ ) ~ 1 E {log(f (y.» n n n n n i=l m 1

- log(f ,(y.»} - 38 • m 1

By the Kolmogorov Strong Law, with probability one lim

n -+

Since

00

*

(L(f ) - L(~ » n n n n

m is the unique mode of

f

~ f log(f m ) f dA - 38. m'

f log(f ) fdA, m

and since

8

was arbitrary,

with probability one lim (L(f *) - L(~ » n-+ oo nn nn This is equivalent to is a contradiction.

f

n

*

>

0 •

having a larger likelihood product than

Hence with probability one,

{m} n

converges to

~

n

.

m.

This

30

6.

Summary.

For

a unimodal density. fm

= E(fIL([m

length

E.

-

%E,

Here

E >

0,

we have given a maximum likelihood estimate of

This estimate is a strongly consistent estimate of m

1

+ 2 E]).

Of course,

m is the mode of

H(m)

f

m

=f

except on an interval of

= J log(f m) fdA.

same method yields a maximum likelihood estimate.

If

E

= 0,

the

The lack of a uniform bound

causes the consistency arguments described in this paper to fail.

Hence the

asymptotic behavior of this sort of estimate is unknown. Wegman [5] describes a consistency argument for a related continuous estimate. 7.

The analogue for Acknowledgements.

~

n

is not difficult.

I wish to thank Professor Tim Robertson for

suggesting the topic from which this work arose and for his guidance as my thesis advisor at the University of Iowa.

31

REFERENCES

[1]

Brunk, H.D. (1963). expectation,"

[2]

Brunk, H.D. (1965).

"On an extension of the concept of conditional Proc. Amer. Math. Soc. 14,

"Conditional expectation given a a-lattice and

applications," [3]

Robertson, Tim (1966). given

[4]

Ann. Math. Statis t. 36,

a-lattices,"

Robertson, Tim (1967).

Wegman, E.J. (1968).

1339-1350.

"A representation for conditional expectations Ann. Math. Statist.]2,

1279-1283.

"On estimating a density which is measurable

with respect to a a-lattice," [5]

298-304.

Ann. Math.

Statist.~,

482-493.

"A note on estimating a unimodal density,"

Institute of Stat. Mimeo Series No. 599, The Univ. of North Carolina at Chapel Hill. (Sub. to Ann. Math. Statist.)