Bootstrap algorithms for small samples - Science Direct

14 downloads 0 Views 680KB Size Report
in the iterated bootstrap, and discuss a method of importance resampling appropriate to small samples. ,4MS Subject Classification: Primary 62G05, 62G 15; ...
Journal

of Statistical

Planning

and Inference

157

27 (1991) 157-169

North-Holland

Bootstrap Nicholas

I. Fisher

CSIRO,

D/vision

Peter

Hall

Department Received

show

of Mathematics

of Statistics,

& Statistics,

Sydney,

National

University,

Australian

by T.L.

We describe

that

calculation

Box 218, Lindfield,

G. P.O. Box 4, Canberra,

NS W 2070, Australia

ACT 2601, Australia

Lai

algorithms

they are practicable

for exact computation for small

of the entire bootstrap

with simulation

involving bootstrap,

,4MS Subject

Classification:

words

and

distribution

several hundred

in the iterated

samples.

distribution

size is n = 6 then the entire bootstrap

Key

P.O.

11 May 1989

Recommended

Abstract:

algorithms for small samples

phrases:

is competitive

Primary

62G05,

Bootstrap;

exact

that

resampling

62G 15; secondary computation;

estimators,

and

enumeration

and

For example

and exact calculation

We also describe

of importance

bootstrap

in this setting,

with simulation.

has only 462 atoms,

replications.

and discuss a method

of nonparametric

It is argued

if the sample is competitive

the role of exact computation appropriate

to small samples.

62E 15.

importance

resampling;

Monte

Carlo;

simulation.

1. Introduction The nonparametric bootstrap may be used to estimate a wide variety of statistical features, including bias, variance, distribution function and quantile. Exact calculation of bootstrap estimates requires enumeration of all resamples which may be drawn with replacement from the original sample, and also computation of the likelihood and statistic value associated with each resample. This is impractical in large samples, since the number of possible resamples increases exponentially quickly with sample size. However it is quite feasible for small samples - certainly for samples of size 7 or less, and sometimes for samples of size 8 or 9, depending on the available computing resources. Some authors suggest that between 1000 and 2000 simulations are needed to compute bootstrap estimators by Monte Carlo means. Since there are only 1700 atoms in the bootstrap distribution when sample size is n = 7, and less than 500 when sample size is n = 6, then exact calculation is competitive with simulation in these cases. We describe algorithms for exact computation of nonparametric bootstrap 037%3758/91/$03.50

0

1991-Elsevier

Science Publishers

B.V. (North-Holland)

158

N.I. Fisher, P. Hall / Bootstrap algorithms

estimates estimators,

in small provides

samples. formulae

Section 2 reviews for exact calculation,

basic properties of bootstrap and discusses the issue of ties.

Section 3 describes the role of exact calculations in iterated bootstrap estimation, using exact calculation in the first bootstrap operation and Monte Carlo simulation in the second. Section 4 discusses a small sample version of Johns’ (1988) importance resampling method, and Section 5 briefly describes generalisations of our technique.

2. Exact computation 2.1.

Introduction

of bootstrap

estimators

and summary

In this section we argue that exact computation of bootstrap estimates, such as bias and quantile estimates, is an attractive proposition for samples of size n I 6. Exact calculation is sometimes feasible for n = 7, 8 and 9, but not really for n 2 10 unless extensive computational resources are available. Subsection 2.2 gives notation, Subsection 2.3 describes basic properties of bootstrap estimates, and Subsection 2.4 discusses algorithms for exact computation. In Subsection 2.3 we describe the effect of ties in resamples; Subsection 2.5 suggests a way of allowing for ties in the original sample. 2.2.

Notation

Let .K= {Xi, . . . . Xn} denote a random sample from the distribution of a d-dimensional vector X, put X= H.’ C Xi, and suppose 0 is a real-valued function of d variables. We focus attention on properties of e(X). For example, we may wish to estimate ,!+{e(~)}, or the distribution of 0(x), perhaps after a standardization for scale. Cases which may be studied in this context include that where 0(x) is a univariate mean, when d = 1; or a variance, when d = 2; or a correlation coefficient, when d = 5; or a function such as a ratio of any number of these quantities. We begin by noting the asymptotic distribution of e(X). If p =E(X) denotes the mean of X then n 1/2{e(x)-e(p)j is asy m p totically normal N(O,o’), where

a2 = ; ;=,

iJC3,(~)ej(,u)au,

j=l

e,cp) = (a/a&e(p),

a0 = E(X’Xj)

-,B’,uUJ,

and elements of vectors are indicated by superscripts. To avoid trivialities, assume that a*> 0. This avoids pathological examples such as e(x) =x2. Our empirical estimator of a2 is the asymptotic estimator, -2 aaspt = C C i9j(X)t3j(X)&‘i, i

j

N.I. Fisher, P. Hall / Bootstrap algorithms

159

where (g

= /,-I

f:

xh__yj _ j('xJ.

k=l In small samples this statistic of nvar{6’(8)}. Alternatives estimator 6&,,, defined by

n

is not always compellingly attractive as an estimator are the jackknife estimator 6fa,, and bootstrap

-I -2 obo,,t = E{8(x*)2

j x’) - [A?{&?*)

/ .%}12,

where X, denotes the mean of the (n - 1)-sample .%C\ {X;} and X* equals the mean of a same-size resample K* of size n drawn randomly, with replacement, from K. Each of &z,,r, r?Fackand S&r converges to o2 with probability one as n + 03. The bootstrap estimator of w =E{ e(8)) is @ = E(e(X*)

/ K}.

Hence the bootstrap estimate of bias E{ e(X)} -O(p) is &-e(X). Let e2 denote any one of c?&,,, 6fack and &iOOt, and write 6*2 for the version of 6’ computed for the resample K*. Bootstrap estimators of the distribution functions I

= qe(X)

G(X) = qe(x)-

- e(p)+,

e(pu)j/ax]

are

F(X) = qe(X*)-e(X)5xl Q(X) = fqep-

sit-}, e(x)y6*5x

respectively. Quantiles of the bootstrap confidence intervals. They are tD = inf (x: P(x) 2p},

19-1 distributions

4, = inf{x:

are often needed

to construct

C?(x)Zp}.

Use of c, or the quantile qP, to construct confidence regions for O(p), is known as the percentile-t method, to distinguish it from the percentile method based on E and tP. While percentile-t has advantages over percentile in large samples (HinkIey and Wei, 1984; Hall, 1988), it can suffer difficulties in small samples owing to ties in the resample Z*. These problems will be discussed in Section 3. 2.3.

Basic properties of bootstrap estimates

Assume that the distribution of X has no atoms. Then with probability one, all values in the sample Z are distinct, and the number of different unordered resamples Z* that can be drawn from .% with replacement, equals the number of ways of choosing nonnegative integers k,, . . . , k, satisfying k, + ... + k, = n. This is given by

N.I. Fisher, P. Hall / Bootstrap algorithms

160

N(n) =

c> 2n-

1

n

(Hall, 1987, Appendix 1). Table 1 lists values of N(n). The qualification ‘with probability one’ here and below refers to realizations of X. It means that if ‘8 is the collection of realizations for which the qualified statement is valid, then P(LXE 6?) = 1. With probability one, each of the N(n) different resamples produces a distinct value of 0(X*) and of T*={o(X*)-B(X)}/&*, provided that in the latter we ignore resamples which have 8* =O. Now, with probability one the event 8*=0 occurs for precisely those n resamples K* which consist of n identical elements. Any given one of these special resamples has conditional probability n-” of arising. Hence,

pa(n) = P(f3* = 0 1LX) = n+-‘), with probability

(2.1)

one. ratio T* is not well defined.

In the event that ci*=O, the studentized we interpret 6 as

However

1XL”)

G(x) = P{e(x*)-e(rt)I&*x

if

(2.2)

can be importhen C? is always well defined, as is the quantile 9,. This convention tant in small samples, where the probability at (2.1) may be non-negligible. Formula (2.2) has an obvious analogue in multivariate problems, where 6’is a vector of length r and 8* is an r x r matrix. Despite these convenient interpretations, there are obvious and serious problems in effectively using resamples which have zero variance, when the percentile-t method is employed.

Table

1

Number resampling

of atoms,

N(n) =

( “‘; ‘), of bootstrap

of most likely atoms

act, all others

are rounded

(n!K”)

to four significant

sample

number

size, n

in bootstrap distribution,

of atoms

distribution,

together

and least likely atoms

(n-“).

with probabilities Probabilities

under

uniform

designated

* are ex-

figures probability

of

probability

of

most likely atom

least likely atom

0.5* 0.2222 9.375 x 10-z*

0.25% 3.704 x 10-2 3.906 x 1O-3

3.84 x lo-2* 1.543 x 10-2

3.2 x 10m4* 2.143 x 1O-5 1.214x 1O-6

N(n)

2

3

3 4

10 35 126

5 6 I

462 1716

6.120x

8

6 435

2.403 x 10m3

9

24310

10

92 378

10m3

9.361 x 1O-4 3.629 x 10m4

5.960 x 1O-8 2.581 x10-9

1x lo-‘0*

161

N. I. Fisher, P. Hall / Bootstrap algorithms

Similar

problems

are often

experienced

with resamples

where a sample

value is

repeated n - 1 or n - 2 times. The corresponding value of T*, although well defined, can be very large. Now, the probability that some sample value is repeated precisely n - 1 times in LX* equals p,(n) The chance

= n(n-

that K*=

{x,~xj~x~}~

1). nnP

n13.

= (1 -nPl)K(+3), . . . ,X,}

{Xi,X,,Xk,Xk,

for some set of three distinct

values

is

p#)

=

+n(n-

l)(n-2).

= +(l -n-1)2(1 and the chance

that X*=

p3(n)

=

n(n.-

n(n-l)nP -2nP’)n+“-5),

{X,,Xj,X,,Xk, 1). $z(n-

. . ..X.}

1)X”

n24, for some X,#X,,

= +(l -K’)%+-4),

is

n15.

The sum P(H)

=Po(n)+L+(n)+P,(n)+P3(n) _

mn+l

_~n-“+2+jn~“+3_~jn~“+4+~n~“+5

equals the probability that some observation in the resample is repeated at least n - 2 times, and is given in Table 2. Only for n27 do these values not exceed 5%. Therefore we may expect to experience problems with tied resample values when using the percentile-t method in samples of size ns6. The most likely resample E* is K, the original sample; the least likely is any one of the n resamples of n identical elements. Table 1 lists values of N(n) and of the two extreme probabilities, for 21 ~510. We may deduce from those data that for n=3 to n=6, and perhaps also for n = 7 to n = 9, exact enumeration of bootstrap atoms is computationally n2 10.

feasible.

Simulation

Table 2 Probability

that some sample

value is repeated

at least n - 2 times in resample sample

size, n

probability large

of

T*, p(n)

5

0.290

6 I 8 9

0.052 6.8 x 10m3 6.8 x 10m4 5.5 x 10-5

10

3.7 x 10-6

is really

the only

alternative

for

N.I. Fisher, P. Hall / Bootstrap algorithms

162 2.4.

Exact

Suppose

computation

we wish to calculate

the exact value of

j x}.

+ = E{e(X*)

Let g(n) denote the set of all distinct n-tuples 1=(1,, . . . , I,) having 0 5 I, I I2 5 ... 5 1, and 1 Ii= n. Given (I,, . . . , I,) e g(n), let A([,, . . . , f,) be the set of all ordered of (I,, . . . , I,,). Define n-tuples (k,, . . . , k,) which are simply permutations X(n)

Table

=

IJ .A(l,, IG Y(n)

3

Example

of the sets P(n)

. H(n), and subsequent

and

calculation

of v/ from

klkzk,h

I, 121314 1 1 1

. . ..I.).

I

1 1

I I

r

0112 0 1 2 1 0211 2110 2011 2101

0112

10

(12 distinct 1 2

102

’ X(4)

permutations)

1

1102 1120 1201 L

0013

1 2 10

C 12 distinct permutations i

0022

0004 (2.4)

6 distinct permutations 4 distinct permutations

I

(3=3!4~3[(l!l!l!l!)~‘f?{(X,+X*+X3+X4)/4} +(1!1!2!))’

c

Q{(X,+X,+2&)/4}

+(1!3!)-’

c

B{(x,+3x,)/4}

+(2!2!)-’

c

0{(2X,+2X,)/4}

+(4!)~‘(~(~,)+~(~z)+B(~3)+~(~4))1.

[In each case, summation

is over distinct values of i, j, k.]

equation

(2.4), with n = 4

algorithms

163

--k,!n”)-lBjn-l~l k;x,)

(2.3)

N.I. Fisher, P. Hall / BooNrap

Then

Z(n)

contains

lJT=

precisely

c

n!(k,!

ke.li(rr)

N(H) elements,

and

See Table 3 for a clarifying example. These formulae follow from the fact that the resample in which X, is repeated just kj times for 15 is n, arises with probability ’ conditional on l?r. n!(k,! . ..k.!n”) Much of the labour in computing @ comes from evaluating the ratio n!(k,! . ..k.!~“)~‘. This work is reduced if u/ is computed in the form (2.4) rather than (2.3). Similarly, the bootstrap distribution functions P and C? may be calculated exactly as

where B(k) denotes the value of 6 computed for the resample in which X, appears exactly k, times for 1 sirn. The p-th quantile [, of P may be computed by first obtaining all N(n) pairs (o,r)

= [O(+,

kid,

n!(kl!...k,!tPm’j;

ordering these pairs in respect of the first element, as (COG, n,) where ol < ...