Ann. Inst. Statist. Math. Vol. 57, No. 2, 279-290 (2005) Q2005 The Institute of Statistical Mathematics
VARIANCE ESTIMATION FOR SAMPLE QUANTILES USING THE OF n BOOTSTRAP
m
OUT
K. Y. CHEUNG AND STEPHEN M. S. LEE*
Department of Statistics and Actuarial Science, The University of Hong Kong, Pokfulam Road, Hong Kong, China, e-maih
[email protected];
[email protected] (Received January 21, 2004; revised May 7, 2004)
Abstract. We consider t h e p r o b l e m of e s t i m a t i n g t h e variance of a s a m p l e quantile calculated from a r a n d o m s a m p l e of size n. T h e r - t h - o r d e r k e r n e l - s m o o t h e d b o o t s t r a p e s t i m a t o r is known to yield an impressively small relative error of o r d e r 0(n-~/(2~+1)). It nevertheless requires s t r o n g s m o o t h n e s s conditions on t h e underlying density function, and has a p e r f o r m a n c e very sensitive to t h e precise choice of t h e b a n d w i d t h . T h e u n s m o o t h e d b o o t s t r a p has a p o o r e r relative error of order O(n-1/4), b u t works for less s m o o t h d e n s i t y functions. We investigate a modified form of t h e b o o t s t r a p , known as the m o u t of n b o o t s t r a p , a n d show t h a t it yields a relative error of order smaller t h a n O(n -1/4) u n d e r t h e s a m e s m o o t h n e s s conditions required b y the conventional u n s m o o t h e d b o o t s t r a p on the d e n s i t y function, p r o v i d e d t h a t t h e b o o t s t r a p s a m p l e size m is of an a p p r o p r i a t e order. T h e e s t i m a t o r p e r m i t s exact, simulation-free, c o m p u t a t i o n a n d has a c c u r a c y fairly insensitive to the precise choice of m. A simulation s t u d y is r e p o r t e d t o provide e m p i r i c a l comparison of t h e various methods.
Key words and phrases:
i.
m out of n b o o t s t r a p , quantile, s m o o t h e d b o o t s t r a p .
Introduction
Suppose that X1,. 99 Xn constitute a random sample of size n taken from a distribution F. Let X(j) denote the j - t h smallest d a t u m in the sample. For a fixed p E (0, 1), assume that F has a continuous and positive density f on F -1 (O) for an open neighbourhood O containing p. Denote by ~p = F-l(p) the unique p-th quantile of F . The p-th sample quantile X(r) is a natural and consistent estimator for ~p, where r = [np] + 1 and [-] denotes the integer part function. Standard theory establishes that a n2 - Var(X(r)) admits an asymptotic expansion (1.1)
6rn2
= n-lp(1
p)f(~p)-2 +o(n-1).
A general discussion can be found in Stuart and Ord ((1994), w Although (1.1) provides an explicit leading term useful for approximating a2~, its direct computation requires the value of f(~p), which is usually unknown and is difficult to estimate. The conventional, n out of n, unsmoothed bootstrap draws a large number of bootstrap samples, each of size n, from X 1 , . . . , Xn, and estimates a n2 by the sample variance of *Supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. HKU 7131/00P). 279
280
K.Y.
CHEUNG
AND STEPHEN
M. S. L E E
the bootstrap sample quantiles calculated from the bootstrap samples. Hall and Martin (1988) show that the theoretical n out of n bootstrap estimator a^n2, which is based on infinite simulation of bootstrap samples, has an explicit expression n
(1.2)
an^2= Z(X(j)
_
X(r))2Wn,j,
j=l
where Wn,j = r(n) J(j-1)/n I'i/n x r - l ( 1 - x)n-rdx. They prove that a^n2 has a large relative error of order O(n-1/4), that is an/a n ^ 2 2 _~ 1 + 0(n-1/4). Maritz and Jarrett (1978) note that a^n2 may be more accurate than the leading term in the asymptotic formula (1.1) for p = 1/2 in small-sample cases, even if the true f(~p) is employed to calculate the latter. The smoothed bootstrap modifies the n out of n bootstrap procedure by drawing (smoothed) bootstrap samples from a kernel density estimate of f rather than from the empirical distribution of X 1 , . . . , Xn. Hall et al. (1989) show that the smoothed bootstrap estimator has a smaller relative error, of order O(n -r/(2r+l)) based on a kernel of order r, under much stronger smoothness conditions on f , provided that the smoothing bandwidth is chosen of order n -1/(2r+1). The m out of n bootstrap, as pioneered by Bickel and Freedman (1981), provides a method for rectifying bootstrap inconsistency in many nonregular problems: see, for example, Swanepoel (1986) and Athreya (1987). It is, however, generally less efficient than the n out of n bootstrap when the latter is consistent: see, for example, Shao (1994) and Cheung et al. (2005). Exceptional cases have been found though. Wang and Taguri (1998) and Lee (1999) improve the n out of n bootstrap by suitably adjusting the resample size m in estimation and confidence interval problems respectively. Arcones (2003) shows that the n out of n bootstrap provides a consistent estimator for the distribution function of sample quantiles with error of order 0(n-1/4), whilst the m out of n bootstrap reduces the error to order O(n -1/3) by use of m (x n 2/3. Janssen et al. (2001) obtain independently similar results for U-quantiles. We shall show in the present context that the m out of n bootstrap is also effective in reducing the relative error of ^n2 under the minimal smoothness conditions same as those required by the n out of n O" bootstrap on f . The rest of the paper is organized as follows. Section 2 reviews the smoothed bootstrap method for variance estimation for sample quantiles. Section 3 studies the convergence rate, as well as the asymptotic distribution, of the m out of n bootstrap variance estimator. Section 4 describes a computational algorithm for empirically determining the optimal m. Section 5 presents a simulation study to compare the performances of the various variance estimators. Section 6 concludes our findings. Technical details are given in the Appendix.
2. Smoothed bootstrap We review the smoothed bootstrap procedure for estimating a n2 . Instead of resampling from the empirical distribution of X 1 , . . . , X n , the smoothed bootstrap simulates smoothed bootstrap samples from a kernel density estimate ]b of f , given by h ( x ) =(nb) -1 ~-]in=_lK ( ( x - X~)/b), where b > 0 denotes the bandwidth and K is an r-th-order kernel function for r _> 2. The smoothed bootstrap estimator an, ^2 b of an2 is then obtained by calculating the sample variance of the p-th-order smoothed bootstrap sample quantiles.
BOOTSTRAP VARIANCE ESTIMATION FOR QUANTILES
281
Let f(J) be the j - t h - d e r i v a t i v e of f . Assume t h a t f(r) exists and is uniformly continuous, f(J) is b o u n d e d for 0 ~_ j < r, f is b o u n d e d away from 0 in a n e i g h b o u r h o o d of ~p and EIXI v < co for some ~ > 0. T h e n Hall et el. (1989) show t h a t a~, b^2 has the optimal relative error of order 0(n-~/(2~+1)), achieved by setting b c~ n -1/(2~+1). In principle, the relative error can be m a d e arbitrarily close to O(n -1/2) by choosing a sufficiently high kernel order r. It should be n o t e d t h a t when r > 2, ]b(X) necessarily takes on negative values for some x and poses practical difficulties if s m o o t h e d b o o t s t r a p samples need be simulated from lb. Negativity correction techniques of some sort must be i n c o r p o r a t e d into the s m o o t h e d b o o t s t r a p p r o c e d u r e to make it c o m p u t a t i o n a l l y feasible: see, for example, Lee and Young (1994). In the case where r = 2 so t h a t ]b is a p r o p e r density function, the o p t i m a l relative error of 52b is of order 0(n-2/5), which already improves u p o n the u n s m o o t h e d n out of n b o o t s t r a p , which has a relative error of order O(n-1/4). 3.
m out of n bootstrap
T h e m out of n b o o t s t r a p modifies the n out of n b o o t s t r a p by drawing b o o t s t r a p samples of size m, instead of n, from the empirical distribution of X 1 , . . . , Xn, where m satisfies m = o(n) and m ~ co as n ---* co. T h e corresponding variance e s t i m a t o r a^ m 2 is t h e n defined as m / n times the sample variance of the p-th b o o t s t r a p sample quantiles. Recall t h a t X(j) is the j - t h order statistic of X 1 , . . . , X,~ and X(r) is the p-th sample quantile. T h e m out of n b o o t s t r a p variance e s t i m a t o r a^m 2 admits an explicit, directly computable, formula: n
am^2 = (re~n) E ( X ( j )
(3.1)
-
X(r))2wmj,
j=l
where Wm,j = k(~)
f~i/21)/nxk-1 (1--x)m-kdx
and k = [ m p ] + l . Our main t h e o r e m below
establishes a s y m p t o t i c n o r m a l i t y of 5 2 t o g e t h e r with the corresponding convergence rate. Its proof is outlined in the Appendix. THEOREM 3.1. Assume m oc n ~ for some ~ C (0, 1), ]EIXI v < co for some ~7 > O, f = F' exists and satisfies a Lipschitz condition of order v = 89+ ~, with E e (0, 89 in a neighbourhood of ~p, and f(~p) > O. Then (3.2)
n3/2m-1/4(52 m
-
-
(T2) = Sn -4- Op(ml/4n -1/2 -4- m-1/2-e/2nl/2),
where Sn converges in distribution to N ( O , 2 ~ - l / 2 [ p ( 1 -
p)]a/2f(~p)-4).
T h e expansion (3.2) enables us to deduce the optimal choice of m by which (:7 ^ m2 achieves the fastest convergence rate, as is asserted in the following corollary. COROLLARY 3.1. Under the conditions of Theorem 3.1, a^ m 2 has an optimal relative error of order O(n-(l+2~)/(4+4E)), achieved by setting m c< n 1/(1+~). Hall and M a r t i n (1988) show t h a t n5/4(52n - a2n) has the same a s y m p t o t i c n o r m a l distribution as does n3/2m-1/4(6~m - a2n) u n d e r exactly the same conditions of T h e o ~2 which has a rem 3.1. It is clear t h a t am^2 converges to a n2 at a faster r a t e t h a n does an,
282
K. Y. C H E U N G AND S T E P H E N
M. S. L E E
relative error of order O(n -1/4). Although the smoothed b o o t s t r a p estimator an,b^2 has an even smaller relative error, of order 0(n-~/(2~+1)), t h a n O" ^m 2 for any r > 2, it requires t h a t f be at least twice continuously differentiable in a neighbourhood of ~p, a condition much stronger t h a n those of Theorem 3.1. Moreover, t h a t no such computable expression as (3.1) exists for a~,b ^2 means t h a t an, ^2 b has to be approximated by Monte Carlo simulation, which is computationally more expensive and is not immediately feasible if r > 2 due to the problem of negativity of lb. Arcones (2003) establishes versions of Theorem 3.1 and Corollary 3.1 for m out of n bootstrap estimation of the distribution of X(~). He shows, under the stronger assumption t h a t f is differentiable at ~p, t h a t the fastest convergence rate, of order n -1/3, is attained by setting m c< rt 2/3. Our results apply to the variance of X(~) and to densities f under less stringent smoothness conditions. Densities violating Arcones' but satisfying our smoothness conditions include those which are Lipschitz continuous of order u e (1/2, 1) near ~p. A simple example is f ( x ) = (7/6)(1 - Ixl3/4), for Ix[ < 1, which is Lipschitz continuous of order 3/4 at x = 0.
4. Empirical determination of m It follows fi'om Corollary 3.1 t h a t fixing m = cn ~, for some constants c and V independent of n, yields the best convergence rate for a^m. 2 In practice V is unknown and so is the optimal value of c. We sketch below a simple algorithm, based on the bootstrap, for empirical determination of b o t h c and V and hence the optimal choice of m. First fix S distinct b o o t s t r a p sample sizes m l , . . . , m s < n, for some S > 2. For each s = 1 , . . . , S, calculate a .2 = ( n / m s ) 6 2 , the variance of the p-th b o o t s t r a p sample quantile induced by the drawing of bootstrap samples of size ms. Generate a large number, B say, of bootstrap samples X~*I,... , 2d~*B, each of size ms, from X 1 , . . . , Xn. For each X~*b, calculate the e out of ms estimate of a*2s, namely
ffs,b,e -~ (g/ms) E ( X f f , ( j )
- ~'b,(r'))Y*
~21.*~
xk'-I
(1 --
X) e-k• dx,
J(j-1)/m.
j=l
where k* = [@] + 1, r* -- [msp] + 1 and X'b,(0 denotes the i-th smallest d a t u m in 2d*s,b. The mean squared error of the g out of m8 bootstrap variance estimate is t h e n estimated by MSE~(g) = B_I E b =Bl ( a s^,2 , b , g _ a~2)2 Select e = es which minimizes MSE~(e) over e E { 1 , . . . , ms }. Asymptotically es ~ cm~, so t h a t log e, ~ log c + ~/log ms, for s = 1 , . . . , S. S t a n d a r d least squares techniques yield t h a t c ~ exp{D -1 (M2L1 - M I K ) } s l o g m s , M2 = ES_l(lOg ms) 2, L1 = and V ~ D - l ( S K - MIL1), where M1 = )-~s=l S S )-~s=l log/s, K = ~s=l(logms)(loges) and D = SM2 - M 2. Finally calculate the optimal m to be m = [cn~], with c and 3' fixed at the above approximate values. .
5. Simulation study We conducted a simulation s t u d y to compare the m e a n squared errors of 62, a^m 2 and an,b , ^2 for p = 0.1, 0.5 and 0.9 and for fixed values of m and b. R a n d o m samples of sizes n = 50 and 200 were generated from three distributions: (i) the s t a n d a r d normal distribution, N(0, 1), (ii) the chi-squared distribution with 5 degrees of freedom, X52, and (iii) the double exponential distribution with density function f ( x ) = ( 1 / 2 ) e x p ( - I x l ) . All three distributions have densities satisfying the Lipschitz condition of order one, so t h a t the
BOOTSTRAP
VARIANCE
ESTIMATION
FOR
283
QUANTILES
rn out of n I
n out of n
n=50 ........ 0.003 0.0
0,5
1.0
1.5
n=200
smoothed
2.0
0.0
0.5
1.0
1.5
2.0
2.5b
? I"
/ c~
p=0.1
!
c~
[]
[3.
m
O.OCCO0
0.000
5 0.5
0.0
10
15
1.0
20
20
1.5
2.0
40
0.5
0.0
60
80
1.0
100
1,5
2.0
E]
0.0006
[]
/
0.000025
/ 0.0006
/ 0.000020
I"
0.0004" s O.OOO2
/ 9
[] /
0.00C010
J
~ ~c~B~:~
i
0.000015
--/.
l
i
l
p=0.5
/
./"
9
9,~ " 9 "
O,OOOO05 m 0.000000
0.0000 5 0.5
0.0
10 1.0
15 1.5
20
2O 2.0
0.5
0.0
40 1.0
60
80
1.5
100
2.0
2.5
D 0.004
O.OOOO6
O.0O3
%
p=0.9 0.00~
I
0.O0010
i !
0.00006
.[3
j
[D 0.00004
0.001
A.
0.0C002
&-
0.00000
0.000 5
10
15
2O
20
40
60
80
100
Fig. 1. N o r m a l E x a m p l e : m e a n squared errors of &n~, &2m (plotted against m ) a n d against b) for n -- 50 and 200, a n d p --- 0.1, 0.5 a n d 0.9.
an, bA 2
(plotted
conditions of Theorem 3.1 hold for e = 1/2. For the smoothed bootstrap estimator an,b , ^ 2 the second-order Epanechnikov kernel function k(t) -- max{(3/4)(1 - t 2 ) , 0 } was employed. Note that the first derivative of the double exponential density function does not exist at ~0.5 = 0, so that the density there lacks the smoothness condition sufficient for proper functioning of the smoothed bootstrap method based on the kernel k above. bootstrap Each smoothed bootstrap estimate an, ^2 b was derived from 1,000 smoothed samples. The estimates a^ n 2 and a^ m 2 were directly computed using explicit formulae (1.2) and (3.1) respectively. Each mean squared error was obtained by averaging over 1,600 random samples drawn from F. Figure 1 plots the mean squared error of a^2 m against m (bottom axis) and that of an,b^ 2 against b (top axis) for the normal distribution. Similar comparisons for the chisquared and double exponential distributions are given in Figs. 2 and 3 respectively. The mean squared error of a^N 2 is also included in each diagram for reference. For the N(0, 1) data, as predicted from asymptotic results, the n out of n bootstrap yields for a^2n the largest mean squared error, except for cases of large b or m, for all combinations of n and p. The mean squared error of the smoothed bootstrap estimate
284
K. Y. CHEUNG AND STEPHEN M. S. LEE I
0.0
0.5
,.0
9
m out ol n n out of ~ s,,~ot,ed
--
n=50
. . . . c - ....
,.5
2.0
2.5
3.0
I
n=200
0.0
0.5
,.0
t.s
2.0
2.~
D
b
0.020
0.0005 0.015 /" 0.010
/
O
/~
0.0004
./"
I :. .,_,
0.0003
D p=0.1 0.005
9
D"D
O
"[~D . D
~ D
.1
0.0002
' "~"A~ . 9 - 9
" ~v-& "
9
m 5
10
0
15
2
4
2O
/[]
0.02
/
20
40
6O
1
2
3
8O
5
b
,i:? I'
!
1/ ~. E2. "r ~
100
4
i
9. ~
9
[J
0.0O08
\
9 ~
9149
0.00,2
0.114
p--.-.~.s
0.0004
[~" ~:~
9
--
9 /
/-
[3 '
""
0.0000
0,00
5 0
2
10 4
6
15 8
10
12
m
20
20
14
0
4
40 2
4
60 6
80 8
100 10
12 "b
b
3 p=O.9
"~ ~ "
/ "/"
0.06O0
6
o6o \
O "D
0.6O01
0,06O (~ or) E
El.
0.06 - - i
2
9
0.04
1
,,.'"
0.02
..D 0
0.00 0
5
10
15
20
m 20
40
60
80
100
Fig. 2. C h i - s q u a r e d E x a m p l e : m e a n s q u a r e d errors of &2n, ~r2m ( p l o t t e d a g a i n s t m ) a n d O'n, ^2 b ( p l o t t e d a g a i n s t b) for n = 50 a n d 2 0 0 , a n d p = 0.1, 0.5 a n d 0.9.
an,b^ 2 varies w i t h b parabolically. A l t h o u g h it is a s y m p t o t i c a l l y less accurate, t h e rn o u t of n b o o t s t r a p e s t i m a t e &2m has m e a n squared error c o m p a r a b l e to that of on, ^2 b constructed using an o p t i m a l b, and m a i n t a i n s a more stable p e r f o r m a n c e t h a n O'n, 2 b for n = 200. A m o n g t h e values of p studied, all three e s t i m a t o r s tend to be m o s t accurate at p = 0.5 for b o t h n = 50 and 200. For data d r a w n from the a s y m m e t r i c X52, the m e a n squared errors of the e s t i m a t o r s are in general larger t h a n t h o s e observed in the N ( 0 , 1) example, and increase as p increases. As in Fig. 1, we see from Fig. 2 that O" ^2n is generally the least accurate, while t h e m e a n squared errors of o^m 2 and O'n, ^2 b a r e of similar magnitudes. T h e o p t i m a l choice of b a n d w i d t h , w h i c h yields t h e m i n i m u m m e a n squared error for ~Tn,b, ^2 increases considerably as p increases; the o p t i m a l choice of m for o" ^m 2 , by constrast, stays w i t h i n the s a m e range as p varies, rendering its empirical d e t e r m i n a t i o n less difficult t h a n that of t h e o p t i m a l b a n d w i d t h . Figure 3 displays the findings for the double e x p o n e n t i a l data. For p = 0.1 and 0.9, we see that an, 2 b and o^m 2 have c o m p a r a b l e m e a n squared errors, w h i c h are n o t a b l y smaller t h a n t h a t of an, ^2 provided b and m are selected sensibly. For p = 0.5, the m e a n squared ^
^
B O O T S T R A P VARIANCE ESTIMATION F O R QUANTILES
=::= mouo,o] n=50
285
n=200
. out of n smoothed
0
1
2
j
0.08
0.06
p=0.1 0.04
El[][:]
0.02
AI
9 [~
3
4
5
9
[]
[] .
9
CJ
0.0005
.~.'"
" ~,
.[]
_1
0.0020 0.0015 0.0010
"B-- 7 9
0.0000,
0.0
3
4
10
15
0.5
j
b
9 1 4 9
.[]
0.00 5
2
0.0025
/
9
9 []
1 b
1.0
1.5
E3
. E3~[]C3C3 9 N . 1 ~
20
20 0.0
40
0.5
D
60
80
1.0
100
1.5
2.0 /[]
0.00010 0.0012
y/ E3
0.0008
9
/"
0.000~
[3
../
/: 0,00006 /"
p=O.5
0.000O4
0.0004 . . . . .
[] ._ [] . ' - - r
9
i ~
9
0.00~02 0.00000
5 1
10 2
15 3
4
20 5
0
20
40
60
1
2
3
80
100
4
0.08 b
0.0020
0.06 O.O015
p=0.9 0.04
0.0010
0.02
0.0005 ~3ODOOD
0.00
m 20
" O ,r7 ' D " ~
0.0000 20
40
60
80
100
Fig. 3. Double Exponential Example: mean squared errors of &2, ~2m (plotted against m ) and (plotted against b) for n = 50 and 200, and p = 0.1, 0.5 and 0.9.
O.n, b^2
error of On, ^2 b increases significantly as b increases, and is m u c h larger t h a n t h o s e of a^2n and a^m 2 for n = 200, due plausibly to the lack of s m o o t h n e s s of t h e double e x p o n e n t i a l density at ~0.5 = 0. In general the m out of n b o o t s t r a p performs m u c h better t h a n the n o u t of n b o o t s t r a p for n = 200, except for a small m -- 9. Similar to the N ( 0 , 1) e x a m p l e , all three m e t h o d s are m o s t accurate at p -- 0.5 a m o n g the values of p studied. We note that in m o s t of the investigated cases the a c c u r a c y of the m out of n b o o t s t r a p deteriorates markedly for s o m e very small values of m. A heuristic e x p l a n a t i o n is as follows. We see from the p r o o f of T h e o r e m 3.1 that a s y m p t o t i c properties of a^2m d e p e n d critically on the weights W m j for j close to r. L e m m a A.1 s h o w s that the wm,j sequence, for j close to r, resembles a s y m p t o t i c a l l y the central s h a p e of a n o r m a l density. T h u s our a s y m p t o t i c findings can reliably predict finite-sample b e h a v i o u r only w h e n the Wm,j attains its m o d e at s o m e j strictly b e t w e e n 1 and n. E x a m i n a t i o n of the wm,j in detail s h o w s that t h e latter c o n d i t i o n holds only w h e n m exceeds a certain value, M ( n , p) say, d e p e n d i n g on b o t h n and p. U n d e r the settings of our s i m u l a t i o n study, we find that for b o t h n = 50 and 200, M ( n , p ) = 9, 2, 10 for p = 0.1, 0.5 and 0.9 respectively. Indeed Figs. 1 - 3 all suggest that the m o u t of n b o o t s t r a p p e r f o r m a n c e begins to stabilize once
286
K. Y. CHEUNG Table
1.
Mean
squared
AND
STEPHEN
M . S. L E E
errors of various variance estimates.
shown for both the smallest error obtained
in t h e s i m u l a t i o n
given by empirically selecting m using the algorithm
I n t h e c a s e o f &2m, r e s u l t s a r e
study using fixed m and the error
i n S e c t i o n 4. M e a n a n d s t a n d a r d
deviation
of the empirical m are also included. Normal example n = 50 p=0.1 2.7 • 10 - 3
(~2
n = 200
p:0.5 4.1 • 10 - 4
p=0.9 4.2 • 10 - 3
p=0.1 7.5 • 10 - 5
p=0.5 1.2 • 10 - 5
p=0.9 9 . 0 • 10 - 5
&~2b (fixed b)
6.7 • 10 - 4
1.2 • 10 - 4
9.8 • 10 - 4
1.6 • 10 - 5
2.5 • 10 - 6
1.4 • 10 - 5
d~
(fixed m )
6.0 • 10 - 4
7.8 • 10 - 5
1.1 • 10 - 3
1.8 • 10 - 5
1.5 • 10 - 6
1.6 • 10 - 5
~
(empirical m)
1.5 • 10 - 3
1.2 x 10 - 4
2.0 x 10 - 3
2.3 x 10 - 5
3.0 x 10 - 6
3.5 x 10 - 5
m e a n o f emp irical m s.d. o f e m p i r i c a l m
8.0 6.5
5.9 4.2
7.9 6.2
11.5 11.3
7.8 8.8
13.5 14.5
p=0.1 1.2 x 10 - 2 3.9 • 10 - 3
n = 50 p=0.5 3 . 7 • 10 - 2 9.8 • 10 - 3
p=0.9 3.4 • 10 ~ 4.7 • 10 - 1
p=0.1 3.3 • 10 - 4 6.1 • 10 - 5
n = 200 p=0.5 9.4 • 10 - 4 1.6 • 10 - 4
p=0.9 4.8 • 10 - 2 4.8 • 10 - 2
(fixed m ) (empirical m)
3.4 • 10 - 3 3.5 • 10 - 3
8.4 • 10 - 3 1.4 • 10 - 2
6.0 x 10 - 1 1.4 • 10 ~
4.5 • 10 - 5 1.0 • 10 - 4
1.4 • 10 - 4 3.3 • 10 - 4
1.4 x 10 - 2 2.5 • 10 - 2
mean of empirical m s.d. o f e m p i r i c a l m
8.3 4.2
7.7
8.9
10.1 7.1
8.2 5.8
7.6 9. 4
13.9 12.3
Chi-squared example
(~ d2b 5~ 52
(fixed b)
Double exponential example n = 50
n = 200
52
p=0.1 3.5 • 10 - 2
p=0.5 4.0 • 10 - 4
p=0.9 7.2 • 10 - 2
p=-0.1 8.6 • 10 - 4
p=0.5 7.4 • 10 - 6
p=0.9 1.0 • 10 - 3
&2~,b (fixed b) & 2 (fixed m ) ~2 (empirical m)
4.1 • 10 - 3 7.8 • 10 - 3 2.9 x 10 - 2
2.3 • 10 - 4 2.8 • 10 - 4 3.5 • 10 - 4
8.7 • 10 - 3 1.2 x 10 - 2 3.2 • 10 - 2
5.9 • 10 - 5 3.0 • 10 - 4 3.9 • 10 - 4
5.4 • 10 - 6 4.8 • 10 - 6 8.3 • 10 - 6
5.7 • 10 - 5 2 . 7 • 10 - 4 6.3 x 10 - 4
9.1 7.8
15.7 19.3
10.3 7.3
13.1 11.1
34.6 34.2
14.8 13.5
mean of empirical m s.d. o f e m p i r i c a l m
m exceeds M(n,p), especially for n = 200. O n t h e o t h e r hand, the o p t i m a l choice of b a n d w i d t h for an, ^ 2 b d e p e n d s crucially on F , n a n d p, a n d its m e a n squared error increases c o n s i d e r a b l y if b d e v i a t e s from its o p t i m a l value. T a b l e 1 c o m p a r e s numerically t h e m e a n s q u a r e d error of 5_2 w i t h those of am^ 2 and ~Tn, b^2 at the o p t i m a l choices of m ( a m o n g values g r e a t e r t h a n M(n,p)) and b as o b s e r v e d from the s i m u l a t i o n study. In t h e case of a^m 2 , we include also results o b t a i n e d using m selected b y t h e a l g o r i t h m described in Section 4, in which 1,000 pilot b o o t s t r a p s a m p l e s were s i m u l a t e d to e s t i m a t e the m e a n s q u a r e d e r r o r of the g out of ms b o o t s t r a p variance e s t i m a t e a n d t h e ms were chosen to be 2s + 8 for n = 50 a n d 12s - 2 for n -- 200, s = 1 , . . . , 8. T h e m e a n a n d s t a n d a r d d e v i a t i o n of the empirical choice of m are r e p o r t e d alongside the m e a n s q u a r e d error findings. We see t h a t the o p t i m a l l y c o n s t r u c t e d a^ m 2 and (Tn,b,^ 2 at fixed m a n d b respectively, h a v e c o m p a r a b l e errors. B o t h of t h e m are c o n s i d e r a b l y m o r e a c c u r a t e t h a n an. ^ 2 In general, our a l g o r i t h m for empirical d e t e r m i n a t i o n of m worked satisfactorily a n d p r o d u c e d e s t i m a t e s m o r e a c c u r a t e t h a n a^ N 2, albeit to a lesser extent t h a n its f i x e d - m c o u n t e r p a r t .
BOOTSTRAP
VARIANCE
ESTIMATION
FOR
QUANTILES
287
6. Conclusion We have shown, both theoretically and empirically, that the m out of n bootstrap variance estimator o,rn ~~ is notably superior to the conventional n out of n bootstrap estima^2 t o r an.^2 For densities satisfying a Lipschitz condition of order within (1/2, 1] near ~p, o.,~ incurs a relative error of smaller order than o.n,.. 2 provided that m is chosen appropriately. The smoothed bootstrap estimator 6~, b may yield an even smaller relative error using an optimal bandwidth b, but requires much stronger smoothness conditions on the density f . The rn out of n bootstrap therefore offers a convenient alternative which is more accurate than the n out of n bootstrap and more robust than the smoothed bootstrap. Under a smooth f for which both smoothed and uusmoothed bootstraps work properly, we have that o.n,A2o'm^2and o,n, b^2 generate relative errors of orders O ( n - I / 4 ) , O(n -~/3) and O(n -2/5) respectively, provided that m o< n 2/3, b c( n -1/5 and a second-order kernel is used in constructing O-n, ^2 b 9 Our simulation results agree closely with the asymptotic findings. Both the smoothed and the m out of n bootstraps, when constructed optimally, yield comparable accuracies and outperform the n out of n bootstrap method substantially. The optimal choice of bandwidth for the smoothed bootstrap varies considerably with the problem settting. The mean squared error of o.n, b 2^ is also very sensitive to the bandwidth. A slight deviation from the optimal value of the bandwidth may greatly deteriorate the accuracy of the estimate. One therefore requires a sophisticatedly-designed, datadependent, procedure for calculating the optimal bandwidth in practice. On the other hand, the observed mean squared error of 62m remains relatively stable over a w i d e range of m beyond M ( n , p), especially for large n. Also, the optimal choice of m tends to stay within a stable region which varies little with the problem setting. This suggests that the precise determination of m is less crucial an issue than is the choice of bandwidth for o,n,b" ^2 We have proposed a simple bootstrap-based algorithm for empirically determining the optimal m and obtained satisfactory results in our simulation study. Unlike most bootstrap-based estimates, o. ^n2 and 6 2 can be evaluated directly using formulae (1.2) and (3.1) respectively, so that no Monte Carlo simulation is necessary, making their computation exact and very efficient. The smoothed bootstrap estimate O-n, b 2^ must, however, most conveniently be approximated using Monte Carlo simulation. Use of a higher-order kernel, which effects in an improved error rate, further complicates the Monte Carlo procedure due to negativity of the kernel estimate ]b-
Appendix
Proof of Theorem 3.1 The proof is modelled after Hall and Martin's (1988) arguments. Let r denote the standard normal density function, Yn,j = (j - 1)/n and bran = (rny,,,y - k){my,,,j(1 - Y n , j ) } -1/2. The following lemma states a useful asymptotic expansion for the weight Wm,y. A.1
LEMMA A.1.
Assume that m ~ n ~ for some A C (0, 1). There exists some constant
C > 0 such that
Wm,j = m'/2n-1{Yn,j(1 - yn,j) }-U2r
+ O ( n - ] e-Om(Y"J-P)2).
288
K. Y. CHEUNG AND STEPHEN M. S. LEE
PROOF. Note t h a t wm,j = I j / n ( k , m - k + 1) - I(j_l)/~(k,m - k + 1), where Iy(a,b) -- 5-'a+b-1 (a+b-l)yJ(1 y)a+b-l-j W i t h o u t loss of generality, consider j = np + q w i t h q _> 0. ~ h e proof is completed by considering the Edgeworth expansion of the binomial distribution function for the case 0