Jan 8, 1996 - ABSTRACT: In this paper, theoretical and practical aspects of the sample-point adaptive positive kernel density estimator are examined.
On Locally Adaptive Density Estimation Stephan R. Sain and David W. Scott
1
January 8, 1996
ABSTRACT: In this paper, theoretical and practical aspects of the sample-point adaptive positive kernel density estimator are examined. A closed-form expression for the mean integrated squared error is obtained through the device of preprocessing the data by binning. With this expression, the exact behavior of the optimally adaptive smoothing parameter function is studied for the rst time. The approach diers from most earlier techniques in that bias of the adaptive estimator remains O(h2) and is not \improved" to the rate O(h4 ). A practical algorithm is constructed using a modi cation of least-squares cross-validation. Simulated and real examples are presented, including comparisons with a xed bandwidth estimator and a fully automatic version of Abramson's adaptive estimator. The results are very promising. KEY WORDS: Kernel Function, Variable Bandwidth, Binning, Cross-Validation. 1 Stephan R. Sain is Research Associate, Department of Statistical Science, Southern Methodist University, POB 750332, Dallas, TX 75275. David W. Scott is Professor, Department of Statistics, Rice University, POB 1892, Houston, TX 77251. This research was supported in part by the National Science Foundation under grant DMS-9306658 and the National Security Agency under grant MOD 9086-93. The authors would like to thank the readers for many helpful suggestions.
1
1. Introduction Precise theoretical understanding of adaptive density estimators as well as the availability of sound practical algorithms has proven surprisingly dicult. Note that the term adaptive in this setting does not refer to automatic or data-based bandwidth selection, but
rather to local smoothing of the estimated density in order to obtain an improved global estimate. This local smoothing can be achieved by varying the functional form of the kernel or the bandwidth or both. In this work, we will only consider taking the variable bandwidth as a function of the data points. Let Kh(t) = h?1K (h?1t). For data fx1; : : : ; xn g, the xed kernel estimator is given by x ? x 1 X n n X 1 i ^ f0(x) = nh K h = n Kh (x ? xi); i=1
i=1
(1)
where the kernel, K , with nite variance, K2 , generally satis es K 0; K (?x) = K (x); R
and K = 1. The smoothing parameter, h, is held constant for all x 2