Apr 9, 1985 - reference intervals. Traditionally, this involves the estima- tion of the 2.5th and 97.5th percentile values in an appropri- ately chosen population ...
CLIN. CHEM. 31/12, 1974-1978 (1985)
Improved Reference-Interval Estimation Edward K. Shuitz,1 Keith E. Willard,2 Steven S. Rich,2 Donald P. Conneiiy,2and Gregory C. Crltchfield2 We used two standardmethods(the common percentile and the log-power gaussiantransformation)and two novel methods (weightedpercentileand smoothed spline interpolation) to estimate the 2.5th and 97.5th percentiles of 1000 sets of data from eight diverse distributional forms, generated by Monte Carlo simulation. For each distributionalform we derived an estimate of optimai performance.Althoughnone of the four proposed methods closely approximated the optimal performance bound,the spline interpolationmethod and the weightedpercentilemethodwere superiorto the two standard methods in accuratelyestimatingpercentiles. Additional Keyphrases: statistics weighted percentiles
smoothed splines Monte Carlosimulation
For optimal interpretation of a specific test result, knowledge of the underlying distribution of test values in both healthy and diseased populations is necessary. Such information is commonly used in clinical medicine to establish reference intervals. Traditionally, this involves the estimation of the 2.5th and 97.5th percentile values in an appropriately chosen population sample, and it represents an analytical challenge because one must deal with the tails of the distribution where data are most sparse. Methods for estimating the 2.5th and 97.5th percentile points in the clinical literature have varied in their simplicity of use, accuracy of estimation, and ability to be applied well across a variety of distributional shapes. One widely used method has been to calculate the mean and variance and to assign the mean ± 1.96 SD as a reference interval. This method, based on the assumption that the distribution of values is gaussian, can result in a gross miscalculation of the true percentile points when the distribution of data is nongaussian (1, 2). Alternatively, the percentile (order statistic) method, in which no parametric assumptions are made and not even smoothness in the underlying distribution is assumed, has been proposed as a method of choice (1, 3). However, its relatively large variance (2) on moderatesize data sets has provided motivation for developing better methods. Harris and Demets (4) suggested several types of multistage-gaussian transformations for data in which the reference interval endpoints are estimated by the mean ± 1.96 SD of the transformed data and are then retransformed back. Reed and Wu (1) pointed out, however, that the method still was sensitive to asymmetric distributions and that the order statistic was therefore a reasonable alternative, depending upon the distributional shape. In 1982, Boyd and Lacher (5) proposed a “log-power” multistage-gaussian transformation, which they reported was insensitive to asymmetric distributions, which were successfully transformed to gaussian shape. In this paper, we compare two new methods, not previous-
ly reported in the clinical laboratory literature, with the common percentile and the “log-power” multistage-gaussian transformation techniques. We found the new methods superior to the usual methods recommended.
Materials and Methods We used four methods for estimating the 2.5th and 97.5th percentiles: (a) the common “percentile” (sample order statistic) method, (b) the log-power multistage-gaussian transformation technique (5), (c) the weighted percentile method of Harrell and Davis (6), and (d) spline interpolation with smoothing. We tested these methods on eight distributional shapes previously studied (1, 5). Samples of data were generated from six Johnson (7) and two chi-squared (with two and four degrees of freedom) distributions (Figure 1). For each distribution 1000 data sets of 119 values each were generated. This large number of data sets was required to ensure that the observed differences in performance between methods reflected “real” changes and did not arise from sampling variability. For each of these 1000 data sets
we estimated the 2.5th and 97.5th percentiles by each of the four methods. The choice of 119 (instead of 120) points was made so that estimates by the percentile method could be made without 4.0.
interpolating
between 3.0
JohnsonA
2.4
2.4 1.8
6
1.2
3.2
08
0.6
0.0#{149}
An
5 4
1.2-
6
\
B
l
N
JohnsOn
0
.2
0.8 0.4
0
.6-
Johnson
P\7u1
20 Johnson C
3 2
2.0-
values.
____________________
0.0
JohnsonE yO 8’l,2
4.0 3.2 2.4
0.8-
.6
0.4-
0.8
JohnsonF
(nfl
0.50.40.3-
0.20.1-
Go ‘Department of Pathology, Dartmouth-Hithhcock Medical Conter, Hanover, NH 03756. 2Department of Laboratory Medicine and Pathology, University of Minnesota, Minneapolis, MN 55455. Received April 9, 1985; accepted August 26, 1985. 1974
CLINICAL CHEMISTRY. Vol. 31, No. 12, 1985
Fig. 1. Underlying frequency distnbutions used in simulation experiments ThetopsixcurvesarefromtheJohnsonS5systemwherez= y+ SIn[j(1 and z is a normalizedgaussianvariable;the bottom two curves representchisquareddistributionswith two and four degreesof freedom
Percentile(Sample Order Statistic)Method For each of the 1000 data sets, the 119 randomly generated points were ordered by ascending numerical value. The 2.5th and 97.5th percentile points were estimated by Q = XEp(n + 1)1 where Q is the percentile estimate at the pth level p is either 0.025 or 0.975 n is the number of points in the sample X[i] is the ith ranked observation For n
=
119, Qo.om
=
X[3] and
Q5
=
X1117]
Log-PowerMultistage-GaussianTransformation Method The log-power multistage-gaussian method of Boyd and Lasher (5) was performed as described in their paper. Asymmetry (skew) is removed by the log (x + c) transformation, and kurtosis (the “peakedness” of the distribution) is brought to zero, the same value as that of a gaussian distribution, by the power function LXlkfor X >0 and LKIk for X ETP
-
Q(i)]2}In
where T is the true (known)
percentile at the pth level Q(i) is the estimated percentile at the pth level, for the ithdata set p is either 0.025 or 0.975 n is the number of synthetic data sets generated (1000 for each distribution).
The performance of each method was evaluated by the ratio of the RMSE of the method to the RMSE of the percentile (order statistic) method. This eliminated scaling differences in the RMSES from one end of a distribution to the other and from one distribution to another. We examined a large number of data sets (1000/distribution) to ensure that the resulting differences in RMSE were not due to sampling variability, and confirmed that this was true by increasing the number of runs until the RMSE ratio stabilized (Figure 2). We chose the percentile method as the method for direct comparison because it is widely used and has been proposed as a method of choice (1, 3). To characterize the statistical efficiency (10) of the four methods, we used a maximum likelthood estimator to approximate the best performance possible (see Appendix). If an explicit form of the underlying distribution is known, then a statistically most likely estimate of the percentile can be calculated from the sample data set. For example, given a set of data known to be gaussian in distribution, the most probable estimate for the 97.5th percentile point is the mean + 1.96 SD. A comparable expression exists for every distri-
1.81
x
V C
Weighted Percentile Method The weighted percentile method recently proposed by Harrell and Davis (6) results in a percentile estimate from a weighted summation of all the data points. The weights are calculated from the sample cumulative distribution functions. Computer programs were developed on a Control Data Corporation CYBER 174 in Pascal and Fortran-77, graphical output on an Apple He, with the simulation runs done on a Digital Equipment Corporation (DEC) VAX 11/750. Software from the International Mathematical and Statistical Libraries (IMSL), Houston, TX 77036, was used for calculating weighted smoothed splines, random normal and chisquared deviates, and cumulative and inverse distribution functions.
Johnson(B)
1.2I a’ 0
Ui C,,
1.0
x 81 .6 i .
setsusedin literature
setsweused
*
J.. laO
250
500
750
1000
2000
NUMBEROF DATA SETS ANALYZED
Fig. 2. Vanabillty of relative performance of log-power method with the number of simulation runs
Evaluation of Methods
The ratio of AMSE (log-power)fo the RMSE (percentile) is an indlcafor of relative
The true 2.5th and 97.5th percentiles points were calculated from the known characteristics of the given distribu-
performance.The loser the ratio,the better the log-powerperformanceforthe particularset of randomnumbers generated. Definite conclusions are dlillcult at a lownumberof simulationnins (i.e.,