M. CACCIARI (*) – G. MAZZANTI (**) – G.C. MONTANARI (**) J. JACQUELIN (***)
A robust technique for the estimation of the two-parameter Weibull function for complete data sets
Contents: 1. Introduction. — 2. A new estimator for the shape and scale parameters of the Weibull probability distribution function. - 2.1. The Thiel method. - 2.2. The proposed method. — 3. Estimation of the pivotal percentile value and accuracy of the proposed non-parametric estimators. — 4. Comparison between scale and shape parameter estimates calculated by the non-parametric estimator and other estimators reported in the literature. - 4.1. Comparison relevant to standard simulated samples. - 4.2. Comparison relevant to experimental samples. — 5. Robustness of the proposed method. — 6. Conclusion. References. Summary. Riassunto. Key words.
1. Introduction Several methods are available in literature for the estimation of the parameters of the Weibull distribution which is generally used to process the results of the electrical breakdown tests performed on solid insulation (Cacciari et al., 1994, Jacquelin, 1997). However, some of the methods have significant limitation, in term of accuracy, for particular conditions of sample size and shape parameter values (small size and low shape parameter values). Another problem associated with the use of methods for the estimation of distribution parameters comes from the robustness of the (*) Dipartimento di Ingegneria dell’Informazione, Universit`a di Parma, area Parco delle Scienze 181/a, 43100 Parma, Italy (**) Dipartimento di Ingegneria Elettrica, Universit`a di Bologna, Viale Risorgimento 2, 40136 Bologna, Italy. Tel: 0039-051-2093481. Fax: 0039-051-2093470 E-mail:
[email protected]. (***) Alcatel CIT, Marcoussis, France.
66 method. The results of the accelerated tests performed in order to characterize insulating materials and provide the indication for insulation system design, in fact, can be significantly affected by outliers which often occur in life testing, due e.g. to imperfections of the test objects and electrical transients in the test assembly. The robustness of the method can help in sterilizing the effect of the outliers on the estimation of the scale and shape parameters, as well as of extreme percentiles. The goal of this paper is to propose new estimators for the scale and shape parameters of a Weibull probability distribution function (p.d.f.), which are capable of providing point estimates that are both efficient (and unbiased) and robust, thus estimators less affected than others by anomalous experimental values (outliers). Such estimators are applied to a two-parameter Weibull p.d.f. employed to process breakdown data of materials and insulating systems subjected to electrothermal (or thermomechanical) stress, where the presence of outliers can hide aging processes, thus affecting the inference of diagnostic quantities or degradation mechanisms. By means of the Monte Carlo simulation method, the distribution of the Weibull function parameter estimates and the relevant percentiles are determined, as well as the so-called unbiasing factors, that provide improved estimators of the scale and shape parameters. Moreover, by the same Monte Carlo method, a check on how these improved estimators are efficient is performed, by assuming as the error the deviation between reference values of the parameters of a Weibull distribution function and the median values of the distribution of the parameter estimates obtained by the proposed estimators. The check is performed via a comparison of the deviations observed on samples differing in size and β values, through Monte Carlo simulation. The comparison is then extended to point estimates of scale and shape parameters, relevant to three experimental samples differing in size and data dispersion, calculated both by the non-parametric method proposed here and by the methods most-commonly employed in the literature, i.e. the Maximum-Likelihood Estimate (MLE), the Least-Square Regression (LSR) and Weighted Least-Square Regression (WLSR) techniques.
67 List of symbols and notations a: coefficients for the estimate of the scale parameter by the method of Seki-Yokoyama. b: coefficients for the estimate of the shape parameter by the method of Seki-Yokoyama. C: cost index. F: cumulative probability distribution function. F (x, α, β): Weibull cumulative probability distribution function. M: number of combinations without repetition of [i,j]. N : sample size. p ∗ : pivotal percentile. pk : k-th percentile of the sampling distribution of βi j . p.d.f.: probability distribution function. q: random number uniformly distributed between 0 and 1. r : correlation coefficient. Ts : total number of simulations. T ∗ : inequality index of the Thiel test. x: random variable (e.g., time-to-failure). xi : experimental values of the random variable. xi∗ : simulated values of the random variable. α: scale parameter of a Weibull p.d.f.. β: shape parameter of a Weibull p.d.f.. α: ˆ estimate of α ˆ estimate of β β: αs : reference value of the scale parameter of a Weibull p.d.f.. βs : reference value of the shape parameter of a Weibull p.d.f.. MLE: maximum likelihood estimator. 1 LSR: least-square regression of X i = ln(xi ) vs Yi = ln ln 1−F
LSRJ: least-square regression of Yi = ln ln
1 1−Fi
i
vs X i = ln(xi ).
SY: Seki-Yokoyama method. WLSR: weighted regression of X i = ln(xi ) vs Yi = least-square 1 (White method). ln ln 1−F i
αˆ S, p : non-parametric estimate of the scale parameter of a Weibull p.d.f. αˆ S,D : non-parametric pivotal estimate of the scale parameter of a Weibull p.d.f.
68 αˆ S,J : non-parametric pivotal estimate of the scale parameter of a Weibull p.d.f. according to Jacquelin. βˆS, p : non-parametric estimate of the shape parameter of a Weibull p.d.f. αT : estimate of the scale parameter of a Weibull p.d.f. according to the Thiel method. βT : estimate of the shape parameter of a Weibull p.d.f. according to the Thiel method.
: sum of the absolute differences between a priori and estimated probabilities.
2. A new estimator for the shape and scale parameters of the Weibull probability distribution function In complete life tests, with reference to a sample consisting of N solid-insulation specimens, the observable values are the measured values xi (i = 1, . . . , N ) of a random variable x, that can be, e.g., the time-to-failure of specimens subjected to a constant electrothermal and/or thermomechanical stress, or, alternatively, the electric strength of specimens subjected to a linearly-increasing voltage till breakdown (ramp tests). Once the measured values have been sorted in ascending order, the two-parameter Weibull p.d.f. is generally used to fit experimental data (Cacciari et al. 1994, 1996, Montanari et al., 1998). Its well-known expression is the following:
F(x; α, β) = 1 − exp −
β
x α
(1)
where α and β are scale and shape parameters, respectively (both > 0). By associating to each xi (i = 1, . . . , N ) value a proper cumulative probability estimation, Fi , it is then possible to plot the couples {xi , Fi } on a proper probability paper (Weibull diagram), in which eq. (1) is linearized resorting to an appropriate coordinate change, so that data scatter with respect to the linear form of eq. (1) can be checked and possible outliers can be singled out. The linear form of eq. (1) is the following:
1 1 (2) ln x = ln α + ln ln β 1− F
69 Median or mean rank estimators are commonly used for Fi (Fothergill, 1990, Ross, 1996). For example, an approximate estimator of the exact value of Fi , derived from the median value of the sampling distribution of Fi , is given by, according to Filliben (Fothergill, 1990, Jacquelin, 1993): i − 0.3175 (3) Fi = N + 0.365 with Fi = 1 − 0.51/N and FN = 0.51/N . Several techniques are available to estimate parameters α and β. Among these, the most popular are MLE and LSR. However, each technique has advantages and disadvantages, in terms of accuracy (linked to β values and sample size (Cacciari et al., 1996)), ease of implementation and sensitivity to outliers (robustness). In the followings, a non-parametric method, called Thiel method, is examined and a new procedure for improving precision and robustness is proposed. 2.1. The Thiel method Referring to the regressive method of Thiel (Sprent, 1989, Birkes and Dodge, 1993), an estimate of parameters α and β can be obtained by means of the following non-parametric regression procedure. 1 )]} values are sorted in ascending The couples of {ln xi , ln[ln( 1−F i order. Then, considering all the combinations without repetition of pairs {i, j} of such couples (with i = 1, 2, . . . , N − 1 and j = i + 1, i + 2, . . . , N ), a tentative value of the shape parameter, βi, j , is calculated for each pair, resorting to the following expression:
ln ln βi, j =
1 1−Fj
− ln ln
ln x j − ln xi
1 1−Fi
(4)
Since the total number of combinations without repetition of pairs {i, j} is M = N (N − 1)/2, the same is the number of βi, j values to be calculated by means of eq. (4). Such M values are then sorted in ascending order and the median value of the so-obtained distribution is chosen as point estimate of the shape parameter, according to Thiel (Sprent, 1989, Castillo and Hadi, 1995, Jacquelin, 1997) βT = median [βi, j ]
(5)
70 It is now possible to calculate N tentative values of scale parameter, αi , i = 1, . . . , N (needed in order to achieve the estimate of scale parameter α), as follows: αi =
xi [− ln(1 − Fi )]1/βT
(6)
Then, the αi values are sorted in ascending order and the median value of the so-obtained sampling distribution of α is chosen as estimate of the scale parameter, according to Thiel: αT = median [αi ]
(7)
It must be noted that in the cases (not uncommon in breakdown testing) when βi = β j , a very high value of βi, j (e.g. βi, j = 100) can be considered in place of eq. (4), in order to achieve estimates of αT and βT . 2.2. The proposed method In order to obtain efficient point estimates of the two parameters of the Weibull distribution, belonging to samples of small size N (N ≤ 10), even in the presence of anomalous experimental values of the random variable (outliers) (see Lecontre and Tassi, 1987), the Thiel method is modified as follows. Let us suppose that the size N of the considered sample is fixed and the reference values, α S and β S , are the Weibull parameter values of the population to which the considered sample belongs. N cumulative probability values qi (i = 1, . . . , N ) are generated, by the Monte Carlo procedure (Montanari et al., 1995), that are randomly and uniformly distributed between 0 and 1, so that a simulated sample can be obtained, composed by N values xi∗ (i = 1, . . . , N ) of the random variable equal to: xi∗ = α S [− ln (1 − qi )]1/β S
(8)
Then, the xi∗ values are sorted in ascending order and the corresponding estimates of Weibull parameters αi and βi, j for the simulated sample are obtained by the regressive method of Thiel. A total number of TS simulations is performed according to the above procedure. If TS
71 is large enough (typically 10000 or more), the sorted values of αi,s and βi, j,s , respectively, constitute a good approximation of the relevant estimate sampling distributions. It can be observed (as confirmed by the Tables in the following) that the median value of the distribution of αi,s thus derived is fairly close to the reference value α S . Hence, it can be said that an accurate estimate of α S is equal to: αˆ S = median[αi,s ]
(9)
The application of the Monte Carlo procedure shows, indeed, that the αi,s value of the 50-th percentile of its distribution is the best estimate of α S . Therefore, by defining as pivotal value p ∗ for a parameter the percentile of the relevant estimate distribution that approximates the parameter at best, p ∗ = 50% can be taken as the pivotal value for the scale parameter (Jacquelin, 1996a, 1996b). As far as the shape parameter is concerned, it can be observed, on the contrary, that the median value of the sampling distribution of βi, j,s , namely (10) βs,50% = median [βi, j,s ] can be quite far from β S . Thus, the pivotal percentile for β S is different from the 50-th. In this case, p ∗ is related to sample size, but not to the value of β S (as it will be shown later). Hence, in order to achieve the best estimation of the shape parameter, the values of p ∗ must be known as a function of sample size, N . Such values are collected in proper Tables in Section 2. Efficient point estimates of the shape parameter (and estimates of the scale parameter alternative to (9)) can be obtained according to the following procedure. The M values βi, j , derived by means of eq. (4) applied to experimental values xi , are sorted in ascending order. Among such ranked values, the one that corresponds to the largest integer K so that K < p ∗ (M + 1) (where p ∗ is the abovediscussed pivotal value of the shape parameter, derived in next Section and reported in Table 2), is assumed as the (non-parametric) shape parameter estimate, βˆS, p , equal to: βˆS, p = pivotal [βi, j ]
(11)
An efficient estimate of βˆS, p is particularly important in the presence of experimental samples of small size and with outlier data.
72 Small sample size means also low number, M, of βi, j values, hence βˆS, p can be conveniently estimated by means of a linear interpolation between the two adjacent βi, j values whose rank numbers, k and k + 1, contain the value p ∗ (M + 1), as follows:
βˆS, p = βi, j (k) + βi, j,(k+1) − βi, j (k)
pk+1 − p ∗ pk+1 − pk
(12)
where the generic percentile pk (k = 1, . . . , M) is expressed as: pk =
k M +1
(13)
Once βˆS, p is known, an estimate of the scale parameter, αˆ S, p , can be attained in a direct way, i.e. without resorting to the relevant sampling distribution (eq. (9)), by means of the N values of the scale parameter, α S,i (i = 1, . . . , N ), that is: xi αˆ S,i = (14) ˆ [− ln(1 − Fi )]1/β S, p and assuming as efficient point estimate the median value of the α S,i values (as supported in the next Section), namely: αˆ S, p = median [α S,i ]
(15)
In the next section, the accuracy of estimates βˆS, p and αˆ S, p is verified. The latter is compared also with another estimator of the scale parameter, called non-parametric pivotal, αˆ S,D , that can be implemented more quickly than through eq. (14):
αˆ S,D = N
N
βˆS, p i=1 x i
i=1 [− ln(1
− Fi )]
1/βˆS, p
(16)
The proposed estimators, βˆS, p , αˆ S, p , as well as the couple, βˆS, p , αˆ S,D , can be employed also in the presence of experimental data samples of the grouped type, since it is hypothesized that the pivotal value, p ∗ , even if determined through the sampling distributions obtained by simulative methods for singly-distributed values, remains constant. The estimates obtained from eqns. (15) and (16) provide very close values, as shown in the following Section, with slightly better accuracy for eq. (16).
73 3. Estimation of the pivotal percentile value and accuracy of the proposed non-parametric estimators As anticipated in the previous Section, the pivotal percentile p ∗ is not related to the shape parameter value, but depends on sample size, N , and on the expression chosen to estimate Fi . On the contrary, the pivotal value relevant to the non-parametric estimate of the scale parameter is always equal or very close to 50%. In order to obtain tables that report the p ∗ values for the shape parameter and, thus, verify the efficiency and accuracy of the proposed method, the percentiles of the distributions of αi and βi, j were calculated for Weibull p.d.f. families with equal reference value of scale parameter, α S = 1, and different reference values of the shape parameter, β S , namely 0.5, 1, 10, with two different sample sizes, i.e. N = 5 and N = 10. In order to determine the values of Fi that appear in the expressions for the calculation of αi and βi, j , in addition to the abovementioned Filliben estimator (eq. (3)), the so-called Benard estimator (Fothergill, 1990, Jacquelin, 1993) was used, i.e.: i − 0.30 (17) Fi = N + 0.40 together with the exact cumulative probability estimator, that can be achieved via the determination of the median value of the so-called beta-function, as suggested in the literature (Fothergill, 1990, Jacquelin, 1993). The median values of the distributions of αi and βi, j , β(50%) and α(50%), obtained for the above-listed values of N and α S , β S by means of the Monte Carlo procedure, are reported in Table 1, together with the pivotal value for the shape parameter, p ∗ (β). Tables 1.A, 1.B and 1.C are relevant, respectively, to Benard, Filliben and betafunction-based cumulative probability estimators. Due to computing time constraints, the number of Monte Carlo simulation, TS , has been limited to a maximum of 30000. Table 1 confirms what previously anticipated, that is, p ∗ is function of N and does not depend on β S (and α S ). It is important to observe that the pivotal values of β obtained by the Monte Carlo method do not vary significantly with the chosen cumulative probability estimator. The values of p ∗ (β) obtained by the Monte Carlo procedure, but with TS = 100000, are reported in Table 2 as a function of sample size N , using the median unbiased estimator of Fi (eq. (3)).
74 Table 1: Median values of the distributions of αi and βi, j estimates and pivotal value of the shape parameter for different values of N and β S (α S = 1). A) Benard probability estimator; B) Filliben probability estimator; C) beta-function-based probability estimator. TS = 30000. A) N
M
βS
p ∗ (β)
β(50%)
α(50%)
5 5 5 10 10 10
10 10 10 45 45 45
0.5 1 10 0.5 1 10
0.453435 0.453435 0.453435 0.465815 0.465815 0.465815
0.542161 1.084322 10.84322 0.522735 1.045470 10.45470
0.968874 0.984314 0.998420 0.976026 0.987940 0.998787
B) N
M
βS
p ∗ (β)
β(50%)
α(50%)
5 5 5 10 10 10
10 10 10 45 45 45
0.5 1 10 0.5 1 10
0.452055 0.452055 0.452055 0.463315 0.463315 0.463315
0.543369 1.086738 10.86738 0.524438 1.048876 10.48876
0.967061 0.983393 0.998327 0.973820 0.986823 0.998674
β(50%)
α(50%)
0.543221 1.086443 10.86443 0.524619 1.049239 10.49239
0.967130 0.983427 0.998330 0.973499 0.986661 0.998658
C) N
M
βS
p ∗ (β)
5 5 5 10 10 10
10 10 10 45 45 45
0.5 1 10 0.5 1 10
0.452215 0.452215 0.452215 0.463037 0.463037 0.463037
On the other hand, the values of p ∗ (β) can be affected by the way the Monte Carlo procedure is carried out, particularly by the fact that samples that exhibit a low correlation coefficient in Weibull plot can be excluded or not from the TS simulations (such samples are always present, but their number is usually very low) (Cacciari et al., 1996, Abernethy, 1996). The same samples employed for the estimation of p ∗ were also used to derive α and β estimates, in terms of expected values and percentiles of the relevant sampling distributions (resorting to eq. (11) for the shape parameter and to eqns. (15) and (16) for the scale parameter). Table 3.A shows the percentiles 5%, 50%, 95% of the sampling distributions of αˆ S, p , βˆS, p (eqns. (15) and (11)) and of the pivotal
75 Table 2: Values of the pivotal percentile p∗ of the shape parameter for different values of N derived by means of the Monte Carlo method (TS = 100000). N
M
p ∗ (β)
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
10 15 21 28 36 45 55 66 78 91 105 120 136 153 171 190
0.4510 0.4509 0.4529 0.4551 0.4562 0.4571 0.4589 0.4596 0.4610 0.4616 0.4623 0.4625 0.4635 0.4645 0.4648 0.4652
βˆ
statistic αˆ S, p /α S S,P obtained for the Weibull p.d.f. families already considered in Table 1 (i.e. those characterized by: α S = 1; β S = 0.5, 1, 10; N = 5, 10), by means of the Monte Carlo procedure with TS = 30000. The relevant expected values are also reported. Table 3.B is the same of Table 3.A, with the exception that αˆ S,D eq. (16)) was used in the calculations instead of αˆ S, p . These tables have the same use of those provided in (Mann et al., 1974) and relevant to the BLIE method. The bias in the estimate of a parameter is defined, in the classical sense, as the difference between the parameter reference value and the expected value of the relevant distribution of parameter estimates, often normalized with respect to the reference value. Then, for the proposed method (that in Table 3.A is called simply non-parametric method) the bias on the shape parameter ranges between 11.9% (when N = 5) and 5% (when sample size is doubled). The bias on the scale parameter, calculated according to the non-parametric method (eq. (15)) and the non-parametric pivotal method (eq. (16)), for N = 5 exhibits a maximum equal to 47% (when β S = 0.5) that then decreases down to 0.3% if β S = 10. The maximum biases on α are halved when sample size is doubled. For some authors (Montanari et al., 1998), since anomalous values are usually encountered in the tails of the sampling distributions of
0.5 1 10 0.5 1 10
5 5 5 10 10 10
1.031581 1.015668 1.001556 1.033510 1.016617 1.001649
(50%)
(5%)
0.184518 0.429555 0.918971 0.316213 0.562328 0.944059
αˆ S, p
αˆ S, p
1.469315 1.100302 0.999896 1.246718 1.060520 1.000962
E(αˆ S, p )
4.207839 2.051302 1.074491 2.898774 1.702579 1.054656
(95%)
αˆ S, p
0.255785 0.511569 5.115690 0.319423 0.638845 6.388452
(5%)
βˆS, p
0.496208 0.992415 9.924154 0.494658 0.989317 9.893169
(50%)
βˆS, p
0.595628 1.191256 11.91256 0.528457 1.056915 10.56915
E(βˆS, p)
1.254270 2.508539 25.08539 0.858165 1.716331 17.16331
(95%)
βˆS, p
0.403844 0.403844 0.403844 0.578185 0.578185 0.578185
(5%)
αS
βS
0.5 1 10 0.5 1 10
N
5 5 5 10 10 10
0.161065 0.401329 0.912746 0.295337 0.542601 0.940538
αˆ S, p
0.982857 0.991392 0.999136 0.991928 0.994902 0.999398
αˆ S, p 1.443354 1.082260 0.997387 1.204037 1.038764 0.998553
E(αˆ S, p ) 4.275613 2.067756 1.075350 2.822808 1.679495 1.053117
αˆ S, p 0.255785 0.511569 5.115690 0.319423 0.638845 6.388452
βˆS, p 0.496208 0.992415 9.924154 0.494658 0.989317 9.893169
βˆS, p 0.595628 1.191256 11.91256 0.528457 1.056915 10.56915
E(βˆS, p)
1.254270 2.508539 25.08539 0.858165 1.716331 17.16331
βˆS, p
0.34766 0.34766 0.34766 0.54216 0.54161 0.54113
αS
0.99178 0.99178 0.99178 0.99623 0.99528 0.99421
αˆ S, p βˆS, p αˆ S, p βˆS, p αS
1.013278 1.013278 1.013278 1.016832 1.016832 1.016832
(50%)
αS
αˆ S, p βˆS, p αˆ S, p βˆS, p
Table 3B: Same of Table 3.A, but with (eq. (16)) in place of (non-parametric pivotal estimators).
βS
N
E
E
1.940626 1.940626 1.940626 1.071395 1.070277 1.069266
αS
2.58541 2.58541 2.58541 1.83700 1.83495 1.83349
αS
2.792143 2.792143 2.792143 1.917251 1.917251 1.917251
(95%)
αS
αˆ S, p βˆS, p αˆ S, p βˆS, p
3.55e+10 3.55e+10 3.55e+10 1.108532 1.108532 1.108532
αS
αˆ S, p βˆS, p αˆ S, p βˆS, p
Table 3A: Percentiles 5%, 50%, 95% of the distributions of, αˆ S, p , βˆS, p estimates (non-parametric estimators: eqns. (15) and eq. (11)) and of the statistic obtained for the Weibull p.d.f. families already considered in Table 1 by means of the Monte Carlo procedure (TS = 30000).
76
77 the Weibull parameters obtained by the above-described Monte Carlo simulative procedure, it is preferable to assume, as error index, the difference (possibly normalized with respect to the reference value) between the median value of the sampling distribution of the parameter estimates and the reference value of the parameter. For other authors (Montanari et al., 1998), it is preferable to evaluated the bias as the ratio between the difference of 95% and 5% percentiles, derived from the sampling distribution of the parameter estimates, and its reference value. According to the former approach, the results reported in Table 3 seem to confirm the efficiency of estimator βˆS, p . Indeed, the 50% percentile of the relevant sampling distribution (obtained by the Monte Carlo procedure) is such that it differs from the reference value of ∼ 0.7%, for sample size N = 5, while the difference increases slightly to 1.1% (for N = 10). As far as the relative error relevant to the difference between percentiles 95% and 5% of βˆS, p estimate distribution is concerned, it remains constant as β S changes, and ranges between 200% and 100% as sample size is doubled (from N = 5 to N = 10). Dealing now with the accuracy of the scale parameter estimate, evaluated on the basis of the criterion described above (consisting of the ratio of the difference between the median of the distribution of the parameter estimate and the relevant reference value over the reference value itself), the results reported in Table 3, relevant to αˆ S, p , αˆ S,D , seem to support the validity of these estimators. Indeed, the latter exhibits a maximum percent error equal to 1.7% for N = 5 and β S = 0.5 and the error decreases when both sample size is doubled (for the same β S ) and/or β S is raised (for the same N ), with a minimum of 0.06% when N = 10 and β S = 10. In the case of αˆ S, p , the error depends only on β S (it is approximately the same for N = 5 and N = 10) and decreases from 3% to 0.2% when β S is raised from 0.5 to 10. In order to extend the investigation on the validity of the proposed estimators, i.e. βˆS, p and αˆ S, p , αˆ S,D , it is interesting to compare the sampling distributions reported in Table 3 (i.e. those relevant to αˆ S, p , αˆ S,D , ˆ ˆ βˆS, p and the statistic (αˆ S, p /α S )β S, p , (αˆ S,D /α S )β S,D ), with the sampling distributions of the same statistics obtained by the Monte Carlo procedure using estimators of Weibull parameters commonly reported in the literature, like the Maximum Likelihood Estimator and the Least Square Regression estimator (Jacquelin, 1996c). This comparison is the subject of the next section.
78 4. Comparison between scale and shape parameter estimates calculated by the non-parametric estimator and other estimators reported in the literature The most commonly-used methods for the calculation of scale and shape parameters of the Weibull distribution are the Least Square 1 )], i = Regression with regression of X i = ln(xi ) vs, Yi = ln[ln( 1−F i 1, . . . , N , LSR, the Weighted Least Square Regression, WLSR, the Least Square Regression with regression of Yi vs X i , LSRJ, and the Maximum Likelihood Estimator, MLE. In this work, in order to extend the comparison, the estimator of Seki and Yokoyama, that can be classified among the robust estimators (Seki and Yokoyama, 1996), are used for Weibull parameter estimation of experimental samples. According to Abernethy (1996), the estimates of scale and shape parameter are determined by means of the regression of X i = ln(xi ) 1 )] (see Lawless, 1982, IEEE guide, 1999). Thus: vs Yi = ln[ln( 1−F i
1 βˆLSR
N
=
i=1
wi X i − X
N
i=1
wi Yi − Y
αˆ LSR
Yi − Y
Y = exp X − βˆLSR
2
(18)
(19)
where X and Y are equal to: N
wi X i i=1 wi
(20)
wi Yi i=1 wi
(21)
i=1 X= N
N
Y = i=1 N
and the weights, wi , can be equal to 1 (unweighted LSR) or to 1 )]} (weighted LSR, WLSR). The latter quantities are 1/Var{ln[ln ( 1−F i tabulated in (White, 1969, IEEE Guide, 1999). The difference between LSR and WLSR is due not only to the weighing factors, but also to the choice of cumulative probability estimators for the calculation of Fi . Indeed, LSR estimates are based on cumulative probability estimators like those previously reported of
79 Filliben (eq. (3)) and of Benard (eq. (17)), or alternatively on the exact estimator derived from the incomplete-beta function, whereas WLSR employs either the expected value of Fi (tabulated in (White, 1969) analogously to the weighing factors), or an approximate mean-rate estimate equal to (Ross, 1994, 1996): i − 0.44 N + 0.25
Fi =
(22)
According to other authors (Nelson, 1982), the estimates of shape and scale parameter can be obtained by the regression of Yi vs X i . Such estimates, indicated as αˆ LSRJ , βˆLSRJ , are thus equal to: βˆLSRJ =
N i=1
wi X i − X
N
Yi − Y
i=1 wi X i − X
2
αˆ LSRJ = exp Y − βˆLSRJ X
(23)
(24)
where X and Y are still given by eqns. (20), (21) and the weights, wi , are commonly set to 1. The Weibull parameter estimates according to the MLE method are given by (Jacquelin, 1996a, 1996b, Nelson, 1982) 1
βˆMLE = N
βˆMLE ln(xi ) i=1 xi N βˆMLE i=1 xi
αˆ MLE =
N i=1
βˆ
−
1 N
N
I =1
(25) ln(xi )
1/βˆMLE
xi MLE N
(26)
4.1. Comparison relevant to standard simulated samples Standard samples, reconstructed by means of Monte Carlo simulative methods with α and β values chosen a priori, lead to the shape parameter estimates reported in Table 4. In particular, the percentiles 50% and extreme percentiles able to characterize the variance of the
80 sampling distribution of the shape parameter estimates (i.e. the 5% and 95%), together with the relevant expected value, E(β), are reported in the Table as a function of sample size (N = 5, 10). The investigation was limited to such sample sizes, since life tests on solid insulation are usually carried out on a few specimens. The reference value of the shape parameter assumed in the calculation is β S = 1, a typical value for life tests on solid insulation subjected to electrical stress. Table 4: Comparison between 5%, 50%, 95% percentile estimates and expected value of the sampling distribution of β estimates obtained by means of literature estimators (MLE, LSRJ, WLSR, LSR) and the proposed method. α S = 1, β S = 1. N 5 5 5 5 5 10 10 10 10 10
method MLE LSRJ WLSR LSR proposed (eq. (11)) MLE LSRJ WLSR LSR proposed (eq. (11))
E(β) 1.428 1.039 0.998 0.840 1.191 1.170 0.968 1.000 0.876 1.0569
β50% 1.240 0.902 1.046 0.902 0.992 1.105 0.917 1.019 0.917 0.9893
β5% 0.682 0.461 0.569 0.461 0.512 0.741 0.555 0.678 0.555 0.6388
β95% 2.786 2.054 2.354 2.055 2.508 1.809 1.554 1.682 1.555 1.716
It can be argued from the Table that the proposed method provides expected values of the sampling distribution of the estimated shape parameter characterized by a bias, in relative value, ranging from 19% to 5% for N ranging from 5 to 10 (both values are relevant, however, to small sample sizes). These expected values are, on the whole, better than those obtained by MLE and LSR, but worse than those derived by means of LSRJ and WLSR. As previously noticed, some authors (Montanari et al., 1998) prefer to assume, as accuracy index, the difference or the ratio between the median value of the distribution of the parameter estimates and the reference value of the parameter (possibly normalized with respect to the reference value), in order to reduce the influence of extreme values (outliers) that are commonly encountered in the tails of the sampling distributions obtained by the Monte Carlo procedure (even when proper rejection tests for non-Weibull distributed simulated samples are employed, such as the Abernethy criterion or the M-test (Lawless, 1982)). In fact, resorting to the ratio β50% /β S as bias indicator, the proposed non-parametric method provides considerably better estimates than the other methods (MLE, LSRJ, WLSR, LSR). Dealing with the spread
81 of the distribution, that can be measured (as previously done) by the quantity [β95% − β5% ] /β S , it can be observed that the LSRJ method is preferable, even if slightly, for low sample size (i.e. N = 5), whereas already for N = 10 all methods provide equivalent spreads. The fractional biases relevant to the scale parameter values obtained by the same estimators examined in Table 4 (i.e. MLE, LSRJ, WLSR and proposed non-parametric method), for α S = 1 and N = 5, 10 (small sample size, like for previous calculations), expressed (according to the two alternatives) as E(α)/α S and α50% /α S , are shown in Table 5. The fractional biases expressed as α50% /α S confirm that the proposed methodology provides scale parameter estimates more accurate than the other examined methods, whereas, according to the fractional biases expressed as E(α)/α S , the scale parameter estimates obtained by means of the proposed method are better than LSRJ, but worse than MLE, WLSR and LSR. Table 5: Comparison between E(α)/α S and α50% /α S for the same estimators considered in Table 4. N
method
E(α)/α S
α50% /α S
5 5 5 5 5 10 10 10 10 10
MLE LSRJ WLSR LSR proposed (eq. (16)) MLE LSRJ WLSR LSR proposed (eq. (16))
1.032 1.126 1.004 1.010 1.082 1.016 1.079 1.000 1.019 1.038
0.961 1.047 1.043 1.047 0.991 0.981 1.035 1.019 1.035 0.995
The comparison between the estimates of β50% /β S (Table 4) and α50% /α S (Table 5) highlights the peculiarity that the non-parametric estimators of the scale and shape parameters, expressed by eqns. (11) and (16), provide better estimates with respect to the other considered methods, even if the analysis has been limited to standard samples (with reference values of scale and shape parameters known a priori). In the next Section, the proposed non parametric method is used to process experimental data chosen in a way to show different levels of fitting to the two-parameter Weibull function.
82 4.2. Comparison relevant to experimental samples It is very interesting to evaluate the accuracy of the estimates of the proposed non-parametric method when applied to experimental samples which fit to eq. (2) with different values of correlation coefficient, as it occurs normally in practice. In order to examine this aspect, three samples of specimens subjected to life tests under electrical stress were chosen (the failure times are reported in Table 6). The relevant values of the scale and shape parameters were calculated by the most common methods in the literature and compared with the corresponding estimates obtained by the proposed non-parametric method. Indeed, the latter can be regarded as an accurate reference, since it has been argued in the previous Sections that there is an excellent correspondence between point estimates of scale and shape parameters derived according to the non-parametric method (i.e. βˆS, p and αˆ S, p ) and the median values of the sampling distributions of the scale and shape parameter of Weibull-distributed populations of data. Table 6: Experimental values of samples with different size N and different correlation coefficient r , chosen in order to compare the proposed method with literature estimators. Sample No.
N
xi (i = 1, .., N )[h]
#1
7
#2
10
#3
10
0.10263; 0.4438; 1.3083; 1.7132; 1.938; 2.7734; 2.9742 26; 40; 40; 41; 47; 49; 50; 50; 50; 64 29.32; 35.37; 39.28; 42.40; 45.16; 47.75; 50.35; 53.11; 56.35; 61.00
r 0.9581 0.9461 0.9999
The extent of the deviation of experimental values with respect to the straight line of eq. (2), once the couples of values {xi , Fi } are plotted on a Weibull probability paper, is measured by the estimate of the correlation coefficient, r , which is shown in Table 6. The estimates of r have been obtained by the classical methods reported in the literature (Nelson, 1982, Lawless, 1982), with the cumulative probability expressed according to eq. (3) (note that sample #3 has data points perfectly aligned along the straight line, being r ∼ = 1; this particular sample has been chosen as a check case of the results obtained in the previous Sub-section).
83 The non-parametric estimates of shape and scale parameter relevant to these experimental samples were calculated according to the previously described non-parametric (βˆS, p and αˆ S, p ) and pivotal nonparametric (βˆS, p and αˆ S,D ) method. The results are reported in Table 7. Then, the shape and scale parameters were determined for the same samples by resorting to the literature estimators dealt with in the previous sub-Section, and the deviation of the relevant estimates with respect to those obtained by the proposed method (considered as a reference, as previously stated) are reported in Table 8. In order to broaden the comparison, the scale and shape parameter estimates were calculated also by means of the robust method of Seki and Yokoyama (SY), which, according to some authors (see Seki and Yokoyama, 1996), is particularly well-suited to experimental samples that exhibit anomalous values (outliers). These estimators have the following form: 1 i=1 ai ln(x i )
(27)
1 i=1 bi ln(x i )
(28)
αˆ S,Y = N βˆS,Y = N
where coefficients ai , bi , are correlated with sample size and can be found in the above-mentioned work. Table 7: Non parametric (βˆS, p and αˆ S, p ) and pivotal non-parametric (βˆS, p and αˆ S,D ) estimates of the scale and shape parameters relevant to the experimental samples of Table 6. method (estimate)
Symbol
equation No. in text
non parametric non parametric pivotal non parametric
αˆ S, p [h] βˆS, p αˆ S,D [h]
(15) (11) (16)
sample # 1 1.839 0.949 1.758
sample # 2 50.25 5.027 49.85
sample # 3 50.00 4.999 49.99
The 5%, 50%, 95% percentile estimates of the shape and scale parameters calculated for the same samples of Table 6 by both the nonparametric method here proposed (eqns. (11) and (15)) and MLE, LSRJ estimators are compared in Table 9. For these latter estimators, the ˆ S and (α/α ˆ S )βˆS reported in (Jacquelin, 1996c) were used. tables of β/β Analogously to what highlighted in (Jacquelin, 1996c), indeed, it can
84 Table 8: Percent deviations of the scale and shape parameter estimates, calculated by the literature estimators dealt with in this Section, with respect to those obtained by the proposed method for the same samples of Table 6. Note that for WLSR, a weight wi = 1 is used, whereas for LSR a weight wi = 1 is employed (see Section 3). method
MLE
WLSR
LSR
LSRJ
SY
parameter (equation)
α β (25) (26)
α β (19) (18)
α β (19) (18)
α β (24) (23)
α β (27) (28)
deviation for sample #1 [%]
5.5 45.6
2.6 26.4
0.78 2.1
5.8 10.1
1.53 18.7
deviation for sample #2 [%]
1.6
8.0
1.6 14.7
1.3 0.56
0.12 11.0
7.4 1.3
deviation for sample #3 [%]
0.47 15.1
0.35 2.8
0
0
0
0
2.1 13.4
ˆ S be thought that, once that the sampling distributions of statistics β/β βˆS and (α/α ˆ S ) , relevant to the estimates obtained by MLE, LSRJ and the proposed non-parametric method are available, the most significant percentiles (e.g. 5%, 50%, 95%) of both the shape and scale parameters can be obtained by the following estimators: βˆ5% =
βˆ
βˆ βS
βˆ50% =
(29)
95%,t
βˆ
βˆ βS
βˆ95% =
(30)
50%,t
βˆ βˆ βS
(31) 5%,t
where βˆ indicates the estimate of the shape parameter, relevant to experimental values xi , calculated, for MLE, by eq. (25), for LSRJ by eq. (23), and for the proposed method by eq. (11), whereas p% ˆ S are derived from proper tables (see, e.g., Table percentiles β/β p%,t
3), as indicated by subscript t, and, since they come from a pivotal ˆ S , they do not require the knowledge of the reference statistic β/β value of the shape parameter (this procedure is similar to that proposed by Mann et al. (1974) regarding the BLIE methods).
85 The corresponding percentiles of the scale parameter are calculated by means of the following relationships: αˆ 5% =
αˆ 50% =
αˆ 95% =
αˆ αˆ αS
βˆ
βˆ
1/βˆ
(33)
50%,t
αˆ αˆ αS
(32)
95%,t
αˆ αˆ αS
1/βˆ
βˆ
1/βˆ
(34)
5%,t
where αˆ indicates the estimate of the scale parameter, relevant to experimental values xi , calculated, for MLE, by eq. (26), for LSRJ by eq. (24) and for the proposed method by eq. (15), whereas p% percentiles [(α/α ˆ S )βˆ ] p%,t are derived from proper tables (see, e.g., Table 3) and, since they descend from a pivotal statistic (α/α ˆ S )βˆ , they do not require the knowledge of the reference values of the parameters of the examined sample. A significant percent deviation can be observed from Table 8 between the shape parameter estimates calculated by MLE, WLSR, LSRJ, SY and the proposed method for all the samples processed. Such deviations tend to diminish as the correlation coefficient estimate relevant to the samples increases. It must be not forgotten that the SY method is known in the literature as a robust method, thus it should be not affected too much by outliers. From Table 8, it can also be argued that values of β S ≥ 1 support 1 )] the use of LSR, since the regression of X i = ln(xi ) vs Yi = ln[ln( 1−F i exhibits acceptable deviations with respect to the estimates coming from the proposed non-parametric method (2.1% in the worst case, relevant to the sample #1, with N = 7 and r = 0.95). As the experimental data point scatter decreases (i.e. r tends to 1), a remarkable difference between the proposed method and the others is observed (see Tables 4 and 5), when β S ≤ 1. These differences vanish as the reference value of the shape parameter increases (e.g. β S ∼ = 5, see
86 Table 9: 5%, 50%, 95% percentile estimates of the shape and scale parameters, calculated for the same samples of Table 6 by both the non-parametric method here proposed (eqns.(11)and(15)) and MLE, LSRJ estimators, resorting to sampling distributions of ˆ ˆ S ) and (α/α (β/β ˆ S )β . sample
parameter percentile 5% 50% 95%
0.96 1.78 3.28
0.86 1.87 4.18
0.85 1.85 4.26
β
5% 50% 95%
0.63 1.19 1.95
0.48 0.93 1.69
0.47 0.96 1.64
α
5% 50% 95%
43.8 49.6 55.8
44.1 49.8 56.4
44.5 50.3 56.7
β
5% 50% 95%
2.98 4.91 7.36
2.87 4.85 8.13
2.92 5.02 7.91
α
5% 50% 95%
44.4 49.9 55.7
44.4 49.6 55.5
44.2 50.0 56.5
β
5% 50% 95%
3.16 5.21 7.81
3.22 5.42 9.08
2.91 5.00 7.86
#2 (N = 10)
#3 (N = 10)
LSRJ non-parametric eqns. (23)-(24) eqns. (15)-(11)
α #1 (N = 7)
MLE eqns. (25)-(26)
Table 8 in which the LSR and LSRJ estimates are equivalent to those obtained by the proposed method). The bias characterizing the different estimates of the shape parameter affects, in turn, the scale parameter estimates. Thus, the previously noticed deviations remain also in the estimates of the latter, and the LSR method results to be the less biased with respect to the estimates of the method proposed in the present work (0.78% in the worst case, which is once again sample #1). One more advantage of the proposed estimators, i.e. αˆ S, p and βˆS, p , is that they provide point estimates (by eqns. (11) and (15)) that are very close to the most efficient ones, which are obtained, according to Jacquelin (Jacquelin, 1996b), by applying eqns. (30) and (33), as it results from the values listed in Table 10. Therefore, the point estimates of shape and scale parameters according to (11) and (15) do not require the application of the Jacquelin’s technique (Jacquelin, 1996b), which derives the most efficient estimates correcting the MLE and LSRJ methods.
87 Table 10: Experimental values and correlation coefficient of samples obtained from sample #3 of Table 6 by changing values in the tails. xi (i = 1, .., N )[h]
Samples
sample size
r
#4
10
33; 35.37; 39.28; 42.40; 45.16; 47.35; 50.35; 53.11; 56.35; 58
0.990
#5
10
26; 35.37; 39.28; 42.4; 45.16; 47.35; 50.35; 53.11; 56.35; 64
0.994
5. Robustness of the proposed method An interesting characteristic of the proposed non-parametric estimates of the scale and shape parameters of a Weibull p.d.f. is that they are not significantly influenced by anomalous values (outliers) possibly present in the experimental sample, contrarily to what can occur to other estimators (e.g. MLE and LSR). This is a problem that can often affect the results of life tests or electric strength tests having the purpose of evaluating different materials (on the contrary, for insulation systems the extreme values have primary importance). In order to evaluate the robustness of the proposed non-parametric method, it was compared with LSR and LSRJ (those that seem to provide the closest point estimates to the proposed non-parametric method), as well as with MLE and SY, the latter known in the literature as a robust one. For this check, sample # 3 of Table 6, characterized by a correlation coefficient r ∼ = 1 was taken as a reference, and the two extreme values of this sample were displaced far from the straight line of eq. (2) (both the lowest and the highest), so that the two samples # 4 and # 5 (reported in Table 10 with the relevant correlation coefficient) were obtained. The displacement was arranged in order that the value of shape parameter would not change with respect to the original distribution. The presence of outliers affects the estimate of the scale and shape parameters, as results from the values reported in Table 11. This Table shows the percent deviations of scale and shape parameter estimates, calculated for the samples of Table 10 by ML, LSR, LSRJ and SY, from those obtained by the proposed method (taken as a reference). In particular, the non-parametric estimates for sample #4 are αˆ S, p = 49.99 and βˆS, p = 5.00, whereas for sample # 5 are αˆ S, p = 50.01 and βˆS, p = 4.99. From these values and from those listed in the Table, it can be argued that the values of shape parameter obtained by the
88 Table 11: Percent deviations of the scale and shape parameter estimates, calculated for the samples of Table 10 by MLE, LSR,LSRJ and SY methods, from those obtained by the proposed method. Extreme failure times displaced with respect to sample #3 of Table 6. sample
MLE α
parameter
LSR β
α
LSRJ β
α
deviation −1.08 31.4 −0.94 17.71 −0.76 for sample #4 [%] deviation for sample #5 [%]
∼ =0
1.64
0.69
12.62
0.81
SY
proposed
β
α
β
α
β
15
2.99
3.77
0
0
13.62 1.46 20.99 0
0
proposed method remain practically constant as anomalous values are introduced, and practically coincident with the reference value of the shape parameter fixed a priori, contrarily to the other estimators, that are significantly affected by the presence of outliers. In order to verify the global effectiveness of the proposed method, beyond the empirical evaluation carried out right above, some indexes are analyzed that are based of the difference between the estimated ˆ and those expected, Fi , for each ˆ β), cumulative probabilities, F(xi , α, failure. The estimates of the so-called inequality index, T ∗ , of the so-called cost index, C, and of the sum of the absolute differences between a priori and estimated probabilities, denoted in the following , are reported in Table 12. In particular, T ∗ , is equal to (see as Zani, 1982):
1 N
T∗ = 1 N N
N i=1
2 i=1 Fi
+
2
ˆ Fi − F(xi , α, ˆ β)
1 N
N i=1
ˆ F(xi , α, ˆ β)
2
(35)
where the values of Fi are expressed, in this case, by eq. (3), and ˆ by eq. (1), once the estimates of the scale and shape F(xi , α, ˆ β) parameter shown in Table 11 have been introduced into this latter equation. Alternatively, in order to overcome the constraints relevant to the application of index T ∗ , the value of the mean cost of the prevision ˆ can be calculated. It is given by: errors Fi − F(xi , α, ˆ β) 2 N ˆ F − F((x , α, ˆ β) i i i=1 (36) C= N which is the square of the mean quadratic error (Zani, 1982).
89 Table 12: Values of indexes T ∗, C and relevant to the scale and shape parameter estimates reported in Table 11. T∗
index sample MLE LSR LSRJ SY proposed
C
#4
#5
#4
#5
#4
#5
0.0383 0.0236 0.0216 0.0418 0.0213
0.0142 0.0187 0.0202 0.0349 0.0132
0.5837 0.5784 0.5752 0.5221 0.5622
0.5754 0.5677 0.5662 0.5584 0.5757
0.3820 0.2501 0.2345 0.4713 0.1227
0.1028 0.1964 0.2063 0.3646 0.0810
From the analysis of the different values of T ∗ , C and , it appears clearly that the point estimates of α and β obtained according to the proposed methodology (i.e. by eqns. (11) and (15)) are more robust than those achieved by means of the other methods, at least as far as the above-described samples (conceived in order to examine the influence of outliers on the Weibull parameter estimates) are considered. On the other hand, if the a priori cumulative probabilities are calculated by means of eqns. (17) or (22), the values of indexes T ∗ , C and are slightly modified with respect to those reported in Table 12. In all cases, however, the values of indexes and T ∗ obtained by the proposed method are always lower, whereas the values of index C are almost comparable with those of LSR and LSRJ, and higher than those obtained by the SY method. The values of indexes and T ∗ highlight more sharply than those of index C the differences between α and β parameter estimates obtained by the different methods. In conclusion, the α and β parameter estimates obtained by the proposed method provide better answers to the robustness test relevant to eq. (1), with respect to both the numerous established procedures of Table 12, and to BLUE (Lawless, 1982, Nelson, 1982), BLIE and (Wyckoff et al., 1980, Adatia and Chan, 1985, Zanakis and Mann, 1982).
6. Conclusion The calculation procedure presented in this work, essentially based on the non-parametric determination of pivotal percentiles, provides satisfactory point estimates of the two parameters of a Weibull probability distribution function that describes the times to failure or the
90 breakdown voltages relevant to life tests carried out on solid insulation samples. It has been illustrated, indeed, that the estimates of parameters α and β thus obtained are both accurate and robust. On the whole, they seem to perform better than other methods known in literature, both those based on classical regressive procedures and those derived from other percentile-based methodologies. In conclusion, the proposed method can be recommended, particularly when dealing with small samples and data sets containing outliers.
REFERENCES Abernethy, R. B. (1996) The new Weibull handbook, (2nd edition), published by the author, ISBN 0-965-3062-0-8. Adatia, A. and Chan, L. K. (1985) Robust estimations pf the 3-parameter Weibull distribution, IEEE Trans. on Reliability, Vo. R-34, 347-351. Birkes, D. and Dodge, Y. (1993) Alternative methods of regression, J. Wiley & Sons, New York. Cacciari, M., Mazzanti, G., and Montanari, G. C. (1994) Electric strength measurements and Weibull statistics on thin EPR films, IEEE Transactions on Dielectrics and Electrical Insulation, Vol. 1, n. 1, 153-159. Cacciari, M., Mazzanti, G., and Montanari, G. C. (1996) Comparison of maximum likelihood unbiasing methods for the estimation of the Weibull function parameters, IEEE Transactions on Dielectrics and Electrical Insulation, Vol. 3, n. 1, 18-27. Castillo, E. and Hadi, A. S. (1995) A method for estimating parameters and quantiles of distributions of continuous random variables, Computational Statistics & Data Analysis, Vol. 20, 421-439. Fothergill, J. C. (1990) Estimating the cumulative probability of failure data points to be plotted on Weibull and other probability paper, IEEE Transactions on Electrical Insulation, Vol. 25, N. 3, 489-492. IEEE guide for statistical analysis of electrical insulation breakdown data, Draft 6, October 1999. Jacquelin, J. (1993) A reliable algorithm for the exact median rank function, IEEE Transactions on Electrical Insulation, Vol. 28, n. 2, 168-171 (and ERRATUM Vol. 28, n. 5, p. 892, October 1993). Jacquelin, J. (1996a) A procedure for the estimation for the three-parameter Weibull distribution, Alcatel-Alsthom Recherche, Internal report, Draft 1. Jacquelin, J. (1996b) Inference of sampling on Weibull parameter estimation, IEEE Transactions on Dielectrics and Electrical Insulation, Vol. 3, n. 6, 809-816.
91 Jacquelin, J. (1996c) Separative method for parameter estimation of multiparameter distribution, Alcatel-Alsthom Recherche, Marcoussis, France, Internal report UME/PI/96058/1996. Jacquelin, J. (1997) Elemental percentile method for parameter estimation of a multiparameter distribution, Alcatel-Alsthom Recherche, Internal report F9/1460, Marcoussis, France. Lawless, J. F. (1982) Statistical models and methods for lifetime data, J. Wiley & Sons, New York. Lecontre, J. P. and Tassi, P. (1987) Non-parametric statistics and robustness, Economica, Paris. Mann, N. R., Schafer, R. E., and Singpurwalla, N. D. (1974) Methods for statistical analysis of reliability and lifetime data, J. Wiley & Sons, New York, 1974. Montanari, G. C., Cavallini, A., Tommasini, L., Cacciari, M., and Contin, A. (1995) Comparison of random generators for Monte Carlo estimates of Weibull parameters, Metron, Vol. LVI, n. 1-2, 55-77. Montanari, G. C., Mazzanti, G., Cacciari, M., and Fothergill, J. C. (1998) Optimum estimators for the Weibull distribution from censored test data. Progressivelycensored tests, IEEE Transactions on Dielectrics and Electrical Insulation, Vol. 5, n. 2, 157-164. Nelson, W. (1982) Applied life data analysis, J. Wiley & Sons, New York. Ross, R. (1994) Graphical methods for plotting and evaluating Weibull distributed data, Proc. 4th Int. Conf. Prop. Appl. Diel. Mater., Vol. 1, 250-253, Brisbane, Australia. Ross, R. (1996) Bias and standard deviation due to Weibull parameter estimation for small data sets, IEEE Transactions on Dielectrics and Electrical Insulation, Vol. 3, n. 1, 28-42. Seki, T. and Yokoyama, S. (1996) Robust parameter estimation using the bootstrap method for the 2-parameter Weibull distribution, IEEE Transactions on Reliability, Vol. 45, n. 1, 34-41. Sprent, P. (1989) Applied nonparametric statistical methods, Chapman and Hall, London, 1989. White, J. S. (1969) The moments of log-Weibull order statistics, Technometrics, Vol. 11, n. 2, 374-386. Wyckoff, J., Bain, L. J., and Engelhardt, M. (1980) Some complete and censored sampling results for the three-parameter Weibull distribution, J. Stat. Comp. and Simulation, Vol. 11, 139-151. Zanakis, S. H. and Mann, N. R. (1982) A good simple percentile estimator of the Weibull shape parameter for use when all three parameters are unknown, Naval Research Logistics Quaterly, Vol. 29, n. 3, 419-428. Zani, S. (1982) Statistical indicators of conjuncture, (in italian), Loescher Publ., Torino, Italy.
92 A robust technique for the estimation of the two-parameter Weibull function for complete data sets Summary In this work, a procedure for the calculation of the parameters of a probability distribution function, which is both accurate and robust, is described. This procedure, belonging to the family of regressive methods, is based on a “non-parametric” methodology that employs the percentiles of the distribution of the parameter estimates. This statistical procedure is applied to obtain the point estimates of the two parameters of a Weibull distribution function, which is able to describe effectively the experimental results of complete life tests performed on solid dielectric insulation. Unbiasing factors for the scale and shape parameters of the Weibull function are obtained resorting to the Monte Carlo method. The accuracy and the robustness of the proposed non parametric method are checked by comparison with the other techniques employed in the literature, that is, the Maximum-Likelihood Estimate (MLE), the LeastSquare Regression (LSR) and Weighted Least-Square Regression (WLSR) techniques. It is shown that the methodology proposed here for the estimation of the parameters of the 2- parameter Weibull function is able to provide accurate and robust estimates, particularly useful for samples of data presenting outliers. Un metodo robusto per la stima della distribuzione di Weibull a due parametri per lotti completi Riassunto In questo lavoro si descrive un procedimento accurato e robusto per il calcolo dei parametri di una funzione di distribuzione di probabilit`a. Questo procedimento, che appartiene alla famiglia dei metodi regressivi, e` basato su di una metodologia “non parametrica” che impiega i percentili della distribuzione campionaria delle stime dei parametri. Tale procedimento statistico e` applicato per ottenere le stime puntuali dei parametri di una distribuzione di Weibull a due parametri, che e` in grado di descrivere efficacemente i risultati sperimentali di prove di vita di tipo completo effettuate su isolamenti solidi. Inoltre si ricavano fattori di correzione dell’errore sistematico per il parametro di scala e di forma della distribuzione di Weibull, facendo ricorso al metodo di Monte Carlo. L’accuratezza e la robustezza del metodo non parametrico proposto sono verificate confrontandolo con le altre tecniche utilizzate in letteratura, ovvero il metodo della Massima Verosimiglianza (MLE), il metodo della regressione lineare (LSR) e della regressione lineare pesata (WLSR). Se ne deduce che il metodo qui proposto per la stima dei parametri della distribuzione di Weibull a due parametri e` in grado di fornire stime accurate e robuste, utili in particolare per lotti di dati che presentano “outliers”. Key words Percentile method; Weibull distribution; Parameter estimates.
[Manuscript received September 2001; final version received July 2002.]