May 29, 2007 - cedures, in A Festschrift for Erich L. Lehmann, edited by P. J. Bickel, K. A. ... rithmsand Complexity, edited by J. F. Traub. Academic Press,.
Quality Engineering
ISSN: 0898-2112 (Print) 1532-4222 (Online) Journal homepage: http://www.tandfonline.com/loi/lqen20
NEW ROBUST STATISTICAL PROCESS CONTROL CHART FOR LOCATION Moustafa O. Abu-Shawiesh & Mokhtar B. Abdullah To cite this article: Moustafa O. Abu-Shawiesh & Mokhtar B. Abdullah (1999) NEW ROBUST STATISTICAL PROCESS CONTROL CHART FOR LOCATION, Quality Engineering, 12:2, 149-159, DOI: 10.1080/08982119908962572 To link to this article: http://dx.doi.org/10.1080/08982119908962572
Published online: 29 May 2007.
Submit your article to this journal
Article views: 92
View related articles
Citing articles: 2 View citing articles
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=lqen20 Download by: [University of Nizwa]
Date: 30 November 2015, At: 23:29
Quality
Engineering, 12(2), 149-1 59 (1999-2000)
Downloaded by [University of Nizwa] at 23:29 30 November 2015
NEW ROBUST STATISTICAL PROCESS CONTROL CHART FOR LOCATION Moustafa 0. Abu-Shawiesh P.O. Box 620417 Irbid 2 1162, Jordan
Mokhtar B. Abdullah Jabatan Statistics, FSM 43600 UKM Bangi Selangor D.E., Malaysia
Key Words Hodges-Lehmann estimator; Shamos-Bickel-Lehmann estimator; Robust; Statistical process control; Control chart; Control limits; Central line; Outlier.
Introduction The field of robust statistics was developed in order to provide more accurate results than the traditional measures, especially when there is a possibility for outliers in the data and the underlying assumptions under which the statistical procedure was developed are not met or are slightly incorrect. These methods will give us reliable results in the neighborhood of the assumed parametric model. The primary tenet of robust approach is that a small error in either the data or in the probability distribution should not have a significant impact on the conclusion of the analysis. In statistical process control, it is important that the control limits for the control chart be set using a measure of the process parameters (i.e., the location and the scale), which is unduly affected by extreme observations or outliers in the subgroups. Traditional measures like the sample mean and
Copyright 8 1999 by Marcel Dekker, Inc.
the sample standard deviation can be easily influenced by these extreme observations. 'The presence of an outlier in a data set could imply that this observation could have been drawn from a different population or that it could be due to some sporadic variation. The outlier might also be caused by recording errors. Errors in the data or the probability distribution used to determine control limits will alter the probabilities of the type 1 and type II errors. This will increase costs beyond what is expected from the exact model. The presence of' outliers should be detected and investigated. The special cause for its presence should be eliminated because it tends to reduce the sensitivity of control charting procedures because the control limits become stretched so tbat the detection of the outliers themselves becomes less likely (1). The traditional statistical procedures used in the construction of control charts are optimal if all underlying assumptions, including normality, are satisfied. The loss of efficiency with the traditional statistics when all assumptions are not satisfied is estimated by Hampel et al. (2) to be in the range 5 -50%. . The proposed method of this article will be based on more robust measures than the sample mean and the sample standard deviation. They are the Hodges-Lehmann and the
ABU-SHAWIESH AND ABDULLAH
Shamos-Bickle-Lehmann estimators. Numerical examples are given to illustrate the use of the proposed method and to compare it to the traditional method. Its performance is investigated using a simulation study.
The Hodges-Lehmann Estimator
The Shamos-Bickel-Lehmann Estimator This estimator is mentioned by Sharnos (8) and later by Bickel and Lehmann (9). We will refer to it as SBL. The SBL is a scale estimator analogous to the location estimator of the HL estimator. It is obtained by replacing the painvise averages (Xi+ Xj)/2 by the pairwise distances - xjl. The SBL estimator for a random sample X , ,X2, . . . ,X,, is defined as follows: Define
Downloaded by [University of Nizwa] at 23:29 30 November 2015
Ix;
The Hodges-Lehmann (HL) estimator was proposed by Hodges and ,bhmann (3) as an estimator for the point of symmetry 8 of a continuous and symmetric distribution. It is used for the estimation of the location parameter in one- and two-sample models. It is also a typical example of R-estimators. This estimator is unbiased, translation invariant, and nonparametric based on -the Wilcoxon signed-rank statistic, which guarantees the specified type I error if the observa' dons constitute a random sample obtained independently from some distribution which is continuous and symmetrical about 8. The HL estimator for a random sample X, ,X,, . . . , X, is defined as follows: Define M = n(n -t 1)/2 Walsh averages W, = (Xi + Xj)/2, where r = 1.2, . . . , M and i 5 j = 1, 2, . . . , n. The HL estimator for the point of symmetry 0 of a continuous and symmetric distribution is defined as the sample median of the Walsh averages Wr for the random sample and it is given by
HL = where =
W(~+I,
{(w,, + w,,
{ri-
I)t2
if M is odd 112 it M is even,
'
(1)
ifMisodd if M is even.
The HL estimator was originally proposed as a nonparametric estimator but was later shown to belong to the class of robust R-estimators (4). The main properties of this estimator can be summarized as follows: 1. The breakdown point for the NL estimator is 29%, which is sufficiently high for most purposes (5). 2. If the underlying distribution for the data is Normal, then the asymptotic relative efficiency (ARE) of the HL estimator relative to the sample mean is 0.955. If the underlying distribution for thedata is not Normal, then the ARE is often greater than unity (4). 3. The ARE for the HL estimator is the same as the Wilcoxon signed-rank test and it is asymptotically normally distributed. Also, it is robust against gross'errors (6). 4. For a random sample X, ,X,, . . . , Xn coming from a continuous distribution that is symmetric about 0, the distribution of the HL estimator is also symmetric about 0, and when the expectation of Xi exists, the statistic HL is an vnbiased estimator of 0 (3,7).
n!
pairwise distances B, = IX; -
wherer= 1,2,. . ., U a n d i < j = 1,2,. . .,n.TheSBL estimator is defined as the sample median of the pairwise distances Br for the random sample and it is given by if U is odd (Bg)+ B(,+,)/2 if U is even,
where =
(3)
{g2-
1)12 if U is odd if U is even.
The main properties of the SBL estimator can be summarized as follows: 1. 2.
The breakdown point for the SBL estimator is 29%. The SBL estimator has a high efficiency in the Normal case (about 86%). 3. The SBL estimator uses an overall median over (;) pairs, where
4.
Rousseeuw and Croux (10) proposed multiplying the SBL estimator by 1.0483 to achieve consistency for the parameter u of Normal distributions. 5 . The square of the SBL estimator (SBL2) can be used as an estimate for c 2 . . 0 fcourse, SBL is not unbiased or median unbiased for o,and SBLZis not unbiased or median unbiased for u2.However, they give good approximations to what they are estimating (I 1).
The Proposed Control 'Chart The previous work in the field of robust control charts is limited. Most of the previous methods are robust to a small number of extreme points and they are assuming three-sigma limits and a Normal distribution in the calculation of their control limits and central line. Hence, they do not guarantee the specified type I error. In this section, we propose a new robust control chart for location. We assume that the process
NEW ROBUST SPC CHART FOR LOCATION is continuous and symmetric. Because robustness of the chart is of primary concern, we propose the use of the HL estimator as an estimate for location parameter and SBL estimator as an estimate for the scale parameter. The calculations are simple and can be done manually or using a computer. Suppose that we have m subgroups each of size n. Let the observations in each subgroup be denoted by X,, X2, . . . , X,, . The method for constructing the proposed robust control chart for location is given below. For each subgroup, perform steps 1- 6.
Table I . n
The Control Limit Factors for the Proposed Method
HL
SBL
STD.ERRORHL
c
-
Form the M = n(n + 1)/2 Walsh averages y.j,lfor subgroup i where Wij,,= (Xij + Xi1)/2,j 5 i 5 1, i = l , 2 ,..., m,j= 1,2,..., n,andl= 1.2,..., n. 2. The resulting Walsh averages are ordered as W(i,l)5 W(i,2)I. . - 5 W(i,,,, where W(;.,)represents the kth-order Walsh averages in the ith subgroup. 3. Calculate the HL estimator using Eq. (1). 4. Form the
Downloaded by [University of Nizwa] at 23:29 30 November 2015
1.
.
pairwise distances Bjj,,for subgroup i where Bij,I,= I X u - X i l l , j < I , i = 1 , 2 , . . . , m,j= 1,2 , . . . , n, a n d l = l , 2 , . . . , n. 5. The resulting values of Bij,, are ordered as B(,.,) 5 B(j.2) 5 . . . 5 B(i,w, where Bo,, represents the kthorder Bid,,in the ith subgroup. 6 . Calculate the SBL estimator using Eq. ( 3 ) . 7. The three-sigma control limits for the proposed control chart are obtained by calculating the average and the average of the of the HL estimators, SBL estimators, SBL, and then using the following formula:
m,
LCL =
UCL =
HL - 3cSBL HL + 3 c a
where LCL and UCL are the lower and upper control limits, respectively. 8. The values of the correction factor c based on the original subgroup size n are given in Table 1. These values were obtained using a simulation study. These values will be explained later. 9. The central Line is obtained by calculating the average of all subgroup HL estimators using
10. After the control limits and the central line have been calculated, the values of HL for the m sub-
groups are plotted on the chart. If any one of the points falls outside the control limits, the process is considered to be out of control.
The Development of Control Limits for the Proposed Method This section will explain how the control limits for the proposed control chart were developed. The control limits are developed empirically through two series of Monte Carlo simulations. The first series is used to determine the standard deviation of the sampling distribution of the HL estimator (i.e., the standard error of the HL estimator). The mean value of the HL estimator and the standard deviation is determined based on 20,000 samples of size n, 2 I n 5 15. The second series consisted of 1000 trials of 20 subgroups of size n. The average values of HL estimators and SBL estimators are determined for each trial. The average of the sample statistic (SBL) is not equal to the standard error of the HL estimator and so a correction factor is required. The factor c is the ratio of the standard error of the HL estimator to the SBL estimator. The Normal distribution is used to calibrate the model. Because the HL and the SBL estimators are robust estimators, they should give reasonable results for distributions in the neighborhood of the Normal. The tinal results of this simulation study are given in Table I. Furthermore, because the HL estimator is normally distributed, this Normal approximation will allow a more even comparison between the proposed method and that of Shewhart's method, which is also based on the normality assumption. Finally, three-sigma limits are obtained by multiplying the SBL estimator by 3c, where c is given in Table 1.
'
ABU-SHAWIESH AND ABDULLAH Table 2.
The First Numerical Example: Comparison of Control Charts
METHOD
LCL
CL
UCL
-
Shewhart based on R/d, Shewhart based on S/c, Langenberg-Iglewicz (I 0%) Proposed
-0.988 - 1.044 - 1.056 -1.100
-
Downloaded by [University of Nizwa] at 23:29 30 November 2015
We now illustrate the use of the proposed robust method and compare it with the traditional control chart and the method proposed by Langenberg and Iglewicz (12). Therefore, four examples which represent the different cases we may encounter are introduced. These cases are Normal without outliers, Normal with outliers, non-Normal without outliers, and non-Normal with outliers. The first example is taken from Ref. 13 (p. 269). It deals with the fill volume of soft drink beverage bottles. As a part of the study of the process, 15 subgroups of 10 observations each were taken. The data were tested and found to be Normal with no outliers. Table 2 contains the results regarding the control limits,
LENGTH -
0.985 1.041 0.925 1.020
-0.001 -0.001 -0.065 -0.040
Numerical Examples
-
OUT
1.973 2.085 1.981 2.120
0 0 0 0
In control In control In control In control
the central line and the number of points falling outside the control limits, the length and the state of control of the process for the, Shewhart (traditional) method, the method of Langenberg and Iglewicz (12) with one trimming rate, and the proposed method. Figure 1 is a combined control chart showing the control limits for the Shewhart method based on range, &he Langenberg and Iglewicz (12) method, and the proposed method based on the HL and SBL estimators. The second example is taken from Ref. 13 (p. 386). It deals with the soft drink bottle bursting strength. As a part of the study of the process, 20 subgroups of 5 observations each were taken. The data were tested and found to be Normal with outliers. Table 3 contains the results regarding the control limits, the central line and the number of points falling outside the
Fill Vobune of S o t Drink Beverage Boalts
1 2 3 4 5 6 7 B
B101112131415
Subgroup Number The combined control chart for the filt volume of soft drink beverage bottles data. Dotted line: traditional method; dashed line: Langenberg-Iglewicz method; solid line: the proposed method; solid circle: HL value; open circle: mean value: Figure I .
STATE
~~~~~
NEW ROBUST SPC CHART FOR LOCATION Table 3. The Second Numerical Example: Comparison of Control Charts
Downloaded by [University of Nizwa] at 23:29 30 November 2015
--
-
-
-
-
-
-
-
METHOD
LCL
CL
UCL
LENGTH
OUT
STATE
Shewhart based on R/d, Shewhart based on S/c4 Langenberg-Iglewicz (10%) Proposed
219.5 220.9 219.1 218.6
264.1 264.1 264.6 263.3
308.6 307.2 3 10.0 308.0
89.1 86.3 90.9 89.4
0 0 0 0
In control In control In control In control
control limits, the length and the state of control of the process for the Shewhart (traditional) method, the method of Langenberg and Iglewicz (12) with one trimming rate, and the proposed method. Figure 2 is a combined control chart showing the control limits for the Shewhart method based on the range, the Langenberg and Iglewicz (12) method, and the proposed method based on the HL and SBL estimators.
The third example is taken from Ref. 13(p. 266). The data deal with the high-voltage power supply. As a part of the study of the process, 20 subgroups of 4 observations each were taken. The data were tested and found to be nonNormal with no outliers. Table 4 contains the results regarding the control limits, the central line and the number of points falling outside the control limits, the length and the state of control of the pro-
Soft Drink Botde Bursting-Strength
Subgmup Number Figure 2. The combined control chart for the soft drink bottle bursting-strength
data. Dotted line: traditional method; dashed line: Langenberg-Iglewicz method; solid line: the proposed method; solid circle: HL value; open circle: mean value. Tizble 4.
The Third Numerical Example: Comparison of Control Charts
METHOD
LCL
CL
UCL
LENGTH
OUT
STATE
Shewhart based on R/d, Shewhart based on S/c4 Langenberg-Jglewicz (10%) Proposed
5.82 6.16 5.67 5.75
10.38 10.38 10.36 10.29
14.93
9.1 1 8.43 9.38 9.08
0 0 0 0
In control In con~rol In control In control
14.59 15.05 14.83
ABU-SHAWlESH AND ABDULLAH
UCL
Downloaded by [University of Nizwa] at 23:29 30 November 2015
CL
E L
Subgroup Number Figure 3. The combined control chart for the high-voltage power-supply data.
Dotted line: traditional method; dashed line: Langenberg-Iglewicz method; solid line: the proposed method; solid circle: HL value; open circle: mean value.
cess for the Shewhart (traditional) method, the method of Langenberg and Iglewicz (1 2) with one trimming rate, and the proposed method. Figure 3 is a combined control chart showing the control limits for the Shewhart method based on the range, the Langenbcrg and Iglewicz (12) method, and the proposed method based on the HL and SBL estimators. The fourth example is taken from Ref. 14 (p. 207). It deals with the ~neltindex of an extrusion-grade polyethylene compound. As a part of the study of the process, 20 subgroups of 4 observations each were taken. The data were tested and Found to be non-Normal with outliers. Table 5 contains the results regarding the control limits, the central line and the number of points falling outside the control limits, the length and the state of control of the process for the Shewhart (traditional) method, the method of
Langenberg and Iglewicz (12) with one trimming rate, and the proposed method. Figure 4 is a combined control chart showing the control limits for the Shewhart method based on the range, the Langenberg and Iglewicz (1 2) method, and the proposed method based on the HL and SBL estimators. From the results of the'above numerical examples, we notice that when the data are coming from a Normal distribution with no outliers, the proposed method based on HL and SBL estimators leads to wider control limits than the traditional method and the robust method of Langenberg and Iglewicz (12). The control limits for the robust method of Langenberg and Iglewicz (12) are very close to the traditional control limits. For the Normal data with outliers, the robust methods still lead to wider control limits and are not affected by outliers like the traditional methods.
Table 5. The Fourth Numerical Example: Comparison of Control Charts
METHOD
LCL
CL
UCL
LENGTH
OUT
STATE
Shewhart based on R/d, Shewhart based on S/c, Langenberg-lglewicz (10%) Proposed
221.4 219.7 222.5 220.8
235.0 235.0 234.9 234.5
248.7 250.4 247.2 248.2
27.3 30.7 24.7 27.4
0
In control In control Out control Out control
0 2 I
NEW ROBUST SPC CHART FOR LOCATION
Downloaded by [University of Nizwa] at 23:29 30 November 2015
Melt Index of Exbusion-Gradc Pdycthfltne Compmd
Subgroup Number The combined control chart for the melt index data. Dotted line: traditional method; dashed line: Langenberg-Iglewicz method; solid line: the proposed method; solid circle: HL value; open circle: mean value.
Figure 4.
For non-Normal data with or without outliers, the proposed method in general leads to shorter control limits than the traditional method but wider control limits than the robust method of Langenberg and Iglewicz (12) for the case of non-Normal data with outliers.
The Performance of the Proposed Method In this section, the performance of the proposed robust control chart is investigated using a series of simulations for the short-tailed, Normal, and heavy-tailed distributions. Per-
formance measures were collected for the Shewhart method, the robust method of Langenberg and Iglewicz (12), and the proposed method for the U(0,1), Normal, logistic, double exponential, 10% contaminated Normal. 20% contaminated Normal, and Cauchy distributions. For each distribution, 1000 trials of 20 subgroups of size 5 and 10 were run. The average statistics presented are based on the 1000 trials. Table 6 gives the average centerline val,ues. The Shewhart method is denoted by The me&d of Langenberg and Iglewicz is denoted by with 10% trimming as recommended. The proposed method is denoted by HL. In judging the performance of location estimators, it is
x. xa
Tabk 6. Average Values of Location Statistics for 1000 Control Charts with 20 Subgroups of Size n
-
DISTRLBUTION
u(0. 1) Normal Logistic DE
n =5
CN(O.l, 3) CN(0.2,3)
0.5001 0.0026 -0.0003 -0.0006 0.0106 0.0072
Cauchy
-1.0187
F(,),a = 0.10
HL
X
TYPE OF
n = I0
n =5
n=
I0
n =5
n = I0
ABU-SHAWIESH AND ABDULLAH Table 7. Variance of the Center Values Presented in Table 6 -
X
TYPE OF DISTRIBUTION u(0, 1 ) Normal Logistic DE CN(O.l, 3) CN(0.2, 3)
Downloaded by [University of Nizwa] at 23:29 30 November 2015
Cauchy
y,,,,cr= 0.10
HL
n=5
n = 10
n=5
n = 10
n=5
n=
0.0014 0.0095 0.0104 0.0 104 0.01 87 0.0270 608.2340
0.001 5 0.005 1 0.005 1 0.0050 0.0083 0.01 22 537.6518
0.0014 0.0103 0.0103 0.0089 0.0149 0.0203 0.202 1
0.0016 0.0055 0.005 1 0.004 1 0.0067 0.0088 0.0305
0.0010 0.0101 0.0 109 0.0 1 @I 0.0167 0.0249 0.5338
0.0005 0.0053 0.0053 0.0052 0.0083 0.0 1 23 0.4476
reasonable to select as the best, an estimator that comes ciosest to the true population center value. In Table 6, the value closest to 0 is considered the best. It seems reasonable to disregard the results for the Normal distribution, as the mean and standard deviation are optimal in this case. With a few exceptions, the proposed method comes closest to the true center value of 0 when compared to the traditional method and the other robust method. Table 7 shows the variance of the various statistics for location used i n Table 6. Again, with a few exceptions, the proposed method has 'the lowest variance when compared with the methods of Shewhart and Langenberg and Iglewicz, especially for the heavy-tailed distributions. Hence, based on the results in Tables 6 and 7, the HL estimator computed for each subgroup is a good robust measure of location, because it comes closest to the true value and has minimum variation for most of the underlying distributions considered. Table 8 is given for completeness. It contains the mean values of the range (R), the standard deviation (S),the SBL estimator, and the trimmed range (R1,,).These values are used in determining the control limit values given in Table 10. It is
,
I0
.
not proper to compare the measures directly, because each measures something different. Table 9 contains the variance of the measures of dispersion given in Table 8. The standard deviation (S) and the SBL estimator of the proposed method are very close in their values and both have the least variation among the other estimators for all sample sizes considered across a11 distributions. In some situations, the SBL estimator has less variation than the standard deviation (S) especially for the worst case (i.e., the Cauchy distribution). Table 10 presents the confidence interval width for the two sample sizes and the various distributions studied. These values are based on Table 8. Table 1 1 presents the number of points falling outside the control limits for each method where an additional 1000 subgroups of size 5 and 10 were generated and the number of points among them falling outside the control limits was counted. .For the short-tailed distributions, the proposed method leads to a wider interval than the other two traditional methods and fewer number of points outside the control limits. Also, it leads t o wider control limits than the other robust method and more points outside the control limits, especially
Table 8. Mean Values for Dispersion TYPE OF DISTRIBUTION u ( 0 , 1) Normal
Logistic DE CN(0.I , 3 ) CN(0.2, 3 )
Cauchy
S
SBL
R
R(,,, a = 0.10
n =5
n = 10
n=5
n = 10
n=5
n = 10
n =5
0.278 0.944 0.927 0.903 1.186 1.418 16.078
0.285 0.975 0.964 0.949 1.259 1.515 20.879
0.667 2.334 2.307 2.260 2.948 3.540 37.029
0.819 3.088 3. I 27 3.171 4.154 5.087 68.126
0.324 1.053 1.010 0.940 1.23 1 1.439 3.197
0.306 1.001 0!956 0.881 1.163 1.349 2.678
Oi475 2.299 2.24 1 2.163 2.760 3.321 10.123
n =
10
0.828 3.062 3.066 3.085 3.941 4.881 .20.476
NEW ROBUST SPC CHART FOR LOCATION Table 9.
TYPE OF DISTRIBUTION
157
Variance of Statistics Given in Table 8
S
'
SBL
R
R(,,,, a = 0.10
n =5
n = 10
n=5
n = 10
n=5
n=lO
0.000 0.006 0.008 0.0 10 0.019 0.028 2,986.700
0.000 0.003 0.004 0.006 0.012 0.018 5,454.000
0.002 0.037 0.05 1 0.068 0.1 19 0.183 14,942.900
0.001 0.032 0.053 0.079 0.166 0.259 54,564.000
0.001 0.010 0.010 0.0 12 0.018 0.030 1.171
0.000 0.004 0.004 0.005 0.007 0.012 0.260
n=5
n = I0
0.002 0.041 0.05 1 0.066 0.101 0.178 18.834
0.00 1 0.035 0.054 0.07 8 0.156 0.263 106.500
-
U(0. 1)
Normal
Logistic DE CN(O.1, 3 ) CN(0.2,3)
Downloaded by [University of Nizwa] at 23:29 30 November 2015
Cauchy
'
Table 10. Control Interval Width for the Traditional and Robust Methods --
TYPE OF DISTRIBUTION u(0, 1)
Normal Logistic DE CN(O.1,3) CN(0.2.3)
Cauchy
x:s
-
-
HL : SBL
n=5
n = 10
n=5
n = 10
0.794 2.693 2.646 2.576 3.384 4.046 45.890
0.556 2.456 1.881 1.850 2.456 2.955 40.7 10
0.862 2.802 2.689 2.502 3.28 1 3.831 8.510
0.600 1.965 1.876 1.730 2.284 2.648 5.258
-
Langenberg-Iglewicz n=5
n = I0
Table 11. Number of Points Falling Outside the Control Limits for the Traditional and Robust Methods
x:s
TYPE OF DISTRIBUTION
n =5
2:R n = 10
n =5
HL : SBL n = 10
n =5
n = 10
Langenberg-Iglewicz n = I0
n=5
- -
U(0, 1 )
Normal Logistic DE
CN(O.l, 3) CN(0.2,3) Cauchy
for large sample sizes. This number becomes approximately equal for the different methods as Lhe sample size increases. If the distribution is Normal, then one can do no better than to use the Shewhart approach. However, it should be noted that as Lhe tail weight of the distribution increases, both robust methods perform better than the traditional methods. Finally, although both robust methods perform better for heavy-tailed distributions, the proposed method based on the
HL and SBL estimators appears to be better as sample size increases and as the tails become heavier.
The Average-Run-Length Simulation Study In this section, the in-control average run length (ARLO) and the out-of-control average run length (ARL,) are calcu-
Downloaded by [University of Nizwa] at 23:29 30 November 2015
ABU-SHAWIESH AND ABDULLAH lated for the robust and Shewhart control charts. The ARLos and ARL,s are simulated under the assumption of normality. The tabled values for the ARLOwere obtained by generating observations from the Normal distribution. Sets of rn = 20 subgroups with each subgroup consisting of n = 5 and 10 observations, respectively, were generated from the N(0, 1) distribution. The control limits for both control charts.were constructed. After determining the control limits, random N(0, 1) samples of size n = 5 and 10 were generated. The HL and X statistics were computed for each sample and compared to the control limits. The control limits for the ARL, are based on N(0, 1) and the observations used to calculale the statistics HL and are from a Normal with mean S and a = I , where 8 represents the size of the shift in the mean. This size of shift is measured in terms of the process standard deviation. Hence, the size of the shift is defined as 8 =
u
(7)
CT
where iq,is the in-control process mean, p is the process mean after shift, and cr is the process standard deviation. Shifts of size S = 0.25, 0.50, 0.75, and 1.OO are considered. The number of samples required for the values of the HL and 2 estimators to exceed the control limits was recorded as a run-length observation, RL,.For runs not signaling by the 25,000th sample, the run length was reported as 25,000. This process was repeated 1000 times and both average run lengths are calculated as follows:
ARL =
I,!? RL, 1000
'
The standard errors for each value are also calculated and shown in parentheses. The results for this simulation study are given in Tables 12 and 13. From Tables 12 and 13, we notice that as n increases from 5 to 10, the ARLos decreases from 564.702 to 345.346 for the robust method and from 474.45 to 369.216 for the traditional method. Note that the exact in-control ARL value of Table 12. In-Control and Out-of-Control ARLs of the Robust and the Control Charts for n = 5
SHIFT SIZE, 8
CONTROL
CHART -
-
Robust
-
X
0.00 -
0.25 -
-
564.702 ,262.908 (1.274) (0.608) 218.523 474.45 (1.096) (0.499)
0.50 -
60.994 (0.199) 44.418 (0.079)
0.75 -
-
16.233 (0.029) 12.831 (0.019)
1 .OO -
-
6.319 (0.009) 5.100 (0.006)
Table 13. In-Control and Out-of-Control ARLs of the Robust and the 2 Control Charts for n = 10
CONTROL CHART Robust -
X
SHIFT SIZE, 8 0.00
0.25
345.346 99.991 (0.474) (0.144) 369.216 100.593 (0.546) (0.165)
0.50
0.75
1.00
16.51 1 (0.019) 15.043 (0.019)
4.337 (0.004) 4.038 (0.004)
1.947 (0.001) 1.819 (0.001)
the Shewart X control chart is 370.4. In general, the performance for both methods is very close to each other, especially when the sample size increases but still better for the Shewhart 1control chart. For example, whereas the average number of samples required for the 3 control chart in the case of n = 10 to detect a shift of size S = 1 is less than two, the robust control chart for the same sample size and same shift requires also less than two samples.
Conclusions As stated by Clifford (15), there is no universally best control chart. The choice of a control chart depends on the process, personnel, instrumentation, and the importance of the problem. As a conclusion, the advantages of a robust control chart outweigh the disadvantage with respect to statistical efficiency. This article has presented a new univariate robust control chart for location and the necessary table of factors for computing the control limits. It has good properties for heavy-tailed distributions and moderate sample sizes. It compares favorably with traditional control charts if the distribution is Normal. The control limits given by Langenberg and Iglewicz (12) are selected so that the limits will be the same as the Shewhart method if the distribution is Normal. True robust Limits should be slightly larger in the case of Normal distributions; see Ref. 16. Hence, their approach is not fully robust in the formal sense of the word. Also, the proposed method of this article closely represents the philosophy of gross error occurring during each subgroup or actual distributions with heavy tails. The robust method of Langenberg and Iglewicz (12) appears to assume the error only at the overall chart level when determining control limits. The philosophy of the proposed robust control chart is more in keeping with the desire to provide robust limits in the face of non-Normal distributions or errors in data collection. In addition to providing protection against data errors or outliers, the proposed method also gives a better perfor-
NEW ROBUST SPC CHART FOR LOCATION mance than the traditional control charts if the underlying distribution of chance causes is not Normal. Both of these features are usually desirable for any control chart to be applied in the industry.
10. ,
Acknowledgments The authors thank the referee for his suggestions and also special thanks go to the Department of Statistics at the National University of Malaysia.
Downloaded by [University of Nizwa] at 23:29 30 November 2015
References and RQ Charts: Robust Control Charts, Stat1. Rocke. D. M., istician, 4 1,97-104 (1992). 2. Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J., and Stahel, W. A,, Robust Statistics: The Approach Based on Infiuence Functions, John Wiley & Sons, New York, 1986. 3. Hodges, J. L., fr. and Lehmann, E. L., Estimates of Location Based on Rank Tests, Ann. Math. Statist., 34, 598-61 1 (1963). 4. Lehmann, E. L., Theory of Point Estimation, John Wiley & Sons, New York, 1983. 5. Hampel, F. R., The Robustness of Some Nonparametric Procedures, in A Festschrift for Erich L. Lehmann, edited by P. J. Bickel, K.A. Doksum, and J. L. Hodges, Wadsworth International Group, Belmont, CA, 1983, pp. 209-238. 6. Hodges, J. L., Jr., Efficiency in Normal Samples and Tolerance of Extreme Values for Some Estimates of Location, in Proceedings of the Fifh Berkeley Symposium on Mathematical Statistics and Probability Volume 1, edited by L. M. Le Cam and J. Neyrnan, University of California Press, Berkeley, 1967, pp. 163-186. 7. Randles, R. H. and Wolfe, D. A.. Introduction to the Theory of Nonparamerric Statistics. John Wiley & Sons, New York, 1979. 8. Shamos, M. I., Geometry and Statistics: Problems at the Interface, in New York Directions and Recent Results in Algorithmsand Complexity, edited by J. F. Traub. Academic Press, New York, 1976. pp. 25 1-280. 9. Bickel, P. I. and Lehmann, E. L., Descriptive Statistics for Nonparametric Models III: Dispersion, in Contributions to
11.
12. 13. 14.
IS. 16.
Statistics, Hdjek Memorial Volume, edited by J. Juretkovi, Academia, Prague, 1979, pp. 33 - 40. Rousseeuw, P. J, and Croux, C., Alternatives to the Median Absolute Deviation, Technical Report 91 -43, Department of Mathematics, University of Antwerp. UIA (1991). Rousseeuw, P. J., private communication, E-mail: rousse@ uia.ua.ac.be,Department of Mathematics, University of Antwerp, UIA, 1998. Langenberg, P. and Iglewicz, B., Trimmed Mean X and R Charts, J. Qual. Technol., 18(3), 152-161 (1986). Montgomery, D. C., In~roductionto Statistical Quality Control, 2nd ed., John Wiley & Sons, Singapore, 1991. Wadsworth, H. M., Stephens, K. S., and Godfrey, A. B., Mode m Methods for Quality Control and Improvement, John Wiley & Sons, New York, 1986. Clifford, P. C., Control Charts Without Calculations: Some Modifications and Extensions, I d Qual. Conrrol, 15( 1 1 ), 40-44 (1959). Iglewicz, B.. Robust Scale Estimators and Confidence Intervals for Location, in Understanding Robust and Exploratory Data Analysis, edited by D. C. Hoaglin, F.Mosteller, and J. W. Tukey, John Wiley & Sons, New York, 1983, pp. 405-431.
About the Authors: Moustafa 0. Abu-Shawiesh received his B.Sc. in statistics from Yarmouk University (Jordan) in 1990, his M.Sc. in statistics from Yarmouk University (Jordan) in 1994, and his Ph.D. in statistics from National University of Malaysia (Malaysia) in 1999. The author's research interests center on statistical process control (SPC), robust statistics, time series, and forecasting methods. Mokhtar B. Abdullah received his Ph.D. in statistics from Dundee University (U.K.) in 1987. The author's research interests center on environmental statistics, robust statistics, regression analysis, statistical process control (SPC), and econometrics. He has published many articles on these and related topics in Communication in Statistics, Environmental Statistics, and others. He has published and was co-author for several books in statistics. He is a member in American Statistical Association, American Society for Quality, Malaysian Institute of Statistics, International Society of Environmetrics, and International Association of Statistical Computing.