Feb 26, 2018 - In this paper, Shewhart-type nonparametric control chart based on run statistic is developed for monitoring the known process location.
Communications in Statistics - Theory and Methods
ISSN: 0361-0926 (Print) 1532-415X (Online) Journal homepage: http://www.tandfonline.com/loi/lsta20
Shewhart-type nonparametric control chart for process location D. M. Zombade & V. B. Ghute To cite this article: D. M. Zombade & V. B. Ghute (2018): Shewhart-type nonparametric control chart for process location, Communications in Statistics - Theory and Methods, DOI: 10.1080/03610926.2018.1435811 To link to this article: https://doi.org/10.1080/03610926.2018.1435811
Published online: 26 Feb 2018.
Submit your article to this journal
Article views: 9
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=lsta20
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS , VOL. , NO. , – https://doi.org/./..
Shewhart-type nonparametric control chart for process location D. M. Zombadea and V. B. Ghuteb a Department of Statistics, Walchand College of Arts and Science, Solapur, (MS), India; b Department of Statistics, Solapur University, Solapur, (MS), India
ABSTRACT
ARTICLE HISTORY
In this paper, Shewhart-type nonparametric control chart based on run statistic is developed for monitoring the known process location of a continuous process distribution. The performance of the proposed nonparametric chart is investigated using a simulation study and is compared with sign chart and signed-rank chart and X chart under normal and non-normal process distributions, using ARL and SDRL as performance measures. It is observed that the proposed nonparametric chart perform better than the X chart for detecting shifts in process location under heavy tailed distribution and is a close competitor to the sign chart for normal and non-normal process distributions.
Received June Accepted January KEYWORDS
Average run length; control chart; run test; sign test; signed-rank test.
1. Introduction Shewhart-type control charts for location are widely used for purposes of determining whether a process is in control, for bringing an out-of-control process into control, and for monitoring a process to make sure that it stays in control. Univariate control charts are used to monitor processes that manufacture products with a single quality characteristic of interest. Most of the control charts are based on the assumption that the underlying distribution of the process is normal. In reality, this assumption may not hold in all the situations. Therefore, it is necessary to suggest nonparametric control charts for monitoring processes which do not depend on the assumption of normality. A formal definition of nonparametric or distributionfree control chart is given in terms of its in-control run length distribution. The number of samples that needs to be collected before the first out-of-control signal given by a chart is a random variable called the run-length; the probability distribution of the run-length is referred to as the run-length distribution. If the in-control run length distribution is same for every continuous distribution then the chart is called distribution-free or nonparametric (Chakraborti, Van Der Laan, and Bakir (2001)). The location and scale of a process are two main parameters often monitored in nonparametric control charts. The problem of monitoring the location of a process is important in many applications. The location parameter could be the mean or the median or some percentiles of the distribution. In literature, several nonparametric control charts are proposed for monitoring location parameter of a univariate process. Amin, Reynolds, and Bakir (1995) developed Shewhart and CUSUM control charts based on sign test statistic. Chakraborti et al. (2001, 2007) presented an extension overview of the literature on univariate nonparametric CONTACT V. B. Ghute (MS), India.
vbghute_stats@rediffmail.com
© Taylor & Francis Group, LLC
Department of Statistics, Solapur University, Solapur ,
2
D. M. ZOMBADE AND V. B. GHUTE
control charts. Bakir (2004) proposed a nonparametric Shewhart-type control chart for monitoring the median of a continuous symmetric distribution using the Wilcoxon signed-rank statistic. Chakraborti and Eryilmaz (2007) further improved the performance of the chart due to Bakir (2004) by introducing 2-of-2 runs rule. Chakraborti and Van de Wiel (2008) developed the control chart based on Mann-Whitney statistic for detecting location shifts. Graham, Human, and Chakraborti (2010) proposed a nonparametric Shewhart-type control chart based on the median for monitoring the location of a continuous variable in a phase I process control setting. Human, Chakraborti, and Smith (2010) developed a class of nonparametric Shewhart-type control charts based on sign statistic using runs rules. Khilare and Shirke (2010) developed a nonparametric synthetic control chart based on sign statistic to monitor shifts in process location. Pawar and Shirke (2010) developed a nonparametric Shewhart-type synthetic control chart based on signed-rank statistic to monitor shifts in the known in control process location. Kritzinger, Human, and Chakraborti (2014) developed improved Shewhart-type runs-rules nonparametric sign chart to detect large shifts quickly. For monitoring multivariate process location, some nonparametric control charts based on sign and signed-rank statistics are also available in literature. Das (2009) proposed multivariate nonparametric control chart based on bivariate sign test. Boone and Chakraborti (2011) proposed two Shewhart-type multivariate nonparametric control charts based on multivariate forms of the sign and signed-rank tests. Ghute and Shirke (2012a) developed nonparametric synthetic control chart based on bivariate signed-rank test to monitor changes in the location of a bivariate process. Ghute and Shirke (2012b) also developed nonparametric synthetic control chart based on bivariate sign test to monitor changes in the location of a bivariate process. The purpose of this paper is to develop a nonparametric control chart for monitoring the known process location of a continuous symmetric process distribution. The location point of a distribution under study is usually unknown in practice and need to be estimated from the analysis of the preliminary samples taken when the process is assumed to be in-control. When detection of shift in location is in only one direction (up or down) is of interest, a one-sided control chart is desirable. A two-sided control chart is suitable when detection of any shift (up and down) is concern. In this paper we focus on positive-sided chart in which upward shifts in the location are of interest. Negative-sided case can be treated in a similar way. When the process distribution is normal, Shewhart X chart is an appropriate control chart for monitoring the process mean. If underlying process distribution is non-normal, then we consider nonparametric control chart based on appropriate nonparametric test. Many nonparametric tests like sign, signed-rank has been proposed in the literature. In this paper, we introduce a nonparametric Shewhart-type control chart based on the run statistic, called the NP-R chart. The proposed NP-R chart for monitoring the process location is based on runs computed within samples. As noted, the process data has a symmetric distribution; hence the goal of this study is to develop positive-sided control chart for monitoring the location of process that will work for non-normal symmetric distributions. It is well known (see Gibbons and Chakraborti 2003) that nonparametric statistical tests can be more efficient than their parametric counter parts under skewed or heavy tailed distributions. Thus we would expect that the proposed NP-R chart would perform better than the parametric chart X in some situations. The rest of the paper is organized as follows. Section 2 provides a brief introduction of existing Shewhart-type nonparametric control charts based on sign and signed-rank tests for monitoring process location. The proposed NP-R chart for monitoring process location based on run test statistic is introduced in Section 3. The performance of proposed chart is
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS
3
evaluated in Section 4 and compared with other charts in Section 5. In Section 6, numerical example is given to illustrate the charting procedure of the proposed NP-R chart. Robustness of the proposed control chart against contamination by outliers is discussed in Section 7. Some conclusions are given in Section 8.
2. Nonparametric sign and signed-rank charts for location Let (Xi1 , Xi2 , . . . , Xi n ) denote a random sample (rational subgroup) of size n > 1 taking at sampling stage i = 1, 2, 3,.. . . Assume that the samples are independent and the observations come from a continuous distribution with a cumulative distribution function F. The distribution of observations will assume to be continuous with location θ. Amin et al. (1995) developed a Shewhart-type sign control chart (referred to as NP-S chart) for monitoring the median based on the charting statistic SNi = nj=1 sign (Xi j − θ0 ) where sign (t ) = 1 if t > 0, 0 if t = 0 and −1 if t < 0 and θ0 denotes the specified value of the median. The proposed sign chart is based on the sign test statistic Ti =
n
I (Xi j > θ0 )
(1)
j=1
where θ0 denotes the specified value of the median and I (Xi j > θ0 )denotes the usual indicator (1 or 0) function for the event { Xi j > θ0 }. Thus, Ti denotes the number of observations larger than θ0 in the ith sample, and it follows a binomial distribution with parameters n and probability of success p = P (Xi j > θ0 ). The statistic T is considered as the control statistic for the nonparametric sign chart for monitoring process median. The upper and lower control limits of the NP-S chart are given by UCL = n − b and LCL = a
(2)
where the charting constants a and b are integers between (including) 0 and n. Let R+ i j denote the rank of the absolute values of the differences |Xi j − θ0 |, j = 1, 2, . . . , n within the ith subgroup. Define SRi = nj=1 sign (Xi j − θ0 ) R+ i j , i = 1, 2, . . . , where θ0 is the known or specified value of median θ that is to be monitored. Note that the relationship , where Ti+ is well-known Wilcoxon’s signed-rank statistic. Bakir (2004) SRi = 2 Ti+ − n (n+1) 2 developed control chart based on charting statistic SR (referred to as NP-SR chart) for monitoring process location θ. The chart gives an out of control signal at the first sampling instance i for which SRi ≥ UCL, where UCL is the upper control limit corresponding to a positive-sided control chart.
3. Control chart based on run statistic In this Section, the basic theory of run test is described. In this context see Varon (2010). Let X1 , X2 , . . . , Xn be a subgroup sample of size n > 1 from a distribution with location θ and standard deviation σ . It is assumed that these observations are independent and have a continuous distribution symmetric about a median θ. Let θ0 denote the in-control value of the process median that is either known or estimated at the end of Phase I process control. Without loss of generality, we assume that θ0 = 0 and σ0 = 1. We are interested in detecting
4
D. M. ZOMBADE AND V. B. GHUTE
shifts in the process median θ. A test for the hypothesis H0 : θ = 0 versus H1 : θ > 0 , based on runs has been studied by Varon (2010). A run is defined as a succession of two or more identical symbols which are followed and preceded by different symbols or no symbol at all. At each inspection point, a nonparametric run statistic is computed using a subgroup sample X1 , X2 , . . . , Xn . For the construction of runs, the variable η j is defined as 1, i f XD j > 0 , j = 1, 2, 3, . . . , n η j = S (XD j ) = , (3) 0, otherwise where D j is the antirank of |X|( j) such that |D j | = |X|( j) . Hence D j labels the X which corresponds to the jth order absolute value. Then the sequence η1 , η2 , . . . , ηn is a dichotomized sequence. The changes in the dichotomized succession are identified with the following indicators: Define I1 = 1 1, i f η j−1 = η j , j = 2, 3, .... , n. Ij = (4) 0, i f η j−1 = η j The number of runs until the jth element of the dichotomized succession is obtained through the following partial sums: ri =
i
I j , i = 1, 2, .... , n.
(5)
j =1
Naturally ri ≤ r j for i < j and rn is the total number of runs in the sequence. Test statistic based on runs is given as n 1 δj rj rn j = 1 ⎧ ⎨ 1 , i f η j = 1, j = 1, 2, 3, .... , n. where δ j = ⎩ −1, i f η j = 0
R=
(6)
(7)
Note that R includes the number of runs until every element of the dichotomized succession, increasing their value when η j = 1 (δ j = 1 , runs o f ones)and decreasing when η j = 0 (δ j = −1 , runs o f zeros) the large value of R indicate greater number of runs of ones and it is an indication that θ > 0. Additionally the inverse of total number of runs r1n is used as a factor of standardization. It should be noted that the statistic R takes values between –n and n. Large values of R indicate a positive shift where as small value indicate a negative shift. For θ > 0, it is expected that R takes large positive values. Accordingly H0 is rejected for large values of R. That is, if R ≥ r, with level of significance α, PH0 (R ≥ r) = α. In order to applying statistic R for testing above hypothesis, the rule is to reject H0 : θ = 0 in favor of H1 : θ > 0 if R ≥ r1−α/2 . The critical value of r1−α/2 is determined so that PH0 (R ≥ r1−α/2 ) = α. The proposed NP-R chart uses run test statistic Rt as the charting statistic, which is run statistic R for the tth sample. The in-control and out-of-control run-length distribution and its associated characteristics such as average run length, standard deviation of run length, and median run length etc. are necessary to design the control chart. Also, implementation of the control chart requires the control limits for which knowledge of the statistical distribution of the control chart statistic is needed. If exact distribution of control chart statistic is unknown or intractable, then control limits can be calculated either approximate distribution or from
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS
5
Monte Carlo simulation. The distribution of run statistic R provided by Varon (2010) is not close to standard form of distribution. Accordingly, it is difficult to compute control limits and run-length properties of the chart. To overcome this problem, we use Monte Carlo simulation with 10,000 simulations to derive the necessary results for the proposed NP-R chart. In the proposed NP-R chart, for detecting shifts in the process median θ0 , we are interested in detecting upward shifts, therefore only UCL is required for the chart. The UCL of the chart is determined such that a desirable in-control ARL (ARL0 ) is achieved. Typically, the UCL are found for some specified ARL0 values of 200, 340 and 500 for subgroup sample size n = 10 and 15. The NP-R consists of plotting the charting statistic Rt sequentially and checking it against UCL at every instance t. The chart gives an out-of-control signal when charting statistic fall above the UCL and we declare a process to be out-of-control with median shifted upwards. Steps for upper one-sided NP-R Chart 1. 2. 3. 4. 5.
Take a subgroup sample X1 , X2 , ..., Xn of size n at each inspection point. Compute the control statistic R. Set up upper control limit UCL for specified ARL0 . Plot R in the chart. If any point goes beyond the UCL, the process is considered to be out-of-control and it is an indication that there is a shift in process median.
4. Performance evaluation To assess the performance of the proposed NP-R chart, the ARL is used as performance measure. Monte Carlo simulation is used to determine the ARL of in-control and out-of-control processes. The performance of a control chart is measured in terms of the run-length distribution. As the run-length distribution is skewed to right, the various summary measures such as mean, standard deviation and the quartiles are considered to characterize the distribution. For the study, the average run length (ARL), the standard deviation of run length (SDRL), median run length (MRL), the first and third quartiles of the run-length are calculated and examined. Consider a process where quality characteristic of interest X is distributed with location θ and standard deviation σ . Let θ0 and σ0 be the in-control values of θ and σ respectively. When a shift in process location occurs, we have change from the in-control value θ0 to the out-of-control value θ1 = θ0 + δ σ0 , ( δ > 0). Therefore, when control chart for location is 0| , where θ1 is the shifted location employed, the process shifts are measured through δ = |θ1σ−θ 0 and θ0 is in-control location. When δ = 0, the process is in-control. Without loss of generality, we take in-control median to be θ0 = 0. We determined ARL and other characteristics by simulation when process was operating under normal, double exponential and Cauchy with location zero and variance one. Equations (8), (9) and (10) respectively gives probability density functions of normal distribution with location θ and scale σ , Cauchy distribution with location θ and scale λ and double exponential distribution with location θ and scale λ (Bakir 2004).
−1 x − θ 2 1 exp , −∞ < x < ∞ and σ > 0 (8) f (x) = √ 2 σ σ 2π
−|x − θ| 1 exp , −∞ < x < ∞ and λ > 0 (9) f (x) = 2λ λ
6
D. M. ZOMBADE AND V. B. GHUTE
Table . ARL and SDRL of the NP-R chart for n = and ARL = . Normal UCL = .
Double Exponential UCL = .
Cauchy UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Table . ARL and SDRL of the NP-R chart for n = and ARL = . Normal UCL = .
Double Exponential UCL = .
Cauchy UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
f (x) =
π [
λ2
λ , −∞ < x < ∞ and λ > 0 + (x − θ )2 ]
(10)
To achieve standard deviation of 1, we choose σ = 1 for normal distribution, λ = √12 for double exponential distribution and λ = 0.2605 for Cauchy distribution. All the distributions are symmetric and shift refers to a shift in the median. The amount of a shift in the median is taken over the range δ = 0 (0.2) 1.2. Computer programs written in C language are used to study the performance of the NP-R chart. The in-control and out-of-control ARL and SDRL values and other characteristics of the proposed NP-R chart are computed using 10000 simulations for sample size of n = 10 and 15. For positive-sided NP-R chart, an upper control limit (UCL) is chosen such thatARL0 ∈ {200, 340, 370, 500}. Tables 1–3 provide the ARL and SDRL of the proposed NP-R chart to detect shift in process median at different magnitudes for the normal, double exponential and Cauchy process distribution when n = 10. The sample results reported in these tables provide useful information about detection ability of the NP-R chart under considered process distributions. As expected, for zero shifts in the process median, the ARL are close to 200, 340 and 500 respectively in Tables 1–3 for all process distributions, representing the cases of in-control process. For fixed Table . ARL and SDRL of the NP-R chart for n = and ARL = . Normal UCL = . Shift δ . . . . . . .
Double Exponential UCL = .
Cauchy UCL = .
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS
7
Table . ARL and SDRL of the NP-R chart for n = and ARL = . Normal UCL = .
Double Exponential UCL = .
Cauchy UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
n and given ARL0 , the out-of-control ARL values (ARL1 )of the NP-R chart decrease sharply with increasing shift in the location. This indicates that NP-R chart is reasonably effective in detecting shifts in the process median. The pattern is quite similar for the SDRL of the chart. For a given n, the ARL0 values of the NP-R control chart are almost identical under normal, double exponential and Cauchy process distributions; hence the NP-R chart is considered to be nonparametric. For fixed n and given ARL0 , the ARL1 values of the chart decrease with increasing shift in the median under normal process distribution. For double exponential process distribution, the general pattern in the ARL1 values remain the same as in the case of normal distribution, but magnitude of ARL values are smaller for a similar shift in the process median, indicating faster detection of shifts under double exponential distribution. For Cauchy process distribution, the general pattern in the ARL1 values also remain the same as in the case of normal and double exponential distribution, but magnitude of ARL values are much smaller for similar shifts in the process median, indicating faster detection of shifts under the Cauchy distribution. For example, when n = 10, ARL0 = 200 and median shift is δ = 0.2, the ARL1 under Cauchy distribution is 11.15, compared to 31.87 in double exponential case and 57.75 in normal case. The effectiveness (speed of detection) of the proposed NP-R chart varies depending on the underlying process distribution. Tables 4–6 provide the ARL and SDRL of the proposed NP-R chart to detect shift in process median at different magnitudes for the normal, double exponential and Cauchy process distribution when n = 15. The simulation study revealed similar conclusions for subgroup sample n = 15 and ARL0 = 200, 370 , 500. Tables 1–6 show that the NP-R chart has larger ARL1 values under normal process distribution, where as it has smaller ARL1 values under double exponential and Cauchy distributions. Hence proposed NP-R chart is less efficient to detect shifts in process median under normal process distribution and more efficient under heavy tailed distributions. Table . ARL and SDRL of the NP-R chart for n = and ARL = . Normal UCL = .
Double Exponential UCL = .
Cauchy UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
8
D. M. ZOMBADE AND V. B. GHUTE
Table . ARL and SDRL of NP-R chart for n = and ARL = . Normal UCL = .
Double Exponential UCL = .
Cauchy UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
Table . Quartiles of the run length distribution of NP-R chart for n = . Normal
Double Exponential
Cauchy
Shift δ
Q1
Q2
Q3
IQR
Q1
Q2
Q3
IQR
Q1
Q2
Q3
IQR
. . . . . . .
For complete understanding of the performance of a chart, Chakraborti and Eryilmaz (2007) suggested to study the entire run-length distribution. The run-length distribution and its various associated characteristics (such as mean (ARL), standard deviation (SDRL), the median (MRL), etc.) reveal important information regarding the performance of a control chart (Human and Graham 2007). Therefore, along with the study of ARL and SDRL performance of the NP-R chart, we also study quartile run-length distribution. We calculate the median (Q2 )and two quartiles of the run length random variable when n = 10 with ARL0 = 200. The in-control (when δ = 0) and the out-of-control (when δ = 0)quartiles of the NP-R chart under the normal, the double exponential and the Cauchy distribution are shown in Table 7. Note that mean (ARL0 ) of in-control run-length distribution is greater than the median indicating the right skewness of the distribution. For example, ARL0 is 200 where as in-control MRL equals 142 for normal and double exponential process distributions and 136 for Cauchy distribution. Based on Table 7, it is observed that in-control IQR values of the NP-R chart under normal, double exponential and Cauchy distribution are almost similar. For all shifts under consideration out-of control IQR values of the NP-R chart under heavy tailed distributions are smaller than that of under normal distribution.
5. Performance comparison For efficiency comparisons, we compare the proposed NP-R chart to the competing NP-S chart proposed by Amin et al. (1995) and NP-SR chart proposed by Bakir (2004) under normal, double exponential and Cauchy distributions. We also include the Shewhart X chart for comparison. Note that all the distributions in the study have mean/median zero and scaled such that they have a standard deviation of one, so that results are easily comparable across distributions. For comparison purpose the control charts are designed so that ARL0 values are
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS
9
Table . ARL comparison under normal distribution for n = (ARL = ). X chart UCL = .
NP-S chart UCL =
NP-SR chart UCL =
NP-R chart UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . .
Table . ARL comparison under double exponential distribution for n = (ARL = ). X chart UCL = .
NP-S chart UCL =
NP-SR chart UCL =
NP-R chart UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
approximately equal. However, because the nonparametric charts are based on charting statistics that have discrete distributions, it is not possible to straightforwardly design the charts such that their ARL0 values are equal to some desired value for a given sample size n. Human (2009) suggested randomization technique to obtain same in-control ARL to the charts for the selected sample and provide the out of control ARL values of NP-S and NP-R charts when ARL0 was 370 for sample size n = 10 and shifts in process location 0.0 to 1.2 for normal, double exponential and Cauchy distributions. The ARL0 of proposed NP-R chart for n = 10 is 340 which is different from NP-S and NP-R charts of Human (2009). In order to have a fair comparison we computed adjusted ARL and SDRL of B chart with respect to A chart as
[ARL (δ)]B ad j ARL (δ) B = [ARL (0)]A [ARL (0)]B
[SDRL (δ)]B ad j SDRL (δ) B = [SDRL (0)]A [SDRL (0)]B
(11) (12)
Tables 8–10 provide the adjusted ARL and SDRL values of NP-R chart for sub-group sample size n = 10 when underlying process distributions are normal, double exponential and Table . ARL comparison under Cauchy distribution for n = (ARL = ). X chart UCL = .
NP-S chart UCL =
NP-SR chart UCL =
NP-R chart UCL = .
Shift δ
ARL
SDRL
ARL
SDRL
ARL
SDRL
ARL
SDRL
. . . . . . .
. . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
. . . . . .
. . . . . . .
10
D. M. ZOMBADE AND V. B. GHUTE
Cauchy respectively. The corresponding values of the NP-S, NP-SR and X chart are provided in Human (2009). Findings from Tables 8 to 10: When underlying process distribution is normal, it is observed that the Shewhart X chart has smaller ARL1 values, where as the NP-S chart has larger ARL1 among the different charts. Hence, Shewhart X chart outperforms the other three nonparametric charts for all shifts under consideration. This is an expected result since the Shewhart X chart is specifically designed for a process operating under a normal distribution. An example of poor performance of the nonparametric charts occur when δ = 0.2, resulting ARL1 values of NP-S, NPSR and NP-R charts respectively are 104.0, 88.7 and 89.9 which are larger than the ARL1 = 63.4 of the traditional X chart. Under double exponential process distribution, it is observed that the NP-SR chart has smaller ARL1 values, whereas X chart has larger ARL1 values for all shifts under consideration. The proposed NP-R chart perform reasonably well although it is slightly less efficient than the NP-SR chart but better than NP-S chart. When underlying process distribution is Cauchy, it is observed that NP-SR chart has smaller ARL1 values, whereas the X chart has larger ARL1 values for all shifts under consideration. The proposed NP-R chart performs reasonably well although it is slightly less efficient than the NP-SR chart but better than the NP-S chart. The general conclusion of comparison is that for all shifts under consideration the X chart outperforms the other three charts under normal process distribution, whereas, the three nonparametric charts perform significantly better than the X chart for heavy tailed distributions. The proposed NP-R chart outperforms the NP-S chart for all distributions under consideration.
6. An example In this Section, we illustrate the operation of the proposed NP-R chart using data from Montgomery (2009). Consider the scenario where the median of compressive strength of parts manufactured by an injection molding process is monitored using NP-R chart. It is based on data given on page 275 in Table 6.7 of Montgomery (2009). The data set includes 20 samples each of 5 observations. These observations were supplemented with additional observations given in Table 6.8. The data set in Table 6.8 includes 15 samples each of 5 observations. However, we need n to be larger for the attainable ARL values to be closer to some typical value of ARL0 used in practice. Therefore, the data are modified by grouping two consecutive samples of size 5 together to obtain sample of size n = 10 each. From the first 10 samples each of size 5 from Table 6.7, we have obtained 5 samples each of size 10 by grouping two consecutive samples. Similarly, from first 10 samples each of size 5 from Table 6.8, we have obtained 5 samples each of size 10 by grouping two consecutive samples. Thus, the data set for this example consists of 10 samples each of n = 10 observations. We assume that underlying distribution is symmetric with an in-control median θ0 = 79.53. The location of each data point was changed by taking deviation from in-control median θ0 so that transformed observations have median equal to zero. The values of plotting statistic R for the 10 samples are presented in Table 11. To have ARL0 = 200, for n = 10 based on Table 1, we have UCL = 6.5. Figure 1 depicts the resulting control chart graph using UCL of 6.5. For the first sample, we illustrate the calculation of statistic R1 using Equation (8).
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS
11
Figure . Shewhart-type NP-R Chart.
The run statistic R1 for the first sample using Equation (6) is calculated as, 10 −2 j=1 r j δ j R1 = = −0.25. = r10 8 Therefore, the point to be plotted on the NP-R chart for the first sample is −0.25. We developed programme in C to perform similar calculations for the other 9 samples and are displayed in Table 11.
7. Robustness of the NP-R chart against outliers In this Section, we are interested in studying the effects of distributional contamination on the proposed NP-R chart under the normal process distribution. A standard model for generating normally distributed processes that produce occasional outliers is the contaminated normal distribution. To study the robustness of proposed NP-R chart against outliers, we consider underlying process data from contaminated normal distribution. The contaminated normal distribution is a mixture of two normal distributions. The cumulative distribution function (cdf) of which is F (x) = (1 − p)N (μ, 1) + pN (μ, σ 2 ),
(13)
where 0 ≤ p ≤ 1, N (μ, σ 2 ) is the cdf of normal distribution with mean μ and variance σ 2 . We will refer p and σ 2 as the percentage of contamination and the extremity of contamination respectively. When μ = 0, the process is in-control though producing occasional outliers. When p = 0.0 and μ = 0, Equation (13) becomes cdf of standard normal distribution. The incontrol ARL values of proposed NP-R chart and that of the traditional X chart are computed using 10,000 simulations for each chart when subgroup sample size is n = 10. The simulations are made for all possible combinations of (σ 2 , p), where σ 2 = 4, 9, 16 and p = 0.01, 0.05, 0.10, 0.15, 0.20. Tables 12 and 13 present simulated in-control ARL values of the X and the NP-R charts for various levels of contamination when underlying process distribution is normal. The UCL Table . Values of NP-R plotting statistic. Sample No. R
.
−.
.
−.
−.
.
.
.
.
−.
12
D. M. ZOMBADE AND V. B. GHUTE
Sr. No. Observation Y j X j = Y j − θ0 |X |( j) Rank of |X |( j) = C j Antirank (D j ) ηj Ij rj δj rj δj
. . . − −
. . . − −
. − . . − −
. − . .
. − . . − −
. . .
. − . . − −
. − . .
. − . . − −
. . .
Table . In-control ARL values of X chart for stable process with occasional outliers for contaminated normal distribution. Extremity of outliers Percentage of contamination p = % p = % p = % p = % p = % p = %
σ2
=4
. . . . .
σ2 = 9
σ 2 = 16
. . . . .
. . . . .
of these charts are obtained to give the respective ARL0 , when a process is operating under a normal distribution with no outliers. Following are the important findings from Table 12: r Under very light percentage p = 1% and light extremity of contamination σ 2 = 4, outliers have a small effect as the ARL of traditional X chart drops to 177.51 which entails about 1.1 times as many false alarms as the expected ARL of 202. For moderate extremity of contamination σ 2 = 9, outliers have noticeable effect as the ARL of Xchart drops to 137.78, which entails about 1.5 times as many false alarms as the expected ARL of 203. When extremity of contamination grows to σ 2 = 16, outliers have substantial effect as the ARL of X chart drops to 104.65, which entails about 1.9 times as many false alarms as the expected ARL of 201. r Under the usual percentage p = 10% and light extremity of contamination σ 2 = 4, outliers have noticeable effect as the ARL of X chart drops to 77.73, which entails about 2.6 times as many false alarms as the expected ARL of 202. For moderate extremity of Table . In-control ARL values of NP-R chart for stable process with occasional outliers for contaminated normal distribution. Extremity of outliers Percentage of contamination
σ2
=4
σ2 = 9
σ 2 = 16
p = % p = % p = % p = % p = % p = %
. . . . . .
. . . . . .
. . . . . .
COMMUNICATIONS IN STATISTICS—THEORY AND METHODS
13
contamination σ 2 = 9, outliers have greater effect as the ARL of X chart drops to 34.27, which entails about 5.9 times as many false alarms as the expected ARL of 203. With serve extremity of contamination σ 2 = 16, the ARL of X chart drops to 19.64, entailing 10.23 times as many false alarms as the expected ARL of 201. r When the percentage of contamination is as large as p = 20%, and light extremity of contamination Mi , the ARL of X chart drops to 16.24, which entails about 10.38 times as many false alarms as the expected ARL of 202. With moderate extremity of contamination σ 2 = 9, the ARL of X chart drops to 10.38, this entails about 11.0 times as many false alarms as the expected ARL of 203. With serve extremity of contamination σ 2 = 16, the ARL of X chart drops to just 10.66, entailing 18.85 times as many false alarms as the expected ARL of 201. In the simulation study of robustness, Table 12 shows that in-control ARL value of X chart changes significantly depending on the level of contamination. This is because X chart is based on actual observations in the sample and hence affected by the outliers. On the other hand, Tables 13 shows that in-control ARL values of the proposed NP-R chart are unaffected at all levels of contamination (σ 2 , p) in the process. This is because run statistic is not affected by outliers. The general conclusion of the robustness study against contamination by outliers is that the X chart can be very sensitive to contamination by outliers, where as, the proposed NP-R chart is robust against contamination by outliers.
8. Conclusions In this paper, a Shewhart-type nonparametric control chart based on run statistic is developed for monitoring a known process location of a continuous symmetric process distribution. The proposed NP-R chart requires simple calculations and it is straightforward to implement. The performance of the proposed chart is studied by simulation under normal, double exponential and Cauchy process distributions. Simulation study indicates that the proposed NP-R chart is more efficient than the traditional Shewhart X chart under heavy tailed distributions, where as it is less efficient under normal distribution. The proposed NP-R chart is efficient than the NP-S chart for detecting shift in process median for considered process distributions. The robustness study indicates that the proposed NP-R chart is robust against contamination by outliers while the traditional X chart is not. The proposed NP-R chart is specifically recommended in situations where the underlying process distribution is known to have heavier tails than the tails of normal distribution or to be contaminated by occasional outliers.
Acknowledgments The authors are thankful to the anonymous reviewers and editor for several helpful comments and suggestions on earlier version of this manuscript, which resulted in a significant improvement in the presentation.
References Amin, R. W., M. R., Reynolds Jr., and S. T. Bakir. 1995. Nonparametric quality control charts based on the sign statistic. Communications in Statistics-Theory and Methods 24:1579–1623. doi.10.1080/03610929508831574. Bakir, S. T. 2004. A Distribution-free Shewhart quality control chart based on signed-ranks. Quality Engineering 16 (4):613–623. doi.10.1081/QEN-120038022.
14
D. M. ZOMBADE AND V. B. GHUTE
Boone, J. M., and S. Chakraborti. 2012. Two simple Shewhart-type multivariate nonparametric control charts. Applied Stochastic Models in Business and Industry 28:130–140. doi.10.1002/asmb.900. Chakraborti, S., and S. Eryilmaz. 2007. A nonparametric Shewhart-type signed-rank control chart based on runs. Communications in Statistics-Simulation and Computation 36:335–356. doi.10.1080/03610910601158427. Chakraborti, S., and M. A. Graham. 2007. Nonparametric control charts. In Encyclopedia of statistics in quality and reliability, vol. 1, 415–429, New York: John Wiley. Chakraborti, S., and Mark A. Van de Wiel. 2008. A nonparametric control chart based on the MannWhitney Statistic. Institute of Mathematical Statistics 1:156–172. Chakraborti, S., P. Van Der Laan, and S. Bakir. 2001. Nonparametric control charts: An overview and some results. Journal of Quality Technology 33 (3):304–315. Das, N. 2009. A multivariate nonparametric control chart based on sign test. Quality Technology and Quantitative Management 6 (2):155–169. doi.10.1080/16843703.2009.11673191. Ghute, V. B., and D. T. Shirke. 2012a. A nonparametric signed-rank control chart for bivariate process location. Quality Technology and Quantitative Management 9 (4):317–328. doi.10.1080/16843703.2012.11673296. Ghute, V. B., and D. T. Shirke. 2012b. Bivariate nonparametric synthetic control chart based on sign test. Journal of Industrial and System Engineering 6 (2):108–121. Gibbons, J. D., and S. Chakraborti. (2003). Nonparametric statistical inference. 4th edition, New York: Marcel Dekker. Graham, M. A., S. W. Human, and S. Chakraborti. 2010. A phase I nonparametric Shewharttype control chart based on the median. Journal of Applied Statistics 37 (11):1795–1813. doi.10.1080/02664760903164913. Human, S. W. 2009. Ph. D. Dissertation on Univariate parametric and nonparametric statistical quality control techniques with estimated parameters. http://repository.up.ac.za/bitstream/ handle/2263/28772/Complete.pdf?sequence=6. Human, S. W., S. Chakraborti, and C. F. Smith. 2010. Nonparametric Shewhart-type sign control chart based on runs. Communications in Statistic-Theory and Methods 39:2046–2062. doi.10.1080/03610920902969018. Human, S. W., and M. A. Graham. 2007. Average run lengths and operating characteristic curves. In Encyclopedia of statistics in quality and reliability, vol. 1, 159–168, New York: John Wiley. Khilare, S. K., and D. T. Shirke. 2010. A nonparametric synthetic control chart using sign statistic. Communications in Statistics-Theory and Methods 39:3282–3293. doi.10.1080/03610920903249576. Kritzinger, P., S. W. Human, and S. Chakraborti. 2014. Improved Shewhart-type runs-rules nonparametric sign charts. Communications in Statistics-Theory and Methods 43:4723–4748. doi.10.1080/03610926.2012.729637. Montgomery, D. C. 2009. Statistical quality control-a modern introduction. 6th Ed., New York: John Wiley and Sons. Pawar, V. Y., and D. T. Shirke. 2010. A nonparametric Shewhart-type synthetic control chart. Communications in Statistics–Simulation and Computation 39:1493–1505. doi.10.1080/03610918.2010.503014. Varon, M. J. R. 2010. Ph. D. Dissertation on Nonparametric test based on runs for a single sample location problem. URL: http://kops.ub.uni-konstanz.de/volltexte/2010/11634.