detection of change point in statistical process control

DETECTION OF CHANGE POINT IN STATISTICAL PROCESS CONTROL JAROŠOVÁ Eva, (CZ) Abstract. The paper deals with the Bayesian approach to the statistical control. The shift of the process mean is detected via a high value of the posterior probability. The average run length and the risk of false alarm are computed numerically by simulation for various levels of the shift and for different sample sizes. Both known and unknown process standard deviation are considered. The results of simulation show that the method performs better than the Shewhart control chart and confirm its usability in short run processes with the exception of individual values from the process with an unknown standard deviation. Key words. Bayesian approach, average run length, risk of false alarm, Shiryayev-Roberts statistic. Mathematics Subject Classification: 62P10, 62-07

1

Introduction

Statistical control of industrial processes is one of the most frequently used tools in quality control. A process is monitored through samples of relatively small size drawn at regular intervals. Sample characteristics are plotted against their order and compared to limits in the control chart. When a point falls beyond the control limits a signal is given that parameters of the process may have changed and so the process is out of control. Shewhart control charts for averages are very often applied. The control limits are positioned at 3 / n away from the central line which corresponds e.g. to the target process mean, standard deviation  represents variation of the process that is under control and n denotes the size of subgroups. When standard deviation  is not known, 20 or 25 subgroups should be taken before the control limits are constructed. It cannot be accomplished in short run processes that are typical for modern business strategies. Various approaches were suggested to solve this problem. They include selfstarting CUSUM chart [3], Q-charts [4] and others. Some methods are based on the Bayesian approach [2] and two of them were examined in [1]. To assess performance of different methods some characteristics are evaluated. The average run length (ARL) is the average number of subgroups taken until a point indicates an out-of-control condition. ARL is determined for several levels of shift  in the process mean including   0 . It is obvious that for   0 a fairly long ARL

Aplimat – Journal of Applied Mathematics is desirable while for a non-zero shift ARL should be as short as possible. The risk of false alarm (RFA) is the probability that a signal occurs without the process mean being shifted. RFA must be fairly small to avoid overcontrol. These characteristics must usually be determined numerically by simulation. The aim of this paper is to examine one of the Bayesian methods in more detail. Some findings from [1] are used and other simulations are performed to explore the effect of the sample size and of the unknown process  . 2

Detection of the change point

Suppose that a process is monitored at regular intervals and that means are determined in samples of size n. The sample means are assumed to have a normal distribution with mean 0 and variance

 2 / n when the process is under control. Suppose the process mean changed from 0 to 1 at some time t0 and remains at this level since then. Time t0 of the change in the process mean is called the change point. Kenett and Zacks [2] present the following approach. The probability that the change point occurred before or at sampling time t is determined repeatedly and its large value indicates the existence of a change point. A random discrete parameter  is defined, where

  0 when the change point occurred before the first sampling time,

  i (1  i  t ) when the change point occurred between the i-th and (i+1)st sampling time,   t when the change point occurred after time t. The modified geometric prior distribution of this parameter at time t is used, defined by

  0,   i 1  t ( )  (1   ) p(1  p)   i, 1  i  t, (1   )(1  p)t 1   t. 

(1)

Here  denotes the probability that the shift in the process occurred before the first sampling time and p is the probability of success on each trial (i.e. the probability that the shift occurs within the time interval between two successive samplings). Contrary to the ordinary geometric distribution, the set of values of  is finite. We will assume that no shift occurred before the process started to be monitored. Then   0 and formulas (1) become simpler

  0, 0  i 1  t ( )   p(1  p)   i, 1  i  t, (1  p)t 1   t. 

(2)

The posterior probability function of  at sampling time t given sample means X1 , X 2 , ... , X t is 300

volume 4 (2011), number3

Aplimat – Journal of Applied Mathematics

 t ( | X1 , X 2 , ... , X t ) 

 t ( ) Lt ( ; X1 , X 2 , ... , X t ) ,   t ( ) Lt ( ; X1 , X2 , ... , Xt )

(3)



where the likelihood Lt ( ; X1 ,..., Xt ) is given by

Lt ( ; X1 ,..., Xt ) 

Functions f ( X j ; 0 ) and

 t  f ( X j ; 1 )  j 1    f ( X j ; 0 )  j 1  t  f ( X j ; 0 )  j 1

 0 t



j  1

f ( X j ; 1 )

1  t

(4)

 t

f ( X j ; 1 ) are densities of normal distributions

N (  0 ,  2 / n)

and

N ( 1 ,  2 / n) , respectively. At the sampling time t we are interested in the posterior probability

P(  t | X1 ,..., Xt ) that the change point has occurred. Using equations (2), (3) and (4), we have

P (  t | X1 ,..., Xt ) 

t 1

i

i 1

j 1

t

 p(1  p)i1  f ( X j ; 0 ) t 1

i

 f (X ;  ) j i 1

t

 p(1  p)  f ( X ;  )  f ( X ;  ) i 1

i 1

j 1

j

0

j

j i 1

1

j

 (1  p)

1

t 1

t

 f (X ;  ) j 1

j

.

(5)

0

It can be rewritten as t t 1 p i 1 (1  ) p Rj   (1  p)t 1 i 1 j i 1 , P(  t | X1 ,..., Xt )  t t 1 p i 1 Rj  1  (1  p)  (1  p)t 1 i 1 j i 1

(6)

where Rj 

f ( X j ; 1 )

 n 2 n   exp   2  2 ( X j  0 )  . f ( X j ; 0 )   2 

(7)

Kenett and Zacks [ ] use an approximate expression t 1

P(  t | X1 ,..., Xt ) 

t

R

i 1 j i 1 t 1 t

R i 1 j i 1

where

t 1

t

 R i 1 j  i 1

j

j

j

,

(8)

1

 Wt is Shiryayev-Roberts statistic.

volume 4 (2011), number 3

301

Aplimat – Journal of Applied Mathematics

In paper [1] the original expression (6) was considered. Putting t t 1 p i 1 ( 1  p ) R j  pZ t ,   (1  p ) t 1 i 1 j i 1

(9)

Z t can be determined recursively Zt 

Rt ( Z t 1  1) 1 p

(10)

and the probability P(  t | X1 ,..., Xt ) is given by P (  t | X 1 ,..., X t ) 

pZ t pZ t  1

(11)

If P(  t | X1 ,..., Xt ) is larger than some stopping threshold  * a signal is given that a change point has occurred that is that the process mean has shifted. When  of the process must be estimated, a recursive formula for sample size n  2 wt2 

1 t 2 (t  1) wt21  s t2  sl  t l 1 t

(12)

can be used, where sl2 is the sample variance at the lth sampling time. 3

Simulation study

The prior distribution (2) with p = 0.05 was used based on the simulation study in [1]. To compute R j according to (7), the deviation from  0 which is to be identified, i.e.   1  0 has to be set. The size of shift  corresponded gradually to  , 1.5 , and 2 , where  is the standard deviation of the pocess. Based on [1], the stopping threshold equal to 0.99865 was chosen. This value imitating the one-sided risk of false alarm in the Shewhart control chart seemed to guarantee a sufficiently low risk of false signal. The aim of the simulation study was to evaluate ARL for different sample sizes n and for both known and unknown  . Monte Carlo method was used to simulate drawing subgroups from a process within SPC. Three situations were considered: a) a process under control with the mean equal to the target value; in this case 1000 samples from N(10, 9) were generated in one cycle, b) process with a shift of the mean equal to  that occurred between sampling times t  5 and t  6 ; first 5 samples came from N(10, 9), the remaining 95 samples from N(10   , 9) , c) process with a shift of the mean equal to  that occurred between sampling times t  10 and t  11 ; first 10 samples came from N(10, 9), the remaining 90 samples from N(10   , 9) . The sample size changed from 2 to 5 and in case of known  also individual values were considered. For all conditions 1 000 cycles were performed every time and the number of samples until P(  t | X1 ,..., Xt )   * were recorded. Results are given in tables 1 to 4. 302


Aplimat – Journal of Applied Mathematics Table 1. Empirical ARL0 based on 1000 subgroups.

known 

n

1 2 3 4 5

unknown 

3

4.5

6

3

4.5

6

956 982 988 987 988

976 988 991 992 994

982 991 996 994 997

923 963 979 978

932 977 986 990

952 984 989 996

Table 2. Empirical RFA based on 1000 subgroups.

n

1 2 3 4 5

known 

unknown 

3

4.5

6

3

4.5

6

83 40 26 27 22

42 21 16 14 9

33 14 6 9 6

100 49 36 31

79 34 19 13

54 20 14 8

Table 3. Empirical ARL , change point between 5th and 6th sampling time

n 1 2 3 4 5

known 

unknown 

3

4.5

6

3

4.5

6

14.052 7.661 5.275 3.953 3.258

6.964 3.601 2.416 1.736 1.260

4.067 1.936 1.219 0.792 0.492

7.305 5.163 4.002 3.146

3.430 2.287 1.686 1.285

1.926 1.202 0.795 0.522

Table 4. Empirical ARL , change point between 10th and 11th sampling time

n

1 2 3 4 5 volume 4 (2011), number 3

known 

unknown 

3

4.5

6

3

4.5

6

13.687 7.527 5.295 4.052 3.200

6.741 3.578 2.351 1.740 1.305

3.974 2.000 1.193 0.787 0.478

7.439 5.258 4.075 3.202

3.522 2.395 1.701 1.308

2.007 1.213 0.823 0.521 303

Aplimat – Journal of Applied Mathematics Table 5. ARL in Shewhart control chart, standards given

n

1 2 3 4 5

4

 44 18 10 6 4

1.5

2

15 5 3 2 2

6 2 1 1 1

Conclusion

The simulation study confirmed good properties of the method. All values of ARL are smaller than those of the classical Shewhart control chart (Table 5). The fact that estimating  practically does not affect ARL is important. Based on two simulated alternatives with the change point located between the 5th and the 6th sampling times or between the 10th and the 11th sampling times, it seems that the Bayesian method performs well even for quite short sequences of samples. As for individual observations, ARL of the Bayesian method is much better than ARL of the Shewhart control chart when  of the process is known. The problem arises, though, when is to be estimated. The estimation based on moving ranges used in the control charts for individuals is not applicable in the recurrent formula because the change of the process mean at the change point is expected to induce a large value of the corresponding moving range and thus to bias the estimate of . A possible excluding this “unsuitable” moving range seems to be quite intricate. References

[1.]

JAROŠOVÁ, E.: Bayesian approach to the short run process control. Demanovská Dolina, 25.-29.8.2010. AMSE 2010 Applications of Mathematics and Statistics in Economy, to be published KENETT, R.S., ZACKS, S.: Modern Industrial Statistics. Brooks/Cole Publishing [2.] Company, Pacific Grove, 1998. MONTGOMERY, D.C.: Statistical Quality Control: A Modern Introduction. John Wiley [3.] & Sons, Hoboken, 2009. QUESENBERRY, Ch.: On Properties of Q Charts for Variables. Journal of Quality [4.] Technology 27, pp. 204-213, 1995.

Current address doc. Ing. Eva Jarošová, CSc. Skoda Auto University Tr. Vaclava Klementa 864 293 60 Mlada Boleslav Czech Republic Phone Number 732469892 e-mail: [email protected]

304


detection of change point in statistical process control

detection of change point in statistical process control

Suggest Documents

Identification of change structure in statistical process control

Change Pattern Discovery in Multistage Statistical Process Control

Implementation of Statistical Process Control

SPEAKER CHANGE POINT DETECTION USING

CHANGE POINT DETECTION IN TIME SERIES

Change-Point Detection in Angular Data

STATISTICAL PROCESS CONTROL (SPC) - Dialnet

STATISTICAL PROCESS CONTROL IN IRON PONDER ... - CiteSeerX

Recent Advances in Multivariate Statistical Process Control

STATISTICAL PROCESS CONTROL IN THE ... - Scielo.br

Application of MCMC to change point detection

Statistical significance of landcover change detection ...

Foreword Statistical genetics Change-point problems

Multi-dimensional Point Process Models in R - Journal of Statistical ...

Statistical Analysis of Trend and Change Point in ... - Semantic Scholar

On Nonparametric Statistical Process Control of ...

Implementation of Statistical Process Control ...

The Case of Statistical Process Control

Integration of multivariate statistical process control ...

IMPACT OF STATISTICAL PROCESS CONTROL (SPC) - smmso

Control chart: A statistical process control tool in pharmacy

Multidecision Quickest Change-Point Detection: Previous ... - CiteSeerX

State-of-the-Art in Sequential Change-Point Detection - arXiv.org

Variance Change Point Detection - IIT Kanpur