Quality Monitoring of Multimode Processes via Signal Data Grasso M.1,2, Colosimo B.M.1, Tsung F3. 1Politecnico
di Milano, Dipartimento di Meccanica, Via La Masa 1, 20156 Milan (Italy), MUSP, Macchine Utensili e Sistemi di Produzione, Via Tirotti 9, 29122, Piacenza (Italy) 3Hong Kong University of Science and Technology, Department of Industrial Engineering and Logistics Management, Clear Water Bay, Kowloon (Hong Kong)
[email protected],
[email protected],
[email protected] 2Laboratorio
Keywords: Multimode Process, Process Monitoring, SPC, sensor signal
Quality Monitoring of Multimode Processes via Signal Data Abstract Continuous advances of sensor technology and real-time computational capabilities allow developing industrial quality monitoring tools based on sensor signals acquired during the process itself. This yields notable benefits with respect to traditional quality control performed on the output of the process (i.e., the manufactured part). However, many discrete manufacturing processes violate the most common distributional assumptions used in Statistical Process Control (SPC). A particularly challenging violation consists of the existence of multiple in-control states (a.k.a. operating modes), which produces a stream of data from different distributions that follow one another over time. The processes that exhibit such a behaviour are referred to as a “multimode processes”. The paper discusses the use of SPC approaches for sensor signal monitoring in the presence of multiple operating modes, where nonparametric and data-adaptive learning methods are combined together. The study focuses on the use of the K-chart and the kernel density estimation (KDE) methodologies. Real industrial examples are discussed to highlight the need for nonparametric methods in industry and to demonstrate the performances of the K-chart in the presence of multimode processes.
1.
INTRODUCTION
There is an increasing tendency in industry towards data-rich environments characterized by “intelligent” and autonomous machine tools, where several sources of information (i.e. sensors installed in production systems) are available for many purposes (e.g., monitoring, diagnostics, predictive maintenance, etc.). In this framework, many technological advances pave the way for a systematic and extended use of sensor data for industrial quality control, via signal-based Statistical Process Control (SPC) methodologies. In discrete manufacturing operations, signalbased SPC allows one to design and implement control charts for in-process monitoring, i.e., to assess how the process is performing during the process itself. Despite many advantages provided by in-process monitoring tools [1] [2], the conventional SPC assumptions relative to the underlying data distribution may not be appropriate for the design of signal-based control charts. The monitored variables usually represent heterogeneous quantities that originate from one or multiple sensors, and each one of them requires dedicated pre-processing and raw signal elaboration steps (e.g., time-domain, frequency-domain or more sophisticated kinds of analysis). Because of this, the assumption of multi-normality is frequently violated in practise and data transformation to normality may be a very difficult task [3]. Furthermore, in many industrial applications, the process naturally switches from one operating mode to the following one, producing streams of data from different distributions that follow one another over time. This kind of process is referred to as “multimode process” in the literature [4] [5]. Different operating modes may be caused by a change of the cutting
parameters, a tool change, a modification of machine settings, a variation of environmental factors, etc. Various authors have discussed the need for distribution-free multivariate SPC methods [6], but there is a lack of such methods in the frame of multimode processes. This study focuses on nonparametric methods that can be used to monitor multimode processes, where two main strategies can be envisaged. One is based on designing a single control chart that is suitable to monitor the process regardless of its current state (namely, a “global modelling” approach [7]). Another one is based on designing multiple control charts, each one dedicated to a specific process state (namely, a “multi-modelling” approach [8] [9]). A particular type of nonparametric control chart, known as K-chart [10], is reviewed and its application in both the aforementioned control charting frameworks is discussed. The K-chart relies on a recently proposed paradigm that allows one to use machine learning methodologies in a process monitoring application, i.e., one-class-classification [11]. The performances of the K-chart are compared against the traditional Hotelling’s 𝑇 2 control chart with empirical limits, and a one-class-classification method based on the kernel density estimation (KDE) of the multivariate probability density function of in-control data, hereafter referred to as KDE-based control chart [12]. The study shows that the K-chart outperforms the 𝑇 2 control chart. Moreover, its performances are comparable to the ones provided by the KDEbased method, but it is more efficient, because it does not require the estimation of the complete density function. This study is a follow-on of a previous study of [13], and its major novelty contributions consist of: i) comparison between the K-chart methodology and the KDE-based chart, ii) comparison of global modelling and multi-modelling methodologies for control chart implementation, and iii) the validation of the K-chart method via a novel set of real industrial data from a transverse roll grinding operation. Section 2 presents some industrial examples to motivate the need for nonparametric methods; Section 3 discusses nonparametric techniques for multimode process monitoring; Section 4 briefly reviews the K-chart methodology; Section 5 discusses the use of presented methods in transverse roll grinding; Section 6 concludes the paper.
2.
INDUSTRIAL EXAMPLES
Signal-based SPC usually follows an information synthesis step aimed at extracting a reduced set of variables from raw signals. These variables allow characterizing the ongoing process and detecting possible shifts from an in-control state. They may be either synthetic indexes resulting from a signal processing step (e.g., time domain or frequency domain indexes), or the estimated coefficients of a model fitted to the original signals (e.g., by using a Fourier, spline or wavelet basis [14]). Different synthetic indexes capture heterogeneous quantities and may be defined on different domains (e.g., ℝ+ , ℝ, [𝑎, 𝑏]ϵℝ, etc.), possibly with non-normal marginal/joint distributions and/or non-linear correlation between each pair of variables. A real example is shown in Fig. 1, which refers to an end-milling operation on a titanium alloy where the spindle current signal represents a source of information for tool condition monitoring. The signal is acquired from the embedded current sensor via the SinuCom NC (by Siemens©) interface with a sampling frequency equal to 𝑓𝑠 = 250 𝐻𝑧. Two synthetic indexes, 𝑥1,𝑗 and 𝑥2,𝑗 , are computed on sequential and partially overlapping time windows (denoted by
𝑗 = 1,2, …) of duration 𝑇 = 1 𝑠: they are two moments of the signal time series, i.e., the skewness and the variance of the spindle current, respectively. Fig. 1 (left panel) shows the time plots of the two indexes and Fig. 1 (right panel) shows their scatterplot, which displays an evident departure from bivariate normality, due to a non-linear association between the two indexes. If a practitioner applies the traditional 𝑇 2 control chart to these indexes, the performances of the chart will be distorted by the distributional assumption violation, with a detrimental effect in terms of false alarms and detection power.
Figure 1 – Time plots of two time domain indexes (𝑥1 =skewness and 𝑥2 =variance) of a spindle current signal in end-milling (left panel) and corresponding scatterplot (right panel)
Mode 1
Mode 2
Mode 2 Mode 3 Mode 1 Mode 3
Figure 2 – Time plots of rms indexes (𝑥1 and 𝑥2 ) from two accelerometer sensors in roll grinding (left panel) and corresponding scatterplot (right panel) In addition to non-normality, discrete manufacturing processes may exhibit a multimode nature that yields clustered data clouds within the space spanned by the monitored variables. In this case, the probability density function is multimodal, which is a challenging violation of common distributional assumptions. A real example is shown in Fig. 2, which refers to a transverse roll grinding operation, where two accelerometer sensors are used to monitor the stability of the process, one mounted on the wheel head and one on the headstock. Transverse roll grinding involves consecutive cycles composed by different process runs, each one executed with
different cutting parameters to achieve the desired geometrical and surface finish quality. In the example illustrated in Fig. 2, three consecutive runs were performed by using the cutting parameters shown in Table 1. Fig. 2 (left panel) shows the time plots of two synthetic indexes, 𝑥1,𝑗 and 𝑥2,𝑗 , 𝑗 = 1,2, …, which are the rms of the accelerometer signals acquired respectively from the wheel head and the headstock sensor, along the axis orthogonal to both the feed and infeed directions. The signals were acquired with a sampling frequency equal to 𝑓𝑠 = 2000 𝐻𝑧 and the rms indexes were computed on consecutive and partially overlapping time windows of duration 𝑇 = 1 𝑠. Operating mode Mode 1 Mode 2 Mode 3
Wheel speed [rpm] 680 830 1000
Infeed [mm] 0.01 0.01 0.08
Roll speed [rpm] 30 30 30
Table 1 – Cutting parameters in different roll grinding operating modes The transition from one operating mode (i.e., one set of cutting parameters) and the following one causes a shift in the bivariate time series that should not be signalled as an alarm, since it corresponds to a natural transition between two consecutive in-control states. Fig. 2 (right panel) shows that such a multimode pattern generates three data clusters in the bivariate space spanned by 𝑥1,𝑗 and 𝑥2,𝑗 . It is evident that neither the use of a fixed threshold (i.e. the mainstream approach in the industrial practice) nor the traditional multivariate SPC methods are suitable to deal with such a process. The frequent violations of distributional assumptions on the one hand, and the need to cope with industrial processes that switch from one in-control state to the next one, on the other hand, motivate the development of nonparametric methods, i.e., monitoring techniques whose performances do not depend on the actual data distribution and/or the multimode properties of the process.
3.
NONPARAMETRIC METHODS FOR MULTIMODE PROCESSES
Multimode process monitoring involves two consecutive phases, analogously to traditional control charting: a phase aimed at designing the control chart based on in-control data samples, and the end-use phase, aimed at monitoring new sensor data acquired during the process. In this study, the terms “Phase I” and “training phase” are used interchangeably to identify the former phase. The term “Phase II” is used, instead, to identify the latter phase. Two major strategies have been proposed to deal with multimode processes [15] [5]. The first approach is based on estimating a single control region that globally adapts to the clustered spreading of in-control data (this approach is referred to as “global modelling” in the literature [7]). The second approach consists of designing multiple control charts, one per each incontrol state, and to monitor the process by switching on only the chart that corresponds to the current operating mode (this approach is referred to as “multi-modelling” [8]). The latter method requires the identification of the current state and the availability of a distinct set of
training (Phase I) data for the design of each dedicated control chart. Thus, the multi-modelling approach is applicable if at least one of the two following conditions is met: external variables (a.k.a. covariates) are available both to label the in-control data within the training dataset and to identify the current process state in order to switch on the corresponding control chart [16]; in the absence of covariates, the training dataset can be clustered into separated modes (unsupervised classification step), and the current control chart can be switched on after a supervised classification step aimed at determining which data cluster corresponds to the current operating mode [8]. In this study, both the control charting methods are considered, but, differently from the mainstream literature, this study focuses on nonparametric techniques. In both the global and multi- modelling frameworks, distribution free control charts without memory are discussed, which are designed to overcome the limitations of the traditional Hotelling’s 𝑇 2 methodology. 3.1. Process monitoring by using a single control region (global modelling) The most common approach to deal with unknown distributions consists of using the Hotelling’s 𝑇 2 control chart with empirical limit [17]. The control limit consists of a percentile of the empirical distribution of the 𝑇 2 values belonging to the training dataset. Although this method is quite widespread and is more effective than using parametric limits in many practical applications, it suffers from the normality assumption implied by the computation of the 𝑇 2 statistic itself. Even by using an empirical control limit, the control region in the multivariate space is a multi-dimensional ellipsoid, regardless of the actual spreading of in-control data. When the process considerably departs from multi-normality, an elliptical control region is not appropriate, especially when data are clustered in nature.
Figure 3 – Example of a KDE of a bivariate 3-modal distribution (left panel) and corresponding control boundary for a Type I error 𝛼 = 0.01 (right panel) An alternative solution consists of fitting a KDE of the multivariate probability density function of training data: in this case, the control boundary in the multivariate space coincides with the
isoline of the fitted function that corresponds to the target Type I error [12]. This approach is exemplified in Fig. 3, for a bivariate dataset. As pointed out by [18], the KDE-based approach is a proto-version of the so called “one-classclassification” methodology [11], whose aim is to extend machine learning methods to applicative scenarios where the training data belong to one class (i.e., the in-control class) and new data may either belong or not to that class (either in-control or out-of-control data). A percentile of the estimated density plays the role of a control boundary to decide if a new observation originates from the in-control pattern or it has to be signalled as an out-of-control datum, with a given Type I error, 𝛼. Nevertheless, the main limitation of this approach consists of the need to fit a complete density distribution, which is a computationally expensive task that becomes less feasible as the number of data dimensions increases [11]. Furthermore, the control boundary, denoted by ℎ, in the KDE-based method is such that [12]: ∫𝑥:𝑝(𝑥)>ℎ 𝑝(𝑥) d𝑥 = 1 − 𝛼
(1)
However, this integral is not analytically tractable and therefore it is not possible to obtain the threshold directly. Bootstrap-based and Markov Chain Monte Carlo methods may be used to approximate the control boundary [12], but it is difficult to achieve a good approximation on the tail of the distribution. More effective and efficient one-class-classification techniques have been proposed thus far to overcome these limitations. A method that was proved to be suitable for SPC purposes is known as K-chart [10]. Contrary to the KDE-based approach, it does not require the estimation of the complete density distribution and its control boundary is easily computable in analytical way. The control region is estimated by using a one-class-classification variant of the Support Vector Machine formalism, called “Support Vector Data Description” (SVDD) [11]. Only a subset of training data actually influence the shape of the control region, and hence a more efficient control chart scheme can be designed. The K-chart methodology is reviewed in Section 4. 3.2. Process monitoring using multiple control charts (multi-modelling) When external variables (a.k.a. covariates) are available to distinguish one in-control state from another, it is possible to design separate control charts (i.e., different control regions), one per each in-control state. The in-process acquisition of covariates allows activating a trigger signal to automatically switch on only the control chart that corresponds to the current operating mode, in order to use one control chart at a time. As an example, in the transverse roll grinding operation mentioned in Section 2, different process states correspond to different sets of cutting parameters. In that case, one may decide to design one separate chart for each set of parameters, and then to activate the control chart that refers to the set of cutting parameters that are now applied in the process. One evident benefit of this approach consists of using only the information corresponding to the current process state to design the control chart. It is worth to notice that, also in this case, nonparametric methods may be required to cope with non-normal distributions within each incontrol state.
However, the availability of covariates for the correct identification of each operating mode implies that any transition from one in-control state to the next one is caused by controllable factors (e.g., a change of cutting parameters, a tool change, a modification of the machine setpoint, etc.). If the transitions between in-control states are driven by uncontrolled factors (e.g., environmental factors, parameters that can not be measured, etc.), it may be difficult, or even impossible, to identify the current state. Some authors [8] proposed multi-modelling methods that are applicable in the absence of covariates: the underlying idea is to couple a classification step with the monitoring step, such that the training dataset can be separated into different modes via unsupervised classification (i.e., clustering). Nevertheless, in the absence of covariates and when data clustering is a difficult task due to a strong overlap between different modes, the global modelling approach represents the only viable solution. Nonparametric alternatives to the 𝑇 2 control chart, e.g., the KDE-based approach and the Kchart, can be applied either to monitor the process in its current state (multi-modelling) or to monitor the process by means of a single control region that globally adapts to the clustered data structure (global modelling). The next section reviews the K-chart methodology, whereas a comparison analysis of different nonparametric methods for multimode process monitoring is presented in Section 5.
4.
THE K-CHART APPROACH
Given a multivariate training dataset {𝒙𝑗 ∈ ℝ𝑝 , 𝑗 = 1, … , 𝑀}, where 𝒙𝑗 = [𝑥1,𝑗 , 𝑥2,𝑗 , … , 𝑥𝑝,𝑗 ] 𝑇 , the SVDD method consists of finding a minimal volume control region characterized by a centre 𝒐 ∈ ℝ𝑝 , and a radius 𝑅, that can envelop a given percentage of the original data. The K-chart [10] is a multivariate control chart whose control statistic consists of the kernel distance of any observation 𝒛 ∈ ℝ𝑝 from the centre 𝒐 ∈ ℝ𝑝 of that region. The control limit is estimated to guarantee a target Type I error with the available dataset. A kernel distance, hereafter denoted by 𝑘𝑑(𝒛), replaces the traditional Euclidean and statistical distance notions to adapt the control region boundary to the actual spread of the data. The estimation of the minimal volume control region, centred in 𝒐 ∈ ℝ𝑝 and with radius 𝑅, requires the solution of the following data-driven optimization problem: min(𝑅2 + 𝐶 ∑𝑀 𝑗=1 𝜉𝑗 ) s.t. (𝒙𝑗 − 𝒐)𝑇 (𝒙𝑗 − 𝒐) ≤ 𝑅2 + 𝜉𝑗 and 𝜉𝑗 ≥ 0, 𝑗 = 1, … , 𝑀
(2)
where 𝜉𝑗 , 𝑗 = 1, … , 𝑀, are slack variables, and 𝐶 is a penalty coefficient used to weight the trade-off between the volume of the region and the percentage of enclosed data (𝐶 > 0). By introducing the Lagrangian function: 𝑀 𝑀 2 𝑇 𝐿(𝑅, 𝒐, 𝜉𝑗 ; 𝛼𝑗 , 𝛾𝑗 ) = 𝑅2 + 𝐶∑𝑀 𝑗=1 𝜉𝑗 − ∑𝑗=1 𝛼𝑗 (𝑅 + 𝜉𝑗 − (𝒙𝑗 − 𝒐) (𝒙𝑗 − 𝒐)) − ∑𝑗=1 𝛾𝑗 𝜉𝑗
(3)
and by setting the partial derivatives w.r.t. 𝑅, 𝒐, and 𝜉𝑗 , 𝑗 = 1, … , 𝑀, to zero, the problem (3) can be simplified as follows [10]:
𝑀 𝑇 𝑇 max(∑𝑀 𝑗=1 𝛼𝑗 𝒙𝑗 𝒙𝑗 − ∑𝑗,𝑘=1 𝛼𝑗 𝛼𝑘 𝒙𝑗 𝒙𝑘 ) 𝑀 s.t. ∑𝑗=1 𝛼𝑗 = 1 and 0 ≤ 𝛼𝑗 ≤ 𝐶, 𝑗 = 1, … , 𝑀
(4)
A particularly interesting feature is that only the data points whose Lagrangian coefficients are larger than zero, called “support vectors”, influence the shape of the region [10]. This allows one not only to avoid the time-consuming estimation of the complete density function, but also to determine the shape of the control boundary by using a small subset of the original data. By introducing the kernel trick, the inner product 𝒂𝑇 𝒃 is replaced by a kernel function 𝐾(𝒂 × 𝒃). The K-chart is aimed at monitoring the stability over time of the kernel distance 𝑘𝑑(𝒛) of any new observation 𝒛 ∈ ℝ𝑝 from the centre 𝒐: 𝑀 𝑘𝑑(𝒛) = 𝐾(𝒛 × 𝒛) − 2∑𝑀 𝑗=1 𝛼𝑗 𝐾(𝒙𝑗 × 𝒛) + ∑𝑗,𝑘=1 𝛼𝑗 𝛼𝑘 𝐾(𝒙𝑗 × 𝒙𝑘 )
(5)
If 𝒂, 𝒃 ∈ ℝ𝑝 , the Gaussian radial basis (GRB) function is used, the kernel distance width parameter 𝑆 ∈ ℝ+ is defined as follows: 𝐾(𝒂 × 𝒃) = exp {−
‖𝒂 − 𝒃‖2 } 𝑆2
(6)
Different authors showed that this kernel function outperforms the competitor ones as far as the SVDD methodology is concerned [18]. Ning and Tsung [10] showed that there are different possible approaches to the design of the K-chart because there are three major parameters to set: the kernel width parameter, 𝑆, the penalty coefficient 𝐶, and the control limit denoted by ℎ. By comparing different design solutions, Ning and Tsung [10] showed that the best performances might be achieved by reducing the number of parameters to two (i.e., 𝑆 and ℎ).
Figure 4 – Example of cylindrical rolls before the grinding process (left) and details of surface finish in IC and OOC conditions (right) In fact, by assuming 𝐶 > 1, the constraint 0 ≤ 𝛼𝑗 ≤ 𝐶 is replaced by 𝛼𝑗 ≥ 0, and problem (4) can be solved by introducing the kernel function 𝐾(𝒙∙ × 𝒙∙ ). In this case, no penalty is applied, and hence, the kernel-based boundary is estimated by enclosing all of the training data. The false alarm rate is controlled by setting a proper value for the control limit ℎ. Thus, only the 𝑆 and ℎ parameters remain to be determined. Appendix
A reviews the procedure used to automatically select those two parameters and to design the K-chart.
5.
A REAL-CASE STUDY
The transverse roll grinding case study is used to discuss the applicability of the K-chart approach in a real industrial scenario. The product of the process consists of large cylindrical rolls for metal sheets milling operations (Fig. 4 vibration phenomena (i.e., “chatter”), which may have a detrimental effect on the surface finishing of the cylinders (Fig. 4). This kind of process involves one (or more) grinding cycles, each one consisting of multiple runs performed with different cutting parameters. Real data were collected by sensorizing a grinding machine tool (Fig. 5) and performing different machining cycles both in chatter-free and chattered conditions. The chatter (out-of-control) state needs to be quickly detected and suppressed (e.g., by using adaptive control or process optimization strategies [19]), in order to avoid undesired undulations of both the workpiece and the wheel, and to prevent the execution of extra grinding cycles to cope with those chatter marks.
Figure 5 – Grinding machine tool used for real data collection (left panel) and schematic illustration of the accelerometer sensor mounting on the wheel head (right panel) The grinding process was performed on a special alloyed steel roll with an initial diameter of 500 𝑚𝑚 and an axial length of 1700 𝑚𝑚. The grinding wheel was constructed of an aluminium oxide material with a nominal diameter of 700 𝑚𝑚 and a width of 75 𝑚𝑚. Two triaxial accelerometer sensors (denoted by A and B) were mounted on the wheel head and on the headstock, respectively (as shown in Fig. 5), and the signal along the 𝑋-axis was used to detect out-of-control departures from a nominal and stable cutting condition. The signals were sampled at 2 𝑘𝐻𝑧 and segmented into sliding time windows of duration 𝑇 = 1 𝑠. The signals were processed online to compute the rms indexes, 𝑥1,𝑗 = 𝑟𝑚𝑠𝐴,𝑗 and 𝑥2,𝑗 = 𝑟𝑚𝑠𝐵,𝑗 , 𝑗 = 1,2, …, which consist of the root mean square indexes of the vibration signal within
the 𝑗𝑡ℎ time window. The result is a bivariate quality characteristic {𝒙𝑗 ∈ ℝ2 , 𝑗 = 1,2, … }, where 𝒙𝑗 = [𝑥1,𝑗 , 𝑥2,𝑗 ] = [𝑟𝑚𝑠𝐴,𝑗 , 𝑟𝑚𝑠𝐵,𝑗 ] 𝑇 .
Figure 6 – Scatterplot of 𝑥1,𝑗 and 𝑥2,𝑗 indexes both in chatter-free and chattered conditions
Figure 7 – Control regions generated by the three different methods: 𝑇 2 chart with empirical limit (a), KDE-based method (b) and K-chart (c) – global modelling approach Three grinding runs were performed by using the sets of cutting parameters shown in Table 1. An additional run was performed using the same parameters of Mode 1, during which a chatter onset was observed, and chatter marks were identified on the roll at the end of the run by visual inspection. The bivariate distribution of monitored indexes 𝒙𝑗 = [𝑟𝑚𝑠𝐴,𝑗 , 𝑟𝑚𝑠𝐵,𝑗 ] 𝑇 in both chatter-free and chattered conditions are shown in Fig. 6 (notice that the chatter-free distribution is the same shown in Fig. 2). For additional details about the experimental set-up the reader is referred to [2] and [19]. Two different control charting approaches were tested, i.e., the global modelling approach and the multi- modelling one. In both cases, the K-chart method was compared with the Hotelling’s 𝑇 2 chart with empirical limit and the KDE-based method.
Figure 8 – Separate control regions, one for each mode, generated by the three different methods: 𝑇 2 chart with empirical limit (a), KDE-based method (b) and K-chart (c) – multimodelling approach With regard to the global modelling approach, by setting a Type I error 𝛼 = 0.01, the three methods yield the control regions shown in Fig. 7. The Phase I dataset consists of 990 data samples, divided into the three different in-control states, with a sample size that is proportional to the duration of the grinding run in each mode. The Gaussian kernel width parameter in the KDE-based method was selected by using the procedure advocated by [20], which represents the benchmark methodology and it is implemented in most analytical software. The kernel width parameter, 𝑆, of the K-chart was selected by using the automatic procedure reviewed in Appendix A. As stated in Section 2, the 𝑇 2 chart generates an elliptical control region that is not suitable to characterize the clustered variability of in-control data. On the contrary, both the KDE-based method and the K-chart generate a control region that adapts to the actual spread of training data. With regard to the multi-modelling approach, three distinct control regions were estimated, one for each in-control state (Fig. 8). In this case, the KDE yields a slight over-fitting, possibly caused by the reduced number of data used for empirical density estimation. The control region produced by the K-chart, instead, exhibits an intermediate shape, in terms of smoothness, between the one generated by the KDE and the elliptical region produced by the 𝑇 2 chart. In the multi-modelling case, only the control region associated to cutting parameters in Mode 1 was applied for chatter detection, because the chatter onset was observed only in that operating mode. Monitoring approach Global modelling
Multi-modelling
Method 𝑇 2 with empirical limit KDE-based method K-chart 𝑇 2 with empirical limit KDE-based method K-chart
N° of samples before alarm 11 3 3 4 2 2
Table 2 – Performances of compared methods (chatter detection)
Table 2 summarizes the performances of the different methods in terms of number of data samples acquired before signalling the chatter onset. It is worth to notice that the “number of samples before alarm” metric is used to make a relative comparison between the methods, such that a smaller value corresponds to a faster detection of the chatter onset. However, such a metric can not be used to determine the exact point in time when chatter actually started, which is unknown.
Figure 9 – K-chart based on the global modelling paradigm (a) and K-chart based on the multi-modelling paradigm (b) Table 2 shows that the multi-modelling approach and the global modelling yield substantially comparable performances1. In addition, in both cases (global and multi-modelling), the K-chart and the KDE-based method provide the same reactivity to chatter detection, which is higher than the one provided by the 𝑇 2 control chart with empirical limit. Nevertheless, the K-chart control limit is easy to compute, leading the one-sided chart shown in Fig. 9, whose control statistic is the kernel distance 𝑘𝑑(𝒛). On the contrary, the control limit in the KDE-based method is not analytically tractable, and hence the control chart design is more troublesome. The K-chart is more efficient than the KDE-based method too: in the roll grinding case study, the K-chart control limit estimation requires about 1.35 s, whereas the KDE-based method requires about 2.30 s for control boundary estimation 2. This makes the K-chart the most effective nonparametric alternative to the 𝑇 2 control chart. Fig. 9 a) shows the K-chart in the global modelling approach, where the unique control limit corresponds to the global control region in Fig. 7 c), whereas Fig. 9 b) shows the K-chart in the multi-modelling approach, where different control limits correspond to the different control regions in Fig. 8 c). 1
The interested reader may refer to [13] for an extended study on the K-chart performances. CPU time refers to an Intel® Core™ i7-3740QM CPU @ 2.70 GHz and does not include the kernel width estimation in both the methods. 2
6.
CONCLUSION
Real industrial processes frequently violate traditional assumptions about the underlying data distribution, and they often switch from one in-control state to another one. Despite different previous studies have been focused on nonparametric methods for multivariate SPC, their extension to multimode processes still represents an unexplored research field. This study investigated the use of the one-class-classification paradigm, where the KDE-based method and the K-chart represent two suitable approaches for multimode process monitoring. The K-chart allows estimating a multivariate control region that adapts to the actual spread of the training data. It outperforms the widely used 𝑇 2 control chart with empirical limit and his performances are comparable to the ones achieved by estimating the complete density distribution (KDE-based method). However, contrary to the KDE-based method, the K-chart is more efficient from a computational viewpoint as it does not require estimating the complete density, but only a control boundary that is influenced by a subset of training data. In addition, the K-chart, being representable as a one-sided chart, is expected to be easier to implement and more effective to use than the KDE-based method, whose control limit computation is not analytically tractable. In the frame of multimode process monitoring, the K-chart can be used either by estimating a single control region that globally adapts to the (unknown) training data distribution in multiple in-control modes, or by estimating distinct control regions, one per each process state. The second approach may be suitable if covariates are available to identify the current state or if the data patterns in different in-control states are easily separable. If at least one of these conditions is met, the monitoring approach known as multi-modelling may be preferred. Otherwise, the only viable solution is represented by the so-called global modelling approach. Future studies may be aimed at evaluating the performances of nonparametric method for multimode processes in the presence of high-dimensional data, when several variables are required to characterize the process.
7.
ACKNOWLEDGEMENTS
The study was partially funded by the National Technology Cluster project “CTN01_00163_216758 – High Performance Manufacturing”, funded by the Italian Ministry of Education, University and Research.
8.
REFERENCES
[1]
Jin J., Shi, J. (1999). Feature-Preserving Data Compression of Stamping Tonnage Information Using Wavelets, Technometrics, 41:4, 327 – 339 Maggioni M., Marzorati E., Grasso M., Colosimo B.M., Parenti P. (2014). In-Process Quality Characterization of Grinding Processes: a Sensor-Fusion Based Approach, ASME 12th Biennial Conference on Engineering Systems Design, ESDA 2014, June 25-27, 2014, Copenhagen, Denmark
[2]
[3] [4]
[5] [6] [7] [8]
[9]
[10] [11] [12]
[13]
[14] [15] [16]
[17]
[18] [19]
Qiu, P. (2008). Distribution-Free Multivariate Process Control Based On Log-Linear Modeling, IIE Transactions, 40, 664–677 Zhao, C., Yao, Y., Gao, F., Wang, F. (2010). Statistical analysis and online monitoring for multimode processes with between-mode transitions. Chemical Engineering Science, 65(22), 5961-5975. Ge, Z., Song, Z., Gao, F. (2013). Review of recent research on data-based process monitoring. Industrial & Engineering Chemistry Research, 52(10), 3543-3562. Qiu, P., Introduction to Statistical Process Control, 2014, Boca Raton, FL: Chapman & Hall/CRC Hwang, D. H., Han, C. (1999). Real-time monitoring for a process with multiple operating modes. Control Engineering Practice, 7(7), 891-902. Zhao, S. J., Zhang, J., Xu, Y. M. (2004). Monitoring of processes with multiple operating modes through multiple principal component analysis models. Industrial & engineering chemistry research, 43(22), 7025-7035. Wang, F., Tan, S., Peng, J., & Chang, Y. (2012). Process monitoring based on mode identification for multimode process with transitions. Chemometrics and Intelligent Laboratory Systems, 110(1), 144-155. Ning X., Tsung F. (2013). Improved Design of Kernel Distance-Based Charts Using Support Vector Methods, IIE Transactions, 45:4, 464-476 Tax D. M. J. (2001). One-Class Classification; Concept-Learning In The Absence Of Counter-Examples, Ph.D. thesis, Delft University of Technology Chen, T., Morris, J., & Martin, E. (2006). Probability density estimation via an infinite Gaussian mixture model: application to statistical process monitoring. Journal of the Royal Statistical Society: Series C (Applied Statistics), 55(5), 699-715. Grasso, M., Colosimo, B. M., Semeraro, Q., Pacella, M. (2015). A Comparison Study of Distribution-Free Multivariate SPC Methods for Multimode Data. Quality and Reliability Engineering International, 31(1), 75-96 Noorossana R., Saghaei A., Amiri A. (2012). Statistical Analysis of Profile Monitoring, John Wiley & Sons Qin, S. J. (2012). Survey on data-driven industrial process monitoring and diagnosis. Annual Reviews in Control, 36(2), 220-234. Ge, Z., Yang, C., Song, Z., Wang, H. (2008). Robust online monitoring for multimode processes based on nonlinear external analysis. Industrial & Engineering Chemistry Research, 47(14), 4775-4783. Phaladiganon, P., Kim, S. B., Chen, V. C., Baek, J. G., & Park, S. K. (2011). Bootstrapbased T 2 multivariate control charts. Communications in Statistics—Simulation and Computation, 40(5), 645-662. Tax D.M.J., Duin R.P.W. (2004). Support Vector Data Description, Machine Learning, 54, 45-66 Parenti P., Leonesio M., Cassinari A., Bianchi G., Monno M. (2013). Model-Based Identification of Chatter Marks during Cylindrical Grinding, International Conference on Advanced Manufacturing Engineering and Technologies, NEWTECH 2013, October 27-30, 2013 Stockholm, Sweden
[20]
Bowman, A. W., Azzalini A. (1997). Applied Smoothing Techniques for Data Analysis, Oxford University Press, New York
Appendix A – Selection of kernel parameter and K-chart design Tax and Duin [18] proposed a method that was further improved by Ning and Tsung [10]. The method is derived from multi-class classification problems in which the misclassification error can be used as a standard to select 𝑆. In a one-class-classification problem, a similar approach might be applied by generating artificial outliers. Tax and Duin [18] proposed drawing those outliers from a block-shaped or a hyper-spherical uniform distribution that encloses the training data in ℝ𝑝 . Given 𝑓𝑜+ (𝑆), the proportion of artificial outliers that are classified as in-boundary data for a given choice of 𝑆, and #𝑆𝑉(𝑆), the number of support vectors, 𝑆 can be selected by minimizing: 𝛾(𝑆) = (1 − 𝜈)
#𝑆𝑉(𝑆) 𝑀
+ 𝜈𝑓𝑜+ (𝑆),
(A1)
because #𝑆𝑉(𝑆)⁄𝑀 is a counterpart of the Type I error, and 𝑓𝑜+ (𝑆) is the counterpart of the Type II error, where 0 < 𝜈 < 1 is a weight. The procedure for the selection of the kernel width parameter is applied as follows: 1. Given a training set of 𝑀 observations, generate a number 𝑀𝑜 of artificial outliers; 2. Set 𝑆 equal to an initial value 𝑆0 ,, and solve problem (4) for the 𝑀 + 𝑀𝑜 available data; 3. Compute 𝑓𝑜+ (𝑆0 ) and #𝑆𝑉(𝑆0 ); 4. Set 𝑆 equal to a new value 𝑆0 + 𝑠, where 𝑠 is a step value, and repeat steps 3 and 4 until 𝑆 equals a pre-fixed upper limit 𝑆𝑈 ; 5. Find the value of 𝑆 (known as 𝑆 ∗ ) such that #𝑆𝑉(𝑆 ∗ )/𝑀 is nearest to the targeted Type I error; 6. Calculate the weight 𝜈 as follows: 𝜈(𝑆 ∗ ) = (1 +
𝑓𝑜+ (𝑆 ∗ ) (#𝑆𝑉(𝑆 ∗ )⁄𝑀 )
−1
) ;
7. Calculate the 𝛾(𝑆) value in Equation (A1), where 𝜈 = 𝜈(𝑆 ∗ ), for 𝑆 values in the range [𝑆0 , 𝑆𝑈 ]; 8. Eventually, 𝑆 is determined by the minimal 𝛾(𝑆). Once the optimal value of the kernel width parameter is determined, the control region can be estimated. The control limit ℎ can be estimated as the 100(1 − 𝛼)% empirical percentile of the kernel distance 𝑘𝑑(𝒛𝑗 ), 𝑗 = 1,2, … , 𝑀 [10], where 𝛼 is the targeted Type I error.