2262
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 12, DECEMBER 2011
Quality Relevant Data-Driven Modeling and Monitoring of Multivariate Dynamic Processes: The Dynamic T-PLS Approach Gang Li, Baosheng Liu, S. Joe Qin, Fellow, IEEE, and Donghua Zhou, Senior Member, IEEE
Abstract— In data-based monitoring field, the nonlinear iterative partial least squares procedure has been a useful tool for process data modeling, which is also the foundation of projection to latent structures (PLS) models. To describe the dynamic processes properly, a dynamic PLS algorithm is proposed in this paper for dynamic process modeling, which captures the dynamic correlation between the measurement block and quality data block. For the purpose of process monitoring, a dynamic total PLS (T-PLS) model is presented to decompose the measurement block into four subspaces. The new model is the dynamic extension of the T-PLS model, which is efficient for detecting quality-related abnormal situation. Several examples are given to show the effectiveness of dynamic T-PLS models and the corresponding fault detection methods. Index Terms— Data-based monitoring, dynamic total projection to latent structures, multivariate dynamic processes, qualityrelated monitoring.
I. I NTRODUCTION
A
S MULTIVARIATE process measurements are highly cross-correlated and auto-correlated in many industrial processes, it has been a challenging problem to capture the relation between many measurements and the required quality indices. For a long time, projection to latent structures (PLS) has been used as an efficient approach to analyze multivariate data and model multivariate processes [1], [2]. On the other hand, PLS has also been used as a basic technique in multivariate statistical process monitoring in the past two decades [3]–[5]. The basic idea of PLS modeling is to extract a few latent variables from highly correlated measurements according to the covariance between measurement and quality data. By projecting the measured data onto the reduced subspace, the quality-related part and quality-unrelated part
Manuscript received January 30, 2011; revised June 17, 2011; accepted August 14, 2011. Date of publication November 14, 2011; date of current version December 13, 2011. This work was supported in part by the National 973 Project under Grant 2010CB731800 and Grant 2009CB32602 and the Natural Science Foundation of China under Grant 61020106003, Grant 61021063, Grant 61028010, and Grant 61074085. G. Li and D. Zhou are with the Department of Automation, TNList, Tsinghua University, Beijing 100084, China (e-mail:
[email protected]. edu.cn;
[email protected]). B. Liu is with the Equipment Academy of the Air Force, Beijing 100085, China (e-mail:
[email protected]). S. J. Qin is with the Department of Chemical and Electrical Engineering, University of Southern California, Los Angeles, CA 90089 USA (e-mail:
[email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNN.2011.2165853
can be separated and monitored, respectively. For the purpose of process monitoring, PLS models can be replaced by total PLS (T-PLS) models, which decompose the measurement space further [6]. With T-PLS models, quality-related fault diagnosis can be performed effectively for continuous multivariate processes [7], [8]. Both of PLS and T-PLS models consider only static relations between measurements and quality data. However, in the case where the true relationship of measurement and quality data is dynamic, traditional PLS models are not enough to model the process and there are a number of possible ways to handle the problem. A widely used approach is to include a relatively large number of lagged values of the input and output variables in the measurement block. If the augmented input matrix includes only the lagged values of the input variables, the algebraic PLS algorithm results in a PLS finite impulse response (FIR) model [9], [10]. If the augmented input matrix includes lagged values of both inputs and outputs, then a parsimonious multivariate autoregressive moving-average model can be built [11]. As these ideas are easy to realize, similar research can be found in the recent literature. Baffi et al. extended the dynamic PLS (D-PLS) to the nonlinear case with augmented input matrix and applied it to predictive control [12], [13]. Chen and Liu proposed the dynamic version of PLS for online batch process monitoring based on the augmented matrix including lagged input variables [14]. Lee et al. put forward the approach for fault diagnosis based on system decomposition and D-PLS including lagged values of both inputs and outputs [15], [16]. Recently, a local dynamic partial least squares approach was proposed by Fletcher et al. for the modeling of batch processes, which still made use of the augmented input matrix [17]. This approach was also applied to a dearomatization process [18] and an inferential control system of distillation compositions [19]. In fact, these ideas are also widely used in other multivariate projection methods, such as principal component analysis (PCA) and independent component analysis [20]. The aforementioned approaches take the advantages of specifying the initial model with generality and reducing the model with the PLS algorithm. However, when the number of lagged variables grows, the modeling order and computational load will increase significantly. The column dimensions of augmented input matrix can be 5–50 times the original data for FIR models, which makes the resulting model cumbersome [21]. Besides, these approaches provide only a good
1045–9227/$26.00 © 2011 IEEE
LI et al.: MODELING AND MONITORING OF MULTIVARIATE DYNAMIC PROCESSES
input–output mapping of the process rather than offering a reasonable statistical model for fault detection. Kaspar and Ray used a D-PLS procedure directly for control system design in [21]. Their method does not involve the use of the lagged value of inputs, but utilize the filtering of input data. By this preprocessing, the major dynamic component in the process was removed, and hence the static PLS algorithm could be applied to the filtered input and output data. The dynamic filters are designed by using some prior knowledge or by minimizing the output modeling residuals. Instead of dynamically filtering the input data, Lakshminarayanan et al. proposed another dynamic version of the PLS model which is based on the direct modification of the PLS inner relation [22]. They relate the input and output scores with a dynamic model instead of a static model. Although it was reported that the identified models can be adequate with sufficient lowfrequency input signals, the PLS outer model is still static and inconsistent with the dynamic inner model. Although these dynamic versions of PLS algorithms can describe dynamic processes, there is no modification for the PLS outer model in these methods. In this paper, we propose a new D-PLS objective in searching input and output scores scores and develop a D-PLS algorithm based on this objective. The new D-PLS algorithm leads to a new outer model, which is consistent with the inner model between the input and output scores. In addition, to monitor the quality-related abnormal situations with dynamic process data, we derive the dynamic T-PLS model. The remainder of this paper is organized as follows. In Section II, the traditional PLS algorithm and its objective are reviewed. Following that, a D-PLS objective is proposed and a D-PLS algorithm is developed. In Section III, a dynamic T-PLS model is constructed for process monitoring. The property of this model is analyzed then. The fault detection indices are derived on the basis of the new model in Section IV. In order to show the effectiveness of new models, several case studies and numerical simulations are given in detail in Section V. The conclusions are summarized in the last section. II. D-PLS A LGORITHM A. Nonlinear Iterative Partial Least Squares (NIPALS) The basic concepts and the PLS algorithm have been reported in the chemometrics literature [1], [2]. Standard PLS algorithm extracts the latent variables from the input data to interpret the output data, and build a linear algebraic relation between the input and output scores. Consider m process variables measured at n different times, which forms the input matrix X ∈ Rn×m . Measurements on p quality and productivity variables at n different times are usually much less frequent, which forms the output matrix Y ∈ Rn× p . By using the NIPALS algorithm, the scaled and centered X and Y can be modeled as ⎧ A ⎪ T T ⎪ ⎨X = ti pi + E = TP + E i (1) A ⎪ ⎪ ⎩Y = ti qiT + F = TQT + F i
2263
where T = [t1 , . . . , t A ], P = [p1 , . . . , p A ], and Q = [q1 , . . . , q A ]. ti (i = 1, . . . , A) are score vectors, pi (i = 1, . . . , A) are the loading vectors for X, and qi (i = 1, . . . , A) are the loading vectors for Y. The number of PLS scores A is usually determined by cross validation. The modeling residuals for X and Y are represented by E and F, respectively. The NIPALS algorithm is summarized in the Appendix. The objective of PLS embedded in each iteration is to find the solution to the following problem: max wT XT Yc s.t. w = 1, c = 1.
(2)
Obviously, this objective only describes the static variations within the input data that are most related to output data. However, it cannot extract dynamic variation or relationship between input and output data. B. D-PLS Algorithm In order to extend the objective to the dynamic case, the following objective for each iteration of outer modeling is proposed: T β T T max wT X(0) (0) + · · · + w X(q−1) β(q−1) Yc w,c,β(i)
w = c = 1
s.t. 2 β(0)
2 + · · · + β2 + β(1) (q−1) = 1
(3)
where X(i) represents the input data with i time lags, and β(i) is a weight coefficient for X(i) w. This objective maximizes the dynamic linear relation of X and Y by searching a direction vector w and a coefficient vector β = [β(0) , . . . , β(q−1) ]T . While the weight vector w is extracted in the original variable space, the covariance of the linear combination of Y data and X data with lagged values is maximized. The dynamic extension does not increase the dimension of w, and describes the dynamic correlation of X and Y in the outer modeling. Remark 1: There is a hidden assumption behind the objective, i.e. the dynamic in the process of interest is extracted in a reduced low-dimensional subspace. The objective captures the most Y-related variation in X with only several dynamic components, while the direct use of PLS procedure on lagged variables only leads to a complex projection that is difficult to interpret and compute. In order to simplify the expression, denote Xg = [X(0), X(1), . . . , X(q−1) ]. Then, the new objective can be rephrased as
max (β ⊗ w)T XgT Yc = β T ⊗ wT XgT Yc w,c,β s.t. w = c = β = 1 (4) where β T ⊗ wT = [β(0) wT , β(1) wT , . . . , β(q−1) wT ] is the Kronecker product. To maximize the D-PLS objective, we use the Lagrange multipliers 1 max J = β T ⊗ wT XgT Yc + λw 1 − wT w 2 1 1 T (5) + λc 1 − c c + λβ 1 − β T β . 2 2
2264
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 12, DECEMBER 2011
Taking derivatives with respective to w, c, and β leads to ∂J = Iq ⊗ wT XgT Yc − λβ β = 0 (6) ∂β ∂J = β T ⊗ Im XgT Yc − λw w = 0 (7) ∂w T ∂J = β T ⊗ wT XgT Y − λc c = 0. (8) ∂c Substituting (8) into (6) and (7), we can obtain Iq ⊗ wT XgT YYT Xg (β ⊗ w) = λβ λc β (9) (10) β T ⊗ Im XgT YYT Xg (β ⊗ w) = λw λc w. Note that β ⊗ w = (Iq ⊗ w)β = (β ⊗ Im )w
(11)
and define S ≡ XgT YYT Xg ∈ Rmq×mq Sw ≡ (Iq ⊗ w)T S(Iq ⊗ w) ∈ Rq×q Sβ ≡ (β ⊗ Im )T S(β ⊗ Im ) ∈ Rm×m .
(12)
Algorithm 1 D-PLS Center the raw data of X and Y to zero mean and scale them to unit variance, and set i = 1. 1) Set wi = [1, 0, . . . , 0]T , calculate Sw , and solve the following eigenvalue decomposition, and let βi be the eigenvector corresponding to the largest eigenvalue Sw β i = λβ λc β i .
(19)
2) Calculate Sβ , and solve the following problem, and let wi be the eigenvector corresponding to the largest eigenvalue (20) Sβ wi = λw λc wi . 3) Iterate steps 1 and 2 until eigenvalues and wi , β i converge. q−1 4) ti = Xi wi , tgi = j =0 βi( j ) ti( j ) , T t , p = X T t /t T t . qi = YiT tgi /tgi gi i i i i i 5) Xi+1 = Xi − ti pi T , Xg,i+1 = [Xi+1(0) , Xi+1(1) , . . . , Xi+1(q−1) ], Yi+1 = Yi − t gi qiT , i = i + 1, return to step 1 until i > A.
Substituting (11) and (12) into (9) and (10), we can obtain Sw β = λβ λc β Sβ w = λw λc w.
(13)
This result indicates that the global optimal solution (w, β) are eigenvectors of matrices Sβ , Sw . Further, the optimal objective can be calculated by the solution as 1 (β ⊗ w)T S(β ⊗ w) λc 1 T = β (Iq ⊗ w)T S(Iq ⊗ w)β λc 1 T β Sw β = λc
Jmax =
Similarly, we can also obtain Jmax = λw . On the other hand YT Xg Iq ⊗ wwT XgT Yc = λβ λc c YT Xg ββ T ⊗ Im XgT Yc = λw λc c.
t = Xw tg = Xg (β ⊗ w) q=
Y T tg tgT tg
XT t . (17) tT t The residual of X and Y should be calculated for the next iteration of modeling p=
= λβ β T β = λβ .
multivariate continuous function within a closed region. Using multivariate calculus, there must be an extreme point for (4). After finding the optimal solution of w, β, and c, the D-PLS outer and inner modeling can be performed as follows:
(14)
E = X − tpT Eg = Xg − Xg (1q ⊗ w)T F = Y − t g qT .
(15)
This result means that the optimal solution c is the eigenvector of above matrices. Further, it can be verified that Jmax = λc . Consequently, the optimal solution must have the following relation: Jmax = λw = λβ = λc . (16) However, the solution to w, β cannot be calculated directly using (13), because Sw and Sβ depend on w, β, respectively. A possible method is to initialize w first, and then use (13) to search w, β iteratively. Remark 2: The existence of global optimal solution is guaranteed. Note that (4) is the maximum solution problem of a
(18)
Let X = E, X g = Eg , and Y = F, and repeat the above algorithm iteratively until the number of PLS factors. Note from (18) that the block columns of Eg are simply the lagged values of E. Therefore, it is only necessary to deflate X and then form Eg with lagged deflated X. To sum up, the dynamic NIPALS procedure is listed in Algorithm 1. In the above procedure, ti( j ) means the value of ti with j time lags, and ti = ti(0) . So is Xi+1( j ) . The number of dynamic principal factors A and time lags q are two important structure parameters. PLS usually applies cross validation to determine PLS factors A [1]. Here, a cross validation with two parameters is used to determine A and q jointly. Firstly, the parameters are restricted to a finite region, A ≤ Amax , q ≤ qmax . For all possible pairs of
LI et al.: MODELING AND MONITORING OF MULTIVARIATE DYNAMIC PROCESSES
[ A, q], perform the following test of cross validation. Firstly, divide test data (X, Y) into n cv subsets. Then, choose one subset (Xi , Yi )(i = 1, . . . , n cv ) each time, and train a dynamic model using the D-PLS algorithm with the rest of training data. Finally, calculate the prediction error for Yi with Xi and the trained model. Let i = 1: n cv , and make out a predicted error square sum (PRESS). The parameter pair (A, q) that corresponds to the smallest of PRESS is chosen as the model parameter pair. III. DYNAMIC T-PLS M ODELS With the proposed D-PLS algorithm, input X and output Y can be modeled dynamically as follows:
X = TPT + E (21) Y = G 1 (z −1 )t1 q1T + · · · + G A (z −1 )t A qTA + F where G i (z −1 ) = βiT z describes the dynamic relation between quality data Y and dynamic score ti . z = [1, z −1 , . . . , z −q+1 ]T , z −1 is the unit delay operator. A. Dynamic Modeling of Scores It is convenient to monitor the dynamic combination of scores tgi (k) = G i (z −1 )ti (k) =
q−1
βi, j ti (k − j ).
(22)
j
However, as tgi (k) is auto-correlated, it is not proper to perform statistical monitoring directly. As industrial processes often operate under a fixed point, tgi (k) can be seen as a quasistationary process, which can be described with a stationary time series model. Denote tg (k) = [tg1 (k), . . . , tg A (k)]T , which can be modeled with a vector auto-regressive (VAR) model tg (k) = α1 tg (k − 1) + · · · + α p tg (k − pg ) + v(k)
(23)
where αi is the parameters of VAR model (23), and pg is the model order of VAR model (23). Denote t(k) = [t1 (k), . . . , t A (k)]T , and it is easy to find that (23) is also the VAR model for t(k). As tg (k) is not obtained directly, it is simpler to build the following VAR model and monitor the residual v(k): t(k) = A1 t(k − 1) + · · · + A p t(k − p) + v(k)
(24)
where Ai is the parameters of VAR model (24), and p is the model order of VAR model (24). If vk can be seen as a zeromean white noise series, the parameters can be estimated with the multivariate least squares algorithm [23] ⎛ ⎞−1 ⎛ ⎞ p+n p+n ˆ =⎝ ϕ(i )ϕ(i )T ⎠ ⎝ ϕ(i )t(i )T ⎠ (25) i= p+1
i= p+1
where = [A1 , . . . , A p ]T , ϕ(k) = [t(k−1)T , . . . , t(k− p)T ]T . Model order p is determined by the Akaike information criterion (AIC) [24].
2265
Algorithm 2 Dynamic latent variable (DLV) modeling algorithm 1) Construct E0 = [e(1), . . . , e(n)]T , Es = [e(s 1), . . . , e(s + n)]T . Perform SVD on (1/k)Es+1 E0 UVT . Let Pd = U1:Ad and s is the time shift the DLV model, and Ad corresponds to the number nonzero singular values. 2) td (k) = PdT e(k). Use a VAR model to describe td (k) td (k) = D1 td (k −1)+· · ·+Dr td (k −r )+vd (k).
+ = in of as
(26)
Estimate Di (i = 1, . . . , r ) with (25), r is determined by AIC. 3) Perform PCA on Ed = E(I − Pd PdT ), and obtain Ed = Ts PsT + Es with As principal components, where As is determined by PCA-based methods.
B. DLV Modeling for Residual E Furthermore, there still exist dynamic and static variations with large variability in residual E of (21). For the outputirrelevant part E, a simplified version of DLV is used to monitor this part [25]. The ultimate structure provided by the algorithm is listed above X = TP T + Td Pd T + Ts Ps T + Es (27) T Y = TG(z−1 )Q + F. For example, the model can be expressed as ⎧ ⎪ ⎪ x(k) = Pt(k) + Pd td (k) + Ps ts (k) + es (k) ⎨ y(k) = Q[B0t(k) + · · · + Bq t(k − q + 1)] + f(k) t(k) = A1 t(k − 1) + · · · + A p t(k − p) + v(k) ⎪ ⎪ ⎩ td (k) = D1 td (k − 1) + · · · + Dr td (k − r ) + vd (k)
(28)
where B j = diag{β1( j ), . . . , β A( j ) }( j = 1, . . . , q). We summarize the whole procedure of dynamic T-PLS models as follows. 1) Use the D-PLS (Algorithm 1) to model dynamic relation of X and Y as in (21), where A and q are determined by a 2-D cross validation. 2) To describe dynamic scores t(k) with a VAR model, use (25) to estimate parameters, where the model order p is determined by AIC. 3) Perform DLV modeling (Algorithm 2) to divide the residual part E for further decomposition. C. Dynamic T-PLS Structure Although dynamic T-PLS is a dynamic model, it performs a static decomposition of the x space. The dynamic relationship between different variables and auto-relation of each variable are described in different subspaces. Lemma 1: Dynamic T-PLS model induces an oblique decomposition on the input space x = xˆ y + xˆ d + xˆ s + x˜ xˆ y = PRT x ≡ C1 x ∈ S y xˆ d = Pd PdT (I − PRT )x ≡ C2 x ∈ Sd xˆ s = Ps PsT (I − Pd Pd T )(I − PRT )x ≡ C3 x ∈ Ss x˜ s = (I − Ps PsT )(I − Pd PdT )(I − PRT )x ≡ C4 x ∈ Sr
(29)
2266
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 12, DECEMBER 2011
TABLE I M ONITORING S TATISTICS AND C ONTROL L IMITS
V. S IMULATIONS AND C ASE S TUDIES A. Illustrative Examples of Dynamic Modeling
Statistics
Calculation
Ty2 (k)
v(k)T −1 y v(k)
Td2 (k)
vd (k)T −1 d td (k)
Ts2 (k)
ts (k)T −1 s ts (k)
Q s (k)
˜xs (k)2
Control limit A(n 2 −1) n(n− A) F A,n− A,α Ad (n 2 −1) n(n− Ad ) F Ad ,n− Ad ,α Ar (n 2 −1) n(n− As ) F As ,n− As ,α 2 gχh,α
For D-statistics, y = cov(v) is the covariance matrix of v(k), which is estimated by the training samples. For Q-statistic, g = S/2μ and h = 2μ2 /S, where μ is the sample mean of Q s , and S is the sample variance of Q s .
where C1 + C2 + C3 + C4 = Im . (30) The proof of this lemma is omitted owing to page limitation, which is similar to that in [6]. It is worth noting that there are clear meanings for different subspaces. Subspace S y reflects the quality-related dynamic variation (including static variation) in the process, subspace Sd denotes quality-unrelated dynamic variation in the process, while subspace Ss represents quality-unrelated static variation in the process. Especially, subspace Sr contains the residual part of this model with very little variability, which is not excited in the normal situation. IV. DYNAMIC T-PLS-BASED FAULT D ETECTION The dynamic T-PLS based monitoring is similar to the T-PLS-based monitoring in [6]. As v(k) and vd (k), ts are nearly time-independent and contain a large variability, they are suitable to use D-statistics for monitoring. Meanwhile, x˜ represents the modeling residual, and thus it is suitable to use Q-statistic for monitoring. These scores and residuals can be calculated as follows: v(k) = t(k) −
p
Ai t(k − i ) ∈ R A
i=1 r
vd (k) = td (k) −
i=1
ts (k) = PsT (I − Pd PTd )(I − PRT )x(k) ∈ R As (31)
where t(k) = RT x(k) ∈ R A td (k) = PdT (I − PRT )x(k) ∈ R Ad .
where ⎛
⎤T ⎡ ⎞ 0.9249 0.4350 0.5586 0.2042 0.6370 ⎢ 0.6295 0.9811 ⎥ ⎜ 0.2007 0.0492 0.4429 ⎟ ⎥ ⎢ ⎟ ⎜ ⎟ , C = ⎢ 0.8783 0.0960 ⎥ , 0.0874 0.6062 0.0664 P=⎜ ⎥ ⎢ ⎟ ⎜ ⎣ 0.6417 0.5275 ⎦ ⎝ 0.9332 0.5463 0.3743 ⎠ 0.7984 0.5456 0.2594 0.0958 0.2491 ⎧ T t0k + [50, 0, 0] , 250 > k ⎪ ⎪ ⎨ t0k + [0, 25, 0]T , 500 > k ≥ 250 , tk = t0k + [0, 0, 5 ∗ sin(0.1k)]T , 750 > k ≥ 500 ⎪ ⎪ ⎩ t0k , 1000 ≥ k ≥ 750
ek ∼ N(0, 0.12 I5 ), vk ∼ N(0, 0.12 I2 ), t0k ∼ N(0, 22 I3 ). One thousand samples are separated into 10 blocks, and a two-parameter cross-validation procedure is used for determining the D-PLS model. Select the range of principal components A as 1−5 and the range of lagged number q as 1−5. The optimal parameters determined by cross validation are A = 3, q = 1. This result means the optimal model only contains current variables, and reduces to the static case, which is consistent with the process description. Case 2: The inputs are generated by a dynamic model, and the outputs are generated by a static model as follows: ⎧ ⎨ tk = α1 tk−1 − α2 tk−2 + tk∗ xk = Ptk + ek (34) ⎩ yk = Cxk + vk where
Di td (k − i ) ∈ R Ad
x˜ s (k) = (I − Ps PsT )(I − Pd PdT )(I − PRT )x(k) ∈ Rm
In order to show the effectiveness of cross validation for determining the D-PLS model, different simulation cases are illustrated. Case 1: The inputs and outputs are generated by a static model as follows:
xk = Ptk + ek (33) yk = Cxk + vk
(32)
Fault detection indices and the respective control limits are listed in Table I. Consistent with the interpretation of different subspaces in the dynamic T-PLS model, Ty2 is used to detect quality-related abnormal situations, which may be caused by the break in Y-related dynamic and static variations. Td2 and Ts2 are used to detect quality-unrelated abnormal situation occurring in dynamic and static variations, respectively. Q r is used to detect the abnormal situation that happens in the residual, which may or may not affect quality data.
⎡
⎤ 0.4389 0.1210 −0.0862 α1 = ⎣ −0.2966 −0.0550 0.2274 ⎦ , 0.4538 −0.6573 0.4239 ⎡ ⎤ −0.2998 −0.1905 −0.2669 α2 = ⎣ −0.0204 −0.1585 −0.2950 ⎦ , 0.1461 −0.0755 0.3749 ⎧ t0k + [10, 10, 10]T , 250 > k ⎪ ⎪ ⎨ t0k − [5, 5, 5]T , 500 > k ≥ 250 ∗ tk = t0 + [1, 1, 1] ∗ sin(0.1k)]T , 750 > k ≥ 500 ⎪ ⎪ ⎩ k 1000 ≥ k ≥ 750. t0k ,
t0k ∼ N(0, I3 ), the others are the same as the above system. Using a similar procedure, the optimal parameters are selected as A = 2, q = 1. The result shows that the relation between x and y is a static one, but the auto-correlation of x is a dynamic one, which is captured by the DLV modeling. Case 3: The inputs are generated by a static model, but the outputs are generated by a dynamic model as follows:
xk = Ptk∗ + ek (35) yk = Cxk + C2 xk−1 + β1 yk−1 + β2 yk−2 + vk
LI et al.: MODELING AND MONITORING OF MULTIVARIATE DYNAMIC PROCESSES
2267
10
× 10−3 3.5 y1
5
3 PRESS
Y predicted Y
0
2.5 2
−5
1.5
0
100
200
300
400
500
10
1 0.5 5 y2
5 4
Y predicted Y
0 3 2
1.5 1 Principal number Fig. 1.
2
4
3.5 3 2.5 Lagged number
4.5
−5
5
Fig. 3. line).
Cross-validation result for Case 4 ( A = 3, q = 3).
y1
0
500
0
100
200
300
400
500
0
100
200
300
400
500
0
100
200 300 Sample index
400
500
0
200
400
600
800
1000
T 2s
10
2
5 0
Y predicted Y
1 y2
400
50 0
−1
0.03 Qs
0 −1
0.02 0.01 0
0
200
400 600 Sample index
800
1000
Fig. 2. Output (solid line) and predicted output with dynamic model (dashed line).
where
200 300 Sample index
Fault 1: Output data (solid line) and prediction of output (dashed
T 2y
Y predicted Y
1
−2
100
100
2
−2
0
⎡
⎤T 1.7198 −0.3715 ⎢ 0.5835 1.5011 ⎥ ⎢ ⎥ ⎥ C2 = ⎢ ⎢ 1.4236 1.3226 ⎥ , ⎣ 0.4963 −1.4145 ⎦ −2.5717 1.0696 0.2485 0.1552 −0.3042 −0.3009 β1 = , β2 = . −0.4856 −0.3011 0.3265 0.4914
The others are the same as the above system. Using a similar procedure, the optimal parameters are selected as A = 3, q = 2. The result shows that the relation between x and y is a dynamic one, and the time lags is 2, which is consistent with the system. Case 4: The inputs and outputs are generated by a dynamic model as follows: ⎧ ⎨ tk = α1 tk−1 − α2 tk−2 + tk∗ xk = Ptk + ek (36) ⎩ yk = Cxk + C2 xk−1 + β1 yk−1 + β2 yk−2 + vk
Fig. 4.
Fault 1: Fault detection by different indices.
where all the parameters and conditions are the same as the above cases, especially, tk∗ is the same as the Case 2. A similar procedure is performed for model structure determination. The selected parameters are A = 3, q = 3, as shown by Fig. 1, which shows a strong dynamic property of X and Y. Fig. 2 shows the prediction of y with trained D-PLS model, which reflects that the identified D-PLS model has a good explanation of outputs by measurements and historical outputs. B. Simulations on Quality-Related Monitoring In the practical case, the process often operates around a designed point. Thus, the measurement and the quality data under normal conditions is approximately stationary. In this situation, a dynamic T-PLS model can be derived for the quality-related process monitoring. This subsection makes use of the process model (36) in Case 4, where tk∗ = t0k , t0k ∼ N(0, 4I3 ), and all other parameters are the same as in the above subsection. One thousand stationary samples are collected under this normal situation. The proper structure parameters of the D-PLS are A = 3, q = 4. Then, a vector
2268
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 12, DECEMBER 2011
4
4 Y predicted Y
y1
2
2 0 y1
0 −2 −4
−2 −4
0
100
200
300
400
−8
4 Y predicted Y
y2
2
200
300
400
500
0 y2
−2 0
100
200 300 Sample index
400
−2
500
15
−6
Fig. 7. line).
10 5 0
100
200
300
400
500
300 200
100
200 300 Sample index
400
500
Fault 3: Output data (solid line) and prediction of output (dashed
10 5 0
100
0
100
200
300
400
500
0
100
200
300
400
500
0
100
200 300 Sample index
400
500
15 0
100
200
300
400
500
0.1
T 2s
0
0
15 T 2y
0
Y predicted Y
−4
Fault 2: Output data (solid line) and prediction of output (dashed
T 2y
100
4 2
Fig. 5. line).
T 2s
0
0
−4
Qs
Y predicted Y
−6
500
10 5 0
0.05
10
Fig. 6.
0
100
200 300 Sample index
400
500 Qs
0
0
Fault 2: Fault detection by different indices.
autoregressive model is built for the D-PLS scores tk . The order of model is selected as p = 2 with AIC. Then, a DLV model is performed on the quality-unrelated residual E. According to auto-correlation analysis, there is no dynamic variation in the residual E, i.e., Ad = 0. Perform PCA algorithm on E, and separate the static variation with As = 1 from the real residual Es . When a fault occurs in the DLVs as tk∗ = tk∗ + [0, 0, 10]T , k > 200, the output y is affected obviously (see Fig. 3). The fault is detected by Ty2 , which indicates a qualityrelated fault. The other indices are not affected by this fault, as shown in Fig. 4. When a fault occurs in the quality-unrelated part of the process with k > 200 xk = xk + [0.0054, 0.3145, −0.0432, 0.7516, −0.4440]T the output y is not affected (see Fig. 5). However, this kind of fault can be detected by Ts2 , which indicates a qualityunrelated fault. Other indices are not involved in this kind of fault, as shown in Fig. 6.
5
Fig. 8.
Fault 3: Fault detection by different indices.
When the fault occurs in the residual part of the process with k > 200 xk = xk + [−5.5131, 0.3612, 0.0664, −0.7349, 2.3187]T the prediction error of output grows significantly (see Fig. 7). This fault can also affect quality with a closed-loop control for the output. Thus, it is also necessary to monitor the residual part by Q s . This fault is detectable only with Q s , and undetectable with other indices as shown in Fig. 8. C. Case Study on the Tennessee Eastman Process (TEP) In this subsection, the effectiveness of the proposed methods on multidimensional faults is investigated by application to the TEP. TEP was created to provide a benchmark of industrial process for evaluating process control and monitoring approaches, including PCA, partial least squares, and the Fisher discriminant analysis [16], [26], [27]. TEP contains two
LI et al.: MODELING AND MONITORING OF MULTIVARIATE DYNAMIC PROCESSES
TABLE II C UMULATIVE % S UM OF S QUARES E XPLAINED BY THE T-PLS AND
5 Y D-PLS: Predicted Y
4
2269
DYNAMIC T-PLS M ODELS
3 T-PLS model X y Dynamic T-PLS model X y
Output
2 1 0 −1 −2
Xr 56.89%
Er 0.24%
Xd 30.40%
Xs 30.25%
Er 0.76%
200
400 600 Sample index
800
1000
Prediction error
0
400 600 Sample index
800
0.05 0 −0.05 −0.1
5 Y PLS: Predicted Y
4 3 2 1
Autocorrelation coefficient
Fig. 9. D-PLS prediction for TEP. Output data (solid line) and prediction of output (dashed line).
Output
Xo 36.03%
0.1
−3 −4
Xy 6.84% 31.41% Xy 38.59% 99.93%
0
200
1000
1 0.5 0 −0.5 −15
−10
−5
0 Lags
5
10
15
0 Fig. 11.
−1 −2
0
200
400 600 Sample index
800
1000
Prediction error
4
−3 −4
Prediction error and autocorrelation coefficient for D-PLS.
0 −2 −4
Autocorrelation coefficient
Fig. 10. PLS prediction for TEP. Output data (solid line) and prediction of output (dashed line).
blocks of variables: 12 manipulated variables and 41 measured variables. Process variables are measured with an interval of 3 min. In this paper, 22 process measurements and 11 manipulated variables, i.e., XMEAS (1–22) and XMV (1–11), are chosen as X. And the component of E in stream 11, i.e., XMEAS 38, is chosen as y. In order to reflect the dynamic property, the one-step lagged value of y is included in the measurement block. About 480 samples are firstly centered to zero mean and scaled to unit variance. Then, these normal samples are used to perform the cross validation to determine the parameters. The final model structure parameters are A = 5, q = 1, and p = 3. Following that, a DLV model with Ad = 8, p2 = 2, and As = 11 is determined by DLV modeling algorithm. The D-PLS modeling results of y is shown in the Fig. 9, which gives a better prediction accuracy than the PLS model ( A = 6, Fig. 10). The prediction error and its autocorrelation coefficient with D-PLS and PLS are shown in Figs. 11 and 12, respectively. It is observed that the D-PLS algorithm can interpret most of information in Y data.
2
0
200
400 600 Sample index
800
1000
1 0.8 0.6 0.4 0.2 −15
Fig. 12.
−10
−5
0 Lags
5
10
15
Prediction error and autocorrelation coefficient for PLS.
Dynamic T-PLS separates the X-space into four parts. Table II tells how much of measurement block and quality block each part of X can interpret in the sense of variability. In the results, dynamic T-PLS model uses 38.59% of X variation to interpret 99.93% variation of y, while T-PLS model uses 6.84% of X variation to interpret 31.41% variation in y. The result shows that the dynamic T-PLS gives a large improvement over the T-PLS model for this case. Moreover, dynamic T-PLS keeps the ability to monitor faults in different subspaces. Fig. 13 shows the detection results of fault 1
2270
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 22, NO. 12, DECEMBER 2011
TABLE III FAULT D ETECTION R ATE FOR TEP U SING T-PLS AND DYNAMIC T-PLS (%)
Faults ID Fault 1 Fault 2 Fault 5 Fault 6 Fault 8 Fault 10 Fault 12 Fault 13
Process faults with known cause Fault description A/C feed ratio, B Composition constant (Stream 4) B composition, A/C ratio constant (Stream 4) Condenser cooling water inlet temperature A Feed loss (Stream 1) A, B, C Feed composition (Stream 4) C Feed temperature Condenser cooling water inlet temperature Reaction kinetics
600
400
400
T 2y 200
0
500
1000
0
500 Sample index
1000
A PPENDIX NIPALS A LGORITHM [5]
200
0
500
0
1000
400
25
300
20 15
200
Qs
T 2s
0
100 0
Fig. 13.
10 5
0
500 1000 Sample index
0
Fault detection rate T-PLS Dynamic T-PLS 99.50 99.63 95.25 96.88 99.75 99.88 99.75 99.88 93.25 97.50 88.25 87.38 95.0 98.88 96.1 95.00
The monitoring of dynamic variations was performed on the basis of different subspaces. Several illustrative examples were used to show the good dynamic modeling ability of the proposed D-PLS model. The results show that the proposed 2-D cross validation can deal with different cases effectively. A numerical example was given to indicate process monitoring results in different subspaces. The TEP case study exhibited a better modeling accuracy of dynamic T-PLS models over static T-PLS models. For quality-related modeling and monitoring of dynamic processes, dynamic T-PLS is seen to be better than static T-PLS.
T 2d
600
Type Step Step Step Step Random variation Random variation Random variation Slow drift
Fault detection in different subspaces with different indices.
of TEP with different indices, which indicates a different meaning of process monitoring. There are 15 known faults in TEP. However, only a part of them can be seen as qualityrelated faults according to the criterion proposed by Zhou et al. [6], which are fault types 1, 2, 5, 6, 8, 10, 12, and 13, respectively. As mentioned previously, only Ty2 and Q s in Table I are used to detect quality-related faults. Table III lists the detection rate for these quality-related faults based on dynamic T-PLS and T-PLS models. The results show that dynamic T-PLS modeling improves the alarm rate of qualityrelated fault detection compared toT-PLS modeling. When a fault is detected by one of these statistics, a diagnosis tool similar to that in [8] can be used for isolating and identifying the fault, which will be our future work. VI. C ONCLUSION In this paper, a new D-PLS algorithm was proposed to extract the dynamic quality-related variation in input data. By using 2-D cross validation, the lagged time and dynamic factor number were determined. In order to monitor the abnormal situation in dynamic processes, a VAR model with/without external source inputs could be used for the stationary/non-stationary dynamic processes. Then, a DLV modeling was adopted to further decompose the residual part. Consequently, a dynamic T-PLS model was constructed to separate the whole measurement block into four subspaces.
Set i = 1 and X1 = X. The PLS component number A is determined by cross-validation. 1) Set ui to any column of Y. 2) wi = XiT ui /XiT ui . 3) ti = Xi wi . 4) qi = YT ti /tiT ti . 5) ui = Yqi . If ti converges, go to step 6, else return to step 2. 6) pi = XT ti /tiT ti . 7) Xi+1 = Xi − ti piT . Set i = i + 1 and return to step 1. Terminate if i > A. Let T = [t1 , . . . , t A ], P = [p1 , . . . , p A ], Q = [q1 , . . . , q A ]. Then (1) is obtained. R EFERENCES [1] P. Geladi and B. R. Kowalski, “Partial least-squares regression: A tutorial,” Anal. Chim. Acta, vol. 185, no. 1, pp. 1–17, 1986. [2] A. Hóskuldsson, “PLS regression methods,” J. Chem., vol. 2, no. 3, pp. 211–228, Jun. 1988. [3] B. M. Wise, N. L. Ricker, D. F. Veltkamp, and B. R. Kowalski, “A theoretical basis for the use of principal component models for monitoring multivariate processes,” Process Control Qual., vol. 1, no. 1, pp. 41–51, 1990. [4] J. V. Kresta, J. F. Macgregor, and T. E. Marlin, “Multivariate statistical monitoring of process operating performance,” Canad. J. Chem. Eng., vol. 69, no. 1, pp. 35–47, 1991. [5] J. F. MacGregor, C. Jaeckle, C. Kiparissides, and M. Koutoudi, “Process monitoring and diagnosis by multiblock PLS methods,” AIChE J., vol. 40, no. 5, pp. 826–838, 1994. [6] D. Zhou, G. Li, and S. J. Qin, “Total projection to latent structures for process monitoring,” AIChE J., vol. 56, no. 1, pp. 168–178, 2010. [7] G. Li, S. J. Qin, and D. Zhou, “Output relevant fault reconstruction and fault subspace extraction in total projection to latent structures models,” Ind. Eng. Chem. Res., vol. 49, no. 19, pp. 9175–9183, 2010.
LI et al.: MODELING AND MONITORING OF MULTIVARIATE DYNAMIC PROCESSES
[8] G. Li, C. Alcala, S. J. Qin, and D. H. Zhou, “Generalized reconstruction based contributions for output-relevant fault diagnosis with application to the Tennessee Eastman process,” IEEE Trans. Control Syst. Technol., vol. 19, no. 5, pp. 1114–1127, Sep. 2011. [9] N. Ricker, “The use of biased least-squares estimators for parameters in discrete-time pulse-response models,” Ind. Eng. Chem. Res., vol. 27, no. 2, pp. 343–350, 1988. [10] S. J. Qin and T. J. McAvoy, “Nonlinear FIR modeling via a neural net PLS approach,” Comput. Chem. Eng., vol. 20, no. 2, pp. 147–159, Feb. 1996. [11] S. J. Qin and T. J. McAvoy, “A data-based process modeling approach and its applications,” in Proc. 3rd IFAC DYCORD Symp., College Park, MD, Apr. 1992, pp. 321–326. [12] G. Baffi, E. Martin, and A. Morris, “Non-linear dynamic projection to latent structures modelling,” Chem. Intell. Lab. Syst., vol. 52, no. 1, pp. 5–22, 2000. [13] G. Baffi, J. Morris, and E. Martin, “Non-linear model based predictive control through dynamic non-linear partial least squares,” Chem. Eng. Res. Des., vol. 80, no. 1, pp. 75–86, 2002. [14] J. Chen and K. Liu, “On-line batch process monitoring using dynamic PCA and dynamic PLS models,” Chem. Eng. Sci., vol. 57, no. 1, pp. 63– 75, 2002. [15] G. Lee, S. O. Song, and E. S. Yoon, “Multiple-fault diagnosis based on system decomposition and dynamic PLS,” Ind. Eng. Chem. Res., vol. 42, no. 24, pp. 6145–6154, 2003. [16] G. Lee, C. H. Han, and E. S. Yoon, “Multiple-fault diagnosis of the Tennessee Eastman process based on system decomposition and dynamic PLS,” Ind. Eng. Chem. Res, vol. 43, no. 25, pp. 8037–8048, 2004. [17] N. Fletcher, A. Morris, G. Montague, and E. Martin, “Local dynamic partial least squares approaches for the modelling of batch processes,” Canad. J. Chem. Eng., vol. 86, no. 5, pp. 960–970, 2008. [18] T. Komulainen, M. Sourander, and S. Jämsä-Jounela, “An online application of dynamic PLS to a dearomatization process,” Comput. Chem. Eng., vol. 28, no. 12, pp. 2611–2619, 2004. [19] M. Kano, K. Miyazaki, S. Hasebe, and I. Hashimoto, “Inferential control system of distillation compositions using dynamic partial least squares regression,” J. Process. Control, vol. 10, nos. 2–3, pp. 157–166, 2000. [20] R. Shi and J. MacGregor, “Modeling of dynamic systems using latent variable and subspace methods,” J. Chem., vol. 14, nos. 5–6, pp. 423– 439, 2000. [21] M. Kaspar and W. H. Ray, “Dynamic PLS modelling for process control,” Chem. Eng. Sci., vol. 48, no. 20, pp. 3447–3461, 1993. [22] S. Lakshminarayanan, S. Shah, and K. Nandakumar, “Modeling and control of multivariable processes: Dynamic PLS approach,” AIChE J., vol. 43, no. 9, pp. 2307–2322, 1997. [23] L. Ljung and E. Ljung, System Identification: Theory for the User. Upper Saddle River, NJ: Prentice-Hall, 1999. [24] S. De Waele and P. Broersen, “Order selection for vector autoregressive models,” IEEE Trans. Signal Process., vol. 51, no. 2, pp. 427–433, Feb. 2003. [25] G. Li, B. S. Liu, S. J. Qin, and D. H. Zhou, “Dynamic latent variable modeling for statistical process monitoring,” in Proc. IFAC World Congr., Milano, Italy, Aug.–Sep. 2011, pp. 12886–12891. [26] L. H. Chiang, E. Russell, and R. D. Braatz, Fault Detection and Diagnosis in Industrial Systems. New York: Springer-Verlag, 2001. [27] L. H. Chiang, E. L. Russell, and R. D. Braatz, “Fault diagnosis in chemical processes using Fisher discriminant analysis, discriminant partial least squares, and principal component analysis,” Chem. Intell. Lab. Syst., vol. 50, no. 2, pp. 243–252, 2000.
Gang Li received the B.E. degree from the Department of Precision Instruments and Mechanology, Tsinghua University, Beijing, China, and the M.Sc. and Ph.D. degrees from the Department of Automation, Tsinghua University, in 2004, 2007, and 2011, respectively. He is currently an Assistant Engineer with the Army. He has published over ten peer-reviewed papers in international journals and one monograph in the area of process monitoring. His current research interests include statistical process monitoring, dynamic process modeling, data-driven fault diagnosis, and prognosis.
2271
S. Joe Qin (F’01) received the B.S. and M.S. degrees in automatic control from Tsinghua University, Beijing, China, in 1984 and 1987, respectively, and the Ph.D. degree in chemical engineering from the University of Maryland, College Park, in 1992. He is the Fluor Professor of process engineering and the Vice Dean at the Viterbi School of Engineering, University of Southern California, Los Angeles. He is the Co-Director of the Texas–Wisconsin– California Control Consortium, Los Angeles, where he has been a Principal Investigator for 16 years. His current research interests include statistical process monitoring and fault diagnosis, model predictive controls, system identification, run-to-run controls, semiconductor process controls, and control performance monitoring. Prof. Qin is a recipient of the National Science Foundation CAREER Award, the Northrop Grumman Best Teaching Award at the Viterbi School of Engineering in 2011, the DuPont Young Professor Award, the Halliburton/Brown & Root Young Faculty Excellence Award, a NSF-China Outstanding Young Investigator Award, the Chang Jiang Professor Award by the Ministry of Education, China, from 2007 to 2010, and the IFAC Best Paper Prize for the Model Predictive Control Survey Paper published in Control Engineering Practice. He is currently an Associate Editor of the Journal of Process Control, the IEEE C ONTROL S YSTEMS M AGAZINE, and the IEEE T RANSACTIONS ON I NDUSTRIAL I NFORMATICS, and an Editorial Board Member of the Journal of Chemometrics. He served as an Editor for Control Engineering Practice and an Associate Editor for the IEEE T RANSACTIONS ON C ONTROL S YSTEMS T ECHNOLOGY.
Donghua Zhou (SM’02) received the B.E., M.S., and Ph.D. degrees from the Department of Electrical Engineering, Shanghai Jiaotong University, Shanghai, China, in 1985, 1988, and 1990, respectively. He was an Alexander von Humboldt Research Fellow with the University of Duisburg, Duisburg, Germany, from 1995 to 1996, and a Visiting Scholar with Yale University, New Haven, CT, from July 2001 to January 2002. He is currently a Professor and the Head of the Department of Automation, Tsinghua University, Beijing, China. He has published over 90 peer-reviewed papers in international journals and four monographs. His current research interests include process identification, fault diagnosis and fault-tolerant controls, reliability prediction, and predictive maintenance. Prof. Zhou is the IFAC Technical Committee Member on Fault Diagnosis and Safety of Technical Processes, an Associate Editor of the Journal of Process Control, the Deputy General Secretary of the Chinese Association of Automation (CAA), and a Council Member of CAA. He was the NOC Chair of the 6th IFAC Symposium on SAFEPROCESS in 2006.
Baosheng Liu received the Bachelors and Ph.D. degrees from the Department of Automation, Tsinghua University, Beijing, China, in 1986 and 2007, respectively. He has been a Communication Member at the Navigation and Command Automation Institute, Beijing, since July 1986. He has published over 15 research papers. His current research interests include aerial navigation and air traffic control.