A FAULT TOLERANT APPROACH FOR NETWORKED CONTROL

A FAULT TOLERANT APPROACH FOR NETWORKED CONTROL SYSTEMS USING PREDICTIVE MODELS L. Brito Palma, F. Vieira Coito, R. Neves da Silva, P. Gil Universidade Nova de Lisboa – FCT – DEE 2829-516 Monte de Caparica, Portugal {LBP, FJVC, RNS, PSG}@fct.unl.pt

Abstract: In this paper, an on-line fault tolerant approach for networked control systems is proposed. The fault tolerant approach incorporates a fault detection and diagnosis methodology based on on-line identification of an ARX model and on neural nonlinear principal components analysis, as well as on neural nonlinear discriminant analysis. When a fault is correctly diagnosed the supervision system reconfigures the closed-loop system using a predictive model acting as a virtual sensor. The fault tolerant approach has been tested successfully on an analogue simulator (HW123 hardware setup) under closedloop control. © Controlo 2008. Keywords: fault tolerant systems, fault detection and diagnosis, networked control systems, virtual sensors.

1. INTRODUCTION In the last few decades, the complexity of technical processes increased mainly due the requirements of productivity and product quality. The incipient detection and diagnosis of faults can help to avoid material damage, breakdown, shutdown, and even human fatalities. The operation of technical processes requires advanced supervision and fault tolerant systems to increase safety, reliability and economy (Isermann, 2006). A major trend in modern industrial and commercial systems is to integrate computing, communication, and control into different levels of machine operations and information processes (Huo, et al., 2004). The traditional solution for exchanging control signals and information is point-to-point communication using wires to connect controllers with each sensor and actuator. This type of point-topoint wiring is expensive and the whole system can be difficult to maintain and diagnose due to the large number of cables. The development of network technologies has enabled distributed networking applications in factory, home and automotive areas, and the emergence of network standards such as PROFIBUS, WorldFIP, CAN, just to name a few. In manufacturing plants, vehicles and on other plants, serial communication networks are employed to exchange control signals and information between

spatially distributed system components such as supervisory computers, controllers and intelligent I/O devices (smart sensors and actuators). Each of the system components connected to the network directly is denoted as a node. Feedback control systems wherein the control loops are closed through a realtime network (wireline or wireless) are called Networked Control Systems (Huo, et al., 2004; Zhang, et al., 2001). The defining feature of Networked Control Systems (NCS) is that information (reference input, control input, plant output, etc) is exchanged using a network among control system components (controller, sensors, actuators, etc). In Fig. 1 is depicted a typical architecture of an NCS.

Physical Plant

Actuator #1

Actuator #m

Sensor #1

Sensor #n

Control Network Controller

Fig. 1. A typical NCS architecture.

Clock

Compared with conventional point-to-point interconnected control systems, the main advantages of an NCS are modular and flexible system design, distributed processing and interoperability, fast implementation, ease of system diagnosis and maintenance (Huo, et al., 2004). The insertion of the communication network in the feedback control loop makes the analysis and design of an NCS complex (Zhang, et al., 2001). Conventional control theories with many ideal assumptions, such as synchronized control and nondelayed sensing and actuation, must be reevaluated before they can be applied to NCSs. The induced delays can degrade the closed-loop performance of the control systems and even destabilize the system. Research in NCS is different from that in traditional time-delay systems where time-delays are assumed constant and bounded. Because of the variability of networked-induced time delays, the NCS may be time-varying systems, making analysis and design more challenging. In most practical problems of fault detection and diagnosis a combined (hybrid) approach is required to guarantee a reasonable performance. Nowadays, almost all complex systems incorporate basic fault detection modules. Fault tolerant systems must guarantee that faults don’t cause drastic failures. Advanced methods of supervision, fault detection and diagnosis, and fault tolerant control are needed (Isermann, 2006; Brito Palma, 2007). The paper is organized as follows. In section 2, the problem under investigation is described. The proposed fault detection and diagnosis approach appears in section 3, and in section 4 it is explained the fault tolerant approach. Experimental results are presented next, and finally appear the conclusions.

2. PROBLEM DESCRIPTION The fault tolerant problem under investigation, in a networked control system, is described here. In the implementation of control loops over networked control systems, two main problems can be identified: time delays and data consistency. This work deals with the problem of variable time delays.

r(k) u(k) Digital LAN Computer

y(k) D/A

PLANT

A/D

LAN

Fs

Fig. 3. Architecture of the closed-loop control system. Typically, there are three kinds of computer delays in networked control systems (Huo, et al., 2004). In Fig. 2 are depicted the typical delays: communication delay between the sensor and the controller τsc, computational delay in the controller τc, and communication delay between the controller and the actuator τca. The architecture of the closed-loop networked control system under investigation is depicted in Fig. 3. In this work, only the sensor-controller delay τsc is considered but the proposed fault tolerant approach can be extended to deal with the other typical delays in NCSs. The supervision and control algorithms are implemented on the digital computer (PC) and the interface between the PC and the plant is done via a data acquisition board (National USB-6009) represented by the blocks A/D and D/A. Here the plant incorporates the process, the actuator and the sensor. The block LAN corresponds to a local area network with induced variable delays, and Fs is a digital low-pass filter incorporated to reduce the system bandwidth. It is more natural to analyze an NCS in the discrete time domain since physical signals (from sensors or to actuators) are sampled and then transmitted over the network. The plant used for the experiments is an analogue simulator (HW123 hardware setup) built in our control laboratory. The HW123 setup implements analogue filters (first-order filter, second-order filter and third-order filter). In this work, a third-order plant (filter) with the transfer function (1) was used. G(s) =

5.61 s + 3.65 s + 8.25 s + 5.24 3

(1)

2

Plant Actuator

Process

Sensor

controller-actuator delay

Network

sensor-controller delay

τca

τsc

Controller τc

Fig. 2. Networked control system with induced delays.

The faults considered are associated with different types of sensor-controller delays, τsc. The nominal operation (fault-free situation) is termed as fault F0, while the other faults are termed as F1, etc. The delay associated with each fault is presented in Table 1. The signal ξ(t) represents a typical variable time delay in local area networks, which depends on the LAN traffic; in our experiments, this signal has a mean value around 0.09 s. Ts is the sampling period. For the fault F3, p = 5 × Ts, q = 3 × Ts and rand(.) is the Matlab function that generates uniformly distributed random numbers.

Table 1 Faults and LAN delays. PLANT

Fault F0 F1 F2 F3

LAN delay [s] 0 ξ(t) + 2 × Ts ξ(t) + 4 × Ts ξ(t) + (p + q × rand(.))

{u(k), y(k), r(k)}

Fault Detection θ(k) SW-PCR

For the experiments, a local area network (LAN) was used. The communication was done using the Matlab TCP/IP routines. In Table 1, the signal ξ(t) (variable time delay) was obtained computing the time spent on writing and on reading data from a TCP/IP port, assuming a localhost configuration. The controller runs on the computer and the plant is connected via a USB data acquisition board.

Alarm generator

hd

3. FAULT DETECTION AND DIAGNOSIS APPROACH In the last years, the main effort has been made in the investigation of analytical approaches based on quantitative models, such as parity equations (Gertler, 1998), observers (Patton, et al., 2000a), and parameter estimation (Isermann, 2006). Most of the FDD methodologies have been tested mainly on linear systems. Human interface

FDD

Fault

Fault

Fault

Fault

r(k) Controller

LAN u(k)

PLANT

LAN

Supervision y(k)

Fig. 4. Fault tolerant control architecture.

q(k)

p(k)

Neural NLDA

fd (k)

LP filter & Thr.

hf fd (k)

fd (k)

The proposed fault tolerant control architecture is depicted in Fig. 4. A SISO system is considered, without loss of generality. The thick lines represent signal flow, and the thin lines represent adaptation (tuning, scheduling or reconfiguration). The supervision system plays a crucial role in FTC applications (Patton, 1997; Cardoso, 2006). The supervisor must take decisions about adaptation when faults occur, in order to maintain the desired system performance and preserve the stability of the overall system. In most critical situations, the final decisions are taken by the humans. In some nonsevere faulty cases, the supervisor only needs to perform the re-tuning of the controller. When severe structural fault occurs, the supervisor usually needs to change the control strategy using other sensors, actuators, re-tuning the controllers and also changing set-points. The proposed fault tolerant approach is based on the following ideas: detect and diagnose faults in closedloop networked control systems and reconfigure the system using a virtual sensor (based on a predictive neural model).

am(k)

ta(k) Neural NLPCA

Knowledge-Based System (Fault Classification)

p(k)

Fault Diagnosis LP filter

fi0(k) fi (k)

Fig. 5. Fault detection and diagnosis architecture. Requirements for accurate analytical model imply that any resulting modelling error will affect the FDD performance. To circumvent this problem one alternative is to use grey-box or black-box models (adaptive ARX models, neural models, etc). The approaches based on black-box models using computational intelligence techniques (fuzzy logic and neural networks) are gaining a great importance mainly due to their ability to deal with nonlinear systems and time-varying systems, and also to their robustness to noise (Patton, et al., 2000b; Brito Palma, 2007). The approaches proposed for FDD are based mainly on ARX adaptive models and neural networks. Fig. 5 depicts the overall architecture used for FDD (Brito Palma, 2007). The main idea is to detect and diagnose faults using symptoms based on deviations of parameters of ARX models.

3.1 Fault Detection Approach It is assumed in this fault detection approach that the fault symptoms are related to deviations on parameters of ARX models. Each time delay is considered a fault since it causes deviations on the parameters of ARX models. Using input-output data vectors u(k), r(k) and y(k), the parameters of an adaptive ARX model can be estimated on-line. Here, an adaptive ARX model relating the process output y(k) with the reference signal r(k) is used. For parameter estimation a Sliding Window Principal Components Regression (SW-PCR) algorithm based on Principal Components Analysis (PCA) was implemented (Brito Palma, 2007). The sliding window parameter estimation algorithms are appropriate for fault detection since with a sliding data-window of length τ, it is known that the transient following a parameter jump lasts exactly τ - 1 samples (Gertler, 1998). Therefore, any isolation decision has to be delayed by τ - 1 samples following the detection of a change.

function H

IL

ML

σ BL

DL

φ OL

Fig. 6. Schematic diagram of an auto-associative neural network used in NLPCA. Instead of using the parameters θ(k) of the ARX model for generation of fault symptoms, a neural nonlinear (NL) PCA method is applied to θ(k) and the symptoms are based on deviations of the scores ta(k) and the square of prediction error (SPE, q(k)). In processes where redundancy or correlation between variables exists, it is advantageous to reduce the number of variables, maintaining an important quantity of original information. Dimensionality reduction techniques, such as PCA, can greatly simplify and improve process monitoring procedures, since they project the data into a lower dimensional space that accurately characterizes the state of the process (Chiang, et al., 2001). The principal curves method and the Kramer’s neural NLPCA method are the two main approaches to extend linear PCA to deal with nonlinear systems (Harkat, 2003). The Kramer’s neural approach was used in this work, and will be briefly described (Kramer, 1991). The schematic diagram of an autoassociative neural network used in NLPCA is depicted in Fig. 6. The auto-associative neural network is trained off-line with the input data equal to the output data. In this work, the data are the nominal parameters θ(k) of the ARX model. The neural network is used on-line to compute the scores and the SPE signal. The on-line scores ta(k) are obtained on the output of the bottleneck layer (BL). The SPE signal q(k) is given by equation (2), where e(k) given by equation (3) is the prediction error of the neural NLPCA model and x(k) is a row data vector. q(k) = e(k) e(k)T ∧

e(k) = x(k) - x(k)

(2)

A fault alarm signal is generated if the deviation from the nominal behaviour exceeds a certain threshold, i.e., p1 < hd; a typical value for hd is 0.9. To obtain a fault detection signal fd(k), the fault alarm signal am(k) is low pass filtered. Finally, the low pass filtered signal is compared to a threshold (a typical value is around 0.5). fd(k) = 1 ⇐ am(k) > hf

(4)

The low pass filter transfer function, used in this work, is given by equation (5), where λ (λ ≥ 0) is a design parameter (the pole location in the z-plane). Vf(z) Hlp(z) = V (z) = i

1-λ 1-λ z -1

(5)

3.2 Fault Diagnosis Approach The task of fault diagnosis is executed after the task of fault detection. The fault isolation is based on a knowledge based system (KBS) that uses “if-then” rules. The isolation is performed via the analysis of the fault class (pattern) generated by the neural nonlinear discriminant analysis model. The isolation of fault number j is achieved if round(pj+1) = 1 and round(pi)|i≠j+1 = 0, where round(.) is the round function to nearest integer. The fault isolation signal is also low pass filtered. For a nonlinear system (or a time-variant system) the ARX parameters are not constant, so it is necessary to define nominal regions and faulty regions in the fault symptoms space (2D scores and SPE signal, as depicted in Fig. 10) used for fault detection and diagnosis. The discrimination between a time delay fault and a system variation is possible if the neural network used for NLPCA can separate the fault symptoms in the decision (fault symptoms) space.

(3)

A pattern classification method based on neural nonlinear discriminant analysis (neural NLDA) is applied to the fault symptoms in order to obtain the associated faults. The fault symptoms are obtained a priori and saved into a database. The architecture of the neural network for NLDA is presented in Fig. 7 (Asoh & Otsu, 1990; Brito Palma, 2007). For the neural network, that implements the NLDA, the input data are the scores and the SPE signal, x(k) = [ta(k) q(k)], and the output data is a pattern

σ x

σ

φ φ

σ IL

HL1

HL2

φ

...

σ

...

...

φ

^

X

...

σ

φ

vector p(k) associated with each fault. Here, for the case of 4 faults, p = [p1 p2 p3 p4], i.e., the ith position on vector p is denominated p(i) = pi. For the case of nominal operation, corresponding to fault F0, p1 ≈ 1 and pi|i≠1 ≈ 0.

...

σ φ

...

σ

...

X

...

σ

...

function G

p

φ OL

Fig. 7. Architecture of the neural network for NLDA.

y(k) LAN

PLANT

LAN

Fs

u(k) ^ y(k)

Virtual Sensor (NROP, predictive model)

The NROP is inspired by the neural parallel model and by the classical observer structures (Kalman filter and Luenberger observer) with prediction and correction mechanisms. The residual signal is given by:

Fs

∧

re(k-1) = y(k-1) - ynrop(k-1)

Fig. 8. Reconfiguration using a neural predictive model (NROP) acting as a virtual sensor.

4. FAULT TOLERANT CONTROL APPROACH

Typically, the NROP works with both the prediction and the correction mechanisms. In this application, the NROP acts as a pure neural parallel model since the sensor output signal is no longer available in the control loop, so the signal y(k) will be replaced by ∧

Fault tolerance approaches and the proposed fault tolerant control approach are presented afterwards. 4.1 Fault Tolerance Fault tolerance in automatic control systems can be achieved either by passive or by active strategies (Blanke, et al., 2003; Cardoso, 2006; Patton, 1997). Passive fault tolerance can be obtained using a robust controller that can tolerate changes of the plant dynamics (Cardoso, 2006; Patton, 1997). Active fault tolerance can be obtained using adaptive controllers (Patton, 1997). 4.2 Fault Tolerant Approach for NCS Faults on sensors can be accommodated using virtual sensors (Cardoso, 2006; Oosterom & Babuska, 2000). After fault occurrence, the virtual sensor is used for estimating the sensor output. The fault tolerant control (FTC) approach proposed uses a virtual sensor based on a neural recurrent output predictor, termed here NROP, (Brito Palma, 2007). When the supervisor decides to reconfigure the system, after a fault diagnosis, the virtual sensor is activated and the real sensor is deactivated. The general architecture used for FTC follows the architecture described in Fig. 4. The schematic diagram used for system reconfiguration is depicted in Fig. 8, where Fs is a low pass filter. The architecture of the virtual sensor (implemented based on a NROP) is depicted in Fig. 9. The NROP is implemented using a multi-layer perceptron (MLP) feed-forward (FF) neural network with external recurrency.

(6)

∧

the estimated value y(k) = ynrop(k) in the structure of the NROP. Hence, the gain Kn (design parameter) has no effect because the residual signal given by equation (6) is always zero, i.e., the correction mechanism has no effect. Only one neural predictive model (NROP) was used in the experiment since the plant is a SISO system. For the case of a MIMO system, a bank of neural predictive models can be used. The proposed FDD/FTC approach assumes that the parameters of the ARX model are updated based on collected data from the process and/or from the virtual sensor. When no data can be acquired from the process, for example in the case of a LAN breakdown, the system plant (process) will be unobservable and uncontrollable.

5. EXPERIMENTAL RESULTS The operating conditions and some experimental results are presented in this section. In order to estimate on-line the parameters of the ARX(2,1,2) model, relating the output signal y(k) with the reference signal r(k), a dither signal (Gaussian white noise) with a variance of 0.5×10-3 was added to r(k). The SW-PCR was used for parameter estimation with a sliding-window of length 20 s. Fig. 10 shows the fault symptoms for each fault (see Table 1). The figure on top shows the 2D scores and on the other figure can be observed the SPE signal.

z -1

...

z -n z -1

y^nrop(k)

MLP-FF neural model

+

...

z -n u(k)

y(k)

z -1

+

Kn z -1

-1

Fig. 9. Architecture of the neural recurrent output predictor (NROP).

Fig. 10. Fault symptoms (2D scores and SPE signal).

Topics for future work are the application of a fuzzy supervisor and an intelligent adaptive networked controller that can identify the network performance and operate among multiple modes of control action. Tests on nonlinear plants will be done in a near future, as well as tests on networks of type WAN.

REFERENCES

Fig. 11. Experimental results for fault F1. A sampling period of 0.22 s was used, for the case of nominal operation. A fixed set-point was established around 0.5. An incremental version of a digital PI controller was implemented; the gains used are Kp = 1.1 and Ti = 1.5 s. In the digital low-pass filters, Fs, the pole was located at λ = 0.7. Tests were performed with a local area network (LAN). Most of the signals have been normalized to the range [0; 1]. All the routines were implemented in discrete-time using the Matlab programming language. In Fig.11, experimental results are presented for the faulty situation F1. The fault occurs at 180 s. From top to bottom, can be observed the output signal, the control input, the network delay, the ARX model parameters. Next appears the fault detection signal and the fault isolation signal. In the last plot is depicted the output signal and the predicted signal, and can observed that the reconfiguration occurs at time instant 199 s. Good performances were also obtained for the other faults (F2 and F3).

6. CONCLUSIONS Results obtained with the proposed fault tolerant approach show good performances for the class of faults under investigation. Good robustness properties with respect to noise and to small variations on network delays were reached. Only faults associated with sensor-controller delays were considered, but the proposed FDD/FTC methodology can also deal with controller-actuator delays, delays in the controller and other faults on the plant (or on the controller) that have symptoms associated to deviations on ARX model parameters.

Asoh, H., N. Otsu (1990). An Approximation of Nonlinear Discriminant Analysis by Multilayer Neural Networks. Proc. of the IEEE Int. Joint Conf. on Neural Networks. San Diego - USA. Blanke, M., M. Kinnaert, J. Lunze, M. Staroswiecki (2003), Diagnosis and Fault-Tolerant Control, Springer. Brito Palma, L. (2007). Fault Detection, Diagnosis and Fault Tolerance Approaches in Dynamic Systems based on Black-Box Models. Phd thesis. Universidade Nova de Lisboa, Portugal. Cardoso, A. (2006). Supervisão e Controlo de Sistemas Dinâmicos com Tolerância a Falhas Contribuição para uma Abordagem Estruturada e Robusta. Phd thesis. Universidade de Coimbra, Portugal. Chiang, L., E. Russell, R. Braatz (2001). Fault Detection and Diagnosis in Industrial Systems. Springer-Verlag. Gertler, J. (1998). Fault Detection and Diagnosis in Engineering Systems. Marcel Dekker Inc. Harkat, M. (2003). Détection et Localisation de Défauts par Analyse en Composantes Principales. Phd thesis. Institut National Polytechnique de Lorraine - France. Huo, Z., H. Fang, C. Ma (2004). Networked Control System: State of the Art. Proc. of the IEEE 5th World Congress on Intelligent Control and Automation. Hangzhou - China. Isermann, R. (2006). Fault-Diagnosis Systems - An Introduction from Fault Detection to Fault Tolerance. Springer. Kramer, M. (1991). Nonlinear Principal Component Analysis using Auto-Associative Neural Networks, AICHE Journal, vol. 37, no. 2, pp. 233-243. Oosterom, M., R. Babuska (2000), Virtual Sensor for Fault Detection and Isolation in Flight Control Systems – Fuzzy Modeling Approach, Proc. of the 39th IEEE Conference on Decision and Control, Sydney – Australia. Patton, R. (1997). Fault Tolerant Control Systems: the 1997 Situation. Proc. of the Safeprocess Symposium. Hull - UK. Patton, R., P. Frank, R. Clark (2000a). Issues of Fault Diagnosis for Dynamic Systems. SpringerVerlag. Patton, R., F. Uppal, C. Toribio (2000b). Soft Computing Approaches to Fault Diagnosis for Dynamic Systems: A Survey. Proc. of the IFAC Symposium Safeprocess. Budapest - Hungary. Zhang, W., M. Branicky, S. Phillips (2001). Stability of Networked Control Systems. IEEE Control Systems Magazine.