European Journal of Control (2001)7:248±286 # 2001 EUCA
Soft Computing Approaches to Fault Diagnosis for Dynamic Systems J.M.F. Calado1, J. Korbicz2,*, K. Patan2,*, R.J. Patton3,y and J.M.G. Sa da Costa4,z 1
IDMEC ± Instituto Superior de Engenharia de Lisboa, Polytechnic Institute of Lisbon, Rua Cons. EmõÂ dio Navarro, 1949-014 Lisboa, Portugal; 2Institute of Control and Computation Engineering, Technical University of Zielona GoÂra, ul. PodgoÂrna 50, 65-246 Zielona GoÂra, Poland; 3Control and Intelligent Systems Engineering, Faculty of Engineering and Mathematics, The University of Hull, Cottingham Road, Hull Hu6 7RX, United Kingdom; 4IDMEC ± Instituto Superior TeÂcnico, Technical University of Lisbon, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
Recent approaches to fault detection and isolation for dynamic systems using methods of integrating quantitative and qualitative model information, based upon soft computing (SC) methods are surveyed and studied in some detail. SC methods are considered an important extension to the quantitative model-based approach for residual generation in fault detection and isolation (FDI). When quantitative models are not readily available, a correctly trained neural network (NN) can be used as a non-linear dynamic model of the system. The paper describes some powerful NN methods, taking into account the dynamic as well as non-linear system behaviour. Sometimes, further insight is required as to the explicit behaviour of the model-involved and it is then that fuzzy and even neurofuzzy methods come to their own in data-driven FDI applications. The paper also discusses the use of evolutionary programming tools for observer and NN design. The paper provides many powerful examples of the use of SC methods for achieving good detection and isolation of faults in the presence of uncertain plant behaviour, together with their practical value for fault diagnosis of real process systems.
* Tel.: 48 68 32 82 422; Fax: 48 68 325-55-97; Email:
[email protected],
[email protected]. y Fax: 0044 1482 46 5117; Email:
[email protected]. z Fax: 351 21 849 80 97; Email:
[email protected]. Correspondence and offprint requests to: J.M.F. Calado, IDMEC ± Instituto Superior de Engenharia de Lisboa, Polytechnic Institute of Lisbon, Rua Cons. EmõÂ dio Navarro, 1949-014 Lisboa, Portugal. Fax: 351 21 831 70 57; Email:
[email protected].
Keywords: Classification; Computational intelligence in control; Evolutionary algorithms; Fault diagnosis; FDI; Fuzzy systems; Neural networks; Soft computing methods
1. General Introduction There is an increasing demand for man-made dynamical systems to become safer and more reliable. These requirements extend beyond normally accepted safety-critical systems of nuclear reactors, chemical plants or aircraft, to new systems such as autonomous vehicles or fast rail systems. The early detection of faults can help avoid system shut-down, breakdown and even catastrophes involving human fatalities and material damage. A system that includes the capacity of detecting, isolating, identifying or classifying faults is called a fault diagnosis system. During the last two decades many investigations have been made using analytical approaches, based on quantitative models. The idea is to generate signals that reflect inconsistencies between nominal and faulty system operation. Such signals, termed residuals, are usually generated using analytical approaches, such as observers [1,2], parameter estimation [3] or parity equations [4] based on analytical (or functional)
Received 18 May 2001; Accepted 22 May 2001. Recommended by Martins de Carvalho, J.M.G. Sa da Costa and AntoÂnio Dourado.
249
Soft Computing Approaches to Fault Diagnosis
redundancy. Considerable attention has been given to both research and application studies of real processes, using analytical redundancy as this is a powerful alternative to the use of repeated hardware (hardware or software redundancy). Early detection and isolation of small, incipient (rather difficult to detect) faults can be achieved with model-based processing of all measured variables, using either qualitative or quantitative modelling. Neural networks and fuzzy logic techniques are now being investigated as powerful modelling and decision making tools, along with the more traditional use of non-linear and robust observers, parity space methods and hypothesis-testing theory. Requirements for precise and accurate analytical model imply that any resulting modelling error will affect the performance of the resulting fault detection and isolation (FDI) scheme. This is particularly true for dynamically non-linear and uncertain systems, which represent the majority of real processes. To circumvent this precision problem (at least in part) more abstract models, based on qualitative physics [5±8] may be used. Alternatively fuzzy-logic rules may be developed to either assist or replace the use of a model for diagnosis [9]. The key advantage of fuzzy logic is that it enables the system behaviour to be described by ``if-then'' relations. Some research has been based upon the neural network (NN) that can be trained to reproduce a specified system behaviour from data sets alone. NNs can, indeed, provide an excellent framework for dealing with non-linear systems [10]. The main feature of NNs are their ability to model any non-linear function, given suitable weighting factors and an appropriate architecture. However, whilst such a configuration can be trained well on numerical data, heuristic knowledge from experts cannot easily be incorporated. It is also argued that, due to their ``grey box'' characteristics, conventional neural networks do not give an insight into the behaviour of the system that is sufficiently comprehensible by the operator. Another drawback of substituting the operator's ``intelligence'' by an automated analytical approach is that the operator's expertise, built up over several years, is simply not used. This is mainly due to the inability of analytical methods to represent symbolic information. In the authors' opinion a robust FDI system should combine both numerical (quantitative) and symbolic (qualitative) information. Some investigators tackled this problem by combining parameter estimation or observers with fuzzy logic [11,12]. The main idea has been to generate residuals using either parameter estimation or observers, and allocate the
decision-making to a fuzzy-logic inference engine. In so doing, it has been possible to include symbolic knowledge with the quantitative information and, thereby, minimise the false alarm rate. Indeed, the key benefit of fuzzy-logic is that it lets the operator describe the system behaviour or the fault±symptom relationship with simple if-then rules. Here we use the term soft computing (SC) for all methods employing computational intelligence algorithms, e.g. fuzzy logic, neural networks, neuro-fuzzy schemes, evolutionary programming, etc. The paper has the following organization. Section 2 gives an overview of the most used SC methods used in FDI systems. Section 3 addresses the use of NNs in FDI systems given two demonstration examples. Section 4 presents a hierarchical fuzzy-neuronal approach for FDI system, given a validation example. In Section 5 the use of evolutionary approaches to FDI system design is addressed. Finally, general conclusions are drawn.
2. Soft Computing in FDI: Some Perspectives 2.1. Introduction This section gives an overview of many of the SC methods that are considered a powerful extension to quantitative/analytical approaches for FDI in dynamic systems. This section forms a background for the later sections of the paper. Starting from a brief consideration of the principles of quantitative model-based fault diagnosis, the ideas behind the use of NNs, fuzzy logic and neuro-fuzzy structures are summarised to give the reader a feel of their potential advantages and disadvantages. In keeping with the ideas behind using quantitative observers for FDI, the concept of the ``fuzzy observer'' is described. A form of fuzzy observer for non-linear systems is outlined which uses a fuzzy rule-base to select the most suitable dynamic model for a particular operating point (the so-called fuzzy inference multiple-model approach). It may be important to structure a quantitative model in a way that qualitative knowledge about the process could be included as well as extracted. The key to achieving this is to structure a NN in a manner that can model highly non-linear systems efficiently using a fuzzy-logic format ± the so-called neuro-fuzzy network. The main aim is to ensure that the NN can be trained rapidly and also provide a linguistic description about the causes of faults. It is argued that the Bspline network can be a suitable architecture for this problem due to an interesting equivalence relation
250
J.M.F. Calado et al.
with the function of fuzzy rule sets. The main difficulty with this approach is the rapidly increasing complexity of the rule-base with system order and complexity. Several more practically suitable neuro-fuzzy approaches are given more attention in the paper. Evolutionary programming algorithms such as GAs are often considered within the field of optimisation. However, for the purpose of FDI design these methods are considered in this study as a valuable approach to the SC field. 2.2. Model-based FDI Principles The aim of a quantitative model-based fault diagnosis is to generate information about the location and timing of a fault, using the measurements available in that system, as well as the precise mathematical relationships that relate them. Figure 1 illustrates the conceptual structure of a model-based fault diagnosis system, which comprises the following main stages. Residual Signal: r
s Hu u
s Hy y
s:
1
Objectives: choose Hu and Hy so that r
s 0 when no fault occurs r
s 6 0 when a fault occurs 1. Residual Generation: This is an algorithm that processes the measurable inputs and outputs of the system to generate the residual signal. 2. Decision-making: The residuals are then examined for the likelihood of faults, and a decision rule is then applied to determine if any fault has occurred. A decision process may be based on a simple threshold test, on the instantaneous values or moving averages of the residuals, or it may consist of methods of statistical decision theory, e.g. likelihood ratio testing or sequential probability testing. The successful detection of a fault is followed by the fault isolation procedure whose aim is to locate the fault. u (s)
System G(s) Hy(s) Hu(s)
y (s)
r (s)
Decision Making
Residual Generator
Fig. 1. Model-based fault diagnosis.
Fault Information
Observers: The underlying idea is to estimate the system outputs from the available inputs and outputs of that system [13]. The residual will then be a weighted difference between the estimated and the actual outputs. The ¯exibility in selecting the observer gain has been fully exploited in the observer, yielding a rich variety of fault detection schemes. Parity relations: They are based either on a technique of direct redundancy, making use of static algebraic relations between sensor and actuator signals or alternatively, upon temporal redundancy, when dynamic relations between inputs and outputs are used. Parameter estimation: This approach makes use of the fact that component faults of a dynamic system can be thought of as re¯ected in the physical parameters as for example friction, mass velocity resistance. Faults are detected by estimating parameters of non-parametric models. The main assumption made when using the above methods is that a precise mathematical model of the plant is required. This makes quantitative modelbased approaches very difficult to use in real systems, since any un-modelled dynamics can affect the performance of the FDI scheme. A way to overcome this problem, is to design robust algorithms where the effects of disturbances on the residual are minimised and the sensitivity to faults maximised. Many approaches had been developed including unknown input observers and eigenstructure assignment observers [1,2], as well as frequency domain techniques for robust FDI filters such as H1 [14] and the minimisation of multi-objective functions [15]. 2.3. The NN Approach to FDI Here we present the essential ideas behind the use of NNs for FDI. Later, in Section 3 we give a more detailed overview of the most important NNs architectures used in FDI. To overcome some of the difficulties of using mathematical models, and make FDI algorithms more applicable to real systems, the NN can be used to generate both residuals and isolate faults [2], as shown in Fig. 2. A NN is a processing system that consists of a number of highly interconnected units called neurons that are interconnected via a large number of ``weighted links''. Each neuron can be considered as a mathematical function that maps the input and output space with several inputs. The inputs are connected to either the inputs of the system or the outputs of the
251
Soft Computing Approaches to Fault Diagnosis Reference Input
Monitored System
Controller Faults Input
Actuators
ANN
Qualitative Information
Plant Dynamics
Residuals
Sensors
ANN
Output
Fault Information
Symbolic (Qualitative) Information
quantitative data and they are readily applicable to multivariable systems. NNs can also be applied for process condition monitoring, where the focus is on small irreversible changes in the process which develop into bigger faults. Yin [18] demonstrated the application of MLP and Kohonen self-organising feature map (KFM) to predictive maintenance or condition-based maintenance of electrical drives, particularly induction motors. The first method utilises supervised learning and in the second the learning is unsupervised.
Fault Diagnosis System
Fig. 2. NNs scheme for FDI.
other neurons in the system. The output of one neuron effects the outputs of other neurons and all neurons connected together can perform complex processes. Artificial neural networks provide an excellent mathematical tool for dealing with non-linear problems [16]. They have an important property, according to which any continuous non-linear relationship can be approximated with arbitrary accuracy using a neural network with suitable architecture and weight parameters. Another attractive property is the selflearning ability. A neural network can extract the system features from historical training data using the learning algorithm, requiring a little or no a priori knowledge about the process. Hence, they can be trained to represent relationships between past values of residual data (generated by another NN) and those identified with some known fault conditions. The configuration used by Chen and Patton [2] involved a multi-layer feed-forward network configuration. Although this configuration can be well trained on numerical data once that the outputs are known, symbolic knowledge from experts cannot easily be incorporated. On the other hand, the quantitative mathematical model used in traditional FDI methods can be very sensitive to modelling errors, parameter variation, noise and disturbance. No system mathematical model is needed to implement a NN. Online training makes it possible to change the FDI system easily when changes are made in the physical process, the control system or parameters. A suitably trained NN can generalise when presented with inputs not appearing in the training data. NNs have the ability to make intelligent decisions in cases of noisy or corrupted data. They also have a highly parallel structure, which is expected to achieve a higher degree of fault-tolerance then conventional schemes [17]. NNs can operate simultaneously on qualitative and
2.3.1. Application Studies NNs have been successfully applied to many applications including fault diagnosis of non-linear dynamic systems [19,20]. Multi-layer perceptron (MLP) networks are applied to detect leakages in electro-hydraulic cylinder drive in a fluid power system [21]. They showed that maintenance information can be obtained from the monitored data using the NN instead of a human operator. Crowther et al. [22], showed in an application of a NN to fault diagnosis of hydraulic actuators, that experimental faults can be diagnosed using NNs trained only on simulation data. NNs are applied to detect the internal leakage in the control valves and motor faults in process plants [23]. Kuhlmann et al. [24] presented the principle of the Device-Specific NN (DS-NN) approach to fault diagnosis. The basic principle of the DS-NN approach is that NNs are trained for dealing with certain basic groups or electrical devices (e.g. lines, transformers, busbars etc). Weerasinghe et al. [25] investigated the application of a single neural-network for the diagnosis of non-catastrophic faults in an industrial nuclear processing plant operating at different points. Data-conditioning methods are investigated to facilitate fault classification, and to reduce network complexity. Maki and Loparo [26] presented detection and diagnosis of faults in industrial processes that require observing multiple data simultaneously. The main feature of this approach is that the fault detection occurs during transient periods of process operation. Vemuri and Polycarpou [27] investigated the problem of fault diagnosis in rigid-link robotic manipulators with modelling uncertainties. A learning architecture with sigmoidal NNs was used to monitor the robotic system for any off-nominal behaviour due to faults. Butler et al. [28] discussed the use of a new NN supervised clustering method to perform fault diagnosis for power distribution networks. The NN proposed performs fault type classification, faulted feeder and faulted phase identification, and fault impedance
252
estimation for grounded and ungrounded distribution networks. Li et al. [29] described a method to diagnose the most frequent faults of a screw compressor and assess magnitude of these faults by tracking changes in the compressor dynamics. Yu et al. [30] investigated semi-independent neural model, based on an RBF network, to generate enhanced residuals for diagnosing sensor faults in a reactor. Narendra et al. [31] also used the RBF network architecture for fault diagnosis in a HVDC system. A new pre-classifier was proposed which consists of an adaptive filter (to track the proportional values of the fundamental and average components of the sensed system variables), and a signal conditioner which uses an expert knowledge base (KB) to aid the pre-classification of the signal. Other networks used in recent applications include Dynamic Back-propagation NNs [32] and Cerebellar Model Articulation Controller (CMAC) Network [33]. Each of these architectures offers different characteristics to suit distinct applications. Recent research focuses on NNs that can optimise their structure during training. Ren and Chen [34] proposed a new type of NN in which the dynamical error feedback is used to modify the inputs of the network. NNs have also been applied to the problem of joint faults in robots, using pattern recognition. The jointbacklash of robots is diagnosed by monitoring its vibration response during normal operation [35]. James and Yu [36] used a NN for the condition monitoring and fault diagnosis of a high-pressure air compressor valve. The NN-based FDI scheme can also show when further increases in fault levels might be likely, thus giving the operator time to take necessary action [37]. Dynamical NNs [38] are applied to on-line fault detection of power systems, aircraft system examples, chemical plants and nuclear reactors which are highly non-linear and complex. A detailed discussion of dynamical NNs is given in Section 3. 2.3.2. Strategies for Fault Diagnosis It is clear that NNs can be applied to the FDI problems using different approaches. Pattern recognition and residual generation-decision making are the most common ones. The latter is generally more suitable for dynamic systems and comprises residual generation and decision-making stages in the manner outlined in Section 1.2. In the first stage the residual vector r is determined in order to characterise each fault. Ideally, the NN models identify all classes of system behaviour. The second stage, decision making or classification processes the residual vector r to determine the location and occurrence time of the faults. Chen and
J.M.F. Calado et al.
Patton [2] showed that a single NN can be used for both stages simultaneously, with increased training time and complexity. Fault isolation requires that the training data are available for all expected faults in terms of residual values or system measurements. The NN can be used for classification in conjunction with other residual generating methods e.g. nonlinear observers. NNs have been successfully applied in state and parameter based FDI schemes [39,40]. Han and Frank [39] proposed a parameter estimation based FDI using NNs in which physical parameters are estimated by applying the NN universal approximation property applied to the measured I/O data. The deviations from normal values are then used for fault diagnosis. It is assumed that the faults in the monitored process can be described as changes in parameter vector and the nominal parameter values are known in advance or can be estimated on-line (e.g. via recursive least-squares method). A linear model of the system can be described as: _ A
tX
t B
tU
t
t, X
t Y
t C
tX
t D
tU
t
t or
1 d
n Y , d
n 1 Y , . . . , dY , B dtn dtn 1 dt C Y
t H@
m A
t "
t, d Y , d
m 1 Y , . . . , U m dt dtm 1
2
0
3
where X 2 Rn, Y 2 Rm, 2 Rp are the state, output, model parameter vectors and 2 Rn and 2 Rm are the process and measurement noise signals. A(), B(), C(), D() and H[ ] are known function matrices. Equation (1) is suitable for FDI with state estimation whilst (2) is suitable for parameter estimation based FDI. Faults in the process can be described as the change in the parameter vector ^ 0 , where ^ and 0 are the estimated and nominal parameter vectors. NNs have been successfully applied to the on-line FDI of industrial processes. Fuente and Vega [41] used a NN for FDI of a biotechnological process. Real data from experiments on the plant were used together with on-line estimation. The back-propagation algorithm was used to analyse the frequency content of some fault-indicating signals derived from the identification step. This gave rise to correct detection and isolation of each fault. Gomm [42] described an adaptive NN that continually monitors and improves its performance on-line as new fault information becomes available. New nodes are automatically added to the network to accommodate novel
253
Soft Computing Approaches to Fault Diagnosis
process faults after detection, and on-line adaptation is achieved using recursive linear algorithms to train selected network parameters. 2.3.3. NN FDI Design Issues As outlined above, NNs provide essentially a ``grey box'' signal processing structure, which do not show rules governing their operation and there is no visibility as to their real behaviour (from an input±output point of view). This does not enable the user to understand the system and predict its behaviour in uncertain situations. On the other hand, B-spline networks can be used to extract and include some heuristic knowledge about the system. The training time required for a specific application and the complexity of the training algorithm present further limitations. The earlier back-propagation algorithm used to train MLPs; requires an excessive training time and is generally an off-line method for training. RBF NNs are capable of on-line adaptive training if required [43] but use large numbers of neurons if the I/O space is large. To accelerate convergence, state variables with additional terms can be used in training [21]. NNs which use neurons as membership functions, e.g. RBF and B-spline NNs, do not generalise well when presented with data outside the training I/O space. MLP and Fuzzy Logic based systems on the other hand tend to generalise in a better way. On-line training should be used to update such networks [43]. If some unknown fault conditions appear, the neural classifier is no longer valid because it is not trained to classify this type of fault. Adaptive training algorithms should be used with systems requiring online training. It is not usually possible to acquire all the faulty data for NN training. Thus unsupervised training, which uses the Kohonen NN and the counter-propagation (CPN) NN [44], is necessary in order to classify the faults not known a priori. A combined artificial NN and expert system tool (ANNEPS) is developed [45] for transformer fault diagnosis using dissolved gas-in-oil analysis (DGA). ANNEPS takes advantage of the inherent positive features of each method and offers a further refinement of present techniques. 2.3.4. Hybrid NNs NN-based FDI methods usually require pre-processing or signal conditioning algorithms to reduce the effect of noise and disturbance and to enhance the fault features. Many other techniques have been
combined with NNs, including fuzzy logic, genetic algorithms and adaptive modelling etc. Aminian et al. [46] developed an analog-circuit fault diagnostic system based on back-propagation NNs using wavelet decomposition, principal component analysis, and data normalisation as pre-processors. The proposed system has the capability to detect and identify faulty components in an analog electronic circuit by analysing its impulse response. Pantelelis et al. [47] developed simple finite element (FE) models of a turbocharger (rotor, foundation and hydrodynamic bearings) combined with NNs and identification methods and vibration data obtained from real machines towards the automatic fault diagnosis. Liu [48] used extended Kalman filter (EKF) and NN classifier for FDI. Network inputs are the process I/O data, such as pressure and temperature, parameters estimated by EKF, and state values calculated by dynamic equations, whilst network outputs are process fault scenarios. Zhao et al. [49] presented a multidimensional wavelet (MW) with its rigorously proven approximation theorems. Taking the new wavelet function as the activation function in its hidden units, a new type of wavelet called multi-dimensional nonorthogonal non-product wavelet-sigmoid basis function NN (WSBFN) model is proposed for dynamic fault diagnosis. Based on the heuristic learning rules presented by Zhao et al. [49], a new set of heuristic learning rules is presented for determining the topology of WSBFNs. Zhao et al. [50] proposed the wavelet-sigmoid basis NN (WSBN) with expert system (ES) for dynamic fault diagnosis (DFD). Ye and Zhao [51] proposed a hybrid intelligent system, which integrates NNs with a procedural decisionmaking algorithm to implement hypothesis-test cycles of system fault diagnosis. 2.3.5. Detection of Multiple Faults NNs have been found to give more information with regard to multiple-fault conditions than some other methods (steady-state position error, time series analysis. Some NNs are applied to diagnose multiple faults in the processes but generally it is much more difficult to diagnose such faults is a process because the training data needed become very large. Maidon et al. [52] developed technique for diagnosing multiple faults in analog circuits from their impulse response function using a fault dictionary. A technique is described [53] for diagnosing multiple faults in analogue circuits from their impulse response function using multi-layer perceptrons, in terms of a specific example. A Dirac impulse input to the circuit was
254
simulated, and time domain features of the output response were classified by a system of two multi-layer perceptrons to produce accurate numerical fault values. 2.4. FDI via Fuzzy Logic Since Zadeh (1965) introduced the theory of the fuzzy sets ± manipulating data that were not precise, but rather ``fuzzy'' and since the work of Mamdani [54], industrial application studies using fuzzy logic controllers have reached a major position in systems engineering. Fuzzy systems are useful in any situation in which the measurements taken are imprecise or their interpretation depends strongly on the context or on human opinion. 2.4.1. Application Studies Application areas include: the process industry, electromechanical systems, traffic and avionics control and biomedical systems etc. Evsukoff et al. [55] proposed a decision support system dedicated to fault detection and isolation from a human±machine co-operation point of view. Yang and Liao [56] proposed an adaptive fuzzy system for incipient fault recognition through an evolution enhanced design approach. Complying with the practical gas records and associated fault causes as much as possible, a fuzzy reasoning algorithm is presented to establish a preliminary fuzzy diagnosis system. In this system, an evolutionary optimisation algorithm is further relied on to fine-tune the membership functions of the if±then inference rules. Lu et al. [57] described a fuzzy diagnostic model that contains a fast fuzzy rule generation algorithm and a priority rule based inference engine. Insfran et al. [58] proposed an approach for fault diagnosis, using fuzzy sets. The system allows not only the fault location, but also identification of the fault type. Currents and voltages are analysed using the fault phase-impedance and fuzzy sets. Dexter and Benouarets [59] used a set of fuzzy reference models that describe faulty and fault-free operation, and a classifier based on fuzzy matching for fault diagnosis. The reference models are obtained off-line from simulation data. A fuzzy model that describes the actual plant behaviour is identified on-line from normal operating data and compared with each of the reference models. Mechefske [60] applied fuzzy logic techniques to classify frequency spectra representing various rolling element bearing faults. The frequency spectra representing a number
J.M.F. Calado et al.
of different fault conditions are processed using a variety of fuzzy set shapes. Although fuzzy systems theory is often applied to industrial process, the applications often do not work well. Sometimes fuzzy logic designs are completed without mathematical rigour. The main tasks of finding appropriate membership functions and fuzzy rules are often determined simply by ``trial and error''. The rules can be obtained by means of optimisation methods. LMI optimisation has been used in order to design an optimal Takagi±Sugeno (T±S) observer based on a relaxed stability condition [61]. Another main approach to obtain the number, position and type of rules is to apply adaptive and learning algorithms to fuzzy systems or to apply NNs to learn the parameters of the fuzzy system. 2.4.2. Fuzzy Decision-Making The advantage of the fuzzy approach is that it supports, in a natural way, the direct integration of the human operator into the fault detection and supervision process. By avoiding an incorrect decision that can cause false alarms the aim of the FDI decision making (for fault diagnosis) is to decide whether and where the fault in the system has occurred [62]. Fuzzy decision-making objectives are very similar to expert systems and supervisory control. Expert Systems are used to simulate the problem-solving and decisionmaking processes of a human expert within a relatively narrow domain. This is done using special computer packages along with knowledge, information and databases [63]. 2.4.2.1. Formulation of decision-making. A decision can be formulated by a set of variables (sets, relations and functions) termed a quintuple (S, st, C, m, dc) [64,65]. By using available information S is the possible actions where a selection of this set is performed. st are the set of uncontrolled variables by the decision maker cannot of the environment but they must be included in the decision making process. C is the set of consequences, which must be including into a multicriteria decision-making scheme. Uncertainties resulting from the identification procedure and inherent uncertainties of the system are included, in part, in the consequences. m is the relation used to obtain the decision-making solutions by mapping the space S st into the set consequences as S st ! C. The decision-maker has, a priori, aims and objectives in a preference ordering. They are taken into account in dc as a decision function dc : C !R. A number of fuzzy decision-making methods for control have been applied for more than 2 decades.
255
Soft Computing Approaches to Fault Diagnosis
For example, the formulation by Bellman and Zadeh [66]. For this approach, there is no distinction between aims and constraints; both are included in the membership functions. 2.4.3. Fuzzy Clustering and Fuzzy Identification To identify complex non-linear systems it is common to obtain partitions of the available data, each partition or subset is approximated by a simple model. The data can be quantitative or qualitative information or a mixture of both. Clustering algorithms are not only used for classification and pattern recognition to construct fuzzy models but also for the simplification and optimisation in modelling. Isoc [67,68] used quasi-linear fuzzy models based on the Sugeno approach (from experimental measurement data according to the Box±Jenkins data sets). These were compared with the real system data sets and then with models obtained using other identification techniques. Various identification techniques to develop fuzzy models were used: for example Mendel±Wang fuzzy reference sets [69] were used. The results obtained were of good quality because a more natural inter-dependence between the data set and extracted fuzzy sets was defined. The fuzzy approach is becoming a powerful alternative to using artificial expert systems and may gain more practical importance in the future. The nonlinear system can be identified using a fuzzy multiplemodel description of the real system in parallel and a series model or any combination (series±parallel) [70] and consequently a number of models are identified. Chen et al. [2] used a different approach. The identification of locally linear models using the T±S fuzzy modelling strategy is solved using a convex optimisation technique involving linear matrix inequalities (LMI) in order to find the optimum set of fuzzy models (Ai, Bi, Ci, Di, for i 1,r). This approach has been successfully applied to a real-time induction motor drive test rig. The issue that remains a challenge is to obtain not only a number of multiple linear models but also the minimum number of models, which describe the nonlinear system. This optimisation is difficult because the identification method using fuzzy logic depends on a large number of variables. There are various procedures to try to extract learning rules in combination with other techniques (e.g. using NNs) [71]. 2.4.4. Fuzzy Techniques in FDI In recent years the application of fuzzy logic to model-based fault diagnosis approaches has gained
increasing attention in both fundamental research and application. Symptoms can be generated using observers based on the estimation of the output from the system. The first methods used fuzzy set theory to express cause±effect relations in expert systems. The key idea of model-based methods is the generation of signals, termed residuals. These are usually generated using mathematical methods (based on state observers, parameter estimation or parity equations). The models correspond to the monitored system [2]. Residuals are signals representing inconsistencies between the model and the actual system being monitored, but the deviation between the model and the plant is influenced not only by the presence of the fault but also modelling uncertainty. One solution is for the observer and controller parameters to be tuned via estimation from the real system for fault isolation and threshold adaptation [72]. The introduction of fuzzy logic can improve the decision-making, and in turn will provide reliable and sufficient FDI, suitable for real industrial applications. However, difficulties arise in the training of the algorithm in the inference mechanism where knowledge is hidden in large amounts of data and knowledge is embedded in trained NNs [15]. A fuzzy feed-forward NN (FNN) is applied to extract rules from an existing data base (Fig. 3). Frank et al. [11,73±77] use fuzzy logic for residual evaluation. This can be an important way of taking into account modelling uncertainty at decision-making rather than during residual generator design. By applying a fuzzy rule-based approach the fault decision process can be made robust to the uncertainties so that false and missed alarm rates can be minimised. Considering supervisory control [11,78] with tasks such as system management, process monitoring, identification, fault detection, diagnosis and adaptive capability reduces at lower level the models for developing simpler structures for observers and controllers using T±S fuzzy models. 2.5. Qualitative and Quantitative FDI Methods Integrated The concept of integrating symbolic and quantitative knowledge through a neuro-fuzzy system is well Fuzzified Input
Input Fuzzifier
Fig. 3. Basic concept of the FNN.
Output
256
J.M.F. Calado et al.
known in the literature. The aim is to combine the NN learning ability with the explicit knowledge representation of fuzzy-logic. The application engineer can therefore extract, from the data, a high level language description of the system. Heuristic plant knowledge can also be included. Some other tools like evolutionary algorithms, genetic algorithms or probabilistic reasoning can also be combined with the above, to enhance the parameter tuning or to deal with the uncertainty in order to establish the desired intelligent system. 2.5.1. Combining NNs with Fuzzy Logic A simple neuro-fuzzy network (NFN) uses a fuzzifier to combine the NN and fuzzy logic. The fuzzifier receives input data patterns and converts them to fuzzy categories which are used as inputs to the NN. Several methods of combining NNs with fuzzy logic have been reported, the advantages and disadvantages of which depend on the specific application. Farag et al. [79] proposed a five-layer fuzzy NN (FNN) in which the parameter identification of the fuzzy model comprises three phases (Fig. 4). In the first phase, initial parameters are found using the Kohonen self- organising feature map algorithm. The second phase consists of finding the linguistic rules. In the third phase, a genetic algorithm known as the multi-resolutional dynamic genetic algorithm (MRDGA) is used to tune the membership functions. Li and Elbestawi [80], proposed a multiple principal component (MPC) fuzzy NN (FNN) for custering (unsupervised classification) which employs fuzzification of the inter-connections of a conventional NN (Fig. 5). This method is used for automated tool condition monitoring in machining and is based on Li and Elbestawi's fuzzy NNs in which fuzzy membership functions are used for decision making and the interconnection in the network.
Altug et al. [81] presented two neural fuzzy (NN/ FZ) inference systems, namely, Fuzzy Adaptive Learning Control/Decision Network (FALCON) and Adaptive Network Based Fuzzy Inference System (ANFIS), with applications to induction motor fault detection/diagnosis problems. Aggarwal et al. [82] addressed the problems of fault diagnosis in complex multi-circuit transmission systems, in particular those arising due to mutual coupling between the two parallel circuits under different fault conditions. The problems are compound by the fact that this mutual coupling is highly variable in nature. In this respect, the soft computing provides the ability to classify the abnormal phase/phases by identifying different patterns of the associated voltages and currents. A Fuzzy ARTmap (Adaptive Resonance Theory) NN is employed and is found to be well-suited for solving the complex fault classification problem under various system and fault conditions. Jota et al. [83] proposed a combinatorial intelligent system based on neuro-fuzzy, neuro-expert and fuzzy-expert algorithms which can be successfully applied in the detection of a number of faults in a range of equipments. Pfeufer and Ayoubi [84] applied a neuro-fuzzy-structure to the classification of faults, based on symptoms generated by identifying a mathematical model. Ozyurt and Kandel [85] presented a hybrid diagnostic methodology for fault diagnosis based on a hierarchical multi-layer perceptronelliptical NN structure and a fuzzy expert system. The introduced hybrid system is noise-tolerant, easy to train and maintain and also reliable under changing process conditions. Some recent research focuses on NNs, for example B-spline NNs (Fig. 6), which can be used to extract the qualitative knowledge of the system. The B-spline network is used to classify faults in the process [86]. The faults are assumed to be a priori known, and their corresponding data available to the designer.
Output Layer 5: Output neurons Layer 4: Membership functions describing output fuzzy linguistic variables Layer 3: Fuzzy rule base. Each neuron acts as single fuzzy control rule Layer 2: Membership functions describing input fuzzy linguistic variables
Outputs Output Layer Fuzzy Logic Connections Fuzzy Connections at Neurons Fuzzy Logic Connections Input Layer
Layer 1: Input neurons Inputs
Input
Fig. 4. Fuzzy NN [79].
Hidden Layer
Fig. 5. Fuzzy NN [80].
257
Soft Computing Approaches to Fault Diagnosis
Fault#1
– u(t)
Σ
. . . . . . .
– y(t)
B1 B2 B3
Σ
Bp
Fault#2
Fault#M Σ Healthy Σ
Fuzzy If-Then rule extraction Fig. 6. Fuzzy rule-extraction using B-spline NN.
The network will then have as many outputs as classes of behaviour. Hence, for a system with 2 classes of faults, the network output will be a vector of dimension 3; this includes the models associated with the 2 faults as well as that corresponding to the healthy one. This method offers the extraction of fuzzy rules in addition to FDI. As outlined above, there are many applications for NNF-based FDI. However successful, implementation of FNN FDI methods depends heavily on prior knowledge of the system and the training data. In Section 4 we describe an interesting approach to FDI based on an architecture of FNNs arranged in a hierarchical scheme [87]. 2.6. Non-linear FDI via Fuzzy Observers
overall outputs for the RBF network should be the same method for the T±S model. (2) In order to restrict the network to the class of T±S structure some conditions must be satis®ed: The output of each fuzzy if±then rule is a constant and the membership functions chosen have to be the Gaussian with the same width. For the T±S models, the stability conditions and pole assignment in LMI regions are derived in [89]. For a non-linear dynamic system described by the T±S fuzzy model, a fuzzy observer can be designed to estimate the system state vector. For the fuzzy observer design, it is assumed that the fuzzy system model is locally observable, i.e., all (Ax, Cx) (x 1, . . . , r) pairs are observable. Using the idea of parallel distributed compensation (PDC) ± the use of parallel dissimilar feedback paths, with each one corresponding to a different model [90,91], for a nonlinear dynamic system represented by T±S fuzzy model, a linear time-invariant observer can be associated with each rule of the fuzzy model. Continuous System: if !
t is Mx then x^_
t Ax x^
t Bx u
t Lx
y
t
y^
t,
4
y^
t Cx x^
t: The overall observer dynamics will then be a weighted sum of individual linear observers. x^_ y^
r X x1 r X
x
wAx x Bx u Lx
y
y^,
5
x
wCx x^:
x1
2.6.1. Takagi±Sugeno (T±S) Fuzzy Models It is possible to establish the equivalence of a generalised class of Gaussian RBF NNs and the T±S model of fuzzy inference. A standard Gaussian RBF and a restricted form of the T±S model of fuzzy inference are functionally equivalent [88]. The standard RBF NN is functionally equivalent to the T±S fuzzy inference model of fuzzy under the following conditions: (1) There are some conditions required to make RBF and fuzzy inference system structurally equivalent, like the number of RBF units must be equal to number of if±then rules, the T-norm has to be the operator used to compute each rule's ®ring is multiplication and the method to derive their
x
w
t is the grade of membership of the premise variable, w(t), or the tensor product of grade of memberships, if w(t) is a vector. The membership grade function x
w
t satisfies the following constraints r X
x
w
t 1,
6
x1
0 x
w
t 1
8x 1, 2, . . . , r:
The schematic diagram of such an observer is shown in Fig. 7, where it can be seen that a fuzzy inference engine is used to ``select'' the appropriate output, from those generated by the r parallel observers. The transition between one model to the other depends on the operating regime defined by !.
258
J.M.F. Calado et al.
MONITORED SYSTEM
Input
Refference Input
Process
Output
CONTROLLER
Fuzzy Qualitative Simulation
Faults
Inputs ACTUATOR
PLANT DYNAMICS
Outputs
SENSORS
Parameter Estimation
Residuals Decision Making
Fig. 8. Fuzzy qualitative observer.
the set of spurious behaviours by means of temporary filters, although these behaviours continue to exist for complex systems. Like Q2, FuSim uses Taylor± Lagrange formula for the temporary calculations, producing identical problems.
Adaptive Law
Residual Observer
Qualitatives States
Fault Information
Fault Diagnosis Unit
Fig. 7. Multiple-model adaptive FDI scheme.
2.6.2. Qualitative Fault Diagnosis Fault diagnosis of dynamic systems can also be based upon declarative knowledge of the system, which is available in qualitative rather than quantitative form [6,92,93]. The qualitative approach is based on the concept of a qualitative model described by means of fuzzy rules that, unlike the quantitative counterpart, only require declarative (heuristic) information. A fuzzy qualitative observer [93] can be designed (see Fig. 8) making use of the fuzzy qualitative simulation in order to produce a residual generator. The qualitative model of the process can be seen as an observer. Since the model used to obtain the observations of the process is qualitative, the states (behaviours) will be qualitative. The qualitative states are obtained via simulation or via observation. In order to reduce the ambiguity resulting from qualitative simulation, qualitative observers are used to generate the predictions of the possible qualitative states. In FDI, the qualitative observer can be used as a substitute when not all the quantitative information of the process is available. A Fuzzy Qualitative Simulator developed by Heriot-Watt University, Edinburgh, called FuSim (fuzzy interval-based simulator) was proposed by Shen et al. [6]. This simulator presents a methodology to integrate knowledge of the common sense and the qualitative simulation of physical systems by means of the use of fuzzy sets. The use of an amount of fuzzy space facilitates allows a detailed description of the relations between two or more variables. This method produces a reduction of
2.6.3. Evolutionary Algorithms of Genetic Type NNs can be trained to replicate dynamic system behaviour during normal and abnormal operation. The NN behaves as an implicit model of the process (implicit ± because a mathematical model of the process is not actually required). In order to assure a good accuracy of these models, the NN structures must be optimised. The research has shown that evolutionary techniques cope efficiently with this optimisation problem. They can be used in order to implement semi-automatic procedures, dedicated to the selection of convenient NN topologies and parameters. Many papers have focused on the development of evolutionary-based algorithms for two types of NN structure. The first is applied to a feed-forward network structure and the second is applied to the dynamic multi-layer perceptron. ``Near-optimal'' NN topologies can be obtained by minimising their complexity order and the corresponding output-squarederror computed for the whole training data set. The proposed procedures are used for an appropriate construction of neural observer schemes, in order to perform a robust diagnosis (detection and fault isolation) of the process faults. The user must set some parameters of a genetic algorithm (GA), but this seems to be easier than manually selecting the NN topology or of using destructive or constructive algorithms. An advanced study of network optimisation using evolutionary programming and GA approaches has recently been reported [94,95]. Later, in Section 5 we give a more detailed explanation of this methodology. Genetic algorithms have also been used successfully to optimise the design of model-based observers for residual generation [96]. This study used a multiobjective approach with objectives corresponding to
259
Soft Computing Approaches to Fault Diagnosis
various sensitivity and robustness design issues to achieve good residual response to faults and minimise the effects of disturbance and noise acting at different frequencies. This approach can be contrasted with the use of GAs for NN optimisation. Zhou et al. [97] developed a method of fault diagnosis, based on genetic algorithms (GAs) to resolve this problem. This NPP fault diagnosis method combines GAs and classical probability with an expert knowledge base. Wen and Han [98] presented a method to fault section estimation using the genetic algorithms in power systems by using the time sequence information of tripped circuit breakers.
3. NN Structures in FDI 3.1. Introduction As discussed in Section 2.3, one of the most important classes of the FDI methods, especially dedicated to non-linear processes, is the use of NN [99±101]. There are many neural structures which can be effectively used to residual generation as well as to residual evaluation. The residual generation can be performed using neural networks with dynamical characteristics like a multi-layer perceptron with tapped delay lines, recurrent networks or a GMDH network (Group Method and Data Handling). The residuals generated by a bank of neural models are then evaluated by means of pattern classification. To carry out this task several neural structures can be used including static multi-layer perceptron, self-organizing map, radial basis networks and finally multiple network structure. In this section of the paper we provided a survey of the most important neural network architectures and possibility of their use in fault detection and diagnosis of dynamic non-linear systems. The final part of the section includes two applications of artificial neural networks in fault diagnosis: fault detection and isolation in two tank system and fault detection in the chosen parts of sugar evaporator. 3.2. Neural Structures for Modelling Technological plants are often complex dynamic systems described by non-linear high-order differential equations. For their quantitative modelling for residual generation, simplifications are inevitable. This usually concerns both the reduction of dynamics order and linearisation. Another problem arises from unknown or time variant process parameters. Due to all
these difficulties, conventional analytical models [2] often turn out to be not accurate enough for an effective residual generation. In this case, knowledgebased models are the only alternative. Replacing unknown parameters by qualitative knowledge-based approach enhances the robustness of the model versus unknown or time-dependent parameters. For residual generation, the neural network replaces the analytical model [102] that describes the process under normal operation. First the network has to be trained for this task. Learning data can be collected directly from the process if possible or from a simulation model that is as realistic as possible. The latter possibility is of special interest for data acquisition in different faulty situations, in order to test the residual generator, as those data are not generally available at the real process. After finishing the training, the neural network is ready for on-line residual generation. First, a bank of neural models should be designed. Each neural model represents one class of system behaviour. One model represents the system at its normal conditions and each successive one ± faulty situation, respectively. After that the residuals can be determined by comparing the system output y(k) and the outputs of models y0(k), y1(k), . . . , yn(k). In this way, the residual vector r [r0, r1, . . . , rn], which characterizes a suitable class of system behaviour can be obtained. Finally, the residual vector r should be transformed by a classifier to determine the location and the time of fault occurrence. 3.2.1. Multi-layer Perceptron NNs are constructed with a certain number of single processing units, which are called neurons. The standard neuron model is described by the following equation [16]: ! P X w p up u0 ,
7 yF p1
where up, p 1, 2, . . . , P, denotes neuron inputs, u0 ± threshold, wp ± synaptic weight coefficients, F( ) ± non-linear activation function. Sigmoid and hyperbolic tangent functions are very popular and most frequently used. The multi-layer perceptron is the network in which the neurons are grouped into layers (Fig. 2). Such a network has an input layer, one or more hidden layers and an output layer. The main task of the input units (black squares) is preliminary input data processing u [u1, u2, . . . , uP]T and passing them onto the elements of the hidden layer. Data processing can
260
J.M.F. Calado et al.
comprise e.g. scaling, filtering or signal normalisation. Fundamental neural data processing is carried out in hidden and output layers. It is necessary to note that links between neurons are designed in such a way that each element of the previous layer is connected with each element of the next layer. These connections are assigned with suitable weight coefficients, which are determined, for each separate case, depending on the task the network should solve. The output layer generates the network response vector y. Non-linear neural computing performed by the network shown in Fig. 9 can be expressed by: y F3 fW3 F2 W2 F1
W1 u1 g,
8
where F1, F2 and F3 denote the non-linear operators, which define neural signal transformation through 1st, 2nd and output layers; W1, W2 and W3 ± the matrices of weight coefficients which determine intensity of connections between neurons in the neighbouring layers; u, y ± the input and output vectors, respectively. One of the fundamental advantages of NNs is that they have an ability of learning and adaptation. From the technical point of view, training of a neural network is anything else, but determination of the weight coefficient values between neighbouring processing units. The fundamental training algorithm for feedforward multi-layer networks is the Back-Propagation algorithm (BP) which gives the prescription how to change the arbitrary weight value assigned to connection between processing units in the neighbouring layers of the network. This algorithm is of iterative type and it is based on minimisation of a sum-squared error utilising optimisation gradient descent method. Modification of the weights is performed according to the formula [16]: w
k 1 w
k
rE
w
k,
1
3.2.2. Recurrent Networks The feed-forward networks can only represent static transformations, and therefore they need past inputs and outputs of the modelled process. This can be performed introducing suitable delays. Unfortunately, this kind of network has some disadvantages. First of all, it is required to know the exact order of the process. If the order of the process is known, all necessary inputs and output should be fed to the network. In this way the input space of the network becomes large. In many practical cases, there is no possibility to learn the order of the system, and the number of suitable delays has to be selected experimentally. Another disadvantage is the limited past history horizon, thereby preventing modelling of arbitrary long time dependencies between inputs and desired outputs. Moreover, the trained network has strictly static, not dynamic, characteristics. More natural, dynamic behaviour is assured by recurrent networks. The recurrent neural networks [103] are characterized by considerably better properties assessing from the point of view of their application in
9
1
2
where w(k) denotes the weight vector at the discrete time k, ± the learning rate and rE
w
k ± the gradient of the performance index E with respect to the weight vector w. The neural networks described above are of static type, and using them it is possible to approximate any continuous non-linear, although static function. However, neural network modelling of control systems requires taking into account dynamics of the considered processes or systems. It can be achieved by an introduction of the tapped delay lines into the system model (Fig. 10).
MODEL
2
P
N
Fig. 9. 3-layer perceptron with P inputs and N outputs.
Fig. 10. Modelling using feed-forward networks; TDL ± Tapped Delay Line.
261
Soft Computing Approaches to Fault Diagnosis
control theory. As a result of feedbacks introduced to network structures, it is possible to accumulate the information and use it later. From possible feedback point of view, recurrent networks can be divided as follows [104]: local recurrent networks ± there are feedbacks only inside neuron models. These networks have a structure similar to static ones, but consist of dynamic neuron models, global recurrent networks ± there are feedbacks allowed between neurons different layers or between neurons of the same layer. Elman network: The Elman network is an example of partially recurrent neural network. Realization of such networks is considerably less expensive than in the case of multi-layer perceptron. This network consists of four layers of units: the input layer with r units, the context layer with s units, the hidden layer with s units and the output layer with n units. The input and output units interact with the outside environment, whereas the hidden and context units do not (Fig. 11). The hidden units can have linear or nonlinear activation functions, and the output neurons are the linear ones. The context units are used only to memorise the previous activations of the hidden neurons. A very important assumption is that in the Elman structure, the number of context units is equal to the number of hidden units. All the feed-forward connections are adjustable; the recurrent connections denoted by a thick arrow in Fig. 11 are fixed. The vector of external inputs to the network at the time k is represented by u
k u1
k, . . . , ur
kT and the vector of network outputs by y
k y1
k, . . . , yn
kT . The total input to the ith hidden unit is denoted by zi(k), the output of the ith hidden unit is
denoted by xi(k), and the output of jth context unit is xcj
k. The dynamic behaviour of the Elman structure is defined by the following equations [105]: zi
k
r X j1
wuij uj
k
s X j1
wxij xcj
k,
xi
k f
zi
k, xcj
k xj
k 1, s X wyli xi
k, yl
k
10
i1
where wuij denotes the connection weight between the jth input and the ith hidden unit, wxij ± the weight between the jth context and the ith hidden units, wyij ± the weight between the jth hidden and the ith output units, and f( ) ± a non-linear activation function. Theoretically, this kind of network is able to model the sth order dynamic system, if it can be trained to do so. Generally, the Elman structure will be significantly smaller than a feed-forward network, when s is large. At the specific time k, the previous activation of the hidden units (at time k 1), and the current inputs (at time k) are used as inputs to the network. In this case, the Elman network behaviour is analogous to a feedforward network. Therefore the standard BP algorithm can be applied to train the network parameters. Network of dynamic neurons: The NN presented here is composed of dynamic neuron models with in®nite impulse response ®lters (Fig. 12) [38,104]. In such models one can distinguish three main parts: a weight adder, a ®lter block and an activation block. A behaviour of the dynamic neuron model under consideration is described by eq. (11). x
k
P X
wp up
k,
p1 n X
y~
k
i1
ai y~
k
i
n X
bi x
k
i,
11
i0
y
k F
~ y1
k F
gs y~
k,
1 1
2
where wp , p 1, . . . , P denotes the input weights, up
k, p 1, . . . , P are the neuron inputs, P is the number of inputs, y~
k denotes the filter output, ai ,
1
Fig. 11. Elman network with r inputs and n outputs.
1
1
2
2
Fig. 12. General scheme of dynamic neuron model.
262
J.M.F. Calado et al.
i 1, . . . , n and bi , i 0, . . . , n are feedback and feedforward filter parameters, respectively, F( ) is a nonlinear activation function that produces the neuron output y(k), and gs is the slope parameter of the activation function. The input weights w, filter parameters a and b, and slope of the activation function gs are adaptable parameters. These parameters are adjusted during training of the neuron. Due to dynamic characteristics of the neurons, a neural network of the feed-forward structure can be designed (similarly like in Fig. 9). Such a network is often called locally recurrent globally feed-forward. Taking into account the fact that this network has not any recurrent link between neurons, to adapt the network parameters, a training algorithm based on the backpropagation idea can be elaborated. The calculated output is propagated back to inputs through hidden layers containing dynamic filters. As a result the Extended Dynamic Back Propagation has been defined [12]. 3.2.3. GMDH Networks This kind of NN is constructed by connections of a number of elementary neurons of the GMDH type [106,107]. Such neurons are described by the transmission function f given by: y f
u1 ,u2 , . . . , un ,
12
where u1 , u2 , . . . , un are the inputs and y is the output. The function f in (12) have to be linear with respect to the parameters, i.e. a second degree polynomial: f
u1 , u2 a0 a1 u1 a2 u2 a3 u21 a4 u22 a5 u 1 u 2
13
Taking into account this assumption, the optimal constant values ai can be derived using Least-Mean Squares (LMS). The GMDH network can be considered as feed-forward networks (Fig. 13) with a growing structure during the training process. Network synthesis depends on iterative repetition of a
specific sequence of operations leading to the evolution of the resulting structure similar to the original Iwachnienko method [106]. The procedure ends up when an optimal (or assumed) network complexity is achieved. For processing accuracy, the selection of the partial models (neurons) is carried out before connecting the created layer to the network. One of the possible solutions here is optimal selection method, which is based on rejecting the neurons for which the processing error is greater than an arbitrarily given threshold ". The neuron layer where optimality criterion is fulfilled, plays the role of the output layer. 3.3. Neural Structures for Classification The residual evaluation is a logical decision-making process that transforms quantitative knowledge into qualitative (``Yes±No'') statements. It can also be seen as a classification problem. The task is to match each pattern of the symptom vector with one of the preassigned classes of faults and the fault-free case. This process may highly benefit from the use of intelligent decision making. A variety of well-established approaches and techniques (thresholds, adaptive thresholds, statistical and classification methods) can be used for residual evaluation [1]. Among these approaches the fuzzy and neural classification methods are very attractive and more and more frequently used in the FDI systems [108]. 3.3.1. Multi-Layer Perceptron In order to apply multi-layer perceptron for residual evaluation the network has to be fed with residuals, which can be generated by another neural network. First the network should be trained for this task. After finishing the training, the neural classifier can be use for on-line residual evaluation to decide whether the fault has occurred and indicate its possible cause [99,101,102]. 3.3.2. Kohonen Network
1 1 1 2
2
Fig. 13. Structure of the GMDH network.
Kohonen network is the self-organising map. Such networks can learn to detect regularities and correlations in their input and adapt their future responses to that input accordingly. The network parameters are adpated by learning procedure based on only input patterns (unsupervised learning). Two-dimensional self-organising map is shown in Fig. 14. Inputs and neurons in competitive layer are connected entirely. Weight parameters are adapted using winner takes all
263
Soft Computing Approaches to Fault Diagnosis
rule as follows [109]: wc
kk minfku
k
ku
k
i
wi
kkg
14
where u(k) is the input vector, wc
k the winner's weight vector and wi
k is the weight vector of the ith processing unit. However, instead of adapting only the winning neuron, all neurons within a certain neighbourhood of the winning neuron are adapted according to the formula: wi
k 1
1 wi
k u
k for i 2
wi
k otherwise,
15
where is the neighbourhood of the winning neuron and is the learning rate. In the Kohonen learning rule, the learning rate is a monotone decreasing time function. The frequently used functions are:
t 1=t or
t at a for 0 < a 1. The concept of ``neighbourhood'' is extremely important during network processing. Figure 15 shows the neighbourhoods of diameters 1 and 2. The neighbourhood may be defined
1
using different metrics, e.g. Euclidean, Manhattan and others. After designing the network, a very important task is assigning clustering results generated by network with desired results for a given problem. 3.3.3. Radial Basis Function Networks In recent years Radial Basis Function (RBF) networks enjoy more and more popularity as an alternative solution to slowly convergent multi-layer perceptron. Similar to the multi-layer perceptron, the radial basis network [16] has an ability to model any non-linear function. However, this kind of neural network needs many nodes to achieve the required approximating properties. This phenomenon is similar to the choice of number of the hidden layers and neurons in the multi-layer perceptron. The RBF network architecture is shown in Fig. 16. This network has three layers: the input layer, only one hidden, non-linear layer and the linear output layer. It is necessary to notice that the weights connecting the input and hidden layers have values equal to one. This means that input data are passed on to the hidden layer without any weight operation. The output i of the nth neuron of the hidden layer is a non-linear function of the Euclidean distance between the input vector u [u1, . . . , uP]T, and the vector of the centres ci ci1 , . . . , ciP T , and can be described by the following expression: i
ku
2
Fig. 14. Two-dimensional self-organizing map.
ci k, i , i 1, . . . , n,
16
where i denotes the spread of the ith basis function, kk ± the Euclidean norm. The jth network output yj is a weighted sum of the hidden neurons outputs: yj
n X
ji i ,
17
i1
11 1 12
1
2
2
Fig. 15. Winner's neighbourhood.
Fig. 16. Structure of P inputs, 2 outputs RBF network.
264
J.M.F. Calado et al.
where ji denotes the connecting weight between the ith hidden neuron and the jth output. Many different () functions were suggested. The most frequently used are the Gaussian functions:
z, exp
z2 , 2
18
or inverted quadratic functions as well:
z
z2 2
1 2:
19
The fundamental operation in the RBF NN is the selection of the function number, function centres and their position. On one hand, too small number of centres can result in weak approximating properties. On the other hand, the number of exact centres increases exponentially with increasing input space size of the network. Hence, it is unsuitable to use the RBF NN in problems where the input space has large dimension. To train such a NN, hybrid techniques are used. First, the centres and the widths of the basis functions are established heuristically. After that the weight adjustment is performed. The centres of the radial basis functions can be chosen in many ways e.g. as values of the random distribution over the input space or by clustering algorithms, which give statistically the best choice of the centre numbers and their positions as well. When the centre values are established, the objective of the learning algorithm is to determine the optimal weight parameters ji , which minimises difference between the desired and the real network response. The output of the network is linear in weights and that is why for estimation of the weight matrix, traditional regressive methods can be used. The example of such a technique is the recursive leastsquares method. 3.3.4. Multiple Network Structure A common single neural network of a finite size does not often assure a required mapping or its generalisation ability is not sufficient. So, if we could build the networks, which consider only the characteristic parts of the complete mapping, it would fulfil its task much better. It seems to be more proper to use abovementioned the RBF networks as classifiers. The underlying idea of the presented multiple network scheme in fault detection and isolation systems (Fig. 17) is to develop n independently trained neural networks for n working points and to classify a given input pattern by using a supervisor [110]. The supervisor assigns a certain degree of reliability to the
Fig. 17. Bank of neural classifiers in the FDI.
output of each neural network and indicates which one can be taken into account during decision making. The output value of the supervisor can be obtained from the formula: F
Pn i1 NNi Ui
u P , n i1 Ui
u
20
where NNi is the output of the ith neural network and Ui
u is the membership function associated with the ith neural network (for example, the input value can be fuzzyfied to a triangular fuzzy set with the centre at a proper working point, and the spread value equal to the distance between the neighbouring working points). 3.4. Applications 3.4.1. Two Tank Laboratory System System description. A two tank system is a technical realisation of a non-linear Multi-Input Multi-Output (MIMO) system (Fig. 18). The system consists of two cylindrical tanks with the delay in the form of a spiral pipeline. The nominal outflow Qn is located at the tank 2. The pump driven
265
Soft Computing Approaches to Fault Diagnosis
by DC motor supplies the tank 1, where Q1 is the inflow of the liquid through the pump to the tank 1. Both tanks are equipped with sensors for measuring the liquid level (h1h2). The valves V1, V2, V3, V4, and VE are electronically controlled. The aim of the two tank system control is to keep the water level in tank 2 constant. In such a system, different types of faults like clogs or leakages can be introduced. In our research, the following three faults were considered: valve V2 closed and blocked, valve V2 opened and blocked, leak in tank 1.
Spiral Pipeline 1
Pump 1
1
2
2
4
3
Fig. 18. Scheme of the two tank system.
Residual generation. In the proposed FDI system [38,86], four classes of system behaviour ± normal operation conditions f0 and three faults f1 f3 are modelled by a bank of neural observers designed with the neural networks composed of dynamic neurons. The MODEL 0 identifies the system under normal operation conditions and MODEL 1, MODEL 2 as well as MODEL 3, model the faults f1, f2 and f3, respectively. Based on the neural observers, the residual signals can be generated (see Fig. 19). Each model is sensitive to only one class of the system behaviour and generates residual signal near zero. The residual rf0 (assigned to the normal conditions) significantly differs from zero when one of the faults occurs (Fig. 19a). At the beginning the diagnosed system worked at its normal operation conditions and after 1000 iterations different faults occurred. The residual rfi assigned to the fault fi generates a deviation from zero under the normal operation of the system, and is almost equal to zero when the fault fi occurs (Fig. 19b±d). Residual evaluation. The residuals should be transformed into the fault vector f. This operation can be seen as a classification problem [102]. For fault diagnosis, this means that each pattern of the symptom vector r br0 , rf1 , rf2 , rf3 c is assigned to one of the classes of system behaviour ff0 , f1 , f2 , f3 g. Each bit of the binary vector f is used to code a single fault. The example of fault coding is shown in Table 1. The representations of the fault vector f given in Table 1 match all known kinds of system conditions. In the case of normal operation the classifier should generate
0
1
3
2 Fig. 19. Residuals for different cases.
266
J.M.F. Calado et al.
f [0 0 0]. In turn in the case of fault f2 the classifier should generate f [0 1 0]. Any other vector representation different from these presented in Table 1 (e.g. f [1 1 1]) shows that the system operates under unrecognised conditions. To perform residual evaluation, the well-known static multi-layer feed-forward network can be used. In fact, the neural classifier should map a relation of the form : Rn ! Rm :
r f. The applied NN belongs to the class N34, 5, 4, 3 . The training process was carried out off-line for 20 000 steps using the standard back-propagation algorithm with momentum and adaptive learning rate. The momentum parameter was equal to 0.95 and the initial learning parameter ± 0.01. Fifty training patterns per one class of system behaviour were chosen for the training process. As a result the set of 200 training patterns was obtained. The activation function in the hidden layer was of hyperbolic tangent type, and in the output layer, of the linear type. Taking into account the fact that one of the designing assumptions was a binary response of the classifier (Table 1), its outputs should be close to the nearest integer, while the classifier output differs from the desired less than the assumed accuracy called a confidence level. Each classifier response, which is inside the confidence interval, is rounded to the nearest integer. Table 2 illustrates the relation between the value of the confidence level and percentage of non-classified and badly classified system behaviours. These results are obtained for detection of the fault f1 . The applied classifier has a simple structure and can
be trained with a non-complicated training algorithm. However, it has one serious disadvantage ± the necessity of rounding the classifier output with a fixed confidence level. It leads to the undesirable effect of non-classified system behaviour. In fact, with increasing the confidence level, the number of nonclassified behaviours is decreasing, but the number of badly classified ones is increasing as well. 3.4.2. Sugar Factory Example The following example shows how the NN architectures can be used in fault detection systems for a real industrial plant [111]. The evaporisation station presented here is a part of the Lublin Sugar Factory (Poland). It comprises 5 vapouriser sections. The first three sections are of Roberts type with a bottom heater-chamber, while the last two are of Wiegends type with top heater-chamber. Due to technological improvement and a very careful inspection of the installation before starting a 3 months-long sugar campaign, faults in sensors, actuators and technology are rather exceptional. Therefore, to design the fault detection system, only data for normal operation conditions can be utilised. Based on observations of the process variables and the knowledge about the process, the juice temperature after evaporation section sub-model can be designed: T51 08 f
T51 06, TC51 05, F51 01, F51 02
21
Table 1. Coding of the faults Faults
Fault vector f [ f1 f2 f3]
Normal conditions Valve V2 closed and blocked Valve V2 opened and blocked Leak in tank 1
f1 0 1 0 0
f2 0 0 1 0
f3 0 0 0 1
Table 2. Confidence level versus non-classified and badly classified system behaviours Confidence level
Non-classified behaviour (%)
Badly classified behaviour (%)
0.1 0.05 0.02 0.01 0.005 0.001
7.8 10.5 13.9 16.8 21.9 97.7
15.4 14.1 12.1 10.3 8.1 2.2
where T51_08 is the juice temperature after first section of evaporation station, T51_06 is the input steam temperature, TC51_05 is the thin juice temperature after heater, F51_01 is the thin juice flow to the input of evaporation station and F51_02 is the steam flow to the input of evaporation station. Suitable process variables are measured in chosen points of the evaporation station by relevant sensors. After that the obtained data are transferred to the monitoring system and stored there. The sampling time is 10 s. During one work shift (8 h) 2880 samples per one monitored process variable are gathered. Individual process variables are characterised by their amplitudes and ranges. In many cases, these signal parameters differ significantly for different process variables. These large differences can cause the neural model be trained inaccurately. It is thus necessary for the raw data to be pre-processed. The model inputs under consideration are normalised to zero mean and unit standard deviation. In turn, the output data should be
267
Soft Computing Approaches to Fault Diagnosis
transformed taking into consideration the response range of the output neurons. For the hyperbolic tangent neurons, this range is [ 1,1]. To perform such kind of transformation, the linear scaling can be used. Moreover, to avoid saturation of the activation function, the raw output data will be transformed to the interval [ 0.9, 0.9]. First of all, neural models to identify suitable process variables under normal conditions are designed. After that residuals may be determined by comparison of the measured values and model outputs. Occurrence of faults is signalled by a deviation of the residual value from zero. The training process has been carried out on-line using the extended dynamic back propagation algorithm of the network of dynamic neurons architecture and data for one work shift (8 h). To check the sensitivity and effectiveness of the proposed fault detection system, data with artificial faults in measuring circuits are employed. The faults are simulated by increasing or decreasing values of particular signals by 5, 10 and 20% at specified time intervals. These artificial faults were chosen very carefully, taking into account process safety and minimisation of economic losses. Below, the experimental results for detection of instrumental faults are described. The dynamic network model of this process belongs to the class N24, 3, 1 (4 inputs, 3 hidden neurons and 1 output neuron). Each neuron in the network model has the first order IIR filter and the hyperbolic tangent activation function. In Fig. 20(a), the modelling results for the evaporisation stage are shown. The output of the process is marked by the black line and the output of its neural model by the grey one. As it can be seen in Fig. 20a, predicting capabilities of the neural model are quite good. Figure 20(b) presents the residual for simulated faults in different sensors. The following sensor faults were introduced successively at chosen time intervals. As it can be seen in Fig. 20(b), the faults are very easy to be detected, e.g. using a simple threshold technique. The fault detection system is the most sensitive to the fault of the output sensor T51_08 (2100±2400 time steps). Even faults of a sensor smaller than 5% can be detected immediately. Somewhat worse are diagnosis results for the faults of the sensors T51_06 (1500±1800 time steps), and TC51_05 (2450±2750 time steps). However, in both cases 5% faults are explicitly and surely detected by the proposed fault detection system. The worst results for faults of the sensors F51_01 and F51_02 are obtained (0±600 time steps). Faults in both these sensors are revealed by small spikes in the residuals. Only large faults are signalled. That means that the fault detection system is not very sensitive to occurrence of faults in these two sensors.
Threshold
Threshold
Fig. 20. Modelling results for evaporator model under normal operation conditions (a) and residual signal (b).
4. Neuro-Fuzzy Approach in FDI 4.1. Introduction Recently, the use of NNs for fault detection and isolation purposes has received increasing attention in both research and application [28,46,77,81,108,112]. In some of these applications, NNs are used to examine the possible fault or faults in the process under concern and give a fault classification signal to declare whether or not the process is faulty [113]. This procedure may be suitable for diagnosing faults for some processes under steady state conditions, but this is not the case for diagnosing faults in processes with significant transient behaviour. Mainly, because the steady-state diagnosis system usually gives an incorrect diagnosis information when the process carries out a transient behaviour due to normal regulation operation. These false alarms result from the nonconsideration of the dynamic behaviour of the process in the design stage of the NN, resulting in a blind network that cannot differentiate between input changes and transient or faulty behaviour. To overcome this problem, a hybrid fault detection and diagnosis system, combining a deep/shallow knowledge-based system with a Hierarchical Structure
268
of Fuzzy Neural Networks (HSFNN), is proposed. The system is designed to cope with transient behaviour of the process and with single and multiple incipient and abrupt fault detection and isolation abilities from only single abrupt fault symptoms. Fault detection is performed through the knowledgebased system where fault detection rules have been obtained from deep and shallow knowledge of the process under consideration. A systematic methodology for generating these heuristic rules, based on the process structure and component function, was developed. Deep knowledge is obtained from structural decomposition of the overall process into subsystems according to plant topology [114]. Heuristic rules based on shallow knowledge are developed from the operational knowledge of the process. The HSFNN isolates the fault or faults, since it encodes both the operational dynamic behaviour of the process and the faulty behaviour of the system. Fault symptoms concerning multiple simultaneous faults are harder to learn than those associated with single faults. Furthermore, the larger the set of faults, the larger the set of fault symptoms will be, and hence, less certain the training outcome. In order to overcome this problem, the current approach has a hierarchical structure of 3 levels where several FNNs are used. Thus, a larger number of patterns are divided into many smaller subsets so the classification can be carried out more efficiently. One of its advantages is that multiple faults are diagnosed from only single abrupt fault symptoms. In contrast with the conventional feed-forward multi-layer NNs, all the FNNs adopted have an additional layer, which converts the variation in each on-line process measurement into fuzzy sets. This section of the paper describes the overall architecture of the proposed FNN FDI system. It provides a description of the knowledge based fault detection system and the presentation of the hierarchical structure of FNNs-based fault isolation approach. Finally, in this section a validation example, using an industrial chemical process, where single and double simultaneous abrupt, as well as, incipient faults are considered, is used to demonstrate the performance of the proposed fault detection and isolation system. 4.2. System Architecture A major limitation of the knowledge-based approach arises from the fact that domain experts do not always think in terms of rules. Moreover, domain experts may not be able to explain their line of reasoning, or
J.M.F. Calado et al.
they may explain it incorrectly. For well-behaved processes with well-defined rules, knowledge based systems can be developed to provide good performance. NNs can be preferable to knowledge based systems when rules are not known either because the topic is too complex or no domain expert is available. Furthermore, as described below, the use of a special type of NN, a so-called FNN could be more suitable for fault diagnosis purposes. Such an NN combines the capability of fuzzy reasoning in handling uncertain information with the capability of NN in learning from examples [115]. From the above considerations, there is motivation to implement a hybrid on-line fault detection and isolation system, fully supported by the theoretical achievements in both, knowledge-based systems [116] and the NN [115]. This hybrid on-line fault detection and diagnosis system consists of a knowledge-based approach for fault detection combined with a HSFNN approach for fault isolation. Once a fault has been detected the fault isolation system is triggered to locate the hypothetical fault or faults in the process under concern. Pre-processed on-line measurement data are fed forward through the HSFNN and the corresponding output values analysed for a diagnosis decision. In contrast to most publications concerning this subject, the proposed architecture of the fault detection and isolation system has the main advantage of being able to cope with fault isolation in the presence of transient behaviours. In such publications, the NN is only used as a fault classifier. After the analysis of the process output data corresponding to a possible fault or abnormal features in the process outputs, the NN gives a fault classification signal to declare whether or not the system is faulty. It may be valid to use only process outputs to diagnose faults for some static processes, however this is not the case for diagnosing faults in dynamic processes because the change in process inputs can also affect certain features of process outputs. A diagnosis method that only utilises output information could give incorrect information about faults in the process when the process input has been changed. This is especially true for non-linear systems. To achieve on-line fault isolation in the presence of transient behaviour, the process dynamics, including process input, output and states have to be considered. Our approach relies in this consideration to handle the distinction between transients due to normal set point regulation and faults. The overall computational system implemented for the design stage can be seen as a two-layer configuration. In the lower layer the process under consideration is simulated through its dynamic model
269
Soft Computing Approaches to Fault Diagnosis
including the controllers. It is also through the lower layer that a fault, or faults in the process under concern, can be simulated. The upper layer has supervisory functions with the main aim of detecting and isolating faults introduced in the lower layer. Figure 21 depicts the overall computational architecture of the proposed fault detection and isolation system. In the field of applications the lower layer of this architecture is replaced by the process itself with its controllers being conveniently connected to the FDI system, through a data pre-processing block to reduce noise and to allow data estimation. The information handled by the fault detection and isolation system is basically the changes which occur in the on-line measurement variables. For simplicity, all the changes observed in the measurement variables used by the fault detection and isolation system have been normalised into the range [ 1, 1]. From Fig. 21, it can be seen that the ``Data Pre-Processing'' module performs a pre-processing and normalisation task. This module interfaces the lower layer with the upper layer. On-line measurement variables are sampled and their values are pre-processed, to reduce noise level, and retained. With two consecutive samples retained, the changes in the measurement variables that occur between sampling times are evaluated and then converted into the normalised range. In contrast, with previous implemented on-line fault detection and isolation systems [117], in the current approach the fault detection task is not based on threshold values, but it relies on heuristic relative behaviour of the measurement variables, implemented through a knowledge-based system. The fault isolation is achieved through a HSFNN previously trained with pre-processed and normalised data from the
INTERFACE Man/Machine
FAULT ISOLATION Hierarchical Structure of Fuzzy Neural Networks
FAULT DETECTION Shallow/Deep Knowledge
Upper Layer Lower Layer
DATA NORMALISATION Data Pre-Processing PROCESS Dynamic Behaviour / Control
Fig. 21. Architecture of a real-time FDI system.
process. In this manner, the performance and reliability of the overall fault detection and isolation system, is not affected by such data pre-processing and normalisation. At each sampling time the pre-processed and normalised data is passed to the ``Fault Detection'' module. This module is basically a knowledge-based system consisting of deep and shallow knowledge of the process under concern. From the on-line information transferred to the knowledge-based system, a forward chaining inference engine tries to match the heuristic rules pre-conditions against the current state. If at least one rule is fired, a fault detection flag is enabled and the fault diagnosis module is triggered. The fault isolation task is performed through a HSFNN with the architecture to be described in Section 4.4. If the fault isolation module has been triggered, the normalised changes observed in the measurement variables are fed forward through the HSFNN. Thus, in the current approach, it is considered that any FNN output with a value greater than a pre-specified value suggests that the corresponding hypothetical fault has occurred in the process under consideration. The pre-specified value is problem dependent and it is a design parameter to be found. After a fault or faults have been diagnosed the computational system keeps the operator informed about that fact through the ``Interface Man/Machine'' module. 4.3. Knowledge-based Fault Detection A number of knowledge-based systems have been reported, which perform fault detection and isolation by the method of heuristic classification [118±120]. In this method, diagnostic knowledge is represented mainly in terms of heuristic rules, which perform a mapping between data abstraction (usually called symptoms) and solution abstraction (typically called disorders). Such knowledge representation is sometimes called ``shallow'' because it does not contain much information about the causal mechanisms underlying the relationship between symptoms and faults. The rules typically reflect empirical knowledge derived from operator's experience, rather than a theory of how the device under diagnosis actually works. The latter knowledge, used in conception of knowledge-based systems, is sometimes called ``deep'' knowledge because it involves understanding the structure of the device and the way its components function (Chapter 15 in [1]) [121]. Thus, knowledge-based fault detection and diagnosis systems can use shallow knowledge and/or deep
270
knowledge according to the nature of the knowledge employed to build up the knowledge base. The knowledge is represented by heuristic rules and, quite often, fuzzy reasoning is used since the knowledge is frequently uncertain. Knowledge acquisition is the key task associated with the shallow knowledge based systems. Expertise covering a wide range of problems must be encoded into the knowledge base. The task of knowledge acquisition is very time consuming since the process operators may know little about knowledge engineering, and therefore the interchange of information between a knowledge engineer and a process operator may not be carried out efficiently. Moreover, in an industrial process, many faults needing to be detected or isolated may never have been experienced and, for new or recently developed plants, there may be little applicable past experimental knowledge. Furthermore, the heuristic rules used in conception of shallow knowledge based approaches lack process generality and they tend to fail under novel circumstances. Due to these drawbacks, the recently developed knowledge-based systems for fault detection use the deep-knowledge-based approach or use a combine approach where deep knowledge plays a dominant role [120,122,123]. The advantages of the deep knowledge based approaches are that they can provide reliable behaviour for infrequent occurrences. However, as Clancey [118] has pointed out, the deep models required for building knowledge-based systems are hard to construct, even for relatively simple processes. Therefore, to narrow the diagnosis focus in the process under consideration, as well as to facilitate the analysis of the behaviour of the process variables, several researchers have pointed out some methodologies. Moor and Kramer [119] report a FDI system based on a deep-knowledge approach, which tries to explore the causal path from the observed abnormalities to their causes and hence, locate any associated fault. Finch and Kramer [124] propose another methodology. In their approach, an industrial process is decomposed into several subsystems according to their functions and then identifying the subsystem where the fault occurs, performs the fault isolation. A similar approach is pointed out by Steels [125], but in this approach the function of the system being diagnosed is hierarchically decomposed. Zhang and Roberts [126] have proposed a fault isolation system based on structural decomposition of the process under concern and component functions. However, most of these reported fault detection and isolation systems only deal with a single fault assumption. Moreover, in most of these approaches, in order to avoid false diagnosis under transient process behaviour, the fault isolation system trigger is based on
J.M.F. Calado et al.
threshold values or simply switches off when a set point change is performed. These threshold values will affect the performance of the fault isolation system and should be set proper according to previous operational experience of the process under consideration. Here, a systematic methodology for generating fault detection heuristic rules is proposed. The goal is to develop a knowledge-based fault detection system with the abilities to cope with multiple fault situations and increase the system reliability under transient behaviour situations. Fault detection heuristic rules are generated from knowledge of system structures and component functions. Deep and shallow knowledge will be combined and such a system will be used to trigger a fault isolation system based on a hierarchical structure of FNNs, which is presented in the next section. To deal with normal process transient behaviour, the process reference signals are included in the heuristic rules antecedents, and the system is designed taking into consideration these operating situations. In this manner, if a setting point change occurs in the reference signal the proposed fault detection and isolation system is not affected and, therefore, not recognising such situation as a fault. During the design phase, to facilitate the behaviour analyses of the process variables, the process under concern is structurally decomposed into several subsystems. This structural decomposition corresponds to the plant topology [114]. A graph similar to a Signed Directed Graph (SDG) [127] can be used as a modelling tool to help the designer to understand the important subsystems of the process and the causal relationships between them. The process can be briefly represented by this graph that contains nodes and directed arcs. Each node represents a subsystem and the directed arcs represent interactions between subsystems. For instance, if a hypothetical system is divided into three subsystems, S1, S2, and S3, where each one interacts with each other, such a system can be represented by a Directed Graph which is depicted in Fig. 22. Furthermore, the proposed methodology for describing the system structure is also based on a set of three matrices. The rows and columns of these
S1
S2
S3
Fig. 22. A Directed graph.
271
Soft Computing Approaches to Fault Diagnosis
matrices represent subsystems or measurement variables. Thus, if there is a possibility of a hypothetical interaction between a specific row and a specific column, the corresponding matrix element takes the value 1. Otherwise, that matrix element will take the value 0. Hence, a Connection Matrix, C, can be used to represent the interaction between the subsystems. Assuming that the process is decomposed into n subsystems, then the Connection Matrix for such a system is a n n matrix. Each element of the Connection Matrix, cij, is defined according to Eq. (22): ( 1, if subsystem Si can directly affect
22 cij subsystem Sj 0, otherwise The state of a system is described by its measurements and a subsystem is abnormal if one of its measurements is abnormal. Such a situation can be represented by Eq. (23): 9k,
k 2 1, mi ,
AB
mik ) AB
Si
23
which states, that if there exists in subsystem Si a measurement mik, which is abnormal, then subsystem Si is abnormal. In Eq. (23) the following notation is used: AB ± is a predicate meaning abnormal; mi ± is the total number of measurements in Si; mik ± is the kth measurement in Si. The Connection Matrix only provides a rough description of the relationships among subsystems. However, the Measurement Causal Matrix, CMij, can give a refined description. If there are n measurements in Si, and m measurements in Sj, then the Measurement Causal Matrix between Si and Sj, CMij, is a n m matrix. Thus, each element of CMij, cmkl ij , is determined through the following equation, 8 1, if the kth measured variable > < in Si can directly affect the lth kl
24 cmij measured in Sj . > : 0, otherwise. Causal relationships also exist between measured variables within a subsystem. The Self-Causal Matrix, CSi, is responsible for representing these relationships. If there are n measurements in subsystem Si, then the Self-Causal Matrix for subsystem Si is a n n matrix. Each element of CSi, cmkl i , is determined according to the Eq. (25): 8 1, if the kth measured variable > < in Si can directly affect the lth kl cmi
25 measured in Si . > : 0, otherwise.
The Directed Graph together with the abovedefined matrices gives a description of the process under concern, from where fault detection rules can be generated. The reasoning for generating fault detection heuristic rules, corresponding to the single fault scenarios, is based on the predicate stated by Eq. (23). Therefore, let us consider that the jth measurement in the ith subsystem presents an abnormal behaviour, which according to Eq. (23) is represented by AB(mij). Then a search is conducted to causally look for any measured variable in subsystem Si that could be responsible for the observed abnormality in mij. If such a variable exists, then it is retained and a fault detection heuristic rule must be generated. The Self-Causal Matrix of subsystem Si guides this search. Similar searches are also performed to find further causes in Si for the retained variable. If there is another variable in Si, which can directly affect the retained variable behaviour, then this one is also retained and another diagnostic rule is generated. If there are no more variables in Si that could be responsible for the observed abnormality, then the causal search at subsystem Si is terminated. Therefore, a search is conducted to find all the subsystems that are connected with the subsystem Si that can directly affect any measurement variable in Si. The goal is to identify all the subsystems whose measurement variables can directly affect the measurement variables in the subsystem where such an abnormal behaviour is observed. At this stage the Connection Matrix guides the search procedure. Thus, such subsystems form the following set, f8j Sj ,
cij 1,
j 6 ig
26
Next, a search is conducted through all the subsystems that compose the above set in order to find all the measured variables in other subsystems, which could directly affect the retained variables. At this stage the Measurement Causal Matrix plays the main role, and hence, if such variables exist, then other heuristic rules are generated. Once the search procedure is terminated, certain process shallow knowledge, in the form of some specific heuristic rules, can be included. In this manner, the reliability of the fault detection system can be increased, by including operator experience about the operation of the process. Furthermore, when multiple faults occur at the same time it is assumed that fault effects on the measurement variables are addictive. Thus, fault detection heuristic rules for multiple fault scenarios, are also generated at the design stage. The knowledge base of the fault detection system will be built up with the fault detection heuristic rules
272
J.M.F. Calado et al.
achieved following the procedure described above. To fire the rules, an inference engine with forward chaining abilities is used. The fault detection heuristic rules will be chained in a forward manner and then, when the data acquired match the antecedent parts of a heuristic rule, a fault detection enable flag will be established. This flag will be used to trigger the fault isolation system, to locate the fault or faults in the process. The main advantage of the present approach is that the fault detection task is not only based on the absolute values of the process variables (input, output and state), but also on heuristic evaluation of their variation. Hence, normal transient behaviours of the process can be considered, as well as, incipient faults whose development occurs gradually, instead of suddenly as for abrupt fault situations. 4.4. Hierarchical Structure of Fuzzy Neural Networks In the current approach, a hierarchical structure of several FNNs have been developed for fault isolation purposes. The adoption of a hierarchical structure of FNNs approach for fault isolation aims at development of an architecture that can localise abrupt and incipient single and multiple faults correctly, or at least with a minimum misclassification rate, from only
single abrupt fault symptoms, and be easily trained. Process measurements or fault symptoms act as antecedents from which we can infer a classification of the pattern input, i.e. fault isolation. In any classification task, we have a measurement space from which we receive physical input data. As previously mentioned, in the present approach process, measurements are pre-processed and normalised and changes in the process variables estimated, establishing the measurement space. The hierarchical structure has three levels configuration where several FNNs are used, as shown in Fig. 23. The lower level consists of one FNN where all variations (e) of the measured variables are used as inputs. At the medium level, a number of FNNs (structurally identical or different) that is equal to the number of single fault scenarios considered, are used. Each FNN at the medium level is also fed with all the measurement variables and each one is associated with an output of the FNN at the lower level, corresponding to a particular single fault. The upper level consists of an OR operation of the outputs of the FNNs of the medium level. The elements of the set used in the OR operation are determined by the outputs of the FNN in the lower level. Thus, if the ith and jth outputs of the FNN at the lower level is taking values close to 1, then the outputs
Δe
FNN1
Δe
F1 Fn
ΔemΔe F1 Δe
FNN2
Δe
F2 Fn
Normal case
Measurement Variables
Δe
FNN0
Δe
F1
Δem-
ΔemF2
Δe
Fn Fn
Δe
OR Operation
Δe
FNNn
Δe
F1
Δem-
Fn
Δe
Lower Level
Medium Level
Fig. 23. Hierarchical structure of FNNs.
Upper Level
Faults
Soft Computing Approaches to Fault Diagnosis
of the ith and jth FNNs at the medium level form the elements used in the OR operation. However, if only one output of the FNN at the lower level is taking a value close to 1, then the corresponding FNN in the medium level is used to confirm that this fault is a single fault, or to diagnose multiple faults. Clearly, the multiple faults must include the one corresponding to the output of the FNN at the lower level. For example, when faults {F1, F2, F3} occur, and the active outputs from the lower level are corresponding to the following set of faults, {F1, F3}, then the outputs of FNN1 and FNN3 in the medium level are examined. It has been observed that active outputs of FNN1 are {F1, F2} and the active outputs of FNN3 are {F2, F3}. The OR operation on {F1, F2} and {F2, F3} gives the result {F1, F2, F3} for the output of the HSFNN. In contrast to the conventional multilayer feed-forward NN, the adopted FNNs have an additional fuzzy input layer that maps the increment of each measurement into fuzzy sets. Therefore, the fuzzification layer converts each input into the quantity space, qf {decrease, steady, increase}, by association with three types of neurons. The design membership functions can be adjusted by appropriately selecting the fuzzification layer weights of the FNN [87]. The processing elements of the fuzzification layer related to the fuzzy sets decrease and increase, use the complement sigmoid function and the sigmoid function, respectively, as their activation functions. On the other hand, the other processing elements of the fuzzification layer related to the fuzzy set steady, use the Gaussian function. The hidden and output layers processing elements use the sigmoid function as their activation functions. The topology of each FNN has been achieved through a trial-and-error procedure until the expected fault isolation performance has been achieved. Two degrees of freedom have been considered in this design. The first is related to the number of processing elements in the fuzzification layer. The other is related to the number of neurons in the hidden layer of the NN. The number of processing elements in the fuzzification layer depends on the number of measurement variables used for fault isolation proposes and on the number of fuzzy sets used to describe the process variables behaviour. However, once the number of measurement variables has been fixed, the number of fuzzy sets used depends essentially on the following aspects: The number of considered faulty scenarios. It has been observed that the number of faults that can be diagnosed increases with the number of fuzzy sets
273
used to discretise the fuzzy quantity space. Thus by increasing the number of fuzzy sets, the number of possible patterns to recognize can be increased. The ratio of expected data compression. By using fuzzy sets to describe the process variables' behavior, due to the fact that similar training patterns are transformed into the same fault symptoms, training data will be compressed and training effort can be lightened. The noise level in the measurement variables. The fuzzy approach makes the system less sensitive to the measurement noise. A trade-off between all the above-mentioned aspects, which are very problem dependent are sometimes difficult to evaluate, determines the number of processing elements in the fuzzification that allows us to achieve the desired performance. The second degree of freedom in the design concerns the number of processing elements in the hidden layer. In order to achieve the desired performance, the complexity of the relationships between faults and fault symptoms determines the number of processing elements in the hidden layer. Once again a trial-anderror procedure has been followed, due to the inherent ambiguity associated with most above-mentioned aspects. If the complexity of the relationships between fault and symptoms increases, the number of processing elements in the hidden layer has to be increased to achieve the performance desired. Furthermore, it has also been observed that the number of processing elements in the hidden layer could determine how early an incipient fault scenario could be isolated, especially when the fault development speed is quite low. As previously mentioned, all changes in the measurement variables are normalised into the range [ 1, 1] before being fed in a feed-forward manner through the FNNs. Thus, the membership functions associated with the fuzzy sets quoted above, are equal for all measurement variables considered. According to the training patterns achieved and the behaviour of the measurement variables under incipient fault scenarios, the membership functions have been adjusted following a trial and error procedure, in order to achieve the desired fault isolation performance. Furthermore, it has been tried to reduce the membership functions overlapping to the minimum possible, in order to increase the data compression and, thus, reducing training time and effort. Hence, the membership functions achieved are depicted in Fig. 24. Both the lower level and the medium level networks are made up of three layers: a fuzzification layer, a hidden layer and an output layer. The FNNs are
274
J.M.F. Calado et al. 1
μ (Δ e)
0.8
1
V1Y1 Pipe 1 L x0
Condenser Cooling water
2
0.6
3 LX 3
0.4 F, X Q
5
LX 5
0 –0.6 –0.4 –0.2
0
0.2
0.4
0.6
0.8
1
8
LC
A simulated Continuous Binary Distillation Column (CBDC), which is described by Ingham et al. [129], has been used as a test bed of the on-line fault detection and isolation system proposed. This process is shown in Fig. 25, where a column containing a total of eight plates plus a reboiler is assumed, with feed entering on plate 5, numbered from the column top. Surge drum
Pipe 4
D, X0
7
Fig. 24. Membership functions associated with fuzzy sets, decrease, steady and increase.
4.5. Case Study
LC
Pipe 3
L1Y6
Δe
trained using the resilient back-propagation-learning algorithm [128]. Single abrupt fault symptoms plus the normal operation conditions are used as training data of the FNN in the lower level. On the other hand, the FNNi in the medium level is trained using the fault symptoms corresponding to one single abrupt fault (the fault associated with that FNNi), together with the symptoms of all possible double simultaneous abrupt faults (double faults considered, have been achieved through an AND operation between the single fault associated with FNNi and all the other single faults). The symptoms corresponding to the double simultaneous abrupt faults are obtained by adding the data for the corresponding single abrupt faults considered. The current approach has been successfully applied in several simulation studies, which are presented in the next section. Single and double simultaneous abrupt faults have been considered, as well as incipient faults. During the studies mentioned above, it has been observed that the NN's generalisation ability has the major importance in the isolation of incipient faults, since training patterns used only include abrupt fault symptoms. As a matter of fact, abrupt fault scenarios can be very well defined, whilst incipient fault symptoms depend on several uncertain factors.
VY5 V1Y6
6
–1 –0.8
Reflux MD, X 0
VY4 LX4
4
0.2
pipe 2
VY3
V1Y9
MB
Reboiler Steam
w,x9 Pipe 5
Pipe 6
Fig. 25. Continuous binary distillation column.
and reboiler levels are controlled by feedback control loops. Fault detection heuristic rules have been generated following the systematic methodology proposed in Section 4.3. To assure a good reliability of the fault detection system, in some heuristic rules pre-condition is used the linguistic statement ``continuously''. Therefore, for implementation purposes, the last two changes in the measurement variables should be retained. According to the system architecture described in Section 4.2, the ``Data Normalisation'' module performs this task, besides the pre-processing and data normalisation. Following the methodology proposed in Section 4.3 for generating the fault detection heuristic rules, the CBDC process represented in Fig. 25 has been decomposed into three subsystems. The first subsystem, S1, consists of the following components: external feed elements and associated sensors. The second subsystem, S2, includes the following components: distillation column, pipe 1, pipe 5, re-boiler level control valve, pipe 6, re-boiler heat exchanger and associated sensors. The remaining components form the third subsystem, S3, which are all the components associated with the surge drum part of the process. Hence, the Directed Graph corresponding to this decomposition is shown in Fig. 26. According to this decomposition and following the methodology proposed for describing the system structure, the corresponding set of three matrices has been achieved in an identical way as pointed out by Calado and Roberts [130] for a similar industrial
275
Soft Computing Approaches to Fault Diagnosis
S1
Moreover, according to Eq. (25) the Measurement Causal Matrix from subsystem S1 to subsystem S2, is given by Eq. (31):
S2
S3 Fig. 26. Continuous binary distillation column Directed Graph.
process. Thus, according to Eq. (22) the Connection Matrix for the CBDC process is given by the following equation: 0 S1 S1 1 C S2 @ 0 0 S3
S2 1 1 1
S3 1 0 1 A 1
CS1 F x
x!
1 0
0 1
27
28
Following an analogous procedure, the self-Causal Matrix for the 2nd subsystem, S2, is given by Eq. (29): 0 Mb 1 MB B 1 W B CS2 V1 @ 0 V 0
W 0 1 1 0
V1 0 1 1 1
V1 0 0C C 1A 1
29
Where the following notation is used: MB ± stands for hold-up in the re-boiler; W ± stands for bottoms flow rate; V1 ± stands for vapour boil-up rate; V ± stands for vapour flow rate. For the third subsystem of the CBDC plant, S3, the following measurement variables are considered: L ± liquid flow rate; MD ± hold-up in surge drum; D ± distillate rate; and therefore, the self-Causal Matrix for the third subsystem is given by the following equation: 0 MD 1 MD CS3 L @ 0 D 1
L 0 1 1
D1 1 1A 1
30
W
V1
0 0
0 0
0 0
V! 1 0
31
The Measurement Causal Matrix from subsystem S2 to subsystem S3 is:
CM23
To perform the process behaviour analysis 9 measurement variables have been considered. For the first subsystem, S1, two measurements F and x are considered which are the column feed-rate and the feed composition respectively. Hence, according to Eq. (25) the self-Causal Matrix for the first subsystem is given by the following equation: F
CM12 F x
MB
0 MD 0 MB B 0 W B V1 @ 0 V 0
L 0 0 0 1
D1 0 0C C 0A 1
32
The Measurement Causal Matrix from subsystem S3 to subsystem S2, is given by Eq. (33):
CM32
0 MB 1 MD L @ 0 D 0
W 0 1 1
V1 1 0 0
V1 0 0A 0
33
Considering the abnormal behaviour of controlled variables, the reasoning behind the generation of the heuristic is conducted to causally search any measured variable or variables, which could be responsible for the observed abnormality. These variables could be in the same subsystem where the abnormality is observed or in any other subsystem. Hence, let us consider the situation where the hold-up in the surge drum is presenting an abnormal behaviour. In such situation two possibilities must be considered, one is the hold-up higher than its set point and another is the hold-up lower than its set point. Considering the second possibility, from Eq. (30), it can be seen that the hold-up in surge drum can be affected by itself and by the distillate rate, D. Furthermore, from Eqs (27) and (32) it can be seen that there are no other variables in other subsystems, which could be responsible for the observed abnormality. Therefore, the following two fault detection heuristic rules are generated: IF (MD is lower than its set point AND D is continuously increasing) THEN enable fault detection flag IF (MD is lower than its set point AND MD is continuously decreasing) THEN enable fault detection flag The formulation of fault detection heuristic rules for the situation where the hold-up in the surge drum is higher than its set point is similar to the above
276
procedure, as well as for the other controlled variable. For the double simultaneous fault scenarios, it is considered that the fault effects are addictive and, therefore, the corresponding fault detection heuristic rules are generated in a straightforward manner following the systematic methodology describe above. The fault isolation system is based on a HSFNN with the characteristics previously presented. Hence, six measurement variables have been used as input data to the HSFNN. These variables are the following: MD, hold-up in surge drum; D, distillate flow rate; MB, hold-up in reboiler; W, bottoms flow rate; F, column feed rate; V, vapour flow rate. Thus, the diagnosis task is performed by presenting the changes in the measurement variables to the HSFNN, which are propagated in a feed-forward manner through the FNNs. To locate a fault or faults in the process an analysis of the HSFNN outputs values is carried out. However, as the linguistic statement ``continuously'' is used in the knowledge-based system responsible for triggering the fault diagnosis module, at least a hypothetical fault could only be diagnosed in the second sampling time after the fault has occurred. Thus, since D, W, F and V used as inputs of the HSFNN are also directly affected by some faults, in the second sampling time after such abrupt faults have occurred, no change will be observed in these four variables. Hence, following this straightforward procedure, the initial set of six variables used to achieve the fault symptoms will be reduced to two variables. However, during simulation studies carried out with the CBDC plant, it was found that two variables are not enough to diagnose all the single and double simultaneous faults considered below. Therefore, in order to overcome this problem, the HSFNN input data are constituted by changes in the measurement variables MD, and MB observed when the fault diagnosis module is triggered, together with the changes in the measurement variables D, W, F and V observed at a previous sampling. In the fault isolation system implemented all the FNNs are equal, with a fuzzification layer consisting of 18 processing elements arranged in 6 groups corresponding to the 6 information sources, with each group containing 3 processing elements as described in the previous section. The number of processing elements in the hidden layer is determined by the complexities of the relationships between the faults and the fault symptoms. For this case study, it was found that 10 hidden processing elements give good performance. Moreover, since 4 single faults have been considered the output layer is constituted by 4 neurons, each one corresponding to a fault. In order to test the robustness of the current approach, all possible double fault
J.M.F. Calado et al.
scenarios corresponding to an AND operation in the single fault space was also considered. In the simulation of incipient faults in the process under concern, it has been assumed that the component degradation follows a linear law. Therefore, incipient faults are simulated through the following via Eq. (34) Mf Mn
1 t
34
where the following notation is used: Mf ± is the value of a process variable when there is a fault; Mn ± is the value of a process variable when there is no fault; ± is a constant which determines the speed of the fault development (s 1); t ± is time (s). In order to test the performance and reliability of the fault detection and isolation system we are trying to handle the most severe incipient fault behaviour. Note that the common situation is a component degradation following an exponential law. Thus, the process variables behaviour affected by that fault will be more close to the behaviour under an abrupt fault scenario as can be seen from Fig. 27. Therefore, since the FNN training patterns used only include abrupt fault symptoms, the diagnosis of an incipient fault where the component degradation follows an exponential law can be achieved with less effort. Training data were obtained from process simulations, with the aim of covering 9 different processoperating points (10±90% of the controlled variables). Thus, 45 learning patterns are involved in training the FNN in the lower level and 27 learning patterns are involved in training each FNN in the medium level. The four single faults considered are: F1, valve on pipe 4 blocked fully open; F2, valve on pipe 4 blocked fully closed; F3, valve on pipe 5 blocked fully open; F4, valve on pipe 5 blocked fully closed.
Dc
Dmax
DN
t Abrupt fault Incipient fault with the component degradation following an exponential law Incipient fault with the component degradation following a linear law Fig. 27. Behaviour of process variables under different fault scenarios.
277
Soft Computing Approaches to Fault Diagnosis
Several studies have been performed considering single abrupt and incipient fault scenarios, as well as, double simultaneous abrupt and incipient faulty situations. For instance, Table 3 shows the results achieved considering the simulation of all single abrupt faults mentioned above, with the set point of MD taking a value of 10% and the set point of MB taking a value of 50%. Under the same normal operational conditions, all the double simultaneous abrupt faults have been simulated and the results achieved are shown in Table 4. From Table 3, it can be seen that a very accurate diagnosis has been achieved for the single abrupt fault situations. Table 4 shows that although the FNN in the lower layer has been trained only with single abrupt fault symptoms, some double simultaneous abrupt faults have been diagnosed by FNN0. Of course, that is due to the generalisation ability of NNs. Furthermore, it can be seen that all double simultaneous abrupt faults have been successfully diagnosed.
Several studies have also been performed under the performance of the HSFNN and depends on the normal operational conditions considered. The results shown in Table 5 were achieved under a situation where the set point of MD and MB are taking the value 10%. In Table 5 it can be seen that the maximum or minimum (depends on the type of fault) value that the parameter of Eq. (34) can take, allowing that the HSFNN is still able to diagnose the correct fault. It is also shown that the diagnosis time achieved using a PC Pentium III, 450 MHz, fitted with 128 MB RAM, used to perform the current studies. It was assumed that a fault occurred when a FNN output presents a value greater than 0.5. Similar studies have also been performed for double simultaneous incipient faults. The results achieved are shown in Table 6.
Table 3. HSFNN results under single abrupt faults.
Double Lower abrupt level faults FNN0
Single Lower abrupt level faults FNN0 F1v F2v F3v F4v
Medium level FNN1 FNN2 FNN3 FNN4
F1v 0,996 F2v 0,998 F3v 0,994 F4v 0,996
F1v ± ± ±
± F2v ± ±
± ± F3v ±
± ± ± F4v
Upper level OR F1v F2v F3v F4v
Table 4. HSFNN results under double simultaneous abrupt faults. Medium level FNN1
F1v F3v F1v 0,916 F3v 0,879 F1vF4v F4v 0,992 F2vF3v F2v 0,993 F3v 0,674 F2vF4v F4v 0,985
FNN2
FNN3
Upper level FNN4 OR
F1vF3v
±
F1vF3v ±
F1vF3v
± ±
± F2vF3v
± F1vF4v F1vF4v F2vF3v ± F2vF3v
±
±
±
F2vF4v F2vF4v
Table 5. HSFNN results under single incipient faults. Single incipient faults
Lower level FNN0
F1v F2v F3v F4v
F1v 0,5004 F2v 0,5035 F3v 0,5012 F4v 0,5005
Medium level FNN1
FNN2
FNN3
FNN4
F1v ± ± ±
± F2v ± ±
± ± F3v ±
± ± ± F4v
Upper level OR
(s
F1v F2v F3v F4v
1
)
91,0 9,8 37,0 4,5
Diagnosis time (s)
0,93 8,56 2,07 17,56
Table 6. HSFNN results under double simultaneous incipient faults. Double incipient Lower level faults FNN0 F1vF3v F1vF4v F2vF3v F2vF4v
Medium level FNN1 FNN2 FNN3 FNN4
F1v 0,8585 F1vF3v ± F1vF3v ± F3v 0,9225 F1v 0 ± ± ± F1vF4v F4v 0,5246 F2v 0,5834 ± F2vF3v ± F3v 0,0399 F2v 0,7551 ± F2v ± F2vF4v F4v 0,5468
Upper level (s OR F1vF3v F1vF4v F2vF3v F2vF4v
1
)
F1v 7000 13000 F1v 500 F4v 630 F2v 43 F3v 46 F2v 9,8 F4v 4,5
Diagnosis time (s)
F1v 0,03 F3v 0,01 F4v 0,2 F2v 1,83 F2v 7,86 F4v 17,56
278
J.M.F. Calado et al.
Since the fault symptoms corresponding to the incipient faulty scenarios were not used for training the FNNs when incipient fault situations are considered, a degradation in the FNNs outputs is observed. However, the on-line fault detection and isolation system proposed in this paper is still able to detect and isolate the correct fault or faults that have occurred in the process. All the studies conducted considering the measurements variables without noise, have also been performed by adding white noise to the measurement variables. The noise is characterised by null mean, variance one and a gain K. After some tests, it has been observed that the maximum value that the parameter k could take without the HSFNN gives false diagnosis, was 0.001. Table 7 shows the results achieved under incipient single fault scenarios, with the measurements variables effected by the white noise and with the set point of the controlled variables taking the value of 90%. It can be observed that the results achieved without noise and with noise are very similar, and, therefore, one can conclude that from the noise point of view the fuzzification layer works like a filter. Then, successful results have been achieved during the simulation studies performed to test the robustness of the proposed on-line fault detection and isolation approach. It has been observed that the system implemented has the ability to cope with normal transient behaviours of the process variables due to changes in the set points, avoiding false fault detection and isolation under such situations. The fault detection task performed through a knowledge-based approach has the advantage that is not dependent on threshold values, but it relies on heuristic relative behaviour of the measurement variables. Thus, performance and reliability of the fault detection and isolation system is increased. The fault isolation task based on hierarchical structure of FNNs combines the advantages of both, fuzzy reasoning and neural networks. The system is able to isolate multiple simultaneous faults from only single abrupt fault symptoms. During the current studies, it has been observed that the NN's
generalisation ability has a great importance in the diagnosis of incipient faults since the training patterns only include symptoms of abrupt faults. The successful results achieved with the online fault detec-tion and isolation system suggest that the combined approach of knowledge-based system with a HSFNN could be powerful methodology for practical implementations.
5. Evolutionary Approaches to FDI Systems Design 5.1. Introduction There are many techniques of building the so-called non-analytical models, i.e. the models that merely approximate the system input±output behaviour. However, they all, in one way or another and finally reduce to some kind of the global optimisation problem, e.g. searching for the optimal structure of a NN [38], parameter allocation of the fuzzy system [131] as well as designing non-linear observers [132,133]. They are usually non-linear, multi-modal as well as multiobjective and hence the so-called ``local'' optimisation methods fail to solve them. In recent years, the direct search techniques, which are problem independent, have been widely used in various optimisation tasks. Unlike calculus-based methods (gradient descent etc.), direct search algorithms do not require the use of derivatives. Moreover, gradient-descent like methods work well when the objective surface is relatively smooth, with few local minima. However, real-world data are often multi-modal and contaminated by a noise, which can further distort the objective surface. In the family of direct search methods, Evolutionary Algorithms (EAs) seem to be especially attractive. EAs are a broad class of stochastic optimisation algorithms inspired by some biological processes which allow populations of organisms to adapt to their surrounding environment [134]. Indeed, EAs approaches combine the Darwinian survival of the fittest strategy to eliminate unfit individuals and use
Table 7. HSFNN results under single incipient faults, with the measurements variables affected by white noise. Single incipient faults
Lower level FNN0
F1v F2v F3v F4v
F1v 0,5390 F1v 0,5114 F3v 0,5086 F4v 0,5177
Medium level FNN1
FNN2
FNN3
FNN4
F1v ± ± ±
± F2v ± ±
± ± F3v ±
± ± ± F4v
Upper level OR F1v F2v F3v F4v
(s
1
)
2,0 10,0 2,5 12,0
Diagnosis time (s)
16,89 2,47 23,3 3,29
279
Soft Computing Approaches to Fault Diagnosis
random information exchange and mutation, with exploitation of the knowledge contained in previous populations, resulting in a search mechanism with surprising reliability and speed. 5.2. Optimisation Tasks in an FDI System Design The model-based FDI procedure for a dynamic system consists of two separated stages: residual generation and residual evaluation (Fig. 28). In other words, the automatic fault diagnosis can be viewed as a sequential process involving the symptom extraction and the actual diagnostic task. As usual, the residual generation process is based on the comparison between the measured y(k) and estimated yM(k) system outputs. As a result, the difference (so called residual) r(k), is expected to be near zero under normal operation but in the case of faults a deviation from zero should appear. In turn, the residual evaluation module is dedicated to the analysis of the residual signal in order to determine whether a fault has occurred and isolate the fault of a particular system unit. In recent years, researches have successively applied evolutionary techniques to the FDI system design. They are used to solve a multiobjective problem of robust residual generation based on a full-order observer [2], to design the gain matrix of the non-linear observer [132] as well as to construct the neural model or the pattern recognition network [135].
disturbances d(t), sensor (t) and (t) input noises (Fig. 29). As usual, the problem of making the residual insensitive to disturbances and noises is of great importance. To tackle this nontrivial problem and make the residual, as far as possible, sensitive to faults, a few performance indices, which are defined as functions of the gain and weighting matrices, are defined. The maximization of the first index makes the residual sensitive to the fault in a pre-specified frequency range. The remaining indices describe the influence of the sensor and input noise signals, disturbances as well as the effect of the uncertainty in the initial condition of the system. It is clear that all these indices have to be minimised. The solution of the simultaneously performed optimisation tasks is obtained using the method of inequalities. Unfortunately, such a multi-objective optimisation problem cannot be solved by the conventional methods and hence the genetic algorithm has been successfully applied [2]. 5.3.2. Design of Non-linear Robust Observers Using Genetic Programming This section describes the application of EAs approaches to designing non-linear observers. In particular, a relatively new genetic programming technique (GP) [136] is employed to solve the problem of appropriate selection of the observer's gain matrix. Consider a non-linear discrete time system:
5.3. Fault Symptoms Extraction
xk1 f
xk , uk ,
5.3.1. Robust Residual Generation via Multi-objective Optimisation and Genetic Algorithms
yk h
xk ,
Let us consider a residual generator based on a linear full-order observer described in (pp 167±191, [2]). In this case, the residual is affected by faults f(t),
1
3
Genetic classifier
4
Model Residual generation via multi-objective optimization
where uk is the input, yk is the output, xk is the state, and f(), h() are non-linear functions. The problem is ^k , given a set of to determine the state estimate x input±output measurements. To tackle this problem,
2
Non-linear robust observer design via GP
Classifier
5
Residual generation
Neural model
Knowledge base
Rule base design via EA
Neural pattern Residual evaluation recognizer
NN training via EA NN structure allocation via EA
Genetically tuned fuzzy systems
Fig. 28. EAs approaches to designing the model-based FDI.
35
e (t) = F(f (t), d(t),ζ (t), η (t)) maximisation of the fault effect
J1 (K,Q) J2 (K,Q)
Minimisation of the sensor noise effect
J3 (K,Q) J4 (K,Q)
minimisation of the disturbance and the initial condition effects
minimisation of the input noise effect
Fig. 29. Optimisation indices.
280
J.M.F. Calado et al.
the following observer's structure is proposed: ^k x ^k K
"k , uk 1 "k , x xk , "k yk h
^
36
^k is the a where "k denotes a priori output error, x priori state estimate, and K() is the gain matrix, which comprises certain functions, i.e. each entry of the gain matrix is a function of the a priori output error and the system input. Thus, in the case when the model of the system is known, the problem reduces to selecting these functions. Due to the fact that each function can be represented as a structured tree it is possible to employ the GP to identify them in such a way to minimise the influence of the model and initial condition uncertainty on the residual [132]. This makes the proposed observer robust to the factors, which may lead to unreliable fault detection. It should be also pointed out that, in the case when the model of the system is unknown the problem can be reduced to the simultaneous identification of the system's model and the gain matrix [133]. In both the approaches, the sum of normalised differences between the true and estimated output (in the fault-free mode) consists the fitness function. Experimental results confirm that the proposed approach provides high quality of modelling as well as robustness to various factors, which may degrade the reliability of fault detection. 5.3.3. EA Approach to Neural Model Construction There are two optimisation tasks corresponding to a NN construction: searching for the best possible structure and the training process, i.e. the optimal allocation of the network parameters. Both of them are non-trivial problems. The network architecture as well as an appropriate selection of a set of the control parameters employed by the training algorithm influence the identification quality significantly. The crucial trade-off one has to make is between the learning ability of the NN and fluctuations due to the finite sample size. If the architecture of the NN is too small then the network might not be able to represent well enough the functional relationship between the input and target output. On the other hand, if the architecture is too complex then, e.g. in the case of the network of dynamic neurons [102,135], the designed network has an excessively large order of dynamic as well as unsatisfactory generalization abilities, i.e. the network obtained depends also on the actual training set [135]. As the searching space of the NN architectures is discrete and infinite, there are no deterministic methods to find an optimal solution. The NN's architecture selection problem can be viewed as a
pseudo-Boolean optimisation one and the genetic algorithm-based approach can be naturally adapted, i.e. each bit represents the presence or absence of a given network parameter. In this case it is possible to use various performance indices. The two most known rely on maximisation of the NN's generalisation performance and/or minimisation of the parameters' number for the network which realises the modelled relationship with an acceptable accuracy [135]. Most of the NN implementations use training algorithms, which are based on a gradient descent like technique. Such algorithms belong to the class of ``local'' optimisers. Unfortunately, the training process of the dynamic network, and in general of NNs, seems to be an optimisation problem which is intrinsically related to a very rich topology of the sum square error function. One way out of this problem is to use a multi-start version of gradient descent like algorithms. However, such a kind of algorithms ends up successfully very seldom. Hence, one of the most effective approaches is to employ EAs to the NN training task [38,102]. 5.4. Decision Making ± Symptoms Evaluation The main objective of residual evaluation is to decide whether and where a fault has occurred with possible avoidance of wrong decisions causing false alarms. In this case, many techniques can be applied [131], which can be further improved by using the global optimisation methods. Pre-processing ± genetic clustering: Efficiency of fault detection systems in the case of multi-dimensional symptom vectors may be improved by pre-processing which leads to the partitioning of the symptom domain into sub-domains (clusters). Among many well-known pre-processing methods, EAs characterise high clustering performance. We focus here on the multi-dimensional real data that form a set of the socalled training pairs T f pq
xq , yq Rn1 jq 1, . . . , pg. The goal is to perform an evolutionary cluster analysis of data in T to get at the end a partitioning of T . The number of clusters is not known in advance. To evaluate each off-spring cluster in the population, different local fitness functions may be used. They could be the maximal distance of the training pair of the cluster from the cluster centroid, or a mean variation of all training pairs in the cluster. Based on the local fitness function of the cluster one can build a global fitness function [137]. Genetically tuned fuzzy systems: A fuzzy inference system is often used as universal approximator for a
281
Soft Computing Approaches to Fault Diagnosis
problem of multi-dimensional data or as controller for some industrial applications. A fuzzy modelling approach consists of two kinds of problems, configuring fuzzy rules and optimisation of the shapes of membership functions, which are considered to be combinatorial and numerical optimisation problems, respectively. The EA can be applied to both these problems. In many research studies [138] however, the EA is applied only to optimise the configuration of the fuzzy rules, whilst another optimisation algorithm, e.g. the steepest descent method, is applied to optimise the shapes of the membership functions. Rule base design via evolutionary algorithms: If the computer performs residual evaluation then the decision making process boils down finally to comparing a decision function with a threshold. If we use a fixed threshold, this means that the decision logic indicates a fault if the decision function surpasses the threshold. This is a typical binary decision logic. The disadvantage of this method is the high probability of false alarms if the residual is corrupted by noise or modelling errors. One way out of this problem is to use artificial intelligence methods. The application of computational intelligence techniques leads to the concept of the fault diagnosis expert system where analytical and heuristic information as well as knowledge processing are combined [131]. The expert system for fault diagnosis consists of a knowledge base, which usually contains a rule base. The construction of the rule base is the main challenge for knowledge engineers, which have to implement, usually out of order, incomplete and heuristic knowledge of the human expert. In this case the fuzzy techniques seem to be an effective tool to build the knowledge base. Unfortunately, there are many fault diagnosis problems for which the human expert knowledge is insufficient and the automatic optimal selection of the rule base is needed. Because of the exponential complexity of the problem of the optimal rule base searching there are no possibility to use a total review method. In this case, techniques of GAs and genetic programming [136,137] may become very effective tools, assuming that decision rules are a set of complexes [138]. Each complex is a conjunction of selectors and each selector is a disjunction of the discrete attribute values. In this case, the population of individuals is built of vectors of selectors. The GA comprises the rule base from the sets of attributes and their values. In order to use the GP to create the rule base, two sets have to be defined. The first one, the ``terminal set'', contains all possible premises and conclusions, the second one contains logic operators. Each rule is represented by a structured tree, and the
GP is used to find the best sets of rules. Contrary to the GA-based approach, where only simple rules (triples) are considered, the GP-based approach makes it possible to use arbitrarily complex rules [136].
6. Conclusions In many engineering applications it can be difficult to determine a suitable mathematical model to enable faults to be discriminated from modelling uncertainty, disturbances etc. When quantitative mathematical models are imprecise or difficult to derive, it may be natural to resort to the use of qualitative modelling methods. An alternative approach, integrating together qualitative and quantitative modelling methods, makes use of soft computing or computational intelligence methods and this paper provides an extensive tutorial treatment of recent research in this important field. The research demonstrates how these methods offer a powerful alternative to the use of quantitative model-based methods for FDI, for cases when explicit mathematical models are either unavailable or imprecise. This paper started with the assumption that a neural network can be used to replace the state estimation/prediction role of an observer of Kalman filter in FDI, for example in generating residual signals for both fault detection and fault isolation. The role and purpose of various neural network structures has been developed with a focus on realistic fault diagnosis application examples. The main aim throughout the study has been to develop methods that minimise the false-alarm and missed-alarm rates in fault decisionmaking. The paper includes a treatment of neural network training methods including the use of evolutionary programming in an optimisation role. The neural network is, on the other hand an implicit or ``grey box'' model which does not give simple insight into the sort of system behaviour, which is important for diagnosis. The paper has demonstrated that by combining together a fuzzy rule-based strategy with a neural network some powerful diagnostic results can be obtained. The advantage of using fuzzy logic is that it supports, in a natural way, the direct integration of the human operator into the fault detection and supervision process using rules that are easy to understand. Fuzzy logic methods are rapidly becoming a realistic alternative to the use of artificial expert systems. The so-called fuzzy neural network takes the advantages of neural networks in adaptation of knowledge learning, distributed parallel processing of data, associative memory and distributed storage of
282
diagnosis rules, to overcome the difficulties of expert systems in knowledge acquisition bottleneck and knowledge inference matching conflict. Various studies in the paper have shown that the fuzzy neural network also takes the advantage of fuzzy logic in knowledge fuzzy reasoning to overcome (at least in part) the grey-box limitation of the neural network. The paper contains a comprehensive survey of the field of soft computing for fault diagnosis in dynamic systems.
Acknowledgements The research described in this paper was supported by the EU FP V Research Training Network, DAMADICS (Development and Application of Methods for Actuator Diagnosis in Industrial Control Systems). The authors are indebted to the managers and staff of the Lublin sugar factory, Cukrownia Lublin S.A., Poland, for their provision of important evaporisation stage data and access to an industrial actuator in the sugar plant.
References 1. Patton RJ, Frank PM, Clark RN. Issues in fault diagnosis for dynamic systems. Springer Verlag, London 2000 2. Chen J, Patton RJ. Robust model based fault diagnosis for dynamic systems. Kluwer Academic Publishers ISBN 0-7923-8411-3 1999 3. Isermann R. Fault diagnosis of machines via parameter estimation and knowledge processing ± a tutorial paper. Automatica 1994; 29(4): 815±835 4. Gertler J. Fault detection and diagnosis in engineering systems. Marcel Dekker 1998 5. De Kleer J, Williams BC. Diagnosis multiple faults. Artificial Intelligence 1981; 32: 97±130 6. Shen Q, Leitch R. Fuzzy qualitative simulation. IEEE Trans Sys Man & Cybernetics 1993; SMC23(4): 1038±1061 7. Kuipers BJ. Qualitative reasoning. MIT Press, Cambridge, Massachusetts 1994 8. Lunze J, SchroÈder J. Application of qualitative observation and prediction to a neutralisation process. Proceedings of the 14th IFAC World Congress, Beijing. In: Chen H-F, Cheng D-Z, Zhang J-F (eds). Pergamon Press 2000, vol I, pp 49±54 9. Dexter AL. Fuzzy model-based fault diagnosis. IEE Proc Control Theory Appl 1995; 142(6): 545±550 10. Naidu S, Zafirou E, McAvoy TJ. Use of neuralnetworks for fault detection in a control system. IEEE Control Sys Magazine 1990; 10: 49±55 11. Frank PM, Kuipel N. Fuzzy Supervision and application to lean production. Int J Systems Sci 1993; 24(10): 1935±1944 12. Isermann R. Process fault detection and diagnosis methods. In: Ruokonen T (ed.). Proc. IFAC symposium
J.M.F. Calado et al.
13. 14.
15.
16. 17. 18. 19.
20.
21. 22.
23. 24.
25.
26.
27. 28.
29.
SAFEPROCESS'94, Helsinki, Finland, 2, 597-612, Pergamon Press Patton RJ. Robustness in model-based fault-diagnosis: The 1997 Situation, A Rev. Control, 21, 103±123, Pergamon Press Edelmayer A, Bokor J. Scaled H1 filtering for sensitivity optimisation of detection filters. In: Edelmayer, BaÂnyaÂsz (eds). Proceedings of the IFAC symposium SAFEPROCESS 2000. Elsevier Press, Budapest 14±16 June 2000, pp 324±330, Chen J, Patton RJ, Liu GP. Optimal residual design for fault diagnosis using multi-objective optimisation and genetic algorithms. Int J Sys Sci 1997; 27(6): 567±576 Fine TL. Feed-forward neural network methodology. Springer Verlag, New York 1999 Hunt K, Sbarbaro, Zbikowski, Gawthrop P. Neural netwoks for control systems: a survey. Automatica 1992; 28(6): 1083±1112 Yin CM. Application of artificial NNs to condition monitoring. PhD thesis, University of Aberdeen, 1993 Wang H, Brown M, Harris CJ. Fault detection for a class of unknown non-linear systems via associative memory networks. Proc I MechE J of Syst and Control Eng 1994; 208: 101±107 Dong D, McAvoy TJ. Non-linear principal component analysis ± based on principal curves and neural networks. Computers and Chemical Engineering 1996; 20: 65±78 Watton J, Pham DT. An artificial NN based approach to fault diagnosis and classification of fluid power systems. J Sys & Con Eng 1997; 211(4): 307±317 Crowther WJ, Edge KA, Burrows CR, Atkinson RM, Woollons DJ. Fault diagnosis of a hydraulic actuator circuit using NNs: an output vector space classification approach. J Sys and Control Engineering 1998; 212(1): 57±68 Sharif MA, Grosvenor RI. Process plant condition monitoring and fault diagnosis. J Proc Mech Eng 1998; 212(1): 13±30 Kuhlmann D, Handschin E. Hoffmann W. ANN-based fault diagnosis system ready for practical application. EngineeringIntelligentSystemsforElectricalEngineering and Communications 1999; 7(1): 29±39 Weerasinghe M, Gomm JB, Williams D. Neural networks for fault diagnosis of a nuclear fuel processing plant at different operating points. Control Engineering Practice 1998; 6(2): 281±289 Maki Y, Loparo KA. 1998; A. neural-network approach to fault detection and diagnosis in industrial processes. IEEE Trans on Control Sys Tech 1997; 5(6): 529±541 Vemuri AT, Polycarpou MM. Neural-network-based robust fault diagnosis in robotic systems. IEEE Trans on Neural Networks 1997; 8(6): 1410±1420 Butler KL, Momoh JA, Dias LG, Sobajic DJ. An approach to power distribution fault diagnosis using a neural net based supervised clustering methodology. Engineering Intelligent Systems for Electrical Engineering & Communications 1997; 5(1): 51±57 Li CJ, Fan YM. Recurrent neural networks for fault diagnosis and severity assessment of a screw compressor. J Dynamic Sys Measurement & Control ± Trans of the ASME 1999; 121(4): 724±729
283
Soft Computing Approaches to Fault Diagnosis
30. Yu DL, Gomm JB, Williams D. Sensor fault diagnosis in a chemical process via RBF neural networks. Control Engineering Practice 1999; 7(1): 49±55 31. Narendra KG, Sood VK, Khorasani K, Patel R. IEEE Trans Power Sys 1998; 13(1): 77±183 32. Narendra KS. Neural networks for control: theory and practice. `Proc of IEEE October 1996; 84(10): 1385±1406 33. Brown M, Harris CJ. The modelling abilities of the binary CMAC. IEEE Int Conf NNs, 1994b; 1335±1339 34. Ren X, Chen J. A modified NN for dynamical system identification & Control. Proceedings of the 14th World Congress of IFAC. ISBN 0 08 043248 4 1999. 35. Pan M-C, Van Brussel H, Sas P. Intelligent joint fault diagnosis of industrial robots. Mech Sys & Signal Proc 1998; 12(4): 571±588 36. James Li C, Yu X. High pressure air compressor valve fault diagnosis using feedforward NNs. Mechanical Systems and Signal Processing 1995; 9(5): 527±536 37. Boucherma M. Turbo-generator fault-detection and diagnosis based on artificial NNs. Ph.D Thesis, University of Sheffield, UK (no. 45±11736), 1995 38. Korbicz J, Obuchowicz A, Patan K. Network of dynamic neurons in fault detection systems. IEEE Int Conf Systems, Man & Cybernetics, San Diego, USA. October 11±14, 1998 (published on CD-ROM No.98CH36218). 39. Han Z, Frank PM. Physical parameter based FDI with NNs. In: Patton, Chen (eds). Proc. IFAC Symposium SAFEPROCESS'97, Pergamon Press, Hull, UK, pp 283±288. 40. Ficola A, Cava M La, Magnino F. An approach to fault diagnosis for non-linear dynamic systems using neural networks, In: Patton & Chen (eds), Proc. IFAC Symposium SAFEPROCESS'97, Pergamon Press, Hull, UK, pp 355±360. 41. Fuente MJ, Vega P. Neural networks applied to fault detection of a biotechnological process. Eng Applics of Artificial Intelligence 1999; 12(5): 569±584 42. Gomm JB. Adaptive NN approach to on-line learning for process fault diagnosis. Trans Inst Meas Contrl 1998; 20(3): 144±152 43. Wilson DJ. Neural networks for Multivariate SPC. PhD Thesis, Queens University Belfast, Faculty of Engineering, 1998 44. Dalmi I, Kocvacs L, Lorant I, Terstyansky G. Application of supervised and unsupervised learning methods to fault diagnosis. Proceedings of 14th World Congress of IFAC, ISBN 0 08 043248 4, 1999 45. Wang ZY, Liu YL, Griffin PJ. A combined ANN and expert system tool for transformer fault diagnosis. IEEE Transactions on Power Delivery 1998; 13(4): 1224±1229 46. Aminian M, Aminian F. Neural-network based analogcircuit fault diagnosis using the wavelet transform as pre-processor. IEEE Trans Circuits & Sys (ii)-analog and digital signal processing 2000; 47(2): 151±156 47. Pantelelis NG, Kanarachos AE, Gotzias N. Neural networks and simple models for the fault diagnosis of naval turbochargers. Mathematics and computers in Simulation 2000; 51(3±4): 387±397 48. Liu W. An extended Kalman filter and NN cascade fault diagnosis strategy for the glutamic acid fermentation
49.
50.
51. 52.
53.
54. 55. 56. 57. 58. 59.
60. 61.
62.
63. 64. 65. 66.
process. Artificial Intelligence in Engineering 1999; 13(2): 131±140 Zhao JS, Chen BZ, Shen JZ. Multidimensional nonorthogonal wavelet-sigmoid basis function NN for dynamic process fault diagnosis. Computers & Chemical Engineering 1998; 23(1): 83±92 Zhao JS, Chen BZ, Shen JZ. A hybrid ANN-ES system for dynamic fault diagnosis of hydrocracking process. Computers & Chemical Engineering 21(SS): S929-S933 Ye N, Zhao B. 1997; A hybrid intelligent system for fault diagnosis of advanced manufacturing system. Int J Prod Res 1996; 34(2): 555±576 Maidon Y, Jervis BW, Fouillat P, Lesage S. Using artificial neural networks or Lagrange interpolation to characterize the faults in an analog circuit: an experimental study, IEEE Trans on Instr & Meas 1999; 48(5): 932±938 Ogg S, Lesage S, Jervis BW, Maidon Y, Zimmer T. Multiple fault diagnosis in analogue circuits using time domain response features and multi-layer perceptrons. IEE Proc Circuits Devices & Systems 1998; 145(4): 213±218 Mamdani E. Applications of fuzzy algorithms for control of simple dynamic plant. Proc. IEE 1994; 121: 1585±1588 Evsukoff A, Gentil S, Montmain J. Fuzzy reasoning in co-operation supervision systems. Control Engineering Practice 2000; 8: 389±407 Yang HT, Liao CC. Adaptive fuzzy diagnosis system for dissolved gas analysis of power transformers. IEEE Trans Power Del 1999; 14(4): 1342±1350 Lu Y, Chen TQ, Hamilton B. A fuzzy diagnostic model and its application in automotive engineering diagnosis. Applied Intelligence 1998; 9(3): 231±243. Insfran AHF, da Silva APA, Lambert Torres G. Fault diagnosis using fuzzy sets. Engineering Intelligent Sys for Elec Eng. & Comms 1999; 7(4): 177±182 Dexter AL, Benouarets M. Model-based fault diagnosis using fuzzy matching. IEEE Trans Sys Man & Cybernet (a)-systems and humans 1997; 27(5): 673± 682 Mechefske CK. Objective machinery fault diagnosis using fuzzy logic. Mechanical Systems and Signal Processing 1998; 12(6): 855±862 Patton RJ, Chen J, Lopez-Toribio CJ. Fuzzy observers for non-linear dynamic systems fault diagnosis. Proc. 37th IEEE conference on decision and control 1998; 84±89 Kuipel N, Frank PM. A fuzzy FDI decision-making system for the support of the human operator. In: Patton, Chen (eds). Proc. IFAC Symposium SAFEPROCESS'97, Hull, UK, 721±726, Pergamon Press 1998 Ford N. Expert systems and artificial intelligence: an information manager's guide, London: Library Association, 1991 Verbruggen HB, Babuka R. Fuzzy logic control advances in applics. Word Scientific Press 1999 Kaymak U. Fuzzy decision-making with control applications. PhD Thesis, Delft University of Technology, PO Box 5031, 2600 GA, Delft, The Netherlands, 1999 Bellman RE, Zadeh Lotfi A. Decision±making in a fuzzy environment. Management Science 1990 17(4): 141±164
284 67. Isoc D. On a new approach to build the membership functions for fuzzy models. Proc. CONTI '98, Timisoara, Romania 1998a 68. Isoc D. Identification and clustering ± some approaches and evaluations. SIMSIS '98, Galati, Romania. 1998b 69. Wang LX, Mendel J. Generating fuzzy rules by learning from examples. IEEE Trans Syst, Man, Cybern 1992; 22: 1414±1427 70. Balle P, Nells OC, FuÈssel D. Fault detection for nonlinear process based on local linear fuzzy models in parallel and series±parallel model, In: Patton & Chen (eds), Proc. IFAC Symposium SAFEPROCESS'97, Pergamon Press, Hull, UK, pp 1137±1142 71. FuÈssel D, Balle P, Isermann R. Closed-loop fault diagnosis-based on a non-linear process model and automatic fuzzy rule generation. In: Patton & Chen (eds), Proc. IFAC Symp. Fault Detection, Supervision & Safety for Technical Processes, SAFEPROCESS'97, Sept 1997, Pergamon Press, University of Hull, UK. 72. Schneider H, Frank PM. Fuzzy logic based threshold adaptation for fault detection in robots. Proceedings of the third IEEE conference. on control applications, Glasgow, Scotland. 1994; pp 1127±1132 73. Frank PM. Advances in observer based fault diagnosis. Proceedings of international. conference on fault diagnosis: TOOLDIAG '93, Toulouse. 1993; pp 817± 836 74. Frank PM. Application of fuzzy logic process supervision and fault diagnosis. Pre-prints of the IFAC symp. on fault detection, supervision & safety for technical processes, SAFEPROCESS '94, Espo, Finland 1994, pp 531±538 75. Frank PM. Analytical and qualitative model-based fault diagnosis ± a survey and some new results. European J of Contr 1996; 2(1): 6±28 76. Schneider H, Frank PM. Observer-based supervision and fault detection in robots using non-linear and fuzzy-logic residual evaluation. IEEE Trans Contr Sys Techno 1996; 4(3): 274±282 77. Frank PM, Koppen-Seliger B. Fuzzy logic and neural network applications to fault diagnosis. Int J of Approximate Reasoning 1997; 16(1): 67±88 78. Linkens DA, Abbod MF. Supervisory intelligent control using fuzzy logic hierarchy. Trans Inst M C 1993; 15(3): 112±132 79. Farag WA, Quintana VH, Lambert-Torres G. A genetic-based neuro-fuzzy approach for modelling and control of dynamical systems. IEEE Trans Neural Networks 1998; 9(5): 80. Li S, Elbestawi MA. Fuzzy clustering for automated tool condition monitoring in machining, mechanical systems and signal processing, 1996; 10(5): 533±550 Academic Press 81. Altug S, Chow MY, Trussell HJ. Fuzzy inference systems implemented on neural architectures for motor fault detection and diagnosis 1999; IEEE Trans Ind Electron 1999; 46(6): 1069±1079 82. Aggarwal RK, Xuan QY, Johns AT, Li FR, Bennett A. A novel approach to fault diagnosis in multi-circuit transmission lines using fuzzy ARTmap NNs. IEEE Trans on Neural Networks 1993; 10(5): 1214±1221 83. Jota PRS, Islam SM, Wu T, Ledwich G. A class of hybrid intelligent system for fault diagnosis in electrical power systems. Neurocomputing 1998; 23(1±3): 207±224
J.M.F. Calado et al.
84. Pfeufer T, Ayoubi M. Application of a hybrid neurofuzzy system to the fault diagnosis of an automotive electromechanical actuator, Fuzzy Sets & Sys 1997; 89(3): 351±360. 85. Ozyurt B, Kandel A. A hybrid hierarchical NN-fuzzy expert system approach to chemical process fault diagnosis. Fuzzy Sets & Sys 1996; 83(1): 11±25 86. Patan K, Obuchowicz A, Korbicz J. Cascade network of dynamic neurons in fault detection systems. In: Proceedings of the European control conference, ECC '99, Karlsruhe, Germany. Published on CDROM, 1999 87. Calado JMF. On-line fault diagnosis of industrial processes based on artificial intelligence techniques. PhD Thesis, Control Engineering Research Centre, City University, London, UK 1996 88. Jang JSR, Sun CT. Functional equivalence between radial basis function networks and fuzzy systems. IEEE Trans Neural Networks 1993; 4: 156±158 89. Lopez CJ, Patton RJ, Daley S. Takagi±Sugeno fuzzy fault tolerant control of an induction motor. Special Issue of NeuroComputing & Applications J on N-F Systems 1999 90. Wang H, Tanaka K, Griffin MF. Parallel distributed compensation of a non-linear systems by Takagi and Sugeno fuzzy models. Proc. FUZZ-IEEE/IFES '95, 531±538 91. Tanaka K, Takayuki I, Wang H. Design of fuzzy control systems. Based on Relaxed LMI Stability Conditions. Proc. 35th CDC, Kobe. 1996, pp 598±603 92. Leitch R. Engineering diagnosis: match problems to solutions, Proceedings of the international conference on fault diagnosis: TOOLDIAG '93, Toulouse. 1993, pp 837±844 93. Zhuang Z, Frank PM. Qualitative observer and its application to fault detection and isolation systems, Proc. IMechE, 1997 94. Obuchowicz, Patan K. An algorithm of evolutionary search with soft selection for training multi-layer feedforward NNs. Proceedings of 3rd conference. NN & their applications, Kule, Poland. October. 14±18, 1997, pp 123±128 95. Obuchowicz, Korbicz J. Evolutionary search with soft selection and forced direction of mutation. Proceedings of 7th International symposium intelligent information system, Malbork, Poland. June 15±19 1998, pp 300-309 96. Patton RJ, Chen J, Liu GP. Robust fault detection of dynamic systems via genetic algorithms. Proceedings of IMechE Part I-J. of Syst. & Contr. Eng 1997; 211(5): 357±364 97. Zhou YP, Zhou BQ, Wu DX. Application of genetic algorithms to fault diagnosis in nuclear power plants. Reliability Eng & Sys Safety 2000; 67(2): 153±160 98. Wen FS, Han ZX. A refined genetic algorithm for fault section estimation in power systems using the time sequence information of circuit breakers. Electric Machines & Power Systems. 1996; 24(8): 801±815 99. Koivo HN. Artificial neural networks in fault diagnosis and control. Control Eng Practice 1994; 2: 89±101 100. Korbicz J. Neural networks and their application in fault detection and diagnosis. In: Proc. IFAC symposium. Fault Detection, Supervision and Safety for
Soft Computing Approaches to Fault Diagnosis
101. 102.
103. 104.
105. 106. 107.
108. 109. 110.
111.
112.
113.
114.
115. 116. 117.
Technical Process SAFEPROCESS '97, Hull, UK, vol. 1, pp 377±382 Sorsa T, Koivo HN. Application of artificial neural networks in process fault diagnosis. Automatica 1993; 29: 843±849 Korbicz J, Patan K, Obuchowicz A. Dynamic neural networks for process modelling in fault detection and isolation systems. Intl J Applied Mathematics and Computer Science 1999; 9: 519±546 Narendra KS, Parthasarathy K. Identification and control of dynamic systems using neural networks. IEEE Trans on Neural Networks 1990; 1: 4±27 Patan K. Artificial dynamic neural networks and their application in modelling of industrial processes. PhD Thesis, Warsaw University of Technology Press, Warsaw, 2000 Elman JL. Finding structure in time. Cognitive Science 1990; 14: 14±211 Farlow SJ. (ed.). Self-organizing methods in modelling: GMDH-type algorithms, Mersel Dekker Inc, New York 1984 Korbicz J, Kus J. Dynamic GMDH neural networks and their application in fault detection systems. In: Proceedings. European control conference, ECC '99, Karlsruhe, Germany. Published on CD-ROM, 1999 Patton RJ, Korbicz J (eds). Advances in computational intelligence, Special Issue in: Intl J Applied Mathematics and Computer Science 1999; 9: Kohonen T. Self-organization and associative memory. Springer-Verlag, Berlin 1984 Marciniak A, Korbicz J. Structure optimisation of neural network ensemble. In: Proceedings of 5th international conference on neural networks and soft computing, Zakopane, Poland. 2000, pp 100±105 Patan K, Korbicz J. Application of dynamic neural networks in an industrial plant In: Proceedings. IFAC symposium on fault detection, supervision and safety for technical process SAFEPROCESS '2000, Budapest, Hungary, vol. 1, pp 186±191 Calado JMF, Roberts PD. Fuzzy neural network applied to fault detection and diagnosis of abrupt and incipient faults. Proceedings of the InstMC/IEE/IEEE symposium on advances in fault diagnosis in process control III, York, UK. April 1996, pp 1±7 Chen J. Robust residual generation for model-based fault diagnosis of dynamic Systems. PhD Thesis, Department of Electronics, University of York, UK 1995 Calado JMF, Sa da Costa JMG. An expert system coupled with a hierarchical fuzzy neural network approach for multiple fault diagnosis. Intel J Applied Mathematics and Computer Sciences 1999; 9(3): 667±687 Haykin S. Neural networks: a comprehensive foundation. Macmillan College Publishing Company, Inc 1994 Turban E. Expert systems and applied artificial intelligence. Macmillan Publishing Company, Inc 1992 Calado JMF, Roberts PD. An intelligent on-line supervisory fault detection and diagnosis system. Pre-prints of the /th IFAC/IFORS/IMACS symposium on large scale systems: theory and applications, London, Elsevier Science, Vol. 2, pp 917±922
285 118. Clancey WJ. Heuristic classification. Artificial Intelligence 1985; 27: 289±350 119. Moor RL, Kramer MA. Expert systems in on-line process control. Chemical Control III, omer 1986, pp 839±867 120. Chitarro L, Guida G, Tasso C, Toppano E. Functional and teleogical knowledge in the multi-modeling approach for reasoning about physical systems: A case study in Diagnosis. IEEE Trans On Syst, Man and Cybernetics 1993; 23(6): 1718±1751 121. Jackson. P. Introduction to expert systems 2nd edn, Addison-Wesley 1990 122. Venkatasubramanian V, Rich SH. An object±oriented two-tier architecture for integrating compiled and deep-level knowledge for process diagnosis. Computers and Chemical Engineering 1988; 12(9/10): 903± 921 123. Leitch RR, Freitag H, Shen Q, Struss P, Tornielli G. ARTIST: A methodological approach to specifying model-based diagnosis system. In: (Guida G. Stefanini A. eds). Industrial applications of knowledge based diagnosis 124. Finch FE, Kramer MA. Narrowing diagnosis focus using functional decomposition. AIChE Journal 1998; 34(1): 25±36 125. Steels L. Diagnosis with a function-fault model, applied artificial intelligence 1989; 3(2±3): 213±237 126. Zhang J, Roberts PD. Process fault diagnosis with diagnostic rules based on structural decomposition. J Process Control 1991; 1: 259 269 127. Oyeleye OO, Kramer MA. Qualitative simulation of chemical process systems: steady-state analyses. AIChE Journal 1988; 34(9): 1441±1453 128. Riedmiller M, Braun H. A direct adaptive method for faster back-propagation learning: The RPROP algorithm. Proceedings IEEE international conference on neural networks, San Francisco, CA 1993, 586±591 129. Ingham I, Dunn IJ, Heinzle S, Prenosil JE. Chemical engineering dynamics: Modelling with PC simulations. VCH 1994 130. Calado JMF, Roberts PD. Generating fault detection heuristic rules through deep and shallow knowledge of the Process. Proceedings of UKACC/IEE Int Conf on Control '96, Exeter, UK, 2±5 September 1996b Vol. 1, No. 427, pp 299±304 131. Frank PM, Koppen-Selinger B. New developments using AI in fault diagnosis. Eng Applic Artif Intell 1997; 10: 3±14 132. Witczak M, Obuchowicz A, Korbicz J. Design of nonlinear state observers using genetic programming, In: Proceedings 3rd National conference on evolutionary algorithms and global optimisation, Potok Zloty Poland 1999, pp 345±352 133. Witczak M, Korbicz J. Genetic programming based observers for non-linear systems. In: Proc 4th IFAC Symp. Fault Detection Supervision and Safety for Technical Processes, SAFEPROCESS '2000, Budapest, Hungary, vol. 2, pp 967±972 134. Michalewicz Z. Genetic Algorithms Data Structures Evolution Programs. Heidenberg: Springer Verlag, Berlin 1996 135. Obuchowicz A. Architecture optimisation of a network of dynamic neurons using the A*-algorithm. In: Proceedings 7th European congress on intelligent
286 techniques and soft computing EUFIT '99, Aachen, Germany. CDROM, 1999 136. Koza JR. Genetic programming. Cambridge MIT: the MIT Press 1992 137. Kolesnik R, Kolesnik L, Kosinski W. Genetic
J.M.F. Calado et al.
operators for clustering analysis. In: Proceedings of 8th Int. Symp. on intelligent information, Ustron/ Wisla, Poland 1999; pp 203±208