Applied Soft Computing 22 (2014) 238–248
Contents lists available at ScienceDirect
Applied Soft Computing journal homepage: www.elsevier.com/locate/asoc
Simultaneous-fault detection based on qualitative symptom descriptions for automotive engine diagnosis Chi Man Vong a,∗ , Pak Kin Wong b , Ka In Wong b a b
Department of Computer and Information Science, University of Macau, Taipa, Macao Department of Electromechanical Engineering, University of Macau, Taipa, Macao
a r t i c l e
i n f o
Article history: Received 17 July 2012 Received in revised form 12 February 2014 Accepted 15 May 2014 Available online 27 May 2014 Keywords: Simultaneous-fault diagnosis Fuzzy logic Probabilistic classification Decision threshold optimization Automotive engine diagnosis
a b s t r a c t Practical automotive engine fault diagnosis in an automotive service center is usually performed by analyzing the qualitative symptom descriptions provided by the vehicle owner. However, it is a non-trivial and time-consuming procedure because; (i) the qualitative symptom descriptions are usually collected through a questionnaire containing binary, numerical, and vague data that are difficult to digest; (ii) the engine malfunctioning may not be caused by a single-fault only, but several single-faults simultaneously (i.e., simultaneous-faults). Therefore, automotive mechanic usually requires several days or even weeks to diagnose and fix the engine. To improve this non-trivial and time-consuming procedure of engine fault diagnosis for the mechanic, a new framework of simultaneous-fault diagnosis is proposed in this paper by integrating fuzzification, pairwise probabilistic multi-label classification, and decision-bythreshold. This framework is called fuzzy and probabilistic simultaneous-fault diagnosis (FPSD). Compared to traditional frameworks, FPSD requires much fewer training cases of costly simultaneous-faults while it can probabilistically diagnose both unseen single-faults and simultaneous-faults based on qualitative symptom descriptions. To evaluate the performance of FPSD, a comparative study was conducted over common classification techniques. Experimental results show that the proposed framework can effectively alleviate the aforementioned issues. © 2014 Elsevier B.V. All rights reserved.
1. Introduction Automotive engine diagnosis is essential for in-use vehicle inspection and maintenance [1], but it is very difficult to perform because a modern automotive engine is a complex integration of thermofluid, electromechanical and computer control systems [2]. Possible malfunction may occur on different engine components which is not easy to detect. Although onboard diagnostic tools [3] are available for prompt engine fault diagnosis in automotive service centers or workshops, these tools are only applicable to the engine parts with transducers. For those mechanical engine parts without transducers, the automotive mechanics usually require the vehicle owner to provide the qualitative symptom descriptions of the engine faults (e.g., ‘Sluggish acceleration?’ with description ‘accelerate very slow’; ‘Backfire in exhaust pipe?’ with description ‘sometimes’) and then try to diagnose the engine faults according to service manuals and also their own experience. Hence, the scope of the study is related to the diagnosis of mechanical
∗ Corresponding author. Tel.: +853 83974357; fax: +853 28838314. E-mail address:
[email protected] (C.M. Vong). http://dx.doi.org/10.1016/j.asoc.2014.05.014 1568-4946/© 2014 Elsevier B.V. All rights reserved.
engine parts without transducer that can only be identified based on expert knowledge. Those engine faults that can be detected by onboard diagnostic systems and scan tools are not within the scope of current study. Throughout this paper, engine faults refer to the faults caused by mechanical engine parts without transducer. Accordingly, an effective diagnostic system of mechanical faults for in-use vehicle engines based on the provided qualitative symptom descriptions is very necessary in automotive service centers or workshops because ineffective diagnosis will incur longer maintenance time and unnecessary waste of manpower. The cost of time and manpower will transfer to both the automotive mechanics and the consumers. In addition, an engine malfunction is often caused by more than one individual fault (i.e., simultaneous-faults). This is a common phenomenon because vehicle owners always ignore the importance of maintenance even when there is an engine fault, as long as the engine is not significantly affected. As a result, the problem becomes a simultaneous-fault diagnosis (SmFD) and the development of such kind of diagnostic system is the main objective of this study. In the current literature, automotive engine diagnostic methods are mainly classified into three categories [4,5]: model-based, datadriven and knowledge-based methods. Model-based diagnostics
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
Nomenclature DT DTO FPSD FNN FZ HFNN MLC OVA PCS PNN PPMLC PSVM SFD SmFD SLC TRAIN 1 TRAIN S TEST 1 TEST S VALID 1 VALID S Bj Cj Cjk d D D1 Ds F P() S S1 Ss s si tij ti x xi yij yi ε ε* j jk
Decision-by-threshold Decision threshold optimization Fuzzy and probabilistic simultaneous-fault diagnosis Fuzzy neural network Fuzzification Hierarchical fuzzy neural network Multi-label classification One-vs.-all strategy Pairwise coupling strategy Probabilistic neural network Pairwise probabilistic MLC Probabilistic support vector machines Single-fault diagnosis Simultaneous-fault diagnosis Single-label classification Training dataset of single-faults only Training dataset of simultaneous-faults only Test dataset of single-faults only Test dataset of simultaneous-faults only Validation dataset of single-faults only Validation dataset of simultaneous-faults only Binary classifier for the jth fault Probabilistic classifier for the jth fault Probabilistic classifier for the jth fault against the kth fault Number of faults Sample dataset Dataset of single-fault patterns Dataset of simultaneous-fault patterns F-measure Probability Fuzzified sample dataset Fuzzified dataset of single-fault patterns Fuzzified dataset of simultaneous-fault patterns Fuzzified pattern of unknown input x Fuzzified pattern of the ith training data xi jth true fault of xi True fault vector of xi Unknown input pattern ith raw training pattern jth predicted fault of xi Predicted fault vector of xi Decision threshold Optimal decision threshold Heaviside step function Probability vector of faults for a pattern x Probability of the jth fault for a pattern x Probability of the jth fault against the kth fault for a pattern x Defuzzification threshold
are very accurate, but to identify the appropriate values of the model parameters is very time-consuming and labor-demanding [6]. Thus, model-based method is too expensive to comprehensively apply in practice [5]. Moreover, due to the different natures of faults and the modeling uncertainty, no single model-based approach can diagnose all the faults [7]. In other words, many models for various types of engines are required. Data-driven engine diagnosis methods, on the other hand, rely on signal-based diagnosis or engine oil analysis. Signal-based diagnosis is recently the most popular method [8–15] because it is very suitable for laboratories
239
and the development of automotive scan tools, computerized engine analyzers, engine condition monitoring and on-board diagnostic systems. Its main drawback is that many signal patterns are engine type dependent. That is, the construction of a general diagnostic system requires many training patterns from various engine models. In addition, the complicated process of signal acquisition and the expensive equipment are also barriers of this kind of approach. Regarding the oil analysis approach [16], the elemental analysis of the engine oil is obviously not easy to be carried out in automotive workshops. The sample data for constructing the diagnostic system is also dependent on the oil manufacturers and engine type. Nevertheless, the most critical drawback of these approaches is that, they cannot deal with the qualitative symptom descriptions because the descriptions are usually uncertain and vague that cannot be easily represented in numerical form. Therefore, both the signal-based diagnostics and engine oil analysis become impractical for automotive service centers or workshops. Knowledge-based methods, by contrast, have the capability of handling the qualitative symptom descriptions. The early development in this category is the decision tree approach by Gelgele and Wang [17] that makes a diagnosis through a sequence of questions and answers. However, on most vehicle diagnostic problems, the diagnostic knowledge is limited and incomplete to build a decision tree due to the complexity of modern vehicles [18]. In recent decade, a famous technique called fuzzy logic provides a generalized solution to qualitative fault diagnosis [18–22]. It analyzes the diagnostic problems via some linguistic descriptions based on general and/or specific qualitative expert knowledge, which is very suitable for automotive service centers or workshops. Applications of fuzzy logic to engine fault diagnosis can be found in the studies of Lu et al. [18] and C¸elik et al. [19]. Nevertheless, in order to construct a reliable fuzzy logic system, multiple sources of qualitative knowledge need to be digested and transformed into a set of complicated fuzzy rules which are very difficult to define accurately [23]. Azarian et al. [5,24], therefore, proposed a car enterprise information system that integrates the technician feedback, heuristic knowledge and model-based knowledge from the database and knowledge of the car manufacturers. The work is very good, but their main achievement is to diminish the time of diagnosis sessions in the workshop only. Furthermore, the enterprise information system is not suitable for non-factory automotive service centers or workshops because the access to the database and knowledge base of the car manufacturers is not available. In a nutshell, all the existing methods for engine fault diagnosis cannot perfectly match the requirements of service centers or workshops. To address the aforementioned limitations, fuzzy networkbased classifier (e.g., fuzzy neural network (FNN)) [25–28] is employed in this study for engine fault detection based on qualitative symptom descriptions because the complicated digestion of qualitative knowledge into fuzzy rules can easily be handled with the use of neural network. However, most of the current networkbased classifier does not take SmFD into account. To perform SmFD, an early literature [29] proposed a hierarchical fuzzy neural network (HFNN) which can handle multiple incipient faults based on monitoring signals. HFNN is constructed by several stages of FNN. Each of the HFNN stage is constructed with a set of FNNs. However, HFNN suffers from the drawback that it requires a large amount of training data of simultaneous-faults. In addition, its architecture becomes too complex to extend to even medium scale because a large amount of FNNs must be constructed in every HFNN stage. Eslamloueyan et al. [30,31] pointed out that the hierarchical neural network can be at most applied to triple-fault patterns. Therefore, the framework of HFNN is too difficult to put into current practice. Another way to process SmFD is with the use of single-label classification (SLC) framework, where one classifier is built for each fault, regardless single- or simultaneous-fault. It is obvious that for
240
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
SLC, a large amount of classifiers and their corresponding training data of simultaneous-faults are required. Moreover, the training data of simultaneous-faults are usually much more costly and difficult to acquire than that of single-faults in many practical applications. In a recent literature [32], the problem of SmFD was solved using multi-label classification (MLC) where multiple classes can be assigned to a pattern. In this literature [32], the MLC framework was also compared with the SLC framework, and showed that MLC is superior to SLC in terms of formulation, expandability, and size of training data, although SLC can general provide a higher accuracy for some simultaneous-fault cases. Since MLC can overcome the aforesaid problems raised by SLC and also be capable of extending to medium or even large scale, it is considered in this study. One deficiency of traditional MLC is that, it employs simple one-vs.-all (OVA) strategy that does not consider the issue of pairwise correlation (i.e., the correlation between every pair of faults or labels). Hastie and Tibshirani [33] showed that, by considering pairwise correlation, the classification accuracy can be significantly improved. Moreover, the qualitative symptoms provided by the vehicle owners may not always be the major factor of the corresponding engine faults. For example, for a symptom of “difficult-to-start”, it is highly likely that a fault of “idle-air valve malfunction” is occurred, but sometimes it may also be caused by a fault of “defective fuel pump system”; the probability of occurrence of “idle-air value malfunction” is not 100% but perhaps just 70%. Thus, it is better to present the occurrence of an engine fault in terms of a probability instead of a binary value (“yes or no” answer). A popular method called pairwise coupling strategy (PCS) [34,35] works perfectly on probability estimates and can alleviate the pairwise correlation issue. Consequently, PCS is employed in this study and combined with MLC using a probabilistic network-based classifier to form a pairwise probabilistic MLC (PPMLC). Probabilistic diagnosis can lead to the following advantages: (i) closer to physical reality [36]; (ii) the probabilities of faults can serve as an important quantitative measure for user decision in addition to the binary diagnostic results; (iii) when a predicted fault is incorrect, probabilities of the faults can provide the user some hints to trace the next possible fault. Although the probabilities of engine faults can be obtained, these probabilities only indicate the chance of occurrence of the engine faults. To determine the occurrence of the engine faults, a decision threshold ε is necessary to apply to those probabilities (e.g., Fault A occurs if P(Fault A) > ε). The decision threshold can be subjectively determined by human experts and also objectively determined through an optimization over sample data. This step is called decision threshold optimization (DTO) as described later. To summarize the aforesaid situations and requirements, a new SmFD framework for automotive engine fault diagnosis in automotive workshops called fuzzy and probabilistic simultaneous-fault diagnosis (FPSD) is proposed in this study. The proposed FPSD basically consists of three components: fuzzification (FZ), PPMLC, and decision-by-threshold (DT). The workflow of the proposed FPSD is shown in Fig. 1 and the details are given in the following sections. This novel framework can practically handle qualitative symptom descriptions using fuzzy logic and performs SmFD using probabilistic classifier at the same time, whereas the existing diagnosis framework only consider independently either fuzzy inference for qualitative symptom descriptions, or simultaneous-fault engine diagnosis. Besides, the proposed FPSD can provide probabilities of engine simultaneous-faults in addition to the final diagnostic decision. By referring to the probabilities of faults, the automotive mechanics can troubleshoot the engine faults more efficiently. The rest of this paper is organized as follows. In Section 2, existing frameworks for SmFD are reviewed and details of the proposed framework FPSD are presented. To demonstrate the effectiveness of the proposed framework, a case study was conducted in Section 3, in which the data sampling procedure and the application of
the proposed FPSD are described. For evaluation purpose, a comparative study of FPSD against existing frameworks is provided in Section 4, along with the corresponding analysis and discussion of results. The conclusions and future works are finally drawn in Section 5. 2. Proposed fuzzy and probabilistic simultaneous-fault diagnosis (FPSD) To provide a better understanding of the proposed FPSD, details of the existing framework for SmFD are briefly reviewed, followed by the details of FPSD. 2.1. Review of existing SmFD frameworks There are three existing framework for SmFD: HFNN, SLC, and MLC. Since HFNN is not suitable for the current practice of SmFD, only SLC and MLC were compared with the proposed FPSD in this study. For this reason, the details of SLC and MLC are briefly reviewed in the following sub-sections. 2.1.1. SLC SLC is actually a single-fault diagnosis (SFD) that considers every combination of single-faults artificially as an individual label. For example, given two single-faults: Fault A and Fault B, there are totally three combinations of faults, where Fault A is referred to Label 1, Fault B as Label 2, and Faults A plus B are artificially considered as Label 3. It can be imagined when the number of single-faults d is large (e.g., d = 12), the number of combinations of single-faults increases exponentially so that a large amount of artificial labels must be made in SLC. Given a sample D = {(xi , ti )} of N cases for a d-class problem for i = 1 to N where d is the number of single-faults, xi ∈ Rn is an ndimensional input pattern, ti ∈ {1, . . ., 2d } is the corresponding label of xi , in which the 1st label indicates a normal condition (i.e., no fault occurs). Then, a function fSLC is constructed for SLC such that the predicted label, or decision, y for an unknown input x is: y = fSLC (x) ∈ {1, . . ., 2d }
(1)
Hence, it can be seen that there are 2d possible labels for a d-class problem, out of which d + 1 labels are the single-fault labels plus the normal condition, but up to 2d − (d + 1) artificial simultaneous-fault labels. For example, for a problem with 5 single-faults, there are 25 = 32 labels, out of which at most 32 − 6 = 26 labels are artificially defined for possible simultaneous-faults. Alternatively, there are up to 2d classifiers in fSLC in which every classifier corresponds to one of the 2d labels. Obviously SLC is prohibitive for even medium size of d in terms of training time and the number of required simultaneousfault training patterns. 2.1.2. MLC MLC can be considered as a generalization of SLC. In SLC, each instance is associated with only one single label, but in MLC, multiple labels are assigned to each instance. Suppose the sample is reconstructed as D = {(xi , ti )} = D1 ∪ Ds for i = 1 to N, where D1 is the set of single-fault patterns, Ds is the set of simultaneous-fault patterns, xi ∈ Rn is the input pattern, and ti = [ti1 , . . ., tid ] ∈ {0, 1}d is the vector of fault labels of xi in which tij ∈ {0, 1} for j = 1 to d, then a function fMLC is constructed for MLC. Within fMLC , one-vs.-all (OVA) splitting strategy [32] was used so that d binary classifiers Bj were constructed for j = 1 to d. Every classifier Bj simply classifies if an unknown pattern x belongs to the jth fault or not. Hence, there is a significant reduction of number of classifiers from 2d to d only. In addition, the training phase of Bj involves the set of single-fault patterns D1 only while the set of simultaneous-fault patterns Ds is not
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
Validation dataset symptoms &
FZ
faul ts
Training dataset Test dataset
symptoms & faul ts
symptoms
FZ
FZ
DTO
trut h values
classifier
Optimal threshold *
Modelling by PPMLC
trut h values
trut h values
241
DT y
classifier
Probabilistic classification
Evaluate F-measurey
Probability ve ctor & decision vector y are system outputs for user decision making
F
Fig. 1. Workflow of the proposed FPSD.
required. The decision vector y for unknown pattern x is obtained from fMLC as follows: y = fMLC (x) = [(B1 (x)), . . ., (Bd (x))] ∈ {0, 1}d
(2)
where Bj (x) ∈ R is the raw output value of the jth SVM classifier, and is a Heaviside step function, i.e., for z ∈ R, (z) = 1 if z ≥ 0 and (z) = 0 otherwise. Note that y = [0, . . ., 0] indicates x is normal (i.e., without any fault). 2.2. Details of proposed FPSD framework The FPSD framework is proposed by modifying the MLC framework. In traditional MLC, the pairwise correlation between the labels is not considered. To deal with this issue, the popular PCS [33,34] is applied to improve the classification accuracy. Furthermore, by probabilistic outputs, the probabilities of faults can be provided as additional information for user decision. Typically, probabilistic neural network (PNN) was selected for probabilistic classification. However, PNN has the drawbacks of long execution time and insufficient accuracy [37] for medium to large dataset. In the past decade, support vector machine (SVM) became a promising technique because of its ease of use, short training and execution time, and high accuracy. In addition, a probabilistic version of SVM (PSVM) [38] was developed which inherits the advantages of SVM while providing probabilistic outputs. Therefore, both PNN and PSVM were employed to construct the probabilistic classifiers as a comparison in this study and to demonstrate the generalization of the proposed framework. Similar to MLC, considering a sample D = {(xi , ti )} = D1 ∪ Ds for i = 1 to N, where xi = [xi1 , . . ., xin ] is a set of linguistic variables representing qualitative symptom descriptions, ti = [ti1 , . . ., tid ] is a binary vector of d labels of the corresponding single-faults of xi and tij ∈ {0, 1} for j = 1 to d, a function fFPSD is then constructed using D1 to map an unknown input x to a decision vector y: y = fFPSD (x) ∈ {0, 1}d
(3)
As shown in Fig. 1, the proposed FPSD involves three steps: FZ, PPMLC, and DT. FZ is to transform the uncertain qualitative symptom descriptions into some numerical values; PPMLC is to overcome the deficiency of traditional MLC with the use of probabilistic classifiers; and DT is to determine the occurrence of the engine faults with the use of a decision threshold. Therefore, fFPSD can be defined more precisely by these three steps: Step 1 :
s = FZ(x) ∈ Rn
Step 2 :
= PPMLC(s) ∈ [0, 1]
Step 3 :
y = DT () ∈ {0, 1}d
(4) d
(5) (6)
where s is the fuzzified input pattern of x after applying appropriate membership functions, is a d-dimensional probability vector
representing the probabilities of d faults for x, and y is the binary decision vector for x. The working details of the three steps are explained in the following sub-sections. 2.2.1. FZ Since the qualitative symptom descriptions from the vehicle owner is mostly in uncertain form such as “Very difficult”, “Difficult”, “Fair”, “Easy” and “Very easy”, fuzzification is necessary to transform the descriptions into numerical values through a set of fuzzy membership functions. For m = 1 to n, xm is the mth qualitative symptom in the input pattern x = [x1 , . . ., xn ]. Assume the universe of xm is A = {v1 , . . ., vu } where u is the number of possible descriptions for xm , the fuzzy set for xm can be expressed as: A=
A (v1 )
v1
+
A (v2 )
v2
+ ··· +
A (vu )
vu
(7)
where A (·) represents the user-defined membership function. Following this procedure, a different fuzzy set is defined for every qualitative symptom xm . 2.2.2. PPMLC and proposed PCS for SmFD After applying FZ to D = {(xi , ti )} = D1 ∪ Ds , a set of truth vectors S = {(si , ti )} = S1 ∪ Ss is produced where si = [si1 , . . ., sin ] and sim ∈ [0,1] indicates the truth value of the mth qualitative symptom for m = 1 to n. Similarly, S1 and Ss represent the sets of truth vectors of singlefaults and simultaneous-faults respectively. The training dataset S1 containing solely single-faults is selected to train a set of probabilistic classifiers in PPMLC. Then, PPMLC takes an unknown truth vector s as input and produces a probability vector = [1 , . . ., d ] where d is the number of the single-fault labels. Here, j = P(j|s) ∈ [0,1] denotes the probability that s belongs to the jth label for j = 1 to d. Note that every j is an independent probability and hence j ≥ 1. This is an important nature for PPMLC. As mentioned, PCS was employed to take the pairwise correlation between classes into account. Traditionally, given a d-class classification problem, the OVA strategy constructs d individual probabilistic classifiers Cj for j = 1 to d. Every Cj is trained with all training data in S1 using any probabilistic modeling method (e.g., PNN or PSVM) in which the training data with the jth label are considered as positive while the training data with the other labels are considered as negative. Then, Cj takes an unseen truth vector s and produces the probability that s belongs to the jth label, i.e., Cj (s) = j = P(j|s). However, in PCS, every Cj is constituted by d − 1 pairwise probabilistic classifiers Cjk , k = 1 to d, and k = / j as shown in Fig. 2. In general, each Cjk can be any probabilistic classifier (e.g., PNN, or PSVM) which estimates the pairwise probability that an unknown truth vector s belongs to the jth label against the kth label, i.e., Cjk (s) = P(j|s, j or k). Since Cjk (s) and Ckj (s) are complementary (i.e., Cjk (s) = 1 − Ckj (s)), there are totally d(d − 1)/2 pairwise classifiers in PPMLC.
242
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
is at least 0.5. However, since the value of ε is applicationspecific, it is more preferred to determine the optimal ε* via a DTO step under a validation dataset. In this study, a famous direct search method, namely particle swarm optimization (PSO) [39], was employed to optimize ε by using an independent validation dataset.
Fig. 2. PCS of probabilistic classification.
There are several methods [34] for PCS, which are, however, only suitable for SFD because of the constraint j = 1. Since the nature of SmFD is that j ≥ 1, a simple PCS is proposed in this study for SmFD as follows: 1. Every Cjk is trained only by the training data with the jth and kth labels. 2. Let jk = Cjk (s) = P(j|s, j or k) be the pairwise probability of the jth label against the kth label for an unknown truth vector s, where Cjk (s) is estimated using PNN or PSVM. Then, j is calculated as: d
j = Cj (s) =
d
njk Cjk (s)
k=1:k = / j
=
d
njk
k=1:k = / j
njk jk
k=1:k = / j d
(8) njk
k=1:k = / j
where njk is the number of training data with the jth and kth labels. Hence, the probabilities j can be more accurately estimated from jk = Cjk (s) because the pairwise correlation between the labels are taken into account. The procedure of the PCS is summarized in Fig. 2. 2.2.3. DT and DTO After probabilistic classification, a probability vector = [1 , . . ., d ] indicating the probabilistic occurrence of the faults is produced. At this stage, the diagnosis system can provide the probability vector to the user as a quantitative measure for reference and further use. Nevertheless, a decision vector y indicating the multi-label decision of fault diagnosis is desired. By introducing a decision threshold ε, y can be produced from : y = DT () = [y1 , y2 , . . ., yd ] = [ε(1 ), ε(2 ), . . ., ε(d )] with ε(j ) =
1
if j ≥ ε
0
otherwise
,
for j = 1 to d
(9)
(10)
where ε ∈ (0, 1) is a user-specified decision threshold and yj indicates s belongs to the jth label or not. For example, if ε = 0.5 and = PPMLC(s) = [0.72, 0.42, 0.82, 0.28, 0.86], then y = DT() = [1, 0, 1, 0, 1]. Therefore, s is diagnosed as simultaneous-faults (1, 3, 5). Note that y = [0, 0, 0, 0, 0] indicates no fault has been found and s is diagnosed as normal. It is obvious that the value of the decision threshold ε will greatly affect the classification accuracy. For a situation without any prior information, the best estimate of ε may be simply set to 0.5, i.e., the occurrence of a fault is considered if its probability
2.2.4. Capability for SFD and SmFD It can be noticed that, for the proposed FPSD, if the engine is caused by a single-fault, say, the jth fault, s contains only the symptoms of the jth fault. Then, in , only the corresponding probability j ≥ ε* , resulting in only yj = 1 in the decision vec / j. In other words, yj = 1 and tor y while all other yk = 0, k = hence a single-fault is detected. For the case that the engine is caused by two simultaneous-faults (e.g., the jth and kth faults), s is constituted by the symptoms of the jth and kth faults. These symptoms may be overlapping or inter-distorted. In the current diagnostic system, probabilities are employed to give the similarity of s against the jth and kth faults by Cj and Ck respectively. If their symptoms are not highly overlapping or inter-distorted, there is a high chance that both the corresponding probabilities j , k ≥ ε* . Under this circumstance, yj = 1 and yk = 1, making yj ≥ 1, so that a simultaneous-fault can be detected. The mechanism is similar for three or more simultaneous-faults. By combining these cases, the proposed FPSD can diagnose both single-fault and simultaneous-faults using classifiers trained with single-faults only. 3. Case study for application of FPSD To demonstrate the effectiveness of the proposed FPSD, a case study was conducted. The procedure for the collection of real world sample data is first described. The FZ process of the collected data and the training of the PPMLC are then presented. The method for evaluating the proposed FPSD is also discussed in this section. 3.1. Sample data collection The first step in the case study was the collection of qualitative symptoms and their corresponding engine faults for constructing the classifier. The qualitative symptom description could be collected from vehicle owners by the mechanics through a questionnaire. However, there is no general format of a questionnaire because every automotive service center has its own design according to the availability of equipment and the experience of its mechanics. Moreover, the engine faults are slightly different by regions due to the factors of choice of fuels, climate, design of road, and driving pattern. For instance, the gasoline octane available in Macau is 98 only, and the coldest month in Macau is January, with average monthly temperature 14.5 ◦ C, so the fuel quality and the cold-weather start issues are not important concerns to the local service centers in Macau. Therefore, as an illustrative example, the sample data was collected from local automotive service centers in Macau, and a total of five experienced mechanics were consulted. In other words, all data are real industrial data. As mentioned in Section 1, those engine faults that can be detected by onboard diagnostic systems or scan tools are not within the scope of current study. Hence, only those engine faults that cannot easily detected by onboard diagnostic tools were utilized. Totally 540 cases including both single-faults and simultaneousfaults were provided by the mechanics. In fact, each mechanic came out with different faults even for the same set of qualitative engine symptoms, so the network-based classification appears to be a good approach to summarize the expert knowledge. Tables 1 and 2 provide the common symptom descriptions and engine faults that
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
Table 1 Automotive engine symptoms for this case study. Linguistic variable
Corresponding engine symptom in the questionnaire
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12
Difficult-to-start Stall on occasion Backfire during acceleration Unstable idle speed or misfire Sluggish acceleration Knocking during acceleration Backfire in exhaust pipe Check engine light is on Abnormal coolant temperature Vehicle has traveled miles since last engine oil change Poor fuel mileage Oil indicator is on
Table 2 Automotive engine faults for this case study. Corresponding engine fault
y1 y2 y3 y4 y5 y6 y7 y8 y9 y10 y11 y12
Idle-air valve malfunction Defective ignition coil Incorrect ignition timing Defective spark plug Defective throttle valve unit Leakage in intake manifold Defective air cleaner Defective injector Defective fuel pump system Defective cooling system Defective lubrication system Engine oil quality/level problem
x1 : ’ Normal start ’ → s1 = 0 x2 : ’ Severely unstable engine speed ’ → s2 = 0.7 ... ... ... x12 : ’ Oil indicator is on ’ → s12 = 1.
Then, the corresponding truth vector becomes s = [s1 , s2 , . . ., s12 ] = [0,0.7, . . ., 1]. 3.3. Training of PPMLC
are available in the sample dataset, whereas Table 3 summarizes the relationships between the symptoms and the possible engine faults. Each row of Table 3 reveals that one symptom can be associated with one or more possible faults, indicating the difficulty of engine fault diagnosis and the necessity for SmFD.
3.2. FZ of symptoms In the current case study, there are totally twelve symptoms (n = 12), which could be expressed as a vector of linguistic variables x = [x1 , x2 , . . ., x12 ]. However, these linguistic variables were difficult to represent by numerical values. Table 1 reveals that the symptom description can be categorized into three types: multiple-choice for x1 –x9 , numeric for x10 , and binary for x11 and x12 . Therefore, the first step in FPSD is to transform every symptom to a corresponding truth value by using fuzzy logic. According to the domain knowledge of some automotive mechanics and reference handbook [25], various membership functions for the inputs are defined below:
Table 3 Relationships between sample symptoms and possible engine faults.
x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12
√ √
√ √
y2 √ √ √ √
y3 √ √ √ √ √ √ √
y4 √ √ √ √ √ √
√
y5
y6 √
√ √ √
√ √ √
y7
√ √
y8 √
y9 √
√ √ √
√ √ √
√ √
√
√ √
√
y10
y11
y12
√ √ √
√ √
√
√
x1 : Difficult-to-start = 1 0.3 0 + able to crank0.7 + immediately stall + normal unable to start but cannot start after strating start x2 : Stall on occasion = 1 0.7 0.3 0 + severely unstable engine speed + unstale engine speed + stable engine speed stall x3 : Backfire during acceleration = 1 0.5 0 + + always backfire sometimes backfire normal acceleration x4 : Unstable speed or misfire = 1 0.7 0.3 0 + engine + unstable engine + stable engine misfire frequently jerk idle speed idle speed x5 : Sluggish acceleration = 1 0.3 0 + unable to0.7 + accelerate + normal acceleration misfiring during acceleration accelerate very slow 1 0.5 0 x6 : Knocking during acceleration = serious + slight + no 0.5 0 x7 : Backfire in exhaust pipe = always 1backfire + sometimes + no backfire backfire 1 0 x8 : Check engine light is on = on + light goes0.5 + off on and off 0 x9 : Abnormal coolant temperature = above 1normal + below0.5 + normal normal x10 : Vehicle has travelled km since last engine oil change = 1 0.7 0.5 0.3 0 + 40,000−20,000 + 20,000−10,000 + 10,000−5,000 + ≤5,000 ≥40,000 1 0 x11 : Poor fuel mileage = yes + no 1 0 x12 : Oil indicator is on = yes + no
With these membership functions, the qualitative symptom descriptions x can be fuzzified to the truth vector s. For example, if the qualitative symptom descriptions are:
Label
y1 √
243
√ √
There are twelve possible engine faults (d = 12) in the present case study. According to the PCS procedure mentioned in Section 2, the maximum number of pairwise probabilistic classifiers Cjk is 12(12 − 1)/2 = 66. To train a Cjk , say, j = 1, k = 3, the set of truth vectors {(si , ti )|ti1 = 1 or ti3 = 1} corresponding to either Fault 1 or Fault 3 are selected because C13 is used to classify either Fault 1 or Fault 3. A more general workflow of training pairwise classifier Cjk is depicted in Fig. 3. After all Cjk are trained, the probability j of occurrence of the jth fault for an unknown symptom truth vector s = FZ(x) is calculated by using Eq. (8). For example, in Eq. (8) 2 = C2 (s) = C2k (s) = 2k for k = 1–12, k = / 2, where the pairwise classifiers C2k are employed. Note that C22 is not necessary in the calculation and in fact it is not trained because j = / k. After every j is calculated for j = 1–12, the probability vector = [1 , . . ., 12 ] can be obtained. Finally, by adopting the subsequent DT procedure as described in Section 2, a final decision vector y = [y1 , . . ., y12 ] = DT() can be obtained. The optimal decision threshold ε* in DT is determined via the DTO step in Section 2. The procedure of the engine fault diagnosis is summarized in Fig. 4. 3.4. Evaluation of FPSD In order to verify the effectiveness of the proposed FPSD, models was constructed under different combinations of frameworks (SLC, MLC, FPSD) and classifiers (NN, PNN, SVM, PSVM) for a comprehensive comparison. PNN and PSVM were used in the models to construct a set of probabilistic classifiers for comparison. The traditional classifiers, NN and SVM, were also employed to illustrate the importance of probabilistic classifiers and PCS. On the other hand, there are two major different components between MLC and FPSD: PPMLC and DTO. The effectiveness of these components is also evaluated. The details of the model settings and the evaluation performance index are discussed in the following sub-sections.
√ √ √
√ √ √
3.4.1. Performance index Traditional evaluation of classification accuracy only considers exact matching of the decision vector y against the true vector t.
244
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
Fig. 3. Training procedure of pairwise classifier.
Table 4 Sample dataset division. Purpose
The training dataset was used to train the classifiers. The validation dataset was used for determining the optimal decision threshold ε* in DTO. The test dataset was used to assess the performance of the trained models.
Name and size of the fuzzified sample dataset
Training Validation Test
Single-faults (420)
Simultaneous-faults (120)
TRAIN 1 (231) VALID 1 (84) TEST 1 (105)
TRAIN S (66) VALID S (24) TEST S (30)
This kind of evaluation is, however, not suitable for MLC framework where partial matching is preferred. Therefore, a well-known and common evaluation called F-measure [40] is employed. F-measure is mostly used as performance evaluation for information retrieval systems where a document may belong to a single or multiple tags simultaneously, which matches the nature of our current study. By F-measure, the evaluation of both single-fault and simultaneousfault test cases can be appropriately done. To define F-measure F, two concepts of precision (E) and recall (R) are used [40] so that F=
2ER E+R
(11)
where E and R are originally designed for single-fault classification only, but can be extended to handle MLC [41]. For a given test set S = {(si , ti )} of Nt single-fault cases and simultaneous-fault cases, i = 1 to Nt , j = 1 to d,
d Nt
E=
y t i=1 ij ij
j=1
d Nt
t i=1 ij
j=1
d Nt
and R =
j=1
y t i=1 ij ij
d Nt j=1
(12)
y i=1 ij
d Nt
F=
y t i=1 ij ij d y + i=1 ij j=1
d Nt j=1
j=1
Nt
t i=1 ij
∈ [0, 1]
1. 2. 3. 4. 5. 6. 7.
NN SLC(s) = [ 1 , . . ., d ]; PNN SLC(s) = [1 , . . ., d ]; SVM SLC(s) = [SVM1 (s), . . ., SVMd (s)]; PSVM SLC(s) = [PSVM1 (s), . . ., PSVMd (s)]; NN MLC(s) = [ 1 , . . ., d ]; PNN MLC(s) = [1 , . . ., d ]; SVM MLC(s) = [SVM1 (s), . . ., SVMd (s)];
8.
PSVM MLC(s) = [PSVM1 (s), . . ., PSVMd (s)];
NN FPSD(s) = [ 1 , . . ., d ]; 9. 10. PNN FPSD(s) = [1 , . . ., d ]; 11. SVM FPSD(s) = [SVM1 (s), . . ., SVMd (s)]; 12. SVM FPSD(s) = [SVM1 (s), . . ., SVMd (s)];
where yi = FPSD(si ) = [yi1 , yi2 , . . ., yid ] and ti = [ti1 , ti2 , . . ., tid ] are the predicted decision vector obtained from the diagnostic system and the true decision vector from the test dataset S, respectively, for a test case si and ∀i, j yij , tij ∈ {0, 1}. Substituting Eq. (12) into Eq. (11), the following F-measure F can be obtained: 2
3.4.3. Setup of different models In each set of models, there are three different frameworks: SLC, MLC, and FPSD; and four different classifiers: NN, SVM, PNN, PSVM. Therefore, there are totally 12 combinations (or models) of frameworks and classifiers. As a reminder, in SLC framework, artificial labels are made for simultaneous-faults so that only a label will be returned. For the classifiers, SVM and PSVM can merely produce a single output of truth value or probability . The architecture of NN and PNN can be modified to produce a vector of d outputs for a d-fault problem. The 12 models are shown in the followings where s is a truth vector fuzzified from an unknown qualitative vector x, y is the predicted fault label of x, and y is the vector of predicted fault labels of x, and j = 1 to d:
(13)
Generally, the higher the F-measure, the better the accuracy is. 3.4.2. Data division To assess the performance of the models, the 540 collected cases (sample dataset D) were first fuzzified using the membership functions in Section 3.2. After the corresponding set of truth vectors S = FZ(D) was obtained, S was divided into six groups for different purposes, namely TRAIN 1, TRAIN S, VALID 1, VALID S, TEST 1, and TEST S as shown in Table 4. The postfix “1” represents singlefaults, while “S” denotes simultaneous-faults. For example, TRAIN 1 means the training data of single-faults and TEST S represents the test data of simultaneous-faults.
y = argmaxj j y = argmaxj j y = argmaxj (SVMj (s)) y = argmaxj (PSVMj (s)) y = [( 1 ), . . ., ( d )] y = [ε(1 ), . . ., ε(d )] y = [(SVM1 (s)), . . ., (SVMd (s))] y = [ε(SVM1 (s)), . . ., ε(SVMd (s))] y = [ * ( 1 ), . . ., * ( d )] y = [ε* (1 ), . . ., ε* (d )] y = [ * (SVM1 (s)), . . ., * (SVMd (s))] y = [ * (SVM1 (s)), . . ., * (SVMd (s))]
where i and i are the fuzzy truth value and probability of the jth fault respectively. The thresholds , * , ε, ε* are similar to the form of Eq. (10). The values of and ε are user-defined while * and ε* are optimized. For SLC models (1–4), both the datasets TRAIN 1 and TRAIN S were employed to train the classifier because SLC is only suitable for SFD. Since no decision threshold is required for SLC framework, the validation datasets were not used. In addition, there is no coupling strategy in SLC. For MLC models (5–8), only TRAIN 1 was employed to train the classifiers according to the procedures discussed in Section 2.1. Based on reference handbook [25], the threshold for defuzzification is set to 0.8 which is consistent with the fuzzy membership value for “Fault must exist” and the probabilistic decision threshold ε is set to 0.5 according to usual practice. The coupling strategy used in MLC is one-vs.-all (OVA). For FPSD models (9–12), only TRAIN 1 was employed as the training dataset. The proposed PCS was implemented as the coupling strategy. In addition, the best threshold for defuzzification * and the best probabilistic decision threshold ε* are optimized
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
245
Fig. 4. Diagnostic procedure of an unseen case.
respectively using PSO. The objective function of the PSO optimization is just the F-measure over the validation datasets VALID 1 and VALID S. For each optimization of the threshold, PSO was run for 10 times with different arbitrary initial population. Then, the threshold producing the highest F-measure is returned as the optimal decision threshold. This procedure is applicable to both * and ε* . Besides, the spread of all the PNNs was selected by default as 1.0. Gaussian kernel was selected in PSVM with kernel width g = 1.0 and C = 1.0 according to usual practice. For the parameters of PSO, the number of generation was set as 1000, population size was 50, inertial weight was 0.9, and both the cognitive and social parameters were set to 2 by referring to [2]. All the proposed methods and classifiers mentioned were implemented in MATLAB R2008a and all the tests were conducted under a PC with Intel Core i5 @ 3.2 GHz and 4GB RAM. 4. Results and discussion In Section 3, the application of the proposed FPSD and the corresponding evaluation methods are presented. This section provides the evaluation results to show the effectiveness of the proposed FPSD. The analysis of the results is also discussed. 4.1. Effectiveness of the proposed FPSD The effectiveness of the proposed framework is examined by comparing the F-measures of the 12 models through three aspects, namely all faults (containing both single- & simultaneous-faults), purely single-faults, and purely simultaneous-faults. The results for the evaluation over all the test cases in TEST 1 and TEST S (all fault) are shown in Fig. 5. It can be learnt from Fig. 5 that all classifiers with SLC framework score the highest F-measures. This is reasonable because both single-fault and simultaneous-fault cases (TRAIN 1 and TRAIN S) were used for training in SLC while the other two frameworks employed only single-fault cases (i.e., TRAIN 1). Among all frameworks, MLC works the worst for all classifiers because it employed OVA strategy and the decision thresholds and ε were set according to usual practice. On the other hand, FPSD obtains higher
Fig. 5. Accuracies of both single and simultaneous-fault detection of various classification frameworks.
Fig. 6. Accuracies of single-fault detection of various classification frameworks.
F-measures because of the PCS which considers the pairwise correlation among different faults. In addition, the PSO-optimized decision thresholds * and ε* also result in higher F-measures. By referring to the results in Fig. 5, FPSD is generally about 3–5% worse than SLC in terms of F-measure under different classifiers, but it is 2–7% better than MLC. Note that FPSD only employs the dataset of single-faults only (i.e., TRAIN 1). Practically, there is a lot of combination of single faults and the combination is also not randomly generated; it should be reasonable, so simultaneous-fault data are difficult to obtain and usually only single-fault data are available. Therefore, it is a significant achievement for the proposed framework that the costly simultaneous-fault data can be eliminated. The results for the evaluation over the single-fault test cases (TEST 1) and simultaneous-fault test cases (TEST S) are shown in Figs. 6 and 7 respectively. For the single-fault test results in Fig. 6, all FPSD classifiers perform very well (F-measure > 0.9) over TEST 1. Alternatively, the SLC classifiers obtain the worst performance because the SLC classifiers were trained with TRAIN 1 and TRAIN S but evaluated over TEST 1 (but no TEST S). In other words, the additional simultaneous-fault cases in TRAIN S introduce classification errors (or noise) to the SLC classifiers. The errors become obvious when the SLC classifiers were evaluated over TEST 1 in which some single-fault test cases were misclassified as simultaneous-faults. It can be imagined when more additional simultaneous-fault labels are available, more classification errors over single-fault test cases will be incurred under SLC. For the simultaneous-fault diagnosis results in Fig. 7, SLC classifiers are the best because they were trained with the
Fig. 7. Accuracies of simultaneous-fault detection of various classification frameworks.
246
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
Fig. 8. Accuracies of various classifiers under OVA and PCS.
simultaneous-fault cases (TRAIN S) but MLC and FPSD just utilized the single-fault training dataset TRAIN 1. Nevertheless, FPSD classifiers are still generally better than MLC ones and not much worse than SLC classifiers. Again, this result verifies the effectiveness of FPSD that, with single-fault training cases only, simultaneous-fault cases can be diagnosed satisfactorily. 4.2. Effectiveness of pairwise probabilistic classification An important contribution in FPSD is the integration of PCS with probabilistic classification to MLC. To illustrate the effectiveness of PCS, a set of experiments was conducted using OVA and PCS in FPSD. In Fig. 8, ‘All OVA’ means that single and simultaneous-fault test cases are used under OVA strategy. Similarly, ‘Single OVA’ means the test cases are purely single-faults under OVA. ‘Simult OVA’ means only simultaneous-fault test cases under OVA are used. ‘All PCS’ means single and simultaneous test cases are employed under PCS. Similarly, ‘Single PCS’ means the test cases are purely singlefaults under PCS. ‘Simult PCS’ means only simultaneous-fault test cases under PCS are adopted. In Fig. 8, the F-measures under FPSD using OVA are represented by the 1st, 3rd and 5th bars while the remaining bars represent the F-measures under FPSD using PCS. It can be seen that PCS generally outperforms OVA strategy because a set of indecisive regions is formed using OVA strategy [42] as illustrated in Fig. 9, while the procedure of PCS minimizes this set of regions so that a more accurate diagnosis is achieved. Another important issue is about the execution time of OVA and PCS. Intuitively, the OVA seems to take much shorter training time because it trains only d classifiers Cj for a d-class problem while PCS trains d(d − 1)/2 pairwise classifiers Cjk as shown in Fig. 2. In fact, each Cjk only takes the training data with the jth and kth labels, while each Cj for OVA takes the training data of all labels. Note that the training time of a classifier (e.g., NN, PNN, SVM, and PSVM) is usually at least polynomial to the size of the training dataset. Therefore, the training time of a classifier Cjk in PCS is much shorter than that of Cj in OVA. In the application example, the training time of every kind of classifiers (except NN) using OVA or PCS under FPSD is less than 1 s. Consequently, PCS takes almost the same amount
Fig. 9. Indecisive regions (shaded area) using one-vs.-all (left) and using pairwise coupling (right) [42].
Fig. 10. Accuracies of various classifiers with and without threshold optimization by PSO.
of training time as OVA, while PCS outperforms OVA in terms of Fmeasure. The superior performance of PCS over OVA has also been provided in [33]. 4.3. Effectiveness of DTO To show the effectiveness of the decision threshold optimization, another set of experiments was conducted under FPSD with (w) or without (w/o) decision threshold optimization for all the classifiers. For the cases without optimization, the decision thresholds and ε are simply set at 0.8 and 0.5 respectively. The results are shown in Fig. 10. In Fig. 10, the F-measures under FPSD without PSO are represented by the 1st, 3rd and 5th bars while the remaining bars represent the F-measures under FPSD with PSO. With the aid of the decision threshold optimization, a range of 4–7% improvement in F-measure can be achieved. Various applications, or even various sample datasets in the same application, may require a different decision threshold so that the procedure of threshold optimization must be included in FPSD. In the case without threshold optimization, a threshold value is provided via a rule of thumb which is generally good but not optimal. In fact, in the principle of many machine learning algorithms such as NN or SVM, there is implicitly a fixed decision threshold (e.g., 0 in NN, ±1 in SVM, 0.5 in PNN and PSVM) for classification of the raw (probabilistic) decision value. The threshold is either predetermined inside the learning algorithm, or explicitly defined by the user, which may degrade the generalization for unseen test data. In FPSD, the threshold is separately optimized over an independent validation set. This may be the reason why the Fmeasure with PSO-optimized threshold outperforms the one with fixed threshold as shown in Fig. 10. 4.4. Discussion of results Overall speaking, all the diagnostic methods using PSVM are superior to those with PNN, NN and SVM. Thus, PSVM is the most suitable classifier for this application domain. To better understand the pros and cons of the FPSD against SLC and MLC, a summary is provided in Table 5. It can be seen that the performance of SLC is actually the best but a set of 2d − (d + 1) additional simultaneousfault classifiers has to be trained, which may further degenerate the performance for unseen single-fault cases. However, FPSD can have a similar accuracy of SLC for all single-fault and simultaneousfault test cases while much fewer simultaneous-fault classifiers are necessary. Moreover, an unexpected phenomenon of improved accuracy for unseen single-fault cases was observed under FPSD. This may be considered as one of the future studies to verify if FPSD can perform better than SLC in general. Another important issue is the expandability of the classification system, i.e., Can the classification system be easily adapted to handle new faults (or labels)? For SLC, it is hard to handle new labels
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
247
Table 5 Comparisons of FPSD against SLC and MLC. Framework
SLC MLC FPSD
Training dataset
TRAIN 1 &TRAIN S TRAIN 1 TRAIN 1
Strategy
No OVA PCS
Number of classifiers
d
2 d d(d − 1)/2
Training time
Very long Short Short
after the diagnostic system is set up, since a lot of new combinations of labels must be produced for the inclusion of the new faults. Moreover, the whole classification system must be retrained. On the other hand, MLC and FPSD can adapt to handle new labels easily since no new combination of labels will be produced and only several individual classifiers Cj are necessary to add to. 5. Conclusions In this paper, a new FPSD framework is proposed for automotive engine simultaneous-fault diagnosis based on qualitative symptom descriptions from the vehicle owner. The proposed FPSD is composed of three modules: FZ, PPMLC, and DT. The PPMLC module is an integration of PCS, MLC, and probabilistic classifier. This new integration just needs single-fault data for training. Nevertheless, the resultant PPMLC module not only diagnoses the single-faults accurately, but also detects the simultaneous-faults satisfactorily. By FPSD, the non-trivial and challenging task of engine simultaneous-fault diagnosis based on qualitative symptom descriptions can be effectively resolved. Although the traditional SLC can obtain the most accurate diagnosis of simultaneous-faults, it is practically prohibitive in terms of amount of simultaneousfault data and number of classifiers. Experimental results show that, even simultaneous-fault data were not used for training, FPSD is only 3–5% worse than traditional SLC in terms of F-measure. Moreover, as compared to traditional MLC, FPSD improves the fault diagnostic performance up to 7% in terms of F-measure. This results verify the significance of the proposed FPSD framework. In addition, it is found that FPSD can diagnose single-faults more accurately than SLC under a simultaneous-diagnostic environment. Practically, FPSD is suitable for automotive service centers or workshops under the constraints of expensive equipment. This kind of simultaneous-fault diagnostic problem for automotive service centers or workshops has not been solved and found in the open literature yet. Besides, apart from automotive engine diagnosis, it is strongly believed that FPSD is applicable to other application areas such as automotive chassis simultaneous-fault diagnosis. Acknowledgements The research is co-supported by FDCT Macau SAR, grant number FDCT/075/2013/A, and the University of Macau Research Grant, grant numbers MYRG075(Y2-L2)-FST12-VCM, MYRG075(Y1-L2)FST13-VCM, and MYRG2014-00178-FST. References [1] L. Dinca, T. Aldemir, G. Rizzoni, A model-based probabilistic approach for fault detection and identification with application to the diagnosis of automotive engines, IEEE Trans. Automat. Control 44 (1999) 2200–2205. [2] P.K. Wong, L.M. Tam, K. Li, C.M. Vong, Engine idle-speed system modelling and control optimization using artificial intelligence, Proceedings of the Institution of Mechanical Engineers, Part D, J. Automob. Eng. 224 (2010) 55–72. [3] H. Schweppe, A. Zimmermann, D. Grill, Flexible on-board stream processing for automotive sensor data, IEEE Trans. Ind. Inform. 6 (2010) 81–92. [4] J. Mohammadpour, M. Franchek, K. Grigoriadis, A survey on diagnostic methods for automotive engines, Int. J. Eng. Res. 13 (2012) 41–64.
Decision threshold optimization
No No Yes
Expandability
Low High High
Accuracy for test cases
All
Single
Simultaneous
Best Good Very good
Good Very good Best
Best Good Very good
[5] A. Azarian, A. Siadat, A global modular framework for automotive diagnosis, Adv. Eng. Inform. 26 (2012) 131–144. [6] Q.R. Butt, A.I. Bhatti, M.R. Mufti, M.A. Rizvi, I. Awan, Modeling and online parameter estimation of intake manifold in gasoline engines using sliding mode observer, Simul. Model. Pract. Theory 32 (2013) 138–154. [7] J. Luo, K.R. Pattipati, L. Qiao, S. Chigusa, An integrated diagnostic development process for automotive engine control systems, IEEE Trans. Syst. Man Cybern. Part C: Appl. Rev. 37 (2007) 1163–1173. [8] F. Cruz-Peragon, F.J. Jimenez-Espadafor, J.M. Palomar, M.P. Dorado, Combustion fault diagnosis in internal combustion engines using angular speed measurements and artificial neural networks, Energy Fuels 22 (2008) 2972–2980. [9] X. Wang, U. Kruger, G.W. Irwin, G. McCullough, N. McDowell, Nonlinear PCA with the local approach for diesel engine fault detection and diagnosis, IEEE Trans. Control Syst. Technol. 16 (2008) 122–129. [10] J.D. Wu, C.H. Liu, An expert system for fault diagnosis in internal combustion engines using wavelet packet transform and neural network, Expert Syst. Appl. 36 (2009) 4278–4286. [11] K. Choi, S. Singh, A. Kodali, et al., Novel classifier fusion approaches for fault diagnosis in automotive systems, IEEE Trans. Instrum. Meas. 58 (2009) 602–611. [12] M.H. Wang, K.H. Chao, W.T. Sung, G.J. Huang, Using ENN-1 for fault recognition of automotive engine, Expert Syst. Appl. 37 (2010) 2943–2947. [13] C.M. Vong, P.K. Wong, Engine ignition signal diagnosis with wavelet packet transform and multi-class least squares support vector machines, Expert Syst. Appl. 38 (2011) 8563–8570. [14] C.M. Vong, P.K. Wong, W.F. Ip, Case-based expert system using wavelet packet transform and kernel-based feature manipulation for engine ignition system diagnosis, Eng. Appl. Artif. Intell. 24 (2011) 1281–1294. [15] C.M. Vong, P.K. Wong, W.F. Ip, A new framework of simultaneous-fault diagnosis using pairwise probabilistic multi-label classification for time-dependent patterns, IEEE Trans. Ind. Electron. 60 (2013) 3372–3385. [16] I. Morgan, H.H. Liu, B. Tormos, A. Sala, Detection diagnosis of incipient faults in heavy-duty diesel engines, IEEE Trans. Ind. Electron. 57 (2010) 3522–3532. [17] H.L. Gelgele, K. Wang, An expert system for engine fault diagnosis: development and application, J. Intell. Manuf. 9 (1998) 539–545. [18] Y. Lu, T.Q. Chen, B. Hamilton, A fuzzy system for automotive fault diagnosis fast rule generation and self-tuning, IEEE Trans. Vehicular Technol. 49 (2000) 651–660. [19] M.B. C¸elik, R. Bayir, Fault Detection In Internal Combustion Engines Using Fuzzy Logic, Proceedings of the Institution of Mechanical Engineers, Part D, J. Automob. Eng. 221 (2007) 579–587. [20] M. D’Angelo, R. Palhares, R. Takahashi, et al., In-vehicle network level fault diagnostics using fuzzy inference systems, Appl. Soft Comput. 11 (2011) 179–192. [21] B. Li, P. Zhang, D. Liu, S. Mi, P. Liu, Novel classification method for sensitive problems and uneven datasets based on neural networks and fuzzy logic, Appl. Soft Comput. 11 (2011) 5299–5305. [22] N.R. Sakthivel, V. Sugumaran, B. Nair, Automatic rule learning using roughset for fuzzy classifier in fault categorization of mono-block centrifugal pump, Appl. Soft Comput. 12 (2012) 196–203. [23] A.P. Rotshtein, H.B. Rakytyanska, Fuzzy evidence in identification, in: Forecasting and Diagnosis, Springer-Verlag Berlin, Berlin, 2012. [24] A. Azarian, A. Siadat, P. Martin, A new strategy for automotive offboard based on a meta-heuristic engine, Eng. Appl. Artif. Intell. 24 (2011) 733–747. [25] G.Y. Li, Application on Intelligent Control and MATLAB to Electronically Controlled Engines, Publishing House of Electronics Industry, Beijing, China, 2007. [26] L. Pan, Y. Tong, N. Ning, A. Chen, Application of Fuzzy Neural Network in Fault Diagnosis of Gasoline Engine, in: The Ninth International Conference on Electronic Measurement & Instruments, IEEE, Beijing, 2009, pp. 602–605. [27] G. Liang, Q. Wang, J. Wang, J. Song, Application for diesel engine in fault diagnose based on fuzzy neural network and information fusion, in: IEEE 3rd International Conference on Communication Software and Networks, Xi’an, China, 2011, pp. 102–105. [28] H. Li, X. Ma, Y. He, Diesel fault diagnosis technology based on the theory of fuzzy neural network information fusion, in: The Sixth International Conference of Information Fusion, Cairns, Queensland, Australia, 2003, pp. 1394–1410. [29] J.M.F. Calado, J.M.G.S.d. Costa, A hierarchical fuzzy neural network approach for multiple fault diagnosis, in: UKACC International Conference on CONTROL’ 98, 1998, pp. 1498–1503. [30] R. Eslamloueyan, M. Shahrokhi, R. Bozorgmehri, Multiple simultaneous fault diagnosis via hierarchical and single artificial neural networks, Sci. Iran. 10 (2003) 300–310.
248
C.M. Vong et al. / Applied Soft Computing 22 (2014) 238–248
[31] R. Eslamloueyan, Designing a hierarchical neural network based on fuzzy clustering for fault diagnosis of the Tennessee–Eastman process, Appl. Soft Comput. 11 (2011) 1407–1415. [32] I. Yélamos, M. Graells, L. Puigjaner, G. Escudero, Simultaneous fault diagnosis in chemical plants using a multilabel approach, Am. Inst. Chem. Eng. J. 53 (2007) 2871–2884. [33] T. Hastie, R. Tibshirani, Classification by pairwise coupling, Ann. Stat. 26 (1998) 451–471. [34] T.F. Wu, C.J. Lin, R.C. Weng, Probability estimates for multi-class classification by pairwise coupling, J. Mach. Learn. Res. 5 (2004) 975–1005. [35] E. Mencía, S.H. Park, J. Fürnkranz, Efficient voting prediction for pairwise multilabel classification, Neurocomputing 73 (2010) 1164–1176. [36] K. Mathioudakis, C. Romessis, Probabilistic neural network for validation of onboard jet engine data, Proceedings of the Institution of Mechanical Engineers, Part G, J. Aerosp. Eng. 218 (2004) 59–72.
[37] E. Oliveira, P.M. Ciarelli, A. Souza, C. Badue, Using a probabilistic neural network for a large multi-label problem, in: The 10th Brazilian Symposium on Neural Networks, Salvador, Brazil, 2008, pp. 195–200. [38] J. Platt, Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, in: P.B.A. Smola, B. Scholkpof, D. Schuurmans (Eds.), Advances in Large Margin Classifiers, MIT Press, London, 1999, pp. 61–74. [39] J. Kennedy, R. Eberhart, Particle swarm optimization, in: IEEE International Conference on Neural Networks, 1995, pp. 1942–1948. [40] C.J.V. Rijsbergen, Information Retrieval, Butterworths, London, 1979. [41] J.M. Tague, Information retrieval experiment, in: K.S. Jones (Ed.), The Pragmatics of Information Retrieval Experimentation, Butterworths, London, 1998, pp. 59–102. [42] S. Abe, Support Vector Machines for Pattern Classification, 2nd ed., Springer Verlag, London, 2010.