Special Issue Article
Improved probabilistic neural networks with self-adaptive strategies for transformer fault diagnosis problem
Advances in Mechanical Engineering 2016, Vol. 8(1) 1–13 Ó The Author(s) 2016 DOI: 10.1177/1687814015624832 aime.sagepub.com
Jiao-Hong Yi1, Jian Wang1 and Gai-Ge Wang2,3,4
Abstract Probabilistic neural network has successfully solved all kinds of engineering problems in various fields since it is proposed. In probabilistic neural network, Spread has great influence on its performance, and probabilistic neural network will generate bad prediction results if it is improperly selected. It is difficult to select the optimal manually. In this article, a variant of probabilistic neural network with self-adaptive strategy, called self-adaptive probabilistic neural network, is proposed. In self-adaptive probabilistic neural network, Spread can be self-adaptively adjusted and selected and then the best selected Spread is used to guide the self-adaptive probabilistic neural network train and test. In addition, two simplified strategies are incorporated into the proposed self-adaptive probabilistic neural network with the aim of further improving its performance and then two versions of simplified self-adaptive probabilistic neural network (simplified selfadaptive probabilistic neural networks 1 and 2) are proposed. The variants of self-adaptive probabilistic neural networks are further applied to solve the transformer fault diagnosis problem. By comparing them with basic probabilistic neural network, and the traditional back propagation, extreme learning machine, general regression neural network, and selfadaptive extreme learning machine, the results have experimentally proven that self-adaptive probabilistic neural networks have a more accurate prediction and better generalization performance when addressing the transformer fault diagnosis problem. Keywords Classification, self-adaptation, probabilistic neural network, back propagation, general regression neural network, extreme learning machine, fault diagnosis
Date received: 5 August 2015; accepted: 27 November 2015 Academic Editor: Siamak Talatahari
Introduction Fault diagnosis (FD)1,2 begins with mechanical equipment FD. With the increment of the technical level of modern equipment and complexity, the effects of equipment failure on the production are also significantly increased. Therefore, in order to ensure that the equipment is reliable, operates effectively, and gives full importance to its effectiveness, it is necessary to develop FD technology. In FD, the state information of facilities in operation or relatively static conditions is fully investigated through the technology of modern testing,
1
School of Environmental Science and Spatial Informatics, China University of Mining and Technology, Xuzhou, China 2 School of Computer Science and Technology, Jiangsu Normal University, Xuzhou, China 3 Institute of Algorithm and Big Data Analysis, Northeast Normal University, Changchun, China 4 School of Computer Science and Information Technology, Northeast Normal University, Changchun, China Corresponding author: Jian Wang, School of Environmental Science and Spatial Informatics, China University of Mining and Technology, Xuzhou, Jiangsu, 221116, China. Email:
[email protected]
Creative Commons CC-BY: This article is distributed under the terms of the Creative Commons Attribution 3.0 License (http://www.creativecommons.org/licenses/by/3.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/ open-access-at-sage).
2 monitoring and computer analysis, and other methods. Then, the technical status of the equipment is analyzed; the nature and the cause of the faults are determined with the aim of forecasting the trend of faults and preparing for the necessary countermeasures. FD technology can be used to diagnose the fault symptoms and causes. It is advantageous to eliminate the fault and safety hazards as soon as possible, to avoid unnecessary loss, so it has high economic and social benefits. Probabilistic neural network (PNN), first proposed by Specht3 in 1990, is a kind of parallel algorithm. It is developed based on Bayes classification rules and Parzen window probability density function (PDF) estimation method. PNN is a kind of widely used artificial neural network (NN), which has a simple structure. In practical application, especially in solving classification problems, its advantage is that linear learning algorithm in PNN is well capable of achieving results of the nonlinear learning algorithms, while preserving the high accuracy of nonlinear algorithm. In PNN, the network weights are the model of the sample distribution, so the network need not train. Therefore, it can meet the real-time processing on the training requirements. Although PNN has various advantages mentioned above, it has an inherent shortcoming. In PNN, one of the most parameters is Spread that has great influence on PNN’s performance. The Spread that is also so-called smoothing parameter is the standard deviation (SD) of the underlying Gaussian distribution, and it is a determinant of the receptive width of the Gaussian window for the PDF of the training set.4 If the Spread is improperly selected, it cannot guide PNN’s training; hence, PNN will generate bad prediction results.4 In this article, a self-adaptive strategy is incorporated into PNN and then a self-adaptive probabilistic neural network (SaPNN) is proposed. In SaPNN, the best Spread can be self-adaptively selected all the time; thus, SaPNN can always get the best prediction accuracy. In addition, in order to reduce the computational requirements of SaPNN, two simplified strategies are combined with the proposed SaPNN and then two variant of SaPNNs (SSaPNN1 and SSaPNN2) are proposed. These two simplified SaPNNs are able to reach the same prediction accuracy with SaPNN using less computational requirements. In variants of SaPNNs, Spread can be self-adaptively selected; therefore, there is no parameters to be adjusted in the training process. The transformer FD problem is addressed by the variants of SaPNNs. By comparing them with basic PNN, and the traditional back propagation (BP),5,6 extreme learning machine (ELM),7–10 general regression neural network (GRNN),11,12 and self-adaptive extreme learning machine (SaELM),13 the results have experimentally shown that SaPNNs have a more accurate prediction rate and better generalization performance when addressing the transformer FD problem.
Advances in Mechanical Engineering The remainder of this article is organized as follows: section ‘‘Preliminaries’’ reviews the related preliminaries, including transformer FD problem and PNN. Sections ‘‘SaPNN’’ and ‘‘SaPNN model for transformer FD problem’’ represent the framework of variants of SaPNNs and the detailed classification using three SaPNNs. In section ‘‘Simulation results,’’ a series of comparison experiments on FD problem are conducted. Section ‘‘Conclusion’’ provides our concluding remarks and points out our future work orientation.
Preliminaries Transformer FD Abnormal condition or information will be generated when the transformer is getting out of order to a certain degree. Fault analysis is to collect the abnormal phenomenon or information of the transformer, which is then analyzed. According to the analyses of these phenomena or information, the type of fault, the severity, and fault location are determined. Therefore, the purpose of transformer FD is first to correctly determine whether the current state of the normal operation of the equipment is in normal or abnormal state. If the transformer is in abnormal state, the nature, type, and cause of the fault are judged. The fault may be insulation fault, overheating fault, or mechanical failure. If it is insulation fault, it may be insulation aging, moisture or discharge fault; if it is discharge fault, which type of discharge should be further determined. Transformer FD is also to predict possible development of fault according to the fault information or its processed results. That is to say, the fault severity and development trend should also be diagnosed. Subsequently, some measures are put forward to control, prevent, and eliminate the fault. The reasonable method of equipment maintenance and the corresponding anti-accident measures are also put forward. Some improved suggestions are proposed with respect to the equipment design, manufacture, and assembly, which can provide scientific basis and suggestions for the modernization of equipment. Analysis of dissolved gas in transformer oil is an important method for the diagnosis of transformer internal faults. In China, the improved three-ratio method14 is widely used in a large number of applications at present. However, the three-ratio method as the discrimination criteria of the transformer FD has two problems, which are coding defect and critical value criterion defect.14 Various methods have been proposed to solve FD problems. Wang et al.15 put forward a novel nearest prototype classifier to diagnose faults in a power plant and then an improved particle swarm optimization (PSO) was used to optimize the position of the prototypes. Sun et al.16 put forward an enhanced accuracy of
Yi et al. FD method in Smart Grid via rough sets together with genetic algorithm (GA) and Tabu search (TS). After the simulated fault and system data were generated, Zhao et al.17 formulated the FD problem as an optimization problem, which was then solved by an improved differential evolution (DE). Tang et al.18 established a multi-fault classification model via the support vector machine (SVM) trained by chaotic PSO that was applied to the FD of rotating machines. Fault diagnosis and isolation (FDI) on industrial systems was formulated as an optimization problem, and it was then addressed by PSO and ant colony optimization (ACO) algorithms.19 By integrating the theory of fuzzy sets, pairwise probabilistic multi-label classification, and decision-by-threshold, a new framework of simultaneous FD, called fuzzy and probabilistic simultaneous fault diagnosis (FPSD), was proposed by Vong et al.20 Xia et al.21 put forward a multi-objective unsupervised feature selection algorithm (MOUFSA) that is verified by nine UCI datasets and five fault recognition datasets. Fathabadi22 proposed a soft computing method via discrete wavelet transform and a hardware via twostage finite impulse response to detect short-circuit faults in power transmission lines. Qin et al.23 proposed a method to recognize power cable fault types via an annealed chaotic competitive learning network. Kang et al.24 put forward a highly reliable FD scheme for incipient low-speed rolling element bearing failures. In this method, a binary bat algorithm (BBA)25 is used to filter discriminative fault features. Another one of the most representative paradigms for solving FD problem is NNs. NN is an effective and efficient problem-solving algorithm, which has been successfully used in several practical problems. In order to improve the performance of NNs, some state-of-theart developed intelligent algorithms have been incorporated into NNs, such as GA,26 shuffled frog leaping algorithm (SFLA),27 and PSO.28 Artificial NN has the advantages of distributed parallel processing, self-adaption, self-learning, associative memory, and nonlinear mapping, which has opened up a new way to solve this problem. In our current work, three improved versions of PNNs in combination with three-ratio method are used to deal with transformer FD problem, which can be found in the following sections in detail.
PNN PNN is a kind of feedforward network with the development of radial basis function network, and its theoretical basis is the minimum Bayesian risk criterion (i.e. Bayesian decision theory). PNN, as a kind of radial basis function NN, is suitable for pattern classification. When the value of Spread is close to 0, it constitutes the nearest neighbor classifier. When the value of Spread is
3
y1
x1
x2
y2
xn
ym
Input Layer
Pattern Layer
Summation Layer
Output Layer
Figure 1. Structure of PNN.
large, it constitutes the nearest neighbor classifier of several training samples. The model of PNN is composed of four layers, which are input layer, pattern layer, summation layer, and output layer, and its basic structure can be shown in Figure 1. The input layer receives the value of the training sample and transfers the feature vector to the network. The number of neurons in input layer is equal to the dimension of the sample vector. The matching relation between feature vector and each pattern in training set is calculated in the pattern layer, and the number of neurons in pattern layer is equal to the sum of training samples for all categories. Thus, the output of each pattern neuron in this layer can be given as "
ðX Wi ÞT ðX W i Þ f ðX, Wi Þ = exp 2d2
# ð1Þ
where Wi is the weight between input layer and pattern layer and d is smoothing factor that plays a vital role in classification problem. The third layer is summation layer, which is the accumulation of the probability of a certain class, according to equation (1) and then the PDF of the fault mode is obtained. Each class that has only one layer unit is only connected with the pattern layer unit of its own class, while it is not connected with the other units in the pattern layer. Therefore, the output of the neurons of its own class in the summation layer will simply be added, while the output is independent of the output of the pattern layer which belongs to other classes. The output of the summation layer is proportional to the probability density of the kernel. The probability statistics of all kinds of the output layer can be obtained by normalizing the output layer. The output decisionmaking layer is composed of a simple threshold discriminator, whose role is to choose a maximum inspection probability density of neurons as the output of the whole system in various fault modes of the probability
4
Advances in Mechanical Engineering
Algorithm 1: SaPNN algorithm Begin Step 1: Initialization. Set the minimum and maximum of Spread Spreadmin and Spreadmax; set the maximal generation MaxIter and current generation counter G = 1; initialize the interval of Spread SpreadInterval. Step 2: While SpreadInterval is not less than 0.1 do Calculate all the possible Spread (say Spread1, ., SpreadM) according to Spreadmin, Spreadmax, and SpreadInterval. for j = 1: M (all the possible Spread between Spreadmin and Spreadmax) do While G \ MaxIter do Construct the basic PNN with Spreadj. Train and predict as shown in section ‘‘PNN.’’ Evaluate the prediction results. G = G + 1. end while end for j Select the best Spread Spreadbest with minimum error. Update Spreadmin, Spreadmax and SpreadInterval according to equations (4)–(6). Step 3: end while Step 4: Output the best Spread and prediction accuracy. End. density estimation. The neuron in output layer is a kind of competitive neuron, and each neuron corresponds to a data type (i.e. fault mode). The number of neurons in output layer is equal to the number of classes of training sample data. Output layer receives all PDF from summation layer, and the output of neuron with the biggest PDF is 1; that is to say, the corresponding class is the exact pattern recognition class to be determined, while the output of the other neurons is 0. FD method based on PNN is a widely accepted decision-making method in probability statistics. It can be described as follows: Suppose there are two known fault modes uA and uB with regard to a fault sample X = (x1, x2, ., xn) to be judged If hA lA fA ðXÞ.hB lB fB ðXÞ holds, then X 2 uA If hA lA fA ðXÞ\hB lB fB ðXÞ holds, then X 2 uB where hA and hB are, respectively, the prior probability of the fault modes uA and uB (hA = NA/N, hB = NB/N); NA and NB are, respectively, the number of training samples for failure modes uA and uB , and N is the total number of training samples. lA is the cost factor when X is classified into uA while it belongs to uB in reality. Similarly, lB is the cost factor when X is classified into uB while it belongs to uA in reality. fA(X) and fB(X) are the PDF of the fault modes uA and uB , respectively, and they cannot usually be obtained accurately, and their statistical values are only achieved according to the existing fault samples. In 1962, Parzen29 put forward a PDF estimation method from a known random sample. In this method, if the sample size is large enough, it can continuously
and smoothly approach the original PDF. The PDF estimation obtained by the Parzen method is as follows " # 1X ðX Wai ÞT ðX Wai Þ f A ðX Þ = exp 2d2 ð2pÞP=2 dP m 1
ð2Þ where m is the number of training samples of the fault mode uA ; d is the smoothing parameter, which can determine the width of the bell-shaped curve in which the sample point is considered as the center.
SaPNN As discussed before, for the basic PNN, the selection of Spread is a difficult problem. In the present work, a self-adaptive strategy and two simplified technologies are originally proposed in order to select the best Spread. The framework of three improved PNNs can be described in the following.
SaPNN First, a self-adaptive strategy is incorporated into the basic PNN, and self-adaptive PNN (SaPNN) is then proposed. In SaPNN, the best Spread is used to form the NNs and to guide the prediction all the time. The mainframe of SaPNN can be shown in Algorithm 1. In Algorithm 1, Spreadmin and Spreadmax indicate the minimum and maximum of Spread. SpreadInterval is the interval of Spread. That is to say, Spread1 =Spreadmin, Spread2 = Spreadmin + SpreadInterval,
Yi et al.
5
Algorithm 2: SSaPNN1 algorithm Begin Step 1: Initialization. Set the minimum and maximum of Spread Spreadmin and Spreadmax; set the maximal generation MaxIter, current iteration counter t = 1 and current generation counter G = 1; initialize the interval of Spread SpreadInterval. Step 2: While t 2 do Calculate all the possible Spread (say Spread1, ., SpreadM) according to Spreadmin, Spreadmax, and SpreadInterval. for j = 1: M (all the possible Spread between Spreadmin and Spreadmax) do While G \ MaxIter do Construct the basic PNN with Spreadj. Train and predict as shown in section ‘‘PNN.’’ Evaluate the prediction results. G = G + 1. end while end for j Select the best Spread Spreadbest with minimum error. Update Spreadmin and Spreadmax according to equations (4) and (5). SpreadInterval = 0.1. t = t + 1. Step 3: end while Step 4: Output the best Spread and prediction accuracy. End.
Spread3 = Spreadmin + 2 3 SpreadInterval, ., Spreadj = Spreadmin + (j 2 1) 3 SpreadInterval, ., SpreadM = Spreadmax (j = 1, 2, ., M). Here, M is the number of spread that can be given as M = round
Spreadmax Spreadmin +1 SpreadInterval
ð3Þ
where round(x) rounds each element of x to the nearest integer. For each Spread (Spreadj, j = 1, 2, ., M), SaPNN is formed and performs the training and testing process. In order to reduce the influence of randomness, MaxIter independent runs are implemented for each Spreadj. After that, the best Spread Spreadbest is selected that has the minimum error. Spreadmin, Spreadmax, and SpreadInterval are then adjusted according to Spreadbest, as shown in equations (4)–(6) Spreadmin = maxðSpreadbest SpreadInterval , Spreadmin Þ ð4Þ Spreadmax = minðSpreadbest + SpreadInterval , Spreadmax Þ ð5Þ SpreadInterval = 0:1 3 round
SpreadInterval 2 3 10
ð6Þ
In general, the minimal Spread is 0.1, so the SpreadInterval is updated as equation (6). When
SpreadInterval is less than 0.1, SaPNN stops and outputs the best Spread Spreadbest and final best prediction accuracy for this case.
SSaPNN1 As mentioned above, in SaPNN, the Spreadmin, Spreadmax, and SpreadInterval are updated many times. In order to reduce updating times and accelerate convergent speed, a simplified version of SaPNN, called SSaPNN1, is proposed. In SSaPNN1, Spreadmin, Spreadmax, and SpreadInterval are updated only once. At iteration 2, the SpreadInterval is set to 0.1 directly after updating Spreadmin and Spreadmax according to equations (4) and (5). After that, the SSaPNN1 stops and outputs the best Spread Spreadbest and final best prediction accuracy for this case. Its framework can be given in Algorithm 2.
SSaPNN2 As mentioned in section ‘‘SaPNN,’’ the number of PNN evaluations at each iteration is equal to the number of spread M. At iteration 2, the PNN with certain Spread has been implemented at iteration 1, therefore, there is no need to be implemented again. On contrast, the results of prediction accuracy at iteration 1 can be used directly at iteration 2. The computational requirements of SaPNN can be further reduced, and another
6
Advances in Mechanical Engineering
Algorithm 3: SSaPNN2 algorithm Begin Step 1: Initialization. Set the minimum and maximum of Spread Spreadmin and Spreadmax; set the maximal generation MaxIter, current iteration counter t = 1 and current generation counter G = 1; initialize the interval of Spread SpreadInterval. Step 2: While SpreadInterval is not less than 0.1 do Calculate all the possible Spread according to Spreadmin, Spreadmax, and SpreadInterval. Remove the repeated Spread at iteration t (t2), and the remained Spread is Spread1, ., SpreadM. for j = 1:M (all the possible Spread) do While G \ MaxIter do Construct the basic PNN with Spreadj. Train and predict as shown in section ‘‘PNN.’’ Evaluate the prediction results. G = G + 1. end while end for j Select the best Spread Spreadbest with minimum error. Update Spreadmin, Spreadmax and SpreadInterval according to equations (4)–(6). t = t + 1. Step 3: end while Step 4: Output the best Spread and prediction accuracy. End.
Collecting data set
Normalization
Selection of train set and test set
Implementing SaPNNs
Analyzing results
SaPNNs
Preprocessing
Figure 2. Flowchart of SaPNNs for the transformer fault diagnosis problem.
simplified version of SaPNN, called SSaPNN2, is proposed. Its framework can be given in Algorithm 3. In Algorithm 3, at iteration 1, SSaPNN2 has the same implementation process with SaPNN. While the repeated Spread at iteration 1 will be removed. This operation will significantly reduce its computational requirements without losing the feature of SaPNN. This process will be repeated until the SpreadInterval is less than 0.1. Finally, SSaPNN2 outputs the best Spread Spreadbest and the best prediction accuracy for a certain problem.
SaPNN model for transformer FD problem In any network model, the selection of the input feature vector must be able to correctly reflect the characteristics of the problem. If the fault feature does not include enough identification information or the information
that cannot be extracted to reflect the fault characteristics, the diagnosis results will be greatly affected. Dissolved gas analysis in oil can well reflect the transformer latent fault, and the improved three-ratio method has the highest accuracy in a variety of diagnostic methods. Herein, three pairs of the ratio of dissolved gas in oil content are considered as the input feature vectors of the NN, and the output feature vector is the type of transformer fault. PNN has the advantages of simple structure and simple training, and it has strong nonlinear classification ability. Herein, the fault sample space is mapped to the fault pattern space, forming a diagnosis system that has stronger fault-tolerant ability and the ability of the self-adaptive structure, so as to improve the accuracy of FD. In our current work, the dissolved gas in the oil is first analyzed, and the FD model based on SaPNNs is then established, which is based on the improved threeratio method. Based on the analyses above, the FD
Yi et al. model using SaPNNs is designed, and its flowchart can be shown in Figure 2.
7 Table 1. Accuracy of PNN with Spreadmin = 0.1 and Spreadmax = 4.9 for the transformer fault diagnosis problem. Spread
Training set
Test set
Spread
Training set
Test set
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5
23 23 23 23 23 23 23 23 22 22 22 22 22 22 21 21 21 21 21 21 21 21 21 21 21
7 7 8 8 8 8 8 8 9 9 9 8 8 8 8 8 8 8 8 7 7 6 6 6 6
2.6 2.7 2.8 2.9 3.0 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9
21 21 21 21 21 21 21 20 19 18 17 17 16 16 16 15 15 15 15 15 15 15 15 15
6 6 6 6 6 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
Simulation results In this section, the performance of variants of SaPNNs when addressing transformer FD problem is fully investigated from various respects. The influence of Spread on PNN is studied, followed by the comparisons between PNN and variants of SaPNNs. Variants of SaPNNs are compared with PNN and other four NNs, including BP, ELM, GRNN, and SaELM. To allow a fair comparison, all the experiments were conducted on a PC with a Pentium IV processor running at 2.0 GHz, 512 MB of RAM and a hard drive of 160 GB. Our implementation was compiled using MATLAB R2012a (7.14) running under Windows XP3. Here, the Spread used in this article is between 0.1 and 4.9, that is, Spreadmin = 0.1 and Spreadmax = 4.9. In order to remove the randomness of NNs, 1000 independent runs are performed in the following experiments. The data studied in this article are the matrix of the 33 3 4 dimension. The first three columns are the data of the improved three-ratio method, and the fourth column is the classification output, that is, the type of transformer fault. The first 23 samples and the last 10 samples are considered as training samples and test samples of SaPNNs, respectively.
24 22 20
Influence of Spread on PNN
18 16
Accuracy
In this section, the influence of Spread on PNN is studied for transformer FD problem. In order to fully investigate this influence, the interval is set to 0.1, that is, SpreadInterval = 0.1, Spread = 0.1, 0.2, 0.3, ., 4.8, 4.9. The results are recorded in Table 1 and Figure 3. The best value obtained by each method is bold. From Table 1, for training set, PNN has the best prediction accuracy when Spread is in [0.1, 0.8]. With the increment of Spread, the prediction accuracy becomes less and less, and this declining trend reaches the minimal accuracy 65.22% (15/23) when Spread = 4.9, which can also be shown in Figure 3. In addition, for test set, PNN has the best prediction accuracy when Spread is in [0.9, 1.1] with the maximal accuracy 90.00% (9/10). With the increment of Spread, the prediction accuracy first increases from 0.1 to 0.9 and then decreases from 1.1 to 4.9. Finally, PNN has the worst prediction accuracy when Spread is in [3.1, 4.9]. This trend can be clearly shown in Figure 3. The prediction accuracy of PNN for training set and test set has the similar trend although they do not reach the optimal at the same time. From the above analyses, Spread has great influence on the performance of PNN, so the selection of Spread
14 Accuracy of train set
12
Accuracy of test set
10 8 6 4 0.1
0.5
0.9
1.3
1.7
2.1
2.5
2.9
3.3
3.7
4.1
4.5
4.9
SPREAD
Figure 3. Prediction of test samples (PNN) with Spreadmin = 0.1 and Spreadmax = 4.9 for the transformer fault diagnosis problem.
is of vital importance in PNN. It has experimentally shown that PNN performs the best when Spread is in [0.9, 1.1]. Spread is set to 0.9 in other experiments. In addition, in order to get the best Spread, 49 implementations of PNN are performed in this experiment. In the next section, three self-adaptive strategies are studied in order to decrease the computational requirements.
8
Advances in Mechanical Engineering
Table 2. Prediction of test samples (SaPNN) with Spreadmin = 0.1 and Spreadmax = 4.9 for the transformer fault diagnosis problem. Iteration
Interval
Range
Evaluations
Spread
Training set
Test set
1
0.8
[0.1, 4.9]
7
2
0.4
[0.1, 1.7]
5
3
0.2
[0.5, 1.3]
5
4
0.1
[0.7, 1.1]
5
0.1 0.9 1.7 2.5 3.3 4.1 4.9 0.1 0.5 0.9 1.3 1.7 0.5 0.7 0.9 1.1 1.3 0.7 0.8 0.9 1.0 1.1
23 22 21 21 20 15 15 23 23 22 22 21 23 23 22 22 22 23 23 22 22 22
7 9 8 8 6 4 4 7 8 9 8 8 8 8 9 9 8 8 8 9 9 9
Influence of variants of SaPNNs 24
In this section, three variants of improved PNNs (SaPNN, SSaPNN1, and SSaPNN2) are fully investigated for transformer FD problem.
Accuracy of train set
22
Accuracy of test set
20 18 16
Accuracy
SaPNN. Here, SaPNN is studied with initial SpreadInterval = 0.8. The results are recorded in Table 2. The best value obtained by each method is bold. First, we have Spreadmin = 0.1, Spreadmax = 4.9, and SpreadInterval = 0.8, that is, Spread can be equal to 0.1, 0.9, 1.7, 2.5, 3.3, 4.1, and 4.9. Accordingly, the number of evaluations for PNN is 7 (nEval1 = 7). SaPNN reaches the maximal prediction accuracy 100.00% (23/ 23) when Spread = 0.1 for training set, while for test set, SaPNN reaches the maximal prediction accuracy of 90.00% (9/10) when Spread = 0.9. Therefore, the best Spread Spreadbest is selected and set to 0.9 at this iteration. For this case, it can also be shown in Figure 4. Second, Spreadmin, Spreadmax, and SpreadInterval are updated according to equations (4)–(6), as shown in equations (7)–(9)
14 12 10 8 6 4 0.1
0.9
1.7
2.5
3.3
4.1
4.9
SPREAD
Figure 4. Prediction of test samples (SaPNN) with Spreadmin = 0.1, Spreadmax = 4.9, and SpreadInterval = 0.8 for the transformer fault diagnosis problem.
SpreadInterval 2 3 10 0:8 = 0:1 3 round 2 3 10
SpreadInterval = 0:1 3 round Spreadmin = maxðSpreadbest SpreadInterval , Spreadmin Þ = maxð0:9 0:8, 0:1Þ = 0:1
ð7Þ
Spreadmax = minðSpreadbest + SpreadInterval , Spreadmax Þ = minð0:9 + 0:8, 4:9Þ = 1:7
ð8Þ
ð9Þ
= 0:4 Herein, we have Spreadmin = 0.1, Spreadmax = 1.7, and SpreadInterval = 0.4, that is, Spread can be equal to 0.1, 0.5, 0.9, 1.3, and 1.7. Accordingly, the number of
Yi et al.
9
24
24
22
22
20
20 Accuracy of train set
18
Accuracy of train set
Accuracy of test set
Accuracy of test set
16
Accuracy
Accuracy
18
14
16
14 12
12
10
10
8
6 0.1
0.5
0.9
1.3
8 0.5
1.7
0.7
SPREAD
evaluations for PNN is 5 (nEval2 = 5). SaPNN has the best prediction accuracy of 100.00% (23/23) when Spread = 0.1 and 0.5 for training set, while for test set, SaPNN has the best maximal prediction accuracy of 90.00% (9/10) when Spread = 0.9. Therefore, the best Spread Spreadbest is set to 0.9 at this iteration. For this case, it can also be shown in Figure 5. Third, similar to equations (7)–(9), Spreadmin, Spreadmax, and SpreadInterval are updated. Herein, we Spreadmax = 1.3, and have Spreadmin = 0.5, SpreadInterval = 0.2, that is, Spread can be equal to 0.5, 0.7, 0.9, 1.1, and 1.3. Accordingly, the number of evaluations for PNN is 5 (nEval3 = 5). For this case, SaPNN has the best prediction accuracy of 100.00% (23/23) when Spread = 0.5 and 0.7 for training set, while for test set, SaPNN has the best maximal prediction accuracy of 90.00% (9/10) when Spread = 0.9 and 1.1. Therefore, the best Spread Spreadbest is set to 0.9 at this iteration (of course, 1.1 can also be selected). For this case, it can also be shown in Figure 6. Finally, similarly, we have Spreadmin = 0.7, Spreadmax = 1.1, and SpreadInterval = 0.1, that is, Spread can be equal to 0.7, 0.8, 0.9, 1.0, and 1.1. Accordingly, the number of evaluations for PNN is 5 (nEval4 = 5). For this case, SaPNN has the best prediction accuracy of 100.00% (23/23) when Spread = 0.7 and 0.8 for training set, while for test set, SaPNN has the best maximal prediction accuracy of 90.00% (9/10) when Spread = 0.9, 1.0, and 1.1. Therefore, the best Spread Spreadbest is set to 0.9 at this iteration (of course, 1.0 and 1.1 can also be selected). For this case, it can also be shown in Figure 7. Because the SpreadInterval reaches minimum (0.1), the SaPNN stops, and the best Spread Spreadbest = 0.9 with the best prediction accuracy of 90.00% (9/10).
1.1
1.3
Figure 6. Prediction of test samples (SaPNN) with Spreadmin = 0.5, Spreadmax = 1.3, and SpreadInterval = 0.2 for the transformer fault diagnosis problem.
24
22
20 Accuracy of train set Accuracy of test set
18
Accuracy
Figure 5. Prediction of test samples (SaPNN) with Spreadmin = 0.1, Spreadmax = 1.7, and SpreadInterval = 0.4 for the transformer fault diagnosis problem.
0.9
SPREAD
16
14
12
10
8 0.7
0.8
0.9
1
1.1
SPREAD
Figure 7. Prediction of test samples (SaPNN) with Spreadmin = 0.7, Spreadmax = 1.1, and SpreadInterval = 0.1 for the transformer fault diagnosis problem.
From the above analyses, the total implementations are 22 (7 + 5 + 5 + 5), that is, only 44.90% (22/49) of PNN. That is to say, the computation requirements decrease to the half without losing the advancement of PNN. The prediction accuracy of SaPNN with best Spread Spreadbest = 0.9 for training set and test set can be shown in Figures 8 and 9. SSaPNN1. In this section, SaPNN is further simplified and then the simplified SSaPNN1 is studied with initial SpreadInterval = 0.8. The results are recorded in Table 3. The best value obtained by each method is bold. First, we have Spreadmin = 0.1, Spreadmax = 4.9, and SpreadInterval = 0.8, that is, Spread can be equal to
10
Advances in Mechanical Engineering
Predict result with trained SaPNN
5
Train error with trained SaPNN
0
-0.5
4
Classification result
-1 3 -1.5 2 -2
1
-2.5
-3 0
10
20
30
0
Sample number
10
20
30
Sample number
Figure 8. Prediction of training samples (SaPNN) with Spread = 0.9 for the transformer fault diagnosis problem.
Predict result with trained SaPNN
5
Test error with trained SAPNN
1 0.9
4
0.8
Classification result
0.7 0.6
3
0.5 2
0.4 0.3
1
0.2 0.1 0 0
5
Sample number
number of evaluations for PNN is 17 (nEval2 = 17). SSaPNN1 has the best prediction accuracy of 100.00% (23/23) when Spread is in [0.1, 0.8] for training set, while for test set, SSaPNN1 has the best maximal prediction accuracy of 90.00% (9/10) when Spread is in [0.9, 1.1]. Therefore, the best Spread Spreadbest is set to 0.9 at this iteration. Because the SpreadInterval reaches minimum (0.1), the SSaPNN1 stops, and the best Spread Spreadbest = 0.9 with the best prediction accuracy of 90.00% (9/10). From the above analyses, the total implementations are 24 (7 + 17), that is, only 48.98% (24/49) of PNN. The computation requirements of SSaPNN1 are a little more than SaPNN (44.90%, 24/49), but Spreadmin, Spreadmax, and SpreadInterval are only updated once. Therefore, it is much simpler than SaPNN.
10
0
5
10
Sample number
Figure 9. Prediction of test samples (SaPNN) with Spread = 0.9 for the transformer fault diagnosis problem.
0.1, 0.9, 1.7, 2.5, 3.3, 4.1, and 4.9. For this case, it is the same as in section ‘‘SaPNN.’’ Accordingly, the number of evaluations is 7 (nEval1 = 7). SaPNN reaches the maximal prediction accuracy of 100.00% (23/23) when Spread = 0.1 for training set, while for test set, SaPNN reaches the maximal prediction accuracy of 90.00% (9/ 10) when Spread = 0.9. Therefore, the best Spread Spreadbest is selected and set to 0.9 at this iteration. Subsequently, Spreadmin and Spreadmax are updated according to equations (4)–(6), as shown in equations (7) and (8). After that, the SpreadInterval is set to 0.1 directly. Herein, we have Spreadmin = 0.1, Spreadmax = 1.7, and SpreadInterval = 0.1, that is, Spread can be equal to 0.1, 0.2, 0.3, ., 1.5, 1.6, and 1.7. Accordingly, the
SSaPNN2. In this section, another simplified version, SSaPNN2, is used to solve transformer FD problem. Similarly, initial SpreadInterval is set to 0.8. The results are recorded in Table 4. The best value obtained by each method is bold. First, we have Spreadmin = 0.1, Spreadmax = 4.9, and SpreadInterval = 0.8, that is, Spread can be equal to 0.1, 0.9, 1.7, 2.5, 3.3, 4.1, and 4.9. For this case, it is the same with sections ‘‘SaPNN’’ and ‘‘SSaPNN1.’’ Accordingly, the number of evaluations is 7 (nEval1 = 7). SSaPNN2 reaches the maximal prediction accuracy of 100.00% (23/23) when Spread = 0.1 for training set, while for test set, SaPNN reaches the maximal prediction accuracy of 90.00% (9/10) when Spread = 0.9. Therefore, the best Spread Spreadbest is selected and set to 0.9 at this iteration. Second, Spreadmin, Spreadmax, and SpreadInterval are updated according to equations (4)–(6), as shown in equations (7)–(9). Herein, we have Spreadmin = 0.1, Spreadmax = 1.7, and SpreadInterval = 0.4, that is, Spread can be equal to 0.1, 0.5, 0.9, 1.3, and 1.7. Looking carefully at Spread, the prediction accuracy for training set and test set when Spread = 0.1, 0.9, and 1.7 has been found at previous iteration. Therefore, Spread can only be equal to 0.5 and 1.3. Accordingly, the number of evaluations for PNN is 2 (nEval2 = 2). SSaPNN2 has the best prediction accuracy of 100.00% (23/23) when Spread = 0.5 for training set, while for test set, SSaPNN2 has the best maximal prediction accuracy of 80.00% (8/10) when Spread = 0.5 and 1.3. Therefore, the best Spread Spreadbest at this iteration is 0.5, and the global best Spread Spreadbest is set to 0.9 up to now. Third, similar to equations (7)–(9), Spreadmin, Spreadmax, and SpreadInterval are updated. Herein, we Spreadmax = 1.3, and have Spreadmin = 0.5, SpreadInterval = 0.2, that is Spread can be equal to 0.5, 0.7, 0.9, 1.1, and 1.3. Similarly, the prediction accuracy
Yi et al.
11
Table 3. Prediction of test samples (SSaPNN1) with Spreadmin = 0.1 and Spreadmax = 4.9 for the transformer fault diagnosis problem. Iteration
Interval
Range
Evaluations
1
0.8
[0.1, 4.9]
7
2
0.1
[0.1, 1.7]
17
Spread
Training set
Test set
0.1 0.9 1.7 2.5 3.3 4.1 4.9 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7
23 22 21 21 20 15 15 23 23 23 23 23 23 23 23 22 22 22 22 22 22 21 21 21
7 9 8 6 4 4 4 7 7 8 8 8 8 8 8 9 9 9 8 8 8 8 8 8
Table 4. Prediction of test samples (SSaPNN2) with Spreadmin = 0.1 and Spreadmax = 4.9 for the transformer fault diagnosis problem. Iteration
Interval
Range
Evaluations
Best Spread
Local best
Global best
Spread
Training set
Test set
1
0.8
[0.1, 4.9]
7
0.9
9
9
2
0.4
[0.1, 1.7]
2
0.9
8
9
3
0.2
[0.5, 1.3]
2
0.9
9
9
4
0.1
[0.7, 1.1]
2
0.9
9
9
0.1 0.9 1.7 2.5 3.3 4.1 4.9 0.5 1.3 0.7 1.1 0.8 1.0
23 22 21 21 20 15 15 23 22 23 22 23 22
7 9 8 8 6 4 4 8 8 8 9 8 9
for training set and test set when Spread = 0.5, 0.9, and 1.3 has been found before. Therefore, Spread can only be equal to 0.7 and 1.1. Accordingly, the number of evaluations for PNN is 2 (nEval3 = 2). For this case, SSaPNN2 has the best prediction accuracy of 100.00% (23/23) when Spread = 0.7 for training set, while for test set, SSaPNN2 has the best maximal prediction accuracy of 90.00% (9/10) when Spread = 1.1. Therefore, the best Spread Spreadbest at this iteration is 1.1, and the global best Spread Spreadbest is set to 0.9 up to now. Finally, similarly, we have Spreadmin = 0.7, Spreadmax = 1.1, and SpreadInterval = 0.1, that is, Spread can be equal to 0.7, 0.8, 0.9, 1.0, and 1.1. After removing the repeated Spread, Spread can be equal to
0.8 and 1.0. Accordingly, the number of evaluations for PNN is 2 (nEval4 = 2). For this case, SSaPNN2 has the best prediction accuracy of 100.00% (23/23) when Spread = 0.8 for training set, while for test set, SSaPNN2 has the best maximal prediction accuracy of 90.00% (9/10) when Spread = 1.0. Therefore, the best Spread Spreadbest at this iteration is 1.0, and the global best Spread Spreadbest is set to 0.9 up to now. Because the SpreadInterval reaches minimum (0.1), the SSaPNN2 stops, and the best Spread Spreadbest = 0.9 with the best prediction accuracy of 90.00% (9/10). From the above analyses, the total implementations are 13 (7 + 2 + 2 + 2), that is, only 26.53% (13/49) of PNN. The computational requirements of SSaPNN2 are much less than SaPNN (44.90%, 24/49), although
12
Advances in Mechanical Engineering
Table 5. Accuracy of BP, ELM, GRNN, PNN, SaELM, SaPNN, SSaPNN1, and SSaPNN2 for the transformer fault diagnosis problem. Training set
BP ELM GRNN PNN SaELM SaPNN SSaPNN1 SSaPNN2
Test set
Mean
Best
Worst
Standard deviation
Mean
Best
Worst
Standard deviation
16.62 23.00 8.92 21.00 23.00 23.00 23.00 23.00
23.00 23.00 12.00 21.00 23.00 23.00 23.00 23.00
2.00 23.00 5.00 21.00 23.00 23.00 23.00 23.00
4.24 0 1.48 0 0 0 0 0
6.24 3.91 6.09 8.00 6.20 9.00 9.00 9.00
10.00 9.00 7.00 8.00 6.62 9.00 9.00 9.00
0 1.00 4.00 8.00 5.66 9.00 9.00 9.00
2.08 1.21 0.48 0 0.13 0 0 0
BP: back propagation; ELM: extreme learning machine; GRNN: general regression neural network; PNN: probabilistic neural network; SaELM: selfadaptive extreme learning machine; SaPNN: self-adaptive probabilistic neural network; SSaPNN1: simplified self-adaptive probabilistic neural network 1; SSaPNN2: simplified self-adaptive probabilistic neural network 2.
SSaPNN2 has the same updating number with SaPNN for Spreadmin, Spreadmax, and SpreadInterval.
Comparisons of SaPNNs with BP, ELM, GRNN, and SaELM In order to further prove the superiority of SaPNNs when addressing transformer FD problem, three versions of SaPNNs are further compared with BP, ELM, GRNN, and SaELM. For BP NN, epochs = 200, learning rate = 0.1, and objective = 0.00004. For GRNN, cyclic training method is used in order to select the best Spread value, making GRNN achieve the best prediction. For ELM, the number of neurons in the hidden layer is 20, and the activation function is S function. The parameters used in SaELM are set as follows: width factor Q = 2, scale factor L = 4, Nmin = 5, Nmax = 120, NInterval = 5, and the maximum generation MaxIter = 2000. For PNN, Spread is set to 1.5. The results are recorded in Table 5. The best value obtained by each method is bold. From Table 5, for training set, the comprehensive performance of ELM, SaELM, and variants of SaPNNs is identical and is better than BP, GRNN, and PNN. Except BP and GRNN, all the other NNs have the same SD that is 0. For test set, the average prediction accuracy of variants of SaPNNs is better than BP, ELM, GRNN, PNN, and SaELM (6.24/ 10 = 62.40%, 3.91/10 = 39.10%, 6.09/10 = 60.90%, 8/ 10 = 80.00%, 6.20/10 = 62.00%). Although BP has the best prediction accuracy (10/10 = 100.00%), the worst prediction accuracy is (0/10 = 0.00%). The results have proven that variants of SaPNNs are suitable for transformer FD problem.
Conclusion In order to remove the influence of Spread in PNN, a self-adaptive strategy is incorporated into the basic
PNN and then SaPNN is proposed. In SaPNN, the best Spread can be self-adaptively selected; therefore, SaPNN can always reach the best prediction accuracy. Moreover, in order to further reduce the computational requirements, two simplified strategies are added to the proposed SaPNN and then two versions of simplified SaPNN (SSaPNN1 and SSaPNN2) are proposed. The simulation results indicate that SSaPNN1 and SSaPNN2 perform more effectively and efficiently than SaPNN while preserving the feature of SaPNN. In variants of SaPNNs, Spread can be well selected; therefore, there is no parameters to be adjusted in the training process. The transformer FD problem is addressed by variants of SaPNNs, and their performance is fully investigated from various respects. In addition, by comparing them with the basic PNN, and BP, ELM, GRNN, and SaELM, the results have experimentally proven that SaPNNs have a more accurate prediction rate and better generalization performance when addressing the FD problem. Although variants of SaPNNs have shown their advantages over the basic PNN, and BP, ELM, GRNN, and SaELM when addressing transformer FD problem, the following points should be clearly provided in our future research. First, the results will be analyzed by other methods in our future research, such as t-test. Second, the proposed SaPNNs will be used to address other classification and prediction problems. Finally, some other new metaheuristic algorithm, such as monarch butterfly optimization (MBO),30 will be combined with SaPNNs, and this combination will surely further significantly improve the performance of SaPNNs. Declaration of conflicting interests The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Yi et al. Funding The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was partially supported by the Fundamental Research Funds for the Central Universities under grant number 2015XKMS051, a project funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions, and the Cooperative Innovation Center of Jiangsu Province.
References 1. Zhang X, Hu N, Hu L, et al. A bearing fault diagnosis method based on the low-dimensional compressed vibration signal. Adv Mech Eng. Epub ahead of print 6 July 2015. DOI: 10.1177/1687814015593442. 2. Li X, Qian J and Wang G-G. Fault prognostic based on hybrid method of state judgment and regression. Adv Mech Eng. Epub ahead of print 14 November 2013. DOI: 10.1155/2013/149562. 3. Specht DF. Probabilistic neural networks. Neural Networks 1990; 3: 109–118. 4. Oluleye B, Leisa A, Jinsong L, et al. On the application of genetic probabilistic neural network and cellular neural networks in precision agriculture. Asian J Comput Inf Syst 2014; 2: 90–101. 5. Van Ooyen A and Nienhuis B. Improving the convergence of the back-propagation algorithm. Neural Networks 1992; 5: 465–471. 6. Muslih IM, Mansour MA and Ramadan SZ. Using artificial neural network for predicting impurity concentration in solid diffusion process under insufficient input parameters. Adv Mech Eng. Epub ahead of print 4 October 2011. DOI: 10.1155/2011/408524. 7. Huang G-B, Zhu Q-Y and Siew C-K. Extreme learning machine: theory and applications. Neurocomputing 2006; 70: 489–501. 8. Huang G-B, Wang DH and Lan Y. Extreme learning machines: a survey. Int J Mach Learn Cybern 2011; 2: 107–122. 9. Li G and Niu P. An enhanced extreme learning machine based on ridge regression for regression. Neural Comput Appl 2011; 22: 803–810. 10. Zhai J-h, Xu H-y and Wang X-z. Dynamic ensemble extreme learning machine based on sample entropy. Soft Comput 2012; 16: 1493–1502. 11. Specht DF. A general regression neural network. IEEE T Neural Networ 1991; 2: 568–576. 12. Yip H-l, Fan H and Chiang Y-h. Predicting the maintenance cost of construction equipment: comparison between general regression neural network and Box–Jenkins time series models. Automat Constr 2014; 38: 30–38. 13. Wang G-G, Lu M, Dong Y-Q, et al. Self-adaptive extreme learning machine. Neural Comput Appl. Epub ahead of print 21 March 2015. DOI: 10.1007/s00521-0151874-3. 14. Sun Y-j, Zhang S, Miao C-x, et al. Improved BP neural network for transformer fault diagnosis. J China Univ Min Tech 2007; 17: 138–142.
13 15. Wang X, Ma L and Wang T. An optimized nearest prototype classifier for power plant fault diagnosis using hybrid particle swarm optimization algorithm. Int J Elec Power 2014; 58: 257–265. 16. Sun Q, Wang C, Wang Z, et al. A fault diagnosis method of Smart Grid based on rough sets combined with genetic algorithm and Tabu search. Neural Comput Appl 2012; 23: 2023–2029. 17. Zhao J, Xu Y, Luo F, et al. Power system fault diagnosis based on history driven differential evolution and stochastic time domain simulation. Inform Sciences 2014; 275: 13–29. 18. Tang X, Zhuang L, Cai J, et al. Multi-fault classification based on support vector machine trained by chaos particle swarm optimization. Knowl-Based Syst 2010; 23: 486–490. 19. Camps Echevarrı´ a L, Llanes Santiago O, Herna´ndez Fajardo JA, et al. A variant of the particle swarm optimization for the improvement of fault diagnosis in industrial systems via faults estimation. Eng Appl Artif Intel 2014; 28: 36–51. 20. Vong CM, Wong PK and Wong KI. Simultaneous-fault detection based on qualitative symptom descriptions for automotive engine diagnosis. Appl Soft Comput 2014; 22: 238–248. 21. Xia H, Zhuang J and Yu D. Multi-objective unsupervised feature selection algorithm utilizing redundancy measure and negative epsilon-dominance for fault diagnosis. Neurocomputing 2014; 146: 113–124. 22. Fathabadi H. Two novel proposed discrete wavelet transform and filter based approaches for short-circuit faults detection in power transmission lines. Appl Soft Comput 2015; 36: 375–382. 23. Qin X, Wang M, Lin J-S, et al. Power cable fault recognition based on an annealed chaotic competitive learning network. Algorithms 2014; 7: 492–509. 24. Kang M, Kim J and Kim J-M. Reliable fault diagnosis for incipient low-speed bearings using fault feature analysis based on a binary bat algorithm. Inform Sciences 2015; 294: 423–438. 25. Mirjalili S, Mirjalili SM and Yang X-S. Binary bat algorithm. Neural Comput Appl 2013; 25: 663–681. 26. Gao XZ and Ovaska SJ. Genetic algorithm training of Elman neural network in motor fault detection. Neural Comput Appl 2002; 11: 37–44. 27. Zhao Z, Xu Q and Jia M. Improved shuffled frog leaping algorithm-based BP neural network and its application in bearing early fault diagnosis. Neural Comput Appl. Epub ahead of print 5 March 2015. DOI: 10.1007/s00521-0151850-y. 28. Jin C and Jin S-W. Prediction approach of software faultproneness based on hybrid artificial neural network and quantum particle swarm optimization. Appl Soft Comput 2015; 35: 717–725. 29. Parzen E. On estimation of a probability density function and mode. Ann Math Stat 1962; 33: 1065–1076. 30. Wang G-G, Deb S and Cui Z. Monarch butterfly optimization. Neural Comput Appl. Epub ahead of print 19 May 2015. DOI: 10.1007/s00521-015-1923-y.