energies Article
Prediction Method for Power Transformer Running State Based on LSTM_DBN Network Jun Lin 1, * 1 2
*
ID
, Lei Su 2 , Yingjie Yan 1 , Gehao Sheng 1 , Da Xie 1 and Xiuchen Jiang 1
Department of Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;
[email protected] (Y.Y.);
[email protected] (G.S.);
[email protected] (D.X.);
[email protected] (X.J.) Electric Power Research Institute of State Grid Shanghai Municipal Electric Power Company, Shanghai 200120, China;
[email protected] Correspondence:
[email protected]; Tel.: +86-188-1827-2579
Received: 3 July 2018; Accepted: 12 July 2018; Published: 19 July 2018
Abstract: It is of great significance to accurately get the running state of power transformers and timely detect the existence of potential transformer faults. This paper presents a prediction method of transformer running state based on LSTM_DBN network. Firstly, based on the trend of gas concentration in transformer oil, a long short-term memory (LSTM) model is established to predict the future characteristic gas concentration. Then, the accuracy and influencing factors of the LSTM model are analyzed with examples. The deep belief network (DBN) model is used to establish the transformer operation using the information in the transformer fault case library. The accuracy of state classification is higher than the support vector machine (SVM) and back-propagation neural network (BPNN). Finally, combined with the actual transformer data collected from the State Grid Corporation of China, the LSTM_DBN model is used to predict the transformer state. The results show that the method has higher prediction accuracy and can analyze potential faults. Keywords: dissolved gas analysis; long short-term memory; deep belief network; running state prediction
1. Introduction As one of the important pieces of equipment in the power system, power transformers can directly influence the stability and safety of the entire power grid. If the transformer fails in operation, it will cause power to turn off and also cause damage to the transformer itself and the power system, which may result in greater damage [1]. So, it is necessary to take real-time condition monitoring into consideration and make diagnoses for the transformer to predict future running states. The potential failure of the transformer is discovered in time and the potential failure types are analyzed. Sending early warning signals to maintainers and taking corresponding measures in a timely manner can reduce the possibility of an accident. At present, there is much research on transformer fault diagnosis, but there are relatively few studies on the prediction of future running states of transformers and fault prediction. During the operation of the transformer, its internal insulating oil and solid insulating material will be dissolved in the insulating oil due to aging or external electric field and humidity. The content of various components of the gas in the oil and the proportional relationship between the different components are closely related to the running state of the transformer. Before the occurrence of electrical or thermal faults, the concentration of various gases has a gradual and regular change with time. Therefore, the dissolved gas analysis (DGA) is an important method to find the transformer defects and latent faults. It is highly feasible and accurate to predict transformer running states and make future fault classifications based on the trend of each historical gas concentration and the ratio between gas concentrations [2–4]. Current methods include oil gas ratio analysis [5–7], SVM [8,9] and artificial Energies 2018, 11, 1880; doi:10.3390/en11071880
www.mdpi.com/journal/energies
Energies 2018, 11, 1880
2 of 14
neural network (ANN) [10,11]. Li et al. [12] proposes an incipient problem diagnosis method based on the combined use of a multi-classification algorithm self-adaptive evolutionary extreme Learning machine (SaE-ELM) and a simple arctangent transform (AT). This paper uses AT alter the data structure of the experiment data to enhance the generalization capability for SaE-ELM. This article utilizes AT to change the structure of experimental data to enhance SaE-ELM fitting and generalization capabilities. Sherif S.M. Ghoneim [13] utilizes the thermodynamic theory to evaluate the fault severity based on dissolved gas analysis, also it proposes fuzzy logic approach to enhance the network fault diagnosis ability. Zhao et al. [14] proposes a transformer fault combination prediction model based on SVM. The prediction results of multiple single prediction methods such as exponential model, gray model, etc. are taken as the input of SVM for the second prediction to form a variable weight combination forecast. Compared with single prediction, the accuracy of fault prediction is improved. Zhou et al. [15] uses cloud theory to predict the expected value of gas changes in oil in the short term to obtain a series of prediction results with stable tendency. The current method still has the following two problems. (1) The current research is mostly aimed at fault diagnosis at the current moment, which lack analysis of the running states in the future and fault warning analysis. (2) In the state assessment and fault classification of transformers, the gas concentration ratio coding is mainly used as the input of the model, but there are problems such as incomplete coding and too-absolute boundaries [16]. In recent years, with the continuous development of deep learning technologies, some deep learning models have been applied to the analysis of time series data. The deep learning model is a kind of deep neural network with multiple non-linear mapping levels. It can abstract the input signal layer by layer and extract features to discover potential laws at a deeper level. In many deep learning models, the recurrent neural network (RNN) can fully consider the correlation of time series and can predict future data based on historical data. It is more adaptable to predict and analyze time series data. The LSTM is used as an improved model of RNN to make up for the disappearance of gradients, gradient explosions, and lack of long-term memory in the training process of the RNN model. It can make full use of historical data. At present, LSTM has achieved extensive research and application in such fields as speech recognition [17], video classification [18], and flow prediction [19,20]. In this paper, the LSTM model is used to process the superiority of the time series, and the gas concentration in the future is predicted based on the trend of the gas concentration in the transformer oil history. The DBN is cumulatively accumulated by multiple restricted Boltzmann machines (RBM), and the data is pre-trained using a comparative divergence (CD) algorithm. The error back-propagation is used to adjust the parameters of the whole network. The DBN network can effectively support traditional neural networks that are vulnerable to initial parameters, and that handle high-dimensional data at a slower speed. Currently, DBN networks have been widely used in fault diagnosis [21], pattern recognition [22], and image processing [23]. In this paper, the ratio of the future gas concentration obtained from the LSTM prediction model is used as the DBN network input to classify the future operating status of the transformer. This paper presents a prediction method of transformer running state based on LSTM_DBN model. Firstly, the ability of LSTM model to deal with time series is used to analyze the changing trend of dissolved gas concentration data in transformer oil to obtain the future gas concentration and calculate the gas concentration ratio. Using the powerful feature learning ability of DBN network, the gas concentration ratio value is the input of the model, and the transformer operation state type is output, and a plurality of hidden layer deep networks are constructed. The entire LSTM_DBN model makes full use of the historical data of the transformer oil chromatogram and realizes the analysis of the state of the transformer in the future and the analysis of the early fault warning. Through the analysis of specific examples, we can see that the model proposed in this paper has good prediction accuracy and can analyze potential faults.
Energies 2018, 11, 1880
3 of 14
Energies 2018, 11, x of FORDissolved PEER REVIEW 2. Prediction Gases
of 14 Concentration in Transformer Oil Based on LSTM3Model
2. Prediction of of Dissolved 2.1. Prediction DissolvedGases GasesConcentration Concentrationin Transformer Oil Based on LSTM Model
Transformer oil chromatographic analysis technology has become one of the important methods 2.1. Prediction of Dissolved Gases Concentration for monitoring the early latency faults of oil-immersed power transformers and analyzing fault nature Transformer oil chromatographic analysis technology has become one of the important methods and locations after failure. Condition-based maintenance of transformers oil-immersed transformers is fully based for monitoring the early latency faults of oil-immersed power and analyzing fault on oil chromatographic data. Transformer oil chromatographic analysis test can quickly and effectively nature and locations after failure. Condition-based maintenance of oil-immersed transformers is fully find potential faults and defects interruption of power. Itanalysis has high overheating based on oil chromatographic data.without Transformer oil chromatographic testrecognition can quicklyof and effectively find potential faults and defects without interruption of power. It has high recognition of faults, discharge faults, and dielectric breakdown failures. overheating faults, discharge faults, and dielectric breakdown failures. Most transformers use oil-paper composite insulation. When the transformer is under normal Most transformers use oil-paper composite insulation. When under normal operation, the insulating oil and solid insulating material the willtransformer gradually isdeteriorate and a small operation, the insulating oil and solid insulating material will gradually deteriorate and a small amount of gas will be decomposed, mainly including H2 , CH4 , C2 H2 , C2 H4 , C2 H6 , CO, and CO2 . amount of gas will be decomposed, mainly including H2, CH4, C2H2, C2H4, C2H6, CO, and CO2. When When the internal fault of the transformer occurs, the speed of these gases will be accelerated. the internal fault of the transformer occurs, the speed of these gases will be accelerated. As the failure As the failure develops, the gas formsbubbles bubbles, to oil. flowThe and diffuse develops, the decomposed gasdecomposed forms bubbles, causing to causing flow andbubbles diffuse in in oil. The composition content of therelated gas are to the of fault the severity composition and content ofand the gas are closely to closely the typerelated of fault and thetype severity of theand fault. of the fault. Therefore, during the operation of the transformer, chromatographic analysis Therefore, during the operation of the transformer, chromatographic analysis of the oil is performed of the oil is regular performed at regular so as internal to detect potential internal equipment failures at intervals, so as tointervals, detect potential equipment failures as early as possible, whichas early as can avoidwhich equipment failure or greaterfailure losses.or However, due toHowever, the complex of the operation possible, can avoid equipment greater losses. dueoperation to the complex transformer oil chromatography test and the it is of greatitsignificance of the transformer oil chromatography testlong andsampling the longinterval, sampling interval, is of great to significance predict the future development trend based on the historical trend of the gas concentration in the to predict the future development trend based on the historical trend of the gas concentration in the transformer oil. transformer oil. 2.2. Principles of Prediction
2.2. Principles of Prediction
The LSTM network is an improved model based on the RNN. While retaining the recursive The LSTM network is an improved model based on the RNN. While retaining the recursive nature nature of RNNs, the problem of disappearance of gradients and gradient explosions in the RNN of RNNs, the problem of disappearance of gradients and gradient explosions in the RNN training training process is solved [24–27].
process is solved [24–27].
y
y(t−1)
Why Whh
y(t) (t)
(t−1)
Whh(t−2)
Wxh x
h(t+1)
h
h
h
Why
Why(t)
Why(t−1)
y(t+1) (t+1)
Whh(t)
Whh(t−1)
Wxh(t−1)
Wxh(t) x(t)
x(t−1)
(a)
Whh(t+1)
Wxh(t+1) x(t+1)
(b)
Figure 1. (a) Basic recurrent neural network (RNN) network; (b) RNN expansion diagram.
Figure 1. (a) Basic recurrent neural network (RNN) network; (b) RNN expansion diagram.
A basic RNN network is shown in Figure 1a. It consists of an input layer, a hidden layer, and an (2), x(3)…, x(n−1), x(n)] is output RNN network is shown in Figure of 1b.an x =input [x(1), xlayer, Alayer. basicThe RNN network is timing showndiagram in Figure 1a. It consists a hidden layer, and an (1) (2) (n) (1) , x(2) ,layer. (n−1) , x(n) ] is the input vector and y = [y , y , …, y ] isdiagram the output vector. h the state the[xhidden output layer. The RNN network timing is shown inisFigure 1b.ofx = x(3) . .W . ,xhxis (n) ] is the the matrix and of the layer. Why isvector. the weight matrix the hidden layer the weight input vector y input = [y(1)layer , y(2) ,to. .the . , yhidden output h is the stateofof the hidden layer. W xh is to the output layer, and Whh is the weight matrix of the hidden layer state as the input at the next the weight matrix of the input layer to the hidden layer. W hy is the weight matrix of the hidden layer to moment. The layer state h(t−1) is used as the weight matrix input at time t. So when the input at t is x(t), the output layer, and W hh is the(t) weight matrix of the hidden layer state as the input at the next moment. (t) the value of the hidden (t−1) layer is h and the output value is y . (t)
The layer state h is used as the weight matrix input at time t. So when the input at t is x , the value of ) t) ( t − 1) ( t − 1) h( t ) =value f (Wxh( tis ⋅ xy((t) (1) the hidden layer is h(t) and the output .+ Whh ⋅ h ) = g(Wxh(t)⋅ h ) h(t) =yf (W + Whh xh ·x (t )
(t)
(t )
(t )
(t)
( t −1)
y(t) = g(Wxh ·h(t) )
· h( t −1) )
(2)
(1) (2)
Energies 2018, 11, 1880
4 of 14
where f is the hidden layer activation function and g is the output layer activation function. Substituting (2) into (1), we can get: 4 of 14 (t) ( t −1) ( t −1) ( t −2) ( t −2) · f (Wxh ·x(t−1) + Whh · f (Wxh ·x(t−2) ( t −3) ( t −3) ( t −3) +Where Whh ·f fis(Wthe ·x + ...)))) layer activation function and g is the output layer activation function. xh hidden Energies 2018,(t11, ) x FOR PEER REVIEW (t)
y(t) = g(Wxh ·h(t) ) = g(Wxh · f (Wxh ·x(t) + Whh
(3)
Substituting (2) into (1), we can get: (t) From (3), it (can be seen the (output network not − 2) only by the t) (t ) ( tthat ) t) ( tvalue ) ( t ) y ( t −of 1) the RNN ( t − 2)is affected ( t − 2) y = g ( W ⋅ h ) = g ( W ⋅ f ( W ⋅ x + W ⋅ f (Wxh( t −1) ⋅ x ( t −1) + W(thh− ⋅ f (t (W ⋅(tx−( t3) xh xh xh hh xh (t) 1) − 2) input x at the current moment, but also by the previous input value x ,x ,x . . . . (3) +Whh( t − 3) ⋅ f (Wxh( t − 3) ⋅ x ( t − 3) + ...)))) The RNN network has a memory function and can effectively deal with non-linear time series. (3),the it can be seen that the output y(t) of the RNN network is affected not onlyof bygradient the However,From when RNN processes a time value sequence with a long delay, the problem (t) at the current moment, but also by the previous input value x(t−1),x(t−2),x(t−3)…. input x disappearance and gradient explosion will occur during the back-propagation through time (BPTT) The RNNAs network has a memory function can deal with allows non-linear series. training process. an improved model, LSTM and adds a effectively gating unit which thetime model to store However, when the RNN processes a time sequence with a long delay, the problem of gradient and transmit information for a longer period of time through the selective passage of information. disappearance and gradient explosion will occur during the back-propagation through time (BPTT) The gating unit of LSTM is shown in Figure 2. It consists of an input gate, a forget gate and an training process. As an improved model, LSTM adds a gating unit which allows the model to store output gate. The workflow of the LSTM gate unit is as follows: and transmit information for a longer period of time through the selective passage of information. (1) Input the sequence value x(t) atin time and the hidden layergate, state h(t−1)gate at time The gating unit of LSTM is shown Figuret 2. It consists of an input a forget and ant − 1. The discarded information is determined by the activation function. The output at this time is: output gate. The workflow of the LSTM gate unit is as follows:
(1) Input the sequence value x(t) at time t and the hidden layer state h(t−1) at time t − 1. The (t) f(t) = by σ (W h( t −1) + W +The b f )output at this time is: discarded information is determined thef ·activation function. f ·x ( t −1)
(t )
(4)
(t )
f =W σ(Wisf ⋅ the h weight + W f ⋅ x matrix + b f ) of forget state, and b is offset (4) of where f (t) is the result of the forget state, f f forgetwhere state.f(t)σisisthe theresult activation function. usually a tanh or sigmoid function. of the forget state, It Wis f is the weight matrix of forget state, and bf is offset of forget (t−1) at time t − 1 and determine the information to update. Update (2) Enter the gate unit state c state. σ is the activation function. It is usually a tanh or sigmoid function. the gate unit state the c(t) gate at time (2) Enter unitt:state c(t−1) at time t − 1 and determine the information to update. Update
the gate unit state c(t) at time t:
i(t) = (σ (Wi ·h(t−(1t –1) )+ Wi ·x((t t) ) + bi ) t) = σ(Wi ⋅ h
i
+ Wi ⋅ x
+ bi )
(5)
e c(t) = (tanh (Wc ·h(t−(1t –)1)+ Wc ·x((t )t) + bc ) t) c = tanh (Wc ⋅ h
+ Wc ⋅ x
+ bc )
(6)
c( t ) = i( t ) ◦ e c( t ) + f ( t ) ◦ c( t −1)
(5) (6) (7)
(t ) (t ) (t ) (t ) ( t – 1) (t)c = i c + f c
(7)
where i(t) is the input gate state result. e c is the cell state input at t. W i is the input gate weight matrix. state result. thethe cellinput state input t. Wiand is thebcinput gate weight i(t) is the W c iswhere the input cellinput stategate weight matrix.c(t )bisi is gate at bias, is the input cellmatrix. state bias. c is the input cell state weight matrix. b i is the input gate bias, and b c is the input cell state bias. W ◦ means multiplication by elements. means multiplication by elements. (3) The output of the LSTM is determined by the output gate and unit status: (3) The output of the LSTM is determined by the output gate and unit status: 1t –1 ) o(t) =o(σt ) (=Wσo(W ·h(⋅t− Wo⋅·xx((t )t)++ h ( )+ +W b )bo )
(8)
(8)
(t) ) (t) (t ) o h(t) h= = o ( t◦ tanh tanh((cc ( t ) ))
(9)
(9)
o
o
o
where o(t) iso(t)the output gate output weight and output o is o is the where is the output gatestate state result. result. WW o is thethe output gategate weight matrixmatrix and bo is theboutput gate gate offset. offset. h(t)
C(t−1)
×
c(t)
+ tanh
× (t)
f h(t−1)
σ
(t)
i
σ
c ( t )
tanh
o(t)
×
σ h(t)
x(t)
Figure 2. Structureofoflong longshort-term short-term memory gate unit. Figure 2. Structure memory(LSTM) (LSTM) gate unit.
Energies 2018, 11, x FOR PEER REVIEW
5 of 14
Energies 2018, 11, 1880
5 of 14
3. Analysis of Transformer Running State Based on Deep Belief Network 3. of Transformer Running State Based on Deep Belief Network 3.1.Analysis Transformer Running Status Analysis
For the running state classification 3.1. Transformer Running Status Analysis of the transformer, it is firstly divided into healthy state (H) and potential failure (P). According to the IEC60599 standard, the types of potential transformer For the running state classification of the transformer, it is firstly divided into healthy state (H) and faults can be classified into thermal fault of partial discharge (PD), low-energy discharge (LD), and potential failure (P). According to the IEC60599 standard, the types of potential transformer faults can high-energy discharge (HD), low temperature (LT), thermal fault of medium temperature (MT), be classified into thermal fault of partial discharge (PD), low-energy discharge (LD), and high-energy thermal fault of high temperature (HT) [21]. Thus, the predicted running state of the transformer is discharge (HD), low temperature (LT), thermal fault of medium temperature (MT), thermal fault divided into 7 (6 + 1) types. of high temperature (HT) [21]. Thus, the predicted running state of the transformer is divided into Due to the normal aging of the transformer, the decomposed gas in the transformer oil is in an 7 (6 + 1) types. unstable state and will accumulate over time and change dynamically. Even though different Due to the normal aging of the transformer, the decomposed gas in the transformer oil is in transformers are in healthy operation, because of their different operating times, the concentration of an unstable state and will accumulate over time and change dynamically. Even though different dissolved gases in the oil varies greatly among different transformers. Therefore, it is necessary to transformers are in healthy operation, because of their different operating times, the concentration of use the ratio between the gas concentrations instead of the simple gas concentration as a reference dissolved gases in the oil varies greatly among different transformers. Therefore, it is necessary to use vector for the prediction of the final running state. the ratio between the gas concentrations instead of the simple gas concentration as a reference vector The currently used ratios include IEC ratios, Rogers ratios, Dornenburg ratios and Duval ratios. for the prediction of the final running state. This paper combines these four methods with other codeless ratio methods. The gas concentration The currently used ratios include IEC ratios, Rogers ratios, Dornenburg ratios and Duval ratios. ratios used is shown in Table 1. This paper combines these four methods with other codeless ratio methods. The gas concentration ratios used is shown in Table 1. Table 1. Structure of LSTM gate unit. Table 1. Structure of LSTM gate unit.
IEC ratios Rogers ratios IEC ratios Dornenburg Rogers ratios ratios Dornenburg Duval ratiosratios
CH4/H2, C2H2/C2H4, C2H4/C2H6 CH4/H2, C2H2/C2H4, C2H4/C2H6, C2H6/CH4 CH4 /H2 , C2 H2 /C2 H4 , C2 H4 /C2 H6 CH 4/H2, C2H2/C2H4, C2H2/CH4, C2H6/C2H2 CH 4 /H2 , C2 H2 /C2 H4 , C2 H4 /C2 H6 , C2 H6 /CH4 CH4C/H C2 HC22/C H4 , where C2 H2 /CH C2 H46+ /CC22H CH4/C, 2H22,/C, H42/C, C =4 , CH H22 + C2H4 Duval ratios /C, C4,HC C2 2HH4 /C, where C = CH4 2+/CH C2 H4,2 C + 2C 2 /C, 4 2H2, CH4/C1, CH4/H2, CCH 2H42/C2H2 2H4/C 6, C2H6/CH4, C2H H2 H 6/C CH4 /H2 , C2 H2 /C2 H4 , C2 H4 /C2 H6 , C2 H6 /CH4 , C2 H2 /CH4 , C2 H6 /C2 H2 , gas concentration ratios C2H2/C1, C2H4/C1, H2/C2, CH4/C2, C2H2/C2, C2H4/C2, C2H6/C2, gas concentration ratios CH4 /C1 , C2 H2 /C1 , C2 H4 /C1 , H2 /C2 , CH4 /C2 , C2 H2 /C2 , C2 H4 /C2 , C2 H6 /C2 where + CC22H H22 ++CC22HH4 ,4,where 2 = HCH 2 + 4CH + 2C+2H + 4C+2H +6C2H6 where C C11 == CH CH44 + where C2 =CH + C42 H C22 H C24H 2+ 3.2. Deep Network 3.2. Deep Belief Belief Network RBM, as as aa component component of of DBN, DBN, includes includes aa visible visible layer layer v v and and aa hidden hidden layer layer h. h. The The structure structure of of RBM, RBM is is shown shown in in Figure Figure 3. 3. The layer consists consists of of visible visible units units vvi and is used to input the training RBM The visible visible layer i and is used to input the training data. The hidden layer is composed of hidden units h i and is used for feature detection. w represents data. The hidden layer is composed of hidden units hi and is used for feature detection. w represents the weights weights between between two two layers. layers. For For the are the the visible visible and and hidden hidden layers layers of of RBM, RBM, the the interlayer interlayer neurons neurons are fully connected and the inner layer neurons are not connected [28–31]. fully connected and the inner layer neurons are not connected [28–31]. h1
h2
...
h3
hj
h
ω
v1
v2
v3
...
v4
vi
v
Figure 3. Structure of LSTM gate unit.
For aa specific specific set set of of (v, (v, h), h), the the energy energy function function of of the the RBM RBM is is defined defined as: as: For nv
nh
nv
nh
v nh v ω h E(v, h | θ) =nv− ai vi −nh bj hj −n i ij j E(v, h|θ ) = − ∑ ai=i1vi − ∑j=1b j h j − ∑ i= 1 ∑ j= 1 vi ωij h j
i =1
j =1
(10) (10)
i =1 j =1
where θ = (ωij , ai , bj ) is the parameter of RBM. ωij is the connection weight between the visible layer node vi and the hidden layer node hj. ai and bj are the offsets of vi and hj respectively. According to this energy function, the joint probability density of (v, h) is:
Energies 2018, 11, 1880
6 of 14
Energies 2018, 11, x FOR PEER REVIEW
6 of 14
where θ = (ωij , ai , b j ) is the parameter of RBM. ωij is the connection weight between the visible layer − E ( v , h |θ ) v, h b| θ )are = ethe / Z(θof ) v and h respectively. According (11) node vi and the hidden layer node hj . api (and offsets to j i j this energy function, the joint probability density of (v, h) is: Z(θ) = e− E( v, h|θ) (12) (11) p(v, h|θ ) = ev− Eh(v,h|θ ) /Z(θ )
The probability that the jth hidden unit in the hidden layer and the ith visible unit in the visible Z(θ ) = ∑ ∑ e− E(v,h|θ ) (12) layer are activated are: v h
n
v The probability that the jth hidden unit in the hidden layer and the ith visible unit in the visible p( h j = 1| v ,θ) = σ(bj + vi ω ji ) (13) layer are activated are: i =1
nv
p(h j = 1|v, θ ) = σ (b j + ∑ vi ω ji ) nh i = 1 p( v j = 1| h ,θ) = σ( ai + h j ω ji )
(13) (14)
ni=h1
p(v j = 1|h, θ ) = σ ( ai + ∑ h j ω ji ) (14) where σ (⋅) is the activation function. Usually we can choose sigmoid function, tanh function or i =1
ReLU function. The expressions are: where σ (·) is the activation function. Usually we can choose sigmoid function, tanh function or ReLU 1 function. The expressions are: sigmoid( x) = (15) −x 1 (15) sigmoid( x ) = 1 + e− x 1+e
e x − e− x tanh( x) =ex x− e−− xx tanh( x ) = xe + e− x e +e
(16) (16)
ReLU ( x ) x=) =max (0, xx)) ReLU( max(0,
(17) (17)
SinceSince the ReLU function can improve the convergence speed ofspeed the model has the the ReLU function can improve the convergence of theand model andnon-saturation has the noncharacteristics, this paper uses the ReLU function as the activation function. saturation characteristics, this paper uses the ReLU function as the activation function. When given given aa set set of of training training samples samples S, S, n is the the number Maximizing the the number of of training training samples. samples. Maximizing When nss is likelihood function function can can achieve achieve the the purpose purpose of of training training RBM. RBM. likelihood nnss
lnθ,S Lθ,S==∑ lnP((vvi)) ln L ln i
(18) (18)
i =11 i=
The DBN network is essentially a deep neuralneural network composed of multiple RBM networks and a The DBN network is essentially a deep network composed of multiple RBM networks classified output layer. Its structure is shown in Figure 4. and a classified output layer. Its structure is shown in Figure 4.
RBM
...
o2
o1
on
pre-training
h1
h2
h3
h4
hj
h1
h2
h3
h4
hj
h1
h2
h3
h4
RBM
RBM
v1
v2
v3
...
...
hj
vi fine-tuning
Figure Figure 4. 4. Structure Structure of of deep deep belief belief network network (DBN)s. (DBN)s.
The DBN training process includes two stages: pre-training and fine-tuning. In the pre-training phase, a contrast divergence (CD) algorithm is used to train each layer of RBM layer by layer. The output of the first layer of RBM hidden layer is used as the input of the upper layer of RBM. In the
Energies 2018, 11, 1880
7 of 14
The DBN training process includes two stages: pre-training and fine-tuning. In the pre-training phase, a contrast divergence (CD) algorithm is used to train each layer of RBM layer by layer. The output of the first layer of RBM hidden layer is used as the input of the upper layer of RBM. In the fine-tuning phase, the gradient descent method is used to propagate the error between the actual output and the labeled numerical label from top to bottom and back to the bottom to achieve optimization the11,entire DBNREVIEW model parameters. Energiesof 2018, x FOR PEER 7 of 14 fine-tuning phase, the gradient descent method is used to propagate the error between the actual 4. Transformer State Prediction Process output and the labeled numerical label from top to bottom and back to the bottom to achieve
With the continuous of power equipment on-line monitoring technology, the monitoring optimization of thedevelopment entire DBN model parameters. data are also increasing rapidly. Utilizing the existing historical state information, such as the type and development law of theState characteristic in the insulating oil, and analyzing the change of the running 4. Transformer Prediction gas Process state is of great significance to the development state assessment and prediction. With the continuous of power equipment on-line monitoring technology, the Themonitoring flowchartdata of the transformer running state prediction method based the LSTM_DBN are also increasing rapidly. Utilizing the existing historical stateon information, such model isasshown in Figure 5. The specific steps are as follows: the type and development law of the characteristic gas in the insulating oil, and analyzing the (1) (2)
(3)
(4)
(5)
change of the running state is of great significance to the state assessment and prediction.
CollectThe the flowchart transformer oiltransformer chromatographic select method the characteristic parameters of the running data state and prediction based on the LSTM_DBNH2 , CHmodel , C H , C H and C H as input for the model; 4 2 is2shown 2 4in Figure 2 5.6 The specific steps are as follows: Train the LSTM model. According to the transformer oil chromatography historical data, each (1) Collect the transformer oil chromatographic data and select the characteristic parameters H2, characteristic gas concentration is taken as the input, and the corresponding gas concentration is CH4, C2H2, C2H4 and C2H6 as input for the model; used the output to train LSTM modeltotothe obtain future oil gaschromatography concentration values; (2) asTrain the LSTM model. According transformer historical data, each Train the DBN model. According toisthe samples of theand transformer fault case the gas characteristic gas concentration taken as the input, the corresponding gaslibrary, concentration is used as the output to train LSTM model to obtain future gas concentration values; concentration ratios are taken as the input of the DBN network, and 7 kinds of transformer (3) Train theare DBN model. According totrain the samples of the transformer fault case library, the gas running states used as the output to DBN model; concentration ratios are taken as the input of the network, andthe 7 kinds of transformer Use the trained LSTM_DBN network to test the test setDBN samples. Input five characteristic gas running states are used as the output to train DBN model; concentration values to the LSTM model and predict future gas changes. Then calculate the gas (4) Use the trained LSTM_DBN network to test the test set samples. Input the five characteristic gas concentration ratio and use the ratio results as input to the DBN network to obtain the future concentration values to the LSTM model and predict future gas changes. Then calculate the gas running states of theratio transformer; concentration and use the ratio results as input to the DBN network to obtain the future If thererunning is fault states information in the prediction result, an early warning signal needs to be issued in of the transformer; time the fault type can be predicted. (5)and If there is fault information in the prediction result, an early warning signal needs to be issued in time and the fault type can be predicted. transformer oil chromatography historical data H2, CH4, C2H2, C2H4, C2H6
transformer fault case library
gas concentration
Gas concentration ratios
LSTM network: predict future gas concentration
error response propagation, back propagation along time
pre-training
LSTM(1)
RBM(1) DBN network: running state classification
LSTM(2)
RBM(2)
...
... LSTM(n)
RBM(n) forward calculation
softmax classifier fine-tuning
trained LSTM_DBN network
Analyze the output
Figure 5. Flowchart of transformer running state prediction.
Figure 5. Flowchart of transformer running state prediction.
Energies 2018, 11, x FOR PEER REVIEW
8 of 14
Energies 11, x FOR PEER REVIEW 5. Case2018, Analysis Energies 2018, 11, 1880
8 of 14 8 of 14
5. Case 5.1. Gas Analysis Concentration Prediction 5.5.1. Case Analysis This paper takes Prediction the oil chromatographic monitoring data collected by a 220 kV transformer oil Gas Concentration chromatography online monitoring device as an example. The sampling interval is 1 day. For the 5.1. Gas Concentration This paper takesPrediction the oilsequence, chromatographic monitoring collectedas bytraining a 220 kVsamples transformer oil methane gas concentration 800 monitoring datadata are selected and 100 chromatography online monitoring device as an example. The sampling interval is 1 day. For the monitoring data are the usedoilaschromatographic test samples. Themonitoring prediction data results are shown Figure 6. This paper takes collected by ain 220 kV transformer oil methane gas concentration sequence, 800 monitoring data are selected as training samples and 100 In order toonline evaluate the accuracy ofThe the sampling prediction modelisproposed paper, chromatography monitoring device and as anvalidity example. interval 1 day. For in thethis methane monitoring data are used as test samples. The prediction results are shown in Figure 6. theconcentration following evaluation are useddata for are analysis. gas sequence, criteria 800 monitoring selected as training samples and 100 monitoring data In order to evaluate the accuracy and validity of the prediction model proposed in this paper, are used as test samples. The prediction results are shown in Figure 6. 1 N xi − xi the following evaluation criteria are used for=analysis. × 100% model proposed in this paper, In order to evaluate the accuracyavg_err and validity of the prediction (19) N i =1 xi N the following evaluation criteria are used for analysis. x − xi 1 × 100% avg_err = i (19) N xi N i =1 x x − xi 1 xe=i − maxi i (20) avg_err =max _err (19) x ×x 100% N i∑ i i =1 xi − xi max _err = max (20) where N is the number of set, xi is the real value and xxei is−the xii predicted value. x i max_err = max (20) x where N is the number of set, xi is the real value and xi is ithe predicted value. ei is the predicted value. where N is the number of set, 14.7xi is the real value and x true value predicted value
14.7 14.6
true value predicted value
14.6 14.5 14.5 14.4 14.4 14.3 14.3 14.2 14.2 14.1 14.1
0
20
40
60
80
100
0
20
40
60
80
100
relativerelative percentage error(%) percentage error(%)
Figure 6. Methane gas concentration prediction results. Figure 6. Methane gas concentration prediction results. Figure 6. Methane gas concentration prediction results. 1.2
1.2 0.8
0.8 0.4
0.4 0
0
0
10
20
30
40
50
60
0
7.30Relative percent 10 Figure 20 40 50 60
70
80
90
100
error. 70 80 Figure 7. Relative percent error.
90
100
Figure 7. Relative percent error.
Energies 2018, 11, 1880 Energies 2018, 11, x FOR PEER REVIEW
9 of 14 9 of 14
Asshown shownininFigure Figure. prediction model proposed in this paper better fitting ability and As 6, 6, thethe prediction model proposed in this paper has has better fitting ability and has a good of to fitting to the changing trend ofgas methane gas concentration. relative ahas good degreedegree of fitting the changing trend of methane concentration. The relativeThe percentage percentage error between the true values and predicted is 7, shown Figure 7, relative where the average error between the true and predicted is shown values in Figure whereinthe average percentage relative percentage error is 0.26%relative and thepercentage maximumerror relative percentage error is 1.21%. error is 0.26% and the maximum is 1.21%. TheLSTM LSTMmodel modelisisused usedto topredict predictthe theother othergas gas concentrations. concentrations.The The predicted predictedresults resultsare areshown shown The in Table 2, which shows that the average error of the LSTM method is lower than general regression in Table 2, which shows that the average error of the LSTM method is lower than general regression neural network network (GRNN), (GRNN), DBN, DBN, and and SVM. SVM. Therefore, Therefore, itit can can be be seen seen that that the the use useof ofLSTM LSTMto topredict predict neural transformer concentration has high stability and reliability. transformer concentration has high stability and reliability. Table2.2.Gas GasConcentration ConcentrationPrediction PredictionResults. Results. Table
Type of Gas
LSTM Type of Gas LSTM
H2 CH4 C2H2 C2H4 C2H6
H2 CH4 C2 H2 C2 H4 C2 H6
Average Error (%) GRNN DBN GRNN DBN SVM 5.01 2.48 5.01 3.93 2.48 6.77 1.78 3.93 1.78 4.01 4.67 1.93 1.93 4.67 6.32 2.98 2.98 2.05 5.94 2.05 4.24 4.24 1.64 8.46 1.64
Average Error (%)
1.89 0.26 1.89 0.26 2.45 2.45 1.45 1.45 2.1 2.1
SVM 6.77 4.01 6.32 5.94 8.46
5.2.Gas GasConcentration ConcentrationPrediction Prediction 5.2. Thetransformer transformeroil oilchromatographic chromatographicgas gasconcentration concentration ratios ratios are areused usedas asthe theinput inputto tothe theDBN DBN The networkand andthe theseven seventransformer transformerrunning runningstates statesare areoutput. output. The The case casedatabase database used used in inthis thispaper paper network contains a total of 3870 datasets, including 838 normal cases and 3032 failure cases (521 LT cases, 376 contains a total of 3870 datasets, including 838 normal cases and 3032 failure cases (521 LT cases, MTMT cases, 587587 HTHT cases, 519 PDPD cases, sample data data are are 376 cases, cases, 519 cases,489 489LD LDcases casesand and540 540HD HDcases). cases). 90% 90% of of the the sample randomly selected from the database to train the DBN network, leaving 10% of the sample data as randomly selected from the database to train the DBN network, leaving 10% of the sample data as the the test sample to test the accuracy of the classification. test sample to test the accuracy of the classification. Theresults resultsofofthe the classification of DBN, SVM, BPNN at aare testshown are shown in Figure This The classification of DBN, SVM, andand BPNN at a test in Figure 8. This8. paper paper evaluates the classification results of transformer running states by drawing confusion matrix. evaluates the classification results of transformer running states by drawing confusion matrix. Light Light squares green squares on the diagonal the number of that samples that the predicted green on the diagonal indicate indicate the number of samples match thematch predicted category category with the actual and category, andsquares the blueindicate squaresthe indicate theofnumber falsely identified with the actual category, the blue number falsely of identified samples. samples. The last row of gray squares is the precision (the number of correctly predicted The last row of gray squares is the precision (the number of correctly predicted samples/number of samples/number ofThe predicted samples). The last column of recall orange squares is the recall (the number predicted samples). last column of orange squares is the (the number of correctly predicted of correctly predicted samples/actual number of samples). The last purple square is the accuracy (all samples/actual number of samples). The last purple square is the accuracy (all correctly predicted correctly predicted samples/all samples). samples/all samples). LT
MT
HT
PD
LD
HD
recall
H
80
1
0
1
0
1
0
96.4%
LT
2
45
0
2
0
0
3
86.5%
MT
0
1
34
0
1
1
0
91.9%
HT
0
2
3
51
0
0
2
87.9%
PD
0
2
0
0
48
0
1
94.1%
LD
1
0
2
2
1
42
0
87.5%
HD
1
0
0
0
3
1
49
90.7%
95.2%
88.2%
87.2%
91.1% (a)
90.6%
93.3%
89.1%
91.1%
precision
Figure 8. Cont.
actual category
H
Energies 2018, 11, 1880 Energies 2018, 11, x FOR PEER REVIEW
10 of 14 10 of 14
MT
HT
PD
LD
HD
recall
H
75
0
1
3
1
1
2
90.4%
LT
2
40
3
2
0
2
3
76.92%
MT
0
2
30
1
2
1
1
81.1%
HT
1
3
2
45
1
3
3
77.6%
PD
2
3
0
1
42
2
1
82.4%
LD
2
1
3
1
2
39
0
81.2%
HD
3
2
0
5
1
2
41
75.93%
88.2%
78.4%
76.9%
78.0%
80.4%
81.5%
H
LT
MT
HT
PD
LD
HD
recall
H
71
2
1
1
5
1
2
85.5%
LT
3
37
4
1
1
2
4
71.2%
MT
1
3
25
4
0
2
2
67.6%
HT
1
4
5
40
4
3
1
68.9%
PD
1
2
1
4
41
2
0
80.4%
LD
0
2
2
1
2
38
3
79.2%
HD
3
4
3
5
2
2
35
64.81%
88.8%
68.5%
61.0%
76.0%
74.5%
74.9%
precision
precision
77.6% 85.7% (b)
71.4% 74.5% (c)
actual category
LT
actual category
H
Figure8.8. Comparison Comparison of ofclassification classificationresults results(a) (a)DBN DBNresults; results;(b) (b)SVM SVMresults; results;(c) (c)BPNN BPNNresults. results. Figure
From Figure 8, it can be seen that compared with the SVM model and the BPNN model, the DBN From Figure 8, it can be seen that compared with the SVM model and the BPNN model, the DBN model has the highest classification accuracy, which respectively increases the accuracy by 9.6% and model has the highest classification accuracy, which respectively increases the accuracy by 9.6% 16.2%. And the precision and recall rate of the DBN model are both high, exceeding 85%. The and 16.2%. And the precision and recall rate of the DBN model are both high, exceeding 85%. comparison shows that the DBN model has a good effect for the classification of transformer running The comparison shows that the DBN model has a good effect for the classification of transformer states. Since a single experiment may be accidental, this paper repeats 10 sets of tests on the DBN running states. Since a single experiment may be accidental, this paper repeats 10 sets of tests on model, the SVM model, and the BPNN model to obtain the average accuracy respectively. The the DBN model, the SVM model, and the BPNN model to obtain the average accuracy respectively. average accuracy of the three models is 89.4%, 80.1%, and 71.9%. Therefore, it can be seen that the The average accuracy of the three models is 89.4%, 80.1%, and 71.9%. Therefore, it can be seen that the DBN model has strong classification stability while maintaining a high accuracy. DBN model has strong classification stability while maintaining a high accuracy. 5.3. Running State Prediction 5.3. Running State Prediction The oil chromatogram data from January to October in 2015 of a main transformer in a substation The oil chromatogram data from January to October in 2015 of a main transformer in a substation are selected for analysis. The sampling interval for data points is 12 h. The original data are shown in are selected for analysis. The sampling interval for data points is 12 h. The original data are shown in Figure 9. Figure 9.
Energies 2018, 2018, 11, 11, x1880 Energies FOR PEER REVIEW
11 of of 14 14 11
Figure 9. Relative percent error. Figure 9. Relative percent error.
First, using usingthe theIEC IECthree-ratio three-ratiomethod methodintegrated integrated original system analysis, there is First, inin thethe original system forfor analysis, there is no no abnormal warning before September. Until September, the measured ratio code is 021, which is abnormal warning before September. Until September, the measured ratio code is 021, which is consistent with with thermal thermal fault fault of of medium medium temperature. temperature. An An abnormal abnormal warning warning should should be be issued issued at at this this consistent time. Secondly, using the integrated threshold method in the original system, H content in excess of time. Secondly, using the integrated threshold method in the original system, H22 content in excess of 150 µL/L is detected in October and an early warning signal is required. 150 μL/L is detected in October and an early warning signal is required. Using the the LSTM_DBN LSTM_DBN model model proposed proposed in in this this paper, paper, the the transformer transformer running running state state is is predicted predicted Using and evaluated. Starting from the 5th month, use the LSTM model to predict the transformer gas and evaluated. Starting from the 5th month, use the LSTM model to predict the transformer gas concentration value value in in the the next next month, month, then then calculate calculate the the gas gas concentration concentration ratios ratios and and input input them them into into concentration the DBN network to get the transformer’s future running state. The transformer’s running state from the DBN network to get the transformer's future running state. The transformer's running state from May to to October October is is shown shown in in Table Table 3. 3. May Table 3. 3. Transformer Transformer running running state state prediction prediction results. results. Table
Month Month H May May57 June June53 July49 July August August 30 September September October 21 October 16
LT H 571 530 490 302 21 4 16
3
LTMTMT HT HT 1 0 0 2 4 3
3 5 10 28 34 37
3 5 10 28 34 37
00 11 22 11 1 14 4
PD PD 11 11 00 00 0 0 0
0
Fault LDLD HD HD Fault Case Rate Case Rate 0 0 0 1 0 2
0 0 0 1 0 2
0 0 1 0 0 0
0 0 1 0 0 0
8.1% 11.7% 20.9% 51.6% 65% 74.2%
8.1% 11.7% 20.9% 51.6% 65% 74.2%
As As itit can can be be seen seen from from Table Table 3, the percentage of fault cases that are obtained obtained through through analysis analysis using the LSTM_DBN model is gradually increasing, of which the percentage in August has exceeded using the LSTM_DBN model is gradually increasing, of which the percentage in August has exceeded 50% and the thehighest highestpercentage percentage fault cases in October is 74.2%. seen that is a 50% and of of fault cases in October is 74.2%. It can It becan seenbe that there is athere potential potential operational failure. Table that, 3 shows that,all among all the fault type analysis the number operational failure. Table 3 shows among the fault type analysis results, results, the number of fault of faultwith casesMT with the largest, so that there a potential fault typewith withthermal thermalfault fault of medium cases is MT the is largest, so that there is aispotential fault type medium temperature. temperature. ItIt needs needs to to send send early early warning warning signals. signals. Oil Oil chromatography chromatography monitoring monitoring device device can can be be interfered interfered with with the the external external environment environment and and cause cause errors errors in in data acquisition. acquisition. When When the the fault fault case case accounts accounts for for more more than than 50%, 50%, itit should should immediately immediately attract attract the the attention attention of of the the staff. staff. For For this this case, case, equipment early warning warningshould shouldbebe issued in August: “Closely concerned the development equipment early issued in August: “Closely concerned withwith the development trend trend of chromatographic datatimely and timely the transformer status”. of chromatographic data and check check the transformer status”. The The actual actual situation situation for for the the operation operation and and maintenance maintenance personnel's personnel’s detection detection records records shows shows that that the the oil oil temperature temperature rises rises abnormally abnormally since since June June and and the the value value of ofthe thecore coregrounding grounding current current increases increases gradually. μL/L from gradually. The The value value of of H H22 in in the the oil oil chromatographic chromatographic device device exceeds exceeds 150 µL/L from October October to to December. December. During During the the outage outage maintenance maintenance in in 2016, 2016, there there are are traces traces of of burn burn at at the the end end of of the the winding winding and and the the BB phase phase winding winding is is distorted. distorted. The Theprediction predictionresults results of ofthe thetransformer transformer running running state state through through the LSTM_DBN model are more consistent with the actual situation. This example shows that the LSTM_DBN model are more consistent with the actual situation. This example shows that the the transformer running state prediction method based on LSTM_DBN model can detect the abnormal
Energies 2018, 11, 1880
12 of 14
transformer running state prediction method based on LSTM_DBN model can detect the abnormal upward trend of oil chromatographic data in time and provide early warning to the abnormal state of the transformer. 6. Conclusions (1) The LSTM model has excellent ability to process time series and solves problems such as gradient disappearance, gradient explosion, and lack of long-term memory in the training process. It can fully utilize historical data. The DBN model can extract the characteristic information hidden in fault case data layer by layer and has high classification ability. (2) The transformer running state prediction method based on the LSTM_DBN model presented in this paper has high accuracy and can send warning information to potential transformer faults in time. Compared with the threshold method according to the standard and the state prediction method in the research literature, this paper can make full use of the historical and current state data. (3) We will focus on the improvement of the LSTM model and the DBN model, as well as parameter optimization, to further improve the transformer state prediction accuracy in the next step. Due to the small number of substations with complete online monitoring equipment and rich state data, the method proposed in this paper needs further verification. Author Contributions: Jun Lin designed the algorithm, test the example and write the manuscript. Lei Sum Gehao Sheng, Yingjie Yan, Da Xie and Xiuchen Jiang helped design the algorithm and debug the code. Acknowledgments: This work was supported in part by the National Natural Science Foundation of China (51477100) and the State Grid Science and Technology Program of China. Conflicts of Interest: The authors declare no conflict of interest.
Nomenclature Variables x the input vector y the output vector h the state of the hidden layer W xh the weight matrix of the input layer to the hidden layer of RNN network W hy the weight matrix of the hidden layer to the output layer of RNN network W hh the weight matrix of the hidden layer state as the input at the next moment of RNN network f (t) the result of the forget state Wf the weight matrix of forget state bf The offset of forget state i(t) the input gate state result the cell state input at time t e c( t ) Wi the input gate weight matrix Wc the input cell state weight matrix bi the input gate bias bc the input cell state bias o(t) the output gate state result Wo the output gate weight matrix bo the output gate offset v a visible layer w the weights between visible layers and hidden layers θ the parameter of RBM the connection weight between the visible layer node vi and the hidden layer node hj ωij ai the offsets of vi bj the offsets of and hj Symbol σ the activation function ◦ multiplication by elements
Energies 2018, 11, 1880
13 of 14
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
12.
13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23.
24.
Taha, I.B.M.; Mansour, D.A.; Ghoneim, S.S.M. Conditional probability-based interpretation of dissolved gas analysis for transformer incipient faults. IET Gener. Transm. Distrib. 2017, 11, 943–951. [CrossRef] Singh, S.; Bandyopadhyay, M.N. Dissolved gas analysis technique for incipient fault diagnosis in power transformers: A bibliographic survey. IEEE Electr. Insul. Mag. 2010, 26, 41–46. [CrossRef] Cruz, V.G.M.; Costa, A.L.H.; Paredes, M.L.L. Development and evaluation of a new DGA diagnostic method based on thermodynamics fundamentals. IEEE Trans. Dielectr. Electr. Insul. 2015, 22, 888–894. [CrossRef] Yan, Y.; Sheng, G.; Liu, Y.; Du, X.; Wang, H.; Jiang, X.C. Anomalous State Detection of Power Transformer Based on Algorithm Sliding Windows and Clustering. High Volt. Eng. 2016, 42, 4020–4025. Gouda, O.S.; El-Hoshy, S.H.; El-Tamaly, H.H. Proposed heptagon graph for DGA interpretation of oil transformers. IET Gener. Transm. Distrib. 2018, 12, 490–498. [CrossRef] Malik, H.; Mishra, S. Application of gene expression programming (GEP) in power transformers fault diagnosis using DGA. IEEE Trans. Ind. Appl. 2016, 52, 4556–4565. [CrossRef] Khan, S.A.; Equbal, M.D.; Islam, T. A comprehensive comparative study of DGA based transformer fault diagnosis using fuzzy logic and ANFIS models. IEEE Trans. Dielectr. Electr. Insul. 2015, 22, 590–596. [CrossRef] Li, J.; Zhang, Q.; Wang, K. Optimal dissolved gas ratios selected by genetic algorithm for power transformer fault diagnosis based on support vector machine. IEEE Trans. Dielectr. Electr. Insul. 2016, 23, 1198–1206. [CrossRef] Zheng, R.; Zhao, J.; Zhao, T.; Li, M. Power Transformer Fault Diagnosis Based on Genetic Support Vector Machine and Gray Artificial Immune Algorithm. Proc. CSEE 2011, 31, 56–64. Tripathy, M.; Maheshwari, R.P.; Verma, H.K. Power transformer differential protection based on optimal probabilistic neural network. IEEE Trans. Power Del. 2010, 25, 102–112. [CrossRef] Ghoneim, S.S.M.; Taha, I.B.M.; Elkalashy, N.I. Integrated ANN-based proactive fault diagnostic scheme for power transformers using dissolved gas analysis. IEEE Trans. Dielectr. Electr. Insul. 2016, 23, 586–595. [CrossRef] Li, S.; Wu, G.; Gao, B.; Hao, C.; Xin, D.; Yin, X. Interpretation of DGA for transformer fault diagnosis with complementary SaE-ELM and arctangent transform. IEEE Trans. Dielectr. Electr. Insul. 2016, 23, 586–595. [CrossRef] Ghoneim, S.S.M. Intelligent Prediction of Transformer Faults and Severities Based on Dissolved Gas Analysis Integrated with Thermodynamics Theory. IET Sci. Meas. Technol. 2018, 12, 388–394. [CrossRef] Zhao, W.; Zhu, Y.; Zhang, X. Combinational Forecast for Transformer Faults Based on Support Vector Machine. Proc. CSEE 2008, 28, 14–19. Zhou, Q.; Sun, C.; Liao, R.J. Multiple Fault Diagnosis and Short-term Forecast of Transformer Based on Cloud Theory. High Volt. Eng. 2014, 40, 1453–1460. Liu, Z.; Song, B.; Li, E. Study of “code absence” in the IEC three-ratio method of dissolved gas analysis. IEEE Electr. Insul. Mag. 2015, 31, 6–12. [CrossRef] Song, E.; Soong, F.K.; Kang, H.G. Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems. IEEE Trans. Speech Audio Process. 2017, 25, 2152–2161. [CrossRef] Gao, L.; Guo, Z.; Zhang, H. Video captioning with attention-based lstm and semantic consistency. IEEE Trans. Multimed. 2017, 19, 2045–2055. [CrossRef] Zhao, J.; Qu, H.; Zhao, J. Towards traffic matrix prediction with LSTM recurrent neural networks. Electron. Lett. 2018, 54, 566–568. [CrossRef] Lin, J.; Sheng, G.; Yan, Y.; Dai, J.; Jiang, X. Prediction of Dissolved Gases Concentration in Transformer Oil Based on KPCA_IFOA_GRNN Model. Energies 2018, 11, 225. [CrossRef] Dai, J.J.; Song, H.; Sheng, G.H. Dissolved gas analysis of insulating oil for power transformer fault diagnosis with deep belief network. IEEE Trans. Dielectr. Electr. Insul. 2017, 24, 2828–2835. [CrossRef] Ma, M.; Sun, C.; Chen, X. Discriminative Deep Belief Networks with Ant Colony Optimization for Health Status Assessment of Machine. IEEE Trans. Instrum. Meas. 2017, 66, 3115–3125. [CrossRef] Beevi, K.S.; Nair, M.S.; Bindu, G.R. A Multi-Classifier System for Automatic Mitosis Detection in Breast Histopathology Images Using Deep Belief Networks. IEEE J. Transl. Eng. Health Med. 2017, 5, 1–11. [CrossRef] [PubMed] Karim, F.; Majumdar, S.; Darabi, H. LSTM fully convolutional networks for time series classification. IEEE Access 2017, 6, 1662–1669. [CrossRef]
Energies 2018, 11, 1880
25. 26. 27. 28. 29. 30. 31.
14 of 14
Zhang, Q.; Wang, H.; Dong, J. Prediction of Sea Surface Temperature Using Long Short-Term Memory. IEEE Geosci. Remote Sen. Lett. 2017, 14, 1745–1749. [CrossRef] Zhang, S.; Wang, Y.; Liu, M. Data-Based Line Trip Fault Prediction in Power Systems Using LSTM Networks and SVM. IEEE Access 2017, 6, 7675–7686. [CrossRef] Song, H.; Dai, J.; Luo, L.; Sheng, G.; Jiang, X. Power Transformer Operating State Prediction Method Based on an LSTM Network. Energies 2018, 11, 914. [CrossRef] Zhong, P.; Gong, Z.; Li, S. Learning to diversify deep belief networks for hyperspectral image classification. IEEE Geosci. Remote Sen. Lett. 2017, 55, 3516–3530. [CrossRef] Lu, N.; Li, T.; Ren, X. A deep learning scheme for motor imagery classification based on restricted boltzmann machines. IEEE Trans. Neural Syst. Rehabil. Eng. 2017, 25, 566–576. [CrossRef] [PubMed] Chen, Y.; Zhao, X.; Jia, X. Spectral–spatial classification of hyperspectral data based on deep belief network. IEEE J. Sel. Top. Appl. Earth Observ. 2015, 8, 2381–2392. [CrossRef] Taji, B.; Chan, A.D.C.; Shirmohammadi, S. False Alarm Reduction in Atrial Fibrillation Detection Using Deep Belief Networks. IEEE Trans. Instrum. Meas. 2018, 67, 1124–1131. [CrossRef] © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).