Laplacian Eigenmaps Feature Conversion and

1 downloads 0 Views 5MB Size Report
6 days ago - Swarm Optimization-Based Deep Neural Network for. Machine ... Feature fusion reduction is the process to produce new and more sensitive features .... Waveform indicator p5 p3 p5. Peak max|x(n)| p11. Pulse indicator p5. 1.
applied sciences Article

Laplacian Eigenmaps Feature Conversion and Particle Swarm Optimization-Based Deep Neural Network for Machine Condition Monitoring Nanqi Yuan 1,2, * , Wenli Yang 1 , Byeong Kang 1 , Shuxiang Xu 1 and Xiaolin Wang 2 1

2

*

Discipline of ICT, School of Technology, Environments and Design, University of Tasmania, Hobart TAS7005, Australia; [email protected] (W.Y.); [email protected] (B.K.); [email protected] (S.X.) School of Engineering, Australian Maritime College, University of Tasmania, Hobart TAS7005, Australia; [email protected] Correspondence: [email protected]

Received: 11 November 2018; Accepted: 11 December 2018; Published: 13 December 2018

 

Abstract: This work reports a novel method by fusing Laplacian Eigenmaps feature conversion and deep neural network (DNN) for machine condition assessment. Laplacian Eigenmaps is adopted to transform data features from original high dimension space to projected lower dimensional space, the DNN is optimized by the particle swarm optimization algorithm, and the machine run-to-failure experiment were investigated for validation studies. Through a series of comparative experiments with the original features, two other effective space transformation techniques, Principal Component Analysis (PCA) and Isometric map (Isomap), and two other artificial intelligence methods, hidden Markov model (HMM) as well as back-propagation neural network (BPNN), the present method in this paper proved to be more effective for machine operation condition assessment. Keywords: Laplacian Eigenmaps; feature conversion; deep neural network; particle swarm optimization; condition assessment

1. Introduction With advanced manufacturing industry (AMI) being attached an increasing importance by most countries in today’s world, effective machine health assessment theory is undergoing an unprecedented revolution. Evaluating and monitoring the performances of some pivotal components, such as gears or bearings, can detect the degradation or faults and correct them before machine breakdown occurs [1]. According to References [2–5], signal processing methods involving time–frequency entropy, wavelet transform, etc., are most popular among the existing works for health assessment. As modern mechanical structure becomes increasingly complex, the vibration signal that characterizes the running state of machine needs to be analyzed more accurately. However, the incipient fault features of target machines is usually so weak that it is always submerged in the strong noise environment and difficult to extract. Therefore, more effective methods that can extract representative features from volatile machine running conditions and provide precise evaluation results are urgently needed. A number of artificial intelligence techniques have already been applied for distinguishing machinery health conditions, for example, Cui et al. [6] proposed a novel approach of analog circuit fault diagnosis by using a support vector machine (SVM) classifier. A kind of hidden Markov model (HMM)-driven robust probabilistic principal component analyzer was created by J. Zhu et al. [7] for dynamic process fault classification. In addition, Yan and Guo [8] adopted back-propagation neural network (BPNN) to assess on-line bearing performance degradation. More recently, diverse novel machine learning algorithms such as deep neural networks (DNNs) [9], deep belief networks Appl. Sci. 2018, 8, 2611; doi:10.3390/app8122611

www.mdpi.com/journal/applsci

Appl. Sci. 2018, 8, 2611

2 of 15

(DBNs) [10], and convolutional neural networks (CNNs) [11] have been effectively applied in the related field. Additionally, a new hierarchical network that stacks DBNs layer by layer was proposed by Meng et al. [12] to assess mechanical systems. However, there is currently no proven mature method can be effectively employed in the establishment of deep neural network models. In practice, researchers have to continually change parameters and experiment multiple times to build a model structure, of which the instability and uncertainty have made the development and application of machine learning models limited, and even questioned. This paper proposes to adopt the particle swarm optimization (PSO) algorithm [13] to optimize the model parameters of DBN for machine condition assessment, which can reduce the complexity of DNN modeling. Nowadays, in addition to time and frequency analysis and other traditional methods, some novel analysis techniques such as short time Fourier transform (STFT) [14], Wigner Ville Distribution (WVD) [15], and wavelet packet transform (WPT) [16] have also been effectively applied to extract vibration signal features. Feature fusion reduction is the process to produce new and more sensitive features through a series of transformations or combinations of the original input feature sets. Compared with feature selection, physical meaning of the new features that acquired after fusion reduction is obviously different from that of original ones and is difficult for understanding. However, since all of the features are involved in the transformation, feature fusion reduction is capable of retaining most of the useful signal classification information by projecting the feature set from high dimensional feature space to low dimensional feature space directly. Since traditional linear data space conversion methods including principal component analysis (PCA) [17], fisher discriminant analysis (FDA) [18], and other algorithms may ignore the convex and concave characteristics of nonlinear feature data under some conditions, the nonlinear feature space conversion technologies based on manifold learning have become a hotspot of current research, which mainly include the Isometric map algorithm (Isomap) [19], Laplacian Eigenmaps algorithm (LE) [20], local linear embedding algorithm (LLE) [21], and so on. Laplacian Eigenmaps constructs the relationship between the data from the local point of view. If two data instances i and j are similar to each other, then I and j would be as close as possible in the subspace transformed by LE, it is therefore, LE is competent to reflect the intrinsic manifold structure of data. After comparing the existing shortcomings of the proposed techniques, this paper reports a novel method that integrates LE into DNN to assess machine running conditions, which mainly includes four steps: (1) Original features ex-traction; (2) LE feature space conversion; (3) normal DNN optimization and training; and (4) obtaining the evaluation results with the well trained DNN. The comprehensive contrast experiments with two other important feature space conversion algorithms PCA and Isomap, and two other intelligent evaluation models HMM and BPNN highlighted the advantages of the method proposed in this paper. Other sections of the article are arranged as follows: Section 2 introduced executive steps of the proposed method in detail, in which all the related algorithms and theories are illustrated. In Section 3, the experiments of extracting original features of the vibration signal, feature space transformation and the comparison experiments with some other popular feature space transformation methods and intelligent assessment algorithms were carried out, and the assessment results are also introduced in this section. Then, in Section 4, the conclusions were drawn. 2. The Proposed Method 2.1. Original Features Extraction Data preprocessing, such as moving out the abnormal data form the original signal dataset, is an essential procedure before feature extraction, and more precise outcomes require more meticulous preprocessing. Since the original signal doped with a lot of messy interference information is unsuitable for analyzing directly, it is a good choice to extract the features of the signal data for further analysis.

Appl. Sci. 2018, 8, 2611

3 of 15

To retain the original useful information as much as possible, both classical and contemporary methods for feature extraction were employed in this paper. 2.1.1. Time and Frequency Analysis Time and frequency domain feature analysis is one of the dominant ways for state evaluation and fault diagnosis of mechanical equipment. Among which, time domain signal possessing the characteristics of containing large information, intuitive and easy to understand is the original basis for health evaluation and diagnosis of machines. The frequency domain characteristic parameters describe signal through the change of frequency band in signal spectrum and the dispersion of spectrum energy. Accompanied by the occurrences and developments of rotating machinery’s faults, the frequency components of vibration signal would change as well, hence the running status of the equipment can be evaluated according to the composition and size of these frequency components. In this paper, 11 characteristic parameters in time domain and 13 characteristic parameters in time domain are adopted, which are displayed in Tables 1 and 2, respectively. In our research, these features are the most effective and the most widely used signal features. Table 1. Features in Time-domain. Features

Names

Equations

Features

N

∑ n =1 x ( n ) N

p1

Mean

p2

variance

p3

Square root amplitude

p4

Valid value

p5

Peak

p6

Skewness index

r 

N

∑n=1 ( x (n)− p1 ) N −1 N

∑ n =1



2

2

| x (n)|

N

q

N

∑n=1 ( x (n)) N

2

max| x (n)| 1 N

Names

p7

Kurtosis index

p8

Peak factor

p9

Margin indicator

p10

Waveform indicator

p11

Pulse indicator

Equations N

1 N

∑n=1 ( x (n)− p1 ) p2 3

1 N

∑n=1 ( x (n)− p1 ) p2 4

N

p5 p4

1 N

p5 p3 p5 N ∑n=1 | x (n)|

p4 N ∑n=1 | x (n)|

Note: x (n) is the sequence of time domain signal, n = 1, 2, . . . , N, N is the total number of samples.

Table 2. Features in Frequency-domain. Features

Names

Equations

p12

Mean frequency

∑ a =1 s ( a ) A

p13

Standard deviation frequency

p14

Spectral skewness

∑ a=1 (s( a)− p12 ) 3 √ A( p13 )

p15

Spectral kurtosis

∑ a=1 (s( a)− p12 ) Ap13 2

p16

First-order center of gravity

A ∑ a =1 f a s ( a ) A ∑ a =1 s ( a )

p17

Second-order center of gravity

p18

Second order moment of spectrum

A

q

A

∑ a=1 (s( a)− p12 ) A −1

Features

Names

Equations r

p19

None

p20

None

p21

None

p17 p16

p22

None

3 A ∑ a=1 ( f a − p16 ) s( a) Ap17 3

p23

None

4 A ∑ a=1 ( f a − p16 ) s( a) Ap17 4

p24

None

A ∑ a =1 f a 4 s ( a ) A ∑ a =1 f a 2 s ( a )



A

A

q

4

2 A ∑ a=1 ( f a − p16 ) s( a) A

r

A ∑ a =1 f a 2 s ( a ) s( a) ∑ aA=1 f a 4 s( a)

A ∑ a =1

A

1/2

∑ a=1 ( f a − p16 ) Ap17 1/2

s( a)

A ∑ a =1 f a 2 s ( a ) A ∑ a =1 s ( a )

Note: s( a) is signal spectrum, a is spectral line. f a is the frequency of a-th spectral line, A is the total number of spectral lines. a = 1, 2, . . . , A.

Appl. Sci. 2018, 8, 2611

4 of 15

2.1.2. WPT When wavelet packet transform (WPT) is decomposing the low frequency part of the signal, it can also decompose the high frequency part more meticulously at the same time, and there is neither redundancy nor omission in the decomposition. Therefore, WPT can provide better time-frequency analysis than wavelet transform for the mechanical vibration signal containing both medium and high frequency information. The steps to extract wavelet packet energy features mainly include [22]: (1)

Extract signals in each sub-band

Recorded the wavelet function W(k) and scaling function ϕ(k) as µ1 = W(k) and µ0 = ϕ(k), respectively, then  √  µ2n (t) = 2 ∑ h(t)µn (2t − k)  √k∈Z (1)  µ ( t ) = 2 ∑ g(t)µn (2t − k) 2n + 1  k∈ Z

where gk = (−1)k h1−k is a biorthogonal filter, n = 2l or n = 2l + 1, l = 0, 1, 2, · · · The recursively defined function µn is called the wavelet packet determined by orthonormal scaling function µ0 = ϕ(k). (2)

Calculate the energy of each sub-band

Set the signal energy corresponding to the reconstructed signal cjk of the jth frequency band of the kth layer after the wavelet packet decomposition as Ejk , then Ejk =

2 c jk (t) dt =

Z

N



x jm

(2)

m =1

In which m is the discrete point of the reconstructed signal cjk of the jth frequency band of the kth layer, while xjm stands for the amplitude of the discrete points of the reconstructed signal cjk . (3)

Constructing wavelet packet feature vector

The feature vector of the wavelet packet can be obtained through normalizing the characteristic parameters calculated by the following formula: e=

n

o Ej0 , Ej1 , · · · , Ejl /E, l = 2 j − 1

(3)

l

where E = ∑ Ejk is the total energy of the signal that equals to the sum of the energy of each sub-band. k=1

After selection, this paper extracted 14 WPT original features for further research. 2.1.3. LE Feature Space Conversion The sample data of high dimensional spaces is actually in a low dimensional manifold, of which the structure contains the geometric characteristics and the intrinsic dimensionality information of the original data [23]. The sample data in high dimensional space (D dimension) can actually be projected into a low dimensional manifold (L dimension, L ≤ D), which can accurately reflect the geometric characteristics of the original data. As a nonlinear space dimensionality transformation technique, LE builds a graph from neighborhood information of the data set, and each data point serves as a node on the graph and connectivity between nodes is governed by the proximity of neighboring points, which can be generally represented as: LE

MD ⇒ ML , (L ≤ D) where MD and ML stand for the original features in D-dimensional space and projected features in L-dimensional space, respectively. The steps can be summarized as follows [24]:

serves as a node on the graph and connectivity between nodes is governed by the proximity of neighboring points, which can be generally represented as: M

M , (L

D)

Appl. Sci. 2018, 8, 2611

where M

5 of 15 and M stand for the original features in D-dimensional space and projected features in

L-dimensional space, respectively. The steps can be summarized as follows [24]: (A) Constructing the Graphs (A) Constructing the Graphs Given k points x1 , . . . , xk in MD , construct a weighted graph with k nodes, one for each point k points x ,…, x neighboring in M , construct graph withpurpose, k nodes, one each point and aGiven set of edges connecting pointsatoweighted each other. For this put an for edge between and a iset edges neighboring to each other.algorithm For this purpose, put nodes andofj that areconnecting close. In this work, thepoints n-nearest neighbors is adopted to an findedge the between nodes i and j that are close. In this work, the n-nearest neighbors algorithm is adopted nodes that are close to each other. In this method, nodes of i and j are connected by an edge if i to is find then-nearest nodes that are closeoftonode eachj.other. In this method, nodes of i and j are connected by an edge among neighbors if i is among n-nearest neighbors of node j. (B) Choosing weights (B) Choosing weights The heat kernel algorithm described in previous section was introduced to calculate the weights heatinkernel algorithm graph. described in previous was introduced to calculate the weights of theThe edges the constructed If nodes i and j section are connected, put of the edges in the constructed graph. If nodes i and j are connected, put Wi,j = e− W, = e

k xi − xj k 2 4t

(4) (4)

(C) Eigenmaps (C) Eigenmaps As for a constructed graph G, to obtain the connected components, we should compute the As for a constructed graph G, to obtain the connected components, we should compute the eigen-values and eigen-vectors for the generalized eigen-vector problem as: eigen-values and eigen-vectors for the generalized eigen-vector problem as: Ay Ay = = δBy δBy

(5)(5)

where B is which thethe entries areare columns sums of W, B = is the thediagonal diagonalweight weightmatrix, matrix,ofof which entries columns sums of W, Bii∑=W∑,j and Wji , and =W B− the Laplacian matrix. A=A B− is W theisLaplacian matrix. The The main main processes processes can can be be presented presented as as Figure Figure 1. 1.

Figure 1. Main processes of LE space conversion.

As introduced in Section 2.1, an n × 38 feature array composed of 38 original features extracted from the vibration signal is acquired in high dimension feature space. Additionally, before the high dimension feature array is projected to lower dimensional space by LE, maximum likelihood estimation (MLE) was adopted to calculate the intrinsic dimension of the array, then an n × m (m < 38) lower dimensional feature array was obtained. 2.2. DNN Training and Optimization 2.2.1. Construction of Deep Neural Network Hinton et al. proposed a feasible scheme to construct deep structure neural network. The key points of this method is to use some Restricted Boltzmann machines (RBM) to execute the pre-training without supervision, and tack up these RBMs layer by layer to construct a DBN. RBM is a probabilistic model that can be represented by a kind of undirected graph models. The undirected graph model has two layers, of which one is a visible layer used to describe the characteristics of the input data, while another is a hidden layer, and each layer is composed of a plurality of probability units. All the visible layer elements are connected with the random binary hidden layer elements by undirected weights, however, there is no connection between the elements in the same visible or hidden layer. DBN is built through stacking a number of RBMs from bottom to top layer by layer, of which the rules are available in the literature of Reference [25]. Since the input features of this paper are

undirected graph model has two layers, of which one is a visible layer used to describe the characteristics of the input data, while another is a hidden layer, and each layer is composed of a plurality of probability units. All the visible layer elements are connected with the random binary hidden layer elements by undirected weights, however, there is no connection between the elements in theSci.same Appl. 2018, visible 8, 2611 or hidden layer. 6 of 15 DBN is built through stacking a number of RBMs from bottom to top layer by layer, of which the rules are available in the literature of Reference [25]. Since the input features of this paper are continuous variables, variables, the the first first two two layers layers are are built RBM models, models, while while other other continuous built as as Gaussian-Bernoulli Gaussian-Bernoulli RBM hidden layers are built as Bernoulli-Bernoulli RBM models. The output values of the lower layer are hidden layers are built as Bernoulli-Bernoulli RBM models. The output values of the lower layer are used as higher oneone between two two binary RBM RBM layers,layers, through repeating which the network used as inputs inputsofofthethe higher between binary through repeating which the structure with desired hidden layer number can be obtained at last. network structure with desired hidden layer number can be obtained at last. In this this paper, paper, aa linear linearoutput outputlayer layerisisadded addedatatthe thetop topofofthe the DBN form DNN that is used In DBN toto form DNN that is used to to study mapping relationship betweenthe thevibration vibrationsignal signalfeatures featuresand and the the equipment equipment state state study thethe mapping relationship between information, the the architecture architecture of information, of DNN DNN is is shown shown in in Figure Figure 2. 2.

Output

h1



hn

Hidden layer m

h3 …

hk

Hidden layer m-1

hj

Hidden layer 1

vi

Visible layer

h2

h1

h2

h1

h2



DBN

v1

v2

h3 … … Input

Structure of DNN. Figure 2. Structure

2.2.2. DNN Optimization Based on PSO 2.3.2. DNN Optimization Based on PSO As for the DNN models, the quantities of hidden nodes and hidden layers are the most significant As for the DNN models, the quantities of hidden nodes and hidden layers are the most parameters, which decide the ability of DNNs to capture useful information from massive input data. significant parameters, which decide the ability of DNNs to capture useful information from The architectures of a DNN model can be defined as follows: massive input data. The architectures of a DNN model can be defined as follows:   DNN param2;j ;param3 param3 DNN param1; param1; param2 param21 ,, ·⋯· · param2 (6)(6) where param1 represents the number of input nodes, param2i denotes the number of hidden nodes of ith hidden layer, while param3 stands for the number of output nodes. A lot of research reveals that too few hidden nodes usually make the network not competent enough for modelling the data, while too many hidden nodes may trigger some problems such as over-fitting and even lead to unreliable results at last [12]. However, until now, there is no mature theory that has been reported for computing the exact quantity of hidden nodes or layers, which remains the construction of DNN that is still an intractable task. In this paper, we propose to use the particle swarm optimization (PSO) algorithm to optimize the model parameters of DNNs. The particle swarm optimization algorithm can be regarded as a process for the global optimization of the population, which can be effectively applied to many optimization problems. The PSO algorithm adopted in this study can realize the establishment of the optimal DNN model through iterating the model parameters. The target parameters of the optimization include the number of nodes in the hidden layer, the order of training in the DNN model, and the number of trainings per level. Usually, the number of nodes in the input layer is less than half of the number of hidden layers, and the number of nodes in the current hidden layer is generally not less than 2 times that of the next layer. For example, if the DNN model has a hidden layer of 2, with L input nodes and 1 output node. L denotes the feature dimension after transformation, and is generally not greater than 6, then the values of the number of nodes in the

Appl. Sci. 2018, 8, 2611

7 of 15

two hidden layers can be set as 12 to17 and 6 to 8, respectively. The verification shows that when the order of the model training exceeds 8, the error generated by the model increases exponentially, so the order of the model training is set as 1 to 7. Additionally, for the number of trainings of each order, it is required to be divisible by the number of elements included in the input node, taking into account the performance of the computer and the time PSO algorithm need, the times of training are tentatively set as {500, 1000, 1250, 2000, 2500, 4000, 5000, 6250, 10,000, and 12,500}. After optimization, the parameters param21 , param2j in Equation (6) will be determined, and thus the optimal DNN model can be obtained. 2.3. Condition Assessment The DNN model optimized by PSO can express numerous function sets in a more compact and concise way, which make it very suitable for DNN to obtain the essential characteristics of massive data. To analyze the whole life running condition of the machine, the entire dataset ML that composed of feature data of the vibration signals collected under normal condition as well as abnormal condition is used as the testing data and input into DNN model, that is  Testing data :

  L M =  

p1 p2 .. . pM

  1 p    .1 = . .   p1M

 · · · pL1 ..  .. . .  · · · pLM

where NL ∈ ML . Additionally, the assessment is then the result of when the machine was healthy, when the incipient slight faults occurred, and when the serious faults occurred would be accurately detected. We consider this task as a regression task. For a regression task, each training instance may have a value, such as 0.8,0.9,1.0,1.1,1.2 and so on. In mechanical part health monitoring, we could set “1.0” for healthy training dataset. When the validation dataset output “1.0” or values close to “1.0”, we could consider this signal to be healthy. When to validation dataset output “0.5” or “1.5”, we may consider this signal to be unhealthy. The output will fluctuate when faults occurred. It’s easier to detect and monitor the mechanical part health using this method. Figure 3 exhibits the main procedures of the proposed method of integrating LE into deep neural network for evaluating machine health state in general.

Appl. Sci. 2018, 8, 2611

8 of 15

Appl. Sci. 2018, 8, x FOR PEER REVIEW

8 of 14

Figure 3. 3. Flow Flow chart chart of of the the proposed proposed method. method. Figure

3. 3. Experiments and Analysis Analysis 3.1. 3.1. Test Test Rig Rig and and Data Data Bearings, Bearings, the the most most important important components components of of mechanical mechanical transmission transmission system, system, are are also also the the most most vulnerable vulnerable parts parts due due to to the the complicated complicated internal internal constitution, constitution, and and most most machine machine failures failures are are caused caused by damage of ofthe thecritical criticalequipment equipmentsuch such bearings. bearing run-to-failure experiment by the the damage asas bearings. TheThe bearing run-to-failure experiment was was implemented on the test devices shown in Figure 4, in which there are four bearings in the implemented on the test devices shown in Figure 4, in which there are four bearings in the transmission system, the rotating speed of the main shaft with constant load was kept invariable transmission system, the rotating speed of the main shaft with constant load was kept invariable by by the the given given alternating alternating current current motor. motor. The The parameters parameters of of the the bearings bearings and and operation operation conditions conditions are are listed listed in in Table Table 3. 3. To To obtain obtain the the accurate accurate vibration vibration data, data, the the bearing bearing housing housing was was fixed fixed with with two two High High Sensitivity Quartz ICP accelerometers (PCB 353B33), of which one is fixed in the horizontal direction Sensitivity Quartz ICP accelerometers (PCB 353B33), of which one is fixed in the horizontal direction and and the the other other is is fixed fixed in in the the vertical vertical direction, direction, and and the the NI NI DAQ DAQ Card Card 6062E 6062E was was also also applied applied in in the the

data acquisition system. When the experiment was done, 984 individual ASCII format data files

were got, of which each file consists of 20,480 data points with the recording interval of 10 min. The outer ring of selected bearing was found to be faulty after the test.

Appl. Sci. 2018, 8, 2611

9 of 15

Appl. Sci. 2018, 8, x FOR PEER REVIEW

9 of 14

data acquisition system. When the experiment was done, 984 individual ASCII format data files were got, of which each file consists of 20,480 data points with the recording interval of 10 min. The outer were got, of which each file consists of 20,480 data points with the recording interval of 10 min. The ring of selected bearing was found to be faulty after the test. outer ring of selected bearing was found to be faulty after the test.

Figure 4. Test rig. Table 3. Bearing parameters and experimental conditions. Sampling Rate Type

Number

Ball Diameter (mm)

Contact Angle (deg)

Rotation Speed (RPM)

Load (kN·m) (kHz)

ZA-2115

4

10

0

3.2. Feature Space Conversion

1500

26.50

20

Figure 4. 4. Test rig. Figure Test rig.

The test data collected from the above experiment were adopted for further analysis. After data Table 3. and conditions. pre-processing, time and frequency domain analysis and WPT were utilized to extract the Table 3. Bearing Bearing parameters parameters and experimental experimental conditions. thirty-eight features in original feature space, here a 9810 × 38 array composed of original feature Ball Contact Rotation Load Sampling Sampling Rate Number wasType obtained. The eight representative original features that have beenLoad widely for(kHz) further Type Number Ball Diameter (mm) Angle(deg) (deg) Rotation Speed (RPM) (kN·m) Diameter (mm)Contact Angle Speed (RPM) (kN ·m) used Rate (kHz) research are displayed in Figure 5, in which all the waveforms of the eight features have an obvious ZA-2115 4 10 0 1500 26.50 20 mutation of them (Mean and ZA-2115at time 4 point 7000, 10however, only four 0 1500 frequency (MF), 26.50 WPT1, WPT5, 20 WPT6) show slight abnormality at time point 5300, while the other four features (Skewness, Kurtosis, 3.2. Feature Space Conversion Crest, and Standard deviation frequency (SDF)) seem unable to detect the early slight abnormality. 3.2. Feature Space Conversion The collected from the aboveofexperiment wereoriginal adoptedfeatures for further After Next,test thedata instinct dimension quantity the thirty-eight wasanalysis. computed withdata the The test data collected from the above experiment were adopted for further analysis. After data pre-processing, time and frequency domain analysis utilized to extract the thirty-eight help of the MLE algorithm, the answer was got as and six. WPT Then were the local non-linear space conversion pre-processing, and to frequency analysis anddimension WPTof were utilized towas extract the features inLE original feature space, herethe adomain 9810 × 38 arrayhigh composed original feature obtained. technique wastime applied project features from space to lower dimensional thirty-eight features in original feature space, here a 9810 × 38 array composed of original feature The eight representative features that have been the widely used for original further research displayed space according to theoriginal steps in Section 2.2, hence, 9810 × 38 feature are dataset was was obtained. The eight representative original features that have been widely used for further in Figure 5, ininto which all the the eight of features havefeatures. an obvious mutation point transformed a 9810 × waveforms 6 one that of composed mapping From Figure at6,time it can be research arethat displayed Figure 5, in which all (features the waveforms theWPT5, eight features haveshow anabnormal obvious 7000, however, only ofprojected them (Mean frequency (MF), and becoming WPT6) slight discovered four four ofinthe features 1, 2,WPT1, 3,ofand 5) started mutation at time point 7000, however, four of features them (Mean frequency (MF), WPT1, WPT5, and abnormality at time point 5300, while theonly other (Skewness, Kurtosis, Crest, and Standard around time point 5300, while there occurred anfour obvious mutation around time point 7000 for all these WPT6) show slight abnormality time point 5300,the while other four features (Skewness, Kurtosis, deviation frequency (SDF)) seematunable to detect earlythe slight abnormality. six features. Crest, and Standard deviation frequency (SDF)) seem unable to detect the early slight abnormality. Next, the instinct dimension quantity of the thirty-eight original features was computed with the Kurtosis Skewness help of the MLE algorithm, the answer was got as six. Then the local non-linear space conversion technique LE was applied to project the features from high dimension space to lower dimensional space according toCrestthe steps in Section 2.2, hence, the 9810 × 38 original feature dataset was MF transformed into a 9810 × 6 one that composed of mapping features. From Figure 6, it can be discovered that four of the projected features (features 1, 2, 3, and 5) started becoming abnormal SDF around time point 5300, while there occurred an obvious WPT1 mutation around time point 7000 for all these six features. 20

2

Point: 7000

10

0

0

1000

2000

3000

4000

5000

6000

Point: 7000

0

7000

8000

9000

10000

10

-2

0

1000

2000

3000

5000

6000

7000

Point: 5300

Point: 7000

5 0

4000

8000

9000

10000

8000

Point: 7000

7000

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

6000

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

4

x 10

2

0.4

Point: 5300

Point: 7000

1 0

Point: 7000

0.3

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

0.04

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

0.2

WPT5

20 0.02

Point: 5300

Skewness

10 0 0 0 0

Point: 7000

5000

WPT6 Kurtosis

2 0.1

Point: 7000

1000

2000

3000

4000

6000

7000

8000

9000

10000

1000

2000

3000

4000Time 5000 / min6000

7000

8000

9000

10000

10

0 0 -2

Point: 7000

Point: 5300 Point: 7000

0

1000

2000

3000

4000

6000

7000

8000

9000

10000

0

1000

2000

3000

4000Time 5000 / min6000

5000

7000

8000

9000

10000

8000

Crest

0

1000

Point: 5300

MF Figure 5. Eight of the features7000in original feature space. Point: 7000

5 0

0.2

2000

3000

4000

5000

6000

7000

8000

9000

10000

6000

0

1000

2000

3000

4000

5000

6000

Point: 7000

7000

8000

9000

10000

Next, the instinct dimension quantity of the thirty-eight original features was computed with WPT1 the help of the MLESDFalgorithm, the answer was got as six. Then the local non-linear space conversion technique LE was applied to project the features from high dimension space to lower dimensional space according to the steps in Section 2.1.3, hence, the 9810 × 38 original feature dataset was transformed 4

2

x 10

0.4

Point: 5300

Point: 7000

1 0

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

0.04

0.2

0

1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

8000

9000

10000

0.2

WPT5

Point: 5300

WPT6

Point: 7000

0.02 0

Point: 7000

0.3

Point: 7000

Point: 5300

0.1

0

1000

2000

3000

4000

5000

6000

Time / min

7000

8000

9000

10000

0

0

1000

2000

3000

4000

5000

6000

Time / min

7000

Appl. Sci. 2018, 8, 2611

10 of 15

into a 9810 × 6 one that composed of mapping features. From Figure 6, it can be discovered that four 10 of 14 of the projected features (features 1, 2, 3, and 5) started becoming abnormal around time point 5300, while there occurred an obvious around time point 7000 for space. all these six features. Figure 5.mutation Eight of the features in original feature Appl. Sci. 2018, 8, x FOR PEER REVIEW

Figure 6. The The six six features in projected feature space.

It can also be easily discovered that the mapping features in projected feature space performed much better than the features in in original original feature feature space space for for abnormality abnormality detection. detection. The abnormal phenomena expressed by the features described above may indicate that the bearing applied in the test might have slight faults around time point 5300, while serious degradation might occur around Additionally,since sinceallallthe the curves both original projected features before 7000. Additionally, curves of of both original andand projected features before time time pointpoint 5300 5300 performed smoothly stably, it can inferred that themachine machinewas wasrunning running under under normal performed smoothly and and stably, it can be be inferred that the condition during the period before this time point. 3.3. DNN DNN Condition Condition Assessment Assessment 3.3. 3.3.1. DNN Construction and Training 3.3.1. DNN Construction and Training In this study, to correspond with the dimension of the input projected feature array, the number In this study, to correspond with the dimension of the input projected feature array, the number of input nodes was also set as six. While in view of ultimate purpose is the result of the equipment of input nodes was also set as six. While in view of ultimate purpose is the result of the equipment condition evaluation, a single output node is preferred. The quantities of hidden nodes and layers condition evaluation, a single output node is preferred. The quantities of hidden nodes and layers of of the DNN can be optimized with PSO algorithm proposed in Section 2.2. The dataset applied for the DNN can be optimized with PSO algorithm proposed in Section 2.3. The dataset applied for training and fine-tuning DNN were segmented into two parts: The first 80% were used for training training and fine-tuning DNN were segmented into two parts: The first 80% were used for training and fine-tuning, while the rest for validation. The algorithm parameters of DNN model, such as and fine-tuning, while the rest for validation. The algorithm parameters of DNN model, such as numepochs, batchsize, momentum, and so on, were adjusted instantly and repeatedly to achieve better numepochs, batchsize, momentum, and so on, were adjusted instantly and repeatedly to achieve results in the experiments. According to Equation (6), after a series of comparative experiments the better results in the experiments. According to Equation (6), after a series of comparative experiments model possessing smooth, clear, and reasonably trended curve is constructed as the model possessing smooth, clear, and reasonably trended curve is constructed as DNNDNN 50, 50, 20, 20, 10;10; 1] 1 6; 100, 1 [6; 100,

and the critical DNN parameters numepochs, batchsize, and momentum were set to 3, 50, and 0, and the critical DNN parameters numepochs, batchsize, and momentum were set to 3, 50, and 0, respectively. According to the analysis in Section 3.2, the feature data before 5300th min are all respectively. According to the analysis in Section 3.2, the feature data before 5300th min are all collected collected in normal condition. Therefore, as described in Section 2.4, the former 2500 × 6 subpart of in normal condition. Therefore, as described in Section 2.3, the former 2500 × 6 subpart of the 9810 × 6 the 9810 × 6 mapping features array obtained by LE will be used as training data to train DNN, and mapping features array obtained by LE will be used as training data to train DNN, and the weights the weights and biases are fine-tuned through the CD and BP algorithms. and biases are fine-tuned through the CD and BP algorithms. 3.3.2. 3.3.2. Assessment Assessment and and Results Results After optimized and well-trained DNN assessment model, entire feature feature dataset dataset After getting getting the the optimized and well-trained DNN assessment model, the the entire composed of feature data of the vibration signals collected under normal condition as well as composed of feature data of the vibration signals collected under normal condition as well as abnormal abnormal condition are used to evaluate the lifelong running condition of the machine. condition are used to evaluate the lifelong running condition of the machine. Firstly, the 9810 × 6 array of six mapping features in the projected space is input into DNN to conduct the assessment experiment, of which the result is shown Figure 7. Then, in order to make a comparison and to demonstrate the advantages of the proposed method, the 9810 × 38 array of

curves show a basically linear trend, but the curve of the mapping features is more stable, while that of the original features fluctuates obviously, which indicates that the former is much more insensitive to noise than the latter. (2) Both of these two kinds of features can detect the serious degradation such as crackle, fatigue spalling, etc., occurred at 7000th min, however, the original features could not detect the Sci. early slight Appl. 2018, 8, 2611faults of wear, pitting, or overheat began in the vicinity of 5300th min, while 11 ofthe 15 mapping features performed well on this issue. (3) Additionally, at the end, the second curve changes the direction of the trend and performs very disorderedly, while the first curve shows a good Firstly, the 9810 × 6 array of six mapping features in the projected space is input into DNN to unilateral trend and rises monotonically and sharply after the 9400th min, which indicates that1 the conduct the assessment experiment, of which the result is shown Figure 7. Then, in order to make bearing got started to deteriorate so violently that it could no longer work. a comparison the advantages method, 9810 × 38 array Contrast and withto thedemonstrate actual experimental situation,ofitthe canproposed be discovered that the the assessment resultof of thirty-eight original features without feature space conversion were applied to conduct the same the proposed method that transforms the features in original higher dimensional space to projected experiment, and Figure 8 plots theinresult, of which DNN model constructed as lower dimensional space by LE this study wasthe consistent withisthe real operation status of the bearings, while the assessment results of the original features without feature space conversion are in DNN2 [38; 100, 50, 20, 10; 1] great difference with the actual situation. 10 10 Assessment Assessment result result of of the the proposed proposed method method

88

Assessment value value Assessment

66

Sharp Sharp deterioration deterioration begining begining point:940 point:940

44 22

Mutational Mutational point:700 point:700

00 -2 -2

Abnarmality Abnarmality beginning beginning point:530 point:530

-4 -4 -6 -6 -8 -80 0

100 100

200 200

300 300

400 500 600 400 500 600 Time Time // (10min) (10min)

700 700

800 800

900 900

1000 1000

Figure7.7.Assessment Assessmentresult resultof ofthe the66mapping mappingfeatures. features. Figure 1.1 1.1 1.05 1.05 11

Assessment value value Assessment

0.95 0.95 0.9 0.9 0.85 0.85 Mutational Mutational point: point: 700 700

0.8 0.8 0.75 0.75 0.7 0.7 0.65 0.65 00

Assessment Assessment result result of of the the 38 38 original original features features by by DBN DBN

100 100

200 200

300 300

400 500 600 400 500 600 Time Time // (10 (10 min) min)

700 700

800 800

900 900

1000 1000

Figure8.8.Assessment Assessmentresult resultof ofthe the38 38original originalfeatures. features. Figure

By analyzing and comparing the results of these two kinds of features, the following phenomena 3.4. Comparison Experiments and Analysis can be discovered: (1) In a long beginning period during which the bearing runs normally, both the curves show a basically linearConversion trend, but the curve of the mapping features is more stable, while that of 3.4.1. Comparisons of Space Methods the original features fluctuates obviously, which indicates that the former is much more insensitive to noise than the latter. (2) Both of these two kinds of features can detect the serious degradation such as crackle, fatigue spalling, etc., occurred at 7000th min, however, the original features could not detect the early slight faults of wear, pitting, or overheat began in the vicinity of 5300th min, while the mapping features performed well on this issue. (3) Additionally, at the end, the second curve changes the direction of the trend and performs very disorderedly, while the first curve shows a good unilateral trend and rises monotonically and sharply after the 9400th min, which indicates that the bearing got started to deteriorate so violently that it could no longer work. Contrast with the actual experimental situation, it can be discovered that the assessment result of the proposed method that transforms the features in original higher dimensional space to projected

Appl. Sci. 2018, 8, 2611

12 of 15

lower dimensional space by LE in this study was consistent with the real operation status of the bearings, while the assessment results of the original features without feature space conversion are in great difference with the actual situation. 3.4. Comparison Experiments and Analysis Appl. Sci. 2018, 8, x FOR PEER REVIEW

12 of 14

3.4.1. Comparisons of Space Conversion Methods LE adopted in this study is a nonlinear local feature space conversion method, in order to make LE adopted in this study is a nonlinear local feature space conversion method, in order to make comparative analysis and highlight its effectiveness for the work, several contrast experiments with comparative analysis and highlight its effectiveness for the work, several contrast experiments with linear space conversion method PCA and global nonlinear space conversion method Isomap that are linear space conversion method PCA and global nonlinear space conversion method Isomap that are proverbially applied for feature transformation were carried out. Considering fairness and proverbially applied for feature transformation were carried out. Considering fairness and rationality, rationality, the most suitable DNN structures for PCA and Isomap are, respectively, constructed as the most suitable DNN structures for PCA and Isomap are, respectively, constructed as follows follows DNN3 [DNN 6; 100,6;50, 25,50, 5; 25, 1] 5; 1 100,

and and DNN4 [6; 100, 50, 25, 10; 1] DNN 6; 100, 50, 25, 10; 1 The evaluation results of DNNs with PCA and Isomap-based space conversion techniques The evaluation results of DNNs with PCA and Isomap-based space conversion techniques applying the same procedures in Section 3.3 are shown in Figure 9, it can be discovered that the applying the same procedures in Section 3.3 are shown in Figure 9, it can be discovered that the waveforms of PCA-based and LE-based results have the same abnormal performance: Both of them waveforms of PCA-based and LE-based results have the same abnormal performance: Both of them began to appear abnormal at 5300th min and mutated around 7000th min, but the former has greater began to appear abnormal at 5300th min and mutated around 7000th min, but the former has greater volatility before the start of the anomaly and the end is chaotic. While the result of Isomap-based volatility before the start of the anomaly and the end is chaotic. While the result of Isomap-based technique performs worse in the beginning normal period, and its mutation at 7000th min is not so technique performs worse in the beginning normal period, and its mutation at 7000th min is not so obvious, but of which, the unidirectional drastic descent (at about 9400th min) demonstrated the obvious, but of which, the unidirectional drastic descent (at about 9400th min) demonstrated the validity of this method in the detection of serious failure of bearings. It is easy to find that LE performs validity of this method in the detection of serious failure of bearings. It is easy to find that LE best in general in the comparisons. performs best in general in the comparisons. (b)

1 Mutational point:700

0.9

Assessment value

0.8 0.7 Abnarmality beginning point:530

0.6 0.5 0.4

Assessment result of Isomap-based dimensionality reduction

0.3 0.2

0

100

200

300

400 500 600 Time / (10min)

700

800

900

1000

Figure 9. Assessment results of the compared space conversion methods. (a)Assessment result of Figure 9. Assessment results of the compared space conversion methods. (a)Assessment result of PCA

PCA conversion; (b)Assessment of Isomap conversion conversion; (b)Assessment result result of Isomap conversion In the following study, two other artificial intelligence models BPNN and HMM that have In the following study, other artificial intelligence models BPNN HMM that havealso excellent excellent performance in two pattern recognition [26], data processing andand other fields were applied performance in pattern recognition [26], data processing and other fields were also applied to carry of to carry out the similar comparative experiments. By the way, the specific algorithmic theories out the similar comparative experiments. By the way, the specific algorithmic theories of HMM and HMM and BPNN can be studied in literatures [7] and [8], respectively. BPNN In canthe be comparison studied in literatures [7] and respectively. experiments, the[8], feature space conversion method remained unchanged as In the comparison experiments, the feature space conversion method remained unchanged as LE, but the evaluation models were changed to BPNN and HMM, respectively. The evaluation LE, but the evaluation models were changed to BPNN and HMM, respectively. The evaluation results are shown in Figure 10, from which it can be discovered that BPNN can accurately detect the results are shown in Figure it can betodiscovered BPNN accurately detect anomaly at 7000th min, 10, butfrom it is which not sensitive the early that slight faultcan at about 5300th minthe and anomaly at 7000th min, but it is not sensitive to the early slight fault at about 5300th min and performs performs intricately at the end where the waveform went to the contrary direction. While the intricately at of theHMM-based end where the waveform went to theincontrary direction. While theindicates waveform waveform method is pretty smooth the early stage, and it also that HMM can identify the early deterioration of bearings around 5300th min more obviously than BPNN. However, the inefficiency of HMM in detection of the mutation at about 7000th min suggests that this approach is not competent enough for the assessment task either. Hence, we can say that DNN outperforms the assessment models.

Appl. Sci. 2018, 8, 2611

13 of 15

of HMM-based method is pretty smooth in the early stage, and it also indicates that HMM can identify the early deterioration of bearings around 5300th min more obviously than BPNN. However, the inefficiency of HMM in detection of the mutation at about 7000th min suggests that this approach is not competent enough for the assessment task either. Hence, we can say that DNN outperforms the Appl. Sci. 2018,models. 8, x FOR PEER REVIEW 13 of 14 assessment (a)

6

(b) Assessment result of the BPNN

83.1

83.05

4

83

Assessment value

Assessment value

82.95 2

Mutational point:700

0

-2

Abnarmality beginning point:530 82.9 82.85

Mutational point:780

82.8 82.75

Abnarmality beginning point:530

82.7

-4

Assessment result of the HMM

82.65 -6

0

100

200

300

400 500 600 Time / (10min)

700

800

900

1000

82.6

0

100

200

300

400 500 600 Time / (10min)

700

800

900

1000

Figure (a) Assessment Assessment results results of Figure 10. 10. Assessment Assessment results results of of the the compared compared assessment assessment models. models. (a) of BPNN BPNN model; (b) Assessment results of HMM model. model; (b) Assessment results of HMM model.

3.4.2. Comparisons of Assessment Models 3.4.2. Comparisons of Assessment Models 4. Conclusions 4. Conclusions In complexity of modern mechanical systems as well the harsh unstable running In view viewofofthe the complexity of modern mechanical systems asaswell as theand harsh and unstable condition, effective methods evaluating monitoring running conditions machines are running condition, effective for methods for and evaluating and the monitoring the running conditions urgently needed. This work reports a novel effective method combining Laplacian Eigenmaps feature machines are urgently needed. This work reports a novel effective method combining Laplacian conversion particle swarm optimization-based neural network deep for evaluating the health Eigenmaps and feature conversion and particle swarmdeep optimization-based neural network for state of the target machine (rolling-element bearings). Firstly, three popular approaches including time evaluating the health state of the target machine (rolling-element bearings). Firstly, three popular and frequency domaintime analysis as well as domain WPT were applied extract thirty-eight features of the approaches including and frequency analysis as to well as WPT were applied to extract vibration signals collected machines in collected the original high dimension space. Then, thedimension nonlinear thirty-eight features of the from vibration signals from machines in the original high local algorithm LE was introduced to transform the original features to the projected lower dimensional space. Then, the nonlinear local algorithm LE was introduced to transform the original features to space and obtain the dimensional six more typical parameters. feature the projected lower space and obtainNext, the the six transformed more typicalsix-dimensional parameters. Next, the dataset was entered into the PSO algorithm optimized DNNinto network to assess the whole life running transformed six-dimensional feature dataset was entered the PSO algorithm optimized DNN conditions the the target bearing in the conditions test. Finally, a series comprehensive persuasive network to of assess whole life running of the targetof bearing in the test. and Finally, a series comparison experiments proved that the proposed method proved of integrating into DNN is more of comprehensive and persuasive comparison experiments that theLE proposed method of effective for machine running state assessment. In the future work, the proposed method in this paper integrating LE into DNN is more effective for machine running state assessment. In the future work, may also be applied and some fields.classification, and some the proposed methodininprognosis, this paperclassification, may also be applied in other prognosis,

other fields.

Author Contributions: B.K, S.X and X.W conceived and designed the experiments; N.Y performed the experiments; N.Y and W.Y analyzed the data; N.Y contributed analysis tools; All the authors wrote the paper and Author the Contributions: B.K, S.X and X.W conceived and designed the experiments; N.Y performed the revised paper. experiments; N.Y and W.Y analyzed the data; N.Y contributed analysis tools; All the authors wrote the paper Funding: This research received no external funding. and revised the paper. Acknowledgments: Thanks Juncheng Lu for data collection and thanks Chengjiang Li and Bin Su for giving Funding: This research this received no external funding. suggestions to improve manuscript. Conflicts of Interest:Thanks The authors declare of interest. Acknowledgments: Juncheng Luno forconflicts data collection and thanks Chengjiang Li and Bin Su for giving suggestions to improve this manuscript.

References Conflicts of Interest: The authors declare no conflicts of interest. 1.

Yuan, N.; Yang, W.; Kang, B.; Xu, S.; Li, C. Signal fusion-based deep fast random forest method for machine

References health assessment. J. Manuf. Syst. 2018, 48, 1–8. [CrossRef] 1. 2. 3.

Yuan, N.; Yang, W.; Kang, B.; Xu, S.; Li, C. Signal fusion-based deep fast random forest method for machine health assessment. J. Manuf. Syst. 2018, 48, 1–8. Loutridis, S. Instantaneous energy density as a feature for gear fault detection. Mech. Syst. Signal Process. 2006, 20, 1239–1253. Öztürk, H.; Sabuncu, M.; Yesilyurt, I. Early detection of pitting damage in gears using mean frequency of

Appl. Sci. 2018, 8, 2611

2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

13.

14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.

14 of 15

Loutridis, S. Instantaneous energy density as a feature for gear fault detection. Mech. Syst. Signal Process. 2006, 20, 1239–1253. [CrossRef] Öztürk, H.; Sabuncu, M.; Yesilyurt, I. Early detection of pitting damage in gears using mean frequency of scalogram. J. Vib. Control 2008, 14, 469–484. [CrossRef] Loutridis, S. Self-similarity in vibration time series: Application to gear fault diagnostics. J. Vib. Acoust. 2008, 130, 031004. [CrossRef] Yu, D.; Yang, Y.; Cheng, J. Application of time–frequency entropy method based on Hilbert–Huang transform to gear fault diagnosis. Measurement 2007, 40, 823–830. [CrossRef] Cui, J.; Wang, Y. A novel approach of analog circuit fault diagnosis using support vector machines classifier. Measurement 2011, 44, 281–289. [CrossRef] Zhu, J.; Ge, Z.; Song, Z. HMM-driven robust probabilistic principal component analyzer for dynamic process fault classification. IEEE Trans. Ind. Electron. 2015, 62, 3814–3821. [CrossRef] Yan, J.; Guo, C.; Wang, X. A dynamic multi-scale Markov model based methodology for remaining life prediction. Mech. Syst. Signal Process. 2011, 25, 1364–1376. [CrossRef] LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [CrossRef] Yang, B.-S.; Oh, M.-S.; Tan, A.C.C. Fault diagnosis of induction motor based on decision trees and adaptive neuro-fuzzy inference. Expert Syst. Appl. 2009, 36, 1840–1849. Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [CrossRef] [PubMed] Gan, M.; Wang, C. Construction of hierarchical diagnosis network based on deep learning and its application in the fault pattern recognition of rolling element bearings. Mech. Syst. Signal Process. 2016, 72, 92–104. [CrossRef] Lin, Q.; Liu, S.; Zhu, Q.; Tang, C.; Song, R.; Chen, J.; Coello, C.A.C.; Wong, K.-C.; Zhang, J. Particle swarm optimization with a balanceable fitness estimation for many-objective optimization problems. IEEE Trans. Evol. Comput. 2018, 22, 32–46. [CrossRef] Klein, R.; Ingman, D.; Braun, S. Non-stationary signals: Phase-energy approach—Theory and simulations. Mech. Syst. Signal Process. 2001, 15, 1061–1089. [CrossRef] Baydar, N.; Ball, A. A comparative study of acoustic and vibration signals in detection of gear failures using Wigner–Ville distribution. Mech. Syst. Signal Process. 2001, 15, 1091–1107. [CrossRef] He, Q. Vibration signal classification by wavelet packet energy flow manifold learning. J. Sound Vib. 2013, 332, 1881–1894. [CrossRef] Gharavian, M.; Ganj, F.A.; Ohadi, A.; Bafroui, H.H. Comparison of FDA-based and PCA-based features in fault diagnosis of automobile gearboxes. Neurocomputing 2013, 121, 150–159. [CrossRef] Zhu, Z.-B.; Song, Z.-H. A novel fault diagnosis system using pattern classification on kernel FDA subspace. Expert Syst. Appl. 2011, 38, 6895–6905. [CrossRef] Tenenbaum, J.B.; De Silva, V.; Langford, J.C. A global geometric framework for nonlinear dimensionality reduction. Science 2000, 290, 2319–2323. [CrossRef] Belkin, M.; Niyogi, P. Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 2003, 15, 1373–1396. [CrossRef] Roweis, S.T.; Saul, L.K. Nonlinear dimensionality reduction by locally linear embedding. Science 2000, 290, 2323–2326. [CrossRef] [PubMed] Hemmati, F.; Orfali, W.; Gadala, M.S. Roller bearing acoustic signature extraction by wavelet packet transform, applications in fault detection and size estimation. Appl. Acoust. 2016, 104, 101–118. [CrossRef] Hauberg, S. Principal curves on Riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 1915–1921. [CrossRef] [PubMed] Jafari, A.; Almasganj, F. Using Laplacian eigenmaps latent variable model and manifold learning to improve speech recognition accuracy. Speech Commun. 2010, 52, 725–735. [CrossRef]

Appl. Sci. 2018, 8, 2611

25. 26.

15 of 15

Hinton, G.E.; Osindero, S.; Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [CrossRef] [PubMed] Yuan, N.; Yang, W.; Kang, B.; Xu, S.; Wang, X.; Li, C. Manifold learning-based fuzzy k-principal curve similarity evaluation for wind turbine condition monitoring. In Energy Science & Engineering; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2018. © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Suggest Documents