Spectral kurtosis. Abstract: The thesis focuses on reliable fault diagnosis of mechanical drives, while cosidering integration of several condition monitoring ...
UNIVERSITY OF LJUBLJANA Faculty of mechanical engineering
Fault detection and localization of mechanical drives based on data fusion techniques Doctoral dissertation
Dissertation submitted to the University of Ljubljana for the degree of Doctor of Sciences
Gabrijel Peršin
Ljubljana, December 2013
UNIVERSITY OF LJUBLJANA Faculty of mechanical engineering
Fault detection and localization of mechanical drives based on data fusion techniques Doctoral dissertation
Dissertation submitted to the University of Ljubljana for the degree of Doctor of Sciences
Gabrijel Peršin
Supervisor: prof. dr. Jožef Vižintin Co-supervisor: prof. dr. Ðani Juričić
Ljubljana, December 2013
V Ljubljani
Askemva cesta 6 100
Fakulteta za stroinistvo
° Ljubljana,
tekfon(01)4771 200 faks (01) 25 18 567 dekanat@fs. uni-lj. si
|
UlilUJJllllU
Tekoca stevilka: DR HI/29 Datum: 8.11.2011 Na osnovi sklepa 20. seje Komisije za doktorski studij, Univerze v Ljubljani, z dne 21.9.2011, po pooblastilu 30. seje Senata UL z dne 20.1.2009, izdajam naslednjo ODL06BO
Komisija za doktorski studij, Univerze v Ljubljani je na svoji 20. seji ,dne 21.9.2011 kandidatu Gabrijelu Persinu, univ.dipl.inz. 1. sprejela temo doktorske disertacije z naslovom: Zaznavanje in lokalizacija poskodb v mehanskih pogonih s tehnikami zlivanja informacij 2. imenovala mentorja: prof. dr. Jozef Vizintin 3. in so-mentorja: prof. dr. Bani Juricic 4. odobrila pisanje doktorske disertacije v angleskem jeziku V skladu s clenom 169 Statuta Univerze v Ljubljani mora kandidat za pridobitev doktorata znanosti najpozneje v stirih letih od dneva, ko je bila sprejeta tema disertacije, predloziti clanici univerze izdelano doktorsko disertacijo. Pravnipouk: Zoper to odlocboje dopusten ugovor na Senat Univerze v Ljubljani v roku 15 dni od prejema odlocbe. Current number: DR HI/29 Date: 8 November 2011 On the basis of the decision taken at the 20th Meeting of the Commission for Doctoral Studies on 21 September 2011 by authority of the 30th Meeting of the Senate of the University of Ljubljana on 20 January 2009,1 am delivering the following DECISION
At the 20th Meeting of the Commission for Doctoral Studies, University of Ljubljana on 21 September 2011 candidate Gabrijel Persin, univ.dipl.inz 1. was approved the topic of the dissertation entitled Fault detection and localization of mechanical drives based on data fusion techniques 2. appointed as mentor: Prof. Dr. Jozef Vizintin 3. as co-mentor: Prof. Dr. Bani Juricic 4. approved the writing of the dissertation in English In accordance with Article 169 of the Statute of the University of Ljubljana the candidate must produce the final version of the dissertation within four years from the date of approval of the dissertation topic. Legal instruction: Against this decision an appeal may be laid at the Senate of the University of Ljubljana within 15 days of the receipt of the Decision. )uhovnik Dekan / Dean
“If you want to find the secrets of the universe, think in terms of energy, frequency and vibration.” ‒ Nikola Tesla
Acknowledgements I would like to thank the Slovenian Research Agency (ARRS), University of Ljubljana and the Faculty of mechanical engineering for supporting me during my postgraduate studies. I would like to express my deepest gratitude to professor Jože Vižintin who embraced me as part of his team at the Centre of tribology and technical diagnostics (CTD). Professor, you have thought me many important aspects of life, often exceeding the everyday scope of work. I would like to thank professor Đani Juričić for helping me to establish strong pillars of knowledge on which my work has been based ever since. You have been available for me anytime I needed your help, advice or support. I would like to thank professor Mitjan Kalin for accepting me as a part of the team within the Centre of tribology and nano technology (TINT). Mitjan, I always felt equally accepted and considered. I would like to thank professor Leonid Gelman for inviting me to join his research group at Cranfield University where I was given a possibility to expand my knowledge to broader application areas. Without your contributions, my work would not be completed. My deepest gratitude goes to Mrs. Joži Sterle for the enormous amount of help I have been receiving during my postgraduate studies. Joži, without you I would not be where I am today. I would like to thank all my colleagues from CTD/TINT for being the best colleagues I have ever had. In particular, I would like to thank José Salgueiro for all your help, many outstanding ideas, and hard work. I would like to thank my family for understanding and supporting me during my bright and dark times. Mum and dad, you have always stood beside me and supported me from all sides. I am truly grateful for your help and encouragement. My sister Zala with her family Zlatko, Svetja and Ala, without you I would not have the strength to fulfil all my duties and wishes. You have opened my mind more than you can imagine. My love Iva, you came into my life unexpectedly and embraced me with pure love. You have spun my life high into clouds where we still live. Last but not least, I would like to thank all my friends who mean the world to me. Without you, the world would be empty and I would not have the strength to walk the path of life.
Dr III/29
UDC 539.92:531.4:534.12(043.3)
Gabrijel Peršin
FAULT DETECTION AND LOCALIZATION OF MECHANICAL DRIVES BASED ON DATA FUSION TECHNIQUES
Keywords: • Fault diagnosis • Multisensor data fusion • Oil analysis • Vibration analysis • Trend change detection • Qualitative trend analysis • Spectral kurtosis
Abstract: The thesis focuses on reliable fault diagnosis of mechanical drives, while cosidering integration of several condition monitoring approaches. Partial decisions regarding machine condition, obtained independently from vibration and oil analysis techniques, are used within the fusion process based on the incidence table. The incidence table offers in-depth relations between faults and signatures from oil properties or vibration features, used for estimation of final fault probabilities. Analysis of oil parameters is based on trend change detection, followed by qualitative analysis, which reveals the nature of the ongoing change. Recognition of fault signatures in oil parameters is followed by estimation of fault probability. Vibration analysis is based on spectral kurtosis and filtering, used to extract fault-related non-stationary component from background noise. The component is used for extraction of diagnostic features, which are used for fault-related impact detection, based on k-means clustering and k-nearest neighbours classification. Experimental validation, which included oil contamination experiment, gear pitting, and bearing inner and outer race damage, proved the proposed approach to offer reliable estimation of fault probabilities, by fusion of partial probabilities obtained by oil and vibration analysis.
Dr III/29
UDK 539.92:531.4:534.12(043.3)
Gabrijel Peršin
ZAZNAVANJE IN LOKALIZACIJA POŠKODB V MEHANSKIH POGONIH S TEHNIKAMI ZLIVANJA INFORMACIJ
Ključne besede: • Diagnostika poškodb • Zlivanje večsenzorni podatkov • Analiza olja • Analiza vibracij • Zaznavanje sprememb trendov • Kvalitativna analiza trendov • Spektralni kurtozis
Izvleček: Predstavljena teza se osredotoča na zaznavanje poškodb na mehanskih pogonih, pri čemer upošteva več možnih načinov spremljanja stanja naprav. Iz neodvisnih analiz vibracij in olja je moč potegniti delne zaključke o stanju stroja, ki jih lahko uporabimo za izvedbo kombiniranega postopka, ki temelji na incidenčni tabeli. Ta tabela vsebuje obsežen popis relacij med napakami in za napake značilnimi odčitki z merilcev stanja olja in vibracij, ki jih uporabljamo za končno oceno verjetnosti za pojavitev napake. Analiza parametrov olja temelji na zaznavanju sprememb trendov in njihovem kvalitativnem analiziranju, kar nam omogoča vpogled v naravo zaznanih sprememb. Ko prepoznamo spremembo v parametrih olja, ki nakazuje možnost napake, izvedemo oceno verjetnosti le-te. Vibracijska analiza temelji na uporabi spektralnega kurtozisa in filtriranja, s katerim nestacionarne, za napake značilne komponente ločimo od šuma ozadja. Filitriran vibracijski signal nato uporabimo za izpeljavo diagnostičnih značilk, s pomočjo katerih izvedemo zaznavanje poškodbe, ki temelji na k-means rojenju in razvrščanju k-najbližjih sosedov.
Contents 1 Introduction
1
1.1
Maintenance strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Online condition monitoring (CM)
. . . . . . . . . . . . . . . . . . . . .
4
1.3
Integration of multiple condition monitoring techniques . . . . . . . . . .
5
1.3.1
Lubricant analysis . . . . . . . . . . . . . . . . . . . . . . . . . .
6
1.3.2
Vibration analysis . . . . . . . . . . . . . . . . . . . . . . . . . . .
7
1.3.3
Integrated fault diagnosis based on data fusion . . . . . . . . . . .
8
1.4
Objectives and scientific contributions of the dissertation . . . . . . . . .
9
1.5
Dissertation structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2 Critical gearbox components and fault signatures 2.1 2.2 2.3
Single stage spur gears and typical defects . . . . . . . . . . . . . . . . .
11
2.1.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
Rolling element bearings . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.2.1
Bearing vibration . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
Typical lubricant-related faults . . . . . . . . . . . . . . . . . . . . . . .
16
2.3.1
Lubricant contamination . . . . . . . . . . . . . . . . . . . . . . .
16
2.3.2
Lubricant aging . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
2.3.3
Excessive component wear . . . . . . . . . . . . . . . . . . . . . .
17
Gear vibration
3 Trend change detection in lubricant parameters 3.1
3.2
11
19
Change detection algorithm (CDA) . . . . . . . . . . . . . . . . . . . . .
20
3.1.1
Initiation by definition of reference and current vectors . . . . . .
20
3.1.2
Data normalization . . . . . . . . . . . . . . . . . . . . . . . . . .
21
3.1.3
Modeling trends by means of linear regression . . . . . . . . . . .
22
3.1.4
CUSUM calculation . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.1.5
Detection of trend change . . . . . . . . . . . . . . . . . . . . . .
25
3.1.6
Change evaluation . . . . . . . . . . . . . . . . . . . . . . . . . .
28
Experimental validation of the proposed qualitative trend analysis methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
i
3.2.1
Experimental setup . . . . . . . . . . . . . . . . . . . . . . . . . .
30
3.2.2
Water contamination . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.2.3
Chemical contamination . . . . . . . . . . . . . . . . . . . . . . .
34
3.2.4
Gear pitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
4 Impact detection based on spectral kurtosis 4.1
41
Theoretical background of spectral kurtosis (SK) . . . . . . . . . . . . . .
42
4.1.1
Definition and properties of SK . . . . . . . . . . . . . . . . . . .
42
4.1.2
Kurtogram technique for optimal SK parameters . . . . . . . . . .
43
4.1.3
SK and the optimal denoising (Wiener) filtering . . . . . . . . . .
44
Spectral kurtosis (SK) of vibration signal . . . . . . . . . . . . . . . . . .
45
4.2.1
Angular resampling . . . . . . . . . . . . . . . . . . . . . . . . . .
45
4.2.2
Segmentation of the vibration signal . . . . . . . . . . . . . . . .
45
4.2.3
Estimation of the spectral kurtosis (SK) . . . . . . . . . . . . . .
48
4.2.4
Optimal denoising (Wiener) filtering . . . . . . . . . . . . . . . .
48
4.2.5
Hilbert transform and the SK-residual envelope . . . . . . . . . .
49
4.2.6
Sensor fusion and averaging SK-residual squared envelope . . . . .
50
4.2.7
Alignment of impacts using cross-correlation . . . . . . . . . . . .
51
4.2.8
Impact detection by means of k-means and k-NN clustering
. . .
54
Experimental validation of the proposed impact detection method . . . .
55
4.3.1
Experimental results for bearing faults . . . . . . . . . . . . . . .
57
4.3.1.1
Bearing inner race defect . . . . . . . . . . . . . . . . . .
57
4.3.1.2
Bearing outer race defect . . . . . . . . . . . . . . . . .
65
Experimental results with gear pitting . . . . . . . . . . . . . . .
70
5 Integrated fault detection and isolation (FDI) based on fusion of vibration and oil readings
75
4.2
4.3
4.3.2
5.1
Oil-based fault probability estimation . . . . . . . . . . . . . . . . . . . .
76
5.1.1
Typical fault signatures and qualitative trend analysis . . . . . . .
77
5.1.2
Similarity estimation using the majority voting approach . . . . .
78
5.2
Vibration-based fault probability estimation by grouping approach . . . .
79
5.3
Final fault diagnosis based on fusion of vibration and oil analyses . . . .
80
5.3.1
The proposed incidence table . . . . . . . . . . . . . . . . . . . .
82
6 Experimental validation of the proposed fault detection and isolation methodology
85
6.1
Lubricant water contamination experiment . . . . . . . . . . . . . . . . .
86
6.2
Lubricant chemical contamination experiment . . . . . . . . . . . . . . .
87
ii
6.3
. . . . . . . . . . . . . . . . . . oil partial . . . . . . . . . . . .
89 89 91
7 Conclusions 7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97 98
6.4
Gear pitting experiment under non-stationary load . . . . . . 6.3.1 Oil analysis . . . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Vibration analysis . . . . . . . . . . . . . . . . . . . . . 6.3.3 Fault probability estimation by fusion of vibration and decisions . . . . . . . . . . . . . . . . . . . . . . . . . . Bearing inner and outer race defects . . . . . . . . . . . . . . .
Bibliography
92 94
101
iii
iv
List of Symbols and Acronyms a ¯n (k)
Averaged amplitude modulation (envelope) signal of the nth segment
α
Level of significance
a ˆn (k)
Aligned amplitude modulation (envelope) signal of the nth segment
ˆ (f ) W
Approximation of the Wiener filter
x ˆCn (i)
The approximated current values
x ˆRn (i)
The predicted reference values
σ
Binary scoring function
e S
A fault-signature model
E(i)
The cumulative sum of errors vector
S0
The state vector containing unique values
wC (i)
The current window
wR (i)
The reference window
x(i)
Oil parameter vector
xCn (i)
The normalized current vector
xC (i)
The current vector
xRn (i)
The normalized reference vector
xR (i)
The reference vector
y(k)
Vibration signal
yn (k)
nth vibration segment
µ
The mean value
φ
Angle of load in case of bearings
ρ(f )
The ratio of the power spectral densities
v
e(k) x
Fault-signature signal
ζ
Similarity measure
A(k)
Testing dataset for impact detection
a
Linear regression y-intersect
A0
The stable threshold
an (k)
Amplitude modulation (envelope) signal of the nth segment
At (k)
Training dataset for impact detection
aCn
Current vector linear regression y-intersect
aRn
Reference vector linear regression y-intersect
b
Linear regression slope
babn
Slope of the abnormal vector
bCn
Current vector linear regression slope
bRn
Reference vector linear regression slope
C
A fault combination within the incidence table
D1
The mean training nearest neighbour distance
D2
The mean testing nearest neighbour distance
df tf
Diameter of a bearing cage
dre
Diameter of a rolling element
DR
The criterion function
e(i)
The error vector
Fs
Sampling frequency
FGM F
The gearmesh frequency
Fsh
Shaft rotation frequency for estimation of bearing characteristic frequencies
G
Number of groups
vi
h(t)
Window for estimation of short-time Fourier transform
I2g
The grouped decision regarding fault presence
i
The time index
I1 (k)
Binary impact detection signal
K(f )
Spectral kurtosis as a function of frequency
k
The time index
Kn (f )
Spectral kurtosis of the nth vibration segment
Kx (f )
Spectral kurtosis of the non-stationary random vibration component x(t)
Ky (f )
Spectral kurtosis of the random vibration component y(t)
m(k)
Binary mask function used for alignment of vibration segments
MC
The majority coefficient
n(t)
Stationary Gaussian noise
n
Vibration segment index
NC
Length of the current vector
Ng
Group length
Np
Vibration segment length
NR
Length of the reference vector
Nt
Number of gear teeth
Nw
Length of window for estimation of short-time Fourier transform
Nabn
A time instant of crossing of the abnormal threshold
Navg
The number of segments for averaging
Ncl
Number of clusters
Ndet
A time instant of crossing of the detection threshold
Ninit
Number of sampling intervals for CDA initialization
vii
Nre
The number of bearing rolling elements
NN
Number of nearest neighbours
P
The final fault probability
Pc
The weighted probability of fault-combination
P x(k)
Probability of fault-signature
Py
The probability of impacts from vibration analysis
Rn (f )
The Fourier transform of the nth filtered vibration segment
rn (k)
The nth residual
S(i)
The qualitative states
sα
Statistical significance threshold
Sn (f )
Power spectral density of Gaussian noise n(t)
Sx (f )
Power spectral density of non-stationary random vibration component x(t)
Ts
Sampling time
Tu,v
Transition from state u to state v
T h0
The stable threshold
T hC
The changing threshold
T h1
The abnormal threshold
T h2
The detection threshold
T habn
The abnormal threshold
T hdet
The detection threshold
T hM C
Majority coefficient threshold
T hN S
The novelty score threshold
u1−α
The percentile of the normal distribution
V ar
The variance
viii
W (f )
Transfer function of the Wiener filter
w(t)
Time function of the Wiener filter
W
Significance weights
X(f )
Frequency spectrum of signal x(t)
x(t)
Non-stationary component of vibration signal
Y (f )
Frequency spectrum of signal y(t)
y(t)
A random component of vibration signal
Y (t, f )
Representation of signal y(t) in time-frequency domain
Yn (f )
The Fourier transform of the nth vibration segment
ACC
Accuracy
AM
Amplitude Modulation
ANN
Artificial Neural Networks
BPFI
Ball-Pass Frequency Inner race
BPFO
Ball-Pass Frequency Outer race
BSF
Ball Spin Frequency
CBM
Condition Based Maintenance
CDA
Change Detection Algorithm
CM
Condition Monitoring
CR
Contact Ratio
CUSUM
Cumulative Sum
EU
European Union
f
Frequency
FN
False Negative
FP
False Positive
ix
FR
Failure Rate
Frequency
Modulation
FTF
Fundamental Train Frequency
H{}
The Hilbert transfrom
ISU
Integrated Sensor Unit
JDL
Joint Directors of Laboratories
KNN
K-Nearest Neighbours
MTBF
Mean Time Between Failures
PHM
Prognostics and Health Management
RMS
Root Mean Square
RSF
Roller Spin Frequency
SK
Spectral Kurtosis
SNR
Signal to Noise Ratio
STFT
Short Time Fourier Transform
TE
Transmission Error
TN
True Negative
TNR
True Negative Rate
TP
True Positive
TPR
True Positive Rate
VSD
Variable Speed Drive
x
List of Figures 1.1
An example of a mechanical drive. . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Failure rate during the operation life . . . . . . . . . . . . . . . . . . . .
2
1.3
Maintenance programmes . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.4
The concept of online integrated condition monitoring process. . . . . . .
5
1.5
The detailed condition monitoring process including data fusion. . . . . .
6
2.1
Basic geometry of a pair of spur gears. [29] . . . . . . . . . . . . . . . . .
11
2.2
The emergence of some of the typical gear defects at different operating conditions [55] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.3
Typical vibration signatures of local gear faults . . . . . . . . . . . . . .
13
2.4
The rolling element bearing [58]. . . . . . . . . . . . . . . . . . . . . . . .
14
2.5
Typical vibration and envelope signals produced by localized bearing faults [29]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
2.6
Wear debris quantity and size during operation . . . . . . . . . . . . . .
17
3.1
Transient detection and identification based on qualitative trend analysis
19
3.2
Initial reference and current data windows. . . . . . . . . . . . . . . . . .
21
3.3
The normalized reference and present data windows. . . . . . . . . . . .
22
3.4
Linear regression and prediction of the reference and current values. . . .
24
3.5
Error and CUSUM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
3.6
Error and CUSUM calculation after sliding current window. . . . . . . .
26
3.7
Detection of trend change using CUSUM value E. . . . . . . . . . . . . .
27
3.8
Qualitative trend analysis (QTA) using the decision quadrant . . . . . .
28
3.9
CDA state transition diagram. . . . . . . . . . . . . . . . . . . . . . . . .
29
3.10 Experimental rig. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
31
3.11 The detection of trends in time evolution of the relative water content . .
33
3.12 The detection of trends in the time evolution of the relative dielectric constant 35 3.13 The time varying torque and temperature profiles . . . . . . . . . . . . .
36
3.14 Trend classification of small ferrous particles of diameter 100um during the pitting experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1
39
Schematic representation of the fault detection procedure based on vibration analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
4.2
SK estimation procedure [32]. . . . . . . . . . . . . . . . . . . . . . . . .
43
4.3
The kurtogram technique to define the optimal SK parameters . . . . . .
43
4.4
The principle of the Wiener filter [31].
. . . . . . . . . . . . . . . . . . .
44
4.5
Angular resampling process. . . . . . . . . . . . . . . . . . . . . . . . . .
46
4.6
Segmentation of vibration signal to short segments of a single rotation. A segment can be related to bearing and gear characteristic frequencies. . .
47
4.7
Estimation of spectral kurtosis.
. . . . . . . . . . . . . . . . . . . . . . . . . .
48
4.8
SK based Wiener filter applied to segments. . . . . . . . . . . . . . . . .
49
4.9
SK-residual envelope extraction using Hilbert transform.
. . . . . . . . . . . . . .
50
4.10 SK-residual fusion to remove amplitude modulation caused by bearing inner race defect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4.11 Impact alignment using cross correlation. . . . . . . . . . . . . . . . . . . . . .
52
4.12 Illustration of impact detection procedure. . . . . . . . . . . . . . . . . .
54
4.13 SKF bearing testing rig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
4.14 Inner race defect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4.15 Periods of vibration signal after segmentation, where a segment duration is proportional to ball pass frequency inner race BPFI. Segment duration in angular domain corresponds to approx. 583 deg . . . . . . . . . . . . .
59
4.16 Results of spectral kurtosis obtained from each segment. . . . . . . . . .
60
4.17 Filtered segments (SK-resudials) . . . . . . . . . . . . . . . . . . . . . . .
61
4.18 SK-residual power - squared envelope . . . . . . . . . . . . . . . . . . . .
61
4.19 Period averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
4.20 Estimated random slippage and error . . . . . . . . . . . . . . . . . . . .
63
4.21 SK-residual segments after alignment using cross-correlation . . . . . . .
63
4.22 Results from impact detection using k-means for clustering (training phase), and k-NN for classification (testing phase) . . . . . . . . . . . . . . . . .
64
4.23 Outer race defect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
4.24 Raw vibration segments . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
4.25 Estimation of spectral kurtosis for each segment . . . . . . . . . . . . . .
67
4.26 Filtered segments (SK-resudials) . . . . . . . . . . . . . . . . . . . . . . .
67
4.27 SK-residual power - squared envelope . . . . . . . . . . . . . . . . . . . .
68
4.28 Estimated random slippage and error . . . . . . . . . . . . . . . . . . . .
68
xii
4.29 SK-residual segments after alignment using cross-correlation . . . . . . .
69
4.30 Impact detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.31 A photo of spur gears with a pitted tooth. . . . . . . . . . . . . . . . . .
71
4.32 Raw vibration segments . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
4.33 Estimation of spectral kurtosis for each vibration segment . . . . . . . .
72
4.34 Filtered segments (SK-resudials) . . . . . . . . . . . . . . . . . . . . . . .
73
4.35 SK-residual power - squared envelope . . . . . . . . . . . . . . . . . . . .
73
4.36 Impact detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
74
5.1
Fault detection and isolation scheme. . . . . . . . . . . . . . . . . . . . .
75
5.2
Possible fault-signatures with representative qualitative trend models . .
77
5.3
Grouping of 1st level decisions to eliminate false and missed detections. .
80
5.4
Evaluation process of the incidence table. . . . . . . . . . . . . . . . . . .
81
5.5
The incidence table.
. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
6.1
Results from water contamination experiment: (a) Trend classification on relative water content, (b) Probabilities of fault indicative patterns, (c) fault probabilities as defined by the FMT . . . . . . . . . . . . . . . . . .
86
Results from chemical contamination experiment: (a) Trend classification on relative dielectric constant, (b) Fault probabilities . . . . . . . . . . .
88
Results from pitting experiment: (a) Measurements and trend classification on small ferrous particles count (D100um), (b) fault-indicative pattern probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
6.5
Impact detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
6.6
The time varying torque profile and temperature evolution with selected vibration segments for training and testing steps of impact detection . . .
93
Results from pitting experiment: (a) Probabilities of fault-indicative oil patterns and vibration impacts, (b) The final fault probabilities . . . . .
93
6.8
Impact detection by grouping . . . . . . . . . . . . . . . . . . . . . . . .
95
6.9
Impact detection by grouping . . . . . . . . . . . . . . . . . . . . . . . .
96
6.2 6.3
6.4
6.7
xiii
List of Tables 3.1 3.2
Nominal parameters of the experimental rig . . . . . . . . . . . . . . . . Tuning parameters of the CDA algorithm. . . . . . . . . . . . . . . . . .
xiv
31 32
1 Introduction Mechanical drives are among the most ubiquitous items of equipment in industry. A mechanical drive usually consists of a power source (driving) unit, a power transmission system, and a functional (driven) unit. Indirect coupling between the driving and driven units is achieved through a transmission system, usually a gearbox. This transmission system is responsible for transmitting mechanical forces from input to output shaft, by adapting the rotary speed and torque in order to comply with the demands of the driven unit (Figure 1.1).
Figure 1.1: An example of a mechanical drive. Gears and bearings are considered to be the most critical mechanical components and some of the most frequent reasons for machine breakdowns. They are so prone to damages because load is being transferred through a relatively small tribological contact of the meshing tooth pair as well as bearing rolling elements. Hence large stress and friction forces are imposed to the material. The breakdown of an element in a mechanical drive can implicate downtime of the entire production line, both direct and indirect maintenance costs, and even endanger human lives. Reliability of a mechanical drive, usually defined in terms of the Failure Rate (FR) and the Mean Time Between Failures (MTBF), is the probability that the system performs the intended function for a specified period of time. One of important features of a reliable system is that it does not silently operate while under damaged conditions, but instead automatically diagnoses the defective components and notifies maintenance personnel.
1
The Failure Rate over operational time of the production system has the form of the bathtub curve (Figure 1.2).
Figure 1.2: Failure rate during the operation life At the run-in phase of a mechanical system, the failure rate is relatively high, mainly due to the incorrect assembly of the components or installation of the system. After the run-in phase the failure rate decreases and then remains to be more or less constant on account of random failures in components. This phase is considered as the longest phase of normal operation. The failure rate may start increasing again mainly due to component wear-out, which indicates the ending of a component’s life cycle, reduced reliability and an increased possibility of a breakdown. To ensure high reliability of a mechanical system and maintain its operating capabilities, the appropriate maintenance actions must be taken.
1.1
Maintenance strategies
According to the Maintenance terminology standard [1], maintenance is the process of keeping a system in operating state by preventing and eliminating the defective conditions. According to the ARTEMIS report, the direct cost of maintenance in European Union (EU) is estimated to be 4%-8% of the total sales turnover [2]. Moreover, 30-50% of the expenditure is wasted through ineffective maintenance programmes. The currently prevailing reactive (react-to-failure) and periodic (preventive) maintenance strategies are outdated and need to be replaced with more cost-effective condition-based maintenance
2
(CBM) using advanced diagnostic, prognostic and health management solutions (PHM) (Figure 1.3).
Figure 1.3: Maintenance programmes Reactive maintenance leaves the system to operate until a defect occurs. Such a maintenance programme takes minimal expenses until the first breakdown occurs. When this happens, direct and indirect maintenance costs rise substantially. Breakdown of one component may suddenly terminate the operation, which might in turn result in damage of various other components, directly or indirectly connected to the breakdown. Consequently, maintenance actions require longer intervals which result in increased maintenance expenses, loss of income due to non-scheduled downtime for the period of maintenance duration, etc. Periodic maintenance attempts to minimize the probability of breakdowns. Maintenance actions are taken in the prescribed time intervals suggested by the manufacturer and the maintenance staff. The most critical components are periodically replaced in order to prevent deterioration and consequent breakdown. Maintenance expenses are reduced mainly due to undisturbed operation and the possibility of planning maintenance interventions. However, expenses may increase due to frequent replacements of components, even though they are still in good operating condition, with the purpose of breakdown prevention. Predictive condition-based maintenance (CBM) is an extension of preventive maintenance, which attempts to assure uninterrupted operation up to the moment when a
3
component starts to deteriorate and the probability of a breakdown increases. Maintenance interventions are applied more or less frequently at intervals determined by the online condition monitoring (CM) process. Except of the catastrophic failures, which are sudden and cause total loss of functionality, 99% of mechanical failures go through a distinct incipient phase. This means there are some noticeable indicators, which provide advanced warning about the onset of a failure. The role of CM is to timely detect this onset, localize the root-cause and, possibly, trend its progression over time. The remaining time until final breakdown can be long enough to allow for efficient maintenance service. This is the key idea of the emerging discipline of prognostics and health management (PHM), which forms the basis for predictive maintenance [3].
1.2
Online condition monitoring (CM)
Condition monitoring (CM) is a label generally used to describe the process of fault detection and isolation with the purpose of overall assessment of the system’s operational condition. Nowadays, continuous and fully automated on-line CM are being applied and rely on measuring vital quantities such as quality of lubricant, vibration, acoustic emission and sound emission, temperature and other. The concept of an integrated condition monitoring system for the mechanical drives is given in Figure 1.4. Oil quality is important indicator of the drive condition. During the machine operation oil continuously circulates between the gearbox and a system of sensors. Measurements of temperature, moisture, dielectricity, and wear debris, are conducted periodically by the appropriate sensors. In addition, there are vibration sensors placed in suitable locations, providing vibrational records. Data processing module includes acquisition of measurements, and extraction of fault indicative feature values from the acquired measurements. Features are variables, usually based on statistical properties of the measurements, which are dependent also (ideally only) from condition of a particular component. When a system deviates from the fault-free operation due to damage progression, the features should indicate change in the condition. Feature extraction is followed by feature evaluation, fault detection and isolation. The last step usually includes the system’s health assessment and communication with operators and maintenance personnel in charge of scheduling maintenance actions.
4
Figure 1.4: The concept of online integrated condition monitoring process.
1.3
Integration of multiple condition monitoring techniques
The process of data manipulation for the purpose of the system’s health assessment was briefly discussed above (Figure 1.4). It is described in details within this section, with an emphasis of integration of data obtained from several monitoring techniques (vibration analysis, oil analysis, etc.). The five distinct steps of the online CM process are presented in details in Figure 1.5. The first step is data acquisition from each of the mounted sensors. It is followed by data fusion of measurements, most commonly performed to discard faulty sensors and/or false measurements, identify potential outliers, reduce noise, etc. Measurements from the first step are passed to the feature extraction module where fault indicative variables are calculated. Features are usually specific to a property and may be indicative to one or more faults with different sensitivity levels. With features
5
Figure 1.5: The detailed condition monitoring process including data fusion. extracted, feature fusion is used to merge redundant information from different sensors. Hence, fusion of equivalent features may be used to improve diagnostic information. For example, vibration features and particle count from lubricant analysis may be fused to increase the sensitivity of gear pitting detection [4]. The step of data association performs feature evaluation. Within this step, the mapping of feature values into fault-related classes is being performed. Fault detection is a step which commonly includes statistical or neural network classification methods, to estimate the probability of presence of a fault. The more numerous the features indicating one fault, the more reliable and accurate is the calculation of the fault probability. The final step of fault diagnosis refers to the association of fault related probabilities. Partial decisions are fused in order to estimate the overall condition of the system, and to provide the final decision about the fault presence.
1.3.1
Lubricant analysis
According to some reports 45% of off-line lubricant analysis of mechanical drives show imminent lubricant degradation and consequent inappropriate component lubrication [5].
6
For online processing of lubricant measurements provided by oil and wear debris analysis, statistical tools are the most commonly used for feature extraction. In the fault-free condition, measurements of physical or chemical properties of the lubricant are usually stable, and any deviation from the stable state can be related to faulty conditions. Therefore, statistical approaches are able to identify such transitions by comparing the mean value of a particular feature with pre-established warning and critical thresholds, or by analyzing a feature’s linear regression parameters and assigning changes to the corresponding fault modes. Several approaches to online detection of changes in lubricant-based measurements can be found in the literature. Tambouratzis and Antonopoulos-Domis suggested a non-linear statistical data modeling method for on-line trend identification based on artificial neural network (ANN) [6]. Furthermore, the cumulative sum (CUSUM) approach is a sequential analysis technique, used to detect changes in a given time series, under assumption that the series is zeromean Gaussian process [7]. Charbonnier et al. [8] used CUSUM technique as the basis for the trend change detection algorithm. They applied a cummulative sum of errors defined as a difference between the measured value of the feature and the value predicted by the fault-free model. CUSUM value proved to be a reliable indicator of a change in the trend of data series. Vaswani presented a complex CUSUM method by applying two different likelihood functions – the expected (negative) log likelihood (ELL) and the observation likelihood (OL) – which are suitable for slow and fast changes, respectively [9, 10]. Profound treatment of the CUSUM technique is available in Basseville and Nikiforov [7].
1.3.2
Vibration analysis
Methods for vibration analysis found in the literature can be divided into three groups: time domain, frequency domain, and time-frequency domain. Time-domain techniques usually rely on simple signal indicators such as root-mean square (RMS), peak-to-peak value, crest factor, kurtosis etc. Methods performing frequency-domain analysis are capable of extracting sufficiently informative features which provide clear physical interpretation [11–13]. Despite broad acceptance, the frequency-domain analysis methods suffer from a major limitation, i.e. they are not applicable to non-stationary signals. The third category, time-frequency analysis, overcomes the problems related to non-stationary signals by investigating the signal simultaneously in time and frequency domains. Detection of local faults in gears using vibration analysis has been subject of intensive re-
7
search and a number of methods have been proposed. These methods include amplitude and phase demodulation [14], cepstrum analysis [15], residual analysis [16, 17], adaptive filtering [17–19], time frequency analysis [20–24] and time-scale analysis [16, 25, 26]. Pitting, like other local gear faults, is known to produce impacts which in turn excite the structural resonance(s) in the system. [27] proposed a resonance demodulation technique based on envelope analysis of the residual signal after band-pass filtering within the resonant band. The identification of the resonance bandwidth is done by spectral analysis of the residual signal, after periodic components are removed. However, no solution was given in order to detect these resonances and select the band-pass filter. The idea was extended by [17,28] where spectral kurtosis (SK) was successfully used to design a detection filter that adaptively extracts the fault related signal from the noisy background. For the diagnosis of bearing defects, one of the most commonly used approaches is the demodulation resonance analysis (envelope analysis), where the vibration signal is bandpass filtered around the resonance frequency [12, 29–31]. The spectrum of the envelope is used to detect characteristic defect frequencies in the low-frequency range. Although useful, this approach presents some limitations due to the selection of the band around resonance frequency and poor results in the presence of high noise [29]. To overcome the issue of correct frequency band selection, an adaptive STFT-based spectral kurtosis (SK) method has been proposed and applied [13, 31–38].
1.3.3
Integrated fault diagnosis based on data fusion
According to some results, by using a particular CM techniques alone, it can diagnose up to 30-40% of all mechanical defects [5, 39, 40]. In the past two decades, considerable attention has been given to the integration of various CM techniques with the purpose of more accurate and robust fault diagnosis and system health assessment. Strong evidence shows that fusion allows for extraction of more reliable and accurate estimation of the state of health which in turn can serve as a basis for more effective maintenance programmes [41–44]. The pillars of multisensor data fusion model were set up by JDL (Joint Directors of Laboratories), by introducing a 5-level fusion model which strives to generalize the terminology and methodology [45]. The initially introduced 5-level model was later reduced to the 3 levels of fusion related to the unprocessed data, features and decisions. Fusion of unprocessed data comes into play when one has to combine data from similar physical sensors that measure the same physical phenomenon. These techniques of information fusion are realized by various means like Kalman filters [46] and Dempster-Shafer methods [47].
8
The fusion of feature values is applicable in cases where each sensor contains different type of information to be displayed in the feature vector. Features from each sensor are then fused into a generalized feature vector that is used in further analyses. This approach is the most flexible one, since it allows information fusion from different physical backgrounds [48–50]. Decision fusion comprises methods that associate decisions provided by individual sensors, as partial assessments of the system’s condition. Based on them, the decision support system aggregates the information into one overall system state assessment. The commonly used approaches for decision fusion include methods like Bayesian reasoning, Dempster-Shafer reasoning, and neural networks [51–54].
1.4
Objectives and scientific contributions of the dissertation
In order to generalize the usefulness of the proposed diagnostic algorithms, they will be extended to multisensor systems. The result of such an integration is an online diagnostic system for gearboxes, capable of associating the results based on lubricant and vibration analyses into an overall system’s health assessment. To achieve this goal, several objectives were established. First, a fully automated online algorithm for detection of changes in lubricant properties, such as temperature, moisture, wear particles, etc. was developed. Second, a vibration-based online algorithm for detection of impacts produced as a result of gear or bearing defects, was developed as well. Third, an algorithm for the integration of results from vibration and lubricant analyses, fault diagnosis and overall system’s health assessment was set up. The main contributions of the dissertation are: • A novel algorithm for fault detection in lubricants based on classification of trend patterns. The algorithm performs the detection of transients in lubricant parameters using the CUSUM technique. After a change is detected, the current trend is associated as qualitative value such as stable, increasing, decreasing, nochange, etc. • A novel vibration processing algorithm for detection of irregularities in impact produced by gear and bearing defects. 1 The algorithm uses spectral kurtosis and optimal denoising (Wiener) filtering techniques for the extraction 1
In case of a fault presence, the damaged surface affects the tribological contact (impact). Instead of using detection of irregularities in impact, we propose shorter form of impact-detection.
9
of fault-related non-stationary components from short vibration segments. The filtered signal, called SK-residual, is used for the detection of impacts employing k-means clustering and k-nearest neighbors classification methods. In addition, the algorithm uses the majority voting rule for final estimation of probability of tentative gear and bearing defects. • New algorithm for fault diagnosis based on integration of vibration and lubricant analyses. The algorithm is based on a decision fusion and weighted averaging of the probabilities provided separately by lubricant and vibration analyses into an overall system health assessment. The algorithm uses the incidence table to deduce the relationship between faults and lubricant and vibration features. The result of the algorithm are ranked probability estimates of particular fault modes.
1.5
Dissertation structure
In chapter 2 a detailed overview of the most typical gearbox defects is provided with the corresponding vibration and lubricant signatures. In chapter 3, a methodology for the detection of gearbox defects based on lubricant and wear debris analysis is provided. Change Detection Algorithm (CDA) is introduced and described in detail. It is aimed for fault detection by making use of qualitative trend analysis. The chapter concludes with application of the algorithm to several case studies. In chapter 4, a methodology for the diagnosis of gearbox defects based on vibration analysis is described. The spectral kurtosis is used for extraction of the fault signature from vibration signal, and optimal denoising (Wiener) filtering. K-means clustering and k-nearest neighbour classification methods are used for detection of the fault-related impacts. The same approach is validated by experimental results including localized bearing defects, such as outer and inner race defects, and gear pitting. In chapter 5, the methodology for integrated fault diagnosis and global system’s health assessment is presented. The methodology is based on the incidence table and the majority rule approach, used for the final fault probability estimation. In chapter 6, the results from various experiments are presented, including a gear pitting experiment, water contamination, and alien fluid contamination experiments. Conclusions are summarized in chapter 7.
10
2 Critical gearbox components and fault signatures 2.1
Single stage spur gears and typical defects
Gears are power transmission components that transmit forces from one shaft to another, usually by changing angular speed or torque. The most common profile is involute, as the input/output speed ratio is insensitive to small variation in the center distance. The pressure angle is the angle between the direction of the normal force between the mating teeth and the common tangent to the pitch circles of each gear (Figure 2.1). For spur gears, the line of action is the tangent to the base circles of the two gears which define the base of the involute curve of the tooth profiles.
Figure 2.1: Basic geometry of a pair of spur gears. [29] Typical gear defects and the way they are conditioned by the operational regimes indicated in the torque versus rotational speed characteristic are indicated in Figure 2.2 [55]. Area 2 refers to the minimal wear at normal operating speed, provided sufficient lubrication is ensured. Particularly at lower speeds and high torque (area 1) the component wear might increase mainly due to reduced lubrication in the tribological contact. With
11
Figure 2.2: The emergence of some of the typical gear defects at different operating conditions [55] increased torque, gear scoring (area 3) or pitting (area 4) are the most likely defects to occur. Gear pitting is caused when the local pressure becomes very high and and the material is subjected to fatigue. Parts of the tooth material are being scraped, hence resulting in small pits. At higher torque (area 5) tooth breakage is likely to occur.
2.1.1
Gear vibration
The vibration spectrum of healthy gears depends much on the Contact Ratio (CR) and Transmission Error (TE). CR represents the average number of teeth in contact throughout a meshing cycle. If the CR is an integer, a constant number of teeth are in contact throughout the entire meshing cycle, and the stiffness of a tooth pair in contact also remains constant. For non-integer CR, tooth deflections tend to double when passing from double tooth pair to single tooth pair contact. Varying stiffness has a period equal to the number of teeth which gives strong excitation at the toothmesh frequency, which is rotational speed multiplied by the number of gear teeth. In principle, if the CR were an integer, there would be no excitation of gearmesh frequency. TE is introduced to express teeth deformation under load, even when tooth profiles are perfect. In addition, there is some geometric deviation from the ideal profiles due to wear of material from each tooth, which is most intensive at the tip of the tooth and
12
practically zero at the pitch circle [29]. The amplitude of the spectral component at the toothmesh frequency varies directly with the load fluctuations and can be considered as an amplitude modulation (AM) signal. Amplitude modulation gives rise to sidebands around harmonics of the gearmeash frequency. In addition to the amplitude modulation, frequency modulation (FM) excites the same sidebands around the same carrier frequencies [15]. FM is hence consequence of two phenomena, one being directly dependent on the changes in the rotating speed, and the other being related to the deviations of the ideal tooth profiles, which tend to differ from tooth to tooth. Damaged tooth contact surface, e.g. due to pitting or scoring, has strong effects on system’s vibration response as it introduces additional variation in stiffness of the meshing tooth pair [17, 29]. Each time a defective tooth pair comes into meshing, a weak impulsive excitation is imposed to the structure at the moment when the damaged tooth surface comes into tribological contact. This phenomenon contributes twofold to the gear vibration response, i.e. in terms of impulse excitation of the structural resonances and additional amplitude modulation (Figure 2.3).
Figure 2.3: Typical vibration signatures of local gear faults Amplitude modulation caused by a fault manifests as a flat increase of sidebands around the gearmesh frequency. The distance between the two neighboring sidebands equals the rotational frequency of the damaged gear. In cases of multiple damages on the teeth, the AM becomes sine shaped, and gives stronger excitation to sidebands around
13
harmonics of gearmesh frequency. Uniform tooth wear is indicated by an increase in the second harmonic of the toothmesh frequency, since the effect at the first harmonic is dominated by tooth deflection. As wear progresses, the profile deterioration causes increase of harmonics and sidebands of the gearmesh frequency. In some cases, especially at higher loads and poor lubrication, the lack of sliding at pitch circles can lead to a local breakdown of the lubricant film and consequent pitchline pitting [56, 57]. Because of its impulsiveness, it results in an increase of all high harmonics of the toothmesh frequency.
2.2
Rolling element bearings
Rolling element bearings are the most widely used elements in machines and also the most frequent reason for machine breakdown. A standard rolling element (ball) bearing is shown in Figure 2.4. It is composed of outer race, inner race, rolling elements and the cage. The load is usually applied to the inner race and transported to the outer race through rolling elements.
Figure 2.4: The rolling element bearing [58].
2.2.1
Bearing vibration
Even geometrically perfect bearings generate vibration due to varying compliance or due to the time varying contact forces in the bearing. In condition monitoring it is of utmost importance to understand the nature of vibration caused by bearing faults. These can be localized or distributed. Distributed faults mainly appear as surface roughness, waviness, misaligned races, and are usually caused by varying size of rolling elements. Localized faults, on the other hand, include cracks, pits and spalls, and are mainly caused by fatigue of the rolling surface.
14
When the rolling element hits a locally damaged area on the outer or inner race, a weak force pulse acts on the structure, thus exciting high-frequency resonances (Figure 2.5).
Figure 2.5: Typical vibration and envelope signals produced by localized bearing faults [29]. Similarly happens when a fault on the rolling element comes into contact with any of the two races. For a constant rotational speed, these pulses are generated at the pace referred to as the fault frequency. By detecting this frequency in the spectrum, one can determine the location of the defect [12]. Vibration signals produced by bearing faults are usually amplitude modulated, which is a consequence of the varying load borne by the rolling elements, and varying transfer function between the source of the pulse and the sensor. From Figure 2.5 the most typical amplitude modulation patterns are presented, dependent on damage location. In case of an outer race defect, where the damaged area is at standstill, relatively low amount of amplitude modulation is present. On the contrary, in case of inner race defect, where the damaged surface is rotating with shaft frequency, strong amplitude modulation phenomena is a consequence of varying transfer function between the source (damaged area) and sink (sensor). Similar phenomena are observed in case of rolling element defect. There are four main frequencies associated with bearing faults:
15
BP F O =
Nre Fsh dre (1 − cosφ) 2 df tf
(2.1)
BP F I =
Nre Fsh dre (1 + cosφ) 2 df tf
(2.2)
dre Fsh (1 − cosφ) 2 df tf
(2.3)
FTF =
BSF (RSF ) =
df tf dre [1 − ( cosφ)2 ] 2dre df tf
(2.4)
where BP F O is the ball-pass frequency at the outer race, BP F I is the ball-pass frequency at the inner race, F T F is the fundamental train frequency and BSF (RSF ) is the ball (roller) spin frequency. Symbol Fsh stands for the shaft rotation frequency, Nre is the number of rolling elements, dre the diameter of a bearing rolling element, df tf diameter of a cage, and φ is the angle of the load.
2.3
Typical lubricant-related faults
Lubrication greatly contributes to the reduction of component friction and wear. Improper lubrication due to lubricant degradation can lead to a reduced film thickness causing direct contact between contact surfaces and consequently excessive wear of the surfaces in the tribological contact. The analysis of physical and chemical properties of the lubricant, such as additive’s viscosity, temperature, water saturation and contamination, as well as wear debris can be used to detect problems with lubricant degradation, excessive wear of the contact surfaces, and to determine the quality of lubricant and appropriateness for its further use.
2.3.1
Lubricant contamination
The notion of contamination stands for the presence of unacceptable chemicals in the lubrication, like water, acids etc. Inevitably, contamination causes lubrication deterioration and decreased quality of lubrication, which greatly reduces the reliability and life span of a system. Combination with oxygen can facilitate the deteriorating effects, such as resins, varnishes, and acids inside the operating lubricant [59]. Water ingression in the lubricant reservoir results in an increase in the precipitation of additives, leading to a loss of corrosion protection [60]. Studies indicate that concentrations of water particles as low
16
as 100 ppm can reduce mechanical components’ life by 50 % and 75 % for concentration above 400 ppm [61].
2.3.2
Lubricant aging
Lubricant aging is characterized by an increase in the viscosity. Lubricant oxidation rate tends to be proportional to the increase in viscosity towards the end of the lubricant lifespan. In the same way, the initial protective additives are removed by oxidation products, hence increasing acidity. All these end-of-life phenomena are followed by darkening of oil color [62].
2.3.3
Excessive component wear
Insufficient or deteriorated lubrication leads to the thinning of the oil film. As a consequence, the contact area between metal contact surfaces in sliding motion gives rise to fatigue, abrasion, and excessive wear. Besides obvious performance decay, the efficiency of power transmission is greatly affected by the energy lost due to local heat generated by friction. Wear particle count and shape represent an efficient tool for component wear detection, since wear debris can be directly associated with wear mechanisms [5, 40, 63]. Additionally, detailed wear debris analysis can be used for determining the type of defect, and the remaining component life-time, by counting the cumulative mass of worn material [39, 64, 65]. The material removed from the contact surface during fault progression produces a typical signature as shown in Figure 2.6.
Figure 2.6: Wear debris quantity and size during operation
17
During the initial run-in phase of gears, large amounts of very small, flake shaped particles are being worn off the surface due to a surface smoothing mechanism. The sizes of generated particles during the run-in phase are in the range 5µm to 15µm. During the normal operation phase, gear wear decreases due to smoothed contact surface, sufficient thickness of the lubrication film, etc. The amount of generated wear debris is therefore greatly decreased upon termination of the run-in phase. Eventually wear and material degradation cause defect propagation into the material resulting in a very strong increase in the amount and size of generated wear debris in the range 20µm-500µm and larger.
18
3 Trend change detection in lubricant parameters Revealing trend changes in lubricant parameters is of great importance in many applications where zero tolerance to faults has to be assured, e.g. in monitoring the helicopter gearboxes. Below, a novel algorithm for automated trend changes detection of lubricant parameters will be presented. The core change detection algorithm (CDA) is based on cumulative sum of errors (CUSUM), which is a sequential analysis technique used to detect changes in a given time series. The algorithm assumes a time series to be a zero-mean Gaussian process [7]. In each time step a cumulative sum of past values is calculated and checked for tentative rise above the threshold. If so, there is a clear sign that the statistical properties of the Gaussian process changed in the meanwhile. This is the idea that represents the basis for the trend extraction algorithm presented in [8]. The method employs linear predictive models and prediction error as input for the CUSUM algorithm. The decision process for qualitative evaluation is based on the comparison of segments of data and their trend slopes. CDA was developed as an extension of the approach presented in [8] with a capability not only to detect trend change, but also classify the signal into several qualitative classes, such as increasing, decreasing, stabilizing, etc (Figure 3.1).
Figure 3.1: Transient detection and identification based on qualitative trend analysis
19
CDA proposes comparison of two data windows, the reference window and the current one, which define the reference and current vectors xR (i) ∈ RNR and xC (i) ∈ RNC , respectively, where i stands for the time index. The reference vector contains samples of the time series before the change. It is kept as reference as long as a change in time series is detected. After normalization of both data vectors, linear regression is calculated and used for the prediction of future values. The difference between the predicted values of the time series and the actual values is prediction error, which is used for calculation of the cumulative sum of errors (CUSUM). CUSUM reflects the degree of difference between the data defined by the two windows. When CUSUM value exceeds a detection threshold the reference slope is no longer considered representative of the current slope and a trend change is detected. Detection is followed by classification of the current trend pattern with respect to the reference data. Its qualitative values can be increasing, decreasing, no change, stabilizing, etc.
3.1
Change detection algorithm (CDA)
For a sampled oil parameter vector x(i), where i is the time index, data acquisition is performed with a sampling time Ts . After Ninit sampling intervals, when i = Ninit and the amount of data is sufficient for CDA’s initiation process, the property vector is formed as x(i) = [x(1), x(2), . . . , x(Ninit )].
3.1.1
Initiation by definition of reference and current vectors
At i = Ninit the data are split into two parts, the reference vector and current vector (Figure 3.2). The reference vector, defined by a fixed rectangular window wR (i) ∈ RNR , ( wR (i) =
for Ninit − NC − NR < i ≤ Ninit − NC otherwise
1 0
(3.1)
is used to define the a section of a property vector, considered as the reference. The current vector, defined by a sliding rectangular window wC (i) ∈ RNC , ( wC (i) =
1 0
for Ninit − NC < i ≤ Ninit otherwise
An element of the reference vector xR (i) ∈ RNR is defined as
20
(3.2)
Figure 3.2: Initial reference and current data windows.
( xR (i) = wR (i) · x(i) =
for Ninit − NC − NR < i ≤ Ninit − NC otherwise
x(i) 0
(3.3)
and the current vector xC (i) ∈ RNC as ( xC (i) = wC (i) · x(i) =
3.1.2
for Ninit − NC < i ≤ Ninit otherwise
x(i) 0
(3.4)
Data normalization
Immediately after selection of the two vectors to be subjected to analysis, data from both windows are first normalized. At time i = Ninit the reference vector xR (i) is subjected to centering to zero (0) mean µ and normalization by variance (V ar), where NR 1 X V ar(xR (i)) = (xR (i) − µ)2 NR i=1
(3.5)
and PNR µ(xR (i)) =
xR (i) NR
i=1
The normalized reference vector xRn (i) ∈ RNR , shown in Figure 3.3, is equal to
21
(3.6)
Figure 3.3: The normalized reference and present data windows.
xRn (i) =
xR (i) − µ V ar
(3.7)
The same normalization procedure is applied to the current vector xC (i), resulting in the normalized current vector xCn (i) ∈ RNC
xCn (i) =
3.1.3
xC (i) − µ V ar
(3.8)
Modeling trends by means of linear regression
After normalization procedure both vectors xRn (i) and xCn (i) are subjected to calculation of linear regression parameters. The data from each window is described with a linear model
xˆ(i) = a + b · i
(3.9)
where i indicates time, and a and b the linear regression parameters. Optimal estimates of the parameters a and b for the normalized reference vector xRn (i), are obtained by minimizing the criterion function
22
DR =
NR X
[xRn (i) − (aRn + bRn · i)]2
(3.10)
i=1
The solution reads aRn bRn
! = (Φ)−1 · ϕ
(3.11)
where " ϕ=
PNR
i=1
xRn (i) xRn (i) · i
#
" # PNR 1 i Φ = i=1 i i2
(3.12)
After aRn and bRn are obtained for the reference vector, the same linearization process is performed to calculate the regression parameters aCn and bCn for the current vector xCn (i) as aCn bCn
! = (Φ)−1 · ϕ
(3.13)
where " ϕ=
PNC
i=1
xCn (i) xCn (i) · i
#
" # PNC 1 i Φ = i=1 i i2
(3.14)
In order to check whether the reference trend describes the data from the current window, first the predicted reference values are calculated, i.e. x ˆRn (i) = aRn + bRn · i. This is followed by linear approximation of the current normalized vector as x ˆCn (i) = aCn +bCn ·i (Figure 3.4). The predicted reference vector x ˆRn (i) ∈ RNP and the predicted current vector x ˆCn (i) ∈ RNP reflect the trend slope of the reference and current vectors, and are therefore used to estimate the error and CUSUM.
3.1.4
CUSUM calculation
The error e(i) is calculated as a difference between the reference predicted values x ˆRn (i) and present linear approximated values x ˆCn (i) (Figure 3.5).
23
Figure 3.4: Linear regression and prediction of the reference and current values.
e(i) = xˆCn (i) − xˆRn (i)
(3.15)
The CUSUM value E(i) is calculated by summing the errors
E(i) = e(i) + E(i − 1) =
X
e(i)
(3.16)
i
Within the present window the error e(i) represents instantaneous difference between the reference and present linear regression values, and CUSUM E(i) the sum of errors from definition of the reference vector. In cases when the CUSUM value remains below the detection threshold, as explained in the next section, the analysis procedure continues with sliding of the current winS %), while the reference remains dow wC (i) with a step of NS samples (overlap of NCN−N C unchanged (Figure 3.6). After each move of the sliding window the above procedure (section 3.1.1 to section 3.1.4) is repeated to update linear regression parameters to the most recent data available, calculate error and update the CUSUM values. With such an approach, the original property vector x(i) is approximated using linear segments as shown in Figure 3.6. After each step only linear regression corresponding to the non-overlapping segments, Ninit + (j − 1) · NS < i ≤ Ninit + j · NS , where j is the number of steps, is used for error and CUSUM calculation.
24
Figure 3.5: Error and CUSUM.
3.1.5
Detection of trend change
The CUSUM E(i) reflects the deviation of the current property evolution from the initially established reference. Low CUSUM values indicate that the current evolution of the property and the reference do not differ significantly, while high CUSUM values indicate significant deviation from the reference. In other words, CUSUM is used to estimate how the current evolution of the property complies with the reference. The comparison is based on two thresholds T habn and T hdet , where T habn < T hdet . The abnormal threshold T habn is used to detect a possible start of a trend change and the detection threshold T hdet serves to confirm the trend change, therefore minimizing the false alarms. As long as CUSUM remains below the abnormal threshold E(i) < T habn , the current and the reference trends are considered as compliant, and the reference vector still representative of the current one. In this case the detection is False and the process proceeds with repeating the above steps (section 3.1.1-3.1.4). When CUSUM E(i) exceeds the abnormal
25
Figure 3.6: Error and CUSUM calculation after sliding current window. threshold T habn but remains below the detection threshold T hdet , T habn < E(i) < T hdet , it is an early indication of the deviation of the current trend from the reference one. The difference is not yet sufficient for the change to be detected, but the values corresponding to i > Nabn , where Nabn represents an instant when CUSUM exceeded the abnormal threshold E(Nabn − 1) < T habn ≤ E(Na bn), are considered as abnormal and are used to define new reference window after positive detection (Figure 3.7). In this case the detection is False and the process proceeds with repeating the above steps (section 3.1.1-3.1.4). When CUSUM E(i) exceeds the detection threshold E(i) > T hdet , the current vector is considered as non-compliant with the reference one, and the detection is True. The section of the property vector x(i), for time indexes i corresponding to the abnormal values T habn < E(i) < T hdet , is defined as the abnormal section
xabn (i) = x(i) for Nabn < i < Ndet
26
(3.17)
Figure 3.7: Detection of trend change using CUSUM value E.
where Nabn and Ndet are defined as instants of CUSUM exceeding the abnormal and detection thresholds, respectively. Trend change detection is followed by qualitative trend analysis in order to determine the nature of change (increasing, decreasing, stabilizing, stable, etc.). For classification, the abnormal vector is used, and compared to the reference one. At the same time, CUSUM E(i) is reset to 0, and the reference window wR (i) for further processing is redefined to correspond to the abnormal data as ( wR (i) =
1 0
for Nabn < i < Ndet otherwise
(3.18)
The purpose of choosing the abnormal data as the newly defined reference lies in the fact, that since deviation in the property vector occurred, the abnormal data represents the most recent property evolution. Therefore, is makes it appropriate to represent the reference, after the final classification is concluded.
27
3.1.6
Change evaluation
When CUSUM E(i) exceeds the detection threshold, E(i) > T hdet , the process of evaluation of the detected change occurs. The evaluation is based on the comparison of the slope of the reference and abnormal vectors, bRn and babn respectively (Figure 3.8).
Figure 3.8: Qualitative trend analysis (QTA) using the decision quadrant There are four possible qualitative states S(i) that the trend change can be classified into 0 ±1 S(i) = ±2 ±3
for for for for
stable stabilizing (increasing/decreasing) non-changing (increasing/decreasing) changing (increasing/decreasing)
(3.19)
where the state regions are defined using the reference slope bRn and classification thresholds
28
• T h0 - the stable threshold defines an absolute slope value that is considered close to 0 or stable (S(i) = 0) • T hC - the relative changing threshold used to define the non-changing area around the reference slope as T hC+,− = bRn (1 ± T hC ) Transitions between states are presented in Figure 3.9.
Figure 3.9: CDA state transition diagram.
The stable state S(i) = 0 is used to characterize a signal with a slope close to zero (0) with the conclusion that no change is present. From the stable state, two transitions are possible: • transition T00 to stable state when babn < T h0 , • transition T03 to changing state when babn ≥ T h0 . The changing increasing/decreasing S(i) = ±3 state indicates cases where the abnormal slope is significantly larger/smaller than the reference slope. When the state is changing, the decision quadrant is divided into three classes, using the changing threshold T hC (figure 3.8). From changing state, several transitions are possible, however no transition from changing to stable state is defined, because changing should be followed by stabilizing prior the stable state: • transition T33 to changing state when babn ≥ T hC+ , • transition T32 to non-changing state when T hC+ > babn ≥ T hC− , • transition T31 to stabilizing state when babn < T hC− .
29
The non-changing increasing/decreasing state S(i) = ±2 is used for cases, when the change was detected by the CUSUM exceeding the detection threshold, but the slope of the abnormal data is non-significantly larger or smaller from the reference one. When the state is non-changing the decision quadrant is divided into same three classes as in the previous case (figure 3.8). The transitions from the non-changing state are • transition T23 to changing state when babn ≥ T hC+ , • transition T22 to non-changing state when T hC+ > babn ≥ T hC− , • transition T21 to stabilizing state when babn < T hC− ). The stabilizing increasing/decreasing state indicates cases, when the signal’s slope is stabilizing, meaning that the abnormal slope is significantly smaller than the reference one in the absolute sense. When the state is stabilizing the transitions are: • transition T11 to stabilizing state when babn < T hC− , • transition T13 to changing state when babn ≥ T hC+ , • transition T12 to non-changing when T hC+ > babn ≥ T hC− , • transition T10 to stable state when babn < T h0 . The state vector S(i) is used to describe qualitative states of the analyzed property over time. S(i) is appended with the new qualitative state value every time a change is detected at i = Ndet , and therefore contains relevant information regarding property evolution over time.
3.2
Experimental validation of the proposed qualitative trend analysis methodology
3.2.1
Experimental setup
Figure 3.10 shows the experimental setup. The setup consists of a synchronous electric motor (1) and a brake-generator (3) that imposes resistive torque. A single-stage gearbox (2) that makes speed reduction by a transmission ratio of 1.5, connects the input and output shafts. Shafts are coupled to the motor and generator by two elastic and one fixed coupling. Inside the gearbox, a pair of steel spur gears (DIN 42CrMo4) with 16 and 24 teeth was installed. Spur gears were additionally nitrated prior to the pitting test (section 4.2).
30
Figure 3.10: Experimental rig. Inside the gearbox, 2 liters of a gear oil – Olmaredol VG68 (Reference viscosity at 40 ◦ C is 68mm2 /s) – were added for each test. Table 3.1 shows the components and main characteristics for the referred experimental setup. Component Synchronous electric motor Braie generator
Characteristic Nominal power Nominal speed Nominal power Maximum torque
Value 12.7 iW 1470 rpm 20.2 iW 110Nm
Table 3.1: Nominal parameters of the experimental rig The integrated sensor unit (ISU) was connected to the gearbox as shown on figure 3.10. Oil properties were acquired synchronously and were analyzed using the CDA algorithm. The objective was to evaluate the CDA and investigate its outputs in various cases of occurring faults. The input parameters of the CDA were obtained empirically prior to the experiments and are shown below (table 3.2)
3.2.2
Water contamination
The experiment with water contamination was conducted at constant motor speed of 1000 rpm and generator’s torque set to 33% of the generator maximum torque. In figure 3.11 the relative water content measurements in time and correspondent CDA’s analysis results are presented. In figure 3.11a, the measurements of the relative water content are shown throughout the experiment duration of 60 hours. Initially, the relative water content shows stable
31
Oil property
Temperature Relative water content Relative dielectric constant Ferrous particles small Ferrous particles large Non-ferrous particles small Non-ferrous particles large
Cusum Present abnorwinmal dow thres. size T h1 3h 20o C 3h 50%
Cusum detection thres. T h2 200o C 500%
4.5h
20%
6h
Stable thres. T h0
Change thres. T hC
5o C/h 0.5%/h
25% 25%
200%
1.2%/h
25%
200p.
2000p.
100 p./h
25%
6h
200p.
2000p.
20 p./h
25%
6h
200p.
2000p.
10 p./h
25%
6h
200p.
2000p.
2 p./h
25%
Table 3.2: Tuning parameters of the CDA algorithm. behaviour, even though a slight increase is visible. After t = 19.5h of operation, 1 ml (500 ppm) of water was inserted through an inlet socket to the gearbox. Water contamination immediately caused an abrupt change in the relative water content., which can be seen as a step increase from approx. 26% to 49%. Soon after the contamination the relative water content started to decrease towards stabilization. In figure 3.11b, the CUSUM value is shown, as a result of comparison of the reference and present windows. The initial reference and present windows were set to the first 3 hours of the normalized measurements, which caused CUSUM value to remain zero (0). The difference between the following present and the initial reference window, caused CUSUM to decrease within the first 12.5 hours of operation. However, the CUSUM based trend detection was followed by classification to result in the stable trend (figure 3.11c), since the slope of the measured data remained below the stable threshold of 0.5 %h (table 3.2). After classification, the CUSUM was reset to zero (0) and the reference window was set according to the abnormal data in interval t = [3h, 12.5h], defined between the abnormal and detection CUSUM thresholds, T h1 and T h2 , respectively. It is important to note, that the minimal duration of the reference window was set to 3 hours, which would be chosen if duration between the detection and abnormal threshold would be below this value. The CUSUM abruptly increased immediately after the contamination (t = 19.5h) and reached the threshold with approx. 0.5 hour delay (figure 3.11b). At the same time
32
Figure 3.11: The detection of trends in time evolution of the relative water content
at t = 20h, the changing (increasing) state was a result of the qualitative analysis, as a result of significantly lower slope of the reference when compared to the present window (figure 3.11c). At that moment, the present window defined the normalized measurements that included the step change produced by the contamination, therefore causing high slope and consequently larger error and CUSUM. The CUSUM was reset to zero (0), and the reference window was set in accordance with the CUSUM surpassing the abnormal and detection thresholds, T h1 and T h2 , respectively, which defined an interval of t = [19.8h, 20h]. Since duration of the abnormal interval was below the minimal 3 hours, the reference was set to t = [17h, 20h]. The newly defined reference included the spiie produced by the contamination, which consequently produced the reference slope to increase. Accordant to the measurements, after the step increase at t = 19.5h the sliding present window produced a negative slope, which caused the CUSUM to decrease. The CUSUM reached the negative detection threshold T h2 at approx. t = 25h. The qualitative classification resulted in the changing
33
(decreasing) state, due to large difference between the reference and present slopes. The CUSUM was reset to zero (0) and the reference was redefined. The signal in figure 3.11a continued to decrease and stabilize. The slopes of both windows remained negative, but the reference one was larger in the absolute sense. This caused the CUSUM value to increase, and reached the positive detection threshold at approx. t = 32h. The result of qualitative classification was the stabilizing (decreasing) state, which lasted until approx. t = 46h. At that time, the stable state was detected as a result of the measurements slope remained below the stable threshold of 0.5 %h (table 3.2).
3.2.3
Chemical contamination
The experiment was performed to observe the response of the algorithm to the contamination of the operating gearbox oil. Motor speed and generator’s torque were set the same as in the previous experiment. Before the start of the experiment the dielectricity sensor was referenced to the gearbox oil Olmaredol VG68. The time signal for the relative dielectric constant and trend classification are presented in figure 3.12. In figure 3.12a, the measurement of the relative dielectric constant are shown throughout the experiment duration of 18.5 hours. Initially, the relative dielectric constant shows stable behaviour with the mean value equal to 2%. After t = 1h of operation, 50 ml (25000 ppm) of the hydraulic oil was inserted through the inlet socket of the gearbox. The relative dielectric constant started to decrease soon after contamination, reaching a value of -15% after approx. 7h from contamination and stabilization at -20% after 12 hours from contamination. In figure 3.12b the CUSUM value is shown as a result of comparison of the reference and present windows. Both windows were set to overlap in the interval of t = [0, 4.5h], which is evident as zero (0) CUSUM value within the interval (figure 3.11b). After contamination at t = 1h the relative dielectric constant started to decrease, causing the CUSUM to decrease and surpassing the negative detection threshold at t = 8.75h. Due to a decreasing trend of the dielectricity signal with a slope of a present window significantly smaller than the reference slope, the classification of the change, indicated by CUSUM surpassing the detection threshold, resulted in a changing decreasing qualitative state ((figure 3.11c). After classification, the CUSUM was reset to zero (0) and a new reference window was set according to the abnormal data interval, defined by CUSUM crossing the abnormal and detection thresholds, T h1 and T h2 , respectively. The chosen interval was defined within t = [6.15h, 8.75h], which predicts the reference duration of 2.6 hours. It
34
Figure 3.12: The detection of trends in the time evolution of the relative dielectric constant
is important to note, that the minimal duration of the reference window was set to the initial 4.5 hours, which has to be chosen if duration of abnormal data would be lower then the minimal allowed duration, as happened in this particular case. Therefore, the reference was defined as t = [4.25h, 8.75h]. After the first trend change was detected and interpreted, the relative dielectricity signal continued to decrease and stabilize (Figure 3.11a). The newly established reference (t = [4.25h, 8.75h]) had a negative slope, which tended towards zero (0) within the sliding present window for t > 8.75h. The slopes of both windows remained negative with the reference slope larger in the absolute sense, which caused the CUSUM value to increase and reach the positive detection threshold T h2 at approx. t = 13.75h. The result of qualitative classification was the stabilizing (decreasing) state, which lasted until approx. t = 15.85h, when the stable state was detected as a result of the measurements slope remained below the stable threshold of 1.2 %h (table 3.2).
35
3.2.4
Gear pitting
The gear pitting experiment was conducted under time varying load conditions to test an influence of load variation on progression of the surface pitting phenomenon. The main objective of the pitting experiment was to induce the occurrence of the macro pitting in the flank of spur gears naturally during operation under variable load. The motor speed was set constant to 1296 rpm corresponding to 92% of the motor’s nominal rotating speed. The torque was varying in steps of 33%, 66% and 100% of the motor’s maximum torque every 7 hours, shown with temperature measurements in figure 3.13. Initially, the load was set to 33% of the generator maximum torque (table 3.1), which caused the temperature to increase from the room temperature of approx. 30 degrees (figure 3.13). Temperature reached a stable value of approx. 37 degrees after 7 hours. The torque was increased to 66%, which was followed by an increase in temperature that stabilized at 42 degrees. At 14 hours, the torque was set to 100%, which caused the temperature to increase up to 50 degrees. Load and temperature variations continued throughout the entire 120 hours of the experiment.
Figure 3.13: The time varying torque and temperature profiles Figures 3.14 and 3.15 show the analysis results of the CDA for the particles with diameter smaller or larger than 100µm, respectively. On both figures a) shows the time domain signal of the cumulative sum of the generated wear particles. In b), the CUSUM value for each signal is shown, which determines the points of trend change detection. In c), the CDA qualitative states are shown for each signal, reflecting the signal’s trend evolution over time. Figure 3.14a shows the cumulative sum of generated particles smaller than 100µm diameter, which reflects the total number of small particles generated during the experimental
36
Figure 3.14: Trend classification of small ferrous particles of diameter 30h the CUSUM started to decrease due to a clear stabilization of particle generation which occurred around t = 40h. The CUSUM reached the negative detection threshold T h2 at approx. t = 50h with a continuing stable slope, as indicated by the final classification at (Figure 3.14c). The CUSUM was again reset and for the reference an interval t = [29h50h] was chosen to assure a sufficient duration. Soon after, the CUSUM abruptly increased above the positive detection threshold at approx. t = 64h. An increase was due to increasing rate of the generated particles, which was caused by maximal load and a possible fault progression, which was successfully indicated by the classification method as changing increasing state. For 64h < t < 90h the CUSUM repeatedly increased upon being reset and several changing increasing states were found, which shows that the rate of change of the generated particles was increasing within the interval. Only after t = 90h the rate of generated particles started to stabilize, which caused CUSUM to decrease and the qualitative classification to recognize the stabilizing state at t = 96h. Stabilization was followed by an increase in the rate of generated particles and consequent CUSUM value, which caused the classification to result in nochange increasing state at t = 105h. The state reflects an increase in the rate of generated particles, but with a slope not significantly different from the established reference slope. If the present slope would be significantly larger from the reference, than an increasing change would be found instead, and if significantly smaller, an additional stabilization would be output. Figure 3.15a shows the cumulative sum of generated particles larger than 100µm diameter, which reflects the total number of large particles generated during the experimental run. The signal, which at the end reached a value of approx. 1450 particles, shows slope variations as a direct consequence of the load variations over time, as shown in (Figure 3.13). The rate of generated large particles depends highly on the load, where, as anticipated, the rate would be lower for lower load and higher for higher load. During the initial run-in phase, a vast amount of wear particles was generated as a result of smoothing of the tribo-contact surface, which decreased and stabilized at around t = 40h, when the run-in phase and the surface smoothing finished. Around t = 60h an increase in the slope is visible which continued to increase, despite load-related variations, throughout the rest of the experiment.
38
Figure 3.15: Trend classification of large ferrous particles of diameter >100um during the pitting experiment In Figure 3.15b the CUSUM value is shown for the large wear particles. The initial reference and present windows were overlapping at the interval t = [0h, 21h] (table 3.2), which caused CUSUM to remain equal to zero (0) within the initial interval. Soon after, the CUSUM value increased and reached the positive detection threshold T h2 at t = 38h due to the significant difference between the present and reference slopes. At t = 38h the qualitative classification resulted in stable state, since the slope of the abnormal data used for classification, remained below the stable threshold (table 3.2). The CUSUM was reset to zero (0) and new reference was established within the interval of t = [17h, 38h], which extended the too short abnormal data interval t = [21h, 38h] to reach the minimal 21 hour duration (as set initially). For t > 38h the CUSUM started to decrease due to a clear stabilization of large particle generation which started at around t = 40h. The CUSUM reached the negative detection threshold T h2 at approx. t = 50h with a continuing stable slope, as indicated by the final classification at (Figure 3.14c). The CUSUM was again reset and an interval t =
39
[29h, 50h] was chosen for the reference. Soon after, the CUSUM abruptly increased above the positive detection threshold at approx. t = 64h, where an increase was caused by an increasing rate of the generated particles due to maximal load and a possible fault progression. At t = 64h, the classification indicated a stable state, since the abnormal slope was below the stable threshold (table 3.2). At t = 72h the CUSUM increased above the detection threshold T h2 , and classification resulted in the changing increasing state, as expected due to a clear increase in the rate of generated particles. Upon CUSUM being reset, several changing increasing states were resulted from the qualitative classification, which shows that the rate of change of the generated particles was increasing until approx. t = 95h, when the stabilization occurred. The CDA analysis of the wear debris during the gear pitting experiment, shows successful behaviour and identification of the qualitative state of the analysed signal. The comparison of slopes from the two windows, the fixed reference and the sliding present ones, resulted in estimation of the CUSUM value, which repeatedly and correctly indicated slope changes in the signal. After each detection, which is indicated as the CUSUM exceeding the detection threshold T h2 , a qualitative trend classification was performed, in order to make an abstraction of a present signal evolution into pre-defined qualitative trend regions, such as stable, stabilizing, increasing, decreasing, etc. The abstracted signal, as a result of the QTA analysis, can be successfully used for anomaly detection, as will be shown in the following chapters together with results of vibration analysis.
40
4 Impact detection based on spectral kurtosis Defective gears and bearings contribute to vibrations of additionally excited resonant frequencies of the structure. The method proposed below is based on spectral kurtosis (SK) and optimal denoising (Wiener) filter to distinguish fault induced impacts from background vibration noise (Figure 4.1).
Figure 4.1: Schematic representation of the fault detection procedure based on vibration analysis. Figure 4.1 shows the schematic representation of the proposed method. An input is the random component of vibration signal transformed into angular domain using angular resampling procedure. The signal may be also high-pass filtered to remove the lowfrequency shaft harmonics and other periodic components. The first step consists of segmentation of the vibration signal into rotation segments, where rotation is either cage rotation (for outer race defect) or a relative rotation between shaft and cage (for inner race defect). For gears the segmentation process is performed for each shaft rotation. Each vibration segment (rotation segment) is subjected to the estimation of the SK and filtering independently. The filtered signal, called the SK-residual, is used for an
41
estimation of the squared envelope, reflecting power of impulses, and alignment of the impulses based on cross-correlation. The process concludes with impact detection using k-mean clustering and k-nearest neighbors (kNN) classification.
4.1 4.1.1
Theoretical background of spectral kurtosis (SK) Definition and properties of SK
Spectral kurtosis (SK) was interpreted in [32] as an adaptive technique used to determine the most suitable frequency band for extraction of the non-stationary component of the signal. It extends the statistical concept of the kurtosis to the function of frequency and indicates how the impulsivity in the signal is distributed in its frequency domain. The spectral kurtosis K(f ) of the signal y(t) defined as the fourth-order spectral moment [32]:
K(f ) =
h|Y (t, f )|4 i −2 h|Y (t, f )|2 i2
(4.1)
where h•i is the time average operator and Y (t, f ) represents the complex envelope of the signal y(t). The Y (t, f ) may be estimated by the short time Fourier transform (STFT) by moving a relatively short window along the signal:
Y (t, f ) =
t+N w −1 X
h(n − t)y(t)e−j2πf n
(4.2)
k=t
where h(t) is the window of length Nw . The subtraction of 2 in eq. (4.1) is used to set K(f ) = 0 in the case Y (t, f ) is a complex Gaussian noise. The idea is illustrated in Figure 4.2 The STFT-based SK highly depends on the window length used for the calculation of STFT by eq. (4.2). Nw should be smaller than the distance between two impulses and larger than the length of one impulse response [13]. An inadequately short window may produce SK with poor spectral resolution and reduced level of detail. In addition, the concept of the kurtogram is useful to find the optimal window length that would maximize the SK.
42
Figure 4.2: SK estimation procedure [32].
4.1.2
Kurtogram technique for optimal SK parameters
The kurtogram technique is useful tool to define the optimal STFT window length Nw maximizing the SK [33]. The idea is to present SK as a function of two independent variables K(f, Nw ), where f is a frequency and Nw STFT window length. Hence, a twodimensional representation called the kurtogram is obtained, as a series of SK’s estimated using a range of window lengths Nw . Figure 4.3 shows the kurtogram of a defective bearing. It indicates the optimal resolution, where resolution is estimated from window s , that maximized SK at res = 2.5kHz when N w = 32 samples at Nw as res = 2F Nw Fs = 40kHz.
Figure 4.3: The kurtogram technique to define the optimal SK parameters
43
4.1.3
SK and the optimal denoising (Wiener) filtering
SK K(f ) can be used to design detection filters due to the fact that it takes large values at the frequencies where impulsivity is dominant and low values where there is Gaussian noise only [13]. For illustration see Figure 4.4.
Figure 4.4: The principle of the Wiener filter [31]. It was shown in [31] that the SK of the sum y(t) = x(t) + n(t), where x(t) is the nonstationary component and n(t) is the stationary Gaussian noise, can be related to the SK of the non-stationary part x(t) by
Ky (f ) =
Kx (f ) [1 + ρ(f )]2
(4.3)
where Ky (f ) is the spectral kurtosis of the sum y(t), Kx (f ) is the spectral kurtosis of the (f ) the ratio of the power spectral densities non-stationary component x(t), and ρ(f ) = SSnx (f ) of n(t) and x(t), reflecting the noise-to-signal ratio with respect to the frequency. The signal to noise ratio will be high (i.e. ρ(f ) ≈ 0) within the resonance bandwidth and low outside. In other words, the Ky (f ) ≈ Kx (f ) within the same bandwidth and Ky (f ) ≈ 0 otherwise [31]. The Wiener filter is linear denoising filter w(t), that can be used to recover the nonstationary component x(t) from the background noise n(t) (Figure 4.4) [31]. The filter transfer function W (f ) can be represented as
W (f ) =
X(f ) 1 = ≈k· Y (f ) 1 + ρ(f )
q
Ky (f )
(4.4)
where X(f ) is the Fourier transform of the non-stationary component x(t), Y (f ) is the (f ) Fourier transform of the sum y(t), and ρ(f ) = SSnx (f the ratio of the power spectral ) densities of n(t) and x(t). The filter is proportional to the square root of the SK, which
44
offers a possibility for SK to identify the optimal filter W (f ) for extraction of transients from the background noise. Prior to the filtering, the SK should be compared with the significance threshold sα that indicates the values significantly greater than zero ( p Ky (f ) W (f ) = 0
for Ky (f ) > s otherwise
(4.5)
In [31,32] a statistical significance threshold sα was calculated on account of the properties of the Gaussian noise.Assuming that the signal is only Gaussian stationary noise, the STFT-based SK has a normal distribution with zero mean and variance N4 , where N is the number of averages [31]. The statistical threshold sα with α level of significance is therefore given as 2 sα = u1−α √ N
(4.6)
where u1−α is the percentile of the normal distribution at 1 − α, which means that all values below this level will have probability 1 − α of not being transients.
4.2 4.2.1
Spectral kurtosis (SK) of vibration signal Angular resampling
Vibration signals are usually acquired in constant time intervals. When the system is operating under fluctuating speed, this can result in inability to accurately track shaft instantaneous speed. Consequently spectral smearing of shaft harmonic frequencies occurs. Sampling of vibration signal at constant angular intervals by using optical encoders can alleviate this problem. This is done by interpolation of the space between successive tachometer pulses to obtain the resampled sampling intervals. Vibration signal is then resampled to the newly established sampling intervals.
4.2.2
Segmentation of the vibration signal
The first step in the signal processing procedure is segmentation of vibration signal y(k) ∈ RN into partial signals (segments) y1 (k) ∈ RN1 , y2 (k) ∈ RN2 , . . . , yn (k) ∈ RNn , . . . where N = N1 + N2 + · · · + Nn .
45
Figure 4.5: Angular resampling process. Generally, the length of a segment Np may be any, but we propose selection of segment length to enable alignment of fault induced impacts and their detection. This can be achieved, by choosing Np proportional to the fault characteristic frequencies, which can be described as:
Np = q ·
Fs fault related frequency
(4.7)
where Fs is sampling frequency and q the proportionality parameter. This means that NP should be related to the ball-pass frequency BP F O (or cage frequency F T F ) for bearing faults, and gearmesh FGM F (or shaft FSH ) frequency for gear faults (Figure 4.6). For bearing outer race defect, following the eq. (4.7) and using BP F O as the fault frequency, segment length is equal to Np = k · BP1F O . Choosing the q equal to the number of rolling elements Nre , the length corresponds to Np = Nre ·
Fs 2Fs Fs = = d BP F O FTF Fsh (1 − dfretf )
(4.8)
when all Nre rolling elements once passing the defective area on the outer race (Figure 4.6-left). As can be seen from eq. (4.8), for outer race defect the length of a segment corresponds to a single rotation of the cage ( F T1 F ).
46
Figure 4.6: Segmentation of vibration signal to short segments of a single rotation. A segment can be related to bearing and gear characteristic frequencies. Similarly, for the bearing inner race defect when using BP F I as the fault frequency, segment length is equal to
Np = Nre ·
2Fs Fs Fs = = d re BP F I Fsh − F T F Fsh (1 + df tf )
(4.9)
when choosing the q equal to the number of rolling elements Nre . As can be seen from eq. (4.9), for inner race defect the length of a segment corresponds to a relative rotation 1 between the shaft and the cage ( Fsh −F ), when all Nre rolling elements once passing the TF defective area (Figure 4.6-middle). For gear defects, by choosing the gearmesh frequency FGM F as the fault characteristic frequency and following eq. (4.7), a segment is proportional to
Np = Nt ·
Fs FGM F
47
=
Fs Fsh
(4.10)
which is a rotation of the shaft when choosing number of teeth q = Nt as the proportionality factor.
4.2.3
Estimation of the spectral kurtosis (SK)
Each segment yn (k), where n is a segment index, is subjected to the estimation of SK using equations (4.2) and (4.1), which results in K( f ). The Hamming window h(k) is used for STFT, where the main parameter is the length of the window Nw . According to some suggestions the length should be smaller than the time between two impacts, and larger than the length of duration of an impulse response [13]. The kurtogram technique (section 4.1.2) offers optimal selection of the Nw while considering the above limitations.
Figure 4.7: Estimation of spectral kurtosis.
4.2.4
Optimal denoising (Wiener) filtering
ˆ (f ) using eq. (4.5). Prior Kn (f ) is used to define the parameters of the Wiener filter W to the filtering, the SK should be compared to the significance threshold s that indicates the values significantly greater than zero and can be assigned to transients. The idea
48
proposed in [31, 32] is to use the constant statistical significance threshold sα to filter out all the Gaussian noise components with 1 − α probability of not being produced by transients. This way, the filter will produce near-to-zero SK-residuals in the fault-free case, when mostly stationary noise is present, and non-zero SK-residuals in the faulty case.
Figure 4.8: SK based Wiener filter applied to segments. Filtering is performed in the frequency domain as follows
ˆ (f ) · Yn (f ) Rn (f ) = W
(4.11)
where Yn (f ) is the Fourier transform of yn (k), and Rn (f ) if the Fourier transform of SK-residual signal rn (k), containing the non-stationary component of yn (k).
4.2.5
Hilbert transform and the SK-residual envelope
If we consider the SK-residual rn (k) as amplitude modulated signal
rn (k) = an (k)cos(2πnf0 + φ)
49
(4.12)
the amplitude modulation signal an (k) (the envelope) reflects the degree of non-stationarity of the fault-induced transients (Figure 4.9).
Figure 4.9: SK-residual envelope extraction using Hilbert transform. an (k) can be extracted from SK-residual rn (k) using the Hilbert transform [66] an (k)ejΦ(k) = rn (k) + j · H{rn (k)}
(4.13)
where Φ(k) = cos(2πnf0 + φ), f0 is the carrier frequency and φ an initial phase. The Hilbert transform H{} is defined as Z
inf
H{rn (k)}(τ ) = − inf
rn (k − τ ) πτ
(4.14)
After extraction of the envelope signal an (k) for nth segment, the squared envelope of the SK-residual a2n (k) is used for physical interpretation of the power of fault induced impacts, and is subsequently used for fault detection and diagnosis purposes.
4.2.6
Sensor fusion and averaging SK-residual squared envelope
In case of inner race defect in a bearing, strong amplitude modulation of vibration signal can occur [13]. Amplitude modulation (AM) is caused by variable transfer function between the rotating damaged spot (the root of the impulse excitation) and the sensor. Additionally, shaft unbalance or misalignment contribute to the variations in forces transferred to the bearing, which cause varying location of a clearance between the rolling elements and the races. When faulty area is closer to the sensor and shaft force applies
50
in the sensor direction, impacts are more expressed and have higher amplitude. On the contrary when the damaged spot is further away from the sensor, the impacts are usually attenuated. Amplitude modulation may present problems in the process of impact detection, especially when impacts are severely attenuated. Classification of impacts of low amplitude may result in large amount of missed detections, which consequently influences the final fault probability estimation.
Figure 4.10: SK-residual fusion to remove amplitude modulation caused by bearing inner race defect. AM can to some extent be solved by averaging of successive segments where segments of a single or multiple sensors can be used for averaging. When fusing features from two sensors, positioned at different angles, each sensor will provide similar vibration pattern, but with shifted modulating signal ai,S1 (k) = ai,S2 (k − φ), where φ is an angle between two sensors. Averaging of SK-residual squared envelope is performed using eq. (4.15)
a ¯2n (k)
Navg 1 X 2 = a (k) Navg i=1 n
(4.15)
where Navg is a number of segments for averaging.
4.2.7
Alignment of impacts using cross-correlation
The process of alignment follows the idea of tracking impacts produced by the same rolling element, by retaining the correspondent impact within constant angular position. Such approach would enable grouping of impacts from the same rolling element for the purpose of reliable detection and diagnosis. Ideally, when the length of a segment is exact and no random slippage is present, this step could be avoided. However, when estimating
51
segment length NP solely based on bearing and gear characteristic frequencies, an error can occur due to deviations of the actual frequencies from the theoretically defined ones. In addition, the random slip phenomena, known to occur in a bearing during machine operation, introduces additional random error from segment to segment (Figure 4.11 top left).
Figure 4.11: Impact alignment using cross correlation. As shown in Figure 4.11, the method proposes estimation of the error and random slippage using cumulative cross-correlation between successive segments. It is proposed that averaged square envelope a ¯2n (k) is used for estimation of cross-correlation. Alternatively, the squared envelope a2n (k) may be used for cross-correlation if amplitude modulation is weak. The cross-correlation between successive segments (¯ a2i−1 ? a ¯2n )(τ ), where τ is the offset between segments, will be high when impacts within both segments are closer and small when apart. Therefore, the offset τ which maximizes the cross-correlation is equal to the error between the two segments. The total error and slippage for nth segment E(n) is equal to the cumulative sum of errors, which can be summarized by
E(n) =
i X
τ
(4.16)
j=1
where τ is the offset maximizing cross correlation. The zeroth segment is equal to the binary mask m(k) as a ¯20 (k) = m(k). The purpose of the mask is to define the angular
52
positions for each of Nre rolling elements. Therefore, the mask m(k) is defined as ( m(k) =
1 0
P (n − 21 ); n ∈ N for k = NNre otherwise
(4.17)
The mask is used as the final reference for the alignment process, which would cause impacts to appear accordant to the angular position of the ball that produced these particular impacts. The cross-correlation (?) is estimated using Np 2Nre X
(¯ a2i−1 ? a ¯2n )(τ ) =
a ¯2i−1 (k)¯ a2n (k + τ )
(4.18)
N k=− 2Np re
which is also defined for n ≥ 1 where the mask m(k) is taken as the zeroth reference. The proposed interval for cross-correlation estimation should not exceed the angle between two impacts NNrep , in order to ensure consistent alignment of the correct impact produced by the same ball. The compensation of the error and slippage is based on mathematical translation of ˆ¯2n (k) = a ¯2n (k − E(n)). Using solely segments according to E(n), which can be defined as a translation introduces a problem of the domain where the aligned segments a ˆ2n (k) exist. If the domain before alignment was equal to k ∈ [0, Np − 1] for all segments, the domain after alignment will be equal to kn ∈ [−E(n), No − E(n) − 1] and different from segment to segment. In order to equalize the domain for all segments after alignment to k ∈ [0, Np − 1], the compensation considers the following steps. Dependent on the sign of E(n), several intervals are established to define parts inside and outside of the desired interval k ∈ [0, Np − 1]. For negative error E(n) < 0, two intervals are defined as 0 ≤ k < −E(n) and −E(n) ≤ k < Np and the alignment process is described as ( a ¯ˆ2n (k) =
a ¯2n (k + E(n) + Np ) a ¯2n (k + E(n))
for 0 ≤ k < −E(n) for − E(n) ≤ k < Np
(4.19)
Similarly for positive error E(n) > 0 the alignment considers two intervals 0 ≤ k ≤ Np − E(n) − 1 and Np − E(n) − 1 < k < Np as ( a ¯ˆ2n (k) =
a ¯2n (k + E(n)) a ¯2n (k + E(n) − Np )
for 0 ≤ k ≤ Np − E(n) − 1 for Np − E(n) − 1 < k < Np
53
(4.20)
4.2.8
Impact detection by means of k-means and k-NN clustering
Impact detection procedure includes two main phases. The first training phase performs clustering of the fault-free features, thus creating Ncl clusters. The second testing phase involves classification of features using k-nearest neighbours (kNN) method into fault-free of faulty states (Figure 4.12).
Figure 4.12: Illustration of impact detection procedure. The features are extracted as the values of SK-residual power ˆ¯a2n (k) that correspond to the maximum value produced by impacts, as ( A(k) =
a ¯ˆ2n (k) 0
for k; where m(k) = T rue otherwise
(4.21)
where m(k) is a pre-established binary mask (eq. (4.17)), which indicates angle of impulses by True value. The approach creates M one-dimensional clusters of the fault-free training data At (k), using the k-means method [67]. After clusters are created, the cal-
54
culation of the novelty scores N S for a testing data sample A(k) is summarized within following steps [68]:
1. First, the nearest neighbour distances are calculated for each training data sample in each of the M clusters. 2. Then, the k maximal N N distances are used for each cluster, and their mean D1 value is calculated. 3. For each testing data sample, Euclidean distances are calculated between training data samples in each cluster. The averaged kNN D2 distance between each cluster and the testing data sample is then calculated as the mean value of k minimal NN distances. 4. The novelty scores N S of testing dataq sample are obtained for all clusters as ratios 2 between distances D2 and D1 : N S = D . D1 Classification proceeds by comparing the novelty scores (NS) to threshold T hN S , which defined a relative threshold between the fault-free class and fault class. If N S > T hN S , the decision is impact not-detected, and if N S < T hN S the impact is detected. The Impact detection procedure thus provides a decision per segment testing sample, which results in a vector of decisions I1 (k), defined as ( I1 (k) =
4.3
0 1
for N S(k) < T hN S for N S(k) ≥ T hN S
(4.22)
Experimental validation of the proposed impact detection method
Validation of the proposed approach included processing of vibration data acquired from fault-free and faulty bearings and gears during independent experimental runs. Bearing vibration data was acquired from three rolling element bearings, which contained artificially created inner race defect, outer race defect, and no defect. The bearing experiment was conducted under stationary conditions using constant speed and load. On the other hand, gear vibration data was acquired during fatigue experiment under non-stationary
55
conditions using constant speed and time-varying load. The gear experiment was operating uninterruptedly for several days. Initially new pair of gears slowly deteriorated, while tooth surface was subjected to relatively large and time-varying torque, a regime used to maximize fatigue progression. From gear vibration, two realizations will be selected from fault-free and faulty conditions under the equal load. The results show performance of the proposed impact detection algorithm, which included spectral kurtosis based extraction of diagnostic feature and classification of features into fault-free and faulty regions. Sensitivity, specificity and accuracy are the main parameters which reflect performance of the proposed impact detection approach. Sensitivity, also known as the True positive rate (TPR), reflects the proportion of correctly identified positive detections in the faulty case. Specificity, also called the True negative rate (TNR), measures the proportion of correctly identified negative detections in the fault-free case. Accuracy reflects the rate of overall correctly identified fault-free and faulty samples, and represents a measure of the total probability of correct diagnosis. Performance parameters are based on four measures, which include • TP (true positive) - a number of correctly identified detections in the faulty case, • TN (true negative) - a number of correctly identified non-detections in the fault-free case, • FP (false positive) - a number of falsely identified detections in the fault-free case, • FN (false negative) - a number of falsely identified non-detections in the faulty case. Based on these measures, the sensitivity (TPR) is a ratio between the number of the correct detections in faulty case (TP) and a sum of false and correct detections in faulty case (TP+FN) TPR =
TP TP + FN
(4.23)
specificity (TNR) is a ratio between the number of the correct non-detections in fault-free case (TN) and a sum of false and correct detections in fault-free case (TN+FP) T NR =
TN TN + FP
(4.24)
and accuracy (ACC) is a ratio between the sum of correct detections in fault and fault-free cases (TP+TN) and the number of correct and false detections from faulty and fault-free cases TP + TN ACC = (4.25) TP + TN + FP + FN
56
4.3.1
Experimental results for bearing faults
The experimental rig (Figure 4.13) consists of a variable speed drive (VSD) with adjusting eccentricity of the coupling, and angular as well as parallel misalignment. In addition, two out of the three bearings were fixed (left and middle) and one was used as a testing bearing (right). The load to the bearing was applied by misalignment adjusted by inserting the shims under bearing housing. Without any shims, the load was maximum and equal to 192N and decreased by 64N for every 0.2mm shim.
Figure 4.13: SKF bearing testing rig Vibration signals were recorded at 2 sensors position at an angle of 45 degrees on the testing bearing housing (Figure 4.13). Additionally, tachometer was used as an independent measurement of the shaft rotation. The testing bearings, damaged and undamaged, had the ball pass frequency outer race BP F O = 3.05Fsh , ball-pass frequency inner race BP F I = 4.95Fsh , cage frequency F T F = 0.382Fsh , and number of rolling elements (balls) Nre = 8. The sampling frequency was Fs = 40kHz for signals to contain high frequency harmonics of the structure, excited by fault-induced impacts. Signals of duration of t = 10s were recorded independently for outer race defect, inner race defect and fault-free case. Bearings were tested under shaft speed Fsh = 58Hz and half of the maximum load T1 = 96N .
4.3.1.1
Bearing inner race defect
The inner race damage was created artificially with damaged area of 0.8mm circumferential length and 0.2mm depth (Figure 4.14). This can be considered as an early stage damage, since it represented 1.17% of the inner race circumference equal to 68mm.
57
Figure 4.14: Inner race defect. The results below show application of the proposed method to vibration signals acquired under fault-free and defective inner race. Prior to the analysis, the acquired vibration segments were resampled into the angular domain by exploiting tachometer shaft rotation measurements. Segmentation of the vibration signal to segments of a single rotation After angular resampling equation eq. (4.9) was used to calculate number of samples of a segment, where BPFI was taken as the fault characteristic frequency and Nre the proportionality parameter as
Np =
40000Hz 40000Hz Fs = = ≈ 1116samples Fsh − F T F 58Hz − 22.156Hz 35.844Hz
(4.26)
The estimated inner race defect segment contains exactly Nre impacts in the defective sh ≈ 582◦ in angular domain. case and corresponds to Np ≈ 360◦ FshF−F TF Figure 4.15 shows vibration segments in the fault-free (4.15a) and defective (4.15b) states. The x-axis corresponds to the shaft angle, the y-axis to the segment number, and the color (z-axis) to the acceleration magnitude. In the fault-free case (Figure 4.15a), when mostly noise is present, the magnitude of vibration segments dos not exceed 1g, where g = 9.81 sm2 is gravitational acceleration. In the defective case (Figure 4.15b), peaks of amplitude above 7g indicate consistent presence of impulsivity and vibrational response of the system. In the defective case the background noise was equal to 1g and signal-to-noise ratio SN RdB = 17dB. The impulsive excitation is evoked by ball passing through the damaged inner race surface,
58
(a) Fault-free
(b) Inner race defect
Figure 4.15: Periods of vibration signal after segmentation, where a segment duration is proportional to ball pass frequency inner race BPFI. Segment duration in angular domain corresponds to approx. 583 deg thus causing Nre = 8 excitations per segment. Alignment of segments in defective case forms curvy patterns of high amplitude, which presents limitations to tracking of impacts produced by the same rolling element. Instead of vertical lines, each located within the corresponding angular range, the curves appear due to error when estimating segment length and the presence of random slippage in the operating bearing (section 4.2.7). It is evident that amplitude of impacts differs for each impact, which indicates presence of amplitude modulation of vibration response. Since the defective component is inner race, the modulation is mainly consequence of rotating damaged surface, and is indicated by a low amplitude of vibration where impacts are attenuated (Figure 4.15b).
Estimation of spectral kurtosis (SK) The segments were subjected to estimation of the spectral kurtosis (SK) using eq. (4.2) and (4.1). The optimal Hamming window for estimation of STFT was determined by the kurtogram technique as Nw = 29 samples, equivalent to 14◦ of shaft angular displacement. For the STFT 80% of the window overlap was used, which concluded in the estimation of the SK for fault-free and faulty cases as shown in Figure 4.16. In the fault-free case (Figure 4.16a) the amplitude of the SK remains approx. 1g within the entire frequency range of [0, F2s ] = [0, 20kHz]. Low SK values are an indication of presence of mainly Gaussian noise in the vibration signal with weak non-stationary components. On the other hand, in the faulty case (Figure 4.16b) strong peaks of the SK are visible, exceeding amplitude of 15. It can be seen from Figure 4.16b, that several resonant frequency bands were excited due to a presence of a defect. SK was consistently excited within frequency bands [3kHz, 4kHz] and less consistent within [8kHz, 13kHz]
59
(a) Fault-free
(b) Inner race defect
Figure 4.16: Results of spectral kurtosis obtained from each segment. and above. It appears, that several frequency bands were excited due to relatively strong impact excitation. It is also evident from Figure 4.16b that not all segments estimate the same SK. Moreover, some segments show clearly expressed frequency bands with values SK > 15, while SK of low amplitude was segmentically estimated case approx. every 13 segments. During those segments, it appears that despite the presence of the fault, impact excitation nor structural response were present. The reasons for this are to be found in attenuated impacts due to amplitude modulation.
SK based optimal denoising (Wiener) filtering and impact power estimation From the SK, the Wiener filter was estimated using eq. (4.5), which considers values above the statistical threshold, set to sα = 0.23 at α = 1% and 75% window overlap. Raw vibration segments were filtered using the Wiener filter (eq. (4.11)) using frequency bands, indicated by values of SK above the threshold. The filtered segments, called the SK-residuals, are shown in Figure 4.17. The SK-residuals are time domain signals with removed stationary random component (Gaussian noise), thus preserving the fault-induced non-stationary component in vibration signal. Filtering of the fault-free vibration data revealed SK-residuals of low amplitude, which confirm absence of non-stationary behaviour and presence of Gaussian noise (Figure 4.17a). The reasons for low amplitude of SK-residual is to be found in the fact, that SK amplitude remains low in the presence of mainly Gaussian noise, often below the significance threshold. In such case, the frequency bands for filtering defined by the SK would be relatively narrow or even zero, which may deteriorate performance of the Wiener filter. In any case, the amplitude of SK-residuals in fault-free case remains close to zero as anticipated.
60
(a) Fault-free
(b) Inner race defect
Figure 4.17: Filtered segments (SK-resudials) In the faulty case in Figure 4.17b, the impulse responses are clearly visible with amplitude > 25g, approx. 3-4 times higher than segments before filtering (Figure 4.15b). The amplitude modulation remains present also after filtering, since only two impacts per segment remain the strongest. The signal-to-noise ratio for faulty case is increased to approx. SN RdB > 28dB, where the amount of background vibration noise after filtering is equal to approx. 1g. To characterize the SK-residuals and reflect power of the fault-induced impacts, the squared envelope (squared Hilbert transform) was estimated using eq. (4.12) to (4.14). Figure 4.18 shows SK-residual squared envelope from fault-free and faulty cases.
(a) Fault-free
(b) Inner race defect
Figure 4.18: SK-residual power - squared envelope In the fault-free case the amplitude of squared envelope remains below 1g, which indicates absence of impulsive behaviour within the system. Consequently, low values of SKresiduals indicate absence of a defect and therefore prove the fault-free conditions (Figure 4.18a). The faulty case shows impacts of amplitude exceeding the value of 600g (Figure 4.18b), which indicates strong presence of a defect. Not only the squared envelope reflects
61
the physical power of impacts, it represents high-frequency impulse responses as a single positive impact with amplitude equal to its power. Averaging and angular alignment of impacts using cross-correlation A moving window of size 10 segments was used for averaging of SK-residual squared envelope, to overcome the problem of impact attenuation due to amplitude modulation. The stronger the modulation, the poorer the behaviour of cross correlation based alignment, when compared to case of equally strong impacts. The averaged segments contained impacts from all Nre balls, which indicates successful application of this simple averaging technique to eliminate the presence of amplitude modulation. Instead of averaging segments from the same sensor, multi=sensor vibration data could instead be averaged. However, the reduced amount of amplitude modulation due to averaging, caused also decrease in amplitude of impacts and SNR, compared to non-averaged squared envelope segments (Figure 4.19).
(a) Fault-free
(b) Inner race defect
Figure 4.19: Period averaging The curvy pattern formed by SK-residual segments represents limitation to tracking impacts produced by the same ball. The curves appear mainly due to two reasons: the error of segment length estimation by theoretical characteristic frequencies, and the presence of the random slippage. Estimation of both was performed using the cross-correlation method (section 4.2.7), applied to the averaged segments with removed amplitude modulation. The cross-correlation was estimated using eq. (4.18) and the error and slippage were estimated using eq. (4.16), and are shown in Figure 4.20 for fault-free and faulty cases. In the fault-free case, when no impacts were present, the random slippage estimation produced unstable results (Figure 4.20a). In that case, the cross correlation has relatively low values, usually below 10, and the maximum value is poorly expressed, therefore the
62
(a) Fault-free
(b) Inner race defect
Figure 4.20: Estimated random slippage and error alignment is random. Using a significance threshold for cross-correlation, which would indicate only values of the correlation above noise, is a possible solution to this problem. Estimation of slippage for the defective case produced stable results showing the error and slippage estimated (Figure 4.20b). The process of angular alignment, described in section 4.2.7, included compensation for the estimated error. Even though averaged segments were used for estimation of crosscorrelation, the alignment process was performed for each non-averaged segment, by assigning the same value of error to several successive segments. The aligned non-averaged SK-residuals are shown in Figure 4.21.
(a) Fault-free
(b) Inner race defect
Figure 4.21: SK-residual segments after alignment using cross-correlation In the fault-free case, when error and slip were estimated randomly, the following process of alignment also randomly translated the fault-free segments (Figure 4.21a). However, in the faulty case the alignment procedure proved successfully, since after alignment the segments are not only aligned to the mask m(t), but also impacts produced by the same ball appear in the same angle (straight lines) (Figure 4.21). Therefore, the alignment procedure enabled tracking of impacts from the same ball, which is a part of the proposed diagnostic procedure.
63
Impact detection by k-means clustering and k-nearest neighbours classification Following the alignment process, impact detection relied on k-means clustering and k-nearest neighbours (k-NN) classification of the aligned segments. For detection purposes, features were extracted from the aligned segments as the maximum amplitude of impacts, which provided Nre diagnostic features per segment. Impact detection included two phases: the training and testing phase. The training phase included establishment of the fault-free clusters using the k-means method, with number of 1-D clusters equal to kC = 10 clusters. For each cluster a novelty score was calculated using Euclidian distances between training samples belonging to the cluster. The training phase concluded by establishing the boundary between the fault-free and faulty clusters, based on the relative impact detection threshold, which was equal to 1. Vibration data for training purposes included 345 segments and consequently 2760 faultfree training samples for k-means clustering. For testing purposes two datasets were established, the fault-free and faulty ones each containing approx. 345 segments shown in Figure 4.21. Impact detection was based on k-NN method with the number of nearest neighbours set to kN N = 10. Each testing sample was subjected to calculation of the novelty score, which was compared to novelty scores obtained during training phase for each fault-free cluster, thus obtaining relative novelty score for each fault-free cluster. When all relative novelty scores exceed the detection threshold 1, the testing sample is considered an anomalous and the detection is positive. The results of k-NN classification are presented in Figure 4.22.
(a) Fault-free
(b) Inner race defect
Figure 4.22: Results from impact detection using k-means for clustering (training phase), and k-NN for classification (testing phase)
64
Figure 4.22 shows the impact detection charts for fault-free (Figure 4.22a) and defective cases (Figure 4.22b), where red color indicates positive, and green negative detection. For the fault-free case, the majority of detections were negative with few isolated false N = 99.7%, detections. The specificity (true negative rate) was equal to T N R = T NT+F P where true negative measure T N = 2753 and false positive measure F P = 7. In the defective case, the majority of detections proved to be positive, which clearly indicates the presence of impacts. However, significant amount of missed detections is evident, and is apparently a consequence of amplitude modulation. The sensitivity (true P positive rate) was equal to T P R = T PT+F = 92.5%, where T P = 2554 and F N = 206. N 4.3.1.2
Bearing outer race defect
The outer race damage was created artificially with damaged area of 0.8mm circumferential length and 0.2mm depth (Figure 4.23). This early stage damage represented 1.44% of the outer race circumference.
Figure 4.23: Outer race defect. The results below show application of the proposed method to vibration signals acquired under fault-free and defective outer race bearing conditions. Prior to the analysis, the acquired vibration segments were resampled into the angular domain by exploiting tachometer shaft rotation measurements. Segmentation of the vibration signal to segments of a single rotation After angular resampling equation eq. (4.8) was used to calculate number of samples of a segment, where BPFO was taken as the fault characteristic frequency and Nre the proportionality parameter as NP =
40000Hz Fs = ≈ 1804 samples FTF 22.156Hz
65
(4.27)
Figure 4.24 shows vibration segments in the fault-free (4.24a) and defective (4.24b) states. The x-axis corresponds to the shaft angle, the y-axis to the segment number, and the color (z-axis) to the acceleration magnitude.
(a) Fault-free
(b) Inner race defect
Figure 4.24: Raw vibration segments In the fault-free case (Figure 4.24a), when mostly noise is present, the magnitude of vibration segments does not exceed 1g. In the defective case (Figure 4.24b), a weak pattern of amplitude above 3.2g indicates consistent presence of impulsivity and vibrational response of the system. In the defective case the background noise was equal to 1g and signal-to-noise ratio SN RdB = 10dB. It is evident that amplitude of impacts differs for each impact, but significantly less than in the case of the inner race defect, when amplitude modulation caused severe attenuation of impacts. Since the defective component is outer race, the modulation is mainly consequence of rotating varying geometry of the rolling elements, since the damaged surface is still.
Estimation of spectral kurtosis (SK) The estimation of the SK for fault-free and faulty cases as shown in Figure 4.25. In the faultfree case (Figure 4.25a) the amplitude of the SK remains equal to 1 within the entire frequency range of f = [0, F2s ] = [0, 20kHz]. In the faulty case (Figure 4.25b) strong peaks of the SK are visible, exceeding amplitude of 6. It can be seen from Figure 4.25b, that consistently for all segments, resonant band were excited due to a presence of a defect at 4kHz.
SK based optimal denoising (Wiener) filtering and power estimation From the SK, the Wiener filter was estimated using eq. (4.5), which considers values above
66
(a) Fault-free
(b) Inner race defect
Figure 4.25: Estimation of spectral kurtosis for each segment the statistical threshold, set to sα = 0.23 at α = 1% and 75% window overlap. Raw vibration segments were filtered using the Wiener filter (eq. (4.11)) using frequency bands, indicated by values of SK above the threshold. The filtered segments, called the SK-residuals, are shown in Figure 4.26.
(a) Fault-free
(b) Inner race defect
Figure 4.26: Filtered segments (SK-resudials) The amplitude of SK-residuals in fault-free case remains close to zero as anticipated. In the faulty case in Figure 4.26b, the impulse responses are clearly visible with amplitude > 5g acceleration, which means that the amplitude of impacts remained close to the one before filtering (Figure 4.24b). The signal-to-noise ratio for faulty case is increased to approx. SN RdB > 14dB, where the amount of background vibration noise after filtering is equal to approx. 1g. Figure 4.27 shows SK-residual squared envelope from fault-free and faulty cases. In the fault-free case the amplitude of squared envelope remains below 1g 2 , which indicates an absence of impulsive behaviour within the system (Figure 4.18a). The faulty case
67
(a) Fault-free
(b) Inner race defect
Figure 4.27: SK-residual power - squared envelope shows impacts of amplitude exceeding the value of 300g 2 (Figure 4.18b), which indicates strong presence of a defect.
Angular alignment of segments using cross-correlation The estimation of the error in segment length and random slippage were performed using cross-correlation method. Due to low amplitude modulation of vibration signal in case of the outer race damage, no averaging was performed on SK-residuals squared envelope, which was used for estimation of cross correlation directly. The cross-correlation was estimated using eq. (4.18) and the error and slippage were estimated using eq. (4.16), and are shown in Figure 4.28 for fault-free and faulty cases.
(a) Fault-free
(b) Inner race defect
Figure 4.28: Estimated random slippage and error When no impacts are present, the cross correlation have relatively low values, usually below 10, and the maximum value is poorly expressed, therefore the alignment is random. Estimation of slippage for the defective case produced stable results showing the error and slippage estimated (Figure 4.28b). The alignment process was performed for each non-averaged segment, and resulted in aligned SK-residuals as shown in Figure 4.29. In the fault-free case, when error and slip were estimated randomly, the following process of alignment also randomly translated the fault-free segments (Figure 4.29a). However,
68
(a) Fault-free
(b) Inner race defect
Figure 4.29: SK-residual segments after alignment using cross-correlation in the faulty case the alignment procedure proved successful, as it resulted in impacts produced by the same ball to appear at the same angle (vertical lines) (Figure 4.29). Therefore, the alignment procedure enabled tracking of impacts from the same ball, which is a part of the proposed diagnostic procedure.
Impact detection by k-means clustering and k-nearest neghbours classification Vibration data for training purposes included approx. 214 segments and consequently 1712 fault-free training samples for k-means clustering. For testing purposes two datasets were established, the fault-free and faulty ones each containing approx. 1712 segments shown in Figure 4.29. Impact detection was based on k-NN method with the number of nearest neighbours set to kN N = 10. Each testing sample was subjected to calculation of the novelty score, which was compared to novelty scores obtained during training phase for each fault-free cluster, thus obtaining relative novelty score for each fault-free cluster. When all relative novelty scores exceed the detection threshold 1, the testing sample is considered an impact. The results of k-NN classification are presented in Figure 4.30. Figure 4.30 shows the impact detection charts for fault-free (Figure 4.30a) and defective cases (Figure 4.30b), where red color indicates positive, and green negative detection. For the fault-free case, the majority of detections was negative with a few isolated false N detections. The specificity (true negative rate) was equal to T N R = T NT+F = 99.2%, P where true negative measure T N = 1699 and false positive measure F P = 13. In the defective case, the majority of detections proved to be positive, which clearly indicates the presence of impacts. However, significant amount of missed detections is evident, and is apparently a consequence weak impacts compared to the inner race P damage. The sensitivity (true positive rate) was equal to T P R = T PT+F = 80.8%, N where T P = 1383 and F N = 329.
69
(a) Fault-free
(b) Inner race defect
Figure 4.30: Impact detection
4.3.2
Experimental results with gear pitting
The main objective of the pitting experiment, conducted under time varying load conditions, was to induce the occurrence of the macro pitting in the flank of spur gears naturally during operation under variable load. The experiment was performed on the experimental rig, which was also used for oil analysis (section 3.2.1). The details of the pitting experiment were discussed in section 3.2.4, including results from the analysis of generated wear particles during the experiment. Three vibration samples were considered for detection of pitting. Two fault-free samples were selected at 100% load at approx. 20 hours and 34 hours for training and fault-free testing purpose, respectively. The sample representing faulty conditions was selected at 100% load at 104 hours.
Segmentation of the vibration signal to segments Figure 4.32 shows vibration segments in the fault-free (4.32a) and defective (4.32b) states. The x-axis corresponds to the shaft angle, the y-axis to the segment number, and the color (z-axis) to the acceleration magnitude. In the fault-free case (Figure 4.32a) vibration segments indicate strong presence of gearmesh component, which is visible as 16 peaks of amplitude above 10g (orange-red and blue vertical lines). During the meshing cycle, when load s transferred between teeth, the loaded surface area of the tooth pair varies in time. Abrupt jumps of amplitude of vibration response are a consequence of non-integer gear contact ratio CR = 1.5 and the fact that
70
Figure 4.31: A photo of spur gears with a pitted tooth.
(a) Fault-free
(b) Gear pitting
Figure 4.32: Raw vibration segments
the load is carried by 2 tooth pairs for one half of the meshing cycle and 1 tooth pair for the other half. In the defective case (Figure 4.32b) vibration is additionally influenced the apparent damage, which manifested as abrupt jumps of amplitude above 30g. Several such peaks (dark red and dark blue vertical lines) indicate damage distributed over several teeth. The background noise was equal to approx. 5g and the signal-to-noise ratio SN RdB = 15.5dB. The impulsive excitation is evoked by the surface rolling and sliding phenomena during a meshing cycle, when damaged surface comes into contact during meshing.
71
Estimation of spectral kurtosis (SK) The estimation of the SK for fault-free and faulty cases as shown in Figure 4.33.
(a) Fault-free
(b) Gear pitting
Figure 4.33: Estimation of spectral kurtosis for each vibration segment In the faultfree case (Figure 4.33a) the amplitude of the SK remains below 2 within the entire frequency range of [0, F2s ] = [0, 20kHz]. Low SK values are an indication of presence of mainly Gaussian noise in the vibration signal with weak non-stationary components. On the other hand, in the faulty case (Figure 4.33b) strong peaks of the SK are visible, exceeding amplitude of 5. It can be seen from Figure 4.33b, that resonant band continuously excited due to defect was equal to [10kHz, 14kHz]. It is also evident from Figure 4.33b that not all segments estimate the same SK. Moreover, some segments show clearly expressed frequency bands with values SK > 5, while SK of lower amplitude was estimated between 12th and 14th segments. During those segments despite the presence of the fault, impact excitation was weak, which could be related the varying size of oil film and amount of wear debris in the contact.
SK based optimal denoising (Wiener) filtering and power estimation The filtered segments, called the SK-residuals, are shown in Figure 4.34. Filtering of the fault-free vibration data revealed SK-residuals of low amplitude, which confirms absence of non-stationary behaviour and presence of Gaussian noise (Figure 4.34a). Filtering in the faulty case (Figure 4.34b) revealed strong presence of a single impact per segment of amplitude above 10g, which points out a single tooth damage, while others are weaker and therefore not clearly expressed. The signal-to-noise ratio for faulty case is increased to approx. SN RdB > 23.5dB, where the amount of background vibration noise after filtering is equal to approx. 1g. Figure 4.35 shows SK-residual squared envelope from fault-free and faulty cases.
72
(a) Fault-free
(b) Gear pitting
Figure 4.34: Filtered segments (SK-resudials)
(a) Fault-free
(b) Gear pitting
Figure 4.35: SK-residual power - squared envelope In the fault-free case the amplitude of squared envelope remains low which indicates an absence of impulsive behaviour within the system (Figure 4.35a). The squared envelope in the faulty case shows one strongest and several weaker impacts, which confirms the presence of damage distributed over several teeth (Figure 4.35b).
Impact detection by k-means clustering and k-nearest neghbours classification Vibration data for training purposes included approx. 21 segments and consequently 336 fault-free training samples for k-means clustering. For testing purposes two datasets were established, the fault-free and faulty ones each containing approx. 336 segments shown in Figure 4.35. Impact detection was based on k-NN method with the number of nearest neighbours set to kN N = 10. The results of k-NN classification are presented in Figure 4.36. Figure 4.36 shows the impact detection charts for fault-free (Figure 4.36a) and defective cases (Figure 4.36b), where red color indicates positive, and green negative detection.
73
(a) Fault-free
(b) Gear pitting
Figure 4.36: Impact detection For the fault-free case, the majority of detections were negative, with several occasional false alarms. In the defective case, two teeth are indicated as defective as the majority of detections proved to be positive.
74
5 Integrated fault detection and isolation (FDI) based on fusion of vibration and oil readings To isolate the correct defect from a number of possible ones, the final steps of fault detection and isolation relies on data fusion at the decision level (figure 5.1).
Figure 5.1: Fault detection and isolation scheme. It was shown in section 3, that the CDA based analysis of oil properties provides detection of signal’s trend change and association of the change to the qualitative states, such as stable, stabilizing, increasing, decreasing, etc (figure 5.1-top). Qualitative trend analysis (QTA) is applied in order to instantaneously evaluate the nature of the change, based on comparison of two slopes, the pre-established reference and the present slopes. Within this chapter, typical fault related patterns occurring in oil parameters are identified, by comparing the time evolution of qualitative states of a property to several fault-pattern models. The results of a comparison is an estimated probability for each of the typical fault pattern.
75
Vibration analysis, presented in section 4, successfully performs extraction of the diagnostic feature of SK-residual power by using squared envelope, and performs impact detection based on k-means and k-NN methods. As a result, the Boolean impact detection chart is obtained, that characterizes the amount of impulse responses present in the analysed vibration signal. The impact detection chart is used for impact grouping, for elimination of false and missed detections, and impact probability estimation for each of the possible faults, sensed by vibration measurements. The final fault detection and identification is performed using oil parameters and vibration impact probabilities, to obtain the final probability for each of the faults defined by the incidence table (figure 5.1-right). Probabilities from oil parameters and impacts are associated during the evaluation process of the incidence table, that contains relations between possible faults and corresponding fault signature patterns. In the incidence table, rows are used to represent faults and columns to represent oil and vibration based probabilities of fault signatures. Each fault may contain one or several possible faultmode combinations, where each combination could indicate a presence of the same fault, dependent on the fault progression mechanisms. Each combination also contains a number of weights assigned differently from zero only to indicative properties. The weights are used to reflect the level at which a particular property is indicative for a fault. Partial probabilities obtained by oil and vibration analyses are associated using the weighted average method, which results in estimation of the final probabilities for all faults included within the incidence table.
5.1
Oil-based fault probability estimation
Fault signatures observed in oil properties tend to evolve slowly in time, producing trend patterns that cannot be interpreted solely by the CDA based analysis. In order to characterize fault related signatures, their evolution is estimated in terms of the qualitative states as an output of the CDA. This way, a series of fault-indicative qualitative models could be established, and assigned to the current evolution of a particular oil property, in order to detect and isolate the fault. The models are continuously compared to the current oil property evolution in the qualitative domain, in order to estimate the probability of fault-signature presence. The qualitative state vector S(k), obtained upon CDA analysis of oil property signal x(k), ej (k), where j is the signature index. Faultis used for recognition of fault signatures x signatures are represented by qualitative trend models, with state values equivalent to the ones obtained by CDA analysis (section 3.1.6). The result of the recognition process
76
is a probability P xj (k), which represents the probability of a particular fault signature currently present in an oil property.
5.1.1
Typical fault signatures and qualitative trend analysis
We propose four distinct fault-signatures j ∈ [1, . . . , 4], which reflect stable, increasing, decreasing, and a step increase with logarithmic decrease patterns (Figure 5.2).
Figure 5.2: Possible fault-signatures with representative qualitative trend models As can be seen (Figure 5.2-right), each fault-signature is described using one or several qualitative models, which represent time sequence of states describing the pattern. For the increasing signature, three patterns (and models) are assumed, for decreasing as e j,u , where u is the model index. well, etc. Each combination is represented by a model S Models are represented as sequences of states, where only transitions between states are considered and repetitions avoided:
e j,u = S(k) S
for S(k) 6= S(k − 1)
77
(5.1)
The stable pattern is used to identify the fault-free state when no faults are present. The e 0,1 = [0]), since a signal in stable condition stable pattern model contains a zero value (S produces a sequence of qualitative states equal to zeros S(k) = [0, 0, 0, . . . ]. The increasing pattern in defined by three typical shapes. The exponential increase is often an indication of damage progression, related to rapid deterioration of mechanical components, severe oil contamination, etc. It is modeled by an increasing qualitative e 1,1 = [0, 3]. Linear pattern indicates continuous increase with a constant slope, state S which is often a result of water condensation, oil aging, etc. It is described by model e 1,2 = [0, 3, 2]) and contains an increasing and non-changing states. Logarithmic increase (S often represents stabilization of fault progression, which is particularly important when estimating fault severity. Such patterns are also common in temperature measurements after load change occurs, which may be relevant during fault diagnosis. Logarithmic e 1,3 = [0, 3, 2, 1]) which is represented by qualitative pattern is described by model (S states as increasing, no-change, and stabilizing. The decreasing pattern is defined by the same shapes as increasing pattern, where the shapes are used with a negative sign. The composite pattern contains step increase with logarithmic decrease, and is a sequence observed mainly during water contamination experiment. The composite pattern can be e 3,1 = [0, 3, −3, −2, −1]) or (S e 3,2 = [0, 3, −3, −1]). The first described using two models (S e 3,1 ) indicates, that the initial step change would be followed by an negative model (S change which would occur when the signal will start to stabilize. The second model (M3,2 ) shows, that before stabilization, the CDA may indicate also a no-changing state, dependent on the shape of the pattern and selection of CDA parameters.
5.1.2
Similarity estimation using the majority voting approach
With fault-signatures and their qualitative models defined, recognition of the signatures in the analyzed oil property is based on comparison of two qualitative state sequences. The first sequence is taken from a signature model, and the second sequence from the qualitative states vector S(k) as a result of current CDA analysis of oil property x(k). If the current state vector S(k), obtained during CDA analysis, contains repeated state values, they are first removed as unique state vector S0 is defined by following the eq. 5.1. e j,u , which results in The unique state vector S0 is compared to fault-signature models S similarity measures between the observed property and the proposed fault-signatures, indirectly through models. After similarity measure is obtained for all models from a
78
particular fault-signature, the final probability of the fault-signature is obtained as the maximum among representative models. e j,u , may be of different lengths, they are first subjected to inAs both vectors, S0 and S terpolation to assure equal lengths of Nl . After interpolation, the binary scoring function σj,u is obtained as ( σj,u (i) =
for S0 (i) 6= Sej,u (i) for S0 (i) = Sej,u (i)
0 1
(5.2)
where i = [1, . . . , Nl ] denotes an index. Similarity degree ζj,u between the observed oil property and the model, is calculated from the binary scoring function using the majority rule approach as PNl ζj,u =
σj,u (i) Nl
i=1
(5.3)
The similarity degree ζj,u reflects the degree of similarity between the observed oil property, and all models from all fault-signatures. In order to obtain the final probability P xj , which represents probability of presence of j th fault-signature, the maximum value among all models representative of the j th fault signature is considered:
P xj = max ζj,u u
(5.4)
The final probability of the fault signature is used as a partial contribution regarding fault presence, if presence of j th signature is the oil property indicates the fault.
5.2
Vibration-based fault probability estimation by grouping approach
After obtaining the impact detection results I1 (k) from vibration based impact detection, an importance is given to grouping of decisions. The proposed approach follows the idea of tracking impacts produced by the same rolling element, by grouping impacts appearing in the same angular position in the successive periods (Figure 5.3). From a group, the majority coefficient M C is calculated PNg MC =
I1(i) Ng
i=1
79
(5.5)
Figure 5.3: Grouping of 1st level decisions to eliminate false and missed detections.
where the sum represents the number of positive detections within group, and Ng is group length. The majority coefficient M C is compared to threshold T hM C as a part of majority voting procedure. Hence, the 2nd level decision is obtained as ( I2 =
0 1
for M C < T hM C for M C ≥ T hM C
(5.6)
As a result, a decision is made upon grouping for each group, and the vector of grouped decisions I2g , where g is a group index. From grouped decisions, the final impact probability is estimated using PG Py =
g=1
G
I2g
(5.7)
where G is the number of groups.
5.3
Final fault diagnosis based on fusion of vibration and oil analyses
The incidence table is a knowledge database that provides relations between the expected faults, and fault indicative oil properties and vibration signatures. Furthermore, evaluation of the incidence table represents the process of fusion between partial decisions
80
obtained from analysing oil and vibration signals. The probabilities of fault-signatures present in oil and vibration signals are integrated within the final step of fault probability estimation (Figure 5.4).
Figure 5.4: Evaluation process of the incidence table. As can be seen from the figure, the incident table considers several faults, where where each fault contains one or several combinations, denoted as Cv,z , where v is fault index and z a combination index. As the proposed faults may occur in different forms, i.e. contaminations with various liquids affect oil properties differently, but still account for the same fault. Each combination is characterized by a distinct set of oil properties and vibration features, which are indicative to the correspondent fault. To the selected oil properties, faultsignatures are assigned (Figure 5.2), to describe the nature of property evolution, during fault progression. As a result, probabilities of the selected property-signature pair P xi,j , where i is the oil property index and j the fault-signature index, will be used for the final fault probability estimation.
81
Each combination also includes fault-related vibration signatures, which are represented within the table as probabilities of impacts related to different vibration signatures P yl , where l is the vibration signature index. To characterize significance of a particular signature regarding fault indication, importance weights W are assigned to the selected oil properties and vibration features. The sum of all weights belonging to a particular combination is equal to 1. Partial probabilities from oil and vibration signature analysis, are averaged to the final fault probability, after being weighted by the corresponding weights. A probability of a fault P is obtained in the following way. First for each combination C in the incidence table, probability Pc is calculated as Pc =
X
Wi P xi,k +
X
i
Wj P yl
(5.8)
j
where i indicates oil property index, j a vibration signature index, k oil-related fault signature index fault index, and c combination index. The final fault probability P is defined as a maximum probability of all combinations of a particular fault as P = max Pc (5.9) c
5.3.1
The proposed incidence table
The proposed incidence table, shown in figure 5.5, comprehends 6 faults, which are related to oil contamination with fluids and dust, aging, excessive component surface wear and pitting [69–72]. The first part of the incidence table is related to oil condition and lubrication quality, and the second part to the condition of mechanical components, such as gears and bearings (figure 5.5). Water contamination has two possible combinations, where the first combination C1,1 predicts a composite pattern related to step change with logarithmic decrease in water content with W = 100%), and the second combination C1,2 predicts an increasing pattern (linear, exponential or logarithmic) in water content at stable temperature. Since an indication of water contamination should be mainly related to water measurements, the water content is assigned with higher weight W x = 90% and stable temperature W x = 10%. Chemical contamination or oil aging are represented by two combinations related to increasing and decreasing patterns in the dielectric constant, and stable temperature. The
82
Temperature x1
Vibration analysis
Fe Particles Fe Particles Dielectricity (small) (large) x3 x4 x5
Non-Fe Particles small x6
Non-Fe Particles (large) x7
Impact probability by gears y1
Impact probabilty from outer race y2
Impact probabilty from inner race y3
Combination
Faults
Oil property analysis Water content x2
Transient probability (Pxi,t: 0-100%), impact probability (Pyl: 0-100%) and weights (Wxijc & Wyljc: 0–100%)
Chemical Contamin. Oil Aging F2
Water Ingression Condensation F1
Px1,0 (10%) Px2,3 (90%)
Px1,0 (10%) Px2,1 (90%)
Px1,0 (10%)
Metallic dust contamination F3
0
0
Px3,1 (90%)
no influence
Px1,0 (10%)
0
0
0
0
0
0
0
0
0
0
0
0
no influence 0 no influence 0 no influence 0 no influence 0 no influence 0 no influence 0 no influence
0 no influence
0
0
0
0
Px3,2 (90%)
0
0
0
0
0
0
Px6,1 (100%)
0
0
0
0
Px4,1 (40%)
no influence no influence 0
0
0
0
0
0
0
Px5,1 (40%)
0
0
0
Px4,1 (50%)
0
0
0
Px7,1 (100%)
0
0
0
0
-Py1 (20%)
-Py1 (20%)
0
0
0
0
Px5,1 (50%)
0
0
0
0
-Py1 (20%)
-Py1 (20%)
Py1 (50%)
0
0
0
0
0
0
0
Py1 (50%)
0
-Py3 (20%)
0
Py1 (34%)
0
0
Py2 (100%)
0
0
Figure 5.5: The incidence table.
C43
0 no influence
0
C42
0
no influence no influence 0
C41
0
no influence no influence
no influence no influence no influence no influence no influence no influence no influence no influence
83
C32
-Py3 (20%)
no influence no influence
no influence no influence no influence no influence no influence no influence no influence
0
0
C34
no influence no influence 0
C31
C33
no influence no influence
Px4,1 (33%) Px5,1 (33%)
C22
0
no influence no influence no influence
no influence no influence no influence
no influence no influence 0
0
C21
0
no influence no influence
no influence no influence no influence 0
0
C12
0
no influence no influence no influence
no influence no influence 0
0
no influence no influence no influence no influence
no influence no influence no influence 0
0
0
no influence no influence no influence no influence no influence 0
0
no influence no influence no influence no influence no influence no influence no influence
no influence no influence no influence no influence 0
0
C11
0
no influence no influence no influence no influence no influence no influence no influence
no influence
0
Excessive Gear Wear / Pitting F4
0
no influence no influence no influence no influence no influence no influence no influence no influence
no influence
Outer Race Wear / Pitting F5
0
no influence no influence no influence no influence no influence no influence no influence no influence
0
Outer Race Wear / Pitting F6
0
C51
Py3 (100%)
C61
weights are assigned similarly as in the case of water contamination, where an importance of dielectricity is considered much higher (W = 90%) that the one of temperature (W = 10%). Contamination of gearbox oil with metallic dust is characterized by four combinations, related to ferrous and non-ferrous particles and vibration. The first and second combinations C3,1 and C3,2 predict an increasing pattern in small or large non-ferrous particles, respectively, both with weights of 100% (W = 100%). The third and fourth combinations C3,3 and C3,4 are defined by an increasing patterns in small or large ferrous particles. This fault signature is similar to the one in excessive wear fault, thus it is required to add additional feature to be able to distinguish between the two faults. In case of dust contamination, vibration features are expected to remain zero, indicating an absence of impacts. Negative probability in used in this case, which would reduce the final fault probability of dust contamination, if it would increase due to particles. The weights assigned to particles are 40% and 20% for the three vibration signatures. Faults related to the condition of mechanical gearbox components, consider mainly gear and bearing defects. Excessive gear wear or gear pitting is characterized by increasing pattern in small or/and large ferrous particles, and the presence of gear related impacts in the vibration signal. The main difference of considering more combinations is that different defect severity would cause different output of CDA based analysis. The weights are distributed equally among ferrous particles and gear vibration signature. For bearing outer and inner race defects, the combination C51 show that solely vibration signal is considered for damage identification, therefore having weight equal to W = 100%. For bearing inner race defect, the same combinations are assigned, with a different vibration signature related to inner race defect.
84
6 Experimental validation of the proposed fault detection and isolation methodology To evaluate the efficiency of the proposed methodology, experiments were performed on two laboratory test rigs. The first rig, presented in section 3.2.1, was composed of a gearbox, a pair of spur gears, two shafts and the supporting bearings. To the gearbox, the Integrated sensor unit (ISU) was connected, which enable oil to continuously circulate through the measuring block, used to perform measurements of various oil properties. The first rig allowed water contamination experiment, where relatively small amounts of water was injected into the operating oil, chemical contamination experiment, where hydraulic oil was used as the contaminant. The experimental results from water and chemical contamination were partially presented in sections 3.2.2 and 3.2.3, including analysis of oil parameters by means of trend change detection and qualitative evaluation by CDA algorithm. Signal abstraction into qualitative states is used to evaluate probabilities of fault-indicative patterns (Figure 5.2), and final fault probabilities defined by the fault modes table FMT (Table 5.5).
The same rig was employed for long run gear fatigue experiment, which resulted in gear pitting damage. The results from wear debris analysis, partially presented in section 3.2.4, are extended within this chapter by fusion with vibration based results and the final gear fault estimation by FMT evaluation.
The second rig, presented in section 4.3.1, was composed of several bearings and a shaft, which was used to apply load to the testing bearing by introducing shaft misalignment. The rig allowed frequent and easy replacement of the testing bearings, without introducing any other structural changes. Fault-free bearing was initially tested, which was followed by testing of a bearing with inner and outer race defects. Vibration based results from bearing faulty and fault-free conditions were partially presented in sections 4.3.1.1 and 4.3.1.2. Moreover, results from analysis of gear vibration from undamaged and damaged (pitting) conditions in section 4.3.2. Impact detection charts are used within this chapter for evaluation of the FMT and final fault probability estimation.
85
6.1
Lubricant water contamination experiment
Water contamination experiment was conducted in order to observe and indicate the influence of contamination by water in the operating gear oil. Motor speed was constant 1000 rpm and generator’s torque set to 28 Nm. The results from the analysis of the relative water content were partially presented in section 3.2.2, which concludes with abstraction of the signal into qualitative states as increasing, decreasing, stable, etc. The state evolution over time is used for identification of fault indicative patterns and final fault probability estimation. Figure 6.1 shows the final results from estimation of water contamination probability as instructed by the FMT. In (a) the relative water content measurements and correspondent CDA’s qualitative trend classification are presented. According to the FMT evaluation procedure, the probabilities of fault indicative patterns were calculated (b), followed by estimation of the final fault probabilities (c).
Figure 6.1: Results from water contamination experiment: (a) Trend classification on relative water content, (b) Probabilities of fault indicative patterns, (c) fault probabilities as defined by the FMT Initially stable qualitative trend (Figure 6.1a) increased the probability of the stable pattern to 100% (b), which consequently caused all fault probabilities equal to 0% (c). At transition of the trend to increasing, at approx. (20h), the state was appended to the
86
unique present state vector as it became equal to Sui = 3. The probability of increasing pattern was estimated as 100%, since the pattern model M11 contains an increasing state, which is sufficient to prevail among other patterns (Figure 5.2). In addition, probability of the composite pattern increased to 33% due to the composite pattern model M32 contains an increasing state among other states. Larger of the probabilities of increasing and composite patterns was used to estimate 100% probability of water contamination (c), as suggested by C11 and C12 combinations in the FMT. At approx. 25h, when transition of qualitative trend into decreasing state occurred (a), the unique state vector became equal to Sui = [3, −3]. The probability of increasing pattern immediately decreased to 50%, while decreasing and composite pattern probabilities increased to 50%. An increase of the decreasing and composite patterns was anticipated as the pattern models contain one or both states from Sui . The final probability of water contamination decreased from 100%, which is accordant to a decrease in pattern probabilities. Soon after 32h a stabilizing trend was indicated (a) and the unique state was expended to Sui = [3, −3, −1], which caused the composite pattern to prevail among others as it’s probability became equal to 100% (b). At the same time, probabilities for increasing and decreasing patterns decreased, as additional stabilizing state is not contained among increasing and decreasing pattern models (Figure 5.2). The final fault probability increased to 100% as it was estimated as the larger from increasing and composite probabilities (c). When the observed quantity stabilized, which was indicated by the CDA trend as stable at approx. 45h, the unique state vector became equal to Sui = [0]. The pattern probabilities became equal to 0%, except for the stable pattern which became equal to initial 100%. Hence, the fault probabilities all became equal to 0%, indicating a fault-free operation. The results indicate that the proposed approach successfully pinpointed the correct fault of water contamination with an insignificant delay from contamination event. Furthermore, the fault probability was different from zero (0) throughout the duration of the fault signature of approx. 25h, while other probabilities remained zero (0).
6.2
Lubricant chemical contamination experiment
In the chemical contamination experiment the influence of contamination of the operating oil with a different lubricant is observed. Motor speed and generator’s torque were set the same as in the water contamination experiment, motor speed as 1000 rpm and generator’s
87
torque as 28 Nm. The results from the analysis of the relative dielectric constant were partially presented in section 3.2.3, including qualitative trend analysis the quantity. The state evolution over time is used for identification of fault indicative patterns and final fault probability estimation. Figure 6.2 shows the final results from estimation of chemical contamination probability as instructed by the FMT. In (a) the relative dielectric constant measurements and correspondent CDA’s qualitative trend classification are presented. According to the FMT evaluation procedure, the probabilities of fault indicative patterns were calculated (b), followed by estimation of the final fault probabilities (c).
Figure 6.2: Results from chemical contamination experiment: (a) Trend classification on relative dielectric constant, (b) Fault probabilities The stable qualitative trend (a) increased the probability of the stable pattern to 100% (b), which consequently caused all fault probabilities equal to 0% (c). At transition to decreasing state at approx. (9h) the unique present state vector became equal to Sui = −3, which maximized the probability of decreasing pattern to 100% (b). Probability of the composite pattern increased to 33% since the composite pattern model M32 contains a decreasing state (Figure 5.2). Probability of decreasing pattern was used to estimate 100% probability of chemical contamination (c), as suggested by C22 combination in the FMT (Table 5.5). Soon after 12.5h a stabilizing trend was indicated (a) and the unique state became Sui = [−3, −1]. The composite pattern probability increased and the decreasing pat-
88
tern probability decreased. The decreasing pattern model M23 contained both of the states in Sui (Figure 5.2) and therefore prevailed among other models by estimating 67% probability (b). The final fault probability decreased in accordance to decreasing pattern probability, which represents the main factor in estimating chemical contamination probability. When CDA trend became stable at approx. 16.5h (a), the unique state vector became equal to Sui = [0] as the fault pattern evolution has ended. The pattern probabilities became equal to 0%, except for the stable pattern which became equal to initial 100%. The results indicate that the proposed approach successfully identified the correct fault of chemical contamination with a delay of 7 hours from contamination event, which occurred at approx. 1h. Furthermore, the fault probability was different from zero (0) throughout the duration of the fault signature of approx. 7h, while other probabilities remained zero (0).
6.3
Gear pitting experiment under non-stationary load
Gear pitting experiment was conducted under time varying load conditions to investigate the influence of load variation in promoting pitting phenomenon under time-varying load. The motor speed was set constant to 1296 rpm corresponding to 92% of the motor’s nominal rotating speed. The objective of the pitting test was to induce the occurrence of the macro pitting in the flank of spur gears. Results from qualitative trend analysis of wear debris, presented in section 3.2.4, are summarized within this section, and used for the estimation of pitting-indicative patterns, defined by the FMT (Table 5.5). Furthermore, results from vibration analysis and impact detection, partially presented in section 4.3.2, are used for estimation of the final probability of impacts. The final step includes fusion of vibration and oil partial pitting-indicative probabilities into the final fault probability.
6.3.1
Oil analysis
Figures 6.3 and 6.4 show summary of the= results from the CDA qualitative trend analysis of wear debris with diameter smaller or larger than 100µm, respectively (a), and probabilities of fault indicative patterns based on CDA qualitative trend (b). Initially stable qualitative trend of small particles (a) increased the probability of the stable pattern to 100% (b), which consequently caused all fault probabilities equal to 0% (c).
89
Figure 6.3: Results from pitting experiment: (a) Measurements and trend classification on small ferrous particles count (D100um), (b) fault-indicative pattern probabilities
6.3.2
Vibration analysis
The gear pitting results from vibration processing for impact detection were presented in section 4.3.2. The impact detection charts, presented for fault-free and faulty conditions (Figure 4.36), where each impact was classified independently as detected (1 - red) or not detected (0 - green). It is evident, that several missed or false detections are present, which may influence estimation of probability of impacts. The step of reduction of missed and false alarms included estimation of impact probability individually for each tooth. Consequently, the probability of impacts is obtained as the maximum among all teeth, since the objective is to evaluate the degree of damage of a single tooth. The final probability of pitting fault is obtained by fusion of oil and vibration based results, as defined by the FMT. Impact detection results were subjected to grouping procedure (section 5.2), and estimation of the grouped decision by majority rule approach. For grouping, which is performed by sliding non-overlapping window, the number of periods was equal to 3 and the majority rule threshold to 0.5. Figure 6.5 shows the results of grouping and majority rule grouped decision making for fault-free (a) and faulty cases (b). It is evident that the grouping approach reduced number of false and missed detections in fault-free and faulty cases, shown in Figure 6.5a,b respectively. After grouping, the probability of impacts in the both cases was estimated for each tooth by eq. (5.7), with an objective to identify the degree and angular location of the defective tooth. In the faultfree case (a), the probability of impacts was estimated as 14%, which is a consequence of
91
(a) Faultfree
(b) Gear pitting
Figure 6.5: Impact detection false alarms not eliminated by the grouping process, as relatively low amount of periods were available. In the faulty case (b), results show that not one but at least 2 teeth were damaged, with the maximum impact probability equal to 71%.
6.3.3
Fault probability estimation by fusion of vibration and oil partial decisions
In order to estimate the final fault probability based on fusion of oil and vibration analyses, additional vibration data from several intervals during the experiment, were subjected to analysis. The vibration analysis procedure followed steps described in section 4.2, which included spectral kurtosis (SK) based filtering of vibration periods, and impact detection by k-NN classification. The impact probability was estimated by the grouping procedure, as shown in the previous section. Figure 6.6 shows the selected vibration segments used for analysis, and training and testing steps required by k-means clustering and k-NN classification phases. The segments were selected from 100% load in order to maximize the energy of impulses and their influence to vibration response. From the first interval of 100% load (14h < T < 21h), a vibration segment was selected at 20h as a fault-free representative used for training step of the anomaly detection procedure. The following intervals of 100% load each provided two vibration segments for testing, impact detection and probability estimation. Figure 6.7 summarizes the results obtained from oil analysis and pattern probability
92
Figure 6.6: The time varying torque profile and temperature evolution with selected vibration segments for training and testing steps of impact detection estimation, and presents it together with vibration based probability of impacts (a). In (b), the final fault probabilities are presented, obtained according to the FMT (Table 5.5).
Figure 6.7: Results from pitting experiment: (a) Probabilities of fault-indicative oil patterns and vibration impacts, (b) The final fault probabilities Initially, partial probabilities from oil and vibration analyses (a) and fault probabilities were equal to 0% (b), which confirms fault-free conditions at the start of the experiment. Vibration-based probability increased the first to 100% at approx. 62h (a), which was followed by immediate increase in gear damage probability for 50% (b). The increase was followed by additional increase to 100% soon after 65h, as the large particle probability increased to 100% (a).
93
Vibration based probability suddenly decreased to 0% at approx. 70h (a), which suggests an absence of impacts during relatively short period of 5 hours. However, the decrease in vibration-based probability caused decay in the final fault probability by 50% (b). At the same time the probability of metallic particle contamination increased to 50%, due to two reasons. The particle based probabilities, which are also indicative to metallic particle contamination (Table 5.5), were equal to 100% and vibration based probability to 0%. Such combination directly reflects particle contamination, since large amount of particles without impulsivity in vibration response, cannot be simply related to gear damage. As the probability based on small particles increased to 100% only 2h later at 72h (a), the gear fault probability followed by an increase to 67%, and particle contamination to 0% (b). Finally confirmed, the vibration-based probability and the fault probability remained equal to 100% from 75h until approx. 96h, when particle-based probabilities adopted value of approx. 67% (a), and fault probability decreased to approx. 82% (b). At 105h, when vibration-based probability decreased to 30%, probability of gear fault followed by a decay to approx. 55%. As anticipated, the particle contamination probability increased to 20% (b), mainly due to low probability of impacts (a). The results show success in obtaining the final fused decision regarding condition of gears, during long-run pitting experiment. Vibration results were considered mainly at maximum load in order to maximize energy of the fault-induced vibration signature. The unstable behaviour of vibration based probability (Figure 6.7), caused by apparent absence or attenuation of impacts and consequent low probability of impacts, influenced unstable behaviour of the final fault probabilities, with several inaccurate detections of metallic particle contamination. However, the results based on oil wear debris analysis.
6.4
Bearing inner and outer race defects
The experiment was conducted on bearing experimental rig (section 4.3.1), which offered a possibility for bearing replacement without influencing the system. Therefore, three bearings were tested under fault-free, inner and outer race defective conditions. The results from bearing vibration analysis were partially presented in sections 4.3.1.1 and 4.3.1.2, including spectral kurtosis based filtering and impact detection. The impact detection charts, presented for fault-free and defective inner and outer race conditions in Figure 4.22 and 4.30, respectively. Impact detection relied on k-nearest neighbours classification, which resulted in single impact detection. The step of minimization of missed and false alarms included estimation of impact probability. As all impacts produced by rolling elements are due to the same defect, the probability of impacts with the
94
grouping procedure (section 5.2), and the final fault probabilities are considered equal to probabilities of impacts, as defined by the FMT. Impact detection results were subjected to grouping procedure, and estimation of the grouped probability by the majority rule approach. For grouping, the number of periods was equal to 3 and the majority rule threshold to 0.5. Figures 6.8 and 6.9 show the results of grouping and majority rule grouped decision making and inner and outer race defects, respectively.
(a) Faultfree
(b) Inner race defect
Figure 6.8: Impact detection by grouping The grouping approach for inner race defect (Figure 6.8), performed for faultfree (a) and faulty (b) conditions, reduced the amount of false and missed detections compared to the impact detection charts (Figure 4.22). For the faultfree case, all grouped detections N were negative, as specificity (true negative rate) was equal to T N R = T NT+F = 100%, P where true negative measure T N = 34 · 8 = 272 and false positive measure F P = 0. In the defective case, the majority of detections proved to be positive, which confirms the presence of impacts. However, small amount of missed detections is evident, as the P sensitivity (true positive rate) was equal to T P R = T PT+F = 92.5%, where T P = 2554 N and F N = 206. In case of outer race defect (Figure 6.9), performed for faultfree (a) and faulty (b) conditions, reduced the amount of false and missed detections compared to the impact detection charts (Figure 4.22). For the faultfree case, all grouped detections were negative, as N specificity (true negative rate) was equal to T N R = T NT+F = 100%, where true negative P measure T N = 21 · 8 = 168 and false positive measure F P = 0. In the defective case, the majority of detections proved to be positive, with reduced amount of missed detec-
95
(a) Faultfree
(b) Inner race defect
Figure 6.9: Impact detection by grouping tions.The sensitivity (true positive rate) was equal to T P R = T P = 146 and F N = 22.
TP T P +F N
= 86.9%, where
It is evident that the grouping approach reduced number of false and missed detections in fault-free and faulty cases, shown in Figures 6.8 and 6.9. After grouping, the probability of impacts in the both cases was estimated by eq. (5.7). In the fault-free case (a), the probability of impacts was estimated as 14%, which is a consequence of false alarms not eliminated by the grouping process, as relatively low amount of periods were available. In the faulty case (b), results show that not one but at least 2 teeth were damaged, with the maximum impact probability equal to 71%.
96
7 Conclusions The thesis attempts to address the problem of integration of multiple condition monitoring (CM) approaches for reliable fault diagnosis of mechanical drives. Particularly industrial gearbox components, which are considered as the most critical, are considered. The currently prevailing CM approaches rely on vibration and oil analysis techniques. The main novelties of the work are related to: • Developed algorithms for analysis of oil properties, which include detection of trend change, association of the trend into qualitative states, and recognition of faultrelated signatures. • Developed vibration analysis approach for extraction of fault features and detection of fault-related impulses. The same approach is suitable for detection of gear and bearing faults. • Proposed approach for fusion of two partial decisions, oil and vibration based, into a reliable fault detection and probability estimation. The incidence table, which represents the core of the decision fusion approach, offers flexibility by allowing integration of additional faults, diagnostic features from vibration, oil or any other monitoring technique. Trend change detection in oil properties is achieved through comparison of signal with predictions of the fault-free reference. The difference is considered as error, which forms cumulative sum (CUSUM). The results have shown that CUSUM is able to reliably detect an occurrence of trend change, i.e. the moment when the signal deviates from the reference. The nature of a change is confirmed upon qualitative trend analysis, which was shown to classify the signal with respect to the reference as stable, changing, etc. Analysis of time sequences of qualitative states successfully identified several faultrelated signatures, hence providing partial decision regarding the fault. The flexible approach of fault-signature recognition allows extension with additional signatures which may be expected in monitored quantities. Therefore, the approach offers a reliable tool in detection of trend changes, and recognition of fault-signatures. The proposed approach of vibration analysis technique of spectral kurtosis (SK), to identify frequency bands within vibration signal, which contain information regarding faultinduced impulse excitation. Vibration signal is first sliced into short segments, which
97
are used for SK estimation and denoising filtering. SK is used to define the Wiener filter, which is able to extract fault-related non-stationary component from background noise. The filter is frequently adapted by the corresponding SK for each segment, due to variations in the fault-related frequency band during machine operation. Results from defective gear and bearing vibration analysis have shown that not segments do not express identical frequency bands, especially in case of moving fault (i.e. bearing inner race fault). The diagnostic features extracted from the filtered residual have shown to be reliable source of information, regarding presence of fault-related impulsive excitation. Classification of features, based on k-means and k-nearest neighbours, results in decision as impact detected or impact not detected. The same approach if fully suitable for analysis of vibration signals acquired from gears or bearings, as faults share similar type of fault-signature of impulsive nature. Integration of vibration and oil partial decisions regarding fault presence relies on the incidence table, which contains in-depth relations between faults and indicative features. The table associates fault probabilities obtained independently from several fault-indicative features into the final fault probability using weighted averaging technique. The table, including 6 faults and 10 features (7 from oil and 3 from vibration analysis), offers flexibility extension to any additional faults and features, extracted from oil, vibration or any other condition monitoring approach. The final fault probability has increased reliability as it is obtained from several independent sources of information describing the same event. Experimental validation of proposed approaches proved them to be useful for diagnosis of several faults of different nature. The results from water and chemical contamination experiments, which were both successfully diagnosed, could be transferred to an industrial practice, where such faults are not rare. The ability of the approach to offer a single vibration analysis tool for diagnosis of gear and bearing faults, can also be beneficial to industry, where gearbox components are the most frequently used. Flexibility of the incidence table by additional faults and CM techniques, may be exploited by different application fields concerned with fault diagnosis.
7.1
Future Work
There are several improvements that could be considered in the future. The proposed approaches demand several parameters to be pre-established, which were so far estimated empirically and by expert knowledge. Trend change detection and qualitative analysis uses parameters such as the length of analysis window, CUSUM thresholds for change
98
detection, stable and changing state thresholds, etc. Several methods for optimal tuning of the parameters exist in the literature, which may offer suitable approach to the solution of the problem of parameter selection. Also, parameter tuning may be of adaptive nature with ability to adjust initially established parameters online during machine operation. For diagnosis of faults related to oil condition, additional sensors for online oil analysis may be considered. Primarily viscosity measurements should be included in the incidence table, to improve reliability of detection of chemical contamination or oil aging. Additionally, measurements of acoustic emission should be included in order to improve diagnosis of bearing faults, as acoustic emission analysis is also one of the exploited techniques in fault diagnosis field. Despite relatively large amount of experimental results used for validation of the proposed approaches, additional experimental work should be considered. Experiment of naturally evolving bearing fault during machine operation, would be of large importance to observe performance of the algorithms online during fault progression. In addition, extension of the incidence table to faults related to bearing rolling elements should be considered.
99
100
Bibliography [1] Maintenance. Maintenance terminology (standard BS EN 13306:2010). Adopted European Standard, November 2010.
British-
[2] A.Muller, M.C.Suhner, and B.Iung. Maintenance anlternative integration to prognosis process engineering. Journal of Quality in Maintenance Engineering, 13:198–211, 2007. [3] C. Mechefske. Machine Condition Monitoring and Fault Diagnostics, pages 25–1– 25–35. CRC Press, 2012/11/19 2005. [4] T.H. Loutas, D. Roulias, E. Pauly, and V. Kostopoulos. The combined use of vibration, acoustic emission and oil debris on-line monitoring towards a more effective condition monitoring of rotating machinery. Mechanical Systems and Signal Processing, In Press, Corrected Proof:–, 2010. [5] T. M. Hunt. Condition monitoring of mechanical and hydraulic plant : a concise introduction and guide. Chapman & Hall, London; New York, 1996. [6] T. Tambouratzis and M. Antonopoulos-Domis. On-line signal trend identification. Annals of Nuclear Energy, 31(14):1541 – 1553, 2004. [7] Nikiforov I. V. Basseville M. Detection of abrupt changes: theory and application. Englewood Cliffs: Prentice Hall, 1993. [8] S. Charbonnier, C. Garcia-Beltan, C. Cadet, and S. Gentil. Trends extraction and analysis for complex system monitoring and decision support. Engineering Applications of Artificial Intelligence, 18(1):21 – 36, 2005. [9] N. Vaswani. The modified cusum algorithm for slow and drastic change detection in general hmms with unknown change parameters. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP ’05), volume 4, pages 701–704, 2005. [10] N. Vaswani. Additive change detection in nonlinear systems with unknown change parameters. In IEEE transactions on Signal Processing, volume 55, pages 859– 872, 2007.
101
[11] P.D. McFadden and J.D. Smith. Vibration monitoring of rolling element bearings by the high-frequency resonance technique — a review. Tribology International, 17(1):3 – 10, 1984. [12] N Tandon and A Choudhury. A review of vibration and acoustic measurement methods for the detection of defects in rolling element bearings. Tribology International, 32(8):469 – 480, 1999. [13] R. Randall and J. Antoni. Rolling element bearing diagnostics—a tutorial. Mechanical Systems and Signal Processing, 25(2):485–520, 2011. [14] PD McFadden. Detecting fatigue cracks in gears by amplitude and phase demodulation of the meshing vibration. ASME, Transactions, Journal of Vibration, Acoustics, Stress, and Reliability in Design, 108:165–170, 1986. [15] R.B.Randall. A new method of modeling gear faults. ASME Journal of Mechanical Design, 104:259–267, 1982. [16] W.Q. Wang, F. Ismail, and M. Farid Golnaraghi. Assessment of gear damage monitoring techniques using vibration measurements. Mechanical Systems and Signal Processing, 15(5):905–922, 2001. [17] F. Combet and L. Gelman. Optimal filtering of gear signals for early damage detection based on the spectral kurtosis. Mechanical Systems and Signal Processing, 23:652–668, 2009. [18] D.Brie, M.Tomczak, H.Oehlmann, and A.Richard. Gear crack detection by adaptive amplitude and phase modulation. Mechanical Systems and Signal Processing, 11(1):149–167, 1997. [19] SK Lee and PR White. The enhancement of impulsive noise and vibration signals for fault detection in rotating and reciprocating machinery. Journal of Sound and Vibration, 217(3):485–505, 1998. [20] WJ Wang and PD McFadden. Early detection of gear failure by vibration analysis i. calculation of the time-frequency distribution. Mechanical Systems and Signal Processing, 7(3):193–203, 1993. [21] B.D. Forrester. Advanced vibration analysis techniques for fault detection and diagnosis in geared transmission systems. PhD thesis, Swinburne University of Technology, 1996.
102
[22] FK Choy, V. Polyshchuk, JJ Zakrajsek, RF Handschuh, and DP Townsend. Analysis of the effects of surface pitting and wear on the vibration of a gear transmission system. Tribology International, 29(1):77–83, 1996. [23] SJ Loutridis. Instantaneous energy density as a feature for gear fault detection. Mechanical systems and signal processing, 20(5):1239–1253, 2006. [24] E.B. Halim, MAA Shoukat Choudhury, S.L. Shah, and M.J. Zuo. Time domain averaging across all scales: A novel method for detection of gearbox faults. Mechanical Systems and Signal Processing, 22(2):261–278, 2008. [25] G. Dalpiaz, A. Rivola, and R. Rubini. Effectiveness and sensitivity of vibration processing techniques for local fault detection in gears. Mechanical Systems and Signal Processing, 14(3):387–412, 2000. [26] J. Lin and MJ Zuo. Gearbox fault diagnosis using adaptive wavelet filter. Mechanical systems and signal processing, 17(6):1259–1269, 2003. [27] W. Wang. Early detection of gear tooth cracking using the resonance demodulation technique. Mechanical Systems and Signal Processing, 15(5):887–903, 2001. [28] T. Barszcz and R.B. Randall. Application of spectral kurtosis for detection of a tooth crack in the planetary gear of a wind turbine. Mechanical Systems and Signal Processing, 23(4):1352–1365, 2009. [29] R. B. Randall. Vibration-based Condition Monitoring: Industrial, Aerospace and Automotive Applications. John Wiley and Sons, 2011. [30] D. Ho and RB Randall. Optimisation of bearing diagnostic techniques using simulated and actual bearing fault signals. Mechanical systems and signal processing, 14(5):763–788, 2000. [31] J. Antoni. The spectral kurtosis: a useful tool for characterising non-stationary signals. Mechanical Systems and Signal Processing, 20(2):282 – 307, 2006. [32] J. Antoni and R.B. Randall. The spectral kurtosis: application to the vibratory surveillance and diagnostics of rotating machines. Mechanical Systems and Signal Processing, 20(2):308 – 331, 2006. [33] J. Antoni. Fast computation of the kurtogram for the detection of transient faults. Mechanical Systems and Signal Processing, 21(1):108 – 124, 2007.
103
[34] J. Antoni. Blind separation of vibration components: Principles and demonstrations. Mechanical Systems and Signal Processing, 19(6):1166 – 1180, 2005. Special Issue: Blind Source Separation Special Issue: Blind Source Separation. [35] N. Sawalhi, R.B. Randall, and H. Endo. The enhancement of fault detection and diagnosis in rolling element bearings using minimum entropy deconvolution combined with spectral kurtosis. Mechanical Systems and Signal Processing, 21:2616–2633, 2007. [36] W. Guo, P.W. Tse, and A. Djordjevich. Faulty bearing signal recovery from large noise using a hybrid method based on spectral kurtosis and ensemble empirical mode decomposition. Measurement, 2012. [37] B. Eftekharnejad, MR Carrasco, B. Charnley, and D. Mba. The application of spectral kurtosis on acoustic emission and vibrations from a defective bearing. Mechanical Systems and Signal Processing, 25(1):266–284, 2011. [38] Y. Wang and M. Liang. An adaptive sk technique and its application for fault detection of rolling element bearings. Mechanical Systems and Signal Processing, 25(5):1750–1764, 2011. [39] D.P. Anderson. Wear particle atlas. revised. pages 92–163, 1982. [40] G. E. Newell. Oil analysis cost-effective machine condition monitoring technique. Industrial Lubrication and Tribology, 51(3):119 – 124, 1999. [41] J.S. Stecki J. Mathew. Comparison of vibration and direct reading ferrographic techniques in application to high-speed gears operating under steady and varying load conditions. Journal of Society of Tribologists and Lubrication Engineers, 43:646– 653, 1987. [42] B. Johnson H. Maxwell. Vibration and lube oil analysis in an integrated predictive maintenance program. Proceedings of the 21st Annual Meeting of the Vibration Institute, pages 117–124, 1997. [43] M. Williamson D.D. Troyer. Effective integration of vibration analysis and oil analysis. pages 411–420, University College of Swansea, Swansea, UK, 21–25 March 1999. [44] J.D. Kozlowski C.S. Byington, T.A. Merdes. Fusion techniques for vibration and oil debris/quality in gearbox failure testing. In Proceedings of the International
104
Conference on Condition Monitoring, pages 113–128, Swansea, UK, 21–25 March 1999. University College of Swansea. [45] D. L. Hall and J. Llinas. Handbook of Multisensor Data Fusion. CRC Press, June 2001. [46] X. Fan and M. J. Zuo. Fault diagnosis of machines based on d-s evidence theory. part 1: D-s evidence theory and its improvement. Pattern Recognition Letters, 27(5):366 – 376, 2006. [47] C. R. Parikh, M. J. Pont, and N. B. Jones. Application of dempster-shafer theory in condition monitoring applications: a case study. Pattern Recognition Letters, 22(6-7):777 – 785, 2001. [48] IW Mayes. Use of neutral networks for on-line vibration monitoring. ARCHIVE: Proceedings of the Institution of Mechanical Engineers, Part A: Journal of Power and Energy 1990-1996 (vols 204-210), 208(41):267–274, 1994. [49] C.S. Leem, DA Dornfeld, and SE Dreyfus. A customized neural network for sensor fusion in on-line monitoring of cutting tool wear. Journal of engineering for industry, 117(2):152–159, 1995. [50] Q. Liu and H.P. Wang. A case study on multisensor data fusion for imbalance diagnosis of rotating machinery. AI EDAM, 15:203–210, 2001. [51] P.K. Varshney. Multisensor data fusion. Electronics Communication Engineering Journal, 9(6):245 –253, December 1997. [52] B.S. Yang and K. J. Kim. Application of dempster-shafer theory in fault diagnosis of induction motors using vibration and current signals. Mechanical Systems and Signal Processing, 20(2):403 – 420, 2006. [53] H.Y. Guo. Structural damage detection using information fusion technique. Mechanical Systems and Signal Processing, 20(5):1173 – 1188, 2006. [54] A. Rakar and D. Juricic. Diagnostic reasoning under conflicting data: the application of the transferable belief model. Journal of Process Control, 12(1):55–67, 2002. [55] J. Vižintin. Gonila in pogonski sklopi: gonila s stalnim prestavnim razmerjem, gonila z ročno in avtomatsko nastavljivim prestavnim razmerjem, gonila z avtomatsko nastavljivim prestavnim razmerjem - CVT-gonila, kombinirani pogoni, zadnji pogonski sistem - diferencial, poškodbe zobnikov, maziva. Slovensko društvo za tribologijo, 2012.
105
[56] J. J. Zakrajsek, D.P. Townsend, and H.J. Decker. An analysis of gear fault detection methods as applied to pitting fatigue failure data. National Aeronautics and Space Administration ; US Army Aviation Systems Command ; National Technical information Service, distributor, Washington, DC, 1993. [57] G. Fajdiga, S. Glodež, and J. Kramar. Pitting formation due to surface and subsurface initiated fatigue crack growth in contacting mechanical elements. Wear, 262:1217–1224, 2007. [58] N. Sawalhi and R. Randall. Simulating gear and bearing interactions in the presence of faults part i. the combined gear bearing dynamic model and the simulation of localised bearing faults. Mechanical Systems and Signal Processing, 22:1924–1951, 2008. [59] E. R. Booser. Handbook of Lubrication, Theory and Practice of Tribology – Vol. I, Application and Maintenance, volume 1. CRC Press, Inc., 1983. [60] V. Chandrasekaran D.W. Hoeppner and A.H.H. Taylor. Review of pitting corrosion fatigue models. 20th ICAF Symposium, 1999. [61] E. R. Booser. Handbook of Lubrication, Theory and Practice of Tribology – Vol. III, Application and Maintenance, volume 3. CRC Press, Inc., 1994. [62] KEW Engineering. Oil aging and degradation: Why do i need to change the oil?, September 2012. [63] L.A. Toms. Machinery Oil Analysis: Methods, Automation & Benefits : a Guide for Maintenance Managers & Supervisors. L. A. Toms, 1995. [64] H.S. Ahn, E.S. Yoon, D.G. Sohn, O.K. Kwon, K.S. Shin, and C.H. Nam. Practical contaminant analysis of lubricating oil in a steam turbine-generator. Tribology International, 29(2):161 – 168, 1996. [65] T.B. Kirk. Numerical characterisation of wear debris for machine condition monitoring. Perth, Australia, December 1994. [66] Hilbert D. Grundzuge einer allgemeinen Theorie der linearen Integralgleichungen. Chelsea Pub. Co., New York, 1953. [67] A. K Jain, M. N. Murty, and P.J. Flynn. Data clustering: a review. ACM computing surveys (CSUR), 31(3):264–323, 1999.
106
[68] L. Gelman and I. Petrunin. Novel anomaly detection technique based on the nearest neighbour and sequential methods. Insight - Non-Destructive Testing and Condition Monitoring, 54(8):433–435, 2012. [69] J. Vizintin, M. Kambic, I. Lipuscek, and V. Hudnik. Application of wear particle analysis to condition monitoring of rotating machinery in iron and steel works. Lubrication Engineering,, 51(5):389–393, 1995. [70] B. Krzan and J. Vizintin. On-line wear and lubricant condition monitoring. In Proceedings of the 5th International Conference on Condition Monitoring and Machinery Failure Prevention Technologies, Edinburgh, Scotland, UK, 2008. [71] G. Persin, J. Salgueiro, J. Vizintin, and E. Juricic. A system for automated online oil analysis. Insight - Non-Destructive Testing and Condition Monitoring, 54(8):428– 432, 2012. [72] C.S. Byington and D.C. Schalcosky. Advances in real time oil analysis. Machinery Lubrication, 2000.
107
108
Biography Gabrijel Peršin was born in 1983 in Ljubljana. After graduating from high school in 2002, he enrolled in the Faculty of Electrical Engineering in Ljubljana. He completed the undergraduate study in 2009 with a thesis entitled Application of equalisation method to PID tuning under the supervision of Prof. Dr. Damir Vrančič and Prof. Dr. Gregor Klančar. He was employed as a Young researcher the same year at the Faculty of Mechanical Engineering in Ljubljana, Center for Tribology and Technical Diagnostics, where he bagun his postgraduate study in the field of technical diagnositcs, vibration and oil analysis, under supervision of prof. dr. Jože Vižintin and prof. dr. Đani Juričić. He was involved in projects related to diagnosis of mechanical drives, primarily solving problems associated with damage of bearings and gears. In 2012, he began working as a researcher at Cranfield University (UK), where he continues to research vibration analysis techniques for diagnosis of wind turbines. The central topic of research is structural damage of wind turbine blades, using methods of higher order spectra. Življenjepis Gabrijel Peršin se je rodil leta 1983 v Ljubljani. Po končani gimnaziji leta 2002 se je vpisal na Fakulteto za elektrotehniko v Ljubljani, študij pa zaključil leta 2009 z diplomsko nalogo z naslovom Aplikacija metode enačenja za nastavljanje PID regulatorjev pod mentorstvom prof. dr. Damirja Vrančiča in somentorstvom prof. dr. Gregorja Klančarja. Še istega leta se je zaposlil kot mladi raziskovalec na Fakulteti za strojništvo v Ljubljani. V Centru za tribologijo in tehnično diagnostiko v okviru doktorskega študija pod mentorstvom prof. dr. Jožeta Vižintina nadaljuje z raziskavami na področju tehnične diagnostike in analize vibracij ter olja. Sodeloval je na projektih povezanih diagnostiko mehanskih pogonov, kjer se je ukvarjal predvsem s problem povezanimi s poškodbami ležajev in zobnikov. Leta 2012 se je kot raziskovalec zaposlil na Cranfield University (UK), kjer nadaljuje z raziskovanjem tehnik analize vibracij za diagnostiko poškodb vetrnih turbin. Osrednja tema raziskovanja so strukturalne poškodbe lopatic z metodami analize višjih spektralnih redov. 109
110
Declaration of independence in conducting research This Doctoral thesis represents the results of my own scientific research in collaboration with the mentor Prof. Dr. Jože Vižintin and co-‐mentor Prof. Dr. Đani Juričić. Gabrijel Peršin Izjava avtorja o samostojnosti izdelave raziskave Doktorska naloga predstavlja rezultate lastnega dela na osnovi sodelovanja z mentorjem prof. dr. Jožetom Vižintinom in somentorjem prof. Dr. Đanijem Juričićem. Gabrijel Peršin
111
112
UVOD Najbolj kritična mehanska komponenta in eden najpogostejših vzrokov za okvare strojev so mehanski pogoni, saj so zobniki in ležaji izpostavljeni poškodbam zaradi prenosa sil preko relativno majhnih triboloških stikov zobnih parov in gibljivih elementov ležajev. Zato je material na teh mestih izpostavljen velikim obremenitvam in trenju. Ležaji se uporabljajo za podporo gredi in prosto rotacijo pri prenašanju sil med komponentami. Odpoved elementa v mehanskem pogonu lahko povzroči zastoj celotne proizvodne linije ter neposredne in posredne stroške vzdrževanja ali celo ogrozi človeška življenja. Standard terminologije vzdrževanja [6] vzdrževanje definira kot proces ohranjanja sistema v operativnem stanju s preprečevanjem in odstranjevanjem defektnih stanj. Poročilo ARTEMIS ocenjuje, da neposredni stroški vzdrževanja v Evropski uniji znašajo med štiri in osem odstotkov prihodkov od prodaje [7], kar 30 do 50 odstotkov teh stroškov pa nastane zaradi neučinkovitih programov vzdrževanja. Trenutne prevladujoče strategije reaktivnega in preventivnega vzdrževanja so zastarele in bi jih bilo potrebno nadomestiti s stroškovno učinkovitejšimi rešitvami, ki temeljijo na spremljanju stanj in napredni diagnostiki, prognostiki ter upravljanju stanja strojev (prediktivna strategija vzdrževanja) . Prediktivno vzdrževanje na podlagi spremljanja stanj (ang. condition based maintenance) je napredna oblika preventivnega vzdrževanja, katerega cilj je zagotoviti neprekinjeno delovanje vse do trenutka, ko komponenta začne propadati in se poveča verjetnost za okvaro. Pri tej strategiji se vzdrževalni posegi izvajajo v bolj ali manj rednih intervalih, ki jih določimo s postopkom spremljanja stanj. Z izjemo katastrofalnih odpovedi, ki so nenadne in pomenijo popolno izgubo funkcionalnosti, gre 99 odstotkov mehanskih okvar skozi značilno začetno fazo. To pomeni, da se še pred nastopom okvare pojavijo opazni indikatorji, ki nas opozarjajo, da bo prišlo do okvare. Naloga spremljanja stanj je pravočasno zaznati nastajanje okvare, odkriti vzroke zanjo in, če je mogoče, napovedati njen prihodnji razvoj. Ta strategija omogoča, da imamo še pred dokončnim nastopom okvare dovolj časa za učinkovito izvedbo vzdrževanja. Ta koncept je ključni element porajajoče se discipline prognostike in upravljanja s stanjem (ang. PHM), ki tvori osnovo prediktivnega vzrževanja [8]. Spremljanje stanja je splošen izraz, ki opisuje postopek zaznavanja in osamitve poškodb z namenom celostne ocene operativnega stanja sistema. Dandanes se »ročni« (offline) pristop k merjenju vse bolj nadomešča s kontinuiranim in popolnoma avtomatiziranim procesom spremljanja stanja, ki temelji na naprednih meritvah stanja maziva, vibracij ter akustičnih in zvočnih emisij v realnem času. Med delovanjem stroja olje neprestano kroži med menjalnikom in sistemom senzorjev, ki v rednih intervalih opravljajo meritve temperature, vlage, dielektričnosti in preverjajo prisotnost obrabljenih delcev. Poleg tega so na ključnih mestih nameščeni senzorji vibracij, ki meritve prav tako opravljajo v realnem času. Prvi korak pri 113
procesiranju podatkov je zajem ključnih značilk iz vsakega posameznega senzorja in zlitje podatkov meritev, katerega glavni namen je izločitev okvarjenih senzorjev oziroma napačnih meritev. Ker je v sistemu več enakih senzorjev, ki merijo isto spremenljivko, je mogoče uporabiti statistično metodo povprečja in variance za odkrivanje potencialnih osamelcev, zmanjševanje šuma ipd. Neobdelane ali zlite meritve iz prvega koraka nato pošljemo v modul za izpeljavo značilk, ki izračuna spremenljivke, značilne za poškodbo. Ko so značilke izpeljane, z njihovim zlivanjem opravimo spojitev redundantnih podatkov iz enakovrednih senzorjev ali pa spojitev dveh tipov podatkov iz različnih senzorjev. V primeru, ko uporabljamo več senzorjev vibracij, zlivanje služi povečanju kakovosti diagnostičnih informacij. Možno je tudi zlivanja vibracijskih značilk in podatkov o številu delcev v mazivu, kar poveča občutljivost zaznavanja jamičenja na zobnikih [9]. V tem koraku povezovanja podatkov se izvede ocenjevanje značilk in razvrščanje vrednosti značilk v za poškodbe značilne razrede. Zaznavanje poškodb je korak, ki vključuje metode statističnega razvrščanja ali razvrščanja z nevronskimi mrežami za oceno verjetnosti poškodbe. Več kot je značilk, ki nakazujejo isto poškodbo, bolj natančen in zanesljiv je izračun verjetnosti. Zadnji korak diagnosticiranja poškodbe je integracija delnih verjetnosti, ki jih zlijemo, tako da dobimo oceno celostnega stanja sistema in izdelamo končno presojo o prisotnosti poškodbe. Z namenom povečanja splošne uporabnosti predlaganih algoritmov bomo predlagali razširitev na večsenzorne sisteme in predlagali integracije različnih dodatnih tehnik. Rezultat teh integracij je realno-‐časovni diagnostični sistem za mehanske pogone, ki je sposoben povezovanja rezultatov v splošno oceno stanja sistema na podlagi analiz maziva in vibracij. Glavni cilji disertacije so: •
•
•
Nov algoritem za zaznavanje poškodb, ki bo temeljil na prepoznavanju za poškodbo značilnih vzorcev v parametrih maziva. Algoritem spremembe v parametrih maziva zaznava s pomočjo kumulativne vsote CUSUM. Ko je sprememba zaznana, se trend označi s kvalitativno vrednostjo, npr. stabilen, naraščajoč, padajoč, nespremenjen itd. Nov algoritem za procesiranje vibracij, ki služi zaznavanju vplivov poškodb na zobnikih in ležajih. Ta algoritem uporablja spektralni kurtozis in tehnike optimalnega razšumljanja (Wiener) in filtriranja za izpeljavo nestacionarnih, s poškodbo povezanih komponent iz kratkih vibracijskih period. Prefiltrirani signal omogoča zaznavanje poškodb s pomočjo tehnik k-‐means rojenja in k-‐najbližjih sosedov. Ob tem algoritem uporablja tudi pravilo večinskega glasovanja za končno oceno verjetnosti nastopa tipičnih poškodb na zobnikih in ležajih. Novo razvit algoritem za prepoznavanje poškodb in ocenjevanje stanja sistema na podlagi integracije analiz vibracije in maziva. Algoritem temelji na zlitju odločitev in uteženem povprečenju verjetnosti, ki jih pridobimo z analizami maziva in vibracij, v oceno splošnega stanja sistema. Algoritem uporablja incidenčno tabelo za 114
•
ugotavljanje relacij med poškodbami in značilkami maziva in vibracij. Rezultat algoritma je izračun stopenj verjetnosti za posamezne poškodbe.
POVZETEK Analiza olja Odkrivanje sprememb trendov v parametrih maziva je zelo pomembno v primerih, ko je treba zagotoviti nično toleranco do poškodb, npr. pri spremljanju stanja menjalnika v helikopterju. V nadaljevanju bo predstavljen algoritem za avtomatizirano zaznavanje sprememb v trendih parametrov maziva. Osrednji algoritem za zaznavanje sprememb (ang. CDA) temelji na kumulativni vsoti pogreškov (ang. CUSUM). Gre za metodo sekvenčne analize, s katero zaznavamo spremembe v obravnavanem časovnem nizu. Algoritem privzame, da je časovni niz rezultat Gaussovega procesa s srednjo vrednostjo nič [12]. V vsakem časovnem koraku se izračuna kumulativna vsota preteklih vrednosti in preveri, ali presega prag. Če ga preseže, je to jasen znak, da so se statistične lastnosti Gaussovega procesa v vmesnem obdobju spremenile. Ta koncept tvori osnovo za algoritem za luščenje trendov, ki je predstavljen v [13]. Ta metoda uporablja linearne napovedne modele in standardni odklon za vhodne vrednosti algoritma CUSUM. Postopek odločanja za izvedbo kvalitativne ga ocenjevanja temelji na primerjavi segmentov podatkov in krivulj njihovih trendov. Algoritem za zaznavanje sprememb je bil razvit kot razširitev pristopa, predstavljenega v [13], in je zmožen ne le zaznavanja sprememb v trendih, temveč tudi razvrščanja signalov v več kvalitativnih razredov, kot npr. naraščajoč, padajoč, se stabilizira itd. (slika 1).
Slika 1: Shematski prikaz pristopa k analizi olja
115
Algoritem za zaznavanje sprememb predlaga primerjavo dveh podatkovnih oken, referenčnega in trenutnega, prek katerih sta definirana referenčni in trenutni vektor. Referenčni vektor vsebuje vzorce časovnega niza pred spremembo in se hrani kot referenca, dokler zaznavamo spremembo v časovnem nizu. Po normalizaciji obeh podatkovnih vektorjev se izračuna linearna regresija, ki se uporabi za napoved prihodnjih vrednosti. Razlika med predvidenimi vrednostmi časovnega niza in dejanskimi vrednostmi je standardni odklon, ki ga uporabimo za izračun kumulativne vsote (CUSUM). CUSUM izraža stopnjo razlike med podatki, ki jih definirata obe okni. Ko vrednost CUSUM preseže prag detekcije, se šteje, da referenčna krivulja ne predstavlja več trenutne krivulje, kar pomeni, da smo zaznali spremembo v trendu. Odkritju spremembe sledi razvrščanje trenutnega vzorca trenda glede na referenčne podatke. Njegove kvalitativne vrednosti so lahko naraščajoč, padajoč, nespremenjen, se stabilizira itd. Analiza vibracij Poškodovani zobniki in ležaji povečujejo vibracije z dodatnim vzbujanjem resonančnih frekvenc strukture. Predlagana metoda temelji na spektralnem kurtozisu in filtru za optimalno razšumljanje (Weiner), zaradi česar lahko vplive poškodbe ločimo od vibracijskega šuma ozadja (slika 2).
Slika 2: Shematski prikaz postopka vibracijske analize Slika 2 prikazuje shematsko skico predlagane metode. Vhodni signal je naključna komponenta vibracijskega signala, ki smo jo pretvorili v kotno domeno s postopkom kotnega vzorčenja. Signal lahko tudi prefiltriramo, tako da odstranimo nizke frekvence 116
gredi, in druge periodične komponente. Prvi korak vključuje segmentacijo vibracijskega signala v rotacijske segmente, kjer je rotacija bodisi rotacija zunanjega obroča (za zaznavanje poškodb zunanjega obroča) bodisi relativna rotacija med gredjo in ohišjem (za zaznavanje poškodb notranjega obroča). Pri zobnikih se segmentacijski postopek izvede za vsako posamezno rotacijo gredi. Vsak vibracijski segment je podvržen neodvisnemu ocenjevanju na podlagi spektralnega kurtozisa in filtriranja. Filtrirani signal, ki ga imenujemo ostanek-‐SK, uporabimo za oceno ovojnice (angl. envelope), ki odraža moč impulzov. Postopek se zaključi z zaznavanjem vplivov s pomočjo k-‐means gručenja in razvrščanju k-‐najbližjih sosedov. Diagnostika poškodb z integracijo analize vibracij in olja V zadnjih korakih je potrebno ločiti resnične poškodbe od navideznih, kar dosežemo tako, da zaznavanje in osamitev poškodb osnujemo na zlivanju podatkov na ravni odločanja (slika 3).
Slika 3: Shematski prikaz diagnoze poškodb z integracijo analiz vibracij in olja Z uporabo algoritma za zaznavanje sprememb v parametrih olja odkrijemo spremembo v trendu signala in jo povežemo s kvalitativnimi stanji, kot so stabilen, se stabilizira, naraščajoč, padajoč itd. S kvalitativno analizo trendov, v kateri primerjamo dve krivulji, referenčno in trenutno, pa je mogoče takoj ugotoviti naravo spremembe. V tem poglavju so opredeljeni za poškodbe značilni vzorci v parametrih olja, ki jih identificiramo s primerjavo razvoja kvalitativnih stanj neke vrednosti in modelov razvoja poškodb. Rezultat primerjave so ocenjene verjetnosti za nastop vsake od različnih možnih poškodb. Vibracijska analiza izvede uspešno izpeljavo diagnostične značilke z uporabo kvadratne ovojnice in izvede zaznavanje poškodbe, z uporabo metod k-‐means in k-‐najbližjih 117
sosedov. Rezultat je binarna tabela, ki opredeljuje količino impulznih odzivov v analiziranem vibracijskem signalu, ki so posledica poškodbe. To tabelo uporabimo za grupiranje, izločanje lažnoh alarmov in izostalih zaznav. Končna odločitev o poškodbi in njena identifikacija se opravi na podlagi parametrov olja in verjetnosti vibracijskih vplivov, tako da dobimo končno verjetnost za vsako poškodbo, ki jo definira incidenčna tabela. Verjetnosti iz parametrov olja in vplivov se povežejo med postopkom ocenjevanja incidenčne tabele, ki vsebuje relacije med možnimi poškodbami in za njih značilnimi vzorci. V incidenčni tabeli vrstice predstavljajo poškodbe, stolpci pa na parametrih olja in vibracij temelječe verjetnosti za vzorce, ki jih zaznamo ob poškodbah. Vsaka poškodba lahko zajema eno ali več možnih kombinacij vrst napake, kjer lahko vsaka kombinacija predstavlja prisotnost iste poškodbe, odvisno od mehanizmov razvoja poškodbe. Vsaka kombinacija vsebuje tudi uteži, dodeljene različnim vrednostim od nič do 100%. Funkcija uteži je izražanje stopnje, na kateri določena vrednost začne kazati na prisotnost poškodbe. Delne verjetnosti, pridobljene z analizami olja in vibracij, se povežejo z metodo uteženega povprečja, kar da oceno končnih verjetnosti za vse poškodbe, vključene v incidenčno tabelo.
ZAKLJUČKI Teza poskuša razrešiti problem integracije več pristopov k spremljanju stanja, katere cilj je zanesljiva diagnostika poškodb na mehanskih pogonih. Posebno pozornost namenja komponentam v industrijskih mehanskih pogonih, ki veljajo za najbolj kritične. Trenutno prevladujoči pristopi k spremljanju stanja se zanašajo na tehnike analiziranja vibracij in olja, zato v tej tezi predlagamo integracijo značilk vibracij in olja v končno verjetnost za poškodbo. Glavne predstavljene inovacije so: •
•
•
Razvoj algoritmov za analizo parametrov olja, ki vključujejo zaznavanje sprememb trendov, povezovanje trendov s kvalitativnimi stanji in prepoznavanje vzorcev, značilnih za poškodbe. Razvoj metode vibracijske analize za luščenje značilk poškodb in zaznavanje impulzov, značilnih za poškodbe. Ta metoda je primerna za zaznavanje poškodb tako na zobnikih kot na ležajih. Predlagan pristop k zlivanju dveh delnih odločitev, temelječih na olju in vibracijah, v zanesljivo odkrivanje poškodb in oceno njihove verjetnosti. Incidenčna tabela, ki predstavlja osrednji element metode zlivanja odločitev, omogoča veliko fleksibilnost, saj dovoljuje integracijo dodatnih poškodb ter diagnostičnih značilk, pridobljenih iz vibracij, parametrov olja ali s katerokoli drugo merilno tehniko.
Zaznavanje sprememb trendov v parametrih olja se izvaja s primerjavo signala in napovedmi referenčnega signala brez poškodb. Razlika med njima je pogrešek, ki tvori kumulativno vsoto (CUSUM). Rezultati so pokazali, da je CUSUM sposoben zanesljivega odkrivanja sprememb trendov, tj. trenutka, ko se signal loči od reference. Narava spremembe je potrjena s kvalitativno analizo trenda, za katero smo pokazali, da signal 118
razvršča glede na referenco v razrede: stabilen, spremenljiv itd. Analiza časovnih sekvenc kvalitativnih stanj je uspešno odkrila več za poškodbo značilnih vzorcev, kar je prineslo delno odločitev glede poškodbe. Fleksibilnost pristopa k odkrivanju vzorcev, značilnih za poškodbo, omogoča razširitev z dodatnimi vzorci, ki jih lahko pričakujemo v merjenih vrednostih. Ta pristop torej ponuja zanesljivo orodje za odkrivanje sprememb trendov in prepoznavanje vzorcev poškodb. Predlagan pristop k tehniki analize vibracij s spektralnim kurtozisom (SK) za identifikacijo frekvenčnih pasov znotraj vibracijskega signala, ki vsebujejo informacije o vzbujanju impulza zaradi poškodbe. Vibracijski signal najprej razdelimo v kratke segmente, ki jih uporabimo za oceno spektralnega kurtozisa in razšumljanje. Spektralni kurtozis uporabimo za definicijo Wienerjevega filtra, s katerim lahko nestacionarne, s poškodbo povezane komponente ločimo od šuma ozadja. Filter se spreminja za vsak segment posebej, tako da upošteva ustrezni spektralni kurtozis. To je nujno zaradi nihanj s poškodbo povezanega frekvenčnega pasu med delovanjem stroja. Rezultati analiz poškodovanih zobnikov in ležajev so pokazali, da segmenti ne izražajo identičnih frekvenčnih pasov, še zlasti v primerih, ko se poškodba premika (tj. v primerih poškodb notranjega obroča). Diagnostične značilke, izluščene iz filtriranega signala, so se izkazale za zanesljiv vir informacij o prisotnosti s poškodbo vzbujenih impulzov. Razvrščanje značilk, ki temelji na k-‐means in k-‐najbljižjih sosedih, pripelje do dveh odločitev: trk zaznan ali trk ni zaznan. Isti pristop je povsem ustrezen tudi pri analizi vibracijskih signalov, pridobljenih iz zobnikov in ležajev, saj poškodbe povzročajo med seboj podobne si vzorce impulzivne narave. Integracija delnih odločitev iz vibracij in olja glede prisotnosti poškodbe temelji na incidenčni tabeli, ki vsebuje poglobljene relacije med poškodbami in povezanimi značilkami. Tabela poveže več verjetnosti za poškodbo, ki so bile pridobljene neodvisno iz več s poškodbami povezanih značilk, v končno verjetnost za poškodbo z uporabo metode uteženega povprečja. Tabela zajema šest poškodb in deset značilk, (sedem iz analize olja in tri iz analize vibracij) in omogoča fleksibilno dodajanje katerekoli poškodbe in značilke, izluščene iz spremljanja olja, vibracij ali kateregakoli drugega stanja. Končna verjetnost poškodbe je zanesljivejša, saj je pridobljena iz več neodvisnih virov informacij, ki opisujejo isti dogodek. Eksperimentalna potrditev predlaganih pristopov je pokazala, da so uporabni za diagnosticiranje različnih vrst poškodb. Rezultati poskusov z vdorom vode in s kemično kontaminacijo, od katerih sta bila oba diagnosticirana uspešno, so pokazali, da obstaja možnost implementacije v industrijsko prakso, kjer imajo neredko opravka s takšnimi poškodbami. Dejstvo, da pristop ponuja enotno orodje za vibracijsko analizo tako zobnikov kot ležajev, bi lahko pomenilo njegovo potencialno uporabnost v industrijskih panogah, v katerih so komponente menjalnikov najpogosteje uporabljane. Fleksibilnost incidenčne tabele, ki jo omogoča dodajanje novih poškodb in metod spremljanja stanja, pa bi se lahko izkazala za koristno za področja, ki se ukvarjajo z odkrivanjem napak.
119