Asset health reliability estimation based on condition data - CiteSeerX

19 downloads 636 Views 460KB Size Report
Keywords: recursive Bayesian analysis, asset health reliability, bearing life testing ... As a result, asset health reliability can be estimated from condition data.
QUT Digital Repository: http;;//eprints.qut.edu.au

Zhang, Sheng and Ma, Lin and Sun, Yong and Mathew, Joseph (2007) Asset health reliability estimation based on condition data. In Proceedings World Congress on Engineering Asset Management, pages pp. 2195-2204, Harrogate, England.

© Copyright 2007 (please consult author)

Asset health reliability estimation based on condition data Sheng Zhang, Lin Ma, Yong Sun, Joseph Mathew CRC for Integrated Engineering Asset Management, School of Engineering Systems, Queensland University of Technology, Brisbane, QLD 4001, Australia [email protected]

Abstract This paper presents a method using recursive Bayesian analysis to estimate asset health reliability. The method enables reliability analysis and prediction using degradation condition monitoring data from an individual asset, rather than using failure data from a population of assets. A scheme was devised to assist the implementation of the method in both normal and abnormal zones during the whole asset operation cycle. An experiment on bearing life testing was employed to validate the method.

Keywords: recursive Bayesian analysis, asset health reliability, bearing life testing Introduction Reliability models have been reasonably well developed for asset health evaluation in the engineering asset maintenance and management field. Conventional reliability analysis is largely conducted through the analysis of population data, e.g. the data from life testing experiments of a batch of similar assets. With condition monitoring techniques being adopted in different industrial sectors, a large amount of observation data is typically collected from individual critical assets during their operation. Such condition monitoring data is used for troubleshooting (e.g. fault diagnosis) and short term asset condition prediction (e.g. prognosis) (Jardine A. K. S., Lin D. et al. 2005). Using condition data for the estimation of asset health reliability however has not been well explored. The idea is dependent on the assumption that condition monitoring data is able to reflect the underlying degradation process of an asset, and that the variation of condition data manifests the reliability change of an asset. As a result, asset health reliability can be estimated from condition data. In this work, a recursive Bayesian analysis method making use of condition data for asset health reliability estimation is investigated. The method enables reliability analysis and prediction based on condition monitoring data from an individual asset, rather than through traditional methods of using data from a family of assets. Utilising condition monitoring data for reliability estimation has some similarities with fault diagnosis. The severity of a fault is a key issue for diagnosis, as it presents insights into how long before the asset fails. Identification of the severity of faults is analogous to the estimation of asset reliability. In practice, fault severity has been commonly addressed by different condition levels using indicators. For example, the widely exercised ISO standard on vibration monitoring uses the Root Mean Square (RMS) of vibration signals to differentiate machine conditions (ISO 10816 1998). Fault severity, if scaled in the range between 0 and 1, expresses similar ideas with the reliability level in reliability analysis. As the fault severity has a close connection with asset failure probability and hence asset reliability, increasing efforts have been devoted to converting condition monitoring information to reliability representation in recent years. Yan et al, (Yan J., Koc M. et al. 2004) modelled the relationship between failure probability and observed condition

1

monitoring data under different machine conditions using logistic regression algorithms. The output of the logistic regression model is a probability value about the asset condition, which is regarded as the asset health reliability. The reliability values were further modelled using auto-regression moving average algorithms to attain failure prediction, and expert rules were applied to produce reliability estimation using estimated condition data in case of insufficient historical condition monitoring data being available. It is believed that the variation of condition data reflects the degradation process of an asset and hence can be modelled to represent asset health reliability. Knezevi etc. proposed the use of relevant condition data for asset reliability estimation at an instant of operation time (Knezevic 1987). The method was then developed for asset health reliability estimation under multiple failure modes (Saranga H. and Knezevic 2000; Lu, Lu et al. 2001). When extending this method to asset health prediction, one of the issues is to estimate probability density functions (PDF) of future observation indicators. Feed-forward networks and self organised maps have been proposed as prediction algorithms to represent the dispersion of predicted condition indicators (Chinnam R.B. 1999). Expert rules and adaptive neuro-fuzzy inference systems have also been reported on modelling asset reliability using observed data (Chinnam R.B. and Baruah P. 2004), where the application of the neuro-fuzzy system is similar to that of Yan (Yan J., Koc M. et al. 2004). Reliability estimation based on condition data produces a time series of reliability evaluations with respect to asset operation time. The time series of evaluations can then be projected into the future for prognosis or prediction. Though the idea is analogous to the condition based approach which uses regression or artificial intelligence algorithms to model condition data and achieves prognosis using extrapolation, it provides an alternative approach to evaluate asset reliability using condition data from individual assets. This work investigates the formulation of asset health reliability estimation using a recursive Bayesian estimation technique, and employs an experiment on bearing life testing as a case study.

Recursive Bayesian reliability estimation We use T to denote the time to failure, R(t ) to denote the probability of a failure occurring after time t (or a failure does not occur before t ), i.e. R(t )  P(T  t ) , and R(t i 1 | t i ) to denote the conditional reliability which is interpreted as a probability that an asset will not fail until t  t i 1 given that the asset is operational at t i . According to (Lewis E. E. 1996), the unconditional reliability at time t i 1 is given by R(t i 1 )  R(t i 1 | t i ) R(t i )

(1)

Hence the unconditional reliability can be determined using Bayesian estimation to take account of previous conditional reliability in a recursive manner. N

R(t i 1)  R(t0 ) R(ti  n 1 | ti  n )

(2)

n 0

Generally R(t 0 )  1 , which is the case when the asset condition is deemed perfect. The conditional reliability R(t i 1 | t i ) can be derived from the condition monitoring observation variable vector y  [ y1 ,..., y m ] . Such a mixture of degradation variables preferably reflects the loading and response variations of an asset during its operation. Given the m indicators are of the ―lower-is-better‖ type signal, and a critical limit is y cl  [ y1cl ,..., y mcl ] , the following parameter is defined as a failure index

2

Ati 1 

y1   y m  

(3)

 ... clg ( yˆ )dy1...dym |ti 1

y cl 1

ym

where, g ( yˆ ) is the joint probability density function of the m degradation variables. The failure index indicates the failure probability at a particular time point. The critical limit is referred to as the reliability threshold in this work. The conditional reliability is then defined as (1  Ati1 ) /(1  Ati ), Ati1  Ati R(ti1 | ti )   1, Ati1  Ati 

(4)

Eq. (4) explains that the conditional reliability is only calculated when the failure index increases. In other words, it is only upgraded when the condition deteriorates compared with the previous detection. The failure index can be served as a criterion to decide if an observation shall be used for upgrading reliability. In some applications, hazard is a preferable indicator rather than reliability. According to (Lewis E. E. 1996), an accumulative hazard at t i 1 is calculated from (5)

H (t i 1 )   ln( Ri 1 )

A scheme for implementing asset health reliability estimation Asset health reliability estimation should be conducted in the whole operation period of an asset. According to the PF curve illustrated in Figure 1, any asset operates in two distinctive zones, the normal zone from installation (I) to a potential failure (P), and the abnormal zone from a potential failure (P) to a functional failure (F). In the normal zone, an asset runs free of faults. In the abnormal zone, the asset operates with fault(s) and experiences a deterioration process. To conduct asset health reliability estimation, it is of importance to differentiate the two zones, especially to determine the time when the abnormal zone starts. This is generally accomplished using condition monitoring techniques through examining observed data (Vachtsevanos G., Lewis F. L. et al. 2006). Figure 2 further illustrates how measured data indicates the underlying deterioration. The measurements present stable behaviours in the normal zone (I-P: Installation to Potential failure) while an increasing trend in the abnormal zone (P-F: Potential failure to Functional failure).

Potential failure

Functional capability

Functional failure

I

P

F

Operation Age

Figure 1 PF curve Eq. (4) assumes the use of a failure index to decide the observation for upgrading reliability estimate. In practice, the threshold of condition indicators is commonly used to determine if the condition is normal or not (Zhang S., Hodkiewicz M. et al. 2006). A scheme

3

is therefore required to incorporate condition thresholds in the implementation of recursive Bayesian reliability estimation. Given observation data are measured at closely spaced intervals, a procedure comprising two steps is taken to apply the data to asset health reliability estimation.

Measurement

I

P

F Operation Age

Figure 2 Illustrative measurements with regards to operation 1) Determine if a measurement shall be counted as a candidate for reliability estimation. This is achieved by examining the selected indicators from measured data, and can be considered as a binary classification issue. For example, if the condition is assessed in the abnormal zone, the measurement is counted in the reliability estimation process. Otherwise, the measurement is ignored. 2) Apply the rule in Eq. (4) to the determined data and work out the conditional reliability. The unconditional reliability is then computed using Eq. (2). This procedure suggests that a filtering mechanism is required to filter out a degradation process by examining the observation data. Two fundamental issues are then raised to facilitate the implementation of the procedure. 1) The determination of the threshold which is used to judge the condition so that the corresponding measurement shall be counted in the reliability estimation. 2) The determination of the threshold y cl  [ y1cl ,..., y mcl ] which is used for reliability calculation. The first issue is concerned with diagnosis. The underlying condition needs to be classified as normal or abnormal. Numerous methods have been available for diagnosis, from simple ones such as linear discrimination analysis to complex ones such as neural networks (Bishop C. M. 1995; Zhang S., Mathew J. et al. 2003). Statistical process control (SPC) have also been applied as an alternative for abnormality detection (Goode K. B., Moore J. et al. 2000; Wang W. 2002). In addition, multivariate analysis shall be investigated to take advantage of the information from multiple indicators. In the selection of detection algorithms, it is generally in favour of those capable of deriving instant decisions based on relative fewer measurements. As to the second issue, the area of the PDF above the threshold has been considered reflecting the conditional probability that a failure occurs since the last measurement, given the asset still functions at the last measurement time. This threshold lies in-between the thresholds of potential failures and function failures. Empirical knowledge may have to be resorted to determine the threshold (Ben-Daya M., Duffuaa S. O. et al. 2000). Other related topics in implementing this scheme include indicator extraction, and PDF estimation of the indicators etc. The PDF estimation becomes very problematic in case of sparse data and demands effective methods (MacKay D. J. C. 2003).

Case study 4

Bearing failure is one of the foremost causes of breakdown in rotating machinery. A large part of the work reported in the literature have focused on crack propagation failures, which have resulted from an artificial or seeded damage (Dyer 1973; Dyer and Stewart 1978; Stronach, Cudworth et al. 1984; Prabhu 1996; Kim June 1984) and component characteristic frequencies can be identified as indicators for early failure warning. In this work, an accelerated life testing of bearings without any seeded defects was adopted for testing the algorithm on asset health reliability estimation. The observations from the experiment showed more complicated frequency components. As a result, it was not suitable to sense early failure warning using component characteristic frequencies as indicators. The test rig for bearing life testing is shown in Figure 3 and Figure 4. A hydraulic ram provided vertical forces on the bearing housing. A hand actuated hydraulic load unit was also available. A slave bearing block was screwed on the load table with two M10x20 bolts. Inside the slave bearing block two high precision bearings (7018CTDBLP6) were used to support the shaft and maintain alignment. The input shaft on the slave bearing block was driven by a TECO electric motor (5.5 kW and 4 poles) via a 630mm pulley to reduce the motor speed from 1450 RPM to 211 RPM. The shaft was designed to be capable of undertaking heavy loads. The bearing housing had two holes 90 degrees apart for mounting sensors. The test bearing was a single-row deep groove ball bearing (SKF 6805). It was loaded radically between 50-60 % of its dynamic loading capacity with a calculated cycle-life of 1.47 million revolutions (Harris 1966). An accelerometer (IMI 621B51) mounted on a bolt was screwed onto the bearing outer race. At 90 degrees away from the accelerometer, a type J thermocouple (iron and copper-nickel, with a range of 0º to 750 ºC) used another hole on the bearing housing and contacted the bearing outer race. The vibration signals were amplified using a PCB482a20 amplifier (gain of 10), and filtered using an analogue filter (KROHN3202, cut-off frequency of 3 kHz). Both vibration and thermocouple were connected with a connector (NI BNC2120) and then came to a digital A/D converter (NI DAC6012). A computer with a data acquisition software tool coded using LabVIEW was used for signal acquisition. The bearing was tested until a failure was reached. Data was acquired in an interval of 5 minutes, with a sampling frequency 12 kHz. For each sample, a total of 217 data points were recorded for 10.55 seconds.

Figure 3 Life testing rig

5

Figure 4 Bearing house and sensors The bearing was operated 51 hours in four days under heavy loading, with a break in each evening. Vibration features, i.e. kurtosis, RMS and Peak-to-Peak (P2P), together with the temperature of the bearing outer race, were extracted and plotted. The ranges of features extracted from the first and last samples in each day (including the 4th day before the last 40 minutes) are presented in Table1, together with the maximum values of the features for the data in the last 40 minutes. Note for the sample of 10.55 seconds, different sub-samples with a data length of 4096 were used to work out the range of features. Data from the earlier three days suggested that the bearing condition was normal. These data provide a profile of baseline indicators. Table 1 Statistical features summary Day Temperature RMS 1 1.1 – 1.3 28C – 32C 2 1.1 – 1.3 28C – 33C 3 28C – 34C 1.1 – 1.35 4 (before the last 40 min) 28C – 34C 1.1 – 1.25 4 (last 40 min) 1.21 37C

P2P 10 – 12 10 – 12.5 10 – 13 10 – 13 14.5

Kurtosis 3 – 3.2 3 – 3.3 3 – 3.4 3 – 3.4 3.6

. Figure 5 displays the last 27 samples (with a sampling interval of 5 minutes) of the temperature and three derived vibration indicators close to the bearing failure in day 4. Note the vibration indicators were the average value of the total sample of 10.55 seconds. The temperature always kept rising during the experiment, presenting a clear trend in relation with bearing operation. From the sample 20, the temperature showed a significant start of increase, which matches the indication from Kurtosis. The kurtosis before this sample is around 3.0 and afterwards is large than 3.0. This moment of time was considered as the onset of bearing deterioration. Interestingly, RMS and P2P appeared to be the least conclusive for detecting this potential failure occurrence. They presented a decreasing trend when the failure started before showing an increasing trend (Karimi M. 2006). In the normal zone the Kurtosis also displayed stable behaviour compared with RMS and P2P. Such stability assists the reduction of false alarms in reliability estimation. Kurtosis was selected as an indicator for reliability estimation since it presented a favourable indication throughout the progression of damage. Reliability estimation was applied for the last 27 samples. The kurtosis of the first 20 samples were around 3.0, which is a common indication that the vibration was normal, and hence the first 20 samples were treated from normal condition (Davies A. 1998). A kurtosis level of 3.1 was taken as the threshold for triggering abnormality alarms, and a threshold of 3.7 was taken for triggering reliability estimation.

6

Figure 6 shows the trend of kurtosis with PDF versus operation time. For this specific case, no false diagnosis occurred in the normal zone. In addition, no missed diagnosis occurred in the abnormal zone. However, two measurements (24 and 26) demonstrate nondegradation behaviours compared with their previous one and have been ruled out in the reliability update process. temperature

38 36 34 32

0

5

10

15

20

25

30

0

5

10

15

20

25

30

0

5

10

15

20

25

30

0

5

10

15

20

25

30

rms

1.5 1 0.5

P2P

9 8 7 6

kurtosis

3.5 3 2.5

Figure 5 Indicators versus operation time 4

3.8

3.6 3.4

Indicator

3.2

3

2.8

2.6

2.4 2.2

2

1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Sample

Figure 6 Kurtosis with regards to operation time

Figure 7 shows the reliability estimation using the last 27 samples, where the aging effect is incorporated using an exponential distribution of time to failure with a failure rate of 0.001.

7

Also the hazard is presented. As can be seen, the reliability of the bearing starts to decline quickly from the time of bearing failure initiation (the 20th sample). 1

Reliability

0.9 0.8 0.7 conditional

0.6 0.5

unconditional 0

5

10

15 Sample

20

25

30

0

5

10

15 Sample

20

25

30

0.8

Hazard

0.6

0.4

0.2

0

Figure 7 Reliability estimation considering aging effect

Conclusion and future work A recursive Bayesian analysis was developed in this work to estimate asset health reliability using condition monitoring data. It enables reliability evaluation using observations from individual assets. The method can be performed in an ‗on-line‘ mode, and functions in both normal and abnormal zones. Remaining useful life can also be evaluated based on the derived reliability curve, given an acceptable reliability (hazard). Although the proposed method was successfully illustrated through bearing life tests, it needs to be improved from various aspects. Proposed future work include: developing methods to minimise false diagnosis in the normal zone and improve the diagnosis response time; estimating health reliability from mixed condition indicators using data fusion; modelling remaining useful life estimation using both degradation data and reliability data; developing novel methods for PDF estimation for multivariate and sparse data; taking maintenance actions into consideration in the reliability estimation process; and conducting case studies from field data.

Acknowledgement The authors are grateful to Dr. Mahdi Karimi at Queensland University of Technology for providing the data for testing.

References Ben-Daya M., Duffuaa S. O. and Raouf A. (2000). Maintenance, modelling and optimization, Kluwer Academic Publisher. 8

Bishop C. M. (1995). Neural Networks for Pattern Recognition, Oxford University Press. Chinnam R.B. (1999). "On-line reliability estimation of individual components using degradation signals." IEEE Transactions on Reliability 48(4): 403-412. Chinnam R.B. and Baruah P. (2004). "A neuro-fuzzy approach for estimating mean residual life in condition-based maintenance systems." International Journal of Materials and Product Technology 20: 166–179. Davies A. (1998). Handbook of condition monitoring techniques and methodology, London: Chapman & Hall. Dyer, D. (1973). "Bearing condition monitoring." Interim Report 1 Department of Mechanical Engineering, University of Southampton, Southampton (UK). Dyer, D. and R. Stewart, M., (1978). "Detection of rolling element bearing damaged by statistical vibration analysis." Trans ASME, J Mech Design 100(2): 229-235. Goode K. B., Moore J. and Roylance B.J. (2000). Plant machinery working life prediction method utilizing reliability and condition-monitoring data. Proceedings of the Institution of Mechanical Engineers Part E—Journal of Process Mechanical Engineering 214 (2000). Harris, T. A. (1966). Rolling bearing analysis. New York, John Wiley and Sons. ISO 10816 (1998). "Evaluation of machine vibration by measurements on non-rotating parts - Part 3: Industrial machines with nominal power above 15 kW and nominal speeds between 120 r/min and 15 000 r/min when measured in situ." Jardine A. K. S., Lin D. and Banjevic D. (2005). "A review on machinery diagnostics and prognostics implementing condition-based maintenance". Mechanical Systems and Signal Processing. Karimi M. (2006). Machine Fault Diagnosis Using Blind Deconvolution. Brisbane, Queensland University of Technology. Kim, P. Y. (June 1984). "A review of rolling element bearing health monitoring (II): preliminary test results on current technologies." Proceedings of Machinery Vibration Monitoring and Analysis Meeting, Vibration Institute, New Orleans, LA, 26-28: 127137. Knezevic, J. (1987). "Condition parameter based approach to calculation of reliability characteristics." Reliability Engineering 19(1): 29-39. Lewis E. E. (1996). Introduction to Reliability Engineering, Wiley. Lu, S., H. Lu and W. J. Kolarik (2001). "Multivariate performance reliability prediction in real-time." Reliability Engineering & System Safety 72(1): 39-45. MacKay D. J. C. (2003). Information theory, inference, and learning algorithms, Cambridge University Press. Prabhu, R. (1996). "Rolling bearing diagnostics." Proceedings of the Indo-US Symposium on Emerging Trends in Vibration and Noise Engineering, New Delhi,: 311-320. Saranga H. and J. Knezevic (2000). "Reliability analysis using multiple relevant condition parameters." Journal of quality in maintenance engineering 6(3): 165-176. Stronach, A. F., C. J. Cudworth and A. B. Johnston (1984). "Condition monitoring of rolling element bearings." Proceedings of the International Condition Monitoring Conference, Swansea, UK, 10-13: 162-177. Vachtsevanos G., Lewis F. L., Roemer M., Hess A. and Wu B. (2006). Intelligent Fault Diagnosis and Prognosis for Engineering Systems, Wiley. Wang W. (2002). "A model to predict the residual life of rolling element bearings given monitored condition information to date." IMA Journal of Management Mathematics 13: 3-16. Yan J., Koc M. and Lee J. (2004). "A prognostic algorithm for machine performance assessment and its application." Production Planning and Control 15: 796–801.

9

Zhang S., Hodkiewicz M., Ma L. and Mathew J. (2006). Machinery condition prognosis using multivariate analysis. The 1st World Congress of Engineering Asset Management, Gold Coast, Australia. Zhang S., Mathew J., Ma L. and Sun Y. (2003). Integrated machine fault diagnosis using wavelet packet transform and probabilistic neural networks. The 10th Asia Pacific Vibration Conference, Gold Coast, Australia.

10