A Stochastic Filtering Based Data Driven Approach for ... - IEEE Xplore

2 downloads 0 Views 2MB Size Report
University of Salford, Salford ([email protected]). *Also at PHM centre of City University of Hong Kong. Abstract As an efficient means of detecting potential ...
A Stochastic Filtering Based Data Driven Approach for Residual Life Prediction and Condition Based Maintenance Decision Making Support Wenbin Wang* and Matthew Carr Centre for OR and Applied Statistics University of Salford, Salford ([email protected]) *Also at PHM centre of City University of Hong Kong

Abstract As an efficient means of detecting potential plant failure, condition monitoring is growing popular in industry with million's spent on condition monitoring hardware and software. The use of condition monitoring techniques will generally increase plant availability and reduce downtime costs, but in some cases it will also tend to over-maintaining the plant in question. There is obviously a need for appropriate decision support in plant maintenance planning utilising available condition monitoring information, but compared to the extensive literature on diagnosis, relatively little research has been done on the prognosis side of condition based maintenance. In plant prognosis, a key, but often uncertain quantity to be modelled is the residual life prediction based on available condition information to date. This paper shall focus upon such a residual life prediction of the monitored items in condition based maintenance and review the recent developments in modelling residual life prediction using stochastic filtering. We first demonstrate the role of residual life prediction in condition based maintenance decision making, which highlights the need for such a prediction. We then discuss in detail the basic filtering model we used for residual life prediction and the extensions we made. We finally present briefly the result of the comparative studies between the filtering based model and other models using empirical data. The results show that the filtering based approach is the best in terms of prediction accuracy and cost effectiveness. I INTRODUCTION Condition Based Maintenance (CBM) is one of the maintenance policies which has gained much of the attention over the last decade. The technological advances in IT, sensors and signal processing techniques have provided a means to achieve high availability and to reduce scheduled and unscheduled production shutdowns. Today there exists a large and growing variety of forms of Condition Monitoring (CM) techniques for machine condition monitoring and fault diagnosis. However, irrespective of the particular condition monitoring techniques used, the working principle of condition monitoring is the same, namely, condition data becomes available and needs to be interpreted, and appropriate actions are taken accordingly. It is noted however that there is a basic, but not always clearly answered question here in condition monitoring. That is, what is the purpose of condition monitoring? Have we lost sight of the ultimate need? Condition monitoring is not an end in itself, it involves an expenditure

978-1-4244-4758-9110/$26.00 e 2010 IEEE

entered into by managers in the belief that it will save them money. How is this saving achieved? It may be obtained by using monitored condition information to optimise the maintenance process in order to achieve the minimum breakdown of the plant with maximum availability for production, and to ensure that maintenance is only carried out when necessary. But in reality, all too often we see effort and money spent on monitoring equipment for faults which rarely occur, and we also see planned maintenance being carried out even when the equipment is perfect healthy though the monitored information indicates something is "wrong", [1]. This paper focuses upon the decision aspect of condition based maintenance and reviews the recent developments in modelling residual life prediction and its use in condition based maintenance decision support. This review is biased towards the use of stochastic filtering for the prediction of the residual life but other popular techniques are also briefly discussed and compared with. The concept of the residual life relates to a frequently asked question in industry, i.e, how long can the monitored component survive given the available condition monitoring information to date. It will be shown later that this predicted residual life conditional on observed condition monitoring information is a key element in modelling condition based maintenance decision making. The type of condition monitoring information that we consider is of an 'indirect' nature. Direct condition monitoring processes are those where the measurement or observation relates exactly to the condition or state of the component, e.g. component wear or degradation, and the objective is to predict the evolution of the degradation. In the case of indirect CM processes, the observed information is assumed to be stochastically indicative ofthe underlying condition. Potential indirect CM techniques include vibration based monitoring, oil analysis, infrared thermography, acoustic emission analysis and motor current analysis. The overall vibration level (or total energy of the vibration signal) combined with a detailed spectral analysis of the signal is commonly used in the fault diagnosis of rotating machinery. Vibration signals often correspond to a two stage process over the life of a component with the first stage being 'normal' operation and the second being 'defective' but operational, [2]. Most machine types can potentially benefit from vibration monitoring except those that operate at very low speeds. Techniques for sampling and analysing engine, transmission and hydraulic oils are used to monitor wear and contamination and can be used to identify impending failures before they occur. There is also the potential for reducing the frequency of oil changes. Available oil-based monitoring techniques include

MU3054

2010 Prognostics & SystemHealth Management Conference (PHM2010 Macau)

spectrometric oil analysis, electron microscopy, x-ray analysis and ferrous debris quantification. See [3] for a detailed review of CM techniques. The literature on diagnostic techniques that utilise CM information to determine the current state of a component or system is vast. The literature on prognostic modelling techniques that use the indicatory condition information for extrapolation purposes, in order to predict future states such as component/system failures, is far less comprehensive. In many situations, condition monitoring can be ineffective if it is not combined with prognostic modelling. CBM techniques are used to minimise the occurrence of failures and increase the availability of the component for operational purposes in a cost effective manner. CBM is superior to breakdown maintenance or frequent as possible policies as action is typically only undertaken when deemed to be necessary. Available prognostic techniques include stochastic filtering [2], [4] -[7], proportional hazards modelling [8]-[10], accelerated life models [11] and [12], hidden Markov models [13]. The above mainly apply to indirect monitoring, while for the case of direct monitoring, different techniques have been used, but are not reviewed in this paper. This paper is not intended to serve as a complete review of prognostic techniques in indirect monitoring, but a certain comparison will be made between the proportional hazard model and the stochastic filtering based model. The paper is organised as follows. Section II highlights the CBM decision model and the role of residual life prediction in such a model. Section III reviews the latest development in residual life prediction using stochastic filtering. In section IV, comparisons are made between the filtering based model and other prediction models. Finally, section V concludes the paper.

No replacement

"",J

....

Optimal replacement time

..... ........

'-

--- -- ---- ------

ti,

C,(TR )

represents the cost function associated with candidate

values for the replacement time, TR • The solid curve represents a situation where an optimal replacement time is

T;

scheduled at time before the subsequent CM point. In the case of the dashed curve, the decision produced by the model is to leave the component in operation until the next CM point and then re-assess the situation.

978-1-4244-4758-9/10/$26.00 e 2010 IEEE

)i+1 Next / monitoring point

",Current 'monitoring point

Figure 1. Illustrating the replacement decision at the ith monitoring point In order to describe the situation in figure 1 in the form of a mathematical model, we seek to fmd the optimal minimizes the long terms objective function, [4];

T; which

· · ) E(cost per life cycle) Ci (1.7'R ) = E( cost per unit time =

E(length of lifer cycle)

· (1)

The expected cost and life per renewal cycle will all depend on the probabilistic description of the residual life and of course the costs associated. Given that the required information is known, then equation (1) is a function of TR , which may be minimized over the interval of

II CBM DECISION MODELLING The prognostic models proposed in this paper involve the development of a conditional probability density function (pdt) for the time until failure of an individual component using observed CM information. A key advantage in establishing a conditional pdf over the evaluation of a single point estimate of the failure time is the availability of the conditional cumulative density function (cdt). At each CM point throughout the life of a component, an optimal replacement time can be scheduled using renewal-reward theory and the long run 'expected cost per unit time' [14]. The conditional pdf and the resulting cdf are used in the development of the replacement decision model that incorporates the probability of failure before a particular instant conditional on the CM history to date. Illustrations of the potential replacement decisions are given in figure 1 where

Time

1* R

[t t +1) j ,

j

.

If such a minimum

cannot be found then the conclusion is to leave the component till t j +1 where new CM observation will be available and the decision model will be re-run to seek the optimal replacement time. A simple alternative role for decision making would be to compare the product of the probability of failure and the consequence of such a failure for each key component and then those components with higher ranking should be scheduled at the next preventive maintenance window or as soon as possible depending on the maintenance capacity. As discussed at the start of this section, when constructing reliable decision models, a point estimate of the failure time is not as useful as the availability of a complete description of the conditional pdf If a suitable model has been defmed and there is not a substantial change witnessed in the behaviour of the monitored process, we would expect the variance of the distribution about the actual failure time to decrease over time as more CM information is obtained. The level of convergence will be dependent on the suitability of the proposed prognostic model for the particular case. In the next section, we discuss the individual prognostic models we developed following the initial work reported in [2] and [4].

MU3054

2010 Prognostics & SystemHealth Management Conference (PHM2010 Macau)

III STOCHASTIC FILTERING BASED RESIDUAL, LIFE PREDICTION MODELS. Basic Non-Linear Stochastic Filtering Non-linear stochastic filtering is a prognostic technique that has been tailored for CBM applications; see [2],[4],[6] and [15]. The model has been employed in a number of previous case studies including applications involving rolling element bearings, aircraft engines and naval diesel engines. A principal advantage of the non-linear filtering approach, over other rival prognostic techniques (such as proportional hazards modelling) is the assumption of the relationship between the predicted residual life and observed CM information and an ability to handle the history of monitored information when making predictions about the time until failure of a component. The model avoids the typical problems associated with autocorrelation and over-fitting that typically occur with multiple sources of CM information.

Stochastic filtering is a technique well known to control engineers. However, the stochastic filtering model developed in [2] used a technique which differs from the conventional one used in automation theory. Two new contributions are presented in [2]. Firstly, the residual life is used to represent the actual state of the monitored item, which was not used before. This overcame a number of problems encountered in other prognostic models such as Markovian based models where a set of plant states must be first defined and then the first passage time to the absorbing state must be found, which is actually the residual life. It also overcame the problem of the memory-less property employed in Markovian models. This simple defmition also provides a means of handling the system equation between item's states in that, the relationship between the states at t, and t i - I becomes

X. = {Xi-I - (ti - ti-I) 1 not defined

if x · I > t. - t. I 1-

1

1-

otherwise

(2)

conventional approach used in control theory and simply defmed the following equation (4)

The pdf. has two parameters, namely, a(xi ) and

p, and the

relationship between Yi and Xi was established using the concept of a floating scale parameter, a(Xi)' that is, Yi is a function of Xi with the random noise characterized by

pdf(a( Xi ), P). By manipulating a( Xi)' a variety of forms between Yi and Xi can be achieved. With equations (2) and (4) available, and after a repeated application of the Bayesian theorem, an analytic form of the probability density function of

p(xi I Yi = YI'···'Yi) can be obtained. Equation (4) is simple in its defmition but it delivers a very useful result. An additional contribution is that it assumes a causal relationship between Yi and Xi in that the cause is Xi and Yi is the result. This is again important in CM practice since we would expect that a shorter residual life may produce a higher CM reading, such as in the case of vibration monitoring. This offers a contrasting comparison between the model in [2] and the proportional hazard based models, [9] where the relationship between Yi and Xi is modelled the other-way around. Beta wear model If one is interested in another prognostic quantity such as wear as in most rotating equipment, then how to predict the current and future wear is of importance. Notice that in this case, no analytical solution exists so one has to resort to numerical solutions. Wang, [6], developed a Beta distribution based wear model using a similar principle as in [4], but both the system and observation equations were modelled using two distributions of p(Wi I Wi-I) and P(Yi I Wi) as

where Xi is the residual life at time

t.. Of course equation (2)

assumes that no other intervening events happened to the item during [ti - I , ti ) • Otherwise, equation 2 can be modified to take

p( w

IW i

)

=

)P-I(I_w.)q-I 1 B( p,q )(1- W - )p+q-I

(W. -W.

r-r

1

1-1

(5)

i I

interventions into account as

f (Xi- I -(ti -ti-I)) X i -{n o t defined

if Xi-I. > ti+I - t, 1

otherwise

where Wi is the wear at (3)

distribution with parameters p and q.

Yi I Wi follows a distribution with a pdf., P(Yi I Wi) ,

Both equations (2) and (3) are deterministic so we may be able to fmd an analytic solution for the prediction of Xi . Since Xi is not observable at time

t, and Wi IWi-I follows a Beta

t., we need to establish an observation

equation to describe the relationship between the observed CM information at t., Yi' and Xi to be able to estimate Xi . The

from a family of exponential distributions with scale and shape parameters. The conditional mechanism was again modelled by assuming the scale parameter, a , in P(Yi I Wi) , is a function of Wi and t i , namely, a = a(Wi't i). The resulting distribution is then

observation equation proposed in [2] did not follow the

978-1-4244-4758-9110/$26.00 © 2010 IEEE

MU3054

2010 Prognostics & SystemHealth Management Conference (PHM2010 Macau)

p(Wi I Y i )

r r

p(y; I WJ

! p(y; IwJ

peW; I W;_l)p(wi-ll Y;-l)dwi-l

.

A model using both external and internal CM variables There are cases in CBM where external variables such as (6) loading and speed can influence the residual life substantially and these variables cannot be treated in the same way as Yi.

pew; IWi-l)P(W;_1 IY;_l)dwi-l dw;

It is noted however that analytical solution is not available and Wang, [6], proposed to use particle filtering to solve the problem. The model developed was fitted to aircraft engine metal concentration data. A two stage filtering model From equation (4), since Xi always decreases as time goes, then we must have

Yi stochastically increasing or decreasing

with Xi . However, it is not always the case in CBM practice since

Yi may stay flat for some time before starting to increase,

see for example in the case of vibration monitoring, Figure 2 shows a typical two stage process. STAGE 2 (0efe ctive)

STAGE 1 (Normal)

C)

c

~

(13 Q)

0:::

2 u

1---·-·-·---·---··--·-·---······-·--·----·----···

- ----+- -.--...- .+.--.--..--.---.....------..-. ---+--·----···_·----1

"=:ooC~=-..

New

tm

~

Failed

~

Time

Figure 2.. Illustrating the two stages ofcomponent operation

Now the problem is to formulate a combined model to predict the start of the second stage and then from there to predict the residual life. Wang [7] developed a model for this purpose. Two modelling techniques were used, the first is the hidden Markov model utilising a modelling concept called the delay time, [16], and the second is the one introduced in [4]. The plant is assumed to have three possible states, namely, normal, defective and failed. Wang used two random variables to represent the duration of the normal and defective stages. This defective stage is referred to as the failure delay time in [17]. Once the defective stage has been confrrmed, then a stochastic filtering based model was initiated to predict the residual delay time. It is important to note that, if the failure process does not indicate a typical two stage failure process, then we don't need to divide it into a two-stage process, and the model presented in the basic non-linear filtering section can be applied from the very beginning of the plants life. However, one thing is necessary here as we require the observed CM information to be stochastically increasing or decreasing, that is, a trend must be present. This is obvious since, if no trend is apparent, there is no need for CM.

978-1-4244-4758-9/10/$26.00 e 2010 IEEE

For example in the oil analysis, metal concentrations in the sampled lubricant are the result of wear but other contaminants such as the amount of water and silicon are not the result of wear but may impact upon the wear process and therefore the residual life. Wang and Hussin, [18], presented a model to combine these two types of variables, internal variables which were influenced by the residual life and external variables which change the residual life. The treatment of internal variables is the same as in the model of [2], but the impact of the external variable on Xi was modeled using a model developed in [12] where, a proportional residual concept was used. It is a concept similar to the accelerated life model, but the difference was, [12], residual life was used instead of the whole life. In a simple way, the external variables will proportionally change the residual life when they are available. This change is then fed through using recursive filtering so the whole history of the changes of the external variable are incorporated. The case example did show prediction improvement when both variables were used instead of only the internal variables. Failure Mode Filtering Many monitoring scenarios provide evidence that the operational components involved may potentially be subject to a number of individual distinct failure modes, rather than a single dominant failure mode as modelled previously, [2]. Figure 3 provides a typical example where, the failure times of the components and the final CM reading before failure are observed to be clustered in the historical data.

Failure time

Figure 3. Illustrating the clustering offailure times under the influence oftwo different failure modes

The modelling procedure proposed to handle this scenario is based on the assumption that an individual monitored component will fail according to one of a number of predefmed failure modes. A double filtering process is used. Firstly, individual non-linear stochastic filters are constructed to facilitate the failure time prediction under the influence of each potential failure mode. Then the output from each filter is then weighted according to the probability that the particular failure

MU3054

2010 Prognostics & SystemHealth Management Conference (PHM2010 Macau)

mode is the true underlying (unknown) failure mode. The probabilities associated with each failure mode are recursively derived using a Bayesian model and the CM information obtained to date. The failure mode analysis filter is described in detail in [19]. Extended Kalman Filtering Many established prognostic modelling techniques for CBM can be computationally expensive and the input from multiple simultaneous sources of CM cannot be incorporated without resorting to data reduction algorithms or other approximate techniques (where much of the multivariate information is lost). The use of different locations for vibration sensors or multiple oil contamination readings is now very common and the established techniques are not satisfactory on a theoretical or performance based level. To overcome these significant shortcomings, we have developed an approximate methodology using extended Kalman filtering and the history of CM to recursively establish the conditional failure time distribution. The filter can be used to develop the distribution in a computationally efficient manner for a large number of components and CM variables simultaneously. The model is documented in the paper Carr & Wang [20]. V CASE EXAMPLE AND COMPARISON An assessment of a CBM methodology for a particular case should always investigate the applicability of the CM data. If the data is unreliable or is not representative of the underlying condition of the components in question, a prognostic technique may not be useful. In these situations it may be preferable to utilise a fixed interval replacement policy or a survival analysis model. In this example, we compare the performance of some modelling developments introduced in this paper with some alternative established prognostic techniques in a complete information example. The "complete" here means that both CM and failure information is available. The alternative techniques include optimal fixed age replacement policies, a method based on survival analysis techniques and other condition-based prognostic techniques. Comparing the condition-based prognostic models with the fixed age policies or survival analysis models can provide evidence to support the use and the investment associated with installing monitoring capabilities, on-board data processing, prognostic modelling and automated decision making. Figure 4 illustrates the overall vibration (total energy) monitoring information over the lives for six operational rolling element bearings that are operated until failure. The data is illustrated sequentially where, ' 0 ' represents a condition monitoring reading and ' . ' represents a reading just before the failure.

• 40

0

0

30

~

~

20

4

~

4

?


t, where, fo(z) is an initial failure time pdf which is parameterised using the historical failure time information. The most common choice for fo is a Weibull distribution and for the case data explored here, we obtain the shape and scale parameters a = 0.01 and f3 = 1.87 .

ofRisk and Reliability, 222, (2008), 47-55. 16.

Wang W., Delay Time Modelling, in Complex system maintenance handbook, (Eds) Pro Murthay and AKS Kobbacy, Springer, 345-370, (2008)

17.

Christer, A.H. and Waller, W.M., Reducing production downtime using delay time analysis, J. Opl. Res. Soc., 35, (1984),499-512. Wang Wand Hussin B., Plant residual time modelling based observed variables in oil samples, J. Opl. Res Soc, 60, (2009), 789-796. Carr, M.J. and Wang, W. An approximate algorithm for CBM applications, (under review, 2009). Carr, M.J. and Wang, W. Failure mode analysis and residual life prediction for condition based maintenance applications, to appear in IEEE Transactions on

REFERENCES 1.

2.

3. 4.

Aghjagan, H.N., Lubeoil analysis expert system, Canadian Maintenance Engineering Conference, Toronto, 1989. Wang, W., A model to predict the residual life of rolling element bearings given monitored condition information to date, IMA Journal of Management Mathematics, 13, (2002), 3-16. Collacott, R.A. Mechanical Fault Diagnosis and Condition Monitoring, Chapman & Hall, (1997). Wang, W. and Christer, A.H., Towards a general condition based maintenance model for a stochastic

978-1-4244-4758-9/10/$26.00 e 2010 IEEE

Lin, D. and Makis, V., Recursive filters for a partially observable system subject to random failure, Adv. Appl. Probab., 35, (2003), 207-227. Wang, W., A prognosis model for wear prediction based on oil-based monitoring, Journal of the Operational Research Society, 58, (2006), 887-893. Wang, W., A two-stage prognosis model in condition based maintenance, European Journal of Operational Research, 182, (2007),1177-1187. Kumar, D. and Westberg, U., Maintenance scheduling under age replacement policy using proportional hazards model and TTT-plotting, European Journal of Operational Research, 99, (1997), 507-515. Makis, V. and Jardine, A.K.S., Optimal replacement in the proportional hazards model, INFOR, 30, (1991), 172-83 Jardine, A.K.S., Makis, V., Banjevic, D., Braticevic, D. and Ennis, M., A decision optimisation model for condition-based maintenance, J. Qua. Main. Eng., 4, (1998),2, 115-121. Kalbfleisch, J.D. and Prentice, R.L. The Statistical Analysis of Failure Time Data, Wiley, New York, (1980). Wang Wand Zhang W., A model to predict the residual life of aircraft engines based on oil analysis data, Naval Logistic Research, 52, (2005), 276-284. Bunks C and Mccarthy D., Condition-based maintenance of machines using hidden markov models, Mechanical systems and signal processing, 14, (2004), 597-612. Ross, S.M. Stochastic processes, Wiley, New York, (1996). Carr M.J. and Wang W., A case comparison of a proportional hazards model and a stochastic filter for condition-based maintenance applications using oilbased condition monitoring information, Proceedings of the Institution ofMechanical Engineers, Part 0, Journal

18.

19. 20.

MU3054

Reliability, (2009).

2010 Prognostics & SystemHealth Management Conference (PHM2010 Macau)

21.

22.

23. 24. 25.

Wang, W. and Zhang, W., Early defect identification: application of Statistical Process Control methods, Journal of Quality & Maintenance Engineering, 14, (2008), 225-236. Love, C.E. and Guo, R., Using proportional hazard modelling in plant maintenance, Quality and Reliability Engineering International, 7, (1991), 7-17 Lawless, J.F., Statistical Models and Methods for Lifetime Data, John Wiley, (2003). Degroot, M., Probability and Statistics, 2, AddisonWesley, (1980). Cox, D.R. and Oakes, D. Analysis of Survival Data, London: Chapman and Hall, (1984).

978-1-4244-4758-9110/$26.00 © 2010 IEEE

26.

27.

MU3054

Banjevic, D. and Jardine, A.K.S. (2004) Calculation of reliability function and remaining useful life for a Markov failure time process, MIMAR2004 Conference Proceedings, Salford, England Vlok, P.J., Coetzee, J.L., Banjevic, D., Jardine, A.K.S. and Makis, V. Optimal component replacement decisions using vibration monitoring and the proportional-hazards model, Journal of the Operational Research Society, 53, (2002), 193-202

2010 Prognostics & SystemHealth Management Conference (PHM2010 Macau)

Suggest Documents