IIE Transactions (2009) 41, 372–387 C “IIE” Copyright ISSN: 0740-817X print / 1545-8830 online DOI: 10.1080/07408170802369409
Residual-life estimation for components with non-symmetric priors SANTANU CHAKRABORTY1 , NAGI GEBRAEEL3 , MARK LAWLEY2,∗ and HONG WAN1 1
Department of Industrial Engineering and 2 Weldon School of Biomedical Engineering, Purdue University, West Lafayette, IN 47907, USA E-mail:
[email protected] 3 H. Milton Stewart School of Industrial and Systems Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USA Received February 2007 and accepted March 2008
Condition monitoring uses sensory signals to assess the health of engineering systems. A degradation model is a mathematical characterization of the evolution of a condition signal. Our recent research focuses on using degradation models to compute residuallife distributions for degrading components. Residual-life distributions are important for providing probabilistic estimates of failure time for use in maintenance planning and spare parts inventory management. To obtain residual-life distributions, our earlier work assumed the degradation model’s stochastic parameters to be normally distributed. This paper investigates the performance of these residual-life distributions when the underlying normality assumptions are not satisfied. The paper also develops methods for estimating residual-life when the stochastic parameters of the degradation model follow more general distributions. Keywords: Residual life, Bayesian updating, non-symmetric priors
1. Introduction Condition monitoring uses sensory information from a functioning device to evaluate its health. The evolution of this sensory information often exhibits characteristic patterns that reflect the underlying physical transitions that occur during degradation. These patterns are known as degradation signals (Nelson, 1990). Degradation models mathematically characterize these degradation signals and are useful for estimating the residual life of a partially degraded component. Degradation models typically have stochastic parameters that follow some distribution across the population of components. In Gebraeel et al. (2005), we develop a Bayesian updating technique that uses degradation signal information to update the distributions of stochastic parameters for linear and exponential degradation models. We also derive expressions for the residual-life distribution assuming normality of the stochastic parameters. The motivation of the present work is to study the practical applicability of these expressions when the normality assumption is not met. We specifically focus on the linear model developed by Gebraeel et al. (2005) and consider the case where the stochastic parameter follows gamma distributions with various degrees of skewness. The gamma ∗
Corresponding author
C 2009 “IIE” 0740-817X
distribution provides more flexibility in capturing the characteristics of the real-world sensory data. Based on this assumption, we propose a simulation-based algorithm to estimate residual-life distributions. Finally, we examine the performance of the proposed method for various degrees of skewness in the prior and obtain a threshold beyond which it performs better than that presented in Gebraeel et al. (2005).
2. Literature review The purpose of real-time condition monitoring is to evaluate the health of a device in order to reduce unnecessary maintenance and unexpected downtime. Significant research exists in areas such as power transformers (Feser et al., 1995), cutting tools (Dimla, 1999), high-voltage induction motors (Thorsen and Dalva, 1999), railway equipment (Fararooy and Allan, 1995) and machine tools (Martin, 1994). Most of this work is diagnostic in nature. Some researchers use degradation information to estimate life characteristics. Lu and Meeker (1993) derive life distributions for populations of devices using degradation information from randomly selected sets of devices. Wang (2000) uses random coefficient models to characterize degradation and provides a concise listing of the most
Life estimation with non-symmetric priors common assumptions in degradation modeling research. These include (i) the condition of the device deteriorates with operating time and the level of deterioration can be observed at any time; (ii) the mean and variance of device deterioration can be increasing in time; (iii) device failure occurs when the degradation signal reaches a well-defined threshold; (iv) the device being monitored comes from a population of devices, each of which exhibits the same degradation form; and (v) the distribution of the stochastic parameters across the population of devices is known. Both Lu and Meeker (1993) and Wang (2000) assume independent and identically distributed (iid) N(0, σ 2 ) error in the degradation signal across the population of devices. Yang and Yang (1998) develop a random-coefficientbased approach to obtain better estimates of life parameters using life information of failed components and degradation information from partially degraded ones. They experimentally verify that this approach provides better estimators than traditional life testing. Yang and Jeang (1994) use a random coefficients model to study the effect of cutting tool flankwear on surface roughness in metal cutting. Tseng et al. (1995) apply random coefficient models for luminosity with experimental design to improve the reliability of fluorescent lamps. They use a degradation model to determine the combination of manufacturing settings that provide the slowest rate of luminous degradation. Goode et al. (1998) use an exponential degradation model to predict the condition of a hot strip mill. They conclude that the degradation model provides better life predictions than a reliability model. Doksum and Hoyland (1991) develop inverse Gaussian life models and maximum likelihood estimators for devices subjected to accelerated stress testing. They model the accumulated decay as a Wiener process with drift and diffusion dependent on the stress level. Whitmore (1995) models degradation as a Wiener process and explains how to account for measurement errors. Whitmore and Schenkelberg (1997) use Wiener processes to model degradation data collected from accelerated testing and develop methods for estimating the parameters of time and stress transformations. Lu et al. (2001) suggest methods for forecasting system performance reliability for systems with multiple failure modes. The authors use time series forecasting to develop a joint density function for performance measures such as system reliability. They also develop a model for estimating the conditional performance reliability in real-time for an individual component in operation. In Gebraeel et al. (2005), we propose a Bayesian technique to update the stochastic parameters for linear and exponential degradation models. Figure 1 shows an example real-time vibration-based degradation signal for a thrust bearing from Gebraeel (2006). Note that the vibration amplitude increases with operating time as the bearing degrades. When the amplitude reaches a pre-specified thresh-
373
Fig. 1. Example real-time degradation signal from Gebraeel (2006).
old (based on industry standards), the bearing is assumed to fail. Figures 2 and 3 illustrate our updating scheme. Figure 2(a) and Fig. 2(b) show an exponential degradation model of the form Si = φ + θ exp [βti + i ] for a specific device and the residual life distributions at time ti and ti+n , respectively. Here both the parameters θ and β are stochastic. In both Fig. 2(a) and Fig. 2(b), we obtain a posterior distribution of the stochastic parameter using the prior distribution and condition monitoring data. Then we use this posterior to compute the residual-life distribution. The small circle in Fig. 2(a) indicates the information obtained up to time ti . For this case we have one specific residual-life distribution. In Fig. 2(b) the small circle shows the information obtained up to time ti+n ; in this case, the residual-life distribution has different mean and variance to the previous one. Variation in the residual-life distribution is due to the variation of the error term and the variation of the posterior distribution. Note that here we use the exponential model just as an example to illustrate the updating technique. Our focus in this paper is a linear degradation model. Figure 3 is a flow diagram of the updating algorithm. In this figure, we use a generic notation θ for the set of parameters of a distribution. Now consider a degradation model of the form Si = φ + θti + i . In Gebraeel et al. (2005), we assume that the stochastic parameter, θ , follows N(µ0 , σ02 ) and the error terms i are iid N(0, σ 2 ). We show that the posterior distribution of θ is normal with parameters µp and σp2 given in Equations (1) and (2): k σ02 i=1 Si ti µp = k k 2 2 σ02 + σ 2 / i=1 ti i=1 ti k 2 2 σ / i=1 ti + µ0 (1) k 2 , 2 σ0 + σ 2 / i=1 ti
374
Chakraborty et al.
Fig. 2. Degradation model and residual life distribution.
σp2
k 2 σ02 σ 2 / i=1 ti = . k 2 σ0 + σ 2 / i=1 ti2
(2)
Using this posterior we derive an expression for the residual-life distribution given by Equation (3): P(Lr ≤ t|S1 , S2 , . . . , Sk ) =
(g(t)) , 1 − (g(0))
(3)
where µp (t + tk ) + φ − T g(t) = , (t + tk )2 σp2 + σ 2 also Si = Si − φ and (·) is the cumulative distribution function of a standard normal distribution.
Fig. 3. Bayesian scheme for updating prior.
Here T denotes the failure threshold of the signal. We assume that the degradation signal is observed from time t1 to tk . Corresponding values of the signal are denoted by S1 , S2 , . . . , Sk . The residual-life distribution given in Equation (3) has the form of Bernstein distribution truncated at zero. Gebraeel et al. (2005) also derive expressions for residual life when the error is Brownian, whereas Elwany and Gebraeel (2006) consider the case where two stochastic parameters are jointly distributed and follow a bivariate normal distribution. The present work is based on the results in Gebraeel et al. (2005); but here we consider a linear model with iid normal error, where the stochastic parameter follows a gamma distribution across the population of components. As we will show, such a model leads to better residual-life estimation
375
Life estimation with non-symmetric priors when the prior distribution of the stochastic parameter is skewed.
3. Example performance of “normal prior method” In this section, we illustrate how the truncated Bernstein distribution, given by Equation (3), approximates the residuallife distribution when the prior distribution of the stochastic parameter is skewed. This example will motivate us to develop residual-life estimation for skewed priors, which we model with the gamma distribution. In this paper, we shall refer to the method proposed by Gebraeel et al. (2005) as the “Normal Prior Method.” Let us consider a family of parts with a degradation model of the form Si = θti + i . Here θ is the stochastic coefficient of the signal and i , i = 1, 2, . . . are iid error terms. Let T denote the failure threshold of the signal. Note that this model does not include the intercept term φ because it has no contribution in the posterior distribution. If an intercept is observed in the signal data (Sir ), we can use the transformation Si = Sir − φ and apply our algorithm. The Bayesian updating technique in Gebraeel et al. (2005) requires computing the posterior distribution of θ, given by f (θ |S1 , S2 , . . . , Sk ), using initial observed signal values S1 , S2 , . . . , Sk . Hence, without loss of generality, we simulate the signal Si , i = 1, 2 . . . up to half the failure threshold, T/2. Let = {(S1 , S2 , . . . , Sn )|Sn+1 > T/2 ≥ Sn }. We refer to as the “Partial Signal.” Let tT/2 denote the time corre-
sponding to Sn in . We use this “Partial Signal” to compute the posterior distribution as given in Gebraeel et al. (2005). Then we draw random samples from the posterior distribution f (θ |S1 , S2 , . . . , Sk ) and run the signal from time tT/2 to the failure threshold T. Let tT denote the time at which the signal crosses the failure threshold. The residual-life is given by tr = tT − tT/2 . We continue simulating from the posterior in this fashion to obtain an empirical residual-life distribution. Next, we compare this empirical life distribution with the truncated Bernstein of Equation (3). Our hypothesis is that if the prior distribution has small skewness, then the empirical residual-life distribution obtained from simulation will match well with the truncated Bernstein distribution. However, as we increase the skewness of the prior, the residual life approximation obtained from the truncated Bernstein will deteriorate. We check this using chi-square goodness of fit tests. 3.1. Linear model with normal prior To validate the simulation approach, we let θ have a normal distribution and generate an empirical residual-life distribution as described above. Then we compare the empirical distribution with the truncated Bernstein. If our simulation method is working correctly, these should be very similar. Figure 4 provides an example QQ-plot for the residual life obtained from the truncated Bernstein and the empirical distribution obtained from simulation. It is clear from Fig. 4 that the empirical distribution matches well with the
Fig. 4. Example QQ-plot for simulated residual life and residual life from “Normal Prior Method”; θ ∼ N(0.5, 0.04), i s are iid N(0, 0.01).
376
Chakraborty et al.
Table 1. Chi-square P-value for 20 validation runs with “Normal Prior Method” Test 1 2 3 4 5 6 7 8 9 10
µ
σ2
σ2
0.50 0.75 1.00 1.25 1.50 2.00 2.25 2.50 2.75 3.00
0.04 0.05 0.05 0.10 0.20 0.20 0.25 0.35 0.45 0.45
0.01 0.01 0.05 0.05 0.09 0.10 0.10 0.12 0.12 0.13
P-value Test 0.633 0.612 0.584 0.611 0.595 0.553 0.622 0.635 0.571 0.565
11 12 13 14 15 16 17 18 19 20
µ
σ2
σ2
P-value
3.25 3.50 3.75 4.00 4.25 4.50 4.75 5.00 5.25 5.50
0.50 0.80 1.00 0.05 0.07 0.09 0.08 0.90 0.5 0.2
0.14 0.15 0.15 0.15 0.15 0.15 0.15 0.16 0.16 0.16
0.615 0.601 0.610 0.613 0.545 0.550 0.626 0.540 0.630 0.601
truncated Bernstein, as expected. A chi-square goodness of fit test (19 degrees of freedom) gives a P-value of 0.633, which indicates a reasonable fit (where the P-value represents the likelihood that any differences result from chance alone). We performed 20 experiments with different distributions of θ and i with similar results (see Table 1). 3.2. Linear model with gamma prior Now we analyze how well the truncated Bernstein matches the empirical when the normality assumption of the prior is violated. Specifically, we consider the case where θ follows a highly skewed gamma distribution with known parameters. Recall that to apply the “Normal Prior Method” the stochastic component of the degradation model needs to
be normally distributed. Hence, we first estimate the best fit normal approximation to this specified gamma prior by equating the mean and the variance of the gamma with the normal. We then use this best fit normal distribution to compute the truncated Bernstein of Equation (3). Next, we compare this with the empirical distribution obtained by simulation. Figure 5 provides an example QQ-plot of the Bernstein residual-life distribution evaluated using the “Normal Prior Method” and the empirical residual-life distribution obtained from simulation. It is clear from Fig. 5 that the empirical residual life is significantly different from the one given by the “Normal Prior Method.” A chi-square goodness of fit test gives a P-value equal to 0.011. Thus, the “Normal Prior Method” is inadequate for estimating residual life when the prior distribution of the stochastic parameter of the degradation model is badly skewed. More comprehensive results on this will be provided later in the paper. This result motivates us to develop a new methodology for estimating residual life when the prior is skewed. Specifically, we will consider using the gamma distribution as a prior since it captures various amounts of skewness.
4. Residual life distributions for gamma priors The previous example demonstrates that the “Normal Prior Method” may not be the appropriate choice if the stochastic parameter of the degradation model is not normally distributed. In this section, we investigate using the gamma distribution as a prior for the stochastic
Fig. 5. Example QQ-plot of residual life from simulation and Bernstein distribution; θ ∼ Gamma(0.3, 1.12), i s are iid N(0, 0.16).
377
Life estimation with non-symmetric priors parameter θ . We consider the gamma distribution primarily because it is flexible and can be used to characterize different distribution shapes with varying degrees of skewness and because it has some analytical tractability. Our approach is based on a Bayesian technique and involves two steps. Step 1. Determining the posterior distribution f (θ| S1 , S2 , . . . , Sk ) given that we have observed a partial degradation signal (S1 , S2 , . . . , Sk ). Step 2. Using the posterior distribution of the stochastic parameter f (θ|S1 , S2 , . . . , Sk ) to compute and update the residual-life distribution of a partially degraded component. 4.1. Posterior and residual life distribution for a gamma prior In this section we derive expressions for the posterior distribution and the residual-life distribution for a linear degradation model Si = θti + i when the stochastic parameter θ has a gamma prior. Proposition 1. Given a linear degradation model Si = θti + i with iid N(0, σ 2 ) error terms i , i = 1, 2, . . . , k, if θ ∼ gamma(α, β), then the posterior distribution of θ is given by f (θ | S1 , S2 , . . . , Sk ) θ α−1 1 2 = exp − (θ − µ1 ) , θ ∈ R+ , c 2σ12 where
c=
1 θ α−1 exp − 2 (θ − µ1 )2 dθ , 2σ1 θ=0 ∞
and µ1 = b=
b , 2a
σ12 =
1 , 2a
k 1
t 2, a= 2σ 2 i=1 i
k 1 1
Si ti − . σ 2 i=1 β
Proof. We have: P(S1 , S2 , . . . , Sk | θ) = P(1 , 2 , . . . , k | θ) k 1 2 1 exp − = √ 2σ 2 i 2π σ i=1 1 = (2π σ 2 )k/2 k 1
2 × exp − (Si − θti ) . 2σ 2 i=1
By definition P(S1 , S2 , . . . , Sk | θ)π(θ ) , θ ∈ P(S1 , S2 , . . . , Sk | θ)π(θ)dθ
= R+ ∪ {0} .
P(θ | S1 , S2 , . . . , Sk ) =
Now P(S1 , S2 , . . . , Sk | θ )π (θ) k 1
1 2 exp − (Si − θti ) = (2π σ 2 )k/2 2σ 2 i=1 θ θ α−1 exp − × α β (α) β
1 k θ α−1 2 ∝θ exp − (Si − θ ti ) − 2σ 2 i=1 β ∝ C2 exp[−(aθ 2 − bθ )] 1 α−1 2 exp − (θ − µ1 ) , ∝ C3 θ 2σ12 where, a= and µ1 =
k k 1
1
1 2 t , b = Si ti − , i 2 2 2σ i=1 σ i=1 β
b 1 , σ12 = 2a 2a
Proposition 2. Given a linear degradation model Si = θti + i with iid N(0, σ 2 ) error terms i , i = 1, 2, . . . , k, if θ ∼ gamma(α, β) and the posterior distribution of θ is given by f (θ | S1 , S2 , . . . , Sk ) θ α−1 1 2 = exp − (θ − µ1 ) , θ ∈ R+ , c 2σ12 the distribution of the residual-life Lr of the signal is given by P(Lr ≤ t | S1 , S2 , . . . , Sk ) +∞ y α−1 =1− c(t + tk )α
y=0 2 1 T −y y × exp − − µ1 dy σ 2σ12 (t + tk ) where Y = θ(t + tk ).
Proof. Let us suppose we observe a “Partial Signal” (S1 , S2 , . . . , Sk ), where Sk denotes the last observation taken at time tk . Suppose that the signal reaches the threshold T at time t + tk . Hence, St+tk = T and for any t ≤ t, we have St +tk ≤ T. Thus P(Lr > t | S1 , S2 , . . . , Sk ) = 1 − P(Lr ≤ t | S1 , S2 , . . . , Sk ) = P(St+tk ≤ T | S1 , S2 , . . . , Sk ) = P(θ (t + tk ) + (t + tk ) ≤ T | S1 , S2 , . . . , Sk ).
378
Chakraborty et al.
Let us put Y = θ (t + tk ) ∈ R+ and let f1 (y) be the probability distribution of Y . Then
2 dθ y α−1 1 y exp − − µ1 f1 (y) = dy 2 α−1 c(t + tk ) 2σ1 (t + tk )
2 y α−1 1 y = exp − − µ1 . c(t + tk )α 2σ12 (t + tk ) Now let U = Y + . Let g(., .) be the joint distribution of Y and . Let φ(.) be the probability distribution function of , then g(y, ) = f1 (y)φ(), since Y and are independent. Hence P(U ≤ u) = P(Y + ≤ u) +∞ u−y = g(y, )dyd y=0 =−∞ +∞ u−y = f1 (y)φ()dyd y=0 +∞
=−∞ u−y
y α−1 c(t + tk )α y=0
=−∞ 2 1 y × exp − − µ1 2σ12 (t + tk ) 2 1 exp − 2 dyd ×√ 2σ 2π σ +∞ y α−1 = α y=0 c(t + tk )
2 y 1 − µ1 × exp − dy 2σ12 (t + tk ) u−y 1 2 exp − 2 d × √ 2σ 2π σ =−∞ +∞ α−1 y = c(t + tk )α y=0
2 1 y × exp − 2 − µ1 2σ1 (t + tk ) u−y × dy. σ =
Hence P(Lr > t | S1 , S2 , . . . , Sk ) = P(θ (t + tk ) + (t + tk ) ≤ T | S1 , S2 , . . . , Sk ) = P(U ≤ T)
2 +∞ y α−1 1 y exp − − µ1 = α 2σ12 (t + tk ) y=0 c(t + tk ) T −y dy. × σ
Therefore P(Lr ≤ t | S1 , S2 , . . . , Sk ) = 1 − P(Lr > t | S1 , S2 , . . . , Sk )
2 +∞ y 1 y α−1 − µ1 exp − 2 = 1− α 2σ1 (t + tk ) y=0 c(t + tk ) T −y × dy σ Note that in Proposition 1, c is the normalizing constant for the posterior distribution of θ. To our knowledge, there is no closed form, hence we use numerical integration. The integrand is a simple function and any standard computing package with built-in procedure for numerical integration can solve it easily. Note that for a given prior distribution we need to calculate c once each time the posterior is updated. Proposition 2 expresses the residual life distribution of the signal in the form of a integration involving the cumulative distribution function of the standard normal distribution. Hence, to calculate the probability of the residual life at any point, t, we must use numerical integration. In this case, however, the integrand is a rather complicated function, and we need to use fine intervals to evaluate it numerically. This results in slower evaluation times, and we have not yet been successful in doing this quickly. Furthermore, to get the distribution of residual life, the function must be evaluated at many points over its entire support. So we have to do the numerical integration for a fairly large number of points. Hence, in this work we develop an alternative technique for getting the residual-life distribution once the posterior is known, and we leave the numerical integration problem for future research. 4.2. Empirical residual life distribution for a gamma prior Given the posterior distribution of the stochastic parameter for a linear degradation model, we develop a simulationbased approach to compute an empirical residual life distribution. We refer to this new technique as the “Gamma Prior Method.” Our objective is to generate the residual life distribution of a component using the prior distribution of its stochastic parameter and data from its “Partial Signal.” Hence, given a “Partial Signal”, we first compute the posterior distribution f (θ | S1 , S2 , . . . , Sk ) using . Then we make random draws from this posterior and, for each draw, run the signal from time tT/2 to failure. Thus, each draw from the posterior generates residual life value, tr = tT − tT/2 (c.f. Section 3). The collection of these residual-life values provides an empirical distribution for the given “Partial Signal.” Algorithm 1 illustrates the technique.
379
Life estimation with non-symmetric priors Algorithm 1. Methods to construct empirical residual-life distribution Input: Prior ∼ (θ | µ), Error ∼N(0, σ 2 ), Threshold T Output: Set of Empirical Residual lives foreach Signal i do Generate θi∗ from (θ |µ) foreach Value j do Generate Signal S∗ (i, j) = φ + θi∗ tj + j ; if S∗ (i, j) ≥ T/2 then hit timei = j; break; end Set S obs = {S∗ (i, 1), S∗ (i, 2), . . . , S∗ (i, hit timei )}; Calculate parameters of f (θ | S∗ (i, 1), S∗ (i, 2), . . . , S∗ (i, hit timei )) using S obs; foreach m do p
Generate θij m from posterior distribution f (θ | S(i, 1), S(i, 2), . . . , S(i, hit timei )); foreach Value k do p
Generate Signal Sp (i, m, k) = φ + θim tk + k ; if Sp (i, m, k) ≥ T then post hit timeim = k; break; end Set R lif e(i, m) = post hit timeim − hit timei ; end end
Note that there is no standard method to generate random variates from the posterior distribution derived above (c.f. Proposition 1). Hence, we use the Metropolis–Hasting (Metropolis et al., 1953; Hasting, 1970; Tierney, 1994; Chib and Greenberg, 1995) algorithm to sample from it. We provide a brief review of the Metropolis–Hastings algorithm in Algorithm 2.
4.2.1. Review of Metropolis–Hastings algorithm Suppose we want to draw random samples from a distribution π (x). One approach is to generate a sequence {Xn } from a Markov chain with steady-state distribution π (.). The Metropolis–Hastings algorithm can be used to generate such a Markov chain for any given distribution π (.) provided we can calculate the density at the
Algorithm 2. Metropolis–Hastings algorithm Input: Proposal distribution q(y | x), initial value x0 from the support of target distribution π Output: Markov chain with stationary distribution π foreach Value n do Generate y from proposal distribution q(y; xn ); Set ρ(xn , y) = q(xn ; y)π(y)/q(y; xn )π (xn ); if q(y; xn )π (xn ) > 0 then Set α(xn , y) = min{ρ(xn , y), 1}; end else Set α(xn , y) = 0; end Draw u from U(0, 1); if u ≤ α(xn , y) then Set xn+1 = y; end else Set xn+1 = xn ; end end
380 point x. Hence, we need to know π(x) up to the normalizing constant. Let us select an arbitrary distribution q(y; x) = q(y | x) as our proposal distribution. Here q(y; x) represents the single-step transition probability from state x to state y. The initial algorithm proposed by Metropolis requires the proposal distribution to be symmetric, i.e., q(y; x) = q(x; y). Later Hastings relaxed this condition. We start the iteration with an initial value x0 from the support of the target distribution π(x). At the nth state, we draw a new proposal state y with probability q(y; xn ). Next we calculate the expression ρ(xn , y) = ρ1 ρ2 , where ρ1 = π (y)/π (xn ) and ρ2 = q(xn ; y)/q(y; xn ). Note that ρ1 is the likelihood ratio of the proposed state y and the current state xn and ρ2 is the ratio of probabilities of jumping from state y to state xn . If the proposal distribution is symmetric, we have ρ2 = 1. Let min {ρ(xn , y), 1} if π (xn )q(y; xn ) > 0, α(xn , y) = 1 otherwise. Let u be a random draw from uniform (0, 1). Now to select the next state xn+1 , we use the following rule: y if u ≤ α(xn , y), xn+1 = xn otherwise. The distribution of the chain {Xn }n≥1 converges to π as n → ∞. Thus, it is customary to leave out a certain number values, say n0 , at the start of the sequence {Xn }n≥1 and use the subsequent draws as the random samples from π . The 0 are called burn-in. samples {Xn }nn=1
Chakraborty et al. 5. Comparing “gamma prior” and “normal prior” In this section, we present a comparative study of the approximations provided by the “Normal Prior Method” and the “Gamma Prior Method” for different types of priors. We test both methods using two scenarios. First, we consider the case when the stochastic parameter θ of the linear degradation model follows a symmetric prior. Second, we test the case where θ has a skewed prior distribution. Finally, we evaluate the behavior of these two methods at different degrees of skewness of the distribution of θ . 5.1. Study for symmetric prior distributions of θ In this section, let θ be normally distributed with known parameters. We first estimate the best fit gamma approximation to the specified normal prior by equating the mean and variance of the gamma with the normal. Figure 6 provides an example normal distribution with its gamma approximation. Then, we use this best fit gamma distribution to determine the residual-life distribution using algorithm 1 of “Gamma Prior Method.” We also compute the truncated Bernstein distribution using “Normal Prior Method,” as described in Gebraeel et al. (2005). Next, we run the “Partial Signal” numerous times (approximately 150 times) from tk to the threshold T, with the same θ we started with and obtain the residual-life distribution; i.e., we do not update θ by the posterior distribution. Our claim here is that, if we know θ exactly, then the distribution of the residual life will be influenced only by the error term. Otherwise, the distribution of the posterior
Fig. 6. Distribution of N(5.5, 5.25) and approximation gamma(5.55, 0.972).
381
Life estimation with non-symmetric priors
Fig. 7. Histogram of residual lives from “Gamma Prior Method”, “Normal Prior Method” and “Partial Signal” for normal prior.
plays a major role. However, if it is possible to reduce the variance of the posterior to a great extent, we can practically consider that θ is known. Now if we compute the posterior based on a large number of observations, we can reduce its variance. Recall that we are estimating the posterior at a point where the signal has reached half the failure threshold. We divided this time horizon in such a fashion that there are on average 200 observations in the interval. Table 2 provides a comparison the variances of the posterior and the error terms. Here σ2 and σP2 denotes the variance of the error and the posterior respectively. Note that the ratio σ2 /σP2 is always very high. Thus, we can conclude that in these cases the effect of the posterior on the residual-life is very small in comparison to the error. Table 2. Comparison of variance of error and variance of the posterior Skewness 5.16 3.65 2.39 2.00 1.63 1.26 1.15
σ2
σP2
σ2 /σP2
0.16 0.18 0.20 0.15 0.17 0.19 0.15
1 × 10−3 8 × 10−4 9 × 10−4 8 × 10−4 8 × 10−4 1 × 10−3 8 × 10−4
145.45 202.23 210.52 180.72 200.00 190.00 168.54
Now we compare the residual life distribution obtained from the “Gamma Prior Method” and the “Normal Prior Method” with this one. Figure 7 provides an example comparison of histograms of the residual-life distributions obtained from “Normal Prior Method,” “Gamma Prior Method” and the “Partial Signal” for the case shown in Fig. 6. It is clear from the figure that the “Gamma Prior Method” gives a very good estimate of the residual life of the “Partial Signal” when the distribution of θ has more symmetry. In this case, the result of both the “Gamma Prior Method” and the “Normal Prior Method” are very close to each other. For this example chi-square goodness of fit tests give P-values equal to 0.633 and 0.612 for the “Normal Prior Method” and the “Gamma Prior Method” respectively. This is a clear indication that the proposed “Gamma Prior Method” works well when the normality assumptions of the original models in Gebraeel et al. (2005) are not violated. We performed ten such experiments with different symmetrical distributions of θ with similar results (see Table 3). 5.2. Study for skewed prior distributions of θ Next, we assume that θ follows a highly skewed gamma distribution with known parameters. As described in the last subsection, we first equate the mean and variance of a gamma and a normal to obtain the best normal
382
Chakraborty et al.
Table 3. Chi-square P-value for ten runs of “Normal Prior Method” and “Gamma Prior Method” with normal prior P-value from chi-square test Test
µ
σ2
“Normal Prior Method”
“Gamma Prior Method”
1 2 3 4 5 6 7 8 9 10
3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0
3.25 4.00 4.25 5.00 5.25 5.50 5.00 4.75 4.50 3.75
0.534 0.550 0.601 0.598 0.633 0.596 0.582 0.612 0.542 0.601
0.548 0.532 0.622 0.529 0.612 0.620 0.601 0.597 0.563 0.573
approximation of the gamma distribution. Figure 8 provides an example gamma distribution with its normal approximation. Then we use this best fit normal distribution to compute the truncated Bernstein distribution using “Normal Prior Method.” We also determine the residual-life distribution using algorithm 1 of “Gamma Prior Method.” Then, as before we obtain the residual-life distribution from the “Partial Signal” by running the signal numerous times (approximately 150 times) from tk to the threshold T, without updating θ by the posterior. Next, we compare the residual-life distribution obtained from the “Gamma Prior Method” and the “Normal Prior Method” with this one. Figure 9 provides an example comparison of histograms of the residual-life distributions obtained from “Normal Prior Method,” “Gamma Prior
Method” and the “Partial Signal” for the case shown in Fig. 8. From Fig. 9, it is clear that the “Gamma Prior Method” gives a very good estimate of the residual life of the “Partial Signal” when θ has a highly skewed distribution. A chi-square goodness of fit test gives a P-value equal to 0.601. However, in this case the “Normal Prior Method” fails to take into account the variation due to the skewness of the distribution of θ . A chi-square goodness of fit test gives a P-value equal to 0.011. Figure 9 clearly shows the difference in the probability distribution function of the residual-life obtained from these two methods. A natural question that now follows is, how will the “Gamma Prior Method” and the “Normal Prior Method” perform when the skewness of the distribution of θ changes. In the following subsection we address this question.
5.3. Study for prior distributions of θ with different skewness In this section we vary the skewness of the distribution of the stochastic parameter θ of the linear degradation model and compare the performances of the “Gamma Prior Method” and the “Normal Prior Method” in predicting the residual life distribution. For this purpose we let θ have a gamma distribution with parameters α and β. The skewness (γ ) of a distribution F is defined as the ratio of the third central moment to the cube of standard 3/2 deviation of the distribution. Thus, γ = µ3 /µ2 , where p µp = EF (X − E(X)) , stands for the pth central √ moment. Thus, for gamma distribution we have γ = 2/ α. Now we vary the shape parameter α of the the gamma distribution to obtain priors with different skewness levels.
Fig. 8. Distribution of skewed gamma(0.3, 1.12) (skewness = 3.65) and approximation N(1.12, 1.25).
Life estimation with non-symmetric priors
383
Fig. 9. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 0.3, γ = 3.65.
Fig. 10. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 0.15, γ = 5.16.
384
Chakraborty et al.
Fig. 11. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 0.7, γ = 2.39.
Fig. 12. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 1, γ = 2.
Life estimation with non-symmetric priors
385
Fig. 13. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 1.5, γ = 1.63.
Fig. 14. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 2.5, γ = 1.26.
386
Chakraborty et al.
Fig. 15. Comparison of residual-life distributions from “Gamma Prior Method” and “Normal Prior Method” for α = 3, γ = 1.15.
Then we repeat the analysis performed in Section 5.2 using the “Gamma Prior Method” the “Normal Prior Method” and the “Partial Signal.” Figures 9 to 15, are example histograms of the residual-life distributions obtained from the “Gamma Prior Method” the “Normal Prior Method” and the “Partial Signal.” In Table 4, we compare the chi-square goodness of fit test P-values for the residual life distribution obtained from the “Partial Signal” with those from the “Gamma Prior Method” and the “Normal Prior Method” for these cases. Note that for values of α ≥ 1.5, i.e., for γ ≤ 1.63, both the “Normal Prior Method” and the “Gamma Prior Method” Table 4. Comparison of “Normal Prior Method” and “Gamma Prior Method” for different skewness values for the prior P-value from chi-square test α 0.15 0.30 0.70 1.00 1.50 2.50 3.00
Skewness
“Normal Prior Method”
“Gamma Prior Method”
5.16 3.65 2.39 2.00 1.63 1.26 1.15
0.003 0.011 0.030 0.040 0.068 0.332 0.596
0.614 0.601 0.612 0.602 0.611 0.601 0.610
perform well. Hence, when the prior distribution has a small skewness, we can use either of these two methods. However, as the skewness increases, the advantage of the “GammaPrior Method” becomes more prominent. In such cases, the “Normal Prior Method” tends to underestimate the actual residual life distribution (when the prior is skewed to the right).
6. Conclusions An important objective in maintenance and reliability research is to compute residual-life distributions for a functioning component using real-time condition information. Gebraeel et al. (2005) presented a Bayesian updating approach that incorporates real-time condition information into residual-life models. This method assumes that the stochastic component of a degradation model follows a normal prior. In this paper we have shown that the method described in Gebraeel et al. (2005) does not apply when the prior is highly skewed. We also develop a method for estimating the residual-life distribution when the prior distribution of the stochastic component is gamma. Our major contributions in this work are: (i) a new simulation-based approach for the case where the prior distribution of the stochastic component of a linear degradation model follows both symmetric and skewed
387
Life estimation with non-symmetric priors distribution; and (ii) an initial approximation of the skewness threshold of the prior distribution beyond which both the “Gamma Prior Method” and the “Normal Prior Method” can be used satisfactorily. The major advantage of using a gamma distribution is the fact that its two parameters provide enough flexibility to estimate a large class of distributions defined on positive half of the real line. This also helps us avoid tedious and complicated numerical integrations to evaluate the posterior that may arise from any non-standard prior distribution of the stochastic component.
References Chib, S. and Greenberg, E. (1995) Understanding the Metropolis– Hastings algorithm. The American Statistician, 49, 327–335. Dimla, D.E. (1999) Artificial neural networks approach to tool condition monitoring in a metal turning operation, in Proceeding of the Seventh IEEE International Conference on Emerging Technologies and Factory Automation (ETFA’ 99), UPC Barcelona, Spain, pp. 313– 320. Doksum, K. and Hoyland, A. (1991) Models for variable-stress accelerated life testing experiments based on Wiener processes and the inverse Gaussian distribution. Technometrics, 34, 74– 82. Elwany, A. and Gebraeel, N. (2006) A bivariate stochastic degradation model for computing and updating residual life distributions of partially degraded components. IEEE Transactions on Reliability, submitted. Fararooy, S. and Allan, J. (1995) On-line condition monitoring of railway equipment using neural networks, in IEE Colloquium on Advanced Condition Monitoring Systems for Railways, IEE, London, UK, pp. 211–9. Feser, K., Feuchter, B., Lauersdorf, M. and Leibfried, T. (1995) General trends in condition monitoring of electrical insulation of power transformers, in Stockholm Power Tech International Symposium on Electric Power Engineering, IEEE, Piscatway, NJ, pp. 104– 109. Gebraeel, N. (2006) Sensory-based degradation models for components with exponential degradation patterns. IEEE Transactions on Automation Science and Engineering, 3(4), 382–393. Gebraeel, N., Lawley, M.A., Li, R. and Ryan, J.K. (2005) Residual-life distributions from component degradation signals: A Bayesian approach. IIE Transactions, 37, 543–557. Goode, K., Roylance, B. and Moore, J. (1998) Development of a predictive model for monitoring condition of a hot strip mill. Ironmaking and Steelmaking, 25, 42–46. Hastings, W.K. (1970) Monte-Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97–109. Lu, C. and Meeker, W. (1993) Using degradation measures to estimate a time-to-failure distribution. Technometrics, 35, 161–174. Lu, H., Kolarik, W. and Lu, S. (2001) Real-time performance reliability prediction. IEEE Transactions on Reliability, 50, 353– 357. Martin, K. (1994) A review by discussion of condition monitoring and fault diagnosis in machine tools. International Journal of Machine Tools & Manufacture, 34, 527–551. Metropolis, N., Rosenbluth, M.N., Teller, A.H. and Teller, E. (1953) Equations of state calculations by fast computing machines. Journal of Chemical Physics, 21, 1087–1092. Nelson, W. (1990) Accelerated Testing Statistical Models, Test Plans, and Data Analysis, Wiley, New York, NY.
Shao, Y. and Nezu, K. (2000) Prognosis of remaining bearing life using neural networks. Proceedings of the Institute of Mechanical Engineers, 217–230. Thorsen, O. and Dalva, M. (1999) Failure identification and analysis for high-voltage induction motors in the petrochemical industry. IEEE Transactions on Industry Applications, 35, 810–818. Tierney, L. (1994) Markov chains for exploring posterior distributions (with discussion). Annals of Statistics, 22, 1701–1762. Tseng, S., Hamada, M. and Chiao, C. (1995) Using degradation data to improve fluorescent lamp reliability. Journal of Quality Technology, 27, 363–369. Wang, W. (2000) A model to determine the optimal critical level and the monitoring intervals in condition-based maintenance. International Journal of Production Research, 38, 1425–1436. Whitmore, G. (1995) Estimating degradation by a Wiener diffusion process subject to measurement error. Lifetime Data Analysis, 1, 307– 319. Whitmore, G. and Schenkelberg, F. (1997) Modeling accelerated degradation data using Wiener diffusion with a time scale transformation. Lifetime Data Analysis, 3, 27–45. Yang, K. and Jeang, A. (1994) Statistical surface roughness checking procedure based on a cutting tool wear model. Journal of Manufacturing Systems, 13, 1–8. Yang, K. and Yang, G. (1998) Degradation reliability assessment using severe critical values. International Journal of Reliability, Quality and Safety Engineering, 5, 85–95.
Biographies Santanu Chakraborty is a Ph.D. student in the Department of Industrial Engineering at Purdue University, West Lafayette, IN. He received his master’s degree from the University of Connecticut, Storrs, CT in 2005. His research interests include decision under uncertainty, stochastic programming and healthcare engineering. Nagi Gebraeel is an Assistant Professor in the H. Milton Stewart School of Industrial and Systems Engineering at Georgia Tech. He received his Ph.D. from Purdue University in 2003. His research interests are in prognostics and prognostics-based logistics, degradation modeling and reliability engineering. His research is mostly targeted towards improving the accuracy of predicting unexpected failures of engineering systems by leveraging sensor-based data streams. He is also interested in studying the impact of these developments on maintenance operations and spare parts logistics. He is a member of IIE and INFORMS. Mark Lawley is an Associate Professor in the Weldon School of Biomedical Engineering at Purdue University. Before joining Biomedical Engineering in 2007, he served 9 years as an Assistant and then Associate Professor of Industrial Engineering, also at Purdue, 2 years as an Assistant Professor of Industrial Engineering at the University of Alabama, and he has held engineering positions with Westinghouse Electric Corporation, Emerson Electric Company and the Bevill Center for Advanced Manufacturing Technology. As a researcher in academia, he has authored over 80 technical papers including book chapters, conference papers, and refereed journal articles, and has won three best paper awards for his work in systems optimization and control. He received a Ph.D. in Mechanical Engineering from the University of Illinois at Urbana Champaign in 1995 and is a registered Professional Engineer in the State of Alabama. Hong Wan is an Assistant Professor in the School of Industrial Engineering at Purdue University. Her research interests include design and analysis of simulation experiments, simulation optimization, simulation of manufacturing, healthcare and financial systems; quality control and applied statistics. She has taught a variety of courses and is a member of INFORMS and ASA.