Received: 10 October 2016
Revised: 22 March 2018
Accepted: 3 April 2018
DOI: 10.1002/qre.2312
RESEARCH ARTICLE
A novel Bayesian approach to reliability modeling: The benefits of uncertainty evaluation in the model selection procedure D.D. Santana
C.L.S. Figueirôa Filho
Programa de Pós Graduação em Engenharia Industrial (PEI), Universidade Federal da Bahia, Salvador, Brazil Correspondence Márcio A. F. Martins, Universidade Federal da Bahia, Programa de Pós Graduação em Engenharia Industrial (PEI), Salvador, Brazil. Email:
[email protected] Funding information Brazilian Sports Ministry and the National Council for Scientific and Technological Development
I. Sartori
Márcio A.F. Martins
Abstract This paper proposes a different likelihood formulation within the Bayesian paradigm for parameter estimation of reliability models. Moreover, the assessment of the uncertainties associated with parameters, the goodness of fit, and the model prediction of reliability are included in a systematic framework for better aiding the model selection procedure. Two case studies are appraised to highlight the contributions of the proposed method and demonstrate the differences between the proposed Bayesian formulation and an existing Bayesian formulation. K E Y WO R D S Bayesian analysis, parametric uncertainty, reliability model
1
I N T RO DU CT ION
The parameters of reliability models are estimated from field data or are based on expert knowledge in order to adequately describe the behavior of failures of a system or its components. Louit et al1 state that the collection of data in the field usually focuses on maintenance management rather than reliability, which makes the information content poor and misleading. In addition to this scenario, in some situations, the field data from the system or product under study is not available2 or is scarce. In other words, the exact values of the parameters of the model are hard to obtain because of uncertainties in the data. Therefore, a single-value estimate of these parameters is not enough to represent the knowledge acquired about them.3 A problem in this area is adjusting models to fit field data and to define which model best represents the behavior observed,1,4 regarding reliability or hazard rate or density function. It is common to apply the best estimate of goodness of fit indices in order to reach a decision on this issue. However, as will be discussed in this paper, predictions, parameters, and goodness of fit indices have associated uncertainties and coverage intervals, consequently Qual Reliab Engng Int. 2018;1–15.
if such properties are considered in the decision procedure, the model selected to describe a certain data set can be changed. Another aspect that can affect the selection of models is the likelihood applied in the estimation procedure, since the uncertainties associated with parameters, prediction, and goodness of fit are influenced by the posterior formulation applied. In this context, the aim of this paper is to propose a new Bayesian formulation for the parameter estimation problem in reliability modeling, based on the residuals of estimation. This formulation is integrated into a systematic method to perform model selection, in which the uncertainties of the parameters, the prediction, and the goodness of fit are key factors to compare different models and reach a decision. Most studies do not consider this information. This paper follows the guidance from Guide to the Expression of Uncertainty in Measurement (GUM) and its supplements,5-7 which are the international references for the uncertainty evaluation task. This article is organized as follows. Section 2 presents a state-of-the-art review. Section 3 describes the aforementioned Bayesian formulation for the estimation problem,
wileyonlinelibrary.com/journal/qre
Copyright © 2018 John Wiley & Sons, Ltd.
1
2
SANTANA ET AL.
including overall guidance for using it in the model selection stage. Section 4 details 2 case studies so as to highlight the effectiveness of the proposed Bayesian formulation and to bring out the influence of uncertainty in the model selection decision. Finally, Section 5 offers some concluding remarks.
2
T H E STATE O F TH E A RT
A comprehensive review of reliability model selection is presented by Louit et al1 and Settanni et al,4 ranging from the identification of the type of system being studied, if repairable or nonrepairable, to the evaluation of goodness of fit. The authors highlight the difference between modeling nonrepairable and repairable systems, in which the former is based on statistical distributions, whereas the latter uses stochastic point processes. With the purpose of more thoroughly evaluating the uncertainties related to issues concerning the Bayesian paradigm, the scope of this paper will cover a model selection procedure that associates the parameter characteristics for nonrepairable reliability systems without censuring due to the simplicity of their modeling. Classical (frequentist) or Bayesian paradigms are commonly applied to evaluate the best parameter estimates and their associated uncertainties for a certain model, one of the first steps for model selection. Each paradigm has its own set of techniques to perform parameter estimation. For instance, when applied to the field of reliability, the techniques of classical statistical inference commonly adopted are maximum likelihood estimation (MLE),8,9 least squares,10 weighted least squares,11,12 or graphical analysis.13-15 Each classical technique has its own set of hypotheses that needs to be satisfied to sustain the statistical coherence of the results. Conversely, in the Bayesian paradigm, the properties of the estimators are based on the posterior distribution,16 which reduces the number of assumptions required to sustain the statistical analysis. Touwn et al17 state that this paradigm has long been popular, especially when the data are scarce and other parameter estimation techniques, which could use the classical paradigm for instance, can be unstable. This statement is reinforced by Louit et al,1 Gupta et al,18 and Peng and Yan.19 Furthermore, when this paradigm is applied in the field of reliability, Guo et al20 affirm that it is less dependent on observed failure data due to the additional information introduced by a prior factor based on accumulated knowledge about the system. This characteristic allows continuous updating of the state of knowledge of the parameters,21,22 which allows us to straightforwardly obtain their empirical distributions, coverage intervals and regions. They carry information about
the model quality and can be used to identify if a model adequately represents a certain data set. However, to the best of the authors' knowledge, this is not explored in the reliability literature to support model selection. Focusing on the framework for model selection, Gupta et al18 state that the classical statistics inference seeks to merely compare the data with the best fitted model, neglecting the uncertainties related to the parameter estimates, which leads to an inaccurate evaluation of the quality of different models. Regarding the Bayesian paradigm, they recommend using function plots based on the empirical distribution and predictive data, such as the hazard rate and the cumulative distribution, as an indication of model quality. However, these empirical distributions are hard to estimate from data,8 which makes the comparison of nonparametric estimates with the model predictions difficult. Table 1 summarizes some numerical goodness of fit indices commonly used in the field of reliability engineering. These indices can be used to assess the quality of just 1 model that describes a certain data set (model quality), or to compare different models so as to select the one which better describes this set (model selection). To the best of the authors' knowledge, common practice is to select as the best model one which is associated with the best-point estimate of the index to be evaluated, which means that even when close values for such indices are obtained for different models, the selection is carried out by choosing the model with the best index. This can be exemplified by the works of Murthy et al,10 Zhang et al,14 Zhang and Dwight,15 Peng and Yan,19 Guo et al,20 Al-Garni and Jamal,28 Castet and Saleh,24 Dubos et al,23 and Woodcock et al.29 Nonetheless, other authors evaluate the predictions of models considering their associated uncertainty, as follows: (1) Upadhyay et al26 and Gupta et al18 assess the Bayesian predictive P value to evaluate the compatibility of models with the observed data, not being advocated by Upadhyay et al26 to perform model selection, (b) Iesmantas and Alzbutas30 evaluate the 95% lower and upper bounds of 2 models in order to compare their prediction capabilities, and (3) Méndez-González et al31 assess the confidence intervals of parameters, applying them to analyze if there are significant differences between the physical meaning of the evaluated models. In summary, model selection generally focuses on the prediction capability of the model, while some characteristics of the parameters are not used as an indication of the model quality, despite being directly available in the Bayesian paradigm, namely, their covariance matrix, coefficients of correlation and coverage regions. It is expected that, besides a small uncertainty, the parameters should present a small correlation as well, which means that each parameter explains different parts of the experimental
SANTANA ET AL.
3
TABLE 1 Goodness of fit indices applied in reliability analysis and examples of their usage, in both classic and Bayesian paradigms Indexa,b
Classical MLE LS
Bayesian WLS
Examples of Usage
KolmogorovSmirnov4,19 Coefficient of
x
Model selection
determination (R2 )14,15,20,23
x
x
Akaike criterion (AIC)9,15,19
x
x
Model selection Weibull family Model selection
F statistic14
x
Techniques comparison
Objective function10 Mean square
x
Weibull family Model selection Techniques comparison
Techniques comparison x
x
Weibull family
error of residuals14 Maximum residual error24,25
Model quality Weibull family Model selection Model quality;
x
x
residual error23-25
x
x
Residual mean12,24 Residual variance12,24 Residual percentiles24
x
x
x
x
x
x
x
x
Average
Predictive P value18,26
Weibull family Model selection Techniques comparison Model selection Techniques comparison Model selection Model selection x
Model quality
Abbreviations: LS, least squares; MLE, maximum likelihood estimation; WLS, weighted LS. a AIC is the Akaike criterion proposed by Akaike.27 b In the evaluation of residual characteristics, graphical techniques can be carried out, such as box plots.
data. Within this context, one proposal of this paper is to use such characteristics as model quality indices in order to support the model selection procedure. As detaching data from its uncertainty is impossible, it is essential to propagate the parameter uncertainties through the proposed model to its predictions, such as the reliability, hazard rate, density function, mean time to failure, and mean time between failures. As a result, not only should the predictions be calculated for the best fitted parameters but also the coverage intervals of each predicted value should be evaluated in order to describe the model uncertainty. In this context, goodness of fit measures, in turn, will also have associated uncertainties that will have a major impact on the model selection. Lawless8 argues that models vary in complexity and in the strength of their assumptions, and therefore, the validation stage should vary accordingly. The author highlights that graphical analysis, such as plotting the model predictions against nonparametric estimators, has its limitations, and formal hypothesis testing is an important part of model selection.
Moreover, Gupta et al18 comment that numerical measures, eg, the ones used as goodness of fit indices, tend to narrowly focus on a particular aspect of the relationship between model and data, and as a consequence there should be a set of numerical and graphical measures to evaluate the model fit. Therefore, this paper will also look into the benefits of evaluating the uncertainty of such indices, when they are applied to perform the selection of models. All the issues previously discussed are deeply influenced by the likelihood definition. Generally, the parameters are not estimated with different likelihood formulations, but according to Evans32 different ways of performing the estimation should be evaluated. In the Bayesian paradigm, the likelihood applied is based on the density function of the observed data, evaluated by a certain model. Therefore, this motivates the investigation of a different likelihood for such a task that, to the best of the authors' knowledge, is absent in the reliability literature: the one based on residuals of estimation, which will be the Bayesian approach proposed in this paper.
4
3 3.1
SANTANA ET AL.
P RO P O S ED BAY E SIAN MET HOD A systematic sketch
The proposed method for model selection ranges from parameter estimation to goodness of fit evaluation, considering their associated uncertainties, in order to support the decision about the model that best describes a certain data set. Figure 1 summarizes this method, which consists of the following steps: 1. Observed data and prior knowledge: Gather the failure times and expert knowledge about the failure being assessed. 2. Posterior formulation and integration: Estimate the parameters of the model using different likelihood formulations. One is a common likelihood applied in literature and other is the one proposed in this paper, based on residuals of estimation. This issue is discussed in Section 3.2. 3. Evaluate the uncertainty of parameters, prediction, and goodness of fit, besides their best estimates. This additional step proposed is discussed in Section 3.3. 4. Compare the indices and select the model that best describes the failure data. This is exemplified in the Section 4. The uncertainty evaluation steps included in this proposed method aim to thoroughly explore each model's capability to represent the data and provide different characteristics for which they can be compared, with different likelihood formulations. This helps avoid some undesirable situations, in particular: (1) overfitting, when a model has an excessive number of parameters; (2) models with highly correlated parameters; (3) selection of a model whose predictions are highly uncertain over another one which can be more precise. Here, we seek to argue that such steps are a decisive mechanism to guide the choice of a reliability model. In what follows, we highlight topics on how an uncertainty evaluation can guide the decision-making process, namely, 1. parameter uncertainty: It is desired that the parameter uncertainty be small, which indicates the precision of such parameters in explaining a certain event. 2. parameter coverage regions: These regions help to identify correlation among the parameters. If there is a high dependence among them, this could indicate that the model structure is not completely adequate to describe the data, even though the model presents good prediction capabilities. 3. coverage intervals of the model's predictions: Model predictions with smaller coverage intervals are preferable to those with higher ones, assuming that both have
FIGURE 1 Summary of the model selection method highlighting the further stages proposed in this paper. Solid lines ( ) represent commonly used steps, while dashed lines (----) represent the additional ones proposed in this paper
acceptable mean prediction values. Besides, if their coverage intervals overlap each other, it can be inferred that they contain the same amount of information, to a certain degree, so that any of the models can be chosen to describe the studied phenomena. 4. goodness of fit coverage intervals: If the coverage intervals overlap, it is not possible to evaluate the differences
SANTANA ET AL.
5
in the performance of the models, thus demonstrating that any analyzed model could be chosen to be the predictor. This implies that the model to be chosen is necessarily the one with the simpler model structure, following the principle of parsimony.
3.2
Posterior distribution proposal
Bayesian statistics can be regarded as a way of providing a numerical description of the state of knowledge of a random quantity from any kind of rational relevant information available, namely, observable and/or unobservable (a priori information).16 From a parametric analysis of reliability models, the Bayesian formulation aims to provide the posterior distribution of the parameters to be estimated, thus describing all the available information about them, which is contained in their prior information and failure time data, or 𝑝(𝜽|t) ∝ L(t |𝜽) · 𝑝(𝜽),
(1)
in which t = [t1 , … , ti , … , tn ]T is the vector of failure ]T [ times, 𝜽 is the parameter vector 𝜃1 , · · · , 𝜃q corresponding to the failure probability distributions; p(𝜽|t) stands for the posterior joint distribution of the parameters given the available data, L(t|𝜽) represents the likelihood function that relates the data and the parameters to be estimated, and p(𝜽) describes the prior state of knowledge about the parameters. The likelihood formulation is somewhat arbitrary since it depends on establishing objectives for the parametric analysis. One formulation commonly applied to adjust reliability models, when the failure times are uncorrelated and noncensored, denoted here as the Bayesian conventional approach, is the one given by18,26,33 : L=
n ∏ 𝑓 (ti ; 𝜽),
(2)
i=1
where f(ti ; 𝜽) is the probability distribution model. A drawback of this formulation is the fact that its results are fairly conservative, ie, the uncertainty associated with the estimated parameters (coverage regions) can be very large. Therefore, the coverage intervals of the model predictions can superpose, which makes it difficult to identify the different prediction capabilities of each model being assessed. In an attempt to circumvent the limitations of the Bayesian conventional approach, the present paper seeks to use an alternative expression for the likelihood function, which is based on the residuals of the estimation, e(ti , 𝜽). Such a likelihood is assessed by Gamerman and Migon16
for any class of models, under the hypotheses of normality, noncorrelation, and zero mean: ] [ n ∏ e(ti , 𝜽)2 1 L= , (3) · exp − √ 2·𝜙 2·𝜋·𝜙 i where 𝜙 is the residual variance, ie, a measure of the quality of the model, an additional parameter, in this formulation, which is properly incorporated into the vector of model parameters 𝜽. In fact, the proposed approach is formulated in such a way as to apply Equation 3 to the field of reliability. Therefore, the residuals, e(ti , 𝜽), will be based on the deviations between a nonparametric reliability estimator, Rnp (ti ), and the predictions of the reliability model to be adjusted, Rm (ti , 𝜽). The result is that this approach uses a nonparametric estimator as the observed data, instead of using the collected failure time data. Consequently, the residuals are expressed as e(ti , 𝜽) = Rnp (ti ) − Rm (ti , 𝜽),
(4)
where Rnp (ti ) is a nonparametric estimator for the reliability, such as the Kaplan-Meier,34 the improved product limit method,12 the median rank approximation,35 or the Nelson-Aalen36,37 ; Rm (ti , 𝜽) is the reliability evaluated by the model at ti . The assumptions behind the formulation of Equation 3 are the desired properties in the stage of model validation (residual analysis). As a result, when Equation 4 expresses the residuals, it is expected that the deviations between a nonparametric reliability estimator and the predictions of the model are a random variable described by a Gaussian distribution with zero mean. Even when the likelihood is expressed by Equation 2, the aforementioned so-called conventional approach, these residual properties are sought so as to evaluate the quality of the parameter estimate and, as a consequence, of the model itself. Therefore, 1 advantage of the proposed Bayesian approach is to explicitly incorporate such desired properties in its likelihood formulation. The residuals based on the hazard rate, h(t), or density function, f(t), could also be applied to Equation 4, but, according to Lawless,8 nonparametric estimates for them are inherently difficult to obtain from experimental data. This is because they represent rates of change in probability, thus the estimation procedure would be overburdened. Such a limitation can be reduced by expert knowledge about the characteristics of the failure of the process/equipment, or even if a great number of registered failures are available. As a result, given a good set of failure time data, at least theoretically, different residual formulations can be assessed in order to thoroughly explore the capabilities of the model to describe different aspects of the data: the reliability, the hazard rate, or a density function.
6
SANTANA ET AL.
Another important factor to be considered in Bayesian inference, either the conventional approach or the proposed approach, is the prior distribution. According to Robert and Kamary,38 this distribution is a tool used to summarize the a priori information available about the phenomenon being studied. If there is a lack of prior information about the phenomenon, the use of noninformative priors is needed.38 Finally, since the likelihood and prior distributions are established, the joint posterior distribution of the parameters, p(𝜽|t), Equation 1, needs to be integrated in order to obtain each marginal distribution, 𝑝 (𝜃i |t). This can be interpreted as a major advantage of the Bayesian paradigm because the covariance matrix and the coverage region can be obtained without further assumptions. Markov chain Monte Carlo simulations (MCMC) have widely been used for this purpose, and a good review of the MCMC procedures applied to the field of reliability can be found in Guo et al,20 Beck and Au,21 Gupta et al,18 and Upadhyay et al.26
3.3
Uncertainty evaluation
Having the posterior probability distribution as a complete state of knowledge about the parameters of the reliability model defined in the likelihood function, it is then possible to know any statistical moment of this distribution. In particular, the first- and second-order moments of the posterior are the more useful ones in reliability analysis, describing the means of the parameters (expectation vector) and its associated uncertainty (covariance matrix), respectively. After integrating the posterior distribution, Equation 1, m samples are drawn from the joint distribution with q parameters, and they can properly be arranged as rows of a m × q matrix: [ 𝜽=
𝜃11 · · · 𝜃q1 ⋮ ⋱ ⋮ 𝜃1m · · · 𝜃qm
] ,
(5)
m×q
where each column represents a sample drawn from the marginal distribution of a specific parameter. A direct consequence of this arrangement is that the mean vector and the covariance matrix can be obtained: ∑m 𝜃 ̂𝜃𝑗 = l=1 𝑗l , 𝑗 = 1, … , q, (6) m
U𝜽𝜽
⎡ ⎢ =⎢ ⎢ ⎣
∑m
̂
l=1 (𝜃1l −𝜃1 )
∑m
2
m−1 ∑m
⋮ ̂
̂
l=1 (𝜃1l −𝜃1 )·(𝜃ql −𝜃q )
m−1
··· ⋱ ···
̂
̂
l=1 (𝜃ql −𝜃q )·(𝜃1l −𝜃1 )
m−1 ∑m
⋮
̂
l=1 (𝜃ql −𝜃q )
m−1
2
⎤ ⎥ ⎥. ⎥ ⎦
(7)
It is worth emphasizing that the square root of the main diagonal of the covariance matrix represents the standard uncertainty associated with the parameters, whereas the other elements are the covariances between the pairs of parameters, as established by the GUM framework.5 If the parameter covariance matrix, Equation 7, shows that there are highly correlated parameters, one could suspect that the model does not adequately represent the data set. Another important property that can be drawn from the parameter samples is the set of their coverage intervals, which can be obtained as follows: 𝜃imax
∫𝜃min
𝑝(𝜃i |t) · d𝜃i = 𝑝′ ,
(8)
i
where 𝜃imin and 𝜃imax are the limits of the interval and p is the coverage probability. The GUM Supplement 1 suggests a procedure to evaluate this interval for symmetric and asymmetric density functions.6 On the other hand, these intervals do not contain any information about the covariances between the parameters. This information is crucial when the quality of the model is evaluated, because a significantly large covariance can reveal an overlap in the events that each parameter explains. In view of the fact that in parametric reliability analysis, the evaluation of coverage regions has not yet been addressed, the present paper systematically covers this topic. The evaluation of the coverage regions for each pair of parameters contained in the reliability model can be formulated as ′
𝜃imax
𝜃𝑗max
∫𝜃min ∫𝜃min i
𝑝(𝜃i , 𝜃𝑗 |t) · d𝜃i · d𝜃𝑗 = 𝑝′ .
(9)
𝑗
Possolo39 proposes a procedure to evaluate the coverage regions with minimum area, which is recommended by GUM7 Supplement 2. Some further discussion about this issue can be found in GUM7 Supplement 2 and Draper and Guttman.40 Such coverage regions are helpful to address correlations between parameters, ie, the parametric correlation can easily be confirmed by the shape of the region. For instance, if there is a certain kind of dependence among the parameters, one could suspect an overlap between the impact of each parameter in the final model predictions. Regarding the predictions and goodness of fit, their associated uncertainties should also be assessed. Since the models involved in the reliability analysis are nonlinear and the posterior joint distribution of the parameters is known, the method based on the law of propagation of joint probability density function would be strongly recommended, as proposed in GUM6,7 Supplements 1 and 2, which deal with a numerical evaluation of the uncertainty by means of a Monte Carlo method. With regard to
SANTANA ET AL.
7
the model predictions, it is possible to draw samples from the posterior joint distribution of the parameters, p(𝜽|t), and run Monte Carlo simulations so as to empirically build, for a given vector of times t, not only the reliability R(t, 𝜽), similar to the work conducted by Yin et al,3 but also the cumulative distribution, F(t, 𝜽), the hazard function, h(t, 𝜽), and the density function, f(t, 𝜽). This vector of times can be the same as that of the experimental data or could contain different values. Therefore, for each time, ti , the samples that belongs to the empirical distribution of this model's realization at ti will be evaluated. They are represented by each column of the generic matrix, G: ] [ g(t1 , 𝜽∼1 ) · · · g(tk , 𝜽∼1 ) ⋮ ⋱ ⋮ , (10) G= g(t1 , 𝜽∼m ) · · · g(tk , 𝜽∼m ) m x k [ ]T where 𝜽∼l is equivalent to 𝜃1l , · · · , 𝜃ql . The index k can be the same as n, the size of the experimental time to failure data, or of a chosen vector of times to failures. The mean and covariance matrix of the evaluated variable can be obtained using a similar procedure expressed by Equations 6 and 7. Not only can the model predictions be evaluated by Monte Carlo simulations but also goodness of fit indices, such as R2 and the Akaike criterion. As a result, their empirical distributions will be obtained. For the adjusted R2 , the distribution is obtained by repeated use of the expression41 : ∑n (Rm (ti , 𝜽∼l ) − Rnp (ti ))2 ∕(n − q) 2 , (11) Rl = 1 − i=1 ∑n np ̄ np 2 i=1 (R (ti ) − R ) ∕(n − 1) whereas for the Akaike criterion (AIC),27 the expression is given by AICl = −2 · ln (L (𝜽∼l )) + 2 · k.
(12)
The coverage intervals for both the model predictions and the goodness of fit are estimated following the procedure of the minimum interval proposed by GUM6 Supplement 1. Regarding the former, this interval will indicate how precise the model prediction is at each time ti : As this interval increases, the less useful the model becomes to represent the set of data. In a similar manner, as regards the goodness of fit, this will indicate how precise the index is at evaluating the quality of the model.
4
C A S E ST UDIE S
To exemplify the use of the proposed method and highlight different aspects that guide model selection, 2 case studies will be assessed here, namely: (1) The first case presents the main differences between the likelihood formulations,
conventional and residual, followed by the consequences of the use of parameter coverage regions as a model selection index. (2) The second case study focuses on the use of the coverage intervals of goodness of fit to sustain the decision of the choice of reliability model.
4.1
Case study 1
Data from the biaxial fatigue life of metal parts will be used to appraise the main differences between the conventional and residual Bayesian approaches, besides illustrating the use of coverage regions of the parameters as a tool to guide model selection. The data were taken from Peng and Yan19 : t∕cycles = [125, 127, 135, 137, 185, 187, 190, 190, 195 ,200, 212, 242, 245, 255, 283, 316, 327, 355, 373, 386, 456, 482, 552, 580, 700, 736, 745, 750, 804, 852, 884, 977, 1040, 1066, 1093, 1114, 1125, 1300, 1536, 1583, 2208, 2266, 2834, 3280, 4707, 5046]T . Two models will be evaluated in this study: (1) the extended Weibull model with three parameters proposed by Peng and Yan19 : [ ( )] 𝜆 Rm (ti , 𝜽) = exp −𝛼 · ti𝛽 · exp − , (13) ti 𝜽 = [𝛼, 𝛽, 𝜆]T ,
(14)
which, according to the authors, fits the aforementioned data set well when compared with other models based on the Weibull distribution; and (2) the lognormal model, which has 1 parameter less: )] [ ( log(t ) − 𝜇 i , (15) Rm (ti , 𝜽) = 1 − 0.5 · 1 + erf √ 2·𝜎 𝜽 = [𝜇, 𝜎]T .
(16)
To prioritize the information contained in the time to failure data, this paper will use noninformative prior distributions for each of the models to be evaluated. For the conventional approach, they are given by 19,42 : a. Extended Weibull: 𝑝(𝜽) =
1 · 𝜆 · 𝛽. 𝛼
(17)
1 · 𝜇. 𝜎
(18)
b. Lognormal: 𝑝(𝜽) =
On the other hand, for the residual approach, the model parameters only influence the location parameter of the likelihood, Equation 3, unlike the nuisance parameter, 𝜙, which affects the scale parameter. The noninformative prior for the models will be defined by16,42 : 𝑝(𝜽) =
1 . 𝜙
(19)
8
SANTANA ET AL.
TABLE 2 Case 1—parameter estimates and covariance matrix for the extended Weibull model for the conventional and residual approaches Estimation approach
𝜽̂
U𝜃𝜃
Conventional
⎡ 7.4 · 10−2 ⎤ ⎢ 5.5 · 10−1 ⎥ ⎥ ⎢ ⎣ 3.4 · 102 ⎦
⎡ 1.1 · 10−2 −1.2 · 10−2 6.3 · 100 ⎤ ⎢ −1.2 · 10−2 2.3 · 10−2 −1.0 · 101 ⎥ ⎢ ⎥ ⎣ 6.3 · 100 −1.0 · 101 9.8 · 103 ⎦
Residuala
⎡ 2.9 · 10−2 ⎤ ⎢ −1 ⎥ ⎢ 6.0 · 10 ⎥ ⎢ 2.1 · 102 ⎥ ⎢ −3 ⎥ ⎣ 1.0 · 10 ⎦
⎡ 2.2 · 10−4 ⎢ −4 ⎢ −9.1 · 10 ⎢ 4.3 · 10−1 ⎢ −7 ⎣ 4.8 · 10
a The
Conventional Residuala
𝜽̂ [
4.3 · 10−1 −1.9 · 100 1.0 · 103 7.1 · 10−4
4.8 · 10−7 ⎤ ⎥ −1.5 · 10−6 ⎥ 7.1 · 10−4 ⎥ ⎥ 4.9 · 10−8 ⎦
additional parameter is the nuisance parameter, 𝜙, present in the residual likelihood, Equation 3.
TABLE 3 Case 1—parameter estimates and covariance matrix for the lognormal model for the conventional and residual approaches Estimation approach
−9.1 · 10−4 4.2 · 10−3 −1.9 · 100 −1.5 · 10−6
] 6.3 1.0 ⎡ ⎤ 6.3 ⎢ ⎥ 1.1 ⎢ ⎥ −3 ⎣ 1.1 · 10 ⎦
TABLE 4 Case 1—coverage intervals of the parameters for the conventional and residual approaches Parameter
U𝜃𝜃 [ 2.3 · 10−2 1.1 · 10−4 ⎡ 3.3 · 10−4 ⎢ 9.1 · 10−6 ⎢ ⎣ 9.1 · 10−9
] 1.1 · 10−4 1.2 · 10−2 9.1 · 10−6 9.1 · 10−9 ⎤ 8.4 · 10−4 1.8 · 10−7 ⎥ ⎥ 1.8 · 10−7 6.7 · 10−8 ⎦
a The additional parameter is the nuisance parameter, 𝜙, present in the residual
likelihood, Equation 3.
Coverage Interval ] [ −4 −1 2.8 · 10 6.0 · 10 ] [ −3 −2 [ 7.8 · 10 5.9 · 10 ] −1 −1 [ 2.6 · 10 8.3 · 10 ] −1 −1 [ 4.6 · 10 7.1 · 10 ] 2 2 [ 1.5 · 10 5.4 · 10 ] 1.6 · 102 2.7 · 102
Lognormal model 𝜇
The integration of Equation 1 was conducted by MCMC associated with Metropolis numerical algorithm43 using the uniform distribution as the proposed posterior, with 2 000 000 iterations and a 10% burn in. The mean of the estimated parameters and their covariance matrices are presented in Tables 2 and 3 for the extended Weibull and lognormal models, respectively. It can be seen from these results that even though the parameter estimates exhibit the same order of magnitude, the variances obtained by the residual approach are up to 100 times smaller than those obtained by the conventional approach. Furthermore, the coverage intervals for parameters evaluated with the former approach are included in the intervals evaluated by the latter approach, as presented in Table 4. It can be concluded that the estimated parameters from the residual likelihood are statistically equivalent with the ones obtained by the conventional approach, but the use of the residual likelihood makes it possible to obtain estimates with fewer uncertainties. An explanation for this is that the residual likelihood approach is based on the difference between a nonparametric reliability estimator and the model predictions, which will tend to bring these predictions, reliability calculations, closer to this estimator. This can be confirmed by the R2 coefficients presented in Table 5, which are systematically higher for the residual approach when compared with the conventional one, in both the evaluated models. Figure 2 presents the mean value of the reliability and, as expected, the curves for both approaches are fairly close,
Approach
Extended Weibull model 𝛼 Conventional Residual 𝛽 Conventional Residual 𝜆 Conventional Residual
𝜎
[ ] 6.0 6.6 [ ] 6.2 6.3 [ ] 8.1 · 10−1 1.2 [ ] 1.0 1.1
Conventional Residual Conventional Residual
TABLE 5 Case 1—R2 coefficient estimates and their 95% probability coverage intervals Estimation approach
Model
R2 Estimate Coverage interval
Conventional Extended Weibull 0.944 Lognormal 0.949 Residual Extended Weibull 0.987 Lognormal
0.986
[0.862, 0.988] [0.867, 0.987] [0.986, 0.988] [0.985, 0.987]
where the major difference consists of the failure times being smaller than 1000 minutes, for which the residual approach presents a slight deviation to the left in order to get closer to the nonparametric estimator, Rnp . However, this information is not sufficient to select which model has the best prediction capability, since it does not account for the uncertainties that lie in the predictions. Furthermore, the nuisance parameter, 𝜙, which is present in this approach (Tables 2 and 3), has the following 95% coverage intervals evaluated by the minimum area method39 : [6.2 · 10−4 , 1.4 · 10−3 ] for the extended Weibull and [7.1 · 10−4 , 1.7 · 10−3 ] for the lognormal model. There is a clear overlap between these intervals, which shows that the residuals of such models have the same degree of dispersion, thus expressing the same prediction capability to
SANTANA ET AL.
9
(A)
(B)
FIGURE 2 Case 1—reliability predictions for the lognormal and extended Weibull models, presenting their means for both conventional and residual approaches [Colour figure can be viewed at wileyonlinelibrary.com]
(A)
(B)
FIGURE 3 Case 1—reliability predictions for the lognormal and extended Weibull models, presenting their means and 95% probability coverage interval for both conventional and residual approaches [Colour figure can be viewed at wileyonlinelibrary.com]
a certain degree. It is worth mentioning that this parameter is a direct consequence of the likelihood formulation, Equation 3. As well as this, both models present goodness of fit indices with overlapping intervals, reinforcing the conclusion that these models have similar prediction capabilities. Another way to compare the models is by assessing the uncertainty of the predictions of each model, which are presented in the form of the 95% probability coverage intervals in Figure 3. It is noteworthy that the intervals of the conventional approach cover all the nonparametric reliability estimators, while in the residual approach, not all estimates are covered because of the smaller intervals obtained. Such a difference in the uncertainty evaluation is mainly due to the parameters' variances: a more precise estimate for a parameter reflects, in general, a smaller interval. This can be interpreted as the conventional approach being more conservative than the residual. Moreover, the nonparametric estimates that are not included in the prediction interval can be used to investi-
gate outliers in the field data. In other words, these outliers carry some information about the data that should be investigated by the analyst in order to identify new aspects that can influence the model. This could represent, for example, a mode of failures that is not well defined or failures that belong to another mode. Regarding the validity of the normality hypothesis of the residuals of estimation in the residual approach, a Kolmogorov-Smirnov test does not reject this hypothesis, with the following P values: 0.5 (extended Weibull) and 0.8 (lognormal). They were evaluated at the mean values of the model's prediction. This result supports the statistical analysis that are distilled from the use of this approach. Consequently, if a model were to be chosen to best describe this data, it seems that both the extended Weibull and lognormal models would fit them equally well, since the predictions of reliability, see Figure 2, are almost the same and the 95% probability coverage intervals for the R2 coefficient overlap in each approach. As a result, on the
10
SANTANA ET AL.
(A)
(C)
(B)
(D)
FIGURE 4 Case 1—95% probability coverage regions for the extended Weibull and lognormal models, obtained through the residual approaches [Colour figure can be viewed at wileyonlinelibrary.com]
basis of this feature, there is no clear difference between the extended Weibull and lognormal models. It is noteworthy that, if solely the best estimate of R2 is used as goodness of fit index, cf Table 5, two different conclusions are drawn: (1) Lognormal is chosen as the best adjusted model when one uses the conventional likelihood, and (2) extended Weibull is chosen as the best adjusted model when one uses the residual likelihood. However, it is important to highlight that the proximity of the R2 values does not allow us to perform this evaluation without considering the associated uncertainties. For instance, when the 95% probability coverage intervals for such goodness of fit are considered, cf Table 5, the overlapping of them is clear. Therefore, the R2 index is not able to distinguish these models, for any likelihood formulation assessed. Other information available about the models is the 95% probability coverage regions of the parameters represented in Figure 4 for both models. Since the residual approach presents a small variance, the coverage regions were evaluated solely for it. On the one hand, it is clear that the parameters of the extended Weibull model are highly correlated, and the normal hypothesis does not hold for them,
considering that the regions obtained from the empirical joint distribution differ from an ellipse. On the other hand, the lognormal model parameters region indicates a low covariance, due to the lack of inclination of the ellipse in relation to the axis. This can be confirmed by the parameters' covariance matrix, presented in Table 3, where the covariance is 100 times smaller than the parameter variance. Therefore, if both adherence to the experimental data and the characteristics of the coverage regions of the parameters are considered, the lognormal would be the model that best describes this system. Both evaluated models have the same prediction capability, but the lognormal contains better characteristics in relation to parameter uncertainty. It is worth mentioning that Peng and Yan19 chose the extended Weibull model when compared with other structures of Weibull-based models, using their best estimates of goodness of fit as a comparison index. However, as shown here in a more thorough analysis, considering the uncertainty characteristics of the parameters and predictions helps to explore the different model characteristics and, as a result, the decision regarding model selection can be better supported.
SANTANA ET AL.
4.2
11
Case study 2
This case study focuses on the analysis of goodness of fit indices and their coverage intervals. Fault data of a sensor in an automotive production line will be used: t∕ min = [531, 567, 602, 646, 665, 665, 751, 794, 798, 799, 822, 832, 848, 857, 952, 1028, 1076, 1085, 1094, 1139, 1203, 1262, 1309, 1336, 1340, 1451, 1451, 1472, 1477, 1489, 1710, 1753, 1950, 1967, 1991, 2061, 2095, 2130, 2131, 2291, 2315, 2528, 2543, 2656, 2701, 2731,2779, 2961]T . Three models with the same number of parameter were used in this study: 1. Weibull:
(20)
𝜽 = [𝛽, 𝜂]T ;
(21)
2. lognormal, expressed by Equation 15; and (3) normal: t
i 1 exp R (ti , 𝜽) = √ 𝜎 · 2 · 𝜋 ∫−∞
(
−(ti − 𝜇)2 2 · 𝜎2
)
𝜽 = [𝜇, 𝜎].
(23)
a. Weibull model: 1 1 · . 𝛽 𝜂
(24)
1 . 𝜎
(25)
TABLE 6 Case 2—parameter estimates and covariance matrix for the Weibull model, for the conventional and residual approaches
Conventional Residuala
𝜽̂ [
2.3 1698.9
]
⎡ ⎤ 2.0 ⎢ 1647.1 ⎥ ⎢ ⎥ ⎣ 1.4 · 10−3 ⎦
U𝜃𝜃 ] [ 6.7 · 10−2 9.0 · 100 9.0 · 100 1.3 · 104 ⎡ 3.0 · 10−3 −2.7 · 10−1 −5.5 · 10−8 ⎤ ⎢ −2.7 · 10−1 3.1 · 102 4.8 · 10−5 ⎥ ⎢ ⎥ ⎣ −5.5 · 10−8 4.8 · 10−5 9.6 · 10−8 ⎦
a The additional parameter is the nuisance parameter, 𝜙, present in the residual
likelihood, Equation 3.
U𝜃𝜃 [ ] 5.4 · 10−3 2.1 · 10−5 −5 −3 2.1 · 10 2.8 · 10 ⎡ 8.2 · 10−5 −4.7 · 10−6 −2.4 · 10−8 ⎤ ⎢ −4.7 · 10−6 1.9 · 10−4 4.2 · 10−8 ⎥ ⎢ ⎥ ⎣ −2.4 · 10−8 4.2 · 10−8 5.7 · 10−8 ⎦
likelihood, Equation 3.
TABLE 8 Case 2—parameter estimates and covariance matrix for the normal model, for the conventional and residual approaches Estimation approach
𝜽̂ [ ] 1500.0 Conventional 721.4 ⎡ 1404.8 ⎤ ⎢ 785.7 ⎥ Residuala ⎢ ⎥ ⎣ 2.4 · 10−3 ⎦
U𝜃𝜃 ] [ 1.1 · 104 1.2 · 102 1.2 · 102 5.9 · 103 ⎡ 3.6 · 102 1.5 · 102 9.3 · 10−5 ⎤ ⎢ 1.5 · 102 9.6 · 102 2.6 · 10−4 ⎥ ⎢ ⎥ ⎣ 9.3 · 10−5 2.6 · 10−4 2.9 · 10−7 ⎦
likelihood, Equation 3.
The residual approach will use the same prior presented in the previous case study, Equation 19. The integration of Equation 1 was conducted, again, by the Metropolis algorithm using the uniform distribution as
Estimation approach
𝜽̂ [ ] 7.2 Conventional 0.5 ⎡ ⎤ 7.2 ⎢ ⎥ Residuala 0.6 ⎢ ⎥ −3 ⎣ 1.1 · 10 ⎦
a The additional parameter is the nuisance parameter, 𝜙, present in the residual
b. Lognormal, the same as in Equation 18, and c. Normal 𝑝(𝜽) =
Estimation approach
a The additional parameter is the nuisance parameter, 𝜙, present in the residual
· dt, (22)
The prior distributions will be assumed noninformative, as in the first case study. For the conventional approach, they are given by16,20,42 :
𝑝(𝜽) =
TABLE 7 Case 2—parameter estimates and covariance matrix for the lognormal model, for the conventional and residual approaches
[ ( ) ] 𝛽 t m , R (ti , 𝜽) = exp − i 𝜂
m
the proposed posterior, with 2 000 000 iterations and a 10% burn in. The parameter estimates and covariance matrix are presented in Table 6, Table 7, and Table 8. The differences in the parameter covariance matrices of the 2 approaches were observed here again: the one evaluated by the residual approach exhibits systematically lower values. Furthermore, Table 9 shows that all the coverage intervals of the parameters evaluated with the residual likelihood are included in the ones evaluated by the conventional
TABLE 9 Case 2—coverage intervals of the parameters for the conventional and residual approaches Parameter
Approach
Coverage Interval
Weibull Model 𝛽 Conventional Residual 𝜂 Conventional Residual
[ ] [ 1.8 2.8 ] ] [ 1.8 2.0 3 3 [ 1.5 · 10 1.9 · 10 ] 1.6 · 103 1.7 · 103
Lognormal Model 𝜇 Conventional Residual Conventional Residual
[ ] [ 7.0 7.3 ] 7.16 7.19 [ ] 4.0 · 10−1 6.1 · 10−1 [ ] 5.4 · 10−1 6.0 · 10−1
Conventional Residual Conventional Residual
] [ 3 3 ] [ 1.2 · 10 1.7 · 10 3 3 [ 1.37 · 10 1.44 · 10] 2 2 [ 5.8 · 10 8.7 · 10 ] 2 7.2 · 10 8.5 · 102
𝜎 Normal Model 𝜇 𝜎
12
SANTANA ET AL.
approach. Although the parameters estimated from both approaches are statistically equivalent, the uncertainty of the parameters obtained by the residual approach is smaller, as would be expected. Figure 5 depicts the reliability prediction, from which it can be concluded that the residual approach drags the reliability prediction of all models closer to the nonparametric estimate, Rnp . However, the deviation from the conventional approach outcomes are slight. Likewise, this conclusion can be drawn from the R2 coefficient, cf Table 10, for which the estimate is systematically higher for the residual approach for all models. This analysis is consistent with the one performed in the first case study.
Like the previous case, the Kolmogorov-Smirnov test did not reject the normality hypothesis assumed for the residuals in the proposed likelihood, with the following P values: 0.9 (Weibull), 1.0 (lognormal), and 0.35 (normal). Additionally, regarding model comparison, the 95% coverage interval of the residual variance, 𝜙, will be used as a comparison index: (1) the lognormal one gives [6.8 · 10−4 , 1.6 · 10−3 ]; (2) the normal gives [1.5 · 10−3 , 3.5 · 10−3 ]; and (3) the Weibull gives [8.9 · 10−4 , 2.0 · 10−3 ]. All these models have intervals with some degree of overlap; thus, all the residuals have similar dispersion around the mean value. Additionally, Table 10 indicates that all the coverage intervals for the R2 values evaluated in the conventional approach overlap. However, this does not occur in the
(A)
(B)
(C) FIGURE 5 Case 2—reliability predictions for the Weibull, lognormal, and normal models, presenting their means for both the conventional and residual approach [Colour figure can be viewed at wileyonlinelibrary.com] TABLE 10 Case 2—goodness of fit estimates and 95% probability coverage intervals Estimation approach
Model
R2 Estimate
Conventional
Weibull Lognormal Normal Weibull
0.94 0.95 0.92 0.983
[0.85, 0.98] [0.89, 0.99] [0.82, 0.97] [0.981, 0.984]
766 764 771 −175
[764, 770] [762, 767] [769, 776] [−178, −171]
Lognormal
0.987
[0.985, 0.987]
−188
[−191, −183]
Normal
0.970
[0.968, 0.972]
−149
[−152, −144]
Residual
Coverage interval
AIC Estimate
Coverage interval
SANTANA ET AL.
13
(A)
(B)
(C) FIGURE 6 Case 2—reliability predictions for the Weibull, lognormal, and normal models, presenting their means and 95% probability coverage interval for both the conventional and residual approach [Colour figure can be viewed at wileyonlinelibrary.com]
residual approach, in which the lognormal model has a higher R2 and there is no overlap of its coverage interval. Furthermore, it has the lowest value for the AIC criterion in both approaches, without overlap in the residual one. Therefore, the lognormal demonstrated the best prediction capability when analyzed by the residual approach; however, in the conventional one, it was not possible to distinguish this behavior from the Weibull model. Figure 6 presents the differences in the predictive capabilities of these models and their 95% coverage intervals.
5
CO N C LU S I O N
This paper has proposed a novel Bayesian formulation to estimate the parameters of reliability models, based on the residuals between a nonparametric reliability estimator and the predictions of the reliability model to be adjusted. This formulation was compared with the one commonly used in reliability modeling and the results demonstrate that the proposed approach yields the predictions closest to the failure data. In addition, the proposed formulation provides smaller coverage intervals for the parameters, which affects the uncertainties associated with model predictions favorably.
Furthermore, a key strength of the present work is its examination of the benefits of the uncertainty evaluation in the model selection procedure within the Bayesian formulations. Under these circumstances, it was demonstrated how the coverage regions of the parameters can provide information about the adjusted parameter correlations and how this can help in the task of circumventing the overfitting of parameters, through the better selection of models. This analysis also enables the systematic evaluation of the quality of the reliability models, since the uncertainties associated with the model predictions and the goodness of fit indices indicate different capabilities of the assessed models in describing a certain failure behavior. Two case studies were assessed to sum up the differences between the 2 Bayesian formulations, besides the advantages of the proposed method for model selection. In both the cases studied, it is clear that in comparison with the conventional approach, the proposed approach systematically provided better goodness of fit estimates for all the evaluated models, and they enabled some different analyses of the data, such as the presence of outliers. The uncertainty evaluation of the performance indices associated with the proposed approach was a decisive factor
14
in choosing the better model, whereas such a task for the conventional approach was not conclusive. Furthermore, it can be concluded that even when models present similar prediction capability, eg, R2 and AIC criteria with overlapping intervals, the parameter correlations (coverage regions) are a key element in the decision-making process about the evaluated models. This reinforces the point that the different statistical aspects of the models to be adjusted and their associated uncertainties should be assessed in order to accomplish a more thorough reliability analysis. The present paper has attempted to cover this topic in a fairly systematic way.
ACKNOWLEDGEMENTS The authors thank the Brazilian Sports Ministry and the National Council for Scientific and Technological Development for financial support.
REFERENCES 1. Louit DM, Pascual R, Jardine KS. A practical procedure for the selection of time-to-failure models based on the assessment of trends in maintenance data. Reliab Eng Syst Saf . 2009;94(10):1618-1628. 2. Bendell T. An overview of collection, analysis, and application of reliability data in the process industries. IEEE Trans Reliab. 1988;37(2):132-137. 3. Yin L, Smith MJ, Trivedi KS. Uncertainty analysis in reliability modeling. In: Annual Reliability and Maintainability Symposium. 2001 Proceedings. International Symposium on Product Quality and Integrity. Philadelphia, PA, USA: IEEE; 2001:229-234. 4. Settanni E, Newnes LB, Thenent NE, Bumblauskas D, Parry G, Goh YM. A case study in estimating avionics availability from field reliability data. Qual Reliab Eng Int. 2016;32(4):1553-1580. 5. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP, and OIML. Evaluation of measurement data—Guide to the expression of uncertainty in measurement. Joint Committee for Guides in Metrology - JCGM 100:2008; 2008. 6. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP, and OIML. Evaluation of measurement data—supplement 1 to the “Guide to the expression of uncertainty in measurement”—propagation of distributions using a Monte Carlo method. Joint Committee for Guides in Metrology - JCGM 101:2008; 2008. 7. BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP and OIML. 2011 Supplement 2 to the ‘Guide to the Expression of Uncertainty in Measurement’—Extension to any number of output quantities, JCGM 102:2011 (BIPM). 8. Lawless JF. Statistical Models and Methods for Lifetime Data, 2nd ed. Hoboken, NJ: Wiley; 2003. 9. Elmahdy EE, Aboutahoun AW. A new approach for parameter estimation of finite Weibull mixture distributions for reliability modeling. Appl Math Modell. 2013;37(4):1800-1810. 10. Prabhakar Murthy DN, Bulmer M, Eccleston JA. Weibull model selection for reliability modelling. Reliab Eng Syst Saf . 2004;86(3):257-267. 11. Jukic´ D, Benšic´ M, Scitovski R. On the existence of the nonlinear weighted least squares estimate for a three-parameter Weibull distribution. Comput Stat Data Anal. 2008;52(9):4502-4511.
SANTANA ET AL.
12. Li H, Zuo H, Liu R, Liu J, Jing C. Evaluation of aileron actuator reliability with censored data. Chin J Aeronaut. 2015;28(4):1087-1103. 13. Jiang R, Murthy DNP. Reliability modeling involving two Weibull distributions. Reliab Eng Syst Saf . 1995;47(3):187-198. 14. Zhang LF, Xie M, Tang LC. A study of two estimation approaches for parameters of Weibull distribution based on WPP. Reliab Eng Syst Saf . 2007;92(3):360-368. 15. Zhang T, Dwight R. Choosing an optimal model for failure data analysis by graphical approach. Reliab Eng Syst Saf . 2013;115:111-123. 16. Migon HS, Gamerman D. Statistical Inference: An Integrated Approach. New York: Hodder Arnold; 1999. 17. Touw AE. Bayesian estimation of mixed Weibull distributions. Reliab Eng Syst Saf . 2009;94(2):463-473. 18. Gupta A, Mukherjee B, Upadhyay SK. Weibull extension model: a Bayes study using Markov chain Monte Carlo simulation. Reliab Eng Syst Saf . 2008;93(10):1434-1443. 19. Peng X, Yan Z. Estimation and application for a new extended Weibull distribution. Reliab Eng Syst Saf . 2014;121:34-42. 20. Guo J, Monas L, Gill E. Statistical analysis and modelling of small satellite reliability. Acta Astronaut. 2014;98:97-110. 21. Beck JL, Au S-K. Bayesian updating of structural models and reliability using Markov chain Monte Carlo simulation. J Eng Mech. 2002;128(4):380-391. 22. Peng W, Huang H-Z, Li Y, Zuo MJ, Xie M. Life cycle reliability assessment of new products—a Bayesian model updating approach. Reliab Eng Syst Saf . 2013;112:109-119. 23. Dubos GF, Castet J-F, Saleh JH. Statistical reliability analysis of satellites by mass category: does spacecraft size matter? Acta Astronaut. 2010;67(5-6):584-595. 24. Castet J-F, Saleh JH. Single versus mixture Weibull distributions for nonparametric satellite reliability. Reliab Eng Syst Saf . 2010;95(3):295-300. 25. Castet J-F, Saleh JH. Satellite and satellite subsystems reliability: statistical data analysis and modeling. Reliab Eng Syst Saf . 2009;94(11):1718-1728. 26. Upadhyay SK, Vasishta N, Smith AFM. Bayes inference in life testing and reliability via Markov chain Monte Carlo simulation. Sankhy¯a: The Indian J Stat. 2000;62(2):203-222. 27. Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control. 1974;19(6):716-723. 28. Al-Garni AZ, Jamal A. Artificial neural network application of modeling failure rate for Boeing 737 tires. Qual Reliab Eng Int. 2011;27(2):209-219. 29. Khoshgoftaar TM, Woodcock TG. Software reliability model selection: a cast study. Proceedings. 1991 International Symposium on Software Reliability Engineering, Vol. 8. Austin, TX: IEEE Comput. Soc. Press; 1991:183-191. 30. Iesmantas T, Alzbutas R. Bayesian reliability of gas network under varying incident registration criteria. Qual Reliab Eng Int. 2016;32(5):1903-1912. 31. Méndez-González LC, Rodríguez-Picón LA, Valles-Rosales DJ, Romero-López R, Quezada-Carreón AE. Reliability analysis for electronic devices using beta-Weibull distribution. Qual Reliab Eng Int. 2017;33(8):2521-2530. 32. Evans RA. Stupid statistics. IEEE Trans Reliab. 1999;48(2):105-105. 33. Guo H, Watson S, Tavner P, Xiang J. Reliability analysis for wind turbines with incomplete failure data collected from after the date of initial installation. Reliab Eng Syst Saf . 2009;94(6):1057-1063. 34. Kaplan EL, Meier P. Nonparametric estimation from incomplete observations. J Am Stat Assoc. 1958;53(282):457-481.
SANTANA ET AL.
35. Fothergill JC. Estimating the cumulative probability of failure data points to be plotted on Weibull and other probability paper. IEEE Trans Electr Insul. 1990;25(3):489-492. 36. Nelson W. Theory and applications of hazard plotting for censored failure data. Technometrics. 1972;14(4):945-966. 37. Aalen O. Nonparametric inference for a family of counting processes. Ann Stat. 1978;6(4):701-726.
15
neering (2010), both from the Federal University of Bahia, and a Ph.D. in Chemical Engineering (2014) from the University of São Paulo. His research interests include statistical inference, robust model predictive control and real-time optimization.
38. Robert CP, Kamary K. Reflecting about selecting noninformative priors. J Appl Comput Math. 2014;03(05):15. 39. Possolo A. Copulas for uncertainty analysis. Metrologia. 2010;47(3):262-271. 40. Draper NR, Guttman I. Confidence intervals versus regions. J R Stat Soc. 1995;44(3):399-403. 41. Montgomery DC, Runger GC. Applied Statistics and Probability for Engineers. 3rd ed. United States of America: John Wiley & Sons, Inc.; 2003. 42. Yang R, Berger JO. A catalog of noninformative priors. West Lafayette: Purdue University; 1998. 43. Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953;21(6):1087-1092.
Daniel D. Santana currently holds a position as assistant professor within the Department of Chemical Engineering of the Federal University of Bahia. He received the B.S. degree in Chemical Engineering (2011) and the M.Sc. degree in Industrial Engineering (2014), both from the Federal University of Bahia. His research interests include statistical inference and model predictive control. Celso L. S. Figueirôa Filho currently works as a consultant in human reliability and industrial maintenance management and holds a position as lecturer of the Catholic University of Salvador. He received the B.S. degree in Mechanical Engineering (1991) from the Federal University of Minas Gerais and the M.Sc. degree in Production Engineering (1999) from the Federal University of Bahia. His research interests include human reliability and maintenance. Isabel Sartori currently holds a position as Adjunct Professor within the School of Management at the Federal University of Bahia. She received the B.S. degree in Chemical Engineering (2007) and a Ph.D. in Industrial Engineering (2012), both from the Federal University of Bahia. Her research interests include reliability modelling and risk assessment. Márcio A.F. Martins currently holds a position as Adjunct Professor within the Department of Chemical Engineering of the Federal University of Bahia. He received the B.S. degree in Chemical Engineering (2008) and the M.Sc. degree in Industrial Engi-
How to cite this article: Santana DD, Figueirôa Filho CLS, Sartori I, Martins MAF. A novel Bayesian approach to reliability modeling: The benefits of uncertainty evaluation in the model selection procedure. Qual Reliab Engng Int. 2018;1–15. https://doi.org/10.1002/qre.2312