INTEGRATING INVERSE PROBLEM TO WEIBULL ...

5 downloads 84 Views 346KB Size Report
on response surface methodology (RSM), Vining and Myers [20] proposed dual ..... [7] J. F. Lawless, Statistical Models and Methods for Lifetime Data, John ...
c ICIC International 2015 ISSN 1881-803X

ICIC Express Letters Volume 9, Number 3(tentative), March 2015

pp. 1–SS16-04

INTEGRATING INVERSE PROBLEM TO WEIBULL REGRESSION FOR ROBUST DESIGN MODELING AND OPTIMIZATION Le Thi Thuy Quyen, Tuan-Ho Le, Soon-Gun Seo and Sangmun Shin∗ Department of Industrial and Management Systems Engineering Dong-A University 37 Nakdong-Daero 550beon-gil saha-gu, Busan 604-714, Republic of Korea { lttquyen88; letuanhoqnu }@gmail.com; ∗ Corresponding author: [email protected]

Received February 2014; accepted April 2014 Abstract. Robust design (RD) has been recognized as one of the most useful approaches to improve the quality of products/processes. Response surface methodology (RSM), a significant tool to perform an RD procedure, is often used to estimate the fitted response functions for the process mean and variance by assuming that experimental errors are normally distributed. When this assumption is violated in many real world industrial situations, other alternative estimation methods (i.e., maximum likelihood estimation and weighted least squares methods) can be considered. The primary objective of this paper is to propose a Weibull regression model as an alternative RD modeling and estimation for the failure-time of a system that is followed by a Weibull distribution. In addition, a new inverse problem-based estimation method in order to improve estimation efficiency for the failure-time data is then proposed. Finally, a simulation study is conducted for verification purposes. Keywords: Robust design, Reliability, Inverse problem, Weibull regression model, Failure-time model, Complete data

1. Introduction. Robust design (RD) first introduced by Taguchi [17] is used to determine the best optimal settings of controllable factors by minimizing the performance variability and product bias, i.e., the deviation from the target value of a product. In order to achieve this goal, Taguchi used the orthogonal arrays and signal to noise ratios to solve the RD problems. However, the techniques of Taguchi made a lot of controversies from the professional statisticians, such as Leon et al. [8], Box [2], and Box et al. [1]. Based on response surface methodology (RSM), Vining and Myers [20] proposed dual responses (DR) model approach as an RD estimation method as well as an RD optimization model. In terms of optimization, the variance is minimized while the mean is kept as the target in the DR model. This model was extended by Del Castillo and Montgomery [5], Lin and Tu [9], Shin and Cho [15,16], and Nha et al. [14]. In terms of estimation, the process mean and variance are modelled as two separate functions of the input factors in which the unknown parameters can be estimated by least squares method (LSM). In order to use this estimation method, all data must follow the normal distribution and random errors are normally distributed with constant variances and zero means. Unfortunately, these assumptions may not hold in many practical industrial problems. In those situations, the weighted least squares method, maximum likelihood, or Bayesian approach, respectively, proposed by Luner [10], Cho and Shin [4], and Chen and Ye [3] are utilized to estimate the coefficients in the DR model approach. In failure-time models, the observations of the life of a system or a device or a component often follow the Weibull distribution, gamma distribution, and exponential distribution [7,12]. These data of time to event are often represented in log form in regression models for reliability engineering and life study. Normally, most of these failure-time data in such situations follow the Weibull distribution. Therefore, Weibull regression model (WRM) is 1

2

L. T. T. QUYEN, T.-H. LE, S.-G. SEO AND S. SHIN

used to model the failure-time data as a function of covariates variables. The influence of explanatory variables to failure behavior can be observed specifically in this model [13]. In the WRM, the maximum likelihood estimation (MLE) method is usually utilized to estimate the unknown model parameters. In order to maximize the likelihood function, it is required various algorithms that depend largely on the choice of the starting values and sometimes, it is not a global optimum [21]. The inverse problem based on Bayesian principle can be utilized to overcome these disadvantages of MLE method. Some of the advantages of the inverse problem (IP) approach are: (1) prior information about parameters as well as uncertainties due to model and observed data is taken into account in IP method, (2) unknown model parameters and unobserved parameters are considered as random variables and are treated by probability density function, and (3) the IP approach provides information about the estimated parameters as a distribution. The primary objective of this paper is to propose a Weibull regression model as an alternative RD modeling and estimation for the failure-time of a system that is followed by a Weibull distribution. In addition, a new inverse problem (IP)-based estimation method in order to improve estimation efficiency for the failure-time data is then proposed. The better estimation results from the proposed IP approach can be obtained since the prior information on distributions and uncertainties in the relationship between input factors and output responses is considered. Finally, a simulation study is conducted for verification purposs. A comparison between the proposed estimation method and the MLE based on RSM is conducted in the simulation study. 2. Weibull Regression Model. Let t = (t1 , t2 , . . . , tn )T denoted by failure times of a component or a system. Assuming that ti follows Weibull distribution, the probability density function of ti can be represented as  β−1  β t t β − η (x) i e , for r = 0, 1, . . . , n (1) f (ti ) = ηi (x) ηi (x) where ηi (x) denotes the scale parameter which is related to explanatory variables (i.e., design points xi = (xi1 , xi2 , . . . , xip )) and β denotes an unknown shape parameter that is assumed to be a constant for any explanatory variables. A way to check whether parameters of Weibull distribution depend on explanatory variables is graphical diagnostic method. It is a simple and useful method to recognize relationship between parameters and explanatory variables [7]. By taking the log linear of Equation (1), the WRM can be obtained as y = log t = Xm + σz (2) T where y, X, m, and z denote a transposed vector of (y1 , y2 , . . . , yn ) followed by an extreme value distribution with location parameter α(x) = log ηx = Xm and scale parameter σ = β −1 , a matrix of design points, a vector of unknown model parameters, and error followed by a standard extreme value distribution, respectively. It is assumed that the total effects of noise factors related to y in experiments can be defined as ε = (ε1 , ε2 , . . . , εn )T where ε is assumed to be followed by a normal distribution with mean zero and covariance matrix D. Equation (2) by applying a quadratic form can be identified as follows: p p X X X yi = m0 + mk xik + mkl xik xil + σzi + εi, for r = 0, 1, . . . , n (3) k=1

k≤l=1

3. The Proposed IP-Based Weibull Regression Model. IP method is a mathematical technique to determine the values of model parameters by using inferences taken from observations [19]. One of the advantages of this IP approach is that the priori information of model parameters as well as the uncertainties of the model and observed data can be

ICIC EXPRESS LETTERS, VOL.9, NO.3, 2015

3

corporated into the estimated model by (i) treating model parameters as random variables, (ii) using probability density functions to express the information on observed data and model parameters. Based on IP principles in Tarantola [18], the posterior information of observed data y and model parameters m can be obtained as follows: ρ(y, m) = c

δ(y, m)θ(y, m) ϕ(y, m)

(4)

where c, δ(y, m), θ(y, m), and ϕ(y, m) represent a normalized constant, a prior probability density function of both observed data and model parameters, a probability density function to represent the relationship between observed data and model parameters, and the homogeneous probability density of y and m, respectively. The observed data is collected independently by the prior information of the model parameters in this paper. In addition, no information about the prior information of the model parameters is required. Besides, both model parameters and data spaces can be represented by linear associations. Combining these properties and Equation (4), the posterior information in the model parameters can be represented as follows: Z ρM (m) = c δD (y)θ(y/m)dy (5) D R where normalized constant c is defined as c = [ D δD (y)θ(y/m)dy]−1. Assuming that the observed data y is not depended on the observational uncertainties with probability density function f (ε) and the modelization uncertainties have probability density function g(z). Based on IP principles in Tarantola [18], the two probability density functions δD (y) and θ(y|m) with given m can be generated as δD (y) = δD (yobs |y) = f (ε) = f (yobs − y) (6)   y − Xm θ(y|m) = g(z) = g (7) σ In this paper, observational uncertainties represented by ε follow normal distribution with zero mean and covariance matrix D. Although the experimental error is not followed by normal distribution, the proposed IP approach can estimate response functions by using the probability density function of error. The noise factors z due to the uncertainties of the model follow standard extreme value distribution. Thus, the posterior information in the model parameters becomes   Z 1 1 T −1 ρM (m) = n exp − (y − yobs ) D (y − yobs ) σ (2π)n/2 |D|1/2 D 2      (8) y − Xm y − Xm T T · exp −[1] − [1] exp − dy σ σ where [1]T denotes a transpose vector represented by n by 1 vector of unit. 4. RD Modeling and Optimization. The primary procedure of RD includes three sequential stages, such as design of experiments, model parameter estimation, and optimization in order to obtain the optimal factor settings. 4.1. Estimation. Based on the dual response model principle, two different output response functions (i.e., the process mean µ(y) and variance ϑ(y)) are considered in this study. The coefficient matrix of the process mean E(m) and variance C(m) can be obtained by taking the expectation and the covariance matrix of model parameters m, respectively. From the obtained posterior probability density function in Equation (8), the expectation and variance of model parameters can be obtained as follows: E(m) = (XT X)−1 XT (yobs − σγ)

(9)

4

L. T. T. QUYEN, T.-H. LE, S.-G. SEO AND S. SHIN

 σ2 π2 C(m) = (X X) X In + D X(XT X)−1 (10) 6 Therefore, the estimated response functions of the process mean and variance can then be formulated as T

−1

T



˜ T E(m) = x ˜ T (XT X)−1 XT (yobs − σγ) µ ˆ(x) = x (11)   2 2 σ π ˜ ˜ T C(m)˜ ˜ T (XT X)−1 XT In + D X(XT X)−1 x (12) ϑˆ2 (x) = x x=x 6 ˜ represents a polynomial form of input factors. The parameter γ is where the vector x inversely proportional to the shape parameter β of Weibull distribution which is estimated based on quantiles, i.e., L-moments [6]. In addition, the optimal solutions are closely associated with mean and variance functions. Both these functions are related to scale parameter σ which is inversely proportional to shape parameter β. Therefore, the different optimal solutions can be automatically obtained while changing shape parameter β. 4.2. Optimization. The main objective of the RD optimization stage is to find the optimal settings of controllable variables such that the process bias and process variance can be minimized simultaneously. By integrating the estimated mean and variance functions from Equations (11) and (12), a conventional RD optimization model (i.e., mean squares error (MSE) model) proposed by Lin and Tu [9] can be utilized as follows: Minimize MSE = (ˆ µ(x) − T )2 + ϑˆ2 (x) (13) subject to x ∈ Ω where T denotes the desirable target value. 5. Simulation Study. In order to conduct a simulation study, a true function for failure time is selected. The true relationships between explanatory variables and failure time can be represented as t = exp(−4.73x22 + 5.3x1 + 4.596x1 x2 − 3.589x2 + 5.93) + 336.75x22 + 113x21 . The full factorial design is used to investigate the effects of input factors on the associated output responses as well as the interactions between them in this study. Each input factor (x1 and x2 ) is conducted with five different levels, and thus the total experimental runs are 52 = 25. The replications of y can be generated by adding a number of variances into the true values of y at each design point and can be demonstrated as yi = ytruei + N(0, σi2 ). In this simulation study, four different cases are considered, such as (1) same and small variance, (2) same and large variance, (3) different and small variance, and (4) different and large variance. The true noises in cases (3) and (4) are identified as σi2 = e0.35x1 −0.3x2 −0.5 and σi2 = e0.35x1 −0.3x2 +0.5 , respectively. The target of the failure time is assumed by 1500 hours and the corresponding target value of y is 7.3132. In order to conduct a simulation study, the experimental data based on the four different cases is given in Table 1. For verification purposes, a comparative study by using conventional RSM-based MLE and the proposed WRN-based IP approaches in order to estimate the functional relationships between input factors and their associated output responses is conducted. The optimal solutions from the proposed WRM-based IP and RSM-based MLE approaches with the corresponding mean, variance, bias, and MSE are demonstrated in Table 2. As identified in Table 2, the better MSE values were obtained from the proposed IP approach compare to MLE approach for four different cases (i.e., same and small variances, same and large variances, different and small variances, and different and large variances). The surface and contour plots of the mean and variance functions of both WRM model based on IP approach and RSM based on MLE for case “different and small” variance are illustrated in Figures 1 and 2, respectively.

ICIC EXPRESS LETTERS, VOL.9, NO.3, 2015

5

Table 1. Experimental data Runs

x1

x2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

–1 –1 –1 –1 –1 –0.5 –0.5 –0.5 –0.5 –0.5 0 0 0 0 0 0.5 0.5 0.5 0.5 0.5 1 1 1 1 1

–1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1 –1 –0.5 0 0.5 1

“True” noise ttrue ytrue Same Same Different Different & Small & Large & Small & Large 534.95 6.28 0.5 1.5 0.58 1.57 238.45 5.47 0.5 1.5 0.50 1.35 114.88 4.74 0.5 1.5 0.43 1.16 197.20 5.28 0.5 1.5 0.37 1.00 449.75 6.11 0.5 1.5 0.32 0.86 466.19 6.14 0.5 1.5 0.69 1.87 281.62 5.64 0.5 1.5 0.59 1.61 54.83 4.00 0.5 1.5 0.51 1.38 112.83 4.73 0.5 1.5 0.44 1.19 365.00 5.90 0.5 1.5 0.38 1.03 456.93 6.12 0.5 1.5 0.82 2.23 777.86 6.66 0.5 1.5 0.70 1.92 376.15 5.93 0.5 1.5 0.61 1.65 103.35 4.64 0.5 1.5 0.52 1.42 336.84 5.82 0.5 1.5 0.45 1.22 507.74 6.23 0.5 1.5 0.98 2.65 2956.53 7.99 0.5 1.5 0.84 2.28 5352.36 8.59 0.5 1.5 0.72 1.96 1048.80 6.96 0.5 1.5 0.62 1.69 380.47 5.94 0.5 1.5 0.54 1.45 619.27 6.43 0.5 1.5 1.16 3.16 11858.13 9.38 0.5 1.5 1.00 2.72 75470.60 11.23 0.5 1.5 0.86 2.34 45949.61 10.74 0.5 1.5 0.74 2.01 3059.47 8.03 0.5 1.5 0.64 1.73 Table 2. Comparative study results

Case 1 2 3 4

Criteria of variances Same and small variances Same and large variances Different and small variances Different and large variances

Model

x1

x2

Mean

MLE 0.8441 0.2806 7.2686 IP 0.5251 0.4268 7.3129 MLE –0.7048 0.9314 7.2768 IP 0.5166 0.417 7.313 MLE –0.4168 0.6963 7.2869 IP 0.495 0.7296 7.3182 MLE –0.6981 0.4356 7.2104 IP 0.2725 –0.1509 7.2245

Bias Variance MSE |ˆ µ(x) − T | ϑˆ2 (x) 0.045 0.43 0.432 0.000 0.0141 0.0141 0.036 1.0892 1.0906 0.000 0.0095 0.0095 0.026 0.8633 0.864 0.005 0.0189 0.019 0.103 0.9235 0.934 0.089 0.1039 0.1118

6. Conclusion. In this paper, an RD modeling based on WRM is developed to model the failure time data which is followed by a Weibull distribution. In addition, the proposed IP approach demonstrated significant flexibility to estimate model parameters. To our best knowledge, this is a new modeling and estimation approach in context of RD modeling and optimization. In addition, the disadvantages associated with many assumptions of classical parameter estimation methods may have solution alternatives by using this proposed IP approach. Furthermore, the prior information of model parameters as

6

L. T. T. QUYEN, T.-H. LE, S.-G. SEO AND S. SHIN

Figure 1. Contour and surface plots of mean functions. (a) IP approach, (b) MLE approach.

Figure 2. Contour and surface plots of variance functions. (a) IP approach, (b) MLE approach.

well as the uncertainties of the model and observations is taken into account in estimating regression parameters when model parameters are treated as a random variable. Based on the simulation study, the optimization results clearly demonstrated that WRM based on the proposed IP approach is more appropriate than RSM based on the MLE method. For further research, the proposed IP approach can be implemented for failure-time models where data follow other non-normal distributions such as exponential and log-normal. This new IP approach was proposed as an alternative method to the Weilbull regression model for the case of complete data. To this end, this IP approach can also be applied to many different types of censored data.

ICIC EXPRESS LETTERS, VOL.9, NO.3, 2015

7

Acknowledgement. This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (20120683). REFERENCES [1] G. Box, S. Bisgaard and C. Fung, An explanation and critique of Taguchi’s contribution to quality engineering, Quality and Reliability Engineering International, vol.4, no.2, pp.123-131, 1988. [2] G. E. P. Box, Signal-to-noise ratios, performance criteria, and transformations, Technometrics, vol.30, no.1, pp.1-17, 1988. [3] Y. Chen and K. Ye, A Bayesian hierarchical approach to dual response surface modeling, Technical Report, Department of Statistics, Virginia Tech, Blacksburg, Virginia, 2005. [4] B. R. Cho and S. M. Shin, Quality improvement and robust design methods to a pharmaceutical research and development, Mathematical Problems in Engineering, 2012. [5] E. Del Castillo and D. C. Montgomery, A nonlinear programming solution to the dual response problem, Journal of Quality Technology, vol.25, no.3, pp.199-204, 1993. [6] J. R. M. Hosking, L-moments: Analysis and estimation of distributions using linear combinations of order statistics, Journal of the Royal Statistical Society, Series B, vol.52, pp.105-124, 1990. [7] J. F. Lawless, Statistical Models and Methods for Lifetime Data, John Wiley and Sons, New York, 2002. [8] R. V. Leon, A. C. Shoemaker and R. N. Kackar, Performance measures independent of adjustment: An explanation and extension of Taguchi signal-to-noise ratio, Technometrics, vol.29, no.3, pp.253285, 1987. [9] D. K. J. Lin and W. Tu, Dual response surface optimization, Journal of Quality Technology, vol.27, no.1, pp.34-39, 1995. [10] J. J. Luner, Achieving continuous improvement with the dual response approach: A demonstration of the Roman catapult, Quality Engineering, vol.6, no.4, pp.691-705, 1994. [11] R. H. Myers and D. C. Montgomery, Response Surface Methodology, John Wiley and Sons, New York, 2002. [12] W. Nelson, Accelerated Testing – Statistical Models, Test Plans and Data Analyses, Wiley, New York, 1990. [13] M. Newby, Perspective on Weibull proportional-hazards models, IEEE Transactions on Reliability, vol.43, no.2, pp.217-223, 1994. [14] V. T. Nha, S. M. Shin and S. H. Jeong, Lexicographical dynamic goal programming approach to a robust design optimization within the pharmaceutical environment, European Journal of Operational Research, vol.229, no.2, pp.505-517, 2013. [15] S. M. Shin and B. R. Cho, Bias-specified robust design optimization and analytical solutions, Computers & Industrial Engineering, vol.48, pp.129-148, 2005. [16] S. M. Shin and B. R. Cho, Robust design models for customer specified bounds on process parameters, Journal of Systems Science and Systems Engineering, vol.15, no.1, pp.2-18, 2006. [17] G. Taguchi, Introduction to Quality Engineering: Designing Quality into Products and Processes, Tokyo, Asian Productivity Association, 1986. [18] A. Tarantola, Inverse Problem Theory: Methods for Data Fitting and Model Parameter Estimation, Elsevier, Amsterdam, New York, 1987. [19] A. Tarantola, Inverse problem theory and methods for model parameter estimation, Society for Industrial and Applied Mathematics, Philadelphia, PA, USA, 2005. [20] G. G. Vining and R. H. Myers, Combining Taguchi and response surface philosophies: A dual response approach, Journal of Quality Technology, vol.22, no.1, pp.38-45, 1990. [21] W. Wang and D. B. Kecekioglu, Fitting the Weibull log-linear model to accelerated life-test data, IEEE Transactions on Reliability, vol.49, no.2, pp.224-229, 2000.

Suggest Documents