This article was downloaded by: [Central Michigan University] On: 01 April 2013, At: 10:43 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Communications in Statistics - Theory and Methods Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsta20
Weibull-Pareto Distribution and Its Applications a
b
Ayman Alzaatreh , Felix Famoye & Carl Lee a
b
Department of Mathematics, Austin Peay State University, Clarksville, Tennessee, USA
b
Department of Mathematics, Central Michigan University, Mount Pleasant, Michigan, USA Version of record first published: 01 Apr 2013.
To cite this article: Ayman Alzaatreh , Felix Famoye & Carl Lee (2013): Weibull-Pareto Distribution and Its Applications, Communications in Statistics - Theory and Methods, 42:9, 1673-1691 To link to this article: http://dx.doi.org/10.1080/03610926.2011.599002
PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.
Communications in Statistics—Theory and Methods, 42: 1673–1691, 2013 Copyright © Taylor & Francis Group, LLC ISSN: 0361-0926 print/1532-415X online DOI: 10.1080/03610926.2011.599002
Weibull-Pareto Distribution and Its Applications
Downloaded by [Central Michigan University] at 10:43 01 April 2013
AYMAN ALZAATREH1 , FELIX FAMOYE2 , AND CARL LEE2 1
Department of Mathematics, Austin Peay State University, Clarksville, Tennessee, USA 2 Department of Mathematics, Central Michigan University, Mount Pleasant, Michigan, USA In this article, a new distribution, namely, Weibull-Pareto distribution is defined and studied. Various properties of the Weibull-Pareto distribution are obtained. The distribution is found to be unimodal and the shape of the distribution can be skewed to the right or skewed to the left. Results for moments, limiting behavior, and Shannon’s entropy are provided. The method of modified maximum likelihood estimation is proposed for estimating the model parameters. Several real data sets are used to illustrate the applications of Weibull-Pareto distribution. Keywords Modified maximum likelihood estimation; Simulation study; T -X family; Unimodality. Mathematics Subject Classification 62E15; 62F10; 62P10; 68U20.
1. Introduction The Weibull distribution is a well-known distribution due to its wide use to model various types of data. Beside the many applications of the Weibull distribution, the distribution has been widely used in survival and reliability analyses. To add more flexibility to Weibull distribution, many researchers developed many generalizations of the distribution. These generalizations include the generalized Weibull distribution by Mudholkar and Kollia (1994), the exponentiated-Weibull distribution by Mudholkar et al. (1995), and the beta-Weibull distribution by Famoye et al. (2005). Let Fx be the cumulative distribution function (cdf) of any random variable X and rt be the probability density function (pdf) of a random variable T defined on 0 . The cdf of the generalized family of distributions defined by Alzaatreh et al. (in press) is given by −log1−Fx Gx = rtdt (1.1) 0
Received December 14, 2010; Accepted June 14, 2011 Address correspondence to Felix Famoye, Department of Mathematics, Central Michigan University, Mount Pleasant, MI 48859, USA; E-mail:
[email protected]
1673
1674
Alzaatreh et al.
The family of distributions defined by (1.1) is called “Transformed-Transformer” family (or T -X family) in Alzaatreh et al. (in press). If a random variable T follows c the Weibull distribution with parameters c and , rt = c/t/c−1 e−t/ , t ≥ 0, the definition in (1.1) leads to the Weibull-X family with the pdf
Downloaded by [Central Michigan University] at 10:43 01 April 2013
gx =
−log1 − Fx c−1 −log1 − Fx c c fx exp − 1 − Fx
(1.2)
In this article, we study a member of Weibull-X family where X is the Pareto random variable. In Sec. 2, the Weibull-Pareto distribution (WPD) is defined. In Sec. 3, we study the properties of WPD, including the limiting behavior and unimodality. Moments, the moment-generating function, and the existence of moments are studied in Sec. 4. Section 5 deals with estimation and simulation of WPD. Also in this section, modified maximum likelihood estimation is proposed for estimating the model parameters. Application of the WPD to real data sets is provided in Sec. 6.
2. The Weibull-Pareto Distribution If X is a Pareto random variable with the density function fx = kk /xk+1 x > , then (1.2) reduces to gx =
x c−1 x c k kc k x > exp − log log x
(2.1)
and replacing k/, by , the distribution in (2.1) can be written as gx =
x c x c−1 c exp − log x > c > 0 log x
(2.2)
A random variable X with the pdf gx in (2.2) is said to follow the WeibullPareto distribution and will be denoted by WPDc . When = 1, the WPD reduces to the log-Weibull distribution defined by Sayama and Sekine (2004). When c = 1, the WPD reduces to the Pareto distribution with parameters and . From (2.2), we obtain the cdf of WPD as Gx = 1 − exp − logx/c
(2.3)
In the next section, some general properties of WPD will be addressed including special cases, the limiting behavior, and the unimodality.
3. Properties of the Weibull-Pareto Distribution The following Lemma gives the relation between WPD and Weibull, exponential, and the Type 1 extreme value distributions. Lemma 3.1 (Transformation). (a) If a random variable Y follows the Weibull distribution with parameters c and 1/, then the random variable X = eY follows WPDc .
Weibull-Pareto Distribution and It’s Applications
1675
(b) If a random variable Y follows the standard exponential distribution then the random variable X = expY 1/c / follows WPDc . (c) If a random variable Y follows the Type 1 extreme value distribution with scale parameter 1/c, then the random variable X = expe−Y / follows WPDc . Proof. The results follow by using the transformation technique.
The hazard function associated with WPD is
Downloaded by [Central Michigan University] at 10:43 01 April 2013
hg x =
x c−1 c gx = log x > 1 − Gx x
(3.1)
The limiting behaviors of the Weibull-Pareto pdf and the hazard function are given in the following theorem. Theorem 3.1. The limit of the Weibull-Pareto density function and the Weibull-Pareto hazard function as x → is 0 and the limit as x → + is given by c > 1 0 lim+ gx = lim+ hg x = / c = 1 (3.2) x→ x→ c < 1 Proof. We first show that limx→ gx = limx→ hg x = 0. Since gx = hg x 1 − Gx, we only need to show that limx→ hg x = 0. If c ≤ 1 and by definition (3.1), we have limx→ hg x = 0. If c > 1, there is an integer m such that 1 < c < m. By using L’Hôpital’s rule, we have x c−1 dm−1 c log x→ dxm−1 = lim cc − 1c − 2 · · · c − m + 1 logx/c−m /x
lim hg x = lim
x→
x→
Since c < m, limx→ hg x = 0. The result in (3.2) follows directly from the definition of (3.1) and gx = hg x1 − Gx. The following theorem shows that the Weibull-Pareto distribution is unimodal. Theorem 3.2. The WPD has a unique mode at x = x0 . When c ≤ 1, the mode is x0 = and when c > 1 the mode x0 is the solution of equation kx = 0, where kx = − logx/ − c logx/c + c − 1
(3.3)
Proof. The derivative with respect to x of Eq. (2.2) is given by g x = c2 x−2 logx/c−2 exp − logx/c kx
(3.4)
From (3.4) the critical points of gx are x = and x = x0 where kx0 = 0. Now, for c ≤ 1, it is easy to see from (3.4) that g x < 0, so gx is strictly decreasing. Also, from Theorem 3.1 we have limx→+ gx = / when c = 1 and
1676
Alzaatreh et al.
limx→+ gx = when c < 1. Thus, gx has a unique mode at x = . For c > 1 and using Theorem 3.1, limx→+ gx = 0 implies that x = cannot be a modal point. So the modes of gx are the solutions to the equation kx = 0. Finally, we need to show that equation kx = 0 has one solution. The derivative of kx with respect to x is given by k x = − x−1 + c2 x−1 logx/c−1
Downloaded by [Central Michigan University] at 10:43 01 April 2013
which implies that kx is strictly decreasing when c > 1 and hence, equation kx = 0 has at most one solution. Using the facts from Theorem 3.1 that limx→+ gx = 0 and limx→ gx = 0, we conclude that gx must have a unique mode. In Figs. 1–3, various graphs of gx and hx are provided. These plots indicate that the WPD has a very long right rail, and when the parameter increases the peak of the distribution increases. Quantile functions for many distributions do not have closed form. For the Weibull-Pareto distribution, the following Lemma gives a closed form quantile function of the distribution. Lemma 3.2. Let Q 0 < < 1 denote the quantile function for the WPD. Then, Q is given by Q = exp − log1 − 1/c / (3.5)
Figure 1. The Weibull-Pareto pdf for various values of c when = 1 and = 1.
Downloaded by [Central Michigan University] at 10:43 01 April 2013
Weibull-Pareto Distribution and It’s Applications
1677
Figure 2. The Weibull-Pareto pdf when = 3 and various values of c and .
Figure 3. The Weibull-Pareto hazard function for various values of c and when = 1 and = 1.
1678
Alzaatreh et al.
Proof. By using GQ = and (2.3), we obtain (3.5), the quantile function of WPD. Setting = 025 050, and 075 in (3.5), the quartiles of the WPD can be obtained. The entropy of a random variable X is a measure of variation of uncertainty (Rényi, 1961). Shannon’s entropy for a random variable X with pdf gx is defined as E − loggX . Shannon (1948) showed important applications of this entropy in communication theory. The entropy has also been used in many fields such as physics, engineering, and economics. Lemma 3.3. The Shannon’s entropy for a random variable X that follows the WPD is
Downloaded by [Central Michigan University] at 10:43 01 April 2013
− logc/ + 1 + c−1 / + 1 − c−1 + 1 where = −
0
(3.6)
e−u log u du ≈ 057722 is the Euler gamma constant.
Proof. For WPD, the Shannon’s entropy is given by E − loggX = − gx loggxdx. The result in (3.6) is derived by using the substitution u = logx/c in gx of (2.2).
4. Moments The moment generating function for the WPD is given by i i i t x t gx dx = EX i etx gxdx = MX t = EetX = i! i! i=0 i=0
(4.1)
If X follows the WPD, then Lemma 3.1 implies that the random variable Y = logX/ follows the Weibull distribution with parameters c and 1/ and ti 1+i/c . Since X = eY , we have EX = EeY = MY 1 = hence, MY t = i=0 i!i 1+i/c i=0 i!i . In general, the sth non central moment is given by EX s = s EesY = s MY s = s
si
1 + i/c i!i i=0
(4.2)
The result in (4.2) can be used in (4.1) to re-write the moment generating function as j i 1 + j/c i t i (4.3) MX t = i! j=0 j!j i=0 The sth non central moment in (4.2) may not exist for all values of and c. Theorem 3.3 gives the conditions for the existence of the non central moments. Theorem 4.1. (i) If c > 1, then the non central moments of the Weibull-Pareto distribution exist. (ii) If c < 1, then the non central moments of the Weibull-Pareto distribution do not exist. (iii) If c = 1, then the non central moments of the Weibull-Pareto distribution exist iff > s.
Weibull-Pareto Distribution and It’s Applications
1679
Proof. We need the following inequalities introduced by Keckic and Vasic (1971) for the proof: bb−1/2
b < a−1/2 e−b−a 0 < a < b
a a
(4.4)
b bb−1 > a−1 e−b−a 0 < a < b
a a
(4.5)
Downloaded by [Central Michigan University] at 10:43 01 April 2013
(i) When c > 1: From (4.2), EX s = s the series ratio test, we have lim
i→
i=0
ai where ai =
si 1+i/c . i!i
ai+1 s i + 1/c = lim i→ ai i i/c
By using
(4.6)
On using the inequality in (4.4) and the fact that c > 1, the result in (4.6) becomes ai+1 se−1/c i + 1/c1/c < lim 1 + 1/ii/c 1 + 1/i−1/2 i→ a i→ i i lim
=
i + 1/c1/c s lim = 0 i→ i
Hence, EX s exists. (ii) When c < 1: Using the inequality in (4.5) and the fact that c < 1, the result in (4.6) becomes ai+1 i + 1/c1/c s > lim = i→ a i→ i i lim
Hence, EX s does not exist. (iii) When c = 1: If c = 1, then EX s = s
si i i=0
which exists if and only if > s.
The next theorem shows that when > s and c ≥ 1, the non central moments of WPDc is bounded above by the non central moments of the Pareto distribution with parameters and . Theorem 4.2. If c ≥ 1 and > s, then EX s ≤ s / − s. Proof. Since x is an increasing function for x ≥ 2 and for any integer i and c ≥ 1, 1 + i/c ≤ 1 + i. Thus, for large i we have 1 + i/c/ 1 + i ≤ 1. Hence, EX s = s
si 1 + i/c i=0
i 1 + i
≤ s
si s = whenever > s i −s i=0
1680
Alzaatreh et al. Table 1 Mode, median, mean, variance, skewness, and kurtosis for some values of c and with = 1 (*: Undefined)
c 1
Downloaded by [Central Michigan University] at 10:43 01 April 2013
4
7
10
Mode
Median
Mean
Variance
Skewness
Kurtosis
1.001 4 7 10 1.001 4 7 10 1.001 4 7 10 1.001 4 7 10
1 1 1 1 23511 12562 11405 10967 25945 12752 11494 11025 26571 12798 11516 11039
1.9986 1.1892 1.1041 1.0718 2.4881 1.2562 1.1392 1.0955 2.5806 1.2678 1.1452 1.0995 2.6197 1.2725 1.1476 1.1012
1001 1.3333 1.1667 1.1111 2.5535 1.2569 1.1390 1.0952 2.5768 1.2644 1.1433 1.0982 2.6033 1.2690 1.1457 1.0999
* 02222 00389 00154 04211 00064 00017 00008 01540 00024 00007 00003 00830 00013 00003 00002
* 70711 33806 28111 05903 00789 00076 −00208 −00235 −03488 −03974 −04170 −02907 −05465 −05851 −06008
* * 228570 178290 33692 27626 27455 27432 28034 30252 30886 31165 29742 33721 34525 34865
From Theorem 4.2, the mode of WPD is at when c ≤ 1. Table 1 provides the mode, median, mean, variance, skewness, and kurtosis of the WPD for various values of and c when = 1. For fixed c > 1 and , the mode, median, and mean of the WPD are decreasing functions of . Also, for fixed > 1 and , the mode and the median of the WPD are increasing functions of c. When c > 1, the variance of the WPD is a decreasing function of c and . The distribution of the WPD tends to be skewed more to the left as c increases.
5. Parameter Estimation In this section, we address two problems when using the ML method to estimate the WPD parameters. The first problem occurs when c < 1. The WPD likelihood function tends to infinity as approaches the sample minimum x1 and hence, when c < 1 and is estimated by x1 , no MLE for c and exists. This problem was studied by Smith (1985) who proposed an alternative approach for estimating the parameters. The second problem is when c 1. For this situation, the WPD has a long left tail which makes x1 a poor estimate for and this produces an unusually large bias in the alternative MLE for c and . A modification of the regular MLE is proposed to deal with this large bias problem. Simulation study is conducted to evaluate the performance of the alternative and the modified MLE methods. 5.1. Alternative Maximum Likelihood Estimation Smith (1985) considered probability densities of the form fx = x − c−1 qx − for x >
(5.1)
Weibull-Pareto Distribution and It’s Applications
1681
Downloaded by [Central Michigan University] at 10:43 01 April 2013
where parameter and parameter vector are unknown. The WPD in (2.2) is of the form (5.1) with = c. The classical regularity conditions for the maximum likelihood estimates are not satisfied for the WPD because the support of the density depends on . Smith (1985) showed that the classical MLE results hold for (5.1) when c > 2 and studied in detail the case when c ≤ 2. For any value of c, Smith (1985) proposed the following alternative to maximum likelihood. If a random sample x1 x2 xn is obtained, estimate the parameter by the sample minimum x1 and then use the MLE method to estimate and c by excluding the sample minimum. We apply Smith’s alternative MLE (AMLE) approach to estimate the parameters of WPD and conduct a simulation study to evaluate the performance of the AMLE for WPD. The alternative log-likelihood function for WPD is given by L∗ = =
log gxi x1 c
xi =x1
c log + log c − log xi + c − 1 loglogxi /x1
xi =x1
− logxi /x1 c
(5.2)
The derivatives of (5.2) with respect to and c are given by c L∗ c/ − cc−1 logxi /x1 = xi =x1
(5.3)
c L∗ c + log + log log xi /x1 − log log xi /x1 = c xi =x1 c − log xi /x1 log log xi /x1
(5.4)
Setting (5.3) and (5.4) to zero and simplifying we obtain
= n − n /
c
−1
xi + log log x1 xi =x1
logxi /x1
xi =x1
−
c
xi =x1
1/c
(5.5)
c logxi /x1 log logxi /x1 c = 0 (5.6) xi =x1 logxi /x1
where n is the frequency of x1 . The AMLE cˆ of c is the solution of Eq. (5.6). The AMLE ˆ of can be found by substituting the estimate cˆ in Eq. (5.5). 5.2. Simulation Study to Evaluate the Performance of the AMLE Smith (1985) showed the AMLE is a consistent estimator. However, it is not clear how the estimator performs in terms of bias for small samples. We conduct a simulation study to evaluate the AMLE of WPD in terms of both bias and variance for various parameter combinations and different sample sizes. We consider the values 05 1, and 3 for parameters and , and 05 1 4, and 7 for the parameter c. Two different sample sizes n = 100 and 500, are considered. The simulation is
Downloaded by [Central Michigan University] at 10:43 01 April 2013
1682
Alzaatreh et al.
done for a total of 36 parameter combinations. For each parameter combination, we generate a random sample y1 y2 yn from Weibull distribution with parameters c and 1/. By using the transformation xi = expyi (see Lemma 3.1), we obtain a random sample x1 x2 xn , which follow WPDc . The initial √ values for the parameters and c (Johnson et al., 1994, pp. 642–643) are c0 = / 6slogyi and 0 = exp−¯xlogyi − /c0 , where slogyi and x¯ logyi are the sample standard deviation and the sample mean for logyi and is the Euler gamma constant. This process is repeated 200 times. The bias (estimate − actual) and the standard deviation are presented in Tables 2 and 3. The result of the simulation shows that the AMLE method does not provide good estimates when c > 1. When c > 1, the estimates of c and are far from the actual values, because the estimates of c and are very sensitive to the estimate of . If ˆ = x1 is greater than by a small quantity, it gives a large bias for estimating c using the AMLE. This bias comes from the fact that cˆ is the solution ˆ of Eq. (5.6). If ˆ is greater than by a small quantity, then the term loglogxi / ˆ becomes very negatively large when xi is close to and hence the value of cˆ becomes smaller than the actual value. For example, if = 1, ˆ = 13, and xi = 13001, then ˆ = −94727, while the actual value loglogxi / = −13377. From loglogxi / Tables 2 and 3, it appears that ˆ = x1 overestimates and becomes worse as c increases. A closer look at the shape of the WPD (see Figs. 1, 2, and Table 2), it indicates that the WPD is gradually shifting from right-skewed to left-skewed as c increases. The sample minimum, x1 , tends to be larger than the lower bound of the distribution when the distribution shape is more left-skewed and the sample size is small. This explains why x1 in small samples overestimates more seriously as c increases in WPD. In the next sub-section, a modified maximum likelihood estimation (MMLE) is proposed and the results show better estimates (in terms of the biases) for the parameters c, , and . 5.3. Modified Maximum Likelihood Estimation As observed and explained in Sec. 5.2, the large bias of AMLE is a problem, especially when the shape of WPD is more left-skewed (when the parameter c is large) and the sample size is small. In view of this problem, there is a need to seek a different approach to estimate . To address this problem, we consider a modified MLE proposed by (Smith, 1985, Sec. 3) for the density function of the form (5.1). Smith showed that the parameter estimates exist and are consistent when c > 1. We consider the log-likelihood function Ln c =
n
log gxi c
(5.7)
i=1
which is defined only for < x1 . ˜ and ˜ satisfy the following equations: The estimators c˜ , ˜ ˜ Ln ˜c = 0 c
˜ ˜ ˜ ˜ Ln ˜c Ln ˜c = 0 and = 0
(5.8)
In order to use (5.8), we need to show that the derivative with respect to exists whenever < x1 for the WPD. The derivative of Ln c with respect to for
Weibull-Pareto Distribution and It’s Applications
1683
Table 2 Bias and standard deviation of the parameter estimates using AMLE method n = 100 Actual values c 0.5
cˆ
ˆ
ˆ
cˆ
ˆ
ˆ
05
05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3
00153 00164 00186 00215 00203 00199 00266 00294 00274 00170 00276 00105 00101 00161 00196 00180 00220 00147 −12733 −13236 −12307 −12482 −11904 −12342 −11801 −11713 −11542 −35316 −36250 −35337 −35971 −35810 −35575 −34027 −34350 −34386
00194 00136 −00064 00123 00091 −00108 −00452 −00086 −00246 00032 00160 00154 00213 00227 00188 00898 00589 00399 02339 02371 02346 04718 04429 04624 13751 13655 13481 05030 05242 05107 10701 10515 10535 30641 31598 30456
0.0002 0.0003 0.0012 0.0001 0.0002 0.0007 0.0000 0.0001 0.0003 0.0098 0.0226 0.0649 0.0052 0.0096 0.0324 0.0019 0.0035 0.0088 0.4105 0.8291 2.4486 0.1722 0.3274 1.0173 0.0508 0.1017 0.3008 0.8145 1.6773 4.9343 0.3188 0.6342 1.9036 0.0874 0.1775 0.5220
0.0396 0.0396 0.0401 0.0418 0.0421 0.0397 0.0413 0.0409 0.0410 0.0871 0.0835 0.0815 0.0805 0.0783 0.0792 0.0798 0.0734 0.0739 0.4378 0.3673 0.4472 0.4435 0.4121 0.4361 0.4370 0.4481 0.4369 0.6846 0.6764 0.6644 0.6914 0.6608 0.6525 0.7356 0.7834 0.7045
0.1098 0.1129 0.1032 0.2241 0.2002 0.2032 0.6353 0.6388 0.6741 0.0501 0.0544 0.0605 0.0985 0.1075 0.1092 0.3789 0.3382 0.3530 0.1020 0.0902 0.0946 0.1840 0.1698 0.2007 0.5605 0.5534 0.5488 0.1708 0.1726 0.1711 0.3616 0.3383 0.3372 1.0182 1.0427 1.0407
0.0006 0.0009 0.0027 0.0002 0.0004 0.0014 0.0001 0.0001 0.0006 0.0096 0.0219 0.0757 0.0054 0.0101 0.0322 0.0020 0.0033 0.0086 0.1547 0.2752 0.8827 0.0532 0.1020 0.3418 0.0145 0.0305 0.0894 0.2150 0.4195 1.2583 0.0676 0.1331 0.3831 0.0160 0.0329 0.0964
Downloaded by [Central Michigan University] at 10:43 01 April 2013
3
05
1
3
4
05
1
3
7
Standard deviation
1
1
Bias
05
1
3
1684
Alzaatreh et al.
Table 3 Bias and standard deviation of the parameter estimates using AMLE method n = 500 Actual values c 0.5
cˆ
ˆ
ˆ
cˆ
ˆ
ˆ
05
05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3
00026 00076 00058 00030 00077 00052 00056 00065 00053 00032 00017 −00013 00031 00041 00030 00034 00024 00046 −08491 −08710 −08605 −09010 −08693 −08813 −08547 −08514 −08892 −29279 −28761 −29323 −29542 −29473 −29500 −29133 −29147 −28888
00037 −00083 00054 −00007 00110 −00020 00011 −00080 00239 00034 −00008 00036 00029 00018 00062 00002 00047 −00002 01241 01288 01259 02687 02618 02628 07623 07623 07939 03349 03243 03365 06894 06812 06870 20201 20196 20153
0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0019 0.0042 0.0145 0.0010 0.0020 0.0062 0.0003 0.0006 0.0022 0.2295 0.4704 1.3939 0.1109 0.2162 0.6488 0.0326 0.0661 0.2027 0.5892 1.1449 3.5438 0.2417 0.4786 1.4438 0.0690 0.1382 0.4135
0.0171 0.0172 0.0170 0.0153 0.0179 0.0170 0.0185 0.0185 0.0154 0.0323 0.0324 0.0362 0.0339 0.0346 0.0334 0.0351 0.0345 0.0328 0.2706 0.2604 0.2742 0.2601 0.2703 0.2965 0.2763 0.2746 0.2576 0.5038 0.5421 0.5535 0.5098 0.5106 0.5396 0.5207 0.4955 0.5313
0.0511 0.0451 0.0474 0.1000 0.0968 0.0922 0.2797 0.2892 0.3073 0.0212 0.0222 0.0240 0.0481 0.0501 0.0512 0.1361 0.1378 0.1401 0.0437 0.0463 0.0459 0.0890 0.0859 0.0976 0.2669 0.2760 0.2548 0.0921 0.0939 0.0962 0.1834 0.1827 0.1851 0.5345 0.5185 0.5474
0.0000 0.0000 0.0001 0.0000 0.0000 0.0001 0.0000 0.0000 0.0000 0.0020 0.0046 0.0137 0.0009 0.0019 0.0068 0.0003 0.0005 0.0021 0.0770 0.1627 0.4703 0.0316 0.0620 0.2077 0.0094 0.0190 0.0539 0.1390 0.2798 0.8767 0.0471 0.0925 0.2882 0.0119 0.0228 0.0729
Downloaded by [Central Michigan University] at 10:43 01 April 2013
3
05
1
3
4
05
1
3
7
Standard deviation
1
1
Bias
05
1
3
Weibull-Pareto Distribution and It’s Applications
1685
the WPD is n n cc 1−c 1 Ln c logxi /c−1 + = i=1 logxi / i=1
(5.9)
It is clear that the right-hand side of Eq. (5.9) is continuous when 0 < < x1 and hence, Ln c exists. Setting Eq. (5.9) to zero, we get n
Downloaded by [Central Michigan University] at 10:43 01 April 2013
i=1
and setting
Ln c
n 1−c logxi /c−1 = 0 + c c logxi / i=1
= 0 and
Ln c c
(5.10)
= 0, we get
1/c n c logxi / = n
(5.11)
i=1
c−1 +
n
n log logxi / −
i=1
i=1
logxi /c log logxi / = 0 n c i=1 logxi /
(5.12)
˜ and ˜ from (5.10)–(5.12) will be called MMLE. The estimators c˜ ,
Table 4 Bias and standard deviation of the parameter estimates using MMLE method n = 100 Actual values
Bias
Standard deviation
c
c˜
˜
˜
c˜
˜
˜
4
05
05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3
−01791 −01059 −01432 01311 −00680 −01264 −01407 −01644 −00039 −12633 −14256 −12287 −04250 −10953 −11783 −18528 −14892 −04569
0.0505 0.0492 0.0498 0.0568 0.0904 0.0948 0.2693 0.2781 0.2065 0.1524 0.1660 0.1595 0.2123 0.3312 0.3351 1.2261 1.0231 0.6521
0.0877 0.1692 0.5138 0.0098 0.0538 0.1811 0.0078 0.0161 0.0263 0.2709 0.5902 1.6752 0.0616 0.2193 0.6818 0.0436 0.0721 0.0849
0.9675 0.9713 0.9646 1.3075 1.0991 0.9929 0.9301 0.9441 1.0276 1.5586 1.4926 1.7056 2.7144 2.1728 2.3534 1.2541 1.4055 2.5257
0.1153 0.1117 0.1088 0.2551 0.2444 0.2179 0.6689 0.6790 0.6606 0.1793 0.1735 0.2009 0.4160 0.4198 0.4085 0.9598 1.0194 1.2771
0.2028 0.4075 1.1955 0.1146 0.2117 0.5863 0.0320 0.0632 0.1981 0.3044 0.5967 1.9508 0.1686 0.2942 0.9044 0.0303 0.0690 0.3385
1
3
7
05
1
3
1686
Alzaatreh et al. Table 5 Bias and standard deviation of the parameter estimates using MMLE method n = 500
Actual values
Standard deviation
c
c˜
˜
˜
c˜
˜
˜
4
05
05 1 3 05 1 3 05 1 3 05 1 3 05 1 3 05 1 3
00053 −00016 −00143 −00285 −00156 −00612 −00551 −01084 −00242 −02499 −02099 −04188 −02271 −00999 −05093 −07587 −07230 −02309
0.0064 0.0063 0.0050 0.0169 0.0141 0.0241 0.0783 0.1017 0.0437 0.0315 0.0265 0.0431 0.0607 0.0543 0.0952 0.3871 0.3926 0.1880
0.0119 0.0241 0.0555 0.0052 0.0079 0.0543 0.0029 0.0079 0.0046 0.0597 0.1020 0.5019 0.0214 0.0352 0.2354 0.0172 0.0338 0.0383
0.3979 0.4676 0.4024 0.4173 0.4039 0.4187 0.4320 0.3691 0.4628 1.0957 1.0401 0.9509 1.2592 1.5578 0.9675 0.7900 0.8624 1.2144
0.0424 0.0508 0.0434 0.0913 0.0933 0.0952 0.2732 0.2578 0.3023 0.0836 0.0742 0.0767 0.1780 0.1845 0.1563 0.4244 0.4798 0.5059
0.0797 0.1947 0.4870 0.0428 0.0863 0.2633 0.0146 0.0263 0.0942 0.1594 0.2871 0.8855 0.0822 0.1815 0.4127 0.0188 0.0402 0.1564
1
Downloaded by [Central Michigan University] at 10:43 01 April 2013
Bias
3
7
05
1
3
Applying the same simulation study as in Sec. 5.2, we obtain the results that are summarized in Tables 4 and 5. Since the MMLE are proposed for c > 1 and the MMLE exist and are consistent when c > 1, the simulation is done for values of c > 1. The results in Tables 4 and 5 show an improvement in estimating the parameter in terms of bias, which in turn provides improved parameter estimates for c and in terms of bias. However, the standard deviation of the MMLE is higher than those of the AMLE. To compare the performance of AMLE with MMLE when c > 1, we compute the mean square error (MSE) of the simulation results for AMLE and MMLE when c > 1. In order to conserve space, these MSE are not reported. The results indicate that MMLE consistently has smaller MSE than AMLE when c > 1. In practice, one should first obtain a graphical display of the data to be fitted. If the data has a reversed J-shape, we recommend the use of AMLE method for estimation. Otherwise, we suggest using the MMLE method since biases are reduced dramatically when compared to AMLE. Further research is needed for developing better estimation method as well as obtaining the asymptotic distribution of the parameter estimators.
6. Application In this section, the WPD is applied to model three data sets from Park et al. (1964) and Park (1954). The three data sets represent the grouped frequency distributions
Weibull-Pareto Distribution and It’s Applications
1687
of adult numbers for Tribolium Confusum and Tribolium Castaneum cultured at 24 C and Tribolium Confusum strain. Tables 6–8 summarize the results of fitting these data to the generalized Weibull distribution (Mudholkar et al., 1996), exponentiated-Weibull distribution (Mudholkar et al., 1995), Lagrange-gamma distribution (Famoye and Govindarajulu, 1998), and Weibull-Pareto distribution. Famoye and Govindarajulu (1998) used the method of moments to estimate the Lagrange-gamma parameters. Method of maximum likelihood estimation was used Table 6 Observed and expected frequencies for Tribolium Castaneum cultured at 24 C
Downloaded by [Central Michigan University] at 10:43 01 April 2013
Expected Generalized Lagrange- Exponentiated- WeibullWeibull gamma Weibull Pareto
x-value
Observed
20–30 30–40 40–50 50–60 60–70 70–80 80–90 90–100 100–110 110–120 120–130 130–140 140–150 150–160 160–170 170–180 180–190 190–200 200–210 210–220 220–230 230–240 240–250 250–260 260–270
2 15 26 30 67 67 65 80 72 70 77 59 47 39 29 25 24 19 19 7 6 4 3 4 1
6.86 14.19 24.04 35.80 48.37 60.32 70.13 76.49 78.64 76.52 70.78 62.51 52.95 43.25 34.24 26.39 19.90 14.75 10.78 7.80 5.60 4.01 2.86 2.04 7.78
3.31 10.06 22.36 38.06 54.09 67.54 76.50 80.32 79.35 74.58 67.24 58.53 49.44 40.68 32.71 25.78 19.96 15.21 11.42 8.46 6.20 4.49 3.22 2.35 5.14
4.45 12.06 23.68 37.86 52.34 64.86 73.74 78.14 78.10 74.30 67.73 59.52 50.63 41.87 33.76 26.61 20.55 15.58 11.61 8.51 6.15 4.39 3.08 2.14 5.33
3.30 12.09 25.16 39.87 53.81 65.18 72.91 76.59 76.38 72.86 66.85 59.24 50.88 42.45 34.49 27.34 21.18 16.05 11.92 8.69 6.21 4.37 3.02 2.05 4.09
Total
857
857
857
857
857
Parameter estimates
2 df p-value
ˆ = 03093 rˆ = 7 ˆ = 16777 c˜ = 69525 ˆ = 12083 ˆ = 00262 ˆ = 870155 ˜ = 91065 ˜ = −02963 ˜ = 024878 ˆ = 27099 ˜ = 03802 27.69 21.02 19.57 17.23 20 20 20 20 0.1169 0.3957 0.4853 0.6380
1688
Alzaatreh et al.
to estimate the generalized Weibull and the exponentiated-Weibull parameters. The modified maximum likelihood estimation method (Sec. 5.3) is used to estimate the Weibull-Pareto parameters. The results in Table 6 indicate that the Weibull-Pareto distribution provides the best fit among the four distributions, while the exponentiated-Weibull and the Lagrange-gamma also provide adequate fit to the data. In examining the distribution of this data, we notice that the data has a very long right tail.
Table 7 Observed and expected frequencies for Tribolium Confusum cultured at 24 C
Downloaded by [Central Michigan University] at 10:43 01 April 2013
Expected x-value 20–30 30–40 40–50 50–60 60–70 70–80 80–90 90–100 100–110 110–120 120–130 130–140 140–150 150–160 160–170 170–180 180–190 190–200 200–210 210–220 220-230 230–240 240–250 250–260 260–270 Total Parameter estimates
2 df p-value
Observed 0 0 3 9 39 53 77 105 135 114 113 92 59 54 38 22 17 6 10 3 2 0 1 0 0 952
Generalized Lagrange- Exponentiated- WeibullWeibull gamma Weibull Pareto 0.47 2.01 5.96 14.15 28.48 49.96 77.04 104.29 123.36 127.42 115.77 94.02 69.79 48.48 32.22 20.86 13.32 8.48 5.42 3.48 2.26 1.49 0.99 0.66 1.61
0.02 0.41 2.91 11.38 29.57 57.00 87.64 112.87 126.05 125.27 112.93 93.75 72.52 52.77 36.41 23.97 15.13 9.21 5.42 3.10 1.72 0.93 0.50 0.26 0.26
0.05 0.57 3.33 11.81 29.30 55.45 84.72 108.87 121.41 120.42 108.36 89.90 69.67 50.98 35.52 23.71 15.25 9.48 5.72 3.35 1.91 1.06 0.57 0.30 0.30
0.00 0.00 2.37 12.82 32.97 59.04 85.11 105.46 116.18 116.08 106.54 90.64 71.97 53.61 37.63 24.99 15.75 9.44 5.40 2.96 1.55 0.78 0.38 0.18 0.14
952
952
952
952
ˆ = 01838 rˆ = 14 ˆ = 18776 ˆ = 11886 ˆ = 006502 ˆ = 774141 ˜ = −05831 ˜ = 006582 ˆ = 53097 23.02 17.22 14.26 14 14 14 0.0599 0.2448 0.4303
c˜ = 50587 ˜ = 332082 ˜ = 07473 15.26 14 0.3605
Downloaded by [Central Michigan University] at 10:43 01 April 2013
Weibull-Pareto Distribution and It’s Applications
1689
This example suggests that the Weibull-Pareto distribution performs very well in capturing a long right tail characteristic. The results in Table 7 indicate that the exponentiated-Weibull distribution fit the data set the best, the Weibull-Pareto is a close second, then, followed by the Lagrange-gamma. The generalized Weibull does not provide a good fit compared to the other distributions. The distribution of this data shows that the data has a very long right tail and a noticeable left tail as well. This example suggests that the Weibull-Pareto distribution does very well in fitting the distributions of data with unusually long left and right tails characteristic. The results in Table 8 indicate that the Weibull-Pareto distribution fits the best among the four distributions, while the exponentiated-Weibull and the Lagrangegamma also provide adequate fit to the data. The distribution of this data shows that the data is approximately symmetric. This example suggests that the WeibullPareto distribution is capable of fitting very well the distribution of data which is somewhat symmetric. Among all the four distributions, only the Weibull-Pareto distribution provides the best fit to both the right and left tails.
Table 8 Observed and expected frequencies for Tribolium Confusum strain Expected x-value 35–40 40–45 45–50 50–55 55–60 60–65 65–70 70–75 75–80 80–85 85–90 90–95 95–100 100–105 105–110 110–115 Total Parameter estimates
2 df p-value
Observed 5 5 14 33 40 49 44 52 44 28 29 13 9 1 1 1 368
Generalized Weibull
Lagrangegamma
Exponentiated- WeibullWeibull Pareto
4.15 8.21 14.74 24.02 35.36 46.41 53.48 53.33 45.63 33.58 21.53 12.30 6.42 3.15 1.48 4.19
2.29 6.33 15.30 28.25 41.83 51.47 54.12 49.73 40.65 30.00 20.24 12.61 7.31 3.98 2.05 1.84
3.06 7.53 15.33 26.30 38.48 48.53 53.16 50.98 43.10 32.35 21.71 13.08 7.11 3.49 1.55 2.24
2.83 7.98 16.38 27.24 38.54 47.57 51.84 50.15 43.14 33.00 22.42 13.51 7.21 3.41 1.42 1.36
368
368
368
368
ˆ = 01498 rˆ = 16 ˆ = 31559 ˆ = 721687 ˆ = 016260 ˆ = 589881 ˜ = −03299 ˜ = 005351 ˆ = 32394 14.80 10.94 9.60 9 9 9 0.0966 0.2797 0.3842
c˜ = 61612 ˜ = 232825 ˜ = 08657 7.78 9 0.5562
1690
Alzaatreh et al.
In summary, the results from these three examples indicate that the WeibullPareto distribution provides very adequate fits to these three types of distributions (very long right tail, left tail, and somewhat symmetric). The distributions of WPD displayed in Figs. 1 and 2 indeed show that the distribution is very flexible and can fit very well a wide range of data sets.
Downloaded by [Central Michigan University] at 10:43 01 April 2013
7. Conclusion A special case of the Weibull-X family, the Weibull-Pareto distribution, is defined and studied. Various properties of the Weibull-Pareto distribution are studied, including the moments, moment generating function, hazard function, and unimodality. Ordinary maximum likelihood estimation method cannot be used to estimate the WPD parameters when c < 1. Two methods of estimation are proposed, AMLE and MMLE. AMLE appears to give poor parameter estimates for WPD when c is large. The results of a simulation study in Tables 2–5 show that the MMLE produces better estimates than the AMLE. Three real data sets are fitted to the WPD and compared with other known distributions. The results show that the WPD gives a good fit to each data set and provides the best fit to the right and left tails. The Weibull-Pareto distribution can be a good model to fit data with long right tail as well as long left tail.
Acknowledgment The authors are grateful for the comments and suggestions by the referee and the Editor. Their comments and suggestions greatly improved the article.
References Alzaatreh, A., Lee, C., Famoye, F. A new method for generating families of continuous distributions. Metron: International Journal of Statistics (in press). Famoye, F., Govindarajulu, Z. (1998). On the Lagrange gamma distribution. Computat. Statist. Data Anal. 27:421–431. Famoye, F., Lee, C., Olumolade, O. (2005). The beta-weibull distribution. J. Statist. Theor. Appl. 4(2):121–136. Johnson, N. L., Kotz, S., Balakrishnan, N. (1994). Continuous Univariate Distributions. 2nd ed. Vol. 1. New York: John Wiley & Sons. Keckic, J., Vasic, P. M. (1971). Some inequalities for the gamma function. Publications De l’Institut Mathématique 11:107–114. Mudholkar, G. S., Kollia, G. D. (1994). Generalized Weibull family: A structural analysis. Commun. Statist. Theor. Meth. 23(4):1149–1171. Mudholkar, G. S., Srivastava, D. K., Freimer, M. (1995). The exponentiated Weibull family: a reanalysis of the bus-motor-failure data. Technometrics 37(4):436–445. Mudholkar, G. S., Srivastava, D. K, Kollia, G. D. (1996). A generalization of the Weibull distribution with application to the analysis of survival data. J. Amer. Statist. Assoc. 91(436):1575–1583. Park, T. (1954). Experimental studies of interspecies competition II. Temperature, humidity, and competition in two species of Tribolium. Physiolog. Zool. 27:177–238. Park, T., Leslie, P. H, Mertz, D. B. (1964). Genetic strains and competition in population of Tribolium. Physiolog. Zool. 37:97–162. Rényi, A. (1961). On measures of entropy and information. Proc. Fourth Berkeley Symp. Mathemat. Statist. Probab. I. University of California Press, Berkeley, pp. 547–561.
Weibull-Pareto Distribution and It’s Applications
1691
Downloaded by [Central Michigan University] at 10:43 01 April 2013
Sayama, S., Sekine, M. (2004). Suppression of sea-ice clutter observed by a millimeter wave radar using a new log-Weibull/CFAR system. Int. J. Infrared Millimeter Waves 25:1481–1494. Shannon, C. E. (1948). A mathematical theory of communication. Bell Syst. Tech. J. 27:379–432. Smith, L. R. (1985). Maximum likelihood estimation in a class of nonregular cases. Biometrika 72(1):67–90.