A new family of continuous distributions with one extra shape parameter: Properties and Applications Halim Zeghdoudi LaPS laboratory, Badji-Mokhtar University, Box 12, Annaba, 23000,ALGERIA
[email protected] August 3, 2016 Abstract This paper proposes a new family of continuous distributions with one extra shape parameter called the generalized Zeghdoudi family of distributions. We investigate the shapes of the density and hazard rate function. We derive explicit expressions for some of its mathematical quantities. Various statistical properties like stochastic ordering, moment method, maximum likelihood estimation, entropies and limiting distribution of extreme order statistics are established. We prove the ‡exibility of the new family by means of applications to several real data sets. Keywords:Lindley distribution,exponential distribution, Gamma distribution, stochastic ordering, maximum-likelihood estimation. AMS Subject Classi…cation : 60E05;62H20
1
Introduction
Statistical models are very useful in describing and predicting real-world phenomena. Numerous extended distributions have been extensively used over the last decades for modelling data in several areas. Recent developments focus on de…ning new families that extend well-known distributions and at the 1
same time provide great ‡exibility inmodelling data in practice. Thus, several classes to generate new distributions by adding one or more parameters have been proposed in the statistical literature. Basic Theory Suppose that X is random variable taking values in ]0; 1), and that the distribution of X depends on an unspeci…ed parameter taking values in ]0; 1). So the distribution of X might be continuous(discrete). In either case, let f denote the probability density function of X on ]0; 1), corresponding to 2 ]0; 1). The distribution of X is a one-parameter polynomia exponential family and the probability density function can be written as f (x) = h ( ) p (x) exp (
x) ;
;x > 0
P where h ( ) is real-valued functions on ]0; 1) , and where p (x) = nk=0 ak xk , is polynomial functions on ]0; 1). Moreover, k is a positif integer and ak is positif real number. We can check imediately: 1) It is non-negative for x > 0; Rb 2) P [a < x < b] = a f (x) dx; R1 1 : 3) 0 f (x) dx = 1 for h ( ) = Pn k! k=0 ak k+1 Now, the probability density function of X is: Pn k x) k=0 ak x exp ( f (x) = ; Pn k! a k=0 k k+1
;x > 0
Examples and Special Cases Many of the special distributions studied in work are general exponential families, at least with respect to some of their parameters. On the other hand, most commonly, a parametric family fails to be a general exponential family because the support set depends on the parameter.
2
General one parameter distribution and some properties
In this section, we give the general one parameter distribution and study its properties. 2
The …rst and second derivatives of f (x) d [(a1 f (x) = dx
a0 ) + ::: + (nan an 1 ) xn Pn k! k=0 ak k+1
1
+ an xn ] exp (
x)
=0
gives x1 ; x2;:::; xn solutions.
We can …nd easily the cumulative distribution function(c.d.f) of the general one parameter distribution: Pn
ak (k + 1; x )
k=0
F (x) = 1
k+1
Pn
k=0
2.1
ak
(2)
k+1
Survival and hazard rate function
Let
Pn
ak (k + 1; x )
k=0
S (x) = 1 and
; x; > 0
k!
F (x) =
k+1
Pn
k=0 ak
; x; > 0
k! k+1
Pn ak xk exp ( x) f (x) = k=0 h (x) = Pn ak (k + 1; x ) 1 F (x) k=0
k+1
be the survival and hazard rate function, respectively. Proposition1. Leth (x) be the hazard rate function of X: Then h (x) is increasing. Proof. According to Glaser (1980) and from the density function (2) Pn 0 k 1 f (x) k=1 kak x P = + (x) = n k f (x) k=0 ak x According to polynomial properties of sum and product, we …nd Pn Pk k k=1 k i=1 ai ak i x 0 0; (x) = P 2 ( nk=0 ak xk ) imply that h (x) is increasing.
3
3
Moments and related measures
The kth moment about the origin of the GZD is : Pn ak k=0 k+i+1 (k + i)! i ; i = 1; 2; ::: E(X ) = Pn k! k=0 ak k+1
Remark 2 The kth moment about the origin of the Lindley distribution
is
i! ( + i + 1) i ( + 1) Corollary 1. Let X GZD( ); the mean and variance for X are : Pn ak k=0 k+2 (k + 1)! E(X) = : Pn k! k=0 ak k+1 E(X i ) =
Theorem 1. Let X GZD( ), me =median(X) and = E(X). Then me < Proof. According to the increasingness of F (x) for all x and , F (me) =
1 2
and F( ) = 1
h( )
n X ak
k + 1; h ( )
k=0
Pn
k=0 k+1
ak k+2
(k + 1)!
Note that 12 < F ( ) < 1. It easy to check that F (me) < F ( ):To this end, we have me < : . The coe¢ cients of variation , skewness and kurtosis of the GZD have been obtained as = skewness =
p
V ar(X) E(X) E(X 3 ) 3
(V ar(X)) 2 E(X 4 ) kurtosis = : (V ar(X))2 4
4
Estimation of parameter
Let Xi GZ( ); i = 1; n be n random variables. The ln-likelihood function, ln l(xi ; ) is: ! n n X X k ln l(xi ; ) = n ln h ( ) + n ln ak x xi : i=1
k=0
The derivatives of ln l(xi ; ) with respect to
is :
n X
nh_ ( ) d ln l(xi ; ) = d h( )
xi :
i=1
From the Zeghdoudi distribution (2), the method of moments (MoM) and the maximum likelihood (ML) estimators of the parameter are the same and it can be obtained by solving the following non-linear equation h_ ( ) h( )
4.1
dh ( ) x = 0; where h_ ( ) = d
(4)
Stochastic orders
De…nition 1. Consider two random variables X and Y . Then X is said smaller than Y in the : a) Stochastic order ( X s Y ), if FX (t) FY (t), 8t: b) Convex order ( X cx Y ), if for all convex functions and provided expectation exist, E[ (X)] E[ (Y )]: c) Hazard rate order ( X hr Y ), if hX (t) hY (t), 8t: (t) is decreasing in t: d) Likelihood ratio order ( X lr Y ), if ffXY (t) Remark 3 Likelihood ratio order)Hazard rate order)Stochastic order. If E[X] = E[Y ]; then Convex order,Stochastic order. Theorem 4. Let Xi GZ( i ); i = 1; 2 be two random variables. If , then X X ; X 1 2 1 lr 2 1 hr X2 ; X1 s X2 and X1 cx X2 : Proof. We have Pn
k=0
fX1 (t) = Pn fX2 (t)
k=0
ak ak 5
k! k+1 2
k! k+1 1
e
(
1
2 )t
:
For simplify we use ln
fX1 (t) fX2 (t)
. Now, we can …nd
d ln dt
fX1 (t) fX2 (t)
= f
(
1
2)
(t)
X1 d 0. This means that X1 To this end, if 1 2 , we have dt ln fX (t) 2 X2 : Also, according to Remark 3 the theorem is proved.
5
lr
Mean Deviations
These are two mean about the median, deR 1 deviation: about the meanRand 1 …ned as M D1 = 0 jx j f (x) dx and M D2 = 0 jx mej f (x) dx respectively, where = E(X) and me =Median(X): The measures M D1 and M D2 can be computed using the following simpli…ed formulas Z xf (x) dx M D1 = 2 F ( ) 2 0
M D2 =
2
Z
me
xf (x) dx
0
6
Extreme order statistics
If X1 ; :::; Xn is a sequence of mutually independent random variables with common continuous distribution function F (x).If X1 ; :::; Xn is a random sample from p (2) and if X is the sample mean then by the usual central limit n X E (X) p ! N (0; 1). Many researchers would be intertheorem n!1 V ar (X) ested in the asymptotics of the extreme values Mn = max(X1 ; :::; Xn ) and mn = min(X1 ; :::; Xn ). For the c.d.f. in (3) and by using L Hopitals rule, it follows that
lim
t!1
1
F (t + x) f (t + x) = lim = lim t!1 t!1 1 F (t) f (t)
and 6
Pn
ak (x + t)k e( (x+t)) k=0P n k ( t) k=0 ak t e
= exp(
x)
F (tx) f (tx) lim = x lim = xlim t!0 F (t) t!0 f (t) t!0
Pn a (xt)k e( k=0 Pn k k ( k=0 ak t e
(xt)) t)
=x
Thus, it follows from Theorem 1.6.2 in Leadbetter et al.(1987) that there must be norming constants an > 0; bn ; cn > 0 and dn such that P rfan (Mn
bn )
and P rfcn (mn
xg ! exp( exp( xg ! 1
dn )
x))
exp ( x)
as n ! 1. The form of the norming constants can also be determined. For instance, using Corollary 1.6.3 in Leadbetter et al.(1987), one can see that an = 1 and bn = F 1 (1 1=n) with F ( ) given by (3).
7
Estimation of the Stress-Strength Parameter
The stress-strength parameter (R) plays an important role in the reliability analysis as it measures the system performance. Moreover, R provides the probability of a system failure, the system fails whenever the applied stress is greater than its strength i.e. R = P (X > Y ). Here X GZ( 1 ) denotes the strength of a system subject to stress Y , and Y GZ( 2 ), X and Y are independent of each other. In our case, the stress-strength parameter R is given by R = P (X > Y ) =
Z
1
SX (y) fY (y) dy
0
=
Pn
k=0
8
ak
k! k+1 1
1 Pn
k=0
ak
k! k+1 2
Z
0
1
n n X ak (k + 1; y 1 ) X k=0
k+1 1
ak y k exp (
k=0
Lorenz curve
The Lorenz curve is often used to characterize income and wealth distributions. The Lorenz curve for a positive random variable X is de…ned as the 7
2 y) dy
graph of the ratio L(F (x)) =
E(XjX x)F (x) E(X)
against F (x) with the properties L(p) p; L(0) = 0 and L(1) = 1. If X represents annual income, L(p) is the proportion of total income that accrues to individuals having the 100 p% lowest incomes. If all individuals earn the same income then L(p) = p for all p. The area between the line L(p) = p and the Lorenz curve may be regarded as a measure of inequality of income, or more generally, of the variability of X. For the exponential distribution, it is well known that the Lorenz curve is given by L(p) = pfp + (1
p)log(1
p)g:
For the GZ distribution in (3),
E(XjX
9
x)F (x) =
Pn
ak
k=0
Pn
k+2
k=0
0
(k + 1)! B @ k!
ak
k+1
Pn
ak
k=0
Pn
k+2
k=0
1P
(k + 1)! C A k!
ak
k+1
n k=0
ak (k + 1; x ) k+1
Pn
k=0
ak
Entropies
It is well known that entropy and information can be considered as measures of uncertainty of probability distribution. However, there are many relationships established on the basis of the properties of entropy. An entropy of a random variable X is a measure of variation of the uncertainty. Rényi entropy is de…ned by Z 1 log f (x) dx J( )= 1 where
> 0 and
= 1. For the GZ distribution in (1), note that
8
k! k+1
Z
f (x) dx =
Pn
k=0
=
Z
1 ak
k! k+1
k=0
ak
k!
R 1 where, xk exp ( x) dx = ( )k tion the ak and : Now, the Rényi entropy as given by 2 J( )=
if
10 10.1
1
1
bk ( )
k=0
k+1
Pn
Pn
k=0
! 1 ,we …nd Shannon entropy.
Z
!
exp (
x) dx
xk exp (
x) dx
(k + 1; x
(k )! ( )k
6 log 6 4
ak x k
k=0
n X
1
Pn
n X
k=0 bk
ak
) and bk ( ) in func-
3
( )7 7 5 k!
k+1
Simulation and Goodness of Fit Simulation study
We can see that the equation F (x) = u, where u is an observation from the uniform distribution on (0; 1), can not be solved explicitly in x (can not use lambert W function in the case k 2), the inversion method for generating random data from the GZ distribution fails. However, we can use the fact that the GZ distribution is a mixture of gamma (k; ) and gamma (k + 1; ) distributions: fN ZD (x; ) = p ( ) gamma(k; ) + (1
p ( )) gamma(k + 1; ); 0 < p ( ) < 1
In this subsection, we investigate the behavior of the ML estimators for a …nite sample size (n). A simulation study consisting of following steps is being carried out N = 10000 times for selected values of ( ; n), where = 0:01; 0:1; 0:5; 1; 3; 10 and n = 10; 30; 50. 9
-
Generate Ui Generate Yi Generate Zi If Ui p ( ),
U nif orm(0; 1), i = 1; :::n. Gamma(k; ), i = 1; :::n: Gamma(k + 1; ), i = 1; :::n: then set Xi = Yi , otherwise, set Xi = Yi , i = 1; :::n; average bais ( ) =
and the average square error. M SE ( ) =
10.2
N 1P ^i N i=1
N 1P ^i N i=1 2
; i = 1; :::N:
Applications
Example 1 Tables 3 and 4 represents the Data of survival times (in months) of ( 94,91) guinea individus infected with Ebola virus, which we compare Zeghdoudi distribution and Lindley distribution. Table 3 Survival time m = 3:17; s = 2:095 Obs freq LD ^ = 0:522 ZD ^ = 0:852 [0; 2[ 32 38.217 33.252 [2; 4[ 35 28.16 32.366 [4; 6[ 17 15.089 17.799 [6; 8[ 7 7.33 7. 133 [8; 10] 3 3.152 2.418 Total 94 94 94 2 2.9244 0:40020 Table 4 Survival time m = 3 Obs freq LD ^ = 0:54858 ZD ^ = 0:896 [0; 2[ 35 39.100 34.56 [2; 4[ 32 27.390 31.497 [4; 6[ 16 13.92 16.206 [6; 8[ 6 6.247 5 6.069 7 [8; 10] 2 2.618 9 1.921 8 Total 91 91 91 2 1.616 5 0.01995
10
Example 2 Now, we compare one parameter (Aradhana, Akash, Shanker) distributions see Shanker (2015a, 2015b, 2016) with Zeghdoudi distribution (see tables 5 and 6). Table 5 Data Distribution log-likelihood Kolmogrov-Smirnov data 1 ZD 1.365 -52.4 0.292 n = 20 Aradhana 1.123 -56.4 0.302 m = 1:9 Akash 1.157 -59.5 0.320 s = 0:704 Shanker 0.804 -59.7 0.315 data 2 ZD 0.095 -239.7 0.251 n = 25 Aradhana 0.094 -242.2 0.274 m = 30:811 Akash 0.097 -240.7 0.266 s = 7:253 Shanker 0.063 -252.3 0.326 Table 6 Data Distribution log-likelihood Kolmogrov-Smirnov data 3 ZD — 0.107 -58.67 0.081 n = 15 Gamma 1.442 0.052 -64.197 0.102 m = 27:546 Weibull 1.306 0.034 -64.026 0.450 s = 20:059 Lognormal 1.061 2.931 -65.626 0.163 data 4 ZD — 0.0167 -142.32 0.108 n = 25 Gamma 1.794 0.010 -152.371 0.135 m = 178:32 Weibull 1.414 0.005 -152.440 0.697 s = 131:097 Lognormal 0.891 4.880 -154.092 0.155
11
Conclusion
In this work propose a one parameter family GZD. Several properties have been discussed : moments, cumulants, characteristic function, failure rate function, stochastic ordering, the maximum likelihood method and the method of moments. The LD does not provide enough ‡exibility for analysing and modelling di¤erent types of lifetime data and survival analysis. But GZD is ‡exible , simple and easy to handle. Two real data sets are analyzed using the new distribution and it is compared with …ve immediate sub-models mentioned above in addition to another distributions (Lindley, Exponential, Gamma, Weibull, Lognormal distributions). The results of the comparisons con…rm the goodness of …t of GZ distribution. We hope our new distribu-
11
tion family might attract wider sets of applications in lifetime data reliability analysis and actuarial sciences. For future studies, we can explain the derivation of posterior distributions for the GZ distribution under Linex loss functions and squared error using non-informative and informative priors(the extension of Je¤reys and Inverted Gamma priors) respectively.
References [1] A. Asgharzadeh , Hassan S. Bakouch, L. Esmaeili , Pareto Poisson– Lindley distribution and its application, Journal of Applied Statistics, pp. 1–18 (2013). [2] Fuller, E.J., Frieman, S., Quinn, J., Quinn, G., and Carter, W. Fracture mechanics approach to the design of glass aircraft windows: A case study, SPIE Proc 2286, 419-430 (1994). [3] M.E. Ghitany, D.K. Al-Mutairi, and S. Nadarajah, Zero-truncated Poisson–Lindley distribution and its application, Math. Comput. Simulation 79, pp. 279–287 (2008a). [4] M. E. Ghitany, B. Atieh, and S. Nadarajah, Lindley distribution and its applications. Mathematics and Computers in Simulation, 78, 493-506 (2008b). [5] Gross, A.J. and Clark, V.A. Survival Distributions: Reliability Applications in the Biometrical Sciences, John Wiley, New York (1975). [6] J. F. Lawless, Statistical models and methods for lifetime data. Wiley, New York, (2003). [7] M.R. Leadbetter, G. Lindgren, H. Rootzén (1987). Extremes and Related Properties of Random Sequences and Processes, Springer Verlag, New York. [8] D. V. Lindley, Fiducial distributions and Bayes’theorem. Journal of the Royal Society, series B, 20, 102-107 (1958). [9] M. Sankaran, The discrete Poisson-Lindley distribution. Biometrics, 26, 145-149 (1970).
12
[10] R.Shanker, S.Sharma, R.Shanker. A two-parameter Lindley distribution for modeling waiting and survival times data. Applied Mathematics, vol. 4, pp. 363-368 (2013). [11] Shanker, R. Akash distribution and Its Applications, International Journal of Probability and Statistics, 4(3), 65-75 (2015 a). [12] Shanker, R. Shanker distribution and Its Applications, International Journal of Statistics and Applications, 5(6), 338 –348 (2015 b). [13] Shanker, R. Aradhana distribution and Its Applications, International Journal of Statistics and Applications, 6(1): 23-34 (2016). [14] H. Zakerzadah, A. Dolati. Generalized Lindley distribution. J. Math. Ext. 3(2):13-25 (2010). [15] H. Zeghdoudi, S. Nedjar. Gamma Lindley distribution and its application. Journal of Applied Probability and Statistics Vol. 11, N 1 129-138 (2016a). [16] H. Zeghdoudi, S. Nedjar. On gamma Lindley distribution : Properties and Simulations. Journal of Computational and Applied Mathematics, Vol 298, pp 167-174 (2016b). [17] H. Zeghdoudi, S. Nedjar. A pseudo Lindley distribution and its application. Afr. Stat. Vol. 11, N 1, 923-932. (2016c) [18] H. Zeghdoudi, H. Messadia. A Zeghdoudi Distribution and its Applications. submited (2016d).
13