A new class of generalized logistic distribution

3 downloads 45 Views 576KB Size Report
Austin Peay State University, Clarksville, Tennessee 37044. Abstract. The logistic distribution and the S-shaped pattern of its cumulative distribution and quantile.
A new class of generalized logistic distribution Indranil Ghosh and Ayman Alzaatreh Austin Peay State University, Clarksville, Tennessee 37044

Abstract The logistic distribution and the S-shaped pattern of its cumulative distribution and quantile functions have been extensively used in many di↵erent spheres a↵ecting human life. By far the most well-known application of logistic distribution is in the logistic regression which is used for modeling categorical response variables. The exponentiated-exponential logistic distribution, a generalization of the logistic distribution is obtained using the technique by Alzaatreh et al. (2013) of mixing two distributions hereafter called as the EEL distribution. This distribution subsumes various types of logistic distribution. The structural analysis of the distribution in this paper includes limiting behavior, quantiles, moments, mode, skewness, kurtosis, order statistics, the large sample distributions of the sample maximum and the sample minimum and the distribution of the sample median. For illustrative purposes, a real life data set is considered as an application of the EEL distribution.

Keywords and Phrases: Logistic distribution, exponentiated-exponential logistic distribution, T -X families of distributions, Shannon’s entropy, reliability parameter, distribution of the sample extremums, sample median.

1

Introduction

The logistic distribution function, pervasive in demographic studies (Verhulst, 1845) is now-adays widely used for continuous and discrete data as, for example, in bioassay and discriminant analysis (Cox, 1970). Numerous application of the logistic curve have been found over the years. Authors, such as, Pear, et al. (1920, 1924, 1940) and Schultz (1930) all applied the logistic model as a growth model in human populations as well as in some biological organisms. Schultz (1930) and Oliver (1964) used the logistic function in terms of modeling data related to agricultural population. A few more interesting uses of the logistic function are in the analysis of survival data (Plackett, 1959). The shape of the logistic distribution is quite similar to that of the normal density function. However, in the tails the logistic density is heavier than the normal density. For economists concerned with the upper tails of the income distribution, the logistic like Pareto, Champernowne distribution is probably more useful than the log-normal, which generally gives a poor fit in the tails (Aitchison and Brown , 1957). Champernowne (1937) introduced the name income power for the logarithm of income. He discarded a simple Pareto model in order to fit incomes throughout the entire range rather than concentrate on the upper tail. The family of densities for income power proposed by Champernowne may be written in the form f (x) = cosh(BxA C) D . Written in terms of the upper tail of the distribution of income (not in income power) it will take the form F¯ (x) = (1 + (x/x0 )↵ ) 1 . Fisk (1961 a, b) focused on the family which he called sech2 distribution, although it might well have been named log-logistic. Fisk also noted that if we find log income to be approximately logistic, then perhaps we might try to fit other functions of income by the logistic distribution. Fisk’s sech2 distribution is endowed with yet another name, Pareto (III). Adoption of 1

the sech2 model is, as observed above a tacit assumption that log income (or income power) have a logistic distribution. It is noteworthy to mention that some work has been done in recent years in the area of generalization of the logistic distribution. Gupta and Kundu (2010) discussed various properties of the two generalizations of the logistic distributions, namely the skew logistic and the second one which they termed as proportional reversed hazard family with the baseline distribution as the logistic distribution. The second one is alternatively known as type I generalized logistic distribution. However, the skew logistic distribution was first proposed by Wahed and Ali (2001). Nadarajah (2009) extended this skew logistic distribution by introducing a scale parameter and studied its distributional properties. Chakraborty et al.(2012) has proposed a new skew logistic distribution by considering a new skew function where the skew function is not a c.d.f. In this paper we quest for richer family of mixture of exponentiated-exponential and logistic distributions as an application to real life scenarios. Let F (x) be the cumulative distribution function (c.d.f) of any random variable X and r(t) be the probability density function (p.d.f) of a random variable T defined on [a, b], for 1  a < b  1. The c.d.f of the generalized family of distributions defined by Alzaatreh, et al.(2013) is given by G(x) =

Z

W (F (x))

r(t)dt,

(1)

a

where W (F (x)) satisfies the following conditions (i) W (F (x)) 2 [a, b]. (ii) W (F (x)) is di↵erentiable and monotonically non-decreasing function. (iii) W (F (x)) ! a as x !

1 and W (F (x)) ! b as x ! 1.

The family of distributions defined in (1) is called “Transformed-Transformer” family (or T -X family ) in Alzaatreh, et al. (2013). When W (F (x)) = log(1 F (x)), the random variable T 2 [0, 1) and X is any continuous random variable, the probability density function of the T -X family is f (x) g(x) = r( log(1 F (x))). (2) 1 F (x) If a random variable T follows the exponentiated-exponential distribution (Gupta and Kundu, 2001), r(t) = ↵ (1 e t )↵ 1 e t , t 0, then from definition (2) we get, n g(x) = ↵ f (x) 1

(1

F (x))

o↵

1

(1

F (x))

1

,

(3)

which leads to the exponentiated-exponential-X family with the above p.d.f in (3). If X follows a logistic distribution with parameters ✓ with the c.d.f F (x) = 1 (1 + ex/✓ ) 1 , x 2 R, then (3) reduces to n o↵ 1 ↵ ex/✓ x/✓ g(x) = 1 (1 + e ) , x 2 R. (4) ✓(1 + ex/✓ ) +1

Note that when ↵ = = 1, the p.d.f in (4) reduces to the logistic distribution. When = 1, the p.d.f in (4) reduces to the type I logistic distribution. And when ↵ = 1, then the p.d.f in (4) reduces to the type II logistic distribution. From (4), the c.d.f of the exponentiated-exponential logistic distribution can be written as 2

G(x) = (1

(1 + ex/✓ )

)↵ , x 2 R.

(5)

A random variable X with the p.d.f g(x) in (4) is said to follow the exponentiated-exponential logistic distribution with parameters ↵, , and ✓ and will be denoted by EEL(↵, , ✓). The remainder of this paper is organized in the following way: In section 2, we study various properties of the EEL(↵, , ✓), including the limiting behavior, transformation and the mode. In section 3, the moment generating function, the moments and the mean deviations from the mean and the median are studied. Section 4 deals with the maximum likelihood estimation of the EEL(↵, , ✓). In section 5, we study the reliability parameter in the context of two independent EEL(↵, , ✓) with di↵erent choices for the parameters ↵ and but for a fixed choice of ✓. Section 6 deals with various types of order statistics, and also the limiting distribution of the sample minimum, the sample maximum and also the distribution of the sample median for a random sample of size n drawn from EEL(↵, , ✓). A real life data set is used to illustrate the application of EEL(↵, , ✓) in section 7.

2

Properties of the exponentiated-exponential logistic distribution

We provide below some characterizations of the EEL(↵, , ✓) distribution which establishes the relation between EEL(↵, , ✓), exponential, power and exponentiated-exponential distributions. Lemma 1(Transformation): (i) If a random variable X follows an exponentiated-exponential distribution with parameter ↵ and , then Y = ✓ log(eX 1) follows the EEL(↵, , ✓) . (ii) If a random variable X follows a power distribution with parameter ↵, then Y = ✓ log((1 X) 1/ 1) follows the EEL(↵, , ✓). (iii) If a random variable X follows an exponential distribution with mean= 1/↵, then Y = ✓ log((1 X) 1/ 1) follows the EEL(↵, , ✓). Proof. The results follows immediately by using the transformation technique. ⇤ The hazard function associated with the exponentiated-exponential logistic distribution is hg (x) =

g(x) ↵ ex/✓ (1 (1 + ex/✓ ) )↵ 1 = 1 G(x) ✓(1 + ex/✓ ) +1 1 (1 (1 + ex/✓ )

)↵

, x 2 R.

(6)

The limiting behaviors of the EEL(↵, , ✓) p.d.f and the hazard function are given in the following theorem. Theorem 1. The limit of the exponentiated-exponential logistic density function as x ! ±1 is 0. Also the limit of the hazard function as x ! 1 is 0 and the limit as x ! 1 is /✓. Proof. Trivial and hence omitted. Theorem 2. The mode of the EEL(↵, , ✓) is the solution of the equation k(x) = 0, where 3

ex/✓ )(1 + ex/✓ ) + ↵ex/✓

k(x) = (1

1.

Proof. The derivative of g(x) in (3) can be written as ✓

1 x/✓

e

(1 + ex/✓ )

2

2

n

(1 + ex/✓ )

1

o↵

2

k(x).

(7)

The critical values of (7) is the solution of k(x) = 0. Hence the proof. ⇤ Corollary. When = 1, the solution of equation (7) is x = ✓ log ↵. When ↵ = 1, the solution of equation (7) is x = ✓ log . Lemma 1. Let Q(p), 0 < p < 1, denote the quantile function for the EEL(↵, , ✓). Then Q(p) is given by n o Q(p) = ✓ log (1 p1/↵ ) 1/ 1 . (8) Proof. The result follows immediately by using G(Q(p)) = p in (5) and then solving it for Q(p). ⇤

0.6

In Figures 1 and 2, various graphs of g(x) and hg (x) are provided for di↵erent parameter values.

α=1

λ=1

α = 0.5 λ = 0.5 α=5

λ=2

α=3

λ=3

α=3

λ = 0.4

0.3 0.0

0.1

0.2

g(x)

0.4

0.5

α = 0.3 λ = 5

−15

−10

−5

0

5

10

15

x

Figure 1: Graphs of the EEL p.d.f when ✓ = 1 and various choices of ↵ and . The Shannon’s entropy (Shannon, 1948) plays an important role in information theory and it is used as a measure of uncertainty. Shannon’s entropy for a random variable X with p.d.f g(x) is defined as E[ log(g(X))]. According to Alzaatreh, et al. (2013), the Shannon’s entropy for the beta-exponential-X family is given by ⌘x =

E[log(f (F (↵

1

1) (↵)

(1

e

T

( )

)))] + log(

1

B(↵, )) + (↵ +

[ (↵ + )]/ , 4

1) (↵ + ) (9)

0.8

α=1

λ = 0.7

α=2

λ = 0.5

α = 0.01 λ = 0.3

0.4 0.0

0.2

g(x)

0.6

α = 0.001 λ = 0.05

−15

−10

−5

0

5

10

15

x

Figure 2: Graphs of the EEL hazard function when ✓ = 1 and various choices of ↵ and . where F (.) and f (.) are the c.d.f and p.d.f of the Transformer family respectively, digamma function and T ⇠ exponentiated-exponential(↵, ).

(↵) is the

Lemma 2. The Shannon’s entropy for the exponentiated-exponential-X family can be written as E[log(f (F

1

(1

e

T

)))]

log(↵ ) + ( (↵)

(1))(1

1/ ) + 1

1/↵.

(10)

Proof: Since when = 1, the beta-exponential distribution (Nadarajah and Kotz, 2005) reduces to the exponentiated-exponential distribution, the result in (10) follows immediately by substituting = 1 in (9) and using the fact that (↵ + 1) = (↵) + 1/↵. ⇤ Next theorem defines expression for the Shannon’s entropy for the EEL(↵, , ✓) distribution. Theorem 3. The Shannon’s entropy for the random variable X which follows a EEL(↵, , ✓) is given by ( ) 1 k ✓ ◆ X X k ⌘x = ↵ k 1 B(↵, 1 + k/ ) + 2 ( 1)i+k+1 B(↵, 1 + (i + k)/ ) i i=0

k=1

log(↵ ) + ( (↵)

(1)) + (

1

Proof. In our case we have F (x) = 1

(1 + ex/✓ )

1,

E log(f (F

1

(1

e

T

))) = E(log(1

e

T

))

1)↵

1

+ 1.

(11)

which implies E(T )

2E log(1 + (1

e

T

)e

T

) ,

(12)

where T ⇠ exponentiated-exponential(↵, ). First, consider E(log(1

e

T )).

Using Taylor series expansion of log(1

E(log(1

e

T

)) =



1 X k=1

5

k

1

B(↵, 1 + k/ ).

e

T ),

one can get (13)

and E log 1 + (1

e

T

T

)e

= ↵

1 X k ✓ ◆ X k

i

k=1 i=0

( 1)i+k+1 k

1

B(↵, 1 + (i + k)/ ).

(14)

From Gupta and Kundu (2001), we have E(T ) = { (↵ + 1)

(1)} / .

(15)

Using (13), (14) and (15), equation (12) reduces to ( ) 1 1 X k ✓ ◆ X X k E[log(f (F 1 (1 e T )))] = ↵ k 1 B(↵, 1 + k✓/ ) + 2 ( 1)i+k+1 B(↵, 1 + (i + k)/ ) i k=1 i=0

k=1

( (↵)

(1))/

1/(↵ ).

(16)

The result in (11) follows by substituting (16) in (10). ⇤

3

Moments and mean deviations

The moment generating function for the EEL(↵, , ✓) is given by Z 1 n ↵ tX 1 MX (t) = E(e ) = ex(t+1/✓) (1 + ex/✓ ) 1 (1 + ex/✓ ) ✓ 1

o↵

1

dx.

(17)

On using the generalized binomial expansion, (17) can be written as Z 1 ↵ X 1)(k) k (↵ MX (t) = ( 1) ✓ k! k=0

1

ex(t+1/✓) (1 + ex/✓ )

k

1

dx,

(18)

1

where (↵ 1)(k) = (↵ 1)(↵ 2) · · · (↵ k). By using the substitution u = (1 + ex/✓ ) 1 , (18) reduces to MX (t) = ↵

1 X

( 1)k

k=0

(↵

1)(k) B(k + k!

t✓, 1 + t✓),

provided that t < /✓. If ↵ 1 is a positive integer, then the summation in (19) will stop at ↵

(19)

1.

The first two moments for the EEL(↵, , ✓) can be obtained by di↵erentiating (19) successively and then substituting t = 0, which are given as follows: µ = ↵✓

1 X k=0

E(X 2 ) = ↵✓2

1 X k=0

( 1)k

( 1)k

(↵ 1)(k) ( (1) (k + 1)!

(↵ 1)(k) ( (1) (k + 1)! 6

(k + )).

(k + ))2 +

0

(1) +

(20)

0

(k + ) .

(21)

Skewness and kurtosis of a distribution can be measured by 1 = µ3 / 3 and 2 = µ4 / 4 , respectively. However the expression for the third and fourth moments of EEL(↵, , ✓) are difficult to obtain. since the quantile function of EEL(↵, , ✓) are in closed form, alternatively we can define the measure of skewness and kurtosis based on quantile function. The Galton’ skewness S defined by Galton (1883) and the Moors’ kurtosis K defined by Moors (1988) are given by S=

Q(6/8) 2Q(4/8) + Q(2/4) . Q(6/8) Q(2/8)

(22)

Q(5/8) + Q(3/8) Q(1/8) . (23) Q(6/8) Q(2/8) When the distribution is symmetric, S = 0 and when the distribution is right (or left) skew, S > 0 (or S < 0). As K increases the tail of the distribution becomes heavier. To investigate the e↵ect of the two shape parameters ↵ and on the EEL(↵, , ✓) distribution, equation (22) and (23) are used to obtain the Galtons’ skewness and Moors’ kurtosis where the quantile function is defined in (8). Figure 3 displays the Galton’s skewness and Moors’ kurtosis for the EEL(↵, , ✓) when ✓ = 1. From Figure 3, the EEL(↵, , ✓) distribution can be left skewed, right skewed and symmetric. For fixed value of ↵, Galtons’ skewness is a decreasing function of . Also, for fixed value of and as ↵ gets large, the values of S and K remain unchanged. For smaller values of ↵ and , a small increment in the values of ↵ results in a rapid change in the values of S and K.

Q(7/8)

λ = 0.01

λ = 0.05

λ = 0.05

λ = 0.1

λ = 0.1

λ = 0.5

λ = 0.5

λ=1

λ=1

2.5

λ = 0.01

2.0

λ=5

0.2

Moors' Kurtosis

λ=5

1.0

−0.4

−0.2

1.5

0.0

Galton's Skewness

0.4

0.6

3.0

K=

0

1

2

3

4

5

6

0.0

0.5

1.0

α

1.5

2.0

2.5

α

Figure 3: Graphs of Skewness and Kurtosis for the EEL p.d.f when ✓ = 1. The deviation from the mean and the deviation from the median are used to measure the dispersion and the spread in a population from the center. If we denote the median by M , then the mean deviation from the mean, D(µ), and the mean deviation from the median, D(M ), can be written as Z 1 Z µ D(µ) = |x µ|g(x)dx = 2µG(µ) 2 xg(x)dx. (24) 1

1

7

D(M ) =

Z

1 1

|x

M |g(x)dx = µ

2

Z

M

xg(x)dx.

(25)

1

Now, consider Im =

Z

Z 1 ↵ X 1)(k) k (↵ xg(x)dx = ( 1) ✓ k! 1

m

k=0

Let ak (m) = Using the substitution u = (1 + ex/✓ ) ak (m) = ✓

2

Z

1

u

k +

Z 1,

m

m

xex/✓ (1 + ex/✓ )

k

1

dx.

1

xex/✓ (1 + ex/✓ )k

1

dx.

1

we get 1

log(1

u)du



(1+em/✓ ) 1

2

Z

1

uk

+

1

log u du.

(26)

(1+em/✓ ) 1

On using the Taylor series expansion for log(1 u), (26) can be written as ( ) 1 n m/✓ ) i k X (1 + e (1 + em/✓ ) k ak (m) = ✓2 1 ✓2 log(1 + em/✓ ) + (k + ) i(i + k + ) k +

1

(k + )

i=1

Hence, Im =

↵ ✓

1 X

( 1)k

1)k

(↵

k=0

k!

ak (m).

By using equations (24) and (25), the mean deviation from the mean and the mean deviation from the median are, respectively, given by D(µ) = 2µ(1

4

(1 + eµ/✓ )

)↵

2Iµ

and

D(M ) = µ

2IM .

Maximum likelihood estimation

In this section we address the parameter estimation of the EEL(↵, , ✓) under the classical set up. Let X1 , X2 , · · · , Xn be a random sample of size n drawn from the density in (4). The log-likelihood function is given by ¯ nX n log ✓ + ✓

` = log L(↵, , ✓) = n log ↵ + n log + (↵

1)

n X i=1

n

log 1

(1 + eXi /✓ )

o

.

( + 1)

n X

log(1 + eXi /✓ )

i=1

(27)

The derivatives of (27) with respect to ↵, , and ✓ are given by

8

2

o

.

n @` n X = + log 1 @↵ ↵ n

(1 + eXi /✓ )

.

(28)

n X log(1 + eXi /✓ ) . (1 + eXi /✓ ) 1 i=1

(29)

i=1

@` n = @

@` = @✓

n ✓

¯ nX + 2 ✓

n X

log(1 + eXi /✓ ) + (↵

1)

i=1

n

o

n

+ 1 X Xi eXi /✓ ✓2 1 + eXi /✓

(↵ 1) X Xi eXi /✓ ✓2 (1 + eXi /✓ )((1 + eXi /✓ )

i=1

i=1

1)

.

(30)

The MLE ↵ ˆ , ˆ , and ✓ˆ are obtained by setting (28), (29) and (30) to zero and solving them simultaneously. Lemma 3. The Fisher information matrix for the EEL(↵, , ✓) distribution when ✓ is known is given by ✓ ◆ 1 1 { (↵) (1)} 2 2 ↵ I=n , 1 1 ↵ 1 (1)} 2 (1 + ( 4 )A) 2 { (↵) where A =

0 (1)

0 (↵

(1))2 , provided ↵ > 1.

1) + ( (↵ + 1)

Proof: See appendix.

5

Reliability parameter

The reliability parameter R is defined as R = P (X > Y ), where X and Y are independent random variables. Many applications of the reliability parameter appeared in the literature such as the area of classical stress-strength model and the breakdown of a system having two components. More applications of the reliability parameter can be found in Hall (1984) and Weerahandi and Johnson (1992). If X and Y are two continuous and independent random variables with the c.d.fs F1 (x) and F2 (y) and their p.d.fs f1 (x) and f2 (y) respectively. Then the reliability parameter R can be written as Z 1 R= F2 (t)f1 (t)dt. 1

Theorem 4. Suppose that X ⇠ EEL(↵1 , P (X > Y ) = ↵1

1 , ✓)

1 X

and Y ⇠ EEL(↵2 ,

2 , ✓),

then

(j)

( 1)j

j=0

↵2 B(↵1 , 1 + j j!

Proof: From (4) and (5) and on using the substitution u = 1

9

2 / 1 ).

(1 + et/✓ )

1

, we have

P (X > Y ) = ↵1 = ↵1

= ↵1

Z Z

1

u ↵1

1

0 1

u ↵1

0

n

1 (1 u) 2 / 1 8 1 (j)

Suggest Documents