Statistical Properties and Different Methods of Estimation of ...

19 downloads 85 Views 7MB Size Report
Darling and right-tail Anderson-Darling, and compare them using extensive simulations studies. Finally, the potentiality of the model is studied using two real ...
Revista Colombiana de Estadística January 2017, Volume 40, Issue 1, pp. 165 to 203 DOI: http://dx.doi.org/10.15446/rce.v40n1.56153

Statistical Properties and Different Methods of Estimation of Transmuted Rayleigh Distribution Propiedades estadśticas y diferentes métodos de estimación de la distribución de Rayleigh transmutada Sanku Dey1,a , Enayetur Raheem2,b , Saikat Mukherjee3,c 1 Department 2 Applied

or Statistics, St. Anthony’s College, Shillong, India

Statistics and Research Methods, University of Northern Colorado, Greeley, USA 3 NIT

Meghalaya, Shillong, India

Abstract This article addresses various properties and different methods of estimation of the unknown Transmuted Rayleigh (TR) distribution parameters from a frequentist point of view. Although our main focus is on estimation, various mathematical and statistical properties of the TR distribution (such as quantiles, moments, moment generating function, conditional moments, hazard rate, mean residual lifetime, mean past lifetime, mean deviation about mean and median, the stochastic ordering, various entropies, stress-strength parameter, and order statistics) are derived. We briefly describe different methods of estimation such as maximum likelihood, method of moments, percentile based estimation, least squares, method of maximum product of spacings, method of Cramér-von-Mises, methods of AndersonDarling and right-tail Anderson-Darling, and compare them using extensive simulations studies. Finally, the potentiality of the model is studied using two real data sets. Bias, standard error of the estimates, and bootstrap percentile confidence intervals are obtained by bootstrap resampling. Key words: Distributional Moments, Order Statistics, Parameter Estimation; Rate Risk Function, Rayleigh Distribution, Transmuted Rayleigh Distribution.

a PhD.

E-mail: [email protected] E-mail: [email protected] c PhD. E-mail: [email protected]

b PhD.

165

166

Sanku Dey, Enayetur Raheem & Saikat Mukherjee Resumen Este artículo se aborda las varias propiedades y diferentes métodos para la estimación de los desconocidos parámetros de Transmuted Rayleigh (TR) distribución desde el punto de vista de un frequentist. Aunque la tema principal de este artículo es estimación, varias propiedades matemáticas y estadísticas de TR distribución (como cuantiles, momentos, una función que genera momentos, momentos condicionales, la tasa de peligro, la media vida residual, media vida pasada, la desviación media por media y mediana, organización stochastic, entropías varias, parámetros de tensión-fuerza y estadisticas de orden) están derivadas. Describimos brevemente los diferentes métodos de estimación, como máxima probabilidad, método de momentos, estimacién basada por percentil, mínimos cuadrados, método de máximos productos de espacios, el método de Cramér-von-Mises, los métodos de Anderson-Darling y right-tail Anderson-Darling, y compararlos con extensos estudios de simulaciones. Por último, la potencialidad del modelo está estudiando con dos conjuntos de datos reales. El margen de error, el promedio de error de las estimaciones y el percentage bootstrap de los confianza intervalos estan derivido por bootstrap remuestro. Palabras clave: momentos distributivos, estadísticas de orden, la estimación de parámetros, riesgo de tipo de función, rayleigh transmutada.

1. Introduction Rayleigh distribution was introduced by Rayleigh (1880) and relates to a problem in the field of acoustics. This distribution has been extensively used in various fields such as communication engineering, the life-testing of electro vacuum devices, reliability theory, and survival analysis. An important characteristic of this distribution is that its failure rate is a linear function of time. The reliability function of the Rayleigh distribution decreases at a much higher rate than the reliability function of exponential distribution. This distribution relates to a number of distributions such as generalized extreme value, Weibull and Chi-square distributions and, hence, its applicability in real life situations is significant. Estimations, predictions, and inferential issues for one parameter Rayleigh distributions have been extensively studied by several authors. Interested readers may refer to Johnson, Kotz & Balakrishnan (1995) for on excellent insight into the Rayleigh distribution, and also see Dey & Das (2007), Dey (2009) for some references. The construction of the transmuted distribution is rather simple and was first proposed by Shaw & Buckley (2007). Since then, transmuted distributions have been widely studied in statistics, and many authors have developed various transmuted-type distributions based on some well known distributions. See, for example, the transmuted extreme value distribution with applications by Alkasasbeh & Raqab (2009a); the transmuted Lindley distribution by Merovci (2013b); the transmuted generalized Rayleigh distribution by Merovci (2013a); the transmuted exponentiated exponential distribution by Merovci (2013c); the transmuted Fréchet distribution by Mahmoud & Mandouh (2013); the transmuted generalized inverse Weibull distribution by Merovci, Elbatal & Ahmed (2014); the transmuted Revista Colombiana de Estadística 40 (2017) 165–203

Estimation of Transmuted Rayleigh distribution

167

Linear exponential distribution by Tian, Tian & Zhu (2014); and the transmuted Pareto distribution by Merovci & Puka (2014), etc. Hence, efficient estimation of these parameters is extremely important. We know that the maximum likelihood estimation (MLE) and the method of moments estimation (MME) are traditional methods of estimation. Although MLE is advantageous in terms of its efficiency and has good theoretical properties, there is evidence that it does not perform well, especially in the case of small samples. The method of moments is easily applicable and often gives explicit forms for estimators of unknown parameters. There are, however, cases where the method of moments does not give explicit estimators (e.g., for the parameters of the Weibull and Gompertz distributions). Therefore, other methods have been proposed in the literature as alternatives to the traditional methods of estimation. Among them, the L-moments estimator (LME), least squares estimator (LSE), generalized spacing estimator (GSE), and percentile estimator (PCE) are often suggested. Generally, these methods do not have good theoretical properties, but in some cases they can provide better estimates of the unknown parameters than the MLE and the MME. This paper considers ten different frequentist estimators for the transmuted Rayleigh distribution and evaluates their performance for different sample sizes and different parameter values. Simulations are used to compare the performance as it is not possible to compare all estimators theoretically (see Gupta & Kundu 2001, Gupta & Kundu 2007). The uniqueness of this study comes from the fact that we provide a comprehensive description of the mathematical and statistical properties of this distribution with the hope that they will attract wider applications in lifetime analysis. Also, to the best of our knowledge thus far, no attempt has been made to compare all these estimators to the two-parameter TR distribution along with mathematical and statistical properties. Comparisons of estimation methods for other distributions have been performed in the literature: Kundu & Raqab (2005) for generalized Rayleigh distributions, Alkasasbeh & Raqab (2009b) for generalized logistic distributions, Mazucheli, Louzada & Ghitany (2013) for weighted Lindley distribution, Teimouri, Hoseini & Nadarajah (2013) for the Weibull distribution, Akram & Hayat (2014) for Weibull distribution, and Dey, Dey & Kundu (2014) for twoparameter Rayleigh distribution. There are some works found in the literature on transmuted Rayleigh distribution and its variants such as the slashed Rayleigh distribution Iriarte, Gómez, Varela & Bolfarine (2015) and its exponential variant Salinas, Iriarte & Bolfarine (2015). The transmuted Rayleigh (TR) distribution was introduced by Merovci (2013c). He only studied moments and the maximum likelihood estimation of the unknown parameters. A random variable X is said to have a transmuted Rayleigh distribution if its cumulative distribution function(cdf) is given by   2 2 G(x, α, λ) = 1 − e−αx 1 − λ + λe−αx ; x > 0, α > 0, |λ| ≤ 1 (1)

Revista Colombiana de Estadística 40 (2017) 165–203

168

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

with corresponding pdf 2

g(x, α, λ) = 2αxe−αx



2

1 − λ + 2λe−αx



x > 0, α > 0, |λ| ≤ 1,

;

and the survival function is given by   2 2 S(x, α, λ) = e−αx 1 − λ + λe−αx ;

x > 0 , α > 0, |λ| ≤ 1.

(2)

(3)

The distribution studied in this paper is a re-parameterization of the the version proposed by Merovci (2013c) with α = (2σ 2 )−1 . Note that the classical Rayleigh distribution is a special case for λ = 0. Figure 1 illustrates some of the possible shapes of the pdf of a transmuted Rayleigh distribution for selected values of the parameters λ and α. This distribution has a unimodal pdf and increasing hazard rates. The latter may seem unrealistic at first in real life situation. However, in several situations, only increasing hazard rates are used or observed: Maeda & Nishikawa (2006) state that “Ruling parties in presidential systems face an increasing hazard rate in their survival”; Woosley & Cossman (2007) observe that drugs during clinical development have increasing hazard rates; Saidane, Babai, Aguir & Korbaa (2010) suppose that the demand interval in spare parts inventory systems has increasing hazard rates; Tsarouhas & Arvanitoyannis (2007) show that machines involved in bread production display increasing hazard rates; Koutras (2011) finds that software degradation times have increasing hazard rates; and Lai (2013) investigates the optimum number of minimal repairs for systems under increasing hazard rates; etc. 0.8

2.0

Α0.2, Λ1.0 0.6

Α0.5, Λ0.5 1.5

Α0.2, Λ0.5

Α1.0, Λ0.5

Α0.2, Λ 0.5

Α2.0, Λ0.5

Α0.2, Λ 1.0

0.4

0.2

0.0

Α3.0, Λ0.5

1.0

0.5

0

1

2

3

4

5

0.0

0

1

2

3

4

5

Figure 1: Probability density function of the TR distribution for different values of parameters α and λ.

The principal motivation of the paper is two fold: one is empirical and shows that the studied transmuted Rayleigh distribution outperforms at least two twoparameter distributions with respect to a real data set; the other is to show how different frequentist estimators of this distribution perform for different sample sizes and different parameter values, and to develop a guideline to choose the best estimation method for the transmuted Rayleigh distribution, which we think would be of interest to applied statisticians. The paper is organized as follows: Various mathematical, statistical and reliability properties of the TR distribution (like shapes, quantiles, moments, moment Revista Colombiana de Estadística 40 (2017) 165–203

169

Estimation of Transmuted Rayleigh distribution

generating function, hazard rate function, mean residual lifetime, and conditional moments, etc.) are derived in Sections 1 and 2. In Section 3, ten most frequent methods of estimations are discussed. Inference with simulation and two real data applications for the TR distribution are discussed in Section 4 and 5. In the last section, we draw some conclusions about the information contained in this article.

1.1. Shape The study of shapes is useful to determine if a data set can be modeled by the TR distribution. Here, we discuss the shapes of the pdf g. Theorem 1. The probability density function, g, is unimodal. Proof . Notice that g is a smooth function. To prove it is unimodal, we shall first show that there is one and only one x∗ ∈ R+ , such that g 0 (x∗ ) = 0, and g 0 > 0 for 2 x < x∗ & g 0 < 0 for x > x∗ . Letting u = e−αx , we have g 0 (x) = 2αe−2u [eu (λ − 1)(2u − 1) + (2 − 8u)λ] . Since 0 < u < ∞, g 0 >, =, < 0 depends on whether eu (λ − 1)(2u − 1) + (2 − 8u)λ >, =, < 0, respectively. Now g 0 (x) = 0 implies that eu (λ − 1)(2u − 1) + (2 − 8u)λ = 0. Case I: (λ = 1.) This implies u∗ = 1/4, and hence x∗ =

1 √ . 2 α

Now, for x > x∗ , i.e. u∗ > 1/4, we have eu (λ−1)(2u−1)+(2−8u)λ = 2−8u < 0 and, hence, g 0 < 0. Similarly, for x < x∗ we obtain g 0 > 0. u

λ Case II: (λ 6= 1.) In this case, we have Q(u) := (2u−1)e = λ−1 . Note that Q is a 8u−2 1 λ 1 ≤ 12 strictly increasing, continuous function of u on (0, 4 ) and ( 4 , ∞). Since λ−1 1 1 1 and Q has range ( 2 , ∞) on (0, 4 ) and range (−∞, ∞) on ( 2 , ∞), there is a unique λ u∗ ∈ ( 14 , ∞) such that Q(u∗ ) = λ−1 . Hence there is a unique x∗ > 2√1 α such that g 0 (x∗ ) = 0.

Now, for u > u∗ > 1/4, 1/4 < u < u∗ and 0 < u < 1/4 < u∗ , one can easily show that g 0 < 0, > 0 and > 0, respectively. Hence g is unimodal.

2. Statistical and Mathematical Properties In this section, we provide some important statistical and mathematical measures for the TR distribution such as quantiles, moment generating functions, moments, hazard rate and mean residual life functions, mean past lifetime, conditional moments, Stochastic ordering, mean deviation about mean and median, Shannon and Rényi entropy, order statistics, and stress strength parameters. The quantile function x = Q(p) = G−1 (p), for 0 < p < 1, of the TR distribution is obtained from (1); it follows that the quantile function is Revista Colombiana de Estadística 40 (2017) 165–203

170

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

"

(λ − 1) +

1 x = − ln α

!# 12 p (λ − 1)2 + 4λ(1 − p) . 2λ

(4)

In particular, the median of the TR distribution can be written as !# 12 √ (λ − 1) + λ2 + 1 . 2λ

"

1 M d(X) = Md = − ln α

(5)

We hardly need to emphasize the necessity and importance of the moments in any statistical analysis, especially in applied work. Some of the most important features and characteristics of a distribution can be studied through moments (e.g., tendency, dispersion, skewness, and kurtosis). If the random variable X is distributed T R(α, λ), then its nth moment around zero can be expressed as Z

n

E(X ) = 2α



xn+1 exp −αx2



1 − λ + 2λ exp −αx2



dx.

0

On simplification, we get E(X n ) = α−n/2 Γ(1 +

 n  ) 1 − λ + λ2−n/2 2

(6)

The variance, skewness, and kurtosis measures can now be calculated using the following relations V ar(X) = E(X 2 ) − E 2 (X), Skewness(X) =

Kurtosis(X) =

E(X 3 ) − 3E(X)E(X 2 ) + 2E 3 (X) , V ar3/2 (X) E(X 4 ) − 4E(X)E(X 3 ) + 6E(X 2 )E 2 (X) − 3E 4 (X) . V ar2 (X)

Figures 2 and 3 illustrate their variations.

2.1. Moment Generating Function Many of the interesting characteristics and features of a distribution can be obtained via its moment generating function and moments. Let X denote a random variable with the probability density function (2). By definition of the moment generating function of X and using (2), we have MX (t) = E(etx ) =

Z



etx g(x)dx

0

    √     t2 t2 √ √ 4 α − 2 π(λ − 1)te 4α Erf 2√t α + 1 + 2πλte 8α Erf √ t√ +1 2 2 α = √ 4 α

(7)

Revista Colombiana de Estadística 40 (2017) 165–203

171

Estimation of Transmuted Rayleigh distribution

1.0 3 0.5 2 1.0

Α

0.5 0.0 Λ

1 0.5 1.0

(a)

(b)

Figure 2: (a) Expectation and (b) Variance of TR −1, −0.9, −0.8, · · · , 1 and α = 0.2, 0.4, 0.6, . . . , 3.

distribution

for

λ

=

4.0

0.8 0.7 0.6 0.5

3

3 3.5

2

1.0

Α

0.5 0.0 Λ

1 0.5

2 1.0

Α

0.5 0.0 Λ

1 0.5

1.0

(a)

1.0

(b)

Figure 3: (a) Skewness and (b) Kurtosis of TR distribution −1, −0.9, −0.8, . . . , 1 and α = 0.2, 0.4, 0.6, . . . , 3.

for

λ

=

2.2. Stochastic Ordering Stochastic ordering is a tool used to study structural properties of complex stochastic systems. For example, it is useful to control congestion in informationtransfer over the Internet, to define treatment-related trends in clustered binary data, and order expected welfare income under different mechanisms of allocating rewards. There are different types of stochastic orderings, which are useful in ordering random variables in terms of different properties. Here we consider four different stochastic orders, namely, the usual, the hazard rate, the mean residual life, and the likelihood ratio order for two independent TR random variables under a restricted parameter space. If Xand Y are independent random variables with CDFs FX and FY , respectively, then X is said to be smaller than Y in the • stochastic order (X ≤st Y ) if FX (x) ≥ FY (x) for all x Revista Colombiana de Estadística 40 (2017) 165–203

172

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

• hazard rate order (X ≤hr Y ) if hX (x) ≥ hY (x) for all x • mean residual life order (X ≤mrl Y ) if mX (x) ≥ mY (x) for all x • likelihood ratio order (X ≤lr Y ) if

fX (x) fY (x)

decreases in x.

The following results due to Shaked & Shanthikumar (1994) are well known for establishing stochastic ordering of distributions. X ≤lr Y =⇒ X

≤hr Y =⇒ X ≤mrl Y ⇓ X ≤st Y

(8)

The TR is ordered with respect to the strongest “likelihood ratio” ordering as shown in the following theorem. It shows the flexibility of the two parameter TR distribution. Theorem 2. Let X∼ T R (α1 , λ1 ) and Y∼ T R (α2 , λ2 ). If α1 = α2 = α and λ1 ≥ λ2 , then X ≤lr Y and, hence, X ≤hr Y, X ≤mrl Y and X ≤st Y. Proof . The likelihood ratio is 2

2

α1 xe−α1 x (1 − λ1 + 2λ1 e−α1 x ) fX (x) = fY (x) α2 xe−α2 x2 (1 − λ2 + 2λ2 e−α2 x2 ) 2

Thus,

fX (x) 4α1 xλ1 e−α1 x d log = −2(α1 − α2 )x − dx fY (x) (1 − λ1 + 2λ1 e−α1 x2 ) 2

4α2 xλ2 e−α2 x + . (1 − λ2 + 2λ2 e−α2 x2 ) (x) d Now, if α1 = α2 = α and λ1 ≥ λ2 , then dx log ffX ≤ 0, which implies that Y (x) X ≤lr Y and, hence, X ≤hr Y, X ≤mrl Y and X ≤st Y.

2.3. Hazard Function The hazard rate function of the transmuted Rayleigh distribution is given by 2

h(x) =

f (x) 2αx(1 − λ + 2λe−αx ) = . 1 − F (x) (1 − λ + λe−αx2 )

(9)

Figure 4 illustrates some of the possible shapes of h. Even though, in Figure 4, it apparently looks like that the hazard rate function of the TR distribution is always increasing, it can be noted that for some parameter values, the hazard function decreases. For example if α = 0.2 and λ = 0.92, h decreases in the interval (4, 4.4). Revista Colombiana de Estadística 40 (2017) 165–203

173

Estimation of Transmuted Rayleigh distribution 5

30

Α0.2, Λ1.0

Α0.5, Λ0.5 25

Α0.2, Λ0.5

4

Α1.0, Λ0.5

Α0.2, Λ 0.5

Α2.0, Λ0.5

20

Α0.2, Λ 1.0

3

Α3.0, Λ0.5 15

2 10

1

0

5

0

0

1

2

3

4

5

6

0

1

2

(a)

3

4

5

(b)

Figure 4: Hazard Function of TR distribution for different values of parameters α and λ.

2.4. Mean Residual Life Function The mean residual life MRL is the expected remaining life, X − x, given that the item has survived to time x. Figure 5 illustrates some of the possible shapes of µ. µ(x) = E(X − x | X > x) Z ∞ y g(y)dy = x −x 1 − G(x) √ √   √ 2√ e2αx π 2(1 − λ) Erfc(x α) + 2λ Erfc(x 2α)  = . √ 4 α λ + (1 − λ)eαx2 One can easily show that limx→∞ µ(x) = 0. 3.0

1.5 2.5

Α0.5, Λ0.5

Α0.2, Λ1.0

Α1.0, Λ0.5

Α0.2, Λ0.5

2.0

1.0

Α0.2, Λ 0.5

Α2.0, Λ 0.5 Α3.0, Λ 0.5

Α0.2, Λ 1.0

1.5

1.0

0.5

0.5

0.0

0.0 0

5

10

15

20

0

(a)

5

10

15

20

(b)

Figure 5: MRL function of TR distribution for different values of parameters α and λ.

2.5. Mean Past Lifetime (MPL) In a real life situation, where systems are often not monitored continuously, one might be interested in inferring more about the history of the system e.g. Revista Colombiana de Estadística 40 (2017) 165–203

174

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

when the individual components have failed. Now assume that a component with lifetime X has failed at or some time before x, x ≥ 0. Consider the conditional random variable x − X | X ≤ x. This conditional random variable shows, in fact, the time elapsed from the failure of the component, given that its lifetime is less than or equal to x. Hence, the mean past lifetime (MPL) of the component can be defined as Rx tf (t)dt F (t)dt =x− 0 k(x) = E[x − X | X ≤ x] = F (x) F (x) √ √ √ √ √   2 π ( 2λ Erfc( 2 αx)−2(λ−1) Erfc( αx)) −αx2 √ + (λ − 1)x −e + λxe−2αx + x − 4 α   = . 1 − e−αx2 1 + λe−αx2 Rx 0

One can easily show that k(x) → ∞ as x → 0, ∞. Figure 6 illustrates some of the possible shapes of k.

30

30

25

25

Α0.2, Λ1.0 Α0.2, Λ0.5

20

Α0.5, Λ0.5 Α1.0, Λ0.5

20

Α0.2, Λ 0.5

Α2.0, Λ 0.5

Α0.2, Λ 1.0

15

10

10

5

5

0

0

5

Α3.0, Λ 0.5

15

10

15

20

0

0

5

10

(a)

15

20

(b)

Figure 6: MPL function of TR distribution for different values of parameters α and λ.

2.6. Conditional Moments For lifetime models, it is also of interest to know what E(X n | X > x) is. It can be easily seen that 1 E(X | X > x) = S(x) n

Z



xn f (x)dx

x

In particular eαx E(X | X > x) =

2

√

√ √ √ √ π ( 2λErfc( 2 αx)−2(λ−1)Erfc( αx)) √ 4 α

+ xe−2αx

2



λ − (λ − 1)eαx

2



 λ e−αx2 − 1 + 1 Revista Colombiana de Estadística 40 (2017) 165–203

175

Estimation of Transmuted Rayleigh distribution

  2 2(λ − 1)eαx αx2 + 1 − λ 2αx2 + 1  E(X | X > x) = 2α (λ − 1)eαx2 − λ 2

√ √ √  √ √ 2 2 12 π(λ − 1)e2αx Erfc ( αx) − 3 2πλe2αx Erfc 2 αx  E(X | X > x) = 16α3/2 (λ − 1)eαx2 − λ 3

  √  2 4 αx 2(λ − 1)eαx 2αx2 + 3 − λ 4αx2 + 3  + 16α3/2 (λ − 1)eαx2 − λ and 2

4

E(X | X > x) =

2(λ − 1)eαx

  α2 x4 + 2αx2 + 2 − λ 2α2 x4 + 2αx2 + 1  2α2 (λ − 1)eαx2 − λ

The mean residual lifetime function is E(X | X > x) − x.

2.7. Mean Deviation The mean deviations of the mean and the median can be used as measures of spread in a population. Let µ = E(X) and M be the mean and the median of the transmuted Rayleigh distribution, respectively. The mean deviations about the mean and of the median can be calculated as Z



Z |x − µ|g(x) dx

δ1 (X) = 0



|x − M |g(x) dx,

δ2 (X) =

(10)

0

respectively.

Z



E (|X − m|) =

|x − m|g(x) dx Z0 m

Z



(m − x)g(x)dx + (x − m)g(x)dx 0 m Z m Z ∞ = 2m g(x)dx − m − E(X) + 2 xg(x) dx 0 m Z ∞ = 2mF (m) + 2 xg(x) dx − E(X) − m

=

m

Revista Colombiana de Estadística 40 (2017) 165–203

176

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Therefore, since we have µ =

√ √ π (( 2−2)λ+2) √ 4 α

r and M =

1 α

log





Z ∞ δ1 (X) = 2µF (µ) − 2µ + 2 xg(x) dx µ    √  1 √ √ 1√  √ π 2λ Erfc π − 2λ + λ + 2 = √ 2 2 α     √ √ 1 +2(1 − λ) Erfc π 2−2 λ+2 , 4 Z



2λ λ2 +1+λ−1

,

(11)



xg(x)dx − µ " s   √ √ 2λ 1 + 2(1 − λ) απλ + λ 2απ = 4 α log √ 2 4α λ +1+λ−1 s  ! √ 2λ −2λ 2απErf 2 log √ λ2 + 1 + λ − 1 s  !# √ 2λ log √ . −4 απ(1 − λ)Erf λ2 + 1 + λ − 1

δ2 (X) = 2

M

(12)

2.8. Entropies An entropy provides an excellent tool to quantify the amount of information (or uncertainty) contained in a random observation regarding its parent distribution (population). A large value of entropy implies greater uncertainty in the data. The concept of entropy is important in different areas such as physics, probability and statistics, communication theory, and economics, etc. Several measures of entropy have been studied and compared in the literature. If X has the probability, distribution function f (·) then the Shannon entropy Shannon (1948) is defined by H(x) = −E (ln g(x)) Z ∞ =− g(x) ln g(x)dx Z0 ∞ h  i 2 2 =− g(x) ln 2αxe−αx 1 − λ + 2λe−αx dx 0  1 =− − 2(log(α) + λ log(2) + γ) + 4 log(2α) + 2(λ − 2) 4  −2λ + (λ − 1)2 (− log(1 − λ)) + (λ + 1)2 log(λ + 1) + . λ Rényi entropy (Renyi 1961) can be expressed as Z ∞  1 β Hβ (x) = ln g (x)dx , 1−β 0

(13)

β > 0, β 6= 1

Revista Colombiana de Estadística 40 (2017) 165–203

177

Estimation of Transmuted Rayleigh distribution

Then Hβ (x) can as simplified as follows: for −1 ≤ λ < 31 , √ − ln 2 − ln α +

1 ln Γ 1−β



β+1 2

 +

k ∞   X β+1 1 β 2λ (β + k)− 2 , ln 1−β 1−λ k k=0

  P∞ β  k k √ β+1 β+1 1 < λ < 1, − ln 2−ln α+ 1−β ln 2 λ (1−λ)β−k α− 2 (β +k)− 2 k=0 k      β+1 2λ β−k β−k k − β+1 2 + 2 λ (1 − λ) (2β − k) − Ga 2 , (k + β) ln 1−λ Γ β+1 2   √ 2λ Ga β+1 , and for λ = 1, − 32 ln 2 − ln α − β+1 2 , (2β − k) ln 1−λ 2 ln β   R ∞ + ln Γ β+1 , where Ga(a, x) = x ta−1 e−t dt. It can be noted that, when β → 1, 2 the Rényi entropy converges to the Shannon entropy. For further details, see Song (2001). for

1 3

2.9. Order Statistics Moments of order statistics play an important role in quality control testing and reliability to predict the failure of future items based on the times of few early failures. Thus, the kth order statistic of a sample is its kth smallest value. For a sample of size n, the nth order statistic (or largest order statistic) is the maximum, that is, X(n) = max{X1 , . . . , Xn }. The sample range is the difference between the maximum and minimum. It is clearly a function of the order statistics: range{X1 , . . . , Xn } = X(n) − X(1) . We know that if X(1) ≤ · · · ≤ X(n) denotes the order statistic of a random sample X1 , . . . , Xn from a continuous population with cdf GX (x) and pdf gX (x), then the pdf of X(j) is given by gX(j) (x) =

n! j−1 n−j gX (x) (GX (x)) (1 − GX (x)) , (j − 1)!(n − j)!

for j = 1, . . . , n. The pdf of the jth order statistic for a transmuted Rayleigh distribution is given by   2 2 n! 2αxe−αx 1 − λ + 2λe−αx (j − 1)!(n − j)! h  ij−1 2 2 × 1 − e−αx 1 − λ + λe−αx h  in−j 2 2 × e−αx 1 − λ + λe−αx .

gX(j) (x) =

Revista Colombiana de Estadística 40 (2017) 165–203

178

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Therefore, the pdf of the largest order statistic X(n) is   2 2 gX(n) (x) =2nαxe−αx 1 − λ + 2λe−αx h  in−1 2 2 × 1 − e−αx 1 − λ + λe−αx , and the pdf of the smallest order statistic X(1) is   2 2 gX(1) (x) = 2nαxe−αx 1 − λ + 2λe−αx .

2.10. Distribution of Minimum, Maximum and Median Let X1 , X2 , . . . , Xn be an independently identically distributed order random variables from the TR distribution having first, last, and median order probability density functions, which are given by the following n−1

g1:n (x) = n [1 − G(x, Φ)] g(x, Φ)   2 2 = 2nαxe−αx 1 − λ + 2λe−αx h  in−1 2 2 × e−αx 1 − λ + λe−αx ,

(14)

 n−1 gn:n (x) = n G(x(n) , Φ) g(x(n) , Φ) h  in−1 2 2 = 2nαx 1 − e−αx 1 − λ + λe−αx   2 2 × e−αx 1 − λ + 2λe−αx ,

(15)

and (2m + 1)! (G(x))m (1 − G(x))m g(x) m! m!  im 2 2 (2m + 1)! h = 1 − e−αx 1 − λ + λe−αx h m! m! i

gm+1:n (e x) =

× e

−αx2

−αx2

m

(16)

1 − λ + λe   2 2 × 2αxe−αx 1 − λ + 2λe−αx ,

where Φ = (α, λ).

2.11. Stress Strength Parameter In reliability theory, the stress-strength model describes the life of a component or item which has a random strength X1 that is subjected to a random stress X2 . The component fails instantaneously when the stress applied to it surpasses Revista Colombiana de Estadística 40 (2017) 165–203

179

Estimation of Transmuted Rayleigh distribution

the strength, and the component will function satisfactorily/acceptably whenever X1 > X2 . So, R = P r (X1 > X2 ) is a measure of component reliability. Its applicability is found in many spheres especially in engineering fields such as structures, deterioration of rocket motors, fatigue failure of aircraft structures, and the aging of concrete pressure vessels, etc. Extensive work on estimation of reliability of stress-strength models has been undertaken for the well known standard distributions. However, there are still some distributions (including generalizations of the well-known distributions) for which the form of R has not been investigated. Here, we derive the reliability R when X1 and X2 are independent random variables distributed with parameters (α1 , λ1 ) and (α2 , λ2 ) , then Z∞ R=

g1 (x)G2 (x) dx 0

Z∞ =

2 2 2 2 2α1 xe−α1 x (1 − λ1 + 2λ1 e−α1 x )[1 − e−α2 x (1 − λ2 + λ2 e−α2 x )] dx (17)

0

α2 α12 (2 − λ1 )(1 + λ2 ) + α1 α2 (5 + λ2 − λ1 (2 − λ2 )) + 2α22 = (α1 + α2 )(2α1 + α2 )(α1 + 2α2 )



If α1 = α2 = α, then R=

3 − λ1 + λ2 6

(18)

and if λ1 = λ2 = λ, then    α2 α12 −λ2 + λ + 2 + α1 α2 λ2 − λ + 5 + 2α22 R= . (α1 + α2 )(2α1 + α2 )(α1 + 2α2 )

(19)

3. Methods of Estimation In this section, we describe ten methods to estimate the parameters, α and λ, the TR distribution. We assume throughout that x = (x1 , · · · xn ) is a random sample of size n from the TR distribution; both parameters α and λ are unknown.

3.1. Method of Maximum Likelihood Estimation The ML method is the most frequently used method of parameter estimation (Casella & Berger 1990). Its success stems from its many desirable properties including consistency, asymptotic efficiency, invariance property as well as its intuitive appeal. Let x1 , . . . , xn be a random sample of size n from (2), the likelihood function of the density (2) is given by Revista Colombiana de Estadística 40 (2017) 165–203

180

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

L(α, λ; x) =

n Y

g(xi , α, λ)

i=1 n n

= 2 α exp −α

n X i=1

! x2i

n Y

n   Y 2 xi 1 − λ + 2λe−αxi .

i=1

(20)

i=1

The log-likelihood function without constant terms is given by `(α, λ; x) = log L(α, λ; x) n n n   X X X 2 = log(xi ) + n log α − α x2i + log 1 − λ + 2λe−αxi . i=1

i=1

i=1

For ease of notation, we will denote the first partial derivatives of any function f (x, y) by fx , and fy , and its second partial derivatives by fxx , fyy , fxy , and fyx . Now setting `α = 0 and `λ = 0 , we have 2 n n X x2i e−αxi n X 2 x − 2λ `α = − = 0, 2 α i=1 i (1 − λ + 2λe−αxi ) i=1

(21)

and `λ =

n X i=1

2

2e−αxi − 1 2 = 0. 1 − λ + 2λe−αxi

(22)

ˆ of θ = (α, λ) is obtained by solving this nonlinear system of The MLE θˆ = (ˆ α, λ) equations. It is usually more convenient to use nonlinear optimization algorithms such as the quasi-Newton algorithm to numerically maximize the sample likelihood function given in (20).

3.2. Method of Moment Estimation The MMEs of the two-parameter TR distribution can be obtainedP by equating n thePfirst two theoretical moments of (2) with the sample moments n1 i=1 xi and n 1 2 i=1 xi respectively, n   n  1X 1  xi = α−1/2 Γ 1 + 1 − λ + λ2−1/2 , n i=1 2

(23)

and n

 1X 2 xi = α−1 1 − λ + λ2−1 . n i=1

(24)

Revista Colombiana de Estadística 40 (2017) 165–203

181

Estimation of Transmuted Rayleigh distribution

3.3. Method of Least-Square Estimation The least square estimators and weighted least square estimators were proposed by Swain, Venkatraman & Wilson (1988) to estimate the parameters of Beta distributions. In this paper, we apply the same technique for the TR distribution. The least square estimators of the unknown parameters α and λ of TR distribution can be obtained by minimizing n  X F (X(j) ) − j=1

j n+1

2

with respect to unknown parameters α and λ. Suppose F (X(j) ) denotes the distribution function of the ordered random variables X(1) < X(2) < · · · < X(n) , where {X1 , X2 , · · · , Xn } is a random sample of size n from a distribution function F (·). Therefore, in this case, the least square ˆ LSE respectively, can be obtained by miniestimators of α and λ, say α ˆ LSE and λ mizing 2 n    X j −αx2(i) −αx2(i) 1−e 1 − λ + λe − n+1 j=1 with respect to α and λ. The weighted least square estimators of the unknown parameters can be obtained by minimizing  2 n X j wj F (X(j) ) − n+1 j=1 2

(n+2) with respect to α and λ. The weights wj are equal to V (X1(j) ) = (n+1) j(n−j+1) . Therefore, in this case, the weighted least square estimators of α and λ, say α ˆ W LSE ˆ W LSE respectively, can be obtained by minimizing and λ

 2 n   X (n + 1)2 (n + 2) j −αx2(i) −αx2(i) 1−e 1 − λ + λe − n−j+1 n+1 j=1 with respect to α and λ.

3.4. Method of Percentile Estimation If the data comes from a distribution function which has a closed form, then we can estimate the unknown parameters by fitting a straight line to the theoretical points obtained from the distribution function and the sample percentile points. This method was originally suggested by Kao (1958, 1959) and it has been used for Weibull distribution and for generalized exponential distribution. In this paper, we apply the same technique for the TR distribution. Since,   2 2 G(x, α, λ) = 1 − e−αx 1 − λ + λe−αx ; Revista Colombiana de Estadística 40 (2017) 165–203

182

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

therefore, "

1 x = − ln α

(λ − 1) +

!# 12 p (λ − 1)2 + 4λ(1 − p) . 2λ

Let X(j) be the jth order statistic, i.e, X(1) < X(2) < · · · < X(n) . If pj denotes an estimate of G(x(j) ; α, λ), then the estimate of α and λ can be obtained by minimizing n X



"

x(j) − − 1 ln α j=1

(λ − 1) +

p

(λ − 1)2 + 4λ(1 − pj ) 2λ

!# 12 2 

with respect to α and λ. The estimates of α and λ can be obtained by solving the following nonlinear equations  n X j=1

λ2 +2λ−4λpj +1 2λ

 +



s αx(j)

 log

√ 2λ λ−1+ λ2 +2λ−4λpj +1

 =0

α2

q v    u λ+ λ2 +2λ−4λpj +1−1 u    t log p 2λ   λ2 + 2λ − 4λpj + 1 + λ(2pj − 1) − 1  − − x(j)  α  

n X j=1

log



λ−1+



v  u u p p 2 2 αλ λ + λ(2 − 4pj ) + 1 λ + λ + 2λ − 4λpj + 1 − 1 tlog

! = 0, 2λ q λ+ λ2 +2λ−4λpj +1−1

respectively. We call the corresponding estimators the percentile estimators or PCE’s. Several estimators of pj can be used in this case, see, for example, Mann, j Schafer & Singpurwalla (1974). In this paper, we consider pj = n+1 .

3.5. Method of L-Moments Estimation In this section, we provide the L-moments estimators, which can be obtained as the linear combinations of order statistics. The L-moments estimators were originally proposed by Hosking (1990), and it is observed that the L-moments estimators are more robust than the usual moment estimators. The L-moment estimators are also obtained in the same way as the ordinary moment estimators, i.e. by equating the sample L-moments with the population L-moments. L-moment estimation provides an alternative method of estimation that is analogous to conventional moments. It has the advantage that it exists whenever the mean of the distribution exists, even though some higher moments may not exist, and it is relatively robust to the effects of outliers (Hosking 1994). Hosking (1990) states that the L-moment estimators are reasonably efficient when compared to the maximum likelihood estimators for distributions such as the normal distribution, the Gumbel distribution, and the GEV distribution. Revista Colombiana de Estadística 40 (2017) 165–203

183

Estimation of Transmuted Rayleigh distribution

In this case, the L-moments estimators can be obtained by equating the first two sample L-moments with the corresponding population L-moments. The first two sample L-moments are n

l1 =

1X x(i) , n i=1

n

l2 =

X 2 (i − 1)x(i) − l1 , n(n − 1) i=1

and the first two population L-moments are λ1 = E(X1:1 ) = E(X)    1  1 − λ + λ2−1/2 = α−1/2 Γ 1 + 2 1 λ2 = [E(X2:2 ) − E(X2:1 )] , 2

(25)

(26)

where Z E(X2:2 ) = 4α 0 √ π =−



 h  i 2 2 2 2 x2 e−αx 1 − λ + 2λe−αx 1 − e−αx 1 − λ + λe−αx dx √ √  2 √ √  √  3+3 2−4 3 λ +4 3−3 2+ 3 λ+3 2−4 √ , 12 α

and Z ∞    2 2 2 E(X1:2 ) = 4α x2 e−αx 1 − λ + 2λe−αx 1 − λ + λe−αx dx 0 √ √ √  √  √  √ π λ 3+3 2−4 3 λ−6 2+4 3 +3 2 √ . = 12 α bLM E of the parameters α and λ can be The L-moments estimators α bLM E and λ obtained by numerically solving the following equations:     bLM E = l1 , λ2 α bLM E = l2 . λ1 α bLM E , λ bLM E , λ (27)

3.6. Method of Maximum Product of Spacings Cheng & Amin (1979, 1983) introduced the maximum product of spacings (MPS) method as an alternative to MLE for the estimation of parameters of continuous univariate distributions. Ranneby (1984) independently developed the same method as an approximation for the Kullback-Leibler measure of information. Using the same notations in subsection 3.4, the uniform spacings of a random sample from the TR distribution are defined as: Di (α, λ) = F (xi:n | α, λ) − F (xi−1:n | α, λ) , where F (x0:n | α, λ) = 0 and F (xn+1:n | α, λ) = 1. Clearly

i = 1, 2, . . . , n, Pn+1 i=1

Di (α, λ) = 1.

Revista Colombiana de Estadística 40 (2017) 165–203

184

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Following (Cheng & Amin 1983), the maximum product of spacings estimators bM P S , of the parameters α, and λ, are obtained by maximizing, with α bM P S and λ respect to α and λ, the geometric mean of the spacings: 1 "n+1 # n+1 Y , (28) G (α, λ) = Di (α, λ) i=1

or, equivalently, by maximizing the function H (α, λ) =

n+1 1 X log Di (α, λ). n + 1 i=1

(29)

bM P S of the parameters α and λ can be obtained The estimators α bM P S and λ by solving the nonlinear equations n+1 1 X 1 ∂H (α, λ) = [∆1 (xi:n | α, λ) − ∆1 (xi−1:n | α, λ)] = 0, (30) ∂α n + 1 i=1 Di (α, λ) n+1 ∂ 1 X 1 H (α, λ) = [∆2 (xi:n | α, λ) − ∆2 (xi−1:n | α, λ)] = 0, (31) ∂λ n + 1 i=1 Di (α, λ)

where

2

2

2

∆1 (xi:n | α, λ) = x2i:n e−αxi:n − λx2i:n e−αxi:n + 2λx2i:n e−αxi:n . and

2

2

∆2 (xi:n | α, λ) = e−αxi:n − e−2αxi:n .

(32) (33)

Cheng & Amin (1983) showed that maximizing H as a method of parameter estimation is as efficient as MLE estimation and the MPS estimators are consistent under more general conditions than the MLE estimators. 3.6.1. Method of Cramér-Von-Mises To motivate our choice of Cramér-von-Mises type minimum distance estimators, (Macdonald 1971) provided empirical evidence that the bias of the estimator is smaller than the other minimum distance estimators. Thus, The Cramér-vonbCM E of the parameters α and λ are obtained by Mises estimates that α bCM E and λ minimizing, with respect to α and λ, the function: 2 n  X 2i − 1 1 C(α, λ) = + F (xi:n | α, λ) − . (34) 12n i=1 2n These estimators can also be obtained by solving the non-linear equations:  n  X 2i − 1 F (xi:n | α, λ) − ∆1 (xi:n | α, λ) = 0, 2n i=1  n  X 2i − 1 F (xi:n | α, λ) − ∆2 (xi:n | α, λ) = 0, 2n i=1 where ∆1 (. | α, λ) and ∆2 (. | α, λ) are given by (32) and (33), respectively. Revista Colombiana de Estadística 40 (2017) 165–203

Estimation of Transmuted Rayleigh distribution

185

3.6.2. Methods of Anderson-Darling and Right-tail Anderson-Darling The Anderson-Darling test was developed in 1952 by Anderson & Darling (1952) as an alternative to other statistical tests for detecting sample distributions departure from normality. Specifically, the AD test converges very quickly towards the asymptote (Anderson & Darling 1954, Pettitt 1976, Stephens 2013). bADE of the parameters α and λ The Anderson-Darling estimators α bADE and λ are obtained by minimizing, with respect to α and λ, the function: n

A(α, λ) = −n −

 1X (2i − 1) log F (xi:n | α, λ) + log F (xn+1−i:n | α, λ) . (35) n i=1

These estimators can also be obtained by solving the non-linear equations: " # n X ∆1 (xi:n | α, λ) ∆1 xn+1−i:n | α, λ (2i − 1) = 0, − F (xi:n | α, λ) F (xn+1−i:n | α, λ) i=1 " # n X ∆2 (xi:n | α, λ) ∆2 xn+1−i:n | α, λ (2i − 1) − = 0, F (xi:n | α, λ) F (xn+1−i:n | α, λ) i=1 where ∆1 (· | α, λ) and ∆2 (· | α, λ) are given by (32) and (33), respectively. bRT ADE of the paThe Right-tail Anderson-Darling estimates α bRT ADE and λ rameters α and λ are obtained by minimizing, with respect to α and λ, the function: R(α, λ) =

n n X n 1X F (xi:n | α, λ) − (2i − 1) log F (xn+1−i:n | α, λ) . (36) −2 2 n i=1 i=1

These estimators can also be obtained by solving the non-linear equations:  n n X ∆1 xn+1−i:n | α, λ ∆1 (xi:n | α, λ) 1X −2 + (2i − 1) = 0, F (xi:n | α, λ) n i=1 F (xn+1−i:n | α, λ) i=1  n n X ∆2 xn+1−i:n | α, λ ∆2 (xi:n | α, λ) 1X + (2i − 1) −2 = 0, F (xi:n | λ, σ) n i=1 F (xn+1−i:n | α, λ) i=1 where ∆1 (· | α, λ) and ∆2 (· | α, λ) are given by (32) and (33), respectively.

4. Simulation We conduct Monte Carlo simulation studies to compare the performance of the estimators discussed in the previous sections. We evaluate the performance of the estimators based on bias, root mean squared error, the average absolute difference between the theoretical and empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and empirical distribution functions. Methods are compared for sample sizes n = 20, 50, 100, and Revista Colombiana de Estadística 40 (2017) 165–203

186

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

200. 10,000 independent samples of size n are first generated from the transmuted Rayleigh (TR) distribution with parameters α = 0.5, 1, 3, 5, 10 and λ = 0.7. We observed that 10,000 repetitions were sufficiently large provide have stable results. For all the methods considered in this study, we first estimated the parameters using the method of maximum likelihood. For all other methods, the maximum likelihood estimates were used as the initial values. Also, the same randomly generated samples were used to compare all the estimation methods. The results of the simulation studies are reported in Tables 1-5. The pseudo-random numbers were generated from the Transmuted Rayleigh distribution using the following function: "

1 x = − ln α

(λ − 1) +

p

(λ − 1)2 + 4λ(1 − u) 2λ

!# 21 ,

(37)

where u ∼ U N IF (0, 1) For each estimate we calculate the bias, root mean-squared error, and the average absolute difference between the theoretical and empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and empirical distribution functions. The statistics are obtained using the following formulae. R

Bias(ˆ α) =

1 X (ˆ αi − α), R i=1

v u R u1 X (ˆ αi − α)2 , RMSE(ˆ α) = t R i=1 R

R

ˆ = Bias(λ)

1 X ˆ (λi − λ) R i=1

v u R u X ˆ =t1 ˆ i − λ)2 RMSE(λ) (λ R i=1

(39)

n

1 X X ˆ Dabs (α ˆ) = F (xij |α, λ) − F (xij α ˆ , λ)| (nR) i=1 j=1 Dmax (ˆ α) =

(38)

R 1 X ˆ α, λ)| max F (xij α, λ) − F (xij |ˆ R i=1 j

(40)

(41)

Simulated bias, RMSE, and P Dabs , Dmax for the estimates are presented in Tables 1-5. The row indicating Ranks shows the partial sum of the ranks. A superscript indicates the rank of each of the estimators among all the estimators for that metric. For example, Table 1 shows the bias of MLE(ˆ α) as 0.29510 for n = 20. This indicates, that the bias of α ˆ , obtained using the method of maximum likelihood ranks was 10th among all other estimators. Table 6 shows the partial and overall rank of the estimators; it is used to find the overall performance of estimation techniques.

Revista Colombiana de Estadística 40 (2017) 165–203

Estimation of Transmuted Rayleigh distribution

187

The following observations can be drawn from the Tables 1-5. 1. All the estimators show the property of consistency, i.e. the RMSE decreases as the sample size increases. 2. The bias of α ˆ decreases with an increasing n for all the method of estimations. ˆ decreases with an increasing n for all the method of estimations. 3. The bias of λ 4. The bias of α ˆ generally increases with an increasing alpha for any given alpha and n and for all methods of estimation. 5. In terms of RMSE, all the methods of estimation produce smaller RMSE for ˆ α ˆ compared to that of λ. 6. Dabs is smaller than Dmax for all the estimation techniques. Again, these statistics get smaller with a a sample size increase. 7. In terms of performance of the methods of estimation, we found that maximum product spacing (MPS)estimators is the best method as it produces the least estimate biases with the least RMSE for most of the configurations considered in our studies. The next best method is the percentile estimators (PCE), followed by right tailed Anderson-Darling estimators. The weighted lest squared estimation (WLS) method ranked 4th while MLE ranked 5th. Method of Cramér-von-Mises ranked 10th among the ten method of estimation. 8. While MPS uniformly performed the best for all values of n and α, the PCE and WLS method performed the best for α ≤ 5. Their performance was degraded for α = 10. The overall positions of the estimators are presented in Table 6, from which we can confirm the superiority of MPS and PCE.

Revista Colombiana de Estadística 40 (2017) 165–203

200

100

50

n 20

0.2087 0.2938 −0.4939 0.6469 0.0337 0.0547 479 0.1727 0.2346 −0.4237 0.5387 0.0245 0.0375 376 0.1465 0.1914 −0.3677 0.4555 0.0164 0.0264 295

0.2088 0.2939 −0.4767 0.6268 0.0333 0.0523 386 0.1573 0.2294 −0.3745 0.5225 0.0233 0.0373 233 0.1193 0.1813 −0.2913 0.4293 0.0163 0.0253 183

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

0.1618 0.2007 −0.3988 0.4788 0.0177 0.0267 458

0.1818 0.2367 −0.4368 0.5398 0.0234 0.0374 398

0.2109 0.2877 −0.4868 0.6197 0.0335 0.0535 417

LME 0.2698 0.4018 −0.5778 0.7647 0.0525 0.0836 427

0.1889 0.2359 −0.4319 0.5299 0.0179 0.0289 549

0.1999 0.2609 −0.4479 0.5689 0.0259 0.0409 549

0.2066 0.2826 −0.4526 0.6005 0.0359 0.0559 417

LSE 0.2023 0.3053 −0.4294 0.6113 0.0537 0.0825 254

0.1486 0.1986 −0.3556 0.4616 0.0166 0.0266 366

0.1656 0.2305 −0.3876 0.5173 0.0247 0.0376 335

0.1823 0.2673 −0.4174 0.5693 0.0334 0.0524 213

WLS 0.2075 0.3364 −0.4435 0.6184 0.0513 0.0803 243

0.0762 0.1421 −0.2092 0.3411 0.0162 0.0251 91

0.0952 0.1741 −0.2652 0.4071 0.0232 0.0362 102

0.1202 0.2161 −0.3292 0.4791 0.0322 0.0502 102

PCE 0.1552 0.2912 −0.4163 0.5722 0.0514 0.0792 152

0.0691 0.1442 −0.1821 0.3442 0.0161 0.0252 91

0.0891 0.1762 −0.2381 0.4102 0.0231 0.0351 81

0.1151 0.2172 −0.3051 0.4802 0.0321 0.0501 81

MPS 0.1531 0.2901 −0.3992 0.5681 0.0512 0.0781 81

Table 1: Simulation results for α = 0.5 and λ = 0.7. MME 0.2839 0.41610 −0.62810 0.8498 0.0549 0.08810 5610

MLE 0.29510 0.4129 −0.6189 0.7576 0.0501 0.0814 396

Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

0.20310 0.25010 −0.46210 0.55810 0.01810 0.02910 6010

0.22210 0.28410 −0.49210 0.61410 0.02510 0.04110 6010

0.24510 0.32210 −0.52510 0.67210 0.03510 0.05710 6010

CVM 0.2607 0.3636 −0.5337 0.7225 0.05410 0.0868 438

0.1497 0.2048 −0.3505 0.4727 0.0178 0.0278 437

0.1655 0.2378 −0.3733 0.5256 0.0248 0.0388 387

0.1844 0.2784 −0.3983 0.5934 0.0348 0.0548 314

AD 0.2054 0.3505 −0.3641 1.36610 0.0536 0.0847 335

0.1384 0.1925 −0.3334 0.4524 0.0165 0.0265 274

0.1584 0.2293 −0.3744 0.5214 0.0246 0.0377 284

0.1845 0.2795 −0.4235 0.6056 0.0336 0.0536 335

RAD 0.2326 0.3777 −0.5076 0.9879 0.0538 0.0879 459

188 Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Revista Colombiana de Estadística 40 (2017) 165–203

200

100

50

n 20

0.4259 0.5959 −0.50110 0.65010 0.0338 0.0538 5410 0.3497 0.4737 −0.4267 0.5407 0.0237 0.0377 427 0.2956 0.3846 −0.3697 0.4576 0.0165 0.0265 356

0.42810 0.59610 −0.4909 0.6368 0.0324 0.0525 468 0.3275 0.4686 −0.3885 0.5316 0.0234 0.0375 315 0.2495 0.3695 −0.3035 0.4365 0.0163 0.0263 265

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

0.3258 0.4048 −0.4008 0.4808 0.0178 0.0268 488

0.3648 0.4758 −0.4389 0.5408 0.0236 0.0376 458

0.4218 0.5778 −0.4898 0.6217 0.0337 0.0527 457

LME 0.5478 0.8139 −0.5818 0.7677 0.0525 0.0838 459

0.3759 0.4699 −0.4309 0.5309 0.0189 0.0299 549

0.3789 0.4969 −0.4288 0.5589 0.0259 0.0399 539

0.3625 0.4965 −0.4075 0.5746 0.0349 0.0549 396

LSE 0.3605 0.5363 −0.3905 0.5913 0.0538 0.0825 294

0.2997 0.3987 −0.3576 0.4637 0.0177 0.0267 417

0.3326 0.4625 −0.3896 0.5185 0.0238 0.0378 386

0.3696 0.5406 −0.4226 0.5745 0.0336 0.0526 355

WLS 0.4256 0.6877 −0.4516 0.6244 0.0513 0.0803 294

0.1532 0.2842 −0.2093 0.3412 0.0162 0.0252 132

0.1933 0.3512 −0.2654 0.4093 0.0233 0.0352 172

0.2373 0.4312 −0.3284 0.4803 0.0323 0.0502 173

PCE 0.2784 0.5262 −0.3844 0.5512 0.0514 0.0782 182

0.1261 0.2751 −0.1661 0.3331 0.0161 0.0251 61

0.1581 0.3301 −0.2132 0.3911 0.0231 0.0351 71

0.1872 0.3791 −0.2573 0.4431 0.0321 0.0491 91

MPS 0.2212 0.4451 −0.3133 0.5071 0.0512 0.0771 101

Table 2: Simulation results for α = 1 and λ = 0.7. MME 0.5819 0.84210 −0.63610 0.8538 0.0539 0.08710 5610

MLE 0.59710 0.8128 −0.6279 0.7646 0.0501 0.0814 387

Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

0.40510 0.50010 −0.45910 0.55910 0.01810 0.02910 6010

0.42310 0.54310 −0.47010 0.60310 0.02510 0.04110 6010

0.4167 0.5527 −0.4587 0.6379 0.03510 0.05610 509

CVM 0.4357 0.6026 −0.4607 0.6845 0.05410 0.0849 448

0.1613 0.3333 −0.1732 0.3643 0.0166 0.0266 233

0.1732 0.3853 −0.1731 0.4022 0.0235 0.0364 172

0.1741 0.4413 −0.1511 0.4502 0.0335 0.0514 162

AD 0.1671 0.5484 0.0421 2.38310 0.0526 0.0826 283

0.2144 0.3594 −0.2504 0.4064 0.0164 0.0264 244

0.2304 0.4204 −0.2583 0.4544 0.0232 0.0363 204

0.2403 0.4864 −0.2552 0.5164 0.0322 0.0513 194

RAD 0.2333 0.5975 −0.1202 1.6769 0.0527 0.0827 336

Estimation of Transmuted Rayleigh distribution

189

Revista Colombiana de Estadística 40 (2017) 165–203

200

100

50

n 20

1.26610 1.77010 −0.49310 0.64810 0.0337 0.0539 5610 1.0439 1.4099 −0.4219 0.5389 0.0236 0.0377 499 0.8849 1.1479 −0.3669 0.4569 0.0165 0.0264 458

0.9188 1.3918 −0.3588 0.5525 0.0333 0.0514 365 0.7967 1.2418 −0.3167 0.4865 0.0232 0.0363 325 0.6313 1.0234 −0.2555 0.4083 0.0162 0.0253 203

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

0.97410 1.21010 −0.40010 0.48010 0.0177 0.0267 5410

1.08410 1.41710 −0.43510 0.53910 0.0235 0.0375 5010

1.2459 1.7149 −0.4849 0.6209 0.0335 0.0525 469

LME 1.6119 2.4099 −0.5739 0.7657 0.0525 0.0837 469

0.6494 1.0093 −0.2503 0.4295 0.0179 0.0279 335

0.7193 1.1033 −0.2753 0.4694 0.0249 0.0389 314

0.7874 1.2053 −0.2984 0.5084 0.0349 0.0536 305

LSE 0.8734 1.4173 −0.3166 0.5434 0.0537 0.0805 305

0.6806 1.0245 −0.2726 0.4144 0.0164 0.0265 304

0.7294 1.1464 −0.2885 0.4563 0.0234 0.0364 243

0.7643 1.2604 −0.2973 0.4913 0.0334 0.0513 203

WLS 0.8173 1.5847 −0.2914 0.5103 0.0512 0.0783 223

0.3722 0.7592 −0.1752 0.3192 0.0163 0.0252 132

0.3952 0.8272 −0.1972 0.3602 0.0233 0.0352 132

0.4162 0.8992 −0.2212 0.4072 0.0321 0.0492 122

PCE 0.4632 1.0551 −0.2572 0.4642 0.0513 0.0772 122

0.2731 0.7091 −0.1251 0.3021 0.0161 0.0251 61

0.3271 0.8061 −0.1561 0.3471 0.0231 0.0351 61

0.3681 0.8931 −0.1851 0.3911 0.0321 0.0481 61

MPS 0.4491 1.0632 −0.2361 0.4531 0.0501 0.0761 71

Table 3: Simulation results for α = 3 and λ = 0.7. MME 1.74210 2.52210 −0.63010 0.8528 0.05310 0.08710 5810

MLE 0.9547 1.4484 −0.3568 0.6105 0.0514 0.0794 325

Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

0.6695 1.0546 −0.2534 0.4487 0.01710 0.02810 427

0.7605 1.1766 −0.2854 0.5008 0.02410 0.03910 437

0.8536 1.3006 −0.3155 0.5566 0.03410 0.05410 437

CVM 0.9748 1.5145 −0.3427 0.6166 0.0539 0.0826 417

0.7858 1.1008 −0.3108 0.4528 0.0178 0.0278 489

0.8298 1.2097 −0.3228 0.5007 0.0248 0.0388 468

0.8767 1.3457 −0.3347 0.5668 0.0338 0.0538 458

AD 0.9336 1.6188 −0.2925 1.4789 0.0526 0.0858 428

0.7386 1.0597 −0.2997 0.4356 0.0166 0.0266 396

0.7706 1.1655 −0.3116 0.4876 0.0237 0.0376 366

0.7875 1.2825 −0.3196 0.5587 0.0336 0.0537 365

RAD 0.8174 1.5576 −0.2753 1.53510 0.0538 0.0869 406

190 Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Revista Colombiana de Estadística 40 (2017) 165–203

MME 2.92910 4.23110 −0.63410 0.8528 0.05410 0.08810 5810 2.14210 2.96810 −0.49610 0.64810 0.0337 0.0549 5610 1.7529 2.3499 −0.4239 0.53810 0.0246 0.0377 509 1.4779 1.9107 −0.3669 0.4556 0.0164 0.0264 396

MLE 1.5696 2.3754 −0.3597 0.6175 0.0523 0.0804 294 1.4816 2.2236 −0.3526 0.5545 0.0334 0.0514 314 1.2884 1.9824 −0.3114 0.4824 0.0233 0.0363 224 1.0793 1.7173 −0.2643 0.4133 0.0163 0.0253 183

Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks

n 20

50

100

200

1.61210 2.00410 −0.39810 0.4789 0.0177 0.0267 5310

1.79910 2.35210 −0.43410 0.5389 0.0235 0.0375 498

2.0779 2.8529 −0.4839 0.6189 0.0335 0.0535 467

LME 2.6499 3.9609 −0.5709 0.7627 0.0525 0.0837 469

1.3256 1.8626 −0.3085 0.4637 0.0179 0.0289 427

1.3775 1.9835 −0.3185 0.4955 0.0249 0.0399 386

1.4735 2.1464 −0.3354 0.5304 0.0349 0.0537 336

LSE 1.5465 2.4385 −0.3396 0.5584 0.0537 0.0815 325

1.1864 1.7364 −0.2864 0.4224 0.0165 0.0265 264

1.2603 1.9293 −0.3013 0.4643 0.0234 0.0374 203

1.3203 2.1243 −0.3083 0.4973 0.0333 0.0513 183

WLS 1.2993 2.3493 −0.2913 0.5093 0.0512 0.0783 173

0.6452 1.2892 −0.1822 0.3242 0.0162 0.0252 122

0.7112 1.4182 −0.2112 0.3692 0.0232 0.0362 122

0.7752 1.5682 −0.2402 0.4192 0.0332 0.0502 122

PCE 0.8222 1.8101 −0.2742 0.4752 0.0524 0.0782 132

0.5251 1.2501 −0.1431 0.3161 0.0161 0.0251 61

0.6011 1.3871 −0.1711 0.3581 0.0231 0.0351 61

0.6961 1.5621 −0.2051 0.4071 0.0321 0.0491 61

MPS 0.8011 1.8152 −0.2541 0.4661 0.0511 0.0771 71

Table 4: Simulation results for α = 5 and λ = 0.7.

1.3827 1.9539 −0.3176 0.48310 0.01710 0.02810 529

1.4877 2.1308 −0.3377 0.5308 0.02510 0.03910 509

1.6208 2.3317 −0.3597 0.5818 0.03510 0.05510 509

CVM 1.7678 2.6356 −0.3788 0.6376 0.0539 0.0836 437

1.4488 1.9428 −0.3438 0.4698 0.0178 0.0278 488

1.4958 2.1177 −0.3498 0.5147 0.0248 0.0388 467

1.5877 2.3628 −0.3618 0.5797 0.0348 0.0548 467

AD 1.6757 2.8258 −0.3295 1.39410 0.0536 0.0858 448

1.3095 1.8295 −0.3187 0.4445 0.0166 0.0266 345

1.3816 2.0286 −0.3356 0.4996 0.0247 0.0376 375

1.4154 2.2185 −0.3415 0.5676 0.0336 0.0536 325

RAD 1.4424 2.6747 −0.3094 1.2639 0.0538 0.0879 416

Estimation of Transmuted Rayleigh distribution

191

Revista Colombiana de Estadística 40 (2017) 165–203

200

100

50

−0.7797 2.5032 0.1407 0.2804 0.0324 0.0504 285 −0.6245 2.0202 0.1025 0.2352 0.0232 0.0363 192 −0.4575 1.6492 0.0725 0.1982 0.0164 0.0264 222

4.06010 5.46410 −0.47610 0.61510 0.0323 0.0517 509 3.2099 4.5499 −0.3859 0.52410 0.0233 0.0374 448 2.4497 3.6587 −0.2997 0.4337 0.0163 0.0253 347

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

Dabs Dmax P Ranks

−0.9957 3.4732 0.2095 0.3524 0.0513 0.0785 264

5.04410 6.45010 −0.57910 0.7118 0.0492 0.0783 438

Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)

20

MME

MLE

Est.

n

−0.2713 1.8133 0.0484 0.2213 0.0176 0.0265 243

−0.3794 2.1993 0.0704 0.2643 0.0235 0.0375 243

−0.4575 2.7153 0.0976 0.3155 0.0339 0.0518 366

−0.5414 3.7215 0.1464 0.3945 0.05210 0.0808 365

LME

3.9629 4.7879 −0.4559 0.5389 0.0177 0.0287 509

1.7726 3.1576 −0.2016 0.4187 0.0247 0.0387 397

0.4576 2.8254 −0.0101 0.2542 0.0337 0.0503 232

0.6625 3.7144 −0.0081 0.2522 0.0529 0.0772 233

LSE

3.0638 4.0088 −0.3678 0.4668 0.0165 0.0266 438

3.40010 4.55910 −0.40210 0.5189 0.0246 0.0376 5110

3.6929 5.0489 −0.4289 0.5589 0.0336 0.0529 5110

3.8839 5.5719 −0.4389 0.5877 0.0515 0.0797 4610

WLS

1.5336 2.8516 −0.2116 0.3416 0.0162 0.0252 284

1.9398 3.4568 −0.2698 0.4076 0.0234 0.0362 366

2.3778 4.1178 −0.3318 0.4728 0.0325 0.0502 397

2.8138 4.8908 −0.4028 0.5426 0.0526 0.0786 427

PCE

−0.2964 1.5171 0.0333 0.1761 0.0161 0.0241 111

−0.3403 1.7711 0.0352 0.1861 0.0221 0.0341 91

−0.3344 2.1131 0.0312 0.1981 0.0311 0.0471 101

−0.2811 2.9021 0.0182 0.2191 0.0491 0.0731 71

MPS

Table 5: Simulation results for α = 10 and λ = 0.7.

4.25610 5.06810 −0.48410 0.56510 0.0188 0.0298 5610

1.8757 3.3777 −0.2077 0.4508 0.0248 0.0388 459

0.1972 2.8415 0.0343 0.2653 0.0338 0.0516 273

0.3052 3.6763 0.0573 0.2673 0.0527 0.0784 222

CVM

−0.0801 2.8255 0.0111 0.2885 0.02710 0.04110 326

0.1452 3.0545 0.0091 0.3105 0.02810 0.04410 335

0.1191 3.3187 0.0504 0.3507 0.03410 0.05310 397

−0.3323 4.2887 0.3406 2.51310 0.0528 0.08310 449

AD

−0.2452 2.5784 0.0282 0.2654 0.0259 0.0389 305

−0.1251 2.5834 0.0373 0.2764 0.0269 0.0409 304

−0.2583 2.8616 0.0815 0.3246 0.0322 0.0505 273

−0.8916 3.9166 0.3517 1.7439 0.0514 0.0829 416

RAD

192 Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Revista Colombiana de Estadística 40 (2017) 165–203

193

Estimation of Transmuted Rayleigh distribution

Table 6: Partial and overall ranks of all the methods of estimation for λ = 0.7 and various α. α 0.5

n 20 50 100 200

MLE 6 6 3 3

MME 10 9 6 5

LME 7 7 8 8

LSE 4 7 9 9

WLS 3 3 5 6

PCE 2 2 2 1

MPS 1 1 1 1

CVM 8 10 10 10

AD 5 4 7 7

RAD 9 5 4 4

1

20 50 100 200

7 8 5 5

10 10 7 6

9 7 8 8

4 6 9 9

4 5 6 7

2 3 2 2

1 1 1 1

8 9 10 10

3 2 2 3

6 4 4 4

3

20 50 100 200

5 5 5 3

10 10 9 8

9 9 10 10

4 4 4 5

3 3 3 4

2 2 2 2

1 1 1 1

7 7 7 7

8 8 8 9

6 5 6 6

5

20 50 100 200

4 4 4 3

10 10 9 6

9 7 8 10

5 6 6 7

3 3 3 4

2 2 2 2

1 1 1 1

7 9 9 9

8 7 7 8

6 5 5 5

10

20 50 100 200

8 9 8 7

4 5 2 2

5 6 3 3

3 2 7 9

10 10 10 8

7 7 6 4

1 1 1 1

2 3 9 10

9 7 5 6

6 3 4 5

P Ranks Overall Rank

108 5

148 8

151 9

119 6

103 4

56 2

20 1

161 10

123 7

102 3

5. Real Data Analysis In the following section, for illustrative purposes we present two applications for the proposed TR distribution to real data for illustrative purposes. These applications will show the flexibility of the TR distribution in modeling positive data. For the purpose of comparison, we also fit one Rayleigh parameter, Weibull, and Gamma distributions for the same data. We use the fitdistrplus R package to fit the distributions. The fits of these four distributions are presented in the subsequent sections. We present a comparative density plot, plots of the distribution functions, and Q-Q and P-P plots for all four distributions. For example, Figure 7 shows the comparative density, distribution, q-q, and p-p plots for the guinea pig data which we discussed in Section 5.1. Classical goodness of fit of a given distribution can be assessed via the density plot and the CDF plot. The Q-Q an P-P plots may provide additional information in some cases. In particular, the Q-Q plot may provide information about the lack-of-fit at the tails of the distribution, whereas the P-P plot emphasizes the Revista Colombiana de Estadística 40 (2017) 165–203

194

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

lack-of-fit at the center of the Muller and Dutang distribution (Delignette-Muller & Dutang 2014). To evaluate the performance of the candidate distributions that best fit the data, we calculate the log-likelihood of the fitted models based on numerically obtained maximum likelihood estimations via the fitdist() function, available in the fitdistrplus R package. We then obtain the values of AIC and BIC. Additionally, we test for goodness of fit for the candidate models using the KolmogorovSmirnov (K-S) test, Anderson-Darling’s (AD) test, and the Cramér-von-Mises (CVM) test. Note that, the K-S test requires unique items for us to obtain exact p-values. Therefore, we made sure to break the ties by adding a tiny random noise, which is uniform between 0.001 and 0.01, to each duplicate observation. Adding this small noise did not affect the statistical properties of the data. The two data sets considered in this study include tied observations. As such, we removed ties for both data sets using the process outlined above. We present a descriptive summary of original data as well as for the data after ties were broken. This shows a minor difference between the original and modified data sets. It should also be mentioned that removal of duplicate observations is necessary for the MPS method of estimation.

5.1. Example 1: Guinea Pig Data Our first data set contains survival times (in days) of guinea pigs injected with different doses of tubercle bacilli and is given in Table 7. The data has been analyzed by Kundu & Howlader (2010) and by Singh, Singh & Sharma (2013). We present the data after breaking the ties. Table 7: Survival times (in days) of guinea pigs injected with different doses of tubercle bacilli. 0.012

0.015

0.022

0.024

0.032

0.033

0.034

0.038

0.043

0.044

0.048

0.052

0.053

0.054

0.055

0.056

0.057

0.058

0.059

0.060

0.061

0.062

0.063

0.065

0.067

0.068

0.070

0.072

0.073

0.075

0.076

0.081

0.083

0.084

0.085

0.087

0.091

0.095

0.096

0.098

0.099

0.109

0.110

0.121

0.127

0.129

0.131

0.143

0.146

0.175

0.211

0.233

0.258

0.263

0.297

0.341

0.376

0.030

0.036

0.043

0.061

0.060

0.063

0.063

0.063

0.072

0.081

0.153

0.181

0.260

0.347

0.074

5.1.1. Finding a suitable distribution for guinea pig data We compare the fit of Rayleigh, Weibull, Gamma, and Transmuted Rayleigh distribution for the guinea pig data set. The data contains tied observations, and, therefore, we modified the data as discussed above to remove the ties. The summary of original data set, as well as the modified one, is presented in Table 8. We do not observe any noticeable difference in the calculated statistics for the modified data set compared to the original one. Revista Colombiana de Estadística 40 (2017) 165–203

195

Estimation of Transmuted Rayleigh distribution Table 8: Summary of Guinea Pig data: Original vs Modified. Data

Min

Q1

Median

Mean

Q3

Max

Original

0.01200

0.05475

0.07000

0.09982

0.11280

0.37600

Ties broken

0.01200

0.05575

0.07177

0.10080

0.11280

0.37600

Next, we present the fit statistics for four distributions for the guinea pig data. The log-likelihood statistic, the AIC, the BIC, and the tests statistics along with the p-values for K-S, AD, and CVM goodness of fit tests are presented in Table 9. Results show that the transmuted Rayleigh distribution best fits the guinea pig data having the largest log-likelihood, smallest AIC, and smallest BIC among all four candidate distributions. Graphical check for the model fitting is shown in Figure 7. From the histogram and theoretical density plots, transmuted Rayleigh distribution shows reasonably good fit especially in the right tail area. Fitting right tail area is particularly important for many life testing reliability problems. From the Q-Q plot, we find that the TR model is reasonably a good fit at the lower tail of the distribution as well. However, Weibull and gamma seems to be fitting the data in the middle of the distribution. We made our decision of finding the best fitting model based on the largest log-likelihood value and smallest AIC and BIC values. Based on these criteria, TR fits the guinea pig data best. Table 9: (Guinea pig data) Fitting distributions to find the best fit model. Distribution

LogLik

AIC

BIC

K-S(p)

AD(p)

CVM(p)

Rayleigh

91.48

−178.96

−174.41

0.25 (.000)

6.15 (.000)

1.28 (.001)

Weibull

99.83

−195.66

−191.10

0.15 (.088)

2.36 (.059)

0.41 (.067)

Gamma

102.83

−201.66

−197.11

0.99 (.000)

781.60 (.000)

23.99 (.000)

TRayleigh

111.37

−218.73

−214.18

0.48 (.000)

46.29 (.000)

6.52 (.000)

5.1.2. Bootstrap Estimates (Guinea Pig Data): Bias and Standard Errors Since the transmuted Rayleigh distribution best fits the data, we carry out ordinary nonparametric bootstrap resampling to obtain bias, standard error, and 95% bootstrap percentile confidence intervals for the parameters α and λ of the transmuted Rayleigh distribution. In ordinary nonparametric resampling, repeated samples are drawn from the original data with replacements. In addition, to the original data set, we draw 999 bootstrap samples to obtain the estimators bias and standard error of the estimators. Results are shown in Table 10. We note that moments based methods have estimates for α that are around 53 while the estimates for λ are around 0.58. For α ˆ , WLS produces the least biased estimate ˆ all the methods produce a small bias and while MPS a has higher bias. For λ, standard error.

Revista Colombiana de Estadística 40 (2017) 165–203

196

Sanku Dey, Enayetur Raheem & Saikat Mukherjee Histogram and theoretical densities 1.0 0.8 CDF

0.4

0.6

10 8 6

TRayleigh Rayleigh Weibull Gamma

0

0.0

2

0.2

4

Density

Empirical and theoretical CDFs

TRayleigh Rayleigh Weibull Gamma

0.0

0.1

0.2

0.3

0.0

0.1

Q−Q plot

0.2

0.3

P−P plot

data

0.0

0.1

0.2

0.8 0.6 0.4

Empirical probabilities

0.2

TRayleigh Rayleigh Weibull Gamma

0.0

0.2 0.1

TRayleigh Rayleigh Weibull Gamma

0.0

Empirical quantiles

0.3

1.0

data

0.0

0.3

0.2

0.4

0.6

0.8

1.0

Theoretical probabilities

Theoretical quantiles

Figure 7: (Guinea pig data) Comparison of fit for the four distributions of the guinea pig survival time data.

Table 10: (Guinea pig data) Estimated bias and standard error of the estimates α ˆ and ˆ and their 95% percentiles of the lower and upper confidence limits based λ on 999 bootstrap samples. ˆ λ

α ˆ

Method Est.

Bias

SE

LCL

UCL

Est.

Bias

SE

LCL

UCL

MLE

46.642

1.703

8.889

33.72

69.84

0.647

−0.002

0.053

0.544

0.751

LSE

77.284

−2.983

10.602

55.22

96.51

0.587

0.068

0.144

0.508

1.000

WLS

75.952

−0.030

12.727

52.65

101.29

0.570

0.020

0.059

0.532

0.757

PCE

43.992

1.398

9.374

30.62

66.27

0.582

0.000

0.020

0.544

0.622

MPS

44.364

−37.897

6.020

3.76

25.76

0.654

0.285

0.151

0.387

1.000

CVM

76.546

−3.044

10.610

54.99

96.46

0.600

0.074

0.153

0.509

1.000

RAD

55.178

0.867

12.452

33.80

81.31

0.657

−0.001

0.047

0.565

0.751

LME

51.856

1.964

10.834

36.23

76.86

0.552

−0.006

0.034

0.535

0.552

MME

52.467

1.185

10.417

35.60

77.82

0.578

0.008

0.020

0.578

0.595

MMM

52.566

0.959

10.429

36.74

77.25

0.593

−0.006

0.006

0.577

0.594

Revista Colombiana de Estadística 40 (2017) 165–203

197

Estimation of Transmuted Rayleigh distribution

5.2. Example 2: Fibre Strength Data The second data set consists of 100 measurements on breaking stress of carbon fibres (in Gba). The data in Table 11 is taken from Cordeiro, Ortega & Popović (2014). The data was also used in Nadarajah, Cordeiro & Ortega (2013). We present the data after the ties broker. Table 11: Breaking strengths of carbon fibres (in Gba). 0.39

0.81

0.85

0.98

1.08

1.12

1.17

1.18

1.22

1.25

1.36

1.41

1.47

1.57

1.59

1.61

1.69

1.71

1.73

1.80

1.84

1.87

1.89

1.92

2.00

2.03

2.05

2.12

2.17

2.35

2.38

2.41

2.43

2.48

2.50

2.53

2.55

2.56

2.59

2.67

2.73

2.74

2.76

2.77

2.79

2.81

2.82

2.83

2.85

2.87

2.88

2.93

2.95

2.96

2.97

3.09

3.11

3.15

3.19

3.22

3.27

3.28

3.31

3.33

3.39

3.51

3.56

3.60

3.65

3.68

3.70

3.75

4.20

4.38

4.42

4.70

4.90

4.91

5.08

5.56

1.58

1.60

1.61

1.70

1.85

2.04

2.17

2.17

2.48

2.55

2.82

2.98

3.11

3.16

3.19

3.23

3.31

3.39

3.68

3.69

5.2.1. Finding a suitable distribution for fibre data Similar to the guinea pig data, we compare the fits of Rayleigh, Weibull, Gamma, and transmuted Rayleigh distribution for the fibres data set. There were 20 tied observations in the data set. We prepared the data by breaking the ties following the procedure discussed earlier. The summary of the original and the modified data (after breaking ties) is shown in Table 12. We notice that the modified data has nearly the same descriptive characteristics as those in the original data set. Table 12: Summary of Fibres data: Original vs Modified. Data

Min

Q1

Median

Mean

Original

0.390

Ties broken

0.390

Q3

Max

1.840

2.700

2.621

3.220

5.560

1.847

2.700

2.622

3.221

5.560

For the fibres data, we present fit statistics of Rayleigh, Weibull, Gamma, and transmuted Rayleigh distribution in Table 13. these show that the transmuted Rayleigh distribution best fits the data ans has the largest log−likelihood (−105.39), the smallest AIC (214.77), and the smallest BIC (219.99) among all four candidate distributions. A histogram with theoretical density curves superimposed over the data is shown in panel (a) of Figure 8. The cdf, Q-Q, and P-P plots also show evidence that the TR is a good candidate model. Revista Colombiana de Estadística 40 (2017) 165–203

198

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Table 13: (Fibres strength data) Fitting distributions to find the best fit model. Distribution

LogLik

AIC

BIC

K-S(p)

AD(p)

CVM(p)

Rayleigh

−144.50

293.00

298.21

0.11 (.157)

1.66 (.143)

0.314 (.123)

Weibull

−141.53

287.05

292.26

0.06 (.868)

.0.41 (.836)

0.06 (.797)

Gamma

−143.24

290.49

295.70

0.96 (.000)

490.87 (.000)

32.74 (.000)

TRayleigh

−105.39

214.77

219.99

0.17 (.014)

4.82 (.003)

0.854 (.005)

0.8 0.4

0.2

CDF

0.6

0.3

0.4

TRayleigh Rayleigh Weibull Gamma

TRayleigh Rayleigh Weibull Gamma

0.0

0.0

0.2

0.1

Density

Empirical and theoretical CDFs 1.0

0.5

Histogram and theoretical densities

1

2

3

4

5

1

2

Q−Q plot data

3

4

5

1

2

3

4

Theoretical quantiles

5

6

0.8 0.6 0.4

Empirical probabilities

0.2

TRayleigh Rayleigh Weibull Gamma

0.0

4 3 2

TRayleigh Rayleigh Weibull Gamma

1

Empirical quantiles

5

1.0

P−Pdata plot

0.0

0.2

0.4

0.6

0.8

1.0

Theoretical probabilities

Figure 8: Comparison of fit of several distributions for the fibres strength data.

5.2.2. Bootstrap Estimates (fibres data): Bias and Standard Errors Since transmuted Rayleigh distribution best fits the fibres data, were obtain bootstrap biases and standard errors of the estimates of the parameters of the TR distribution using different methods of estimation. Results are shown in Table 14. We find that the MLE and all the moment-based methods are nearly unbiased for the parameter α. The worst performing method for this data set is the MPS, which produces a relatively large bias (0.325) for the estimate. The bootstrap percentile confidence intervals for λ shows that the lower limit is actually at the lower bound for the parameter. Also, note that the moment-based methods produce confidence intervals for λ, the lower limit of which falls outside the bound of the parameter, which is −1. Revista Colombiana de Estadística 40 (2017) 165–203

199

Estimation of Transmuted Rayleigh distribution

ˆ Table 14: (Fibres Data) Estimated bias and standard error of the estimates α ˆ and λ and their 95% percentiles of the lower and upper confidence limits based on 999 bootstrap samples. ˆ λ

α ˆ

Method Est.

Bias

SE

LCL

UCL

Est.

Bias

LCL

UCL

MLE

0.184

−0.000

0.015

0.153

0.216

−0.919

0.009

0.110 −1.000

−0.626

LSE

0.178 −0.001

0.014

0.151

0.205

−0.895

0.032

0.153 −1.000

−0.476

WLS

0.189 −0.002

0.015

0.154

0.216

−1.000

0.075

0.125 −1.000

−0.588

PCE

0.180

0.156

0.151

0.213

−0.884

0.004

0.130 −1.000

−0.546

0.001

SE

MPS

0.178

0.325

3.535

0.130

0.238

−0.851

0.101

0.204 −0.852

−0.246

CVM

0.181

−0.002

0.014

0.152

0.208

−0.923

0.035

0.129

−1.000

−0.572

RAD

0.188 −0.005

0.017

0.151

0.221

−1.000

0.089

0.135

−1.000

−0.525

LME

0.186

0.001

0.017

0.157

0.221

−0.945

−0.011

0.159

−1.261

−0.641

MME

0.188

0.001

0.016

0.157

0.222

−0.961

−0.009

0.161

−1.263

−0.653

MMM

0.187

0.001

0.018

0.156

0.227

−0.950

−0.002

0.161

−1.234

−0.623

6. Conclusion In this article, we have provide explicit expressions for the quantiles, moments, moment generating function, conditional moments, hazard rates, mean residual lifetime, mean past lifetime, mean deviation about mean and median, the stochastic ordering, various entropies, stress-strength parameter, and order statistics. The model parameters are estimated by ten methods of estimation, namely, maximum likelihood, moments, L-moments, percentile, least squares, weighted least squares, maximum product of spacing, Cramer-von Mises, Anderson-Darling, and right tailed Anderson-Darling. We have performed an extensive simulation study to compare these methods. We have compared estimators with respect to bias, root mean-squared error, the average absolute difference between the theoretical and empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and empirical distribution functions. We have also compared estimators wing two real data applications. The simulation results show that maximum product spacing (MPS) estimators is the best performing estimator in terms of biases and RMSE. The next best performing estimator is the percentile estimator (PCE), followed by the right tailed Anderson-Darling estimators. The real data applications show that the maximum product of spacing estimator gives the shortest confidence intervals for the alpha for Guinea Pig data set and the LSE for Fibre strength data set. Hence, we can argue that the percentile estimators, least squares estimators, right tailed Anderson-Darling estimators and the maximum product spacing estimators are among the best performing estimators for TR distribution. Revista Colombiana de Estadística 40 (2017) 165–203

200

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Acknowledgments The authors would like to thank the editor and two anonymous referees for their suggestions which have greatly improved the presentation of the paper. The second author also thanks Hana Kamran for translating the abstract in Spanish. 

Received: March 2016 — Accepted: November 2016



References Akram, M. & Hayat, A. (2014), ‘Comparison of estimators of the Weibull Distribution’, Journal of Statistical Theory and Practice 8(2), 238–259. Alkasasbeh, M. R. & Raqab, M. Z. (2009a), ‘Estimation of the generalized logistic distribution parameters: Comparative study’, Statistical Methodology 6(3), 262–279. Alkasasbeh, M. R. & Raqab, M. Z. (2009b), ‘Estimation of the generalized logistic distribution parameters: Comparative study’, Statistical Methodology 6(3), 262–279. Anderson, T. W. & Darling, D. A. (1952), ‘Asymptotic theory of certain “goodnessof-fit” criteria based on stochastic processes’, Annals of Mathematical Statistics 23, 193–212. Anderson, T. W. & Darling, D. A. (1954), ‘A test of goodness-of-fit’, Journal of the American Statistical Association 49, 765–769. Casella, G. & Berger, R. L. (1990), Statistical Inference, Brooks/Cole Publishing Company, California. Cheng, R. C. H. & Amin, N. A. K. (1979), Maximum product of spacings estimation with applications to the lognormal distribution, Technical report, Department of Mathematics, University of Wales. Cheng, R. C. H. & Amin, N. A. K. (1983), ‘Estimating Parameters in Continuous Univariate Distributions with a Shifted Origin’, Journal of the Royal Statistical Society: Series B 45(3), 394–403. Cordeiro, G. M., Ortega, E. M. M. & Popović, B. V. (2014), ‘The gammalinear failure rate distribution: theory and applications’, Journal of Statistical Computation and Simulation 84(11), 2408–2426. Delignette-Muller, M. & Dutang, C. (2014), ‘fitdistrplus: An R Package for Fitting Distributions’, Journal of Statistical Software 64(4), 1–34. Dey, S. (2009), ‘Comparison of Bayes estimators of the parameter and reliability function for Rayleigh distribution under differentloss functions’, Malaysian Journal of Mathematical Sciences 3(247-264). Revista Colombiana de Estadística 40 (2017) 165–203

Estimation of Transmuted Rayleigh distribution

201

Dey, S. & Das, M. K. (2007), ‘A note on prediction interval for a Rayleigh distribution: Bayesian approach’, American Journal of Mathematical and Management Science 12, 43–48. Dey, S., Dey, T. & Kundu, D. (2014), ‘Two-parameter Rayleigh distribution: Different methods of estimation’, American Journal of Math- ematical and Management Sciences 33, 55–74. Gupta, R. D. & Kundu, D. (2001), ‘Generalized exponential distribution: different method of estimations’, Journal of Statistical Computation and Simulation 69(4), 315–337. Gupta, R. D. & Kundu, D. (2007), ‘Generalized exponential distribution: Existing results and some recent developments’, Journal of Statistical Planning and Inference 137(11), 3537–3547. Hosking, J. R. M. (1990), ‘L-Moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics’, Journal of the Royal Statistical Society. Series B (Methodological) 52(1), 105–124. Hosking, J. R. M. (1994), ‘The four-parameter kappa distribution’, Journal of Research and Development - IBM 38(3), 251–258. Iriarte, Y. A., Gómez, H. W., Varela, H. & Bolfarine, H. (2015), ‘Slashed rayleigh distribution’, Revista Colombiana de Estadística 38(1), 31–44. Johnson, N. L., Kotz, S. & Balakrishnan, N. (1995), Continuous univariate distributions, Wiley, New York. Kao, J. (1958), ‘Computer methods for estimating Weibull parameters in reliability studies’, Trans. IRE Reliability Quality Control 13, 15–22. Kao, J. (1959), ‘A graphical estimation of mixed Weibull parameters in life testing electron tube’, Technometrics 1, 389–407. Koutras, V. (2011), Two-level software rejuvenation model with in- creasing failure rate degradation, in ‘Dependable Computer Systems’, Springer-Verlag, New York, pp. 101–115. Kundu, D. & Howlader, H. (2010), ‘Bayesian inference and prediction of the inverse Weibull distribution for Type-II censored data’, Computational Statistics and Data Analysis 54(6), 1547–1558. Kundu, D. & Raqab, M. Z. (2005), ‘Generalized Rayleigh distribution: Different methods of estimations’, Computational Statistics and Data Analysis 49(1), 187–200. Lai, M.-T. (2013), ‘Optimum number of minimal repairs for a system under increasing failure rate shock model with cumulative repair-cost limit’, International Journal of Reliability and Safety 7, 95–107. Revista Colombiana de Estadística 40 (2017) 165–203

202

Sanku Dey, Enayetur Raheem & Saikat Mukherjee

Macdonald, P. D. M. (1971), ‘Comment on “An estimation procedure for mixtures of distributions” by Choi and Bulgren’, Journal of the Royal Statistical Society B 33, 326–329. Maeda, K. & Nishikawa, M. (2006), ‘Duration of party control in parliamentary and presidential governments: A study of 65 democracies, 1950 to 1998’, Comparative Political Studies 39, 352–374. Mahmoud, M. & Mandouh, R. (2013), ‘The Transmuted Marshall-Olkin FrÃľchet Distribution: Properties and Applications’, Journal of Applied Sciences Research 9(10), 5553–5561. Mann, N. R., Schafer, R. E. & Singpurwalla, N. D. (1974), Methods for Statistical Analysis of Reliability and Life Data, Wiley, New York. Mazucheli, J., Louzada, F. & Ghitany, M. (2013), ‘Comparison of estimation methods for the parameters of the weighted lindley distribution’, Applied Mathematics and Computation 220, 463–471. Merovci, F. (2013a), ‘Transmuted generalized rayleigh distribution’, Journal of Statistics Applications and Probability 2(3), 1–12. Merovci, F. (2013b), ‘Transmuted lindley distribution’, International Journal of Open Problems in Computer Science & Mathematics 6(2), 63–72. Merovci, F. (2013c), ‘Transmuted Rayleigh distribution’, Austrian Journal of Statistics 41(1), 21–31. Merovci, F., Elbatal, I. & Ahmed, A. (2014), ‘The Transmuted Generalized Inverse Weibull Distribution’, Austrian Journal of Statistics 43(2), 119–131. Merovci, F. & Puka, L. (2014), ‘Transmuted Pareto distribution’, ProbStat Forum 7, 1–11. Nadarajah, S., Cordeiro, G. M. & Ortega, E. M. (2013), ‘The exponentiated weibull distribution: a survey’, Statistical Papers 54(3), 839–877. Pettitt, A. (1976), ‘A two-sample Anderson-Darling rank statistic’, Biometrika 63(1), 161. Ranneby, B. (1984), ‘The Maximum Spacing Method. An Estimation Method Related to the Maximum Likelihood Method’, Scandinavian Journal of Statistics 11(2), pp. 93–112. Rayleigh, L. (1880), ‘On the resultant of a large number of vibrations of the same pitch and of arbitrary phase’, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 10(60), 73–78. Renyi, A. (1961), On measures of entropy and information, in ‘Fourth Berkeley symposium on mathematical statistics and probability’, Vol. 1, pp. 547–561. Revista Colombiana de Estadística 40 (2017) 165–203

Estimation of Transmuted Rayleigh distribution

203

Saidane, S., Babai, M., Aguir, M. & Korbaa, O. (2010), Spare parts inventory systems under an increasing failure rate demand interval distribution, in ‘41st International Conference on Computers and Industrial Engineering’, pp. 768– 773. Salinas, H. S., Iriarte, Y. A. & Bolfarine, H. (2015), ‘Slashed exponentiated rayleigh distribution’, Revista Colombiana de Estadística 38(2), 453–466. Shaked, M. & Shanthikumar, J. G. (1994), Stochastic orders and their applications, Academic Press, Boston, MA. Shannon, C. E. (1948), ‘Prediction and Entropy of Printed English’, Bell System Technical Journal 30(1), 50–64. Shaw, W. T. & Buckley, I. R. (2007), The Alchemy of Probability Distributions: Beyond Gram Charlier & Cornish Fisher Expansions, and Skew-Normal or Kurtotic-Normal Distributions, Technical report, Financial Mathematics Group, King’s College, London, U.K. Singh, S. K., Singh, U. & Sharma, V. K. (2013), ‘Bayesian analysis for TypeII hybrid censored sample from inverse Weibull distribution’, International Journal of System Assurance Engineering and Management 4(3), 241–248. Song, K. S. (2001), ‘Rényi information, loglikelihood and an intrinsic distribution measure’, Journal of Statistical Planning and Inference 93(1-2), 51–69. Stephens, M. A. (2013), ‘EDF Statistics for Goodness of Fit and Some Comparisons’, Journal of the American Statistical Association 69(347), 730–737. Swain, J. J., Venkatraman, S. & Wilson, J. R. (1988), ‘Least-squares estimation of distribution functions in johnson’s translation system’, Journal of Statistical Computation and Simulation 29(4), 271–297. Teimouri, M., Hoseini, S. M. & Nadarajah, S. (2013), ‘Comparison of estimation methods for the Weibull distribution’, Statistics 47(1), 93–109. Tian, Y., Tian, M. & Zhu, Q. (2014), ‘Transmuted Linear Exponential Distribution: A New Generalization of the Linear Exponential Distribution’, Communications in Statistics-Simulation and Computation 43(10), 2661–2677. Tsarouhas, P. & Arvanitoyannis, I. (2007), ‘Reliability and maintainability analysis of bread production line’, Critical Reviews in Food Science and Nutrition 50, 327–343. Woosley, R. & Cossman, J. (2007), ‘Drug development and the FDAs critical path initiative’, Public policy 81, 129–133.

Revista Colombiana de Estadística 40 (2017) 165–203

Suggest Documents