Revista Colombiana de Estadística January 2017, Volume 40, Issue 1, pp. 165 to 203 DOI: http://dx.doi.org/10.15446/rce.v40n1.56153
Statistical Properties and Different Methods of Estimation of Transmuted Rayleigh Distribution Propiedades estadśticas y diferentes métodos de estimación de la distribución de Rayleigh transmutada Sanku Dey1,a , Enayetur Raheem2,b , Saikat Mukherjee3,c 1 Department 2 Applied
or Statistics, St. Anthony’s College, Shillong, India
Statistics and Research Methods, University of Northern Colorado, Greeley, USA 3 NIT
Meghalaya, Shillong, India
Abstract This article addresses various properties and different methods of estimation of the unknown Transmuted Rayleigh (TR) distribution parameters from a frequentist point of view. Although our main focus is on estimation, various mathematical and statistical properties of the TR distribution (such as quantiles, moments, moment generating function, conditional moments, hazard rate, mean residual lifetime, mean past lifetime, mean deviation about mean and median, the stochastic ordering, various entropies, stress-strength parameter, and order statistics) are derived. We briefly describe different methods of estimation such as maximum likelihood, method of moments, percentile based estimation, least squares, method of maximum product of spacings, method of Cramér-von-Mises, methods of AndersonDarling and right-tail Anderson-Darling, and compare them using extensive simulations studies. Finally, the potentiality of the model is studied using two real data sets. Bias, standard error of the estimates, and bootstrap percentile confidence intervals are obtained by bootstrap resampling. Key words: Distributional Moments, Order Statistics, Parameter Estimation; Rate Risk Function, Rayleigh Distribution, Transmuted Rayleigh Distribution.
a PhD.
E-mail:
[email protected] E-mail:
[email protected] c PhD. E-mail:
[email protected]
b PhD.
165
166
Sanku Dey, Enayetur Raheem & Saikat Mukherjee Resumen Este artículo se aborda las varias propiedades y diferentes métodos para la estimación de los desconocidos parámetros de Transmuted Rayleigh (TR) distribución desde el punto de vista de un frequentist. Aunque la tema principal de este artículo es estimación, varias propiedades matemáticas y estadísticas de TR distribución (como cuantiles, momentos, una función que genera momentos, momentos condicionales, la tasa de peligro, la media vida residual, media vida pasada, la desviación media por media y mediana, organización stochastic, entropías varias, parámetros de tensión-fuerza y estadisticas de orden) están derivadas. Describimos brevemente los diferentes métodos de estimación, como máxima probabilidad, método de momentos, estimacién basada por percentil, mínimos cuadrados, método de máximos productos de espacios, el método de Cramér-von-Mises, los métodos de Anderson-Darling y right-tail Anderson-Darling, y compararlos con extensos estudios de simulaciones. Por último, la potencialidad del modelo está estudiando con dos conjuntos de datos reales. El margen de error, el promedio de error de las estimaciones y el percentage bootstrap de los confianza intervalos estan derivido por bootstrap remuestro. Palabras clave: momentos distributivos, estadísticas de orden, la estimación de parámetros, riesgo de tipo de función, rayleigh transmutada.
1. Introduction Rayleigh distribution was introduced by Rayleigh (1880) and relates to a problem in the field of acoustics. This distribution has been extensively used in various fields such as communication engineering, the life-testing of electro vacuum devices, reliability theory, and survival analysis. An important characteristic of this distribution is that its failure rate is a linear function of time. The reliability function of the Rayleigh distribution decreases at a much higher rate than the reliability function of exponential distribution. This distribution relates to a number of distributions such as generalized extreme value, Weibull and Chi-square distributions and, hence, its applicability in real life situations is significant. Estimations, predictions, and inferential issues for one parameter Rayleigh distributions have been extensively studied by several authors. Interested readers may refer to Johnson, Kotz & Balakrishnan (1995) for on excellent insight into the Rayleigh distribution, and also see Dey & Das (2007), Dey (2009) for some references. The construction of the transmuted distribution is rather simple and was first proposed by Shaw & Buckley (2007). Since then, transmuted distributions have been widely studied in statistics, and many authors have developed various transmuted-type distributions based on some well known distributions. See, for example, the transmuted extreme value distribution with applications by Alkasasbeh & Raqab (2009a); the transmuted Lindley distribution by Merovci (2013b); the transmuted generalized Rayleigh distribution by Merovci (2013a); the transmuted exponentiated exponential distribution by Merovci (2013c); the transmuted Fréchet distribution by Mahmoud & Mandouh (2013); the transmuted generalized inverse Weibull distribution by Merovci, Elbatal & Ahmed (2014); the transmuted Revista Colombiana de Estadística 40 (2017) 165–203
Estimation of Transmuted Rayleigh distribution
167
Linear exponential distribution by Tian, Tian & Zhu (2014); and the transmuted Pareto distribution by Merovci & Puka (2014), etc. Hence, efficient estimation of these parameters is extremely important. We know that the maximum likelihood estimation (MLE) and the method of moments estimation (MME) are traditional methods of estimation. Although MLE is advantageous in terms of its efficiency and has good theoretical properties, there is evidence that it does not perform well, especially in the case of small samples. The method of moments is easily applicable and often gives explicit forms for estimators of unknown parameters. There are, however, cases where the method of moments does not give explicit estimators (e.g., for the parameters of the Weibull and Gompertz distributions). Therefore, other methods have been proposed in the literature as alternatives to the traditional methods of estimation. Among them, the L-moments estimator (LME), least squares estimator (LSE), generalized spacing estimator (GSE), and percentile estimator (PCE) are often suggested. Generally, these methods do not have good theoretical properties, but in some cases they can provide better estimates of the unknown parameters than the MLE and the MME. This paper considers ten different frequentist estimators for the transmuted Rayleigh distribution and evaluates their performance for different sample sizes and different parameter values. Simulations are used to compare the performance as it is not possible to compare all estimators theoretically (see Gupta & Kundu 2001, Gupta & Kundu 2007). The uniqueness of this study comes from the fact that we provide a comprehensive description of the mathematical and statistical properties of this distribution with the hope that they will attract wider applications in lifetime analysis. Also, to the best of our knowledge thus far, no attempt has been made to compare all these estimators to the two-parameter TR distribution along with mathematical and statistical properties. Comparisons of estimation methods for other distributions have been performed in the literature: Kundu & Raqab (2005) for generalized Rayleigh distributions, Alkasasbeh & Raqab (2009b) for generalized logistic distributions, Mazucheli, Louzada & Ghitany (2013) for weighted Lindley distribution, Teimouri, Hoseini & Nadarajah (2013) for the Weibull distribution, Akram & Hayat (2014) for Weibull distribution, and Dey, Dey & Kundu (2014) for twoparameter Rayleigh distribution. There are some works found in the literature on transmuted Rayleigh distribution and its variants such as the slashed Rayleigh distribution Iriarte, Gómez, Varela & Bolfarine (2015) and its exponential variant Salinas, Iriarte & Bolfarine (2015). The transmuted Rayleigh (TR) distribution was introduced by Merovci (2013c). He only studied moments and the maximum likelihood estimation of the unknown parameters. A random variable X is said to have a transmuted Rayleigh distribution if its cumulative distribution function(cdf) is given by 2 2 G(x, α, λ) = 1 − e−αx 1 − λ + λe−αx ; x > 0, α > 0, |λ| ≤ 1 (1)
Revista Colombiana de Estadística 40 (2017) 165–203
168
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
with corresponding pdf 2
g(x, α, λ) = 2αxe−αx
2
1 − λ + 2λe−αx
x > 0, α > 0, |λ| ≤ 1,
;
and the survival function is given by 2 2 S(x, α, λ) = e−αx 1 − λ + λe−αx ;
x > 0 , α > 0, |λ| ≤ 1.
(2)
(3)
The distribution studied in this paper is a re-parameterization of the the version proposed by Merovci (2013c) with α = (2σ 2 )−1 . Note that the classical Rayleigh distribution is a special case for λ = 0. Figure 1 illustrates some of the possible shapes of the pdf of a transmuted Rayleigh distribution for selected values of the parameters λ and α. This distribution has a unimodal pdf and increasing hazard rates. The latter may seem unrealistic at first in real life situation. However, in several situations, only increasing hazard rates are used or observed: Maeda & Nishikawa (2006) state that “Ruling parties in presidential systems face an increasing hazard rate in their survival”; Woosley & Cossman (2007) observe that drugs during clinical development have increasing hazard rates; Saidane, Babai, Aguir & Korbaa (2010) suppose that the demand interval in spare parts inventory systems has increasing hazard rates; Tsarouhas & Arvanitoyannis (2007) show that machines involved in bread production display increasing hazard rates; Koutras (2011) finds that software degradation times have increasing hazard rates; and Lai (2013) investigates the optimum number of minimal repairs for systems under increasing hazard rates; etc. 0.8
2.0
Α0.2, Λ1.0 0.6
Α0.5, Λ0.5 1.5
Α0.2, Λ0.5
Α1.0, Λ0.5
Α0.2, Λ 0.5
Α2.0, Λ0.5
Α0.2, Λ 1.0
0.4
0.2
0.0
Α3.0, Λ0.5
1.0
0.5
0
1
2
3
4
5
0.0
0
1
2
3
4
5
Figure 1: Probability density function of the TR distribution for different values of parameters α and λ.
The principal motivation of the paper is two fold: one is empirical and shows that the studied transmuted Rayleigh distribution outperforms at least two twoparameter distributions with respect to a real data set; the other is to show how different frequentist estimators of this distribution perform for different sample sizes and different parameter values, and to develop a guideline to choose the best estimation method for the transmuted Rayleigh distribution, which we think would be of interest to applied statisticians. The paper is organized as follows: Various mathematical, statistical and reliability properties of the TR distribution (like shapes, quantiles, moments, moment Revista Colombiana de Estadística 40 (2017) 165–203
169
Estimation of Transmuted Rayleigh distribution
generating function, hazard rate function, mean residual lifetime, and conditional moments, etc.) are derived in Sections 1 and 2. In Section 3, ten most frequent methods of estimations are discussed. Inference with simulation and two real data applications for the TR distribution are discussed in Section 4 and 5. In the last section, we draw some conclusions about the information contained in this article.
1.1. Shape The study of shapes is useful to determine if a data set can be modeled by the TR distribution. Here, we discuss the shapes of the pdf g. Theorem 1. The probability density function, g, is unimodal. Proof . Notice that g is a smooth function. To prove it is unimodal, we shall first show that there is one and only one x∗ ∈ R+ , such that g 0 (x∗ ) = 0, and g 0 > 0 for 2 x < x∗ & g 0 < 0 for x > x∗ . Letting u = e−αx , we have g 0 (x) = 2αe−2u [eu (λ − 1)(2u − 1) + (2 − 8u)λ] . Since 0 < u < ∞, g 0 >, =, < 0 depends on whether eu (λ − 1)(2u − 1) + (2 − 8u)λ >, =, < 0, respectively. Now g 0 (x) = 0 implies that eu (λ − 1)(2u − 1) + (2 − 8u)λ = 0. Case I: (λ = 1.) This implies u∗ = 1/4, and hence x∗ =
1 √ . 2 α
Now, for x > x∗ , i.e. u∗ > 1/4, we have eu (λ−1)(2u−1)+(2−8u)λ = 2−8u < 0 and, hence, g 0 < 0. Similarly, for x < x∗ we obtain g 0 > 0. u
λ Case II: (λ 6= 1.) In this case, we have Q(u) := (2u−1)e = λ−1 . Note that Q is a 8u−2 1 λ 1 ≤ 12 strictly increasing, continuous function of u on (0, 4 ) and ( 4 , ∞). Since λ−1 1 1 1 and Q has range ( 2 , ∞) on (0, 4 ) and range (−∞, ∞) on ( 2 , ∞), there is a unique λ u∗ ∈ ( 14 , ∞) such that Q(u∗ ) = λ−1 . Hence there is a unique x∗ > 2√1 α such that g 0 (x∗ ) = 0.
Now, for u > u∗ > 1/4, 1/4 < u < u∗ and 0 < u < 1/4 < u∗ , one can easily show that g 0 < 0, > 0 and > 0, respectively. Hence g is unimodal.
2. Statistical and Mathematical Properties In this section, we provide some important statistical and mathematical measures for the TR distribution such as quantiles, moment generating functions, moments, hazard rate and mean residual life functions, mean past lifetime, conditional moments, Stochastic ordering, mean deviation about mean and median, Shannon and Rényi entropy, order statistics, and stress strength parameters. The quantile function x = Q(p) = G−1 (p), for 0 < p < 1, of the TR distribution is obtained from (1); it follows that the quantile function is Revista Colombiana de Estadística 40 (2017) 165–203
170
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
"
(λ − 1) +
1 x = − ln α
!# 12 p (λ − 1)2 + 4λ(1 − p) . 2λ
(4)
In particular, the median of the TR distribution can be written as !# 12 √ (λ − 1) + λ2 + 1 . 2λ
"
1 M d(X) = Md = − ln α
(5)
We hardly need to emphasize the necessity and importance of the moments in any statistical analysis, especially in applied work. Some of the most important features and characteristics of a distribution can be studied through moments (e.g., tendency, dispersion, skewness, and kurtosis). If the random variable X is distributed T R(α, λ), then its nth moment around zero can be expressed as Z
n
E(X ) = 2α
∞
xn+1 exp −αx2
1 − λ + 2λ exp −αx2
dx.
0
On simplification, we get E(X n ) = α−n/2 Γ(1 +
n ) 1 − λ + λ2−n/2 2
(6)
The variance, skewness, and kurtosis measures can now be calculated using the following relations V ar(X) = E(X 2 ) − E 2 (X), Skewness(X) =
Kurtosis(X) =
E(X 3 ) − 3E(X)E(X 2 ) + 2E 3 (X) , V ar3/2 (X) E(X 4 ) − 4E(X)E(X 3 ) + 6E(X 2 )E 2 (X) − 3E 4 (X) . V ar2 (X)
Figures 2 and 3 illustrate their variations.
2.1. Moment Generating Function Many of the interesting characteristics and features of a distribution can be obtained via its moment generating function and moments. Let X denote a random variable with the probability density function (2). By definition of the moment generating function of X and using (2), we have MX (t) = E(etx ) =
Z
∞
etx g(x)dx
0
√ t2 t2 √ √ 4 α − 2 π(λ − 1)te 4α Erf 2√t α + 1 + 2πλte 8α Erf √ t√ +1 2 2 α = √ 4 α
(7)
Revista Colombiana de Estadística 40 (2017) 165–203
171
Estimation of Transmuted Rayleigh distribution
1.0 3 0.5 2 1.0
Α
0.5 0.0 Λ
1 0.5 1.0
(a)
(b)
Figure 2: (a) Expectation and (b) Variance of TR −1, −0.9, −0.8, · · · , 1 and α = 0.2, 0.4, 0.6, . . . , 3.
distribution
for
λ
=
4.0
0.8 0.7 0.6 0.5
3
3 3.5
2
1.0
Α
0.5 0.0 Λ
1 0.5
2 1.0
Α
0.5 0.0 Λ
1 0.5
1.0
(a)
1.0
(b)
Figure 3: (a) Skewness and (b) Kurtosis of TR distribution −1, −0.9, −0.8, . . . , 1 and α = 0.2, 0.4, 0.6, . . . , 3.
for
λ
=
2.2. Stochastic Ordering Stochastic ordering is a tool used to study structural properties of complex stochastic systems. For example, it is useful to control congestion in informationtransfer over the Internet, to define treatment-related trends in clustered binary data, and order expected welfare income under different mechanisms of allocating rewards. There are different types of stochastic orderings, which are useful in ordering random variables in terms of different properties. Here we consider four different stochastic orders, namely, the usual, the hazard rate, the mean residual life, and the likelihood ratio order for two independent TR random variables under a restricted parameter space. If Xand Y are independent random variables with CDFs FX and FY , respectively, then X is said to be smaller than Y in the • stochastic order (X ≤st Y ) if FX (x) ≥ FY (x) for all x Revista Colombiana de Estadística 40 (2017) 165–203
172
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
• hazard rate order (X ≤hr Y ) if hX (x) ≥ hY (x) for all x • mean residual life order (X ≤mrl Y ) if mX (x) ≥ mY (x) for all x • likelihood ratio order (X ≤lr Y ) if
fX (x) fY (x)
decreases in x.
The following results due to Shaked & Shanthikumar (1994) are well known for establishing stochastic ordering of distributions. X ≤lr Y =⇒ X
≤hr Y =⇒ X ≤mrl Y ⇓ X ≤st Y
(8)
The TR is ordered with respect to the strongest “likelihood ratio” ordering as shown in the following theorem. It shows the flexibility of the two parameter TR distribution. Theorem 2. Let X∼ T R (α1 , λ1 ) and Y∼ T R (α2 , λ2 ). If α1 = α2 = α and λ1 ≥ λ2 , then X ≤lr Y and, hence, X ≤hr Y, X ≤mrl Y and X ≤st Y. Proof . The likelihood ratio is 2
2
α1 xe−α1 x (1 − λ1 + 2λ1 e−α1 x ) fX (x) = fY (x) α2 xe−α2 x2 (1 − λ2 + 2λ2 e−α2 x2 ) 2
Thus,
fX (x) 4α1 xλ1 e−α1 x d log = −2(α1 − α2 )x − dx fY (x) (1 − λ1 + 2λ1 e−α1 x2 ) 2
4α2 xλ2 e−α2 x + . (1 − λ2 + 2λ2 e−α2 x2 ) (x) d Now, if α1 = α2 = α and λ1 ≥ λ2 , then dx log ffX ≤ 0, which implies that Y (x) X ≤lr Y and, hence, X ≤hr Y, X ≤mrl Y and X ≤st Y.
2.3. Hazard Function The hazard rate function of the transmuted Rayleigh distribution is given by 2
h(x) =
f (x) 2αx(1 − λ + 2λe−αx ) = . 1 − F (x) (1 − λ + λe−αx2 )
(9)
Figure 4 illustrates some of the possible shapes of h. Even though, in Figure 4, it apparently looks like that the hazard rate function of the TR distribution is always increasing, it can be noted that for some parameter values, the hazard function decreases. For example if α = 0.2 and λ = 0.92, h decreases in the interval (4, 4.4). Revista Colombiana de Estadística 40 (2017) 165–203
173
Estimation of Transmuted Rayleigh distribution 5
30
Α0.2, Λ1.0
Α0.5, Λ0.5 25
Α0.2, Λ0.5
4
Α1.0, Λ0.5
Α0.2, Λ 0.5
Α2.0, Λ0.5
20
Α0.2, Λ 1.0
3
Α3.0, Λ0.5 15
2 10
1
0
5
0
0
1
2
3
4
5
6
0
1
2
(a)
3
4
5
(b)
Figure 4: Hazard Function of TR distribution for different values of parameters α and λ.
2.4. Mean Residual Life Function The mean residual life MRL is the expected remaining life, X − x, given that the item has survived to time x. Figure 5 illustrates some of the possible shapes of µ. µ(x) = E(X − x | X > x) Z ∞ y g(y)dy = x −x 1 − G(x) √ √ √ 2√ e2αx π 2(1 − λ) Erfc(x α) + 2λ Erfc(x 2α) = . √ 4 α λ + (1 − λ)eαx2 One can easily show that limx→∞ µ(x) = 0. 3.0
1.5 2.5
Α0.5, Λ0.5
Α0.2, Λ1.0
Α1.0, Λ0.5
Α0.2, Λ0.5
2.0
1.0
Α0.2, Λ 0.5
Α2.0, Λ 0.5 Α3.0, Λ 0.5
Α0.2, Λ 1.0
1.5
1.0
0.5
0.5
0.0
0.0 0
5
10
15
20
0
(a)
5
10
15
20
(b)
Figure 5: MRL function of TR distribution for different values of parameters α and λ.
2.5. Mean Past Lifetime (MPL) In a real life situation, where systems are often not monitored continuously, one might be interested in inferring more about the history of the system e.g. Revista Colombiana de Estadística 40 (2017) 165–203
174
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
when the individual components have failed. Now assume that a component with lifetime X has failed at or some time before x, x ≥ 0. Consider the conditional random variable x − X | X ≤ x. This conditional random variable shows, in fact, the time elapsed from the failure of the component, given that its lifetime is less than or equal to x. Hence, the mean past lifetime (MPL) of the component can be defined as Rx tf (t)dt F (t)dt =x− 0 k(x) = E[x − X | X ≤ x] = F (x) F (x) √ √ √ √ √ 2 π ( 2λ Erfc( 2 αx)−2(λ−1) Erfc( αx)) −αx2 √ + (λ − 1)x −e + λxe−2αx + x − 4 α = . 1 − e−αx2 1 + λe−αx2 Rx 0
One can easily show that k(x) → ∞ as x → 0, ∞. Figure 6 illustrates some of the possible shapes of k.
30
30
25
25
Α0.2, Λ1.0 Α0.2, Λ0.5
20
Α0.5, Λ0.5 Α1.0, Λ0.5
20
Α0.2, Λ 0.5
Α2.0, Λ 0.5
Α0.2, Λ 1.0
15
10
10
5
5
0
0
5
Α3.0, Λ 0.5
15
10
15
20
0
0
5
10
(a)
15
20
(b)
Figure 6: MPL function of TR distribution for different values of parameters α and λ.
2.6. Conditional Moments For lifetime models, it is also of interest to know what E(X n | X > x) is. It can be easily seen that 1 E(X | X > x) = S(x) n
Z
∞
xn f (x)dx
x
In particular eαx E(X | X > x) =
2
√
√ √ √ √ π ( 2λErfc( 2 αx)−2(λ−1)Erfc( αx)) √ 4 α
+ xe−2αx
2
λ − (λ − 1)eαx
2
λ e−αx2 − 1 + 1 Revista Colombiana de Estadística 40 (2017) 165–203
175
Estimation of Transmuted Rayleigh distribution
2 2(λ − 1)eαx αx2 + 1 − λ 2αx2 + 1 E(X | X > x) = 2α (λ − 1)eαx2 − λ 2
√ √ √ √ √ 2 2 12 π(λ − 1)e2αx Erfc ( αx) − 3 2πλe2αx Erfc 2 αx E(X | X > x) = 16α3/2 (λ − 1)eαx2 − λ 3
√ 2 4 αx 2(λ − 1)eαx 2αx2 + 3 − λ 4αx2 + 3 + 16α3/2 (λ − 1)eαx2 − λ and 2
4
E(X | X > x) =
2(λ − 1)eαx
α2 x4 + 2αx2 + 2 − λ 2α2 x4 + 2αx2 + 1 2α2 (λ − 1)eαx2 − λ
The mean residual lifetime function is E(X | X > x) − x.
2.7. Mean Deviation The mean deviations of the mean and the median can be used as measures of spread in a population. Let µ = E(X) and M be the mean and the median of the transmuted Rayleigh distribution, respectively. The mean deviations about the mean and of the median can be calculated as Z
∞
Z |x − µ|g(x) dx
δ1 (X) = 0
∞
|x − M |g(x) dx,
δ2 (X) =
(10)
0
respectively.
Z
∞
E (|X − m|) =
|x − m|g(x) dx Z0 m
Z
∞
(m − x)g(x)dx + (x − m)g(x)dx 0 m Z m Z ∞ = 2m g(x)dx − m − E(X) + 2 xg(x) dx 0 m Z ∞ = 2mF (m) + 2 xg(x) dx − E(X) − m
=
m
Revista Colombiana de Estadística 40 (2017) 165–203
176
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Therefore, since we have µ =
√ √ π (( 2−2)λ+2) √ 4 α
r and M =
1 α
log
√
Z ∞ δ1 (X) = 2µF (µ) − 2µ + 2 xg(x) dx µ √ 1 √ √ 1√ √ π 2λ Erfc π − 2λ + λ + 2 = √ 2 2 α √ √ 1 +2(1 − λ) Erfc π 2−2 λ+2 , 4 Z
2λ λ2 +1+λ−1
,
(11)
∞
xg(x)dx − µ " s √ √ 2λ 1 + 2(1 − λ) απλ + λ 2απ = 4 α log √ 2 4α λ +1+λ−1 s ! √ 2λ −2λ 2απErf 2 log √ λ2 + 1 + λ − 1 s !# √ 2λ log √ . −4 απ(1 − λ)Erf λ2 + 1 + λ − 1
δ2 (X) = 2
M
(12)
2.8. Entropies An entropy provides an excellent tool to quantify the amount of information (or uncertainty) contained in a random observation regarding its parent distribution (population). A large value of entropy implies greater uncertainty in the data. The concept of entropy is important in different areas such as physics, probability and statistics, communication theory, and economics, etc. Several measures of entropy have been studied and compared in the literature. If X has the probability, distribution function f (·) then the Shannon entropy Shannon (1948) is defined by H(x) = −E (ln g(x)) Z ∞ =− g(x) ln g(x)dx Z0 ∞ h i 2 2 =− g(x) ln 2αxe−αx 1 − λ + 2λe−αx dx 0 1 =− − 2(log(α) + λ log(2) + γ) + 4 log(2α) + 2(λ − 2) 4 −2λ + (λ − 1)2 (− log(1 − λ)) + (λ + 1)2 log(λ + 1) + . λ Rényi entropy (Renyi 1961) can be expressed as Z ∞ 1 β Hβ (x) = ln g (x)dx , 1−β 0
(13)
β > 0, β 6= 1
Revista Colombiana de Estadística 40 (2017) 165–203
177
Estimation of Transmuted Rayleigh distribution
Then Hβ (x) can as simplified as follows: for −1 ≤ λ < 31 , √ − ln 2 − ln α +
1 ln Γ 1−β
β+1 2
+
k ∞ X β+1 1 β 2λ (β + k)− 2 , ln 1−β 1−λ k k=0
P∞ β k k √ β+1 β+1 1 < λ < 1, − ln 2−ln α+ 1−β ln 2 λ (1−λ)β−k α− 2 (β +k)− 2 k=0 k β+1 2λ β−k β−k k − β+1 2 + 2 λ (1 − λ) (2β − k) − Ga 2 , (k + β) ln 1−λ Γ β+1 2 √ 2λ Ga β+1 , and for λ = 1, − 32 ln 2 − ln α − β+1 2 , (2β − k) ln 1−λ 2 ln β R ∞ + ln Γ β+1 , where Ga(a, x) = x ta−1 e−t dt. It can be noted that, when β → 1, 2 the Rényi entropy converges to the Shannon entropy. For further details, see Song (2001). for
1 3
2.9. Order Statistics Moments of order statistics play an important role in quality control testing and reliability to predict the failure of future items based on the times of few early failures. Thus, the kth order statistic of a sample is its kth smallest value. For a sample of size n, the nth order statistic (or largest order statistic) is the maximum, that is, X(n) = max{X1 , . . . , Xn }. The sample range is the difference between the maximum and minimum. It is clearly a function of the order statistics: range{X1 , . . . , Xn } = X(n) − X(1) . We know that if X(1) ≤ · · · ≤ X(n) denotes the order statistic of a random sample X1 , . . . , Xn from a continuous population with cdf GX (x) and pdf gX (x), then the pdf of X(j) is given by gX(j) (x) =
n! j−1 n−j gX (x) (GX (x)) (1 − GX (x)) , (j − 1)!(n − j)!
for j = 1, . . . , n. The pdf of the jth order statistic for a transmuted Rayleigh distribution is given by 2 2 n! 2αxe−αx 1 − λ + 2λe−αx (j − 1)!(n − j)! h ij−1 2 2 × 1 − e−αx 1 − λ + λe−αx h in−j 2 2 × e−αx 1 − λ + λe−αx .
gX(j) (x) =
Revista Colombiana de Estadística 40 (2017) 165–203
178
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Therefore, the pdf of the largest order statistic X(n) is 2 2 gX(n) (x) =2nαxe−αx 1 − λ + 2λe−αx h in−1 2 2 × 1 − e−αx 1 − λ + λe−αx , and the pdf of the smallest order statistic X(1) is 2 2 gX(1) (x) = 2nαxe−αx 1 − λ + 2λe−αx .
2.10. Distribution of Minimum, Maximum and Median Let X1 , X2 , . . . , Xn be an independently identically distributed order random variables from the TR distribution having first, last, and median order probability density functions, which are given by the following n−1
g1:n (x) = n [1 − G(x, Φ)] g(x, Φ) 2 2 = 2nαxe−αx 1 − λ + 2λe−αx h in−1 2 2 × e−αx 1 − λ + λe−αx ,
(14)
n−1 gn:n (x) = n G(x(n) , Φ) g(x(n) , Φ) h in−1 2 2 = 2nαx 1 − e−αx 1 − λ + λe−αx 2 2 × e−αx 1 − λ + 2λe−αx ,
(15)
and (2m + 1)! (G(x))m (1 − G(x))m g(x) m! m! im 2 2 (2m + 1)! h = 1 − e−αx 1 − λ + λe−αx h m! m! i
gm+1:n (e x) =
× e
−αx2
−αx2
m
(16)
1 − λ + λe 2 2 × 2αxe−αx 1 − λ + 2λe−αx ,
where Φ = (α, λ).
2.11. Stress Strength Parameter In reliability theory, the stress-strength model describes the life of a component or item which has a random strength X1 that is subjected to a random stress X2 . The component fails instantaneously when the stress applied to it surpasses Revista Colombiana de Estadística 40 (2017) 165–203
179
Estimation of Transmuted Rayleigh distribution
the strength, and the component will function satisfactorily/acceptably whenever X1 > X2 . So, R = P r (X1 > X2 ) is a measure of component reliability. Its applicability is found in many spheres especially in engineering fields such as structures, deterioration of rocket motors, fatigue failure of aircraft structures, and the aging of concrete pressure vessels, etc. Extensive work on estimation of reliability of stress-strength models has been undertaken for the well known standard distributions. However, there are still some distributions (including generalizations of the well-known distributions) for which the form of R has not been investigated. Here, we derive the reliability R when X1 and X2 are independent random variables distributed with parameters (α1 , λ1 ) and (α2 , λ2 ) , then Z∞ R=
g1 (x)G2 (x) dx 0
Z∞ =
2 2 2 2 2α1 xe−α1 x (1 − λ1 + 2λ1 e−α1 x )[1 − e−α2 x (1 − λ2 + λ2 e−α2 x )] dx (17)
0
α2 α12 (2 − λ1 )(1 + λ2 ) + α1 α2 (5 + λ2 − λ1 (2 − λ2 )) + 2α22 = (α1 + α2 )(2α1 + α2 )(α1 + 2α2 )
If α1 = α2 = α, then R=
3 − λ1 + λ2 6
(18)
and if λ1 = λ2 = λ, then α2 α12 −λ2 + λ + 2 + α1 α2 λ2 − λ + 5 + 2α22 R= . (α1 + α2 )(2α1 + α2 )(α1 + 2α2 )
(19)
3. Methods of Estimation In this section, we describe ten methods to estimate the parameters, α and λ, the TR distribution. We assume throughout that x = (x1 , · · · xn ) is a random sample of size n from the TR distribution; both parameters α and λ are unknown.
3.1. Method of Maximum Likelihood Estimation The ML method is the most frequently used method of parameter estimation (Casella & Berger 1990). Its success stems from its many desirable properties including consistency, asymptotic efficiency, invariance property as well as its intuitive appeal. Let x1 , . . . , xn be a random sample of size n from (2), the likelihood function of the density (2) is given by Revista Colombiana de Estadística 40 (2017) 165–203
180
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
L(α, λ; x) =
n Y
g(xi , α, λ)
i=1 n n
= 2 α exp −α
n X i=1
! x2i
n Y
n Y 2 xi 1 − λ + 2λe−αxi .
i=1
(20)
i=1
The log-likelihood function without constant terms is given by `(α, λ; x) = log L(α, λ; x) n n n X X X 2 = log(xi ) + n log α − α x2i + log 1 − λ + 2λe−αxi . i=1
i=1
i=1
For ease of notation, we will denote the first partial derivatives of any function f (x, y) by fx , and fy , and its second partial derivatives by fxx , fyy , fxy , and fyx . Now setting `α = 0 and `λ = 0 , we have 2 n n X x2i e−αxi n X 2 x − 2λ `α = − = 0, 2 α i=1 i (1 − λ + 2λe−αxi ) i=1
(21)
and `λ =
n X i=1
2
2e−αxi − 1 2 = 0. 1 − λ + 2λe−αxi
(22)
ˆ of θ = (α, λ) is obtained by solving this nonlinear system of The MLE θˆ = (ˆ α, λ) equations. It is usually more convenient to use nonlinear optimization algorithms such as the quasi-Newton algorithm to numerically maximize the sample likelihood function given in (20).
3.2. Method of Moment Estimation The MMEs of the two-parameter TR distribution can be obtainedP by equating n thePfirst two theoretical moments of (2) with the sample moments n1 i=1 xi and n 1 2 i=1 xi respectively, n n 1X 1 xi = α−1/2 Γ 1 + 1 − λ + λ2−1/2 , n i=1 2
(23)
and n
1X 2 xi = α−1 1 − λ + λ2−1 . n i=1
(24)
Revista Colombiana de Estadística 40 (2017) 165–203
181
Estimation of Transmuted Rayleigh distribution
3.3. Method of Least-Square Estimation The least square estimators and weighted least square estimators were proposed by Swain, Venkatraman & Wilson (1988) to estimate the parameters of Beta distributions. In this paper, we apply the same technique for the TR distribution. The least square estimators of the unknown parameters α and λ of TR distribution can be obtained by minimizing n X F (X(j) ) − j=1
j n+1
2
with respect to unknown parameters α and λ. Suppose F (X(j) ) denotes the distribution function of the ordered random variables X(1) < X(2) < · · · < X(n) , where {X1 , X2 , · · · , Xn } is a random sample of size n from a distribution function F (·). Therefore, in this case, the least square ˆ LSE respectively, can be obtained by miniestimators of α and λ, say α ˆ LSE and λ mizing 2 n X j −αx2(i) −αx2(i) 1−e 1 − λ + λe − n+1 j=1 with respect to α and λ. The weighted least square estimators of the unknown parameters can be obtained by minimizing 2 n X j wj F (X(j) ) − n+1 j=1 2
(n+2) with respect to α and λ. The weights wj are equal to V (X1(j) ) = (n+1) j(n−j+1) . Therefore, in this case, the weighted least square estimators of α and λ, say α ˆ W LSE ˆ W LSE respectively, can be obtained by minimizing and λ
2 n X (n + 1)2 (n + 2) j −αx2(i) −αx2(i) 1−e 1 − λ + λe − n−j+1 n+1 j=1 with respect to α and λ.
3.4. Method of Percentile Estimation If the data comes from a distribution function which has a closed form, then we can estimate the unknown parameters by fitting a straight line to the theoretical points obtained from the distribution function and the sample percentile points. This method was originally suggested by Kao (1958, 1959) and it has been used for Weibull distribution and for generalized exponential distribution. In this paper, we apply the same technique for the TR distribution. Since, 2 2 G(x, α, λ) = 1 − e−αx 1 − λ + λe−αx ; Revista Colombiana de Estadística 40 (2017) 165–203
182
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
therefore, "
1 x = − ln α
(λ − 1) +
!# 12 p (λ − 1)2 + 4λ(1 − p) . 2λ
Let X(j) be the jth order statistic, i.e, X(1) < X(2) < · · · < X(n) . If pj denotes an estimate of G(x(j) ; α, λ), then the estimate of α and λ can be obtained by minimizing n X
"
x(j) − − 1 ln α j=1
(λ − 1) +
p
(λ − 1)2 + 4λ(1 − pj ) 2λ
!# 12 2
with respect to α and λ. The estimates of α and λ can be obtained by solving the following nonlinear equations n X j=1
λ2 +2λ−4λpj +1 2λ
+
√
s αx(j)
log
√ 2λ λ−1+ λ2 +2λ−4λpj +1
=0
α2
q v u λ+ λ2 +2λ−4λpj +1−1 u t log p 2λ λ2 + 2λ − 4λpj + 1 + λ(2pj − 1) − 1 − − x(j) α
n X j=1
log
√
λ−1+
√
v u u p p 2 2 αλ λ + λ(2 − 4pj ) + 1 λ + λ + 2λ − 4λpj + 1 − 1 tlog
! = 0, 2λ q λ+ λ2 +2λ−4λpj +1−1
respectively. We call the corresponding estimators the percentile estimators or PCE’s. Several estimators of pj can be used in this case, see, for example, Mann, j Schafer & Singpurwalla (1974). In this paper, we consider pj = n+1 .
3.5. Method of L-Moments Estimation In this section, we provide the L-moments estimators, which can be obtained as the linear combinations of order statistics. The L-moments estimators were originally proposed by Hosking (1990), and it is observed that the L-moments estimators are more robust than the usual moment estimators. The L-moment estimators are also obtained in the same way as the ordinary moment estimators, i.e. by equating the sample L-moments with the population L-moments. L-moment estimation provides an alternative method of estimation that is analogous to conventional moments. It has the advantage that it exists whenever the mean of the distribution exists, even though some higher moments may not exist, and it is relatively robust to the effects of outliers (Hosking 1994). Hosking (1990) states that the L-moment estimators are reasonably efficient when compared to the maximum likelihood estimators for distributions such as the normal distribution, the Gumbel distribution, and the GEV distribution. Revista Colombiana de Estadística 40 (2017) 165–203
183
Estimation of Transmuted Rayleigh distribution
In this case, the L-moments estimators can be obtained by equating the first two sample L-moments with the corresponding population L-moments. The first two sample L-moments are n
l1 =
1X x(i) , n i=1
n
l2 =
X 2 (i − 1)x(i) − l1 , n(n − 1) i=1
and the first two population L-moments are λ1 = E(X1:1 ) = E(X) 1 1 − λ + λ2−1/2 = α−1/2 Γ 1 + 2 1 λ2 = [E(X2:2 ) − E(X2:1 )] , 2
(25)
(26)
where Z E(X2:2 ) = 4α 0 √ π =−
∞
h i 2 2 2 2 x2 e−αx 1 − λ + 2λe−αx 1 − e−αx 1 − λ + λe−αx dx √ √ 2 √ √ √ 3+3 2−4 3 λ +4 3−3 2+ 3 λ+3 2−4 √ , 12 α
and Z ∞ 2 2 2 E(X1:2 ) = 4α x2 e−αx 1 − λ + 2λe−αx 1 − λ + λe−αx dx 0 √ √ √ √ √ √ π λ 3+3 2−4 3 λ−6 2+4 3 +3 2 √ . = 12 α bLM E of the parameters α and λ can be The L-moments estimators α bLM E and λ obtained by numerically solving the following equations: bLM E = l1 , λ2 α bLM E = l2 . λ1 α bLM E , λ bLM E , λ (27)
3.6. Method of Maximum Product of Spacings Cheng & Amin (1979, 1983) introduced the maximum product of spacings (MPS) method as an alternative to MLE for the estimation of parameters of continuous univariate distributions. Ranneby (1984) independently developed the same method as an approximation for the Kullback-Leibler measure of information. Using the same notations in subsection 3.4, the uniform spacings of a random sample from the TR distribution are defined as: Di (α, λ) = F (xi:n | α, λ) − F (xi−1:n | α, λ) , where F (x0:n | α, λ) = 0 and F (xn+1:n | α, λ) = 1. Clearly
i = 1, 2, . . . , n, Pn+1 i=1
Di (α, λ) = 1.
Revista Colombiana de Estadística 40 (2017) 165–203
184
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Following (Cheng & Amin 1983), the maximum product of spacings estimators bM P S , of the parameters α, and λ, are obtained by maximizing, with α bM P S and λ respect to α and λ, the geometric mean of the spacings: 1 "n+1 # n+1 Y , (28) G (α, λ) = Di (α, λ) i=1
or, equivalently, by maximizing the function H (α, λ) =
n+1 1 X log Di (α, λ). n + 1 i=1
(29)
bM P S of the parameters α and λ can be obtained The estimators α bM P S and λ by solving the nonlinear equations n+1 1 X 1 ∂H (α, λ) = [∆1 (xi:n | α, λ) − ∆1 (xi−1:n | α, λ)] = 0, (30) ∂α n + 1 i=1 Di (α, λ) n+1 ∂ 1 X 1 H (α, λ) = [∆2 (xi:n | α, λ) − ∆2 (xi−1:n | α, λ)] = 0, (31) ∂λ n + 1 i=1 Di (α, λ)
where
2
2
2
∆1 (xi:n | α, λ) = x2i:n e−αxi:n − λx2i:n e−αxi:n + 2λx2i:n e−αxi:n . and
2
2
∆2 (xi:n | α, λ) = e−αxi:n − e−2αxi:n .
(32) (33)
Cheng & Amin (1983) showed that maximizing H as a method of parameter estimation is as efficient as MLE estimation and the MPS estimators are consistent under more general conditions than the MLE estimators. 3.6.1. Method of Cramér-Von-Mises To motivate our choice of Cramér-von-Mises type minimum distance estimators, (Macdonald 1971) provided empirical evidence that the bias of the estimator is smaller than the other minimum distance estimators. Thus, The Cramér-vonbCM E of the parameters α and λ are obtained by Mises estimates that α bCM E and λ minimizing, with respect to α and λ, the function: 2 n X 2i − 1 1 C(α, λ) = + F (xi:n | α, λ) − . (34) 12n i=1 2n These estimators can also be obtained by solving the non-linear equations: n X 2i − 1 F (xi:n | α, λ) − ∆1 (xi:n | α, λ) = 0, 2n i=1 n X 2i − 1 F (xi:n | α, λ) − ∆2 (xi:n | α, λ) = 0, 2n i=1 where ∆1 (. | α, λ) and ∆2 (. | α, λ) are given by (32) and (33), respectively. Revista Colombiana de Estadística 40 (2017) 165–203
Estimation of Transmuted Rayleigh distribution
185
3.6.2. Methods of Anderson-Darling and Right-tail Anderson-Darling The Anderson-Darling test was developed in 1952 by Anderson & Darling (1952) as an alternative to other statistical tests for detecting sample distributions departure from normality. Specifically, the AD test converges very quickly towards the asymptote (Anderson & Darling 1954, Pettitt 1976, Stephens 2013). bADE of the parameters α and λ The Anderson-Darling estimators α bADE and λ are obtained by minimizing, with respect to α and λ, the function: n
A(α, λ) = −n −
1X (2i − 1) log F (xi:n | α, λ) + log F (xn+1−i:n | α, λ) . (35) n i=1
These estimators can also be obtained by solving the non-linear equations: " # n X ∆1 (xi:n | α, λ) ∆1 xn+1−i:n | α, λ (2i − 1) = 0, − F (xi:n | α, λ) F (xn+1−i:n | α, λ) i=1 " # n X ∆2 (xi:n | α, λ) ∆2 xn+1−i:n | α, λ (2i − 1) − = 0, F (xi:n | α, λ) F (xn+1−i:n | α, λ) i=1 where ∆1 (· | α, λ) and ∆2 (· | α, λ) are given by (32) and (33), respectively. bRT ADE of the paThe Right-tail Anderson-Darling estimates α bRT ADE and λ rameters α and λ are obtained by minimizing, with respect to α and λ, the function: R(α, λ) =
n n X n 1X F (xi:n | α, λ) − (2i − 1) log F (xn+1−i:n | α, λ) . (36) −2 2 n i=1 i=1
These estimators can also be obtained by solving the non-linear equations: n n X ∆1 xn+1−i:n | α, λ ∆1 (xi:n | α, λ) 1X −2 + (2i − 1) = 0, F (xi:n | α, λ) n i=1 F (xn+1−i:n | α, λ) i=1 n n X ∆2 xn+1−i:n | α, λ ∆2 (xi:n | α, λ) 1X + (2i − 1) −2 = 0, F (xi:n | λ, σ) n i=1 F (xn+1−i:n | α, λ) i=1 where ∆1 (· | α, λ) and ∆2 (· | α, λ) are given by (32) and (33), respectively.
4. Simulation We conduct Monte Carlo simulation studies to compare the performance of the estimators discussed in the previous sections. We evaluate the performance of the estimators based on bias, root mean squared error, the average absolute difference between the theoretical and empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and empirical distribution functions. Methods are compared for sample sizes n = 20, 50, 100, and Revista Colombiana de Estadística 40 (2017) 165–203
186
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
200. 10,000 independent samples of size n are first generated from the transmuted Rayleigh (TR) distribution with parameters α = 0.5, 1, 3, 5, 10 and λ = 0.7. We observed that 10,000 repetitions were sufficiently large provide have stable results. For all the methods considered in this study, we first estimated the parameters using the method of maximum likelihood. For all other methods, the maximum likelihood estimates were used as the initial values. Also, the same randomly generated samples were used to compare all the estimation methods. The results of the simulation studies are reported in Tables 1-5. The pseudo-random numbers were generated from the Transmuted Rayleigh distribution using the following function: "
1 x = − ln α
(λ − 1) +
p
(λ − 1)2 + 4λ(1 − u) 2λ
!# 21 ,
(37)
where u ∼ U N IF (0, 1) For each estimate we calculate the bias, root mean-squared error, and the average absolute difference between the theoretical and empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and empirical distribution functions. The statistics are obtained using the following formulae. R
Bias(ˆ α) =
1 X (ˆ αi − α), R i=1
v u R u1 X (ˆ αi − α)2 , RMSE(ˆ α) = t R i=1 R
R
ˆ = Bias(λ)
1 X ˆ (λi − λ) R i=1
v u R u X ˆ =t1 ˆ i − λ)2 RMSE(λ) (λ R i=1
(39)
n
1 X X ˆ Dabs (α ˆ) = F (xij |α, λ) − F (xij α ˆ , λ)| (nR) i=1 j=1 Dmax (ˆ α) =
(38)
R 1 X ˆ α, λ)| max F (xij α, λ) − F (xij |ˆ R i=1 j
(40)
(41)
Simulated bias, RMSE, and P Dabs , Dmax for the estimates are presented in Tables 1-5. The row indicating Ranks shows the partial sum of the ranks. A superscript indicates the rank of each of the estimators among all the estimators for that metric. For example, Table 1 shows the bias of MLE(ˆ α) as 0.29510 for n = 20. This indicates, that the bias of α ˆ , obtained using the method of maximum likelihood ranks was 10th among all other estimators. Table 6 shows the partial and overall rank of the estimators; it is used to find the overall performance of estimation techniques.
Revista Colombiana de Estadística 40 (2017) 165–203
Estimation of Transmuted Rayleigh distribution
187
The following observations can be drawn from the Tables 1-5. 1. All the estimators show the property of consistency, i.e. the RMSE decreases as the sample size increases. 2. The bias of α ˆ decreases with an increasing n for all the method of estimations. ˆ decreases with an increasing n for all the method of estimations. 3. The bias of λ 4. The bias of α ˆ generally increases with an increasing alpha for any given alpha and n and for all methods of estimation. 5. In terms of RMSE, all the methods of estimation produce smaller RMSE for ˆ α ˆ compared to that of λ. 6. Dabs is smaller than Dmax for all the estimation techniques. Again, these statistics get smaller with a a sample size increase. 7. In terms of performance of the methods of estimation, we found that maximum product spacing (MPS)estimators is the best method as it produces the least estimate biases with the least RMSE for most of the configurations considered in our studies. The next best method is the percentile estimators (PCE), followed by right tailed Anderson-Darling estimators. The weighted lest squared estimation (WLS) method ranked 4th while MLE ranked 5th. Method of Cramér-von-Mises ranked 10th among the ten method of estimation. 8. While MPS uniformly performed the best for all values of n and α, the PCE and WLS method performed the best for α ≤ 5. Their performance was degraded for α = 10. The overall positions of the estimators are presented in Table 6, from which we can confirm the superiority of MPS and PCE.
Revista Colombiana de Estadística 40 (2017) 165–203
200
100
50
n 20
0.2087 0.2938 −0.4939 0.6469 0.0337 0.0547 479 0.1727 0.2346 −0.4237 0.5387 0.0245 0.0375 376 0.1465 0.1914 −0.3677 0.4555 0.0164 0.0264 295
0.2088 0.2939 −0.4767 0.6268 0.0333 0.0523 386 0.1573 0.2294 −0.3745 0.5225 0.0233 0.0373 233 0.1193 0.1813 −0.2913 0.4293 0.0163 0.0253 183
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
0.1618 0.2007 −0.3988 0.4788 0.0177 0.0267 458
0.1818 0.2367 −0.4368 0.5398 0.0234 0.0374 398
0.2109 0.2877 −0.4868 0.6197 0.0335 0.0535 417
LME 0.2698 0.4018 −0.5778 0.7647 0.0525 0.0836 427
0.1889 0.2359 −0.4319 0.5299 0.0179 0.0289 549
0.1999 0.2609 −0.4479 0.5689 0.0259 0.0409 549
0.2066 0.2826 −0.4526 0.6005 0.0359 0.0559 417
LSE 0.2023 0.3053 −0.4294 0.6113 0.0537 0.0825 254
0.1486 0.1986 −0.3556 0.4616 0.0166 0.0266 366
0.1656 0.2305 −0.3876 0.5173 0.0247 0.0376 335
0.1823 0.2673 −0.4174 0.5693 0.0334 0.0524 213
WLS 0.2075 0.3364 −0.4435 0.6184 0.0513 0.0803 243
0.0762 0.1421 −0.2092 0.3411 0.0162 0.0251 91
0.0952 0.1741 −0.2652 0.4071 0.0232 0.0362 102
0.1202 0.2161 −0.3292 0.4791 0.0322 0.0502 102
PCE 0.1552 0.2912 −0.4163 0.5722 0.0514 0.0792 152
0.0691 0.1442 −0.1821 0.3442 0.0161 0.0252 91
0.0891 0.1762 −0.2381 0.4102 0.0231 0.0351 81
0.1151 0.2172 −0.3051 0.4802 0.0321 0.0501 81
MPS 0.1531 0.2901 −0.3992 0.5681 0.0512 0.0781 81
Table 1: Simulation results for α = 0.5 and λ = 0.7. MME 0.2839 0.41610 −0.62810 0.8498 0.0549 0.08810 5610
MLE 0.29510 0.4129 −0.6189 0.7576 0.0501 0.0814 396
Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
0.20310 0.25010 −0.46210 0.55810 0.01810 0.02910 6010
0.22210 0.28410 −0.49210 0.61410 0.02510 0.04110 6010
0.24510 0.32210 −0.52510 0.67210 0.03510 0.05710 6010
CVM 0.2607 0.3636 −0.5337 0.7225 0.05410 0.0868 438
0.1497 0.2048 −0.3505 0.4727 0.0178 0.0278 437
0.1655 0.2378 −0.3733 0.5256 0.0248 0.0388 387
0.1844 0.2784 −0.3983 0.5934 0.0348 0.0548 314
AD 0.2054 0.3505 −0.3641 1.36610 0.0536 0.0847 335
0.1384 0.1925 −0.3334 0.4524 0.0165 0.0265 274
0.1584 0.2293 −0.3744 0.5214 0.0246 0.0377 284
0.1845 0.2795 −0.4235 0.6056 0.0336 0.0536 335
RAD 0.2326 0.3777 −0.5076 0.9879 0.0538 0.0879 459
188 Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Revista Colombiana de Estadística 40 (2017) 165–203
200
100
50
n 20
0.4259 0.5959 −0.50110 0.65010 0.0338 0.0538 5410 0.3497 0.4737 −0.4267 0.5407 0.0237 0.0377 427 0.2956 0.3846 −0.3697 0.4576 0.0165 0.0265 356
0.42810 0.59610 −0.4909 0.6368 0.0324 0.0525 468 0.3275 0.4686 −0.3885 0.5316 0.0234 0.0375 315 0.2495 0.3695 −0.3035 0.4365 0.0163 0.0263 265
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
0.3258 0.4048 −0.4008 0.4808 0.0178 0.0268 488
0.3648 0.4758 −0.4389 0.5408 0.0236 0.0376 458
0.4218 0.5778 −0.4898 0.6217 0.0337 0.0527 457
LME 0.5478 0.8139 −0.5818 0.7677 0.0525 0.0838 459
0.3759 0.4699 −0.4309 0.5309 0.0189 0.0299 549
0.3789 0.4969 −0.4288 0.5589 0.0259 0.0399 539
0.3625 0.4965 −0.4075 0.5746 0.0349 0.0549 396
LSE 0.3605 0.5363 −0.3905 0.5913 0.0538 0.0825 294
0.2997 0.3987 −0.3576 0.4637 0.0177 0.0267 417
0.3326 0.4625 −0.3896 0.5185 0.0238 0.0378 386
0.3696 0.5406 −0.4226 0.5745 0.0336 0.0526 355
WLS 0.4256 0.6877 −0.4516 0.6244 0.0513 0.0803 294
0.1532 0.2842 −0.2093 0.3412 0.0162 0.0252 132
0.1933 0.3512 −0.2654 0.4093 0.0233 0.0352 172
0.2373 0.4312 −0.3284 0.4803 0.0323 0.0502 173
PCE 0.2784 0.5262 −0.3844 0.5512 0.0514 0.0782 182
0.1261 0.2751 −0.1661 0.3331 0.0161 0.0251 61
0.1581 0.3301 −0.2132 0.3911 0.0231 0.0351 71
0.1872 0.3791 −0.2573 0.4431 0.0321 0.0491 91
MPS 0.2212 0.4451 −0.3133 0.5071 0.0512 0.0771 101
Table 2: Simulation results for α = 1 and λ = 0.7. MME 0.5819 0.84210 −0.63610 0.8538 0.0539 0.08710 5610
MLE 0.59710 0.8128 −0.6279 0.7646 0.0501 0.0814 387
Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
0.40510 0.50010 −0.45910 0.55910 0.01810 0.02910 6010
0.42310 0.54310 −0.47010 0.60310 0.02510 0.04110 6010
0.4167 0.5527 −0.4587 0.6379 0.03510 0.05610 509
CVM 0.4357 0.6026 −0.4607 0.6845 0.05410 0.0849 448
0.1613 0.3333 −0.1732 0.3643 0.0166 0.0266 233
0.1732 0.3853 −0.1731 0.4022 0.0235 0.0364 172
0.1741 0.4413 −0.1511 0.4502 0.0335 0.0514 162
AD 0.1671 0.5484 0.0421 2.38310 0.0526 0.0826 283
0.2144 0.3594 −0.2504 0.4064 0.0164 0.0264 244
0.2304 0.4204 −0.2583 0.4544 0.0232 0.0363 204
0.2403 0.4864 −0.2552 0.5164 0.0322 0.0513 194
RAD 0.2333 0.5975 −0.1202 1.6769 0.0527 0.0827 336
Estimation of Transmuted Rayleigh distribution
189
Revista Colombiana de Estadística 40 (2017) 165–203
200
100
50
n 20
1.26610 1.77010 −0.49310 0.64810 0.0337 0.0539 5610 1.0439 1.4099 −0.4219 0.5389 0.0236 0.0377 499 0.8849 1.1479 −0.3669 0.4569 0.0165 0.0264 458
0.9188 1.3918 −0.3588 0.5525 0.0333 0.0514 365 0.7967 1.2418 −0.3167 0.4865 0.0232 0.0363 325 0.6313 1.0234 −0.2555 0.4083 0.0162 0.0253 203
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
0.97410 1.21010 −0.40010 0.48010 0.0177 0.0267 5410
1.08410 1.41710 −0.43510 0.53910 0.0235 0.0375 5010
1.2459 1.7149 −0.4849 0.6209 0.0335 0.0525 469
LME 1.6119 2.4099 −0.5739 0.7657 0.0525 0.0837 469
0.6494 1.0093 −0.2503 0.4295 0.0179 0.0279 335
0.7193 1.1033 −0.2753 0.4694 0.0249 0.0389 314
0.7874 1.2053 −0.2984 0.5084 0.0349 0.0536 305
LSE 0.8734 1.4173 −0.3166 0.5434 0.0537 0.0805 305
0.6806 1.0245 −0.2726 0.4144 0.0164 0.0265 304
0.7294 1.1464 −0.2885 0.4563 0.0234 0.0364 243
0.7643 1.2604 −0.2973 0.4913 0.0334 0.0513 203
WLS 0.8173 1.5847 −0.2914 0.5103 0.0512 0.0783 223
0.3722 0.7592 −0.1752 0.3192 0.0163 0.0252 132
0.3952 0.8272 −0.1972 0.3602 0.0233 0.0352 132
0.4162 0.8992 −0.2212 0.4072 0.0321 0.0492 122
PCE 0.4632 1.0551 −0.2572 0.4642 0.0513 0.0772 122
0.2731 0.7091 −0.1251 0.3021 0.0161 0.0251 61
0.3271 0.8061 −0.1561 0.3471 0.0231 0.0351 61
0.3681 0.8931 −0.1851 0.3911 0.0321 0.0481 61
MPS 0.4491 1.0632 −0.2361 0.4531 0.0501 0.0761 71
Table 3: Simulation results for α = 3 and λ = 0.7. MME 1.74210 2.52210 −0.63010 0.8528 0.05310 0.08710 5810
MLE 0.9547 1.4484 −0.3568 0.6105 0.0514 0.0794 325
Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
0.6695 1.0546 −0.2534 0.4487 0.01710 0.02810 427
0.7605 1.1766 −0.2854 0.5008 0.02410 0.03910 437
0.8536 1.3006 −0.3155 0.5566 0.03410 0.05410 437
CVM 0.9748 1.5145 −0.3427 0.6166 0.0539 0.0826 417
0.7858 1.1008 −0.3108 0.4528 0.0178 0.0278 489
0.8298 1.2097 −0.3228 0.5007 0.0248 0.0388 468
0.8767 1.3457 −0.3347 0.5668 0.0338 0.0538 458
AD 0.9336 1.6188 −0.2925 1.4789 0.0526 0.0858 428
0.7386 1.0597 −0.2997 0.4356 0.0166 0.0266 396
0.7706 1.1655 −0.3116 0.4876 0.0237 0.0376 366
0.7875 1.2825 −0.3196 0.5587 0.0336 0.0537 365
RAD 0.8174 1.5576 −0.2753 1.53510 0.0538 0.0869 406
190 Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Revista Colombiana de Estadística 40 (2017) 165–203
MME 2.92910 4.23110 −0.63410 0.8528 0.05410 0.08810 5810 2.14210 2.96810 −0.49610 0.64810 0.0337 0.0549 5610 1.7529 2.3499 −0.4239 0.53810 0.0246 0.0377 509 1.4779 1.9107 −0.3669 0.4556 0.0164 0.0264 396
MLE 1.5696 2.3754 −0.3597 0.6175 0.0523 0.0804 294 1.4816 2.2236 −0.3526 0.5545 0.0334 0.0514 314 1.2884 1.9824 −0.3114 0.4824 0.0233 0.0363 224 1.0793 1.7173 −0.2643 0.4133 0.0163 0.0253 183
Est. Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ) Dabs Dmax P Ranks
n 20
50
100
200
1.61210 2.00410 −0.39810 0.4789 0.0177 0.0267 5310
1.79910 2.35210 −0.43410 0.5389 0.0235 0.0375 498
2.0779 2.8529 −0.4839 0.6189 0.0335 0.0535 467
LME 2.6499 3.9609 −0.5709 0.7627 0.0525 0.0837 469
1.3256 1.8626 −0.3085 0.4637 0.0179 0.0289 427
1.3775 1.9835 −0.3185 0.4955 0.0249 0.0399 386
1.4735 2.1464 −0.3354 0.5304 0.0349 0.0537 336
LSE 1.5465 2.4385 −0.3396 0.5584 0.0537 0.0815 325
1.1864 1.7364 −0.2864 0.4224 0.0165 0.0265 264
1.2603 1.9293 −0.3013 0.4643 0.0234 0.0374 203
1.3203 2.1243 −0.3083 0.4973 0.0333 0.0513 183
WLS 1.2993 2.3493 −0.2913 0.5093 0.0512 0.0783 173
0.6452 1.2892 −0.1822 0.3242 0.0162 0.0252 122
0.7112 1.4182 −0.2112 0.3692 0.0232 0.0362 122
0.7752 1.5682 −0.2402 0.4192 0.0332 0.0502 122
PCE 0.8222 1.8101 −0.2742 0.4752 0.0524 0.0782 132
0.5251 1.2501 −0.1431 0.3161 0.0161 0.0251 61
0.6011 1.3871 −0.1711 0.3581 0.0231 0.0351 61
0.6961 1.5621 −0.2051 0.4071 0.0321 0.0491 61
MPS 0.8011 1.8152 −0.2541 0.4661 0.0511 0.0771 71
Table 4: Simulation results for α = 5 and λ = 0.7.
1.3827 1.9539 −0.3176 0.48310 0.01710 0.02810 529
1.4877 2.1308 −0.3377 0.5308 0.02510 0.03910 509
1.6208 2.3317 −0.3597 0.5818 0.03510 0.05510 509
CVM 1.7678 2.6356 −0.3788 0.6376 0.0539 0.0836 437
1.4488 1.9428 −0.3438 0.4698 0.0178 0.0278 488
1.4958 2.1177 −0.3498 0.5147 0.0248 0.0388 467
1.5877 2.3628 −0.3618 0.5797 0.0348 0.0548 467
AD 1.6757 2.8258 −0.3295 1.39410 0.0536 0.0858 448
1.3095 1.8295 −0.3187 0.4445 0.0166 0.0266 345
1.3816 2.0286 −0.3356 0.4996 0.0247 0.0376 375
1.4154 2.2185 −0.3415 0.5676 0.0336 0.0536 325
RAD 1.4424 2.6747 −0.3094 1.2639 0.0538 0.0879 416
Estimation of Transmuted Rayleigh distribution
191
Revista Colombiana de Estadística 40 (2017) 165–203
200
100
50
−0.7797 2.5032 0.1407 0.2804 0.0324 0.0504 285 −0.6245 2.0202 0.1025 0.2352 0.0232 0.0363 192 −0.4575 1.6492 0.0725 0.1982 0.0164 0.0264 222
4.06010 5.46410 −0.47610 0.61510 0.0323 0.0517 509 3.2099 4.5499 −0.3859 0.52410 0.0233 0.0374 448 2.4497 3.6587 −0.2997 0.4337 0.0163 0.0253 347
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
Dabs Dmax P Ranks
−0.9957 3.4732 0.2095 0.3524 0.0513 0.0785 264
5.04410 6.45010 −0.57910 0.7118 0.0492 0.0783 438
Bias(α) ˆ RMSE(α) ˆ ˆ Bias(λ) ˆ RMSE(λ)
20
MME
MLE
Est.
n
−0.2713 1.8133 0.0484 0.2213 0.0176 0.0265 243
−0.3794 2.1993 0.0704 0.2643 0.0235 0.0375 243
−0.4575 2.7153 0.0976 0.3155 0.0339 0.0518 366
−0.5414 3.7215 0.1464 0.3945 0.05210 0.0808 365
LME
3.9629 4.7879 −0.4559 0.5389 0.0177 0.0287 509
1.7726 3.1576 −0.2016 0.4187 0.0247 0.0387 397
0.4576 2.8254 −0.0101 0.2542 0.0337 0.0503 232
0.6625 3.7144 −0.0081 0.2522 0.0529 0.0772 233
LSE
3.0638 4.0088 −0.3678 0.4668 0.0165 0.0266 438
3.40010 4.55910 −0.40210 0.5189 0.0246 0.0376 5110
3.6929 5.0489 −0.4289 0.5589 0.0336 0.0529 5110
3.8839 5.5719 −0.4389 0.5877 0.0515 0.0797 4610
WLS
1.5336 2.8516 −0.2116 0.3416 0.0162 0.0252 284
1.9398 3.4568 −0.2698 0.4076 0.0234 0.0362 366
2.3778 4.1178 −0.3318 0.4728 0.0325 0.0502 397
2.8138 4.8908 −0.4028 0.5426 0.0526 0.0786 427
PCE
−0.2964 1.5171 0.0333 0.1761 0.0161 0.0241 111
−0.3403 1.7711 0.0352 0.1861 0.0221 0.0341 91
−0.3344 2.1131 0.0312 0.1981 0.0311 0.0471 101
−0.2811 2.9021 0.0182 0.2191 0.0491 0.0731 71
MPS
Table 5: Simulation results for α = 10 and λ = 0.7.
4.25610 5.06810 −0.48410 0.56510 0.0188 0.0298 5610
1.8757 3.3777 −0.2077 0.4508 0.0248 0.0388 459
0.1972 2.8415 0.0343 0.2653 0.0338 0.0516 273
0.3052 3.6763 0.0573 0.2673 0.0527 0.0784 222
CVM
−0.0801 2.8255 0.0111 0.2885 0.02710 0.04110 326
0.1452 3.0545 0.0091 0.3105 0.02810 0.04410 335
0.1191 3.3187 0.0504 0.3507 0.03410 0.05310 397
−0.3323 4.2887 0.3406 2.51310 0.0528 0.08310 449
AD
−0.2452 2.5784 0.0282 0.2654 0.0259 0.0389 305
−0.1251 2.5834 0.0373 0.2764 0.0269 0.0409 304
−0.2583 2.8616 0.0815 0.3246 0.0322 0.0505 273
−0.8916 3.9166 0.3517 1.7439 0.0514 0.0829 416
RAD
192 Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Revista Colombiana de Estadística 40 (2017) 165–203
193
Estimation of Transmuted Rayleigh distribution
Table 6: Partial and overall ranks of all the methods of estimation for λ = 0.7 and various α. α 0.5
n 20 50 100 200
MLE 6 6 3 3
MME 10 9 6 5
LME 7 7 8 8
LSE 4 7 9 9
WLS 3 3 5 6
PCE 2 2 2 1
MPS 1 1 1 1
CVM 8 10 10 10
AD 5 4 7 7
RAD 9 5 4 4
1
20 50 100 200
7 8 5 5
10 10 7 6
9 7 8 8
4 6 9 9
4 5 6 7
2 3 2 2
1 1 1 1
8 9 10 10
3 2 2 3
6 4 4 4
3
20 50 100 200
5 5 5 3
10 10 9 8
9 9 10 10
4 4 4 5
3 3 3 4
2 2 2 2
1 1 1 1
7 7 7 7
8 8 8 9
6 5 6 6
5
20 50 100 200
4 4 4 3
10 10 9 6
9 7 8 10
5 6 6 7
3 3 3 4
2 2 2 2
1 1 1 1
7 9 9 9
8 7 7 8
6 5 5 5
10
20 50 100 200
8 9 8 7
4 5 2 2
5 6 3 3
3 2 7 9
10 10 10 8
7 7 6 4
1 1 1 1
2 3 9 10
9 7 5 6
6 3 4 5
P Ranks Overall Rank
108 5
148 8
151 9
119 6
103 4
56 2
20 1
161 10
123 7
102 3
5. Real Data Analysis In the following section, for illustrative purposes we present two applications for the proposed TR distribution to real data for illustrative purposes. These applications will show the flexibility of the TR distribution in modeling positive data. For the purpose of comparison, we also fit one Rayleigh parameter, Weibull, and Gamma distributions for the same data. We use the fitdistrplus R package to fit the distributions. The fits of these four distributions are presented in the subsequent sections. We present a comparative density plot, plots of the distribution functions, and Q-Q and P-P plots for all four distributions. For example, Figure 7 shows the comparative density, distribution, q-q, and p-p plots for the guinea pig data which we discussed in Section 5.1. Classical goodness of fit of a given distribution can be assessed via the density plot and the CDF plot. The Q-Q an P-P plots may provide additional information in some cases. In particular, the Q-Q plot may provide information about the lack-of-fit at the tails of the distribution, whereas the P-P plot emphasizes the Revista Colombiana de Estadística 40 (2017) 165–203
194
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
lack-of-fit at the center of the Muller and Dutang distribution (Delignette-Muller & Dutang 2014). To evaluate the performance of the candidate distributions that best fit the data, we calculate the log-likelihood of the fitted models based on numerically obtained maximum likelihood estimations via the fitdist() function, available in the fitdistrplus R package. We then obtain the values of AIC and BIC. Additionally, we test for goodness of fit for the candidate models using the KolmogorovSmirnov (K-S) test, Anderson-Darling’s (AD) test, and the Cramér-von-Mises (CVM) test. Note that, the K-S test requires unique items for us to obtain exact p-values. Therefore, we made sure to break the ties by adding a tiny random noise, which is uniform between 0.001 and 0.01, to each duplicate observation. Adding this small noise did not affect the statistical properties of the data. The two data sets considered in this study include tied observations. As such, we removed ties for both data sets using the process outlined above. We present a descriptive summary of original data as well as for the data after ties were broken. This shows a minor difference between the original and modified data sets. It should also be mentioned that removal of duplicate observations is necessary for the MPS method of estimation.
5.1. Example 1: Guinea Pig Data Our first data set contains survival times (in days) of guinea pigs injected with different doses of tubercle bacilli and is given in Table 7. The data has been analyzed by Kundu & Howlader (2010) and by Singh, Singh & Sharma (2013). We present the data after breaking the ties. Table 7: Survival times (in days) of guinea pigs injected with different doses of tubercle bacilli. 0.012
0.015
0.022
0.024
0.032
0.033
0.034
0.038
0.043
0.044
0.048
0.052
0.053
0.054
0.055
0.056
0.057
0.058
0.059
0.060
0.061
0.062
0.063
0.065
0.067
0.068
0.070
0.072
0.073
0.075
0.076
0.081
0.083
0.084
0.085
0.087
0.091
0.095
0.096
0.098
0.099
0.109
0.110
0.121
0.127
0.129
0.131
0.143
0.146
0.175
0.211
0.233
0.258
0.263
0.297
0.341
0.376
0.030
0.036
0.043
0.061
0.060
0.063
0.063
0.063
0.072
0.081
0.153
0.181
0.260
0.347
0.074
5.1.1. Finding a suitable distribution for guinea pig data We compare the fit of Rayleigh, Weibull, Gamma, and Transmuted Rayleigh distribution for the guinea pig data set. The data contains tied observations, and, therefore, we modified the data as discussed above to remove the ties. The summary of original data set, as well as the modified one, is presented in Table 8. We do not observe any noticeable difference in the calculated statistics for the modified data set compared to the original one. Revista Colombiana de Estadística 40 (2017) 165–203
195
Estimation of Transmuted Rayleigh distribution Table 8: Summary of Guinea Pig data: Original vs Modified. Data
Min
Q1
Median
Mean
Q3
Max
Original
0.01200
0.05475
0.07000
0.09982
0.11280
0.37600
Ties broken
0.01200
0.05575
0.07177
0.10080
0.11280
0.37600
Next, we present the fit statistics for four distributions for the guinea pig data. The log-likelihood statistic, the AIC, the BIC, and the tests statistics along with the p-values for K-S, AD, and CVM goodness of fit tests are presented in Table 9. Results show that the transmuted Rayleigh distribution best fits the guinea pig data having the largest log-likelihood, smallest AIC, and smallest BIC among all four candidate distributions. Graphical check for the model fitting is shown in Figure 7. From the histogram and theoretical density plots, transmuted Rayleigh distribution shows reasonably good fit especially in the right tail area. Fitting right tail area is particularly important for many life testing reliability problems. From the Q-Q plot, we find that the TR model is reasonably a good fit at the lower tail of the distribution as well. However, Weibull and gamma seems to be fitting the data in the middle of the distribution. We made our decision of finding the best fitting model based on the largest log-likelihood value and smallest AIC and BIC values. Based on these criteria, TR fits the guinea pig data best. Table 9: (Guinea pig data) Fitting distributions to find the best fit model. Distribution
LogLik
AIC
BIC
K-S(p)
AD(p)
CVM(p)
Rayleigh
91.48
−178.96
−174.41
0.25 (.000)
6.15 (.000)
1.28 (.001)
Weibull
99.83
−195.66
−191.10
0.15 (.088)
2.36 (.059)
0.41 (.067)
Gamma
102.83
−201.66
−197.11
0.99 (.000)
781.60 (.000)
23.99 (.000)
TRayleigh
111.37
−218.73
−214.18
0.48 (.000)
46.29 (.000)
6.52 (.000)
5.1.2. Bootstrap Estimates (Guinea Pig Data): Bias and Standard Errors Since the transmuted Rayleigh distribution best fits the data, we carry out ordinary nonparametric bootstrap resampling to obtain bias, standard error, and 95% bootstrap percentile confidence intervals for the parameters α and λ of the transmuted Rayleigh distribution. In ordinary nonparametric resampling, repeated samples are drawn from the original data with replacements. In addition, to the original data set, we draw 999 bootstrap samples to obtain the estimators bias and standard error of the estimators. Results are shown in Table 10. We note that moments based methods have estimates for α that are around 53 while the estimates for λ are around 0.58. For α ˆ , WLS produces the least biased estimate ˆ all the methods produce a small bias and while MPS a has higher bias. For λ, standard error.
Revista Colombiana de Estadística 40 (2017) 165–203
196
Sanku Dey, Enayetur Raheem & Saikat Mukherjee Histogram and theoretical densities 1.0 0.8 CDF
0.4
0.6
10 8 6
TRayleigh Rayleigh Weibull Gamma
0
0.0
2
0.2
4
Density
Empirical and theoretical CDFs
TRayleigh Rayleigh Weibull Gamma
0.0
0.1
0.2
0.3
0.0
0.1
Q−Q plot
0.2
0.3
P−P plot
data
0.0
0.1
0.2
0.8 0.6 0.4
Empirical probabilities
0.2
TRayleigh Rayleigh Weibull Gamma
0.0
0.2 0.1
TRayleigh Rayleigh Weibull Gamma
0.0
Empirical quantiles
0.3
1.0
data
0.0
0.3
0.2
0.4
0.6
0.8
1.0
Theoretical probabilities
Theoretical quantiles
Figure 7: (Guinea pig data) Comparison of fit for the four distributions of the guinea pig survival time data.
Table 10: (Guinea pig data) Estimated bias and standard error of the estimates α ˆ and ˆ and their 95% percentiles of the lower and upper confidence limits based λ on 999 bootstrap samples. ˆ λ
α ˆ
Method Est.
Bias
SE
LCL
UCL
Est.
Bias
SE
LCL
UCL
MLE
46.642
1.703
8.889
33.72
69.84
0.647
−0.002
0.053
0.544
0.751
LSE
77.284
−2.983
10.602
55.22
96.51
0.587
0.068
0.144
0.508
1.000
WLS
75.952
−0.030
12.727
52.65
101.29
0.570
0.020
0.059
0.532
0.757
PCE
43.992
1.398
9.374
30.62
66.27
0.582
0.000
0.020
0.544
0.622
MPS
44.364
−37.897
6.020
3.76
25.76
0.654
0.285
0.151
0.387
1.000
CVM
76.546
−3.044
10.610
54.99
96.46
0.600
0.074
0.153
0.509
1.000
RAD
55.178
0.867
12.452
33.80
81.31
0.657
−0.001
0.047
0.565
0.751
LME
51.856
1.964
10.834
36.23
76.86
0.552
−0.006
0.034
0.535
0.552
MME
52.467
1.185
10.417
35.60
77.82
0.578
0.008
0.020
0.578
0.595
MMM
52.566
0.959
10.429
36.74
77.25
0.593
−0.006
0.006
0.577
0.594
Revista Colombiana de Estadística 40 (2017) 165–203
197
Estimation of Transmuted Rayleigh distribution
5.2. Example 2: Fibre Strength Data The second data set consists of 100 measurements on breaking stress of carbon fibres (in Gba). The data in Table 11 is taken from Cordeiro, Ortega & Popović (2014). The data was also used in Nadarajah, Cordeiro & Ortega (2013). We present the data after the ties broker. Table 11: Breaking strengths of carbon fibres (in Gba). 0.39
0.81
0.85
0.98
1.08
1.12
1.17
1.18
1.22
1.25
1.36
1.41
1.47
1.57
1.59
1.61
1.69
1.71
1.73
1.80
1.84
1.87
1.89
1.92
2.00
2.03
2.05
2.12
2.17
2.35
2.38
2.41
2.43
2.48
2.50
2.53
2.55
2.56
2.59
2.67
2.73
2.74
2.76
2.77
2.79
2.81
2.82
2.83
2.85
2.87
2.88
2.93
2.95
2.96
2.97
3.09
3.11
3.15
3.19
3.22
3.27
3.28
3.31
3.33
3.39
3.51
3.56
3.60
3.65
3.68
3.70
3.75
4.20
4.38
4.42
4.70
4.90
4.91
5.08
5.56
1.58
1.60
1.61
1.70
1.85
2.04
2.17
2.17
2.48
2.55
2.82
2.98
3.11
3.16
3.19
3.23
3.31
3.39
3.68
3.69
5.2.1. Finding a suitable distribution for fibre data Similar to the guinea pig data, we compare the fits of Rayleigh, Weibull, Gamma, and transmuted Rayleigh distribution for the fibres data set. There were 20 tied observations in the data set. We prepared the data by breaking the ties following the procedure discussed earlier. The summary of the original and the modified data (after breaking ties) is shown in Table 12. We notice that the modified data has nearly the same descriptive characteristics as those in the original data set. Table 12: Summary of Fibres data: Original vs Modified. Data
Min
Q1
Median
Mean
Original
0.390
Ties broken
0.390
Q3
Max
1.840
2.700
2.621
3.220
5.560
1.847
2.700
2.622
3.221
5.560
For the fibres data, we present fit statistics of Rayleigh, Weibull, Gamma, and transmuted Rayleigh distribution in Table 13. these show that the transmuted Rayleigh distribution best fits the data ans has the largest log−likelihood (−105.39), the smallest AIC (214.77), and the smallest BIC (219.99) among all four candidate distributions. A histogram with theoretical density curves superimposed over the data is shown in panel (a) of Figure 8. The cdf, Q-Q, and P-P plots also show evidence that the TR is a good candidate model. Revista Colombiana de Estadística 40 (2017) 165–203
198
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Table 13: (Fibres strength data) Fitting distributions to find the best fit model. Distribution
LogLik
AIC
BIC
K-S(p)
AD(p)
CVM(p)
Rayleigh
−144.50
293.00
298.21
0.11 (.157)
1.66 (.143)
0.314 (.123)
Weibull
−141.53
287.05
292.26
0.06 (.868)
.0.41 (.836)
0.06 (.797)
Gamma
−143.24
290.49
295.70
0.96 (.000)
490.87 (.000)
32.74 (.000)
TRayleigh
−105.39
214.77
219.99
0.17 (.014)
4.82 (.003)
0.854 (.005)
0.8 0.4
0.2
CDF
0.6
0.3
0.4
TRayleigh Rayleigh Weibull Gamma
TRayleigh Rayleigh Weibull Gamma
0.0
0.0
0.2
0.1
Density
Empirical and theoretical CDFs 1.0
0.5
Histogram and theoretical densities
1
2
3
4
5
1
2
Q−Q plot data
3
4
5
1
2
3
4
Theoretical quantiles
5
6
0.8 0.6 0.4
Empirical probabilities
0.2
TRayleigh Rayleigh Weibull Gamma
0.0
4 3 2
TRayleigh Rayleigh Weibull Gamma
1
Empirical quantiles
5
1.0
P−Pdata plot
0.0
0.2
0.4
0.6
0.8
1.0
Theoretical probabilities
Figure 8: Comparison of fit of several distributions for the fibres strength data.
5.2.2. Bootstrap Estimates (fibres data): Bias and Standard Errors Since transmuted Rayleigh distribution best fits the fibres data, were obtain bootstrap biases and standard errors of the estimates of the parameters of the TR distribution using different methods of estimation. Results are shown in Table 14. We find that the MLE and all the moment-based methods are nearly unbiased for the parameter α. The worst performing method for this data set is the MPS, which produces a relatively large bias (0.325) for the estimate. The bootstrap percentile confidence intervals for λ shows that the lower limit is actually at the lower bound for the parameter. Also, note that the moment-based methods produce confidence intervals for λ, the lower limit of which falls outside the bound of the parameter, which is −1. Revista Colombiana de Estadística 40 (2017) 165–203
199
Estimation of Transmuted Rayleigh distribution
ˆ Table 14: (Fibres Data) Estimated bias and standard error of the estimates α ˆ and λ and their 95% percentiles of the lower and upper confidence limits based on 999 bootstrap samples. ˆ λ
α ˆ
Method Est.
Bias
SE
LCL
UCL
Est.
Bias
LCL
UCL
MLE
0.184
−0.000
0.015
0.153
0.216
−0.919
0.009
0.110 −1.000
−0.626
LSE
0.178 −0.001
0.014
0.151
0.205
−0.895
0.032
0.153 −1.000
−0.476
WLS
0.189 −0.002
0.015
0.154
0.216
−1.000
0.075
0.125 −1.000
−0.588
PCE
0.180
0.156
0.151
0.213
−0.884
0.004
0.130 −1.000
−0.546
0.001
SE
MPS
0.178
0.325
3.535
0.130
0.238
−0.851
0.101
0.204 −0.852
−0.246
CVM
0.181
−0.002
0.014
0.152
0.208
−0.923
0.035
0.129
−1.000
−0.572
RAD
0.188 −0.005
0.017
0.151
0.221
−1.000
0.089
0.135
−1.000
−0.525
LME
0.186
0.001
0.017
0.157
0.221
−0.945
−0.011
0.159
−1.261
−0.641
MME
0.188
0.001
0.016
0.157
0.222
−0.961
−0.009
0.161
−1.263
−0.653
MMM
0.187
0.001
0.018
0.156
0.227
−0.950
−0.002
0.161
−1.234
−0.623
6. Conclusion In this article, we have provide explicit expressions for the quantiles, moments, moment generating function, conditional moments, hazard rates, mean residual lifetime, mean past lifetime, mean deviation about mean and median, the stochastic ordering, various entropies, stress-strength parameter, and order statistics. The model parameters are estimated by ten methods of estimation, namely, maximum likelihood, moments, L-moments, percentile, least squares, weighted least squares, maximum product of spacing, Cramer-von Mises, Anderson-Darling, and right tailed Anderson-Darling. We have performed an extensive simulation study to compare these methods. We have compared estimators with respect to bias, root mean-squared error, the average absolute difference between the theoretical and empirical estimate of the distribution functions, and the maximum absolute difference between the theoretical and empirical distribution functions. We have also compared estimators wing two real data applications. The simulation results show that maximum product spacing (MPS) estimators is the best performing estimator in terms of biases and RMSE. The next best performing estimator is the percentile estimator (PCE), followed by the right tailed Anderson-Darling estimators. The real data applications show that the maximum product of spacing estimator gives the shortest confidence intervals for the alpha for Guinea Pig data set and the LSE for Fibre strength data set. Hence, we can argue that the percentile estimators, least squares estimators, right tailed Anderson-Darling estimators and the maximum product spacing estimators are among the best performing estimators for TR distribution. Revista Colombiana de Estadística 40 (2017) 165–203
200
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Acknowledgments The authors would like to thank the editor and two anonymous referees for their suggestions which have greatly improved the presentation of the paper. The second author also thanks Hana Kamran for translating the abstract in Spanish.
Received: March 2016 — Accepted: November 2016
References Akram, M. & Hayat, A. (2014), ‘Comparison of estimators of the Weibull Distribution’, Journal of Statistical Theory and Practice 8(2), 238–259. Alkasasbeh, M. R. & Raqab, M. Z. (2009a), ‘Estimation of the generalized logistic distribution parameters: Comparative study’, Statistical Methodology 6(3), 262–279. Alkasasbeh, M. R. & Raqab, M. Z. (2009b), ‘Estimation of the generalized logistic distribution parameters: Comparative study’, Statistical Methodology 6(3), 262–279. Anderson, T. W. & Darling, D. A. (1952), ‘Asymptotic theory of certain “goodnessof-fit” criteria based on stochastic processes’, Annals of Mathematical Statistics 23, 193–212. Anderson, T. W. & Darling, D. A. (1954), ‘A test of goodness-of-fit’, Journal of the American Statistical Association 49, 765–769. Casella, G. & Berger, R. L. (1990), Statistical Inference, Brooks/Cole Publishing Company, California. Cheng, R. C. H. & Amin, N. A. K. (1979), Maximum product of spacings estimation with applications to the lognormal distribution, Technical report, Department of Mathematics, University of Wales. Cheng, R. C. H. & Amin, N. A. K. (1983), ‘Estimating Parameters in Continuous Univariate Distributions with a Shifted Origin’, Journal of the Royal Statistical Society: Series B 45(3), 394–403. Cordeiro, G. M., Ortega, E. M. M. & Popović, B. V. (2014), ‘The gammalinear failure rate distribution: theory and applications’, Journal of Statistical Computation and Simulation 84(11), 2408–2426. Delignette-Muller, M. & Dutang, C. (2014), ‘fitdistrplus: An R Package for Fitting Distributions’, Journal of Statistical Software 64(4), 1–34. Dey, S. (2009), ‘Comparison of Bayes estimators of the parameter and reliability function for Rayleigh distribution under differentloss functions’, Malaysian Journal of Mathematical Sciences 3(247-264). Revista Colombiana de Estadística 40 (2017) 165–203
Estimation of Transmuted Rayleigh distribution
201
Dey, S. & Das, M. K. (2007), ‘A note on prediction interval for a Rayleigh distribution: Bayesian approach’, American Journal of Mathematical and Management Science 12, 43–48. Dey, S., Dey, T. & Kundu, D. (2014), ‘Two-parameter Rayleigh distribution: Different methods of estimation’, American Journal of Math- ematical and Management Sciences 33, 55–74. Gupta, R. D. & Kundu, D. (2001), ‘Generalized exponential distribution: different method of estimations’, Journal of Statistical Computation and Simulation 69(4), 315–337. Gupta, R. D. & Kundu, D. (2007), ‘Generalized exponential distribution: Existing results and some recent developments’, Journal of Statistical Planning and Inference 137(11), 3537–3547. Hosking, J. R. M. (1990), ‘L-Moments: Analysis and Estimation of Distributions Using Linear Combinations of Order Statistics’, Journal of the Royal Statistical Society. Series B (Methodological) 52(1), 105–124. Hosking, J. R. M. (1994), ‘The four-parameter kappa distribution’, Journal of Research and Development - IBM 38(3), 251–258. Iriarte, Y. A., Gómez, H. W., Varela, H. & Bolfarine, H. (2015), ‘Slashed rayleigh distribution’, Revista Colombiana de Estadística 38(1), 31–44. Johnson, N. L., Kotz, S. & Balakrishnan, N. (1995), Continuous univariate distributions, Wiley, New York. Kao, J. (1958), ‘Computer methods for estimating Weibull parameters in reliability studies’, Trans. IRE Reliability Quality Control 13, 15–22. Kao, J. (1959), ‘A graphical estimation of mixed Weibull parameters in life testing electron tube’, Technometrics 1, 389–407. Koutras, V. (2011), Two-level software rejuvenation model with in- creasing failure rate degradation, in ‘Dependable Computer Systems’, Springer-Verlag, New York, pp. 101–115. Kundu, D. & Howlader, H. (2010), ‘Bayesian inference and prediction of the inverse Weibull distribution for Type-II censored data’, Computational Statistics and Data Analysis 54(6), 1547–1558. Kundu, D. & Raqab, M. Z. (2005), ‘Generalized Rayleigh distribution: Different methods of estimations’, Computational Statistics and Data Analysis 49(1), 187–200. Lai, M.-T. (2013), ‘Optimum number of minimal repairs for a system under increasing failure rate shock model with cumulative repair-cost limit’, International Journal of Reliability and Safety 7, 95–107. Revista Colombiana de Estadística 40 (2017) 165–203
202
Sanku Dey, Enayetur Raheem & Saikat Mukherjee
Macdonald, P. D. M. (1971), ‘Comment on “An estimation procedure for mixtures of distributions” by Choi and Bulgren’, Journal of the Royal Statistical Society B 33, 326–329. Maeda, K. & Nishikawa, M. (2006), ‘Duration of party control in parliamentary and presidential governments: A study of 65 democracies, 1950 to 1998’, Comparative Political Studies 39, 352–374. Mahmoud, M. & Mandouh, R. (2013), ‘The Transmuted Marshall-Olkin FrÃľchet Distribution: Properties and Applications’, Journal of Applied Sciences Research 9(10), 5553–5561. Mann, N. R., Schafer, R. E. & Singpurwalla, N. D. (1974), Methods for Statistical Analysis of Reliability and Life Data, Wiley, New York. Mazucheli, J., Louzada, F. & Ghitany, M. (2013), ‘Comparison of estimation methods for the parameters of the weighted lindley distribution’, Applied Mathematics and Computation 220, 463–471. Merovci, F. (2013a), ‘Transmuted generalized rayleigh distribution’, Journal of Statistics Applications and Probability 2(3), 1–12. Merovci, F. (2013b), ‘Transmuted lindley distribution’, International Journal of Open Problems in Computer Science & Mathematics 6(2), 63–72. Merovci, F. (2013c), ‘Transmuted Rayleigh distribution’, Austrian Journal of Statistics 41(1), 21–31. Merovci, F., Elbatal, I. & Ahmed, A. (2014), ‘The Transmuted Generalized Inverse Weibull Distribution’, Austrian Journal of Statistics 43(2), 119–131. Merovci, F. & Puka, L. (2014), ‘Transmuted Pareto distribution’, ProbStat Forum 7, 1–11. Nadarajah, S., Cordeiro, G. M. & Ortega, E. M. (2013), ‘The exponentiated weibull distribution: a survey’, Statistical Papers 54(3), 839–877. Pettitt, A. (1976), ‘A two-sample Anderson-Darling rank statistic’, Biometrika 63(1), 161. Ranneby, B. (1984), ‘The Maximum Spacing Method. An Estimation Method Related to the Maximum Likelihood Method’, Scandinavian Journal of Statistics 11(2), pp. 93–112. Rayleigh, L. (1880), ‘On the resultant of a large number of vibrations of the same pitch and of arbitrary phase’, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 10(60), 73–78. Renyi, A. (1961), On measures of entropy and information, in ‘Fourth Berkeley symposium on mathematical statistics and probability’, Vol. 1, pp. 547–561. Revista Colombiana de Estadística 40 (2017) 165–203
Estimation of Transmuted Rayleigh distribution
203
Saidane, S., Babai, M., Aguir, M. & Korbaa, O. (2010), Spare parts inventory systems under an increasing failure rate demand interval distribution, in ‘41st International Conference on Computers and Industrial Engineering’, pp. 768– 773. Salinas, H. S., Iriarte, Y. A. & Bolfarine, H. (2015), ‘Slashed exponentiated rayleigh distribution’, Revista Colombiana de Estadística 38(2), 453–466. Shaked, M. & Shanthikumar, J. G. (1994), Stochastic orders and their applications, Academic Press, Boston, MA. Shannon, C. E. (1948), ‘Prediction and Entropy of Printed English’, Bell System Technical Journal 30(1), 50–64. Shaw, W. T. & Buckley, I. R. (2007), The Alchemy of Probability Distributions: Beyond Gram Charlier & Cornish Fisher Expansions, and Skew-Normal or Kurtotic-Normal Distributions, Technical report, Financial Mathematics Group, King’s College, London, U.K. Singh, S. K., Singh, U. & Sharma, V. K. (2013), ‘Bayesian analysis for TypeII hybrid censored sample from inverse Weibull distribution’, International Journal of System Assurance Engineering and Management 4(3), 241–248. Song, K. S. (2001), ‘Rényi information, loglikelihood and an intrinsic distribution measure’, Journal of Statistical Planning and Inference 93(1-2), 51–69. Stephens, M. A. (2013), ‘EDF Statistics for Goodness of Fit and Some Comparisons’, Journal of the American Statistical Association 69(347), 730–737. Swain, J. J., Venkatraman, S. & Wilson, J. R. (1988), ‘Least-squares estimation of distribution functions in johnson’s translation system’, Journal of Statistical Computation and Simulation 29(4), 271–297. Teimouri, M., Hoseini, S. M. & Nadarajah, S. (2013), ‘Comparison of estimation methods for the Weibull distribution’, Statistics 47(1), 93–109. Tian, Y., Tian, M. & Zhu, Q. (2014), ‘Transmuted Linear Exponential Distribution: A New Generalization of the Linear Exponential Distribution’, Communications in Statistics-Simulation and Computation 43(10), 2661–2677. Tsarouhas, P. & Arvanitoyannis, I. (2007), ‘Reliability and maintainability analysis of bread production line’, Critical Reviews in Food Science and Nutrition 50, 327–343. Woosley, R. & Cossman, J. (2007), ‘Drug development and the FDAs critical path initiative’, Public policy 81, 129–133.
Revista Colombiana de Estadística 40 (2017) 165–203