PARAMETER ESTIMATION IN A RIGHT-TRUNCATED GAMMA DISTRIBUTION Muhammad El-Taha, University of Southern Maine Department of Mathematics and Statistics, 96 Falmouth Street, Portland, ME 04104-9300, (
[email protected]) Key Words: Truncated distributions, minimum
variance unbiased estimates, maximum likelihood, Gamma.
Abstract: The purpose of this article is to give an explicit formula, based on the derivation of the minimum variance unbiased point estimate, for estimating the scale parameter of a right-truncated gamma distribution when the shape parameter, , is a known integer. Numerical comparisons of simulated data show that our formula outperforms the MLE method when both and the sample size are small.
1 Introduction We consider a Gamma random variable X whose distribution is right-truncated at a known point X = b b > 0 so that the probability density function of X may be written as
(
Cx ;1 e;x= ;()
0 < x b 0 < < 1 0 otherwise, (1) R b where C is obtained such that 0 fX (x)dx = 1. El-Taha 6] obtains the MV UE of the scale parameter in a shifted (left-truncated) gamma distribution (no such explicit result is available for the right-truncated case). El-Taha and Evans 7] consider the special case of a right-truncated exponential distribution. They give an approximate formula for estimating and provide numerical results that conrm the superiority of their method to the MLE (maximum likelihood estimate) method. This article extends the results of 7] to the Gamma case, and complement those of 6]. The problem of estimating the scale parameter , when is a known integer, will be examined on the basis of a sample of size n and an AMV UE (approximate minimum variance unbiased estimate) will be obtained. It is well known that the MLE method (see for example Hegde and Dahiya 9]) does not give accurate point
fX (x) =
estimates in the case of small sample sizes. Our result may be viewed as complementary to the MLE method and its variants. Truncated distributions of the Gamma family arise in a variety of practical applications when limitations on time or space result in restricting the possible values of the gamma variable to a well-dened nite interval. Applications in which the shape parameter, , is an integer appear frequently in manufacturing when the processing time of a job consists of a known number of exponential stages see Law and Kelton 10], for example. The MV UE of the reliability function P fX tg for 0 < t < b, is obtained by Sathe and Varde 12] Nath 11] in the cases of right-truncated exponential and Gamma distributions respectively. See also Cohen 4], Cheng and Amin 2], and Bai, Jakeman and McAleer 1]. This paper is organized as follows. In Section 2 we examine the right-truncated case and give an explicit expression, (4), for an AMV UE of that is computationally ecient when the sample size is small. In the special case of the truncated exponential distribution, approximation (4) reduces to a simple expression that is computationally ecient for all sample sizes, thus provides an alternative to the MLE method. Section 3 discusses briey the MLE method, and its relation to the method of moments. In some special cases both methods will be shown to be equivalent. Our review emphasizes computational considerations. Section 4 provides a discussion of tabulated numerical results from simulated data.
2 Main Result In this section we examine the right-truncated Gamma distribution and give an expression to approximate the point estimate of the scale parameter that can be easily evaluated in the case of small samples. To this end, let X1 X2 : : : Xn be identically distributed and independent P random variables with density (1) then S = nj=1 Xj is a complete sufcient statistic in the untruncated case. Therefore it continues to be so in the truncated case, see for
example Tukey 14]. Moreover since the family of distributions is complete, the MV UE of is a function of S , see Smith 13]. Now, the characteristic function of S is given by S (t) = C n 1 ; it];n
X ;1 (1 ; it)k ; b (1 ; it ) = ]n (2) 1 ; e k=0
k!
which when inverted gives the probability density function of S as
8 Cne;s= Pn0 k n n;1 > ) k=0 (;1) ( k )(s ; kb) < n ;(n fn ; 1 b=(s ; kb)g fS (s) = > n = 0 n ; 1 n0 b s (n0 + 1)b : 0 0 : : :otherwise , where fn ; 1 b=(s ; kb)g =
X
(3)
k!k !
k0 :::k ;1
Q;1 k ! Q;1(j !)kj ( n k; 1 )( s ;b kb )k j =0 j
j =0
the summation is taken over allPpossible values of ;1 k , and k = k0 k1 : : : k;1 such that k = j=0 j P;1 jk . j =0 j Equations (2) and (3) are given by Nath 11] who used moment generating functions and Laplace transforms in the derivations. Now we state the main result for the right truncated case. Let x] represents the largest integer less than or equal to x.
The Approximation. The AMV UE , ^(s) of is
given by
^(s) = N=D ,
(4)
where
Substituting fS (s) in (5), we obtain
Z1 0
with
f1 (s) =
n0 X k=0
n;1 n ) e;s= ^(s)f1 (s)ds = C;( n
(6)
(;1)k (nk )(s;kb)n;1 fn ;1 b=(s;kb)g
where n0 = 0 1 : : : n ; 1 n0 b s (n0 + 1)b: Expanding 1=C n, by making use of the binomial then the multinomial expansion formulas, condition (6) may be written as
Z1 0
where
f2 (s) =
e;s= ^(s)f1 (s)ds = f2 (s)
n X k=0
X
(;1)k (nk )
k0 k1 :::k ;1 k Q;1 k k!!Qb ;1(j !)kj e;kb= n+1;k : j =0 j j =0
Now dene the functions gk (s), k = 0 1 : : : n
gk (s) =
(
(s;kb)n ;k ;(n+1;k )
s kb ,
0
otherwise.
(7)
Let the function G(s) be dened as
G(s) = ;(n )
n X
k=0
(;1)k (nk )
X
k0 k1 :::k ;1
k!bk
Q;1 k ! Q;1 (j !)kj gk (s) , j =0 j
j =0
Now, let LG 1= ] denote the Laplace transform of ;1 h (k ) X (;1)k (s ; kb)n X Qj=0 G with argument 1= . One can readily verify that j j N= )! ( n ; k )! ( n ; k k0 :::k ;1 k=0 LG(s) 1= ] = f2 (s) Q;1 h (k ) and therefore X s=b] k (s ; kb)n;1 X ( ; 1) j =0 j j D= (n ; k)! (n ; k ; 1)! LG(s) 1= ] = L^(s)f1 (s) 1= ]: k=0 k0 :::k ;1 bj ]kj P ;1 k , and k = By the uniqueness of the inversion of the Laplace hj (kj ) = j!(s;kkbj !)j k = j=0 j transforms Churchill 3], p.183, one obtains ^(s) = P;1 jk . G(s) s=b]
j =0
j
Justication. The MV UE of is a function only of the complete sucient statistic S . Let this function be ^(s), then it must satisfy
Z1 0
^(s)fS (s)ds = :
(5)
f1 (s) : It is evident that, gk (s) = 0 for k > s=b] (see equation (7)), equation (4) then follows after further simple manipulations. The approximation step takes place in equation (6) and the subsequent simplications. From that point on we ignore the fact that s nb, and perform the integration from 0 to 1. Obviously, the
accuracy of the point estimate improves as the truncation point increases. Remark 2.1. The point estimator ^(S ) given by (4) has the desirable properties of a MV UE asymptotically in nb. Since the truncation point, b, is typically greater than 1, our approximation converges faster than the corresponding MLE method which is asymptotically ecient in n, the sample size. Remark 2.2. In the approximation (4) the numerator diers from the denominator only in that (s ; kb)n;1 and (n ; k ; 1)! are being replaced by (s ; kb)n and (n ; k )! respectively. Consequently the numerator in (4) can be computed in the process of computing the denominator at no extra eort. Remark 2.3. The major computational eort is in calculating the second summation in the denominator of (4). The number of terms within the second summation is equal to ( k+k ;2 ) for each k k = 0 1 : : : s=b]. This number grows exponentially as and n, the sample size, increase, thus expression (4) is useful for \small" sample sizes and reasonable values of (see Section 4). Remark 2.4. Approximation (4) reduces to ^(s) = s=n for the untruncated case, as expected. To obtain this formula from (4), simply choose b > s (implies k = 0) and simplify. Approximation (4) reduces to give a simpler formula for the AMV UE of in the case of a righttruncated exponential distribution ( see El-Taha and Evans 7] for an independent derivation) namely let = 1 in (4) to obtain Exponential Case. The AMV UE of in a righttruncated exponential distribution is given by
P
s=b] k n n =0 (;1) (k )(s ; kb) ^(s) = Pks=b . n k=0] (;1)k (nk )(s ; kb)n;1
This estimate can be carried out by means of a simple program or even a programmable calculator.
3 Maximum Likelihood Method In this section a computational approach based on the the MLE method and the method of moments is briey reviewed, and computational formulas for the right-truncated gamma are presented. The basic result of this section is not new, our presentation however, provides additional insights into the MLE method. For related literature the reader may refer to Hegde and Dahiya 9], Cohen 4], and references there in. R Let K ( ) = 0b y;1 e;y= dy: With this notation the probability density function (1) may be
written as
(
x ;1 e;x= K ( )
0 < x b 0 < < 1 otherwise. (8) We start with two simple results that are of interest on their own.
fX (x) =
0
Lemma 3.1 Let X be a random variable with a truncated gamma distribution as given in (8). (i) For r 1, the r-th moment of X , r , is given recursively by
+r;1 ;b= r = ( + r ; 1)r;1 ; b K ( e ) , 0 = 1 . (ii) For r 1, the r-th moment of X is given
explicitly as
r = r
r Y
j =1 r X r;i bi;1
Q
;b=
e ( + j ; 1) ; b K ( )
i=1
r Y
j =i+1
( + j ; 1)] ,
where rj=r+1 (:) = 1: Proof: The proof of (i) follows by applying the denition of the r-th moment E (X r ), and using integration by parts. Part (ii) follows by using induction on r, and exploiting (i). +r ) , r = 1 2 , be Lemma 3.2 Let r () = KK(( ) the rth moment of (7). Then (i) r ( ) ! 0, as ! 0, (ii) r ( ) ! +r br , as ! 1, and (iii) 0 < r ( ) < +r br , for all 0 < < 1 . Proof: By denition
R b x+r;1e;x= dx r ( ) = 0R b ;1 ;x= . x e dx 0
(9)
To prove (i), let u = x= in (9), and take the limit as ! 0. Part (ii) follows by taking limits in (9) as ! 1. Part (iii) follows from the fact that r ( ) is a monotonic function of (see Gross 8]). Let X1 : : : Xn be a sample of size n. Then the likelihood function L( ) is given by Q ni=1 xi ];1 e;nx= L( ) = . (10) K ( )]n Set @lnL@() = 0, to obtain the MLE of as the unique solution, if it exists, of
R
b ;x= x e dx x = R b0 ;1 ;x= . e dx 0x
(11)
Note that the right hand side of (11) is the rst moment 1 . The MLE of is given by the following theorem. Theorem 3.3 Let^ be the MLE of . Then (i) if 0 < x < +1 b, then ^ is the unique solution of (11), and (ii) if x +1 b, then ^ does not exist ( in the sense that ^ = 1). Proof: Observe that @lnL@() = 0 implies that 1 ( ) = x. By Lemma 3.2, 1 ( ) = x has a solution when 0 < x < +1 b. To verify that (11) has a unique solution when 0 < x < +1 b, note that lim!0 1 ( ) ; x < 0, lim!1 1 ( ) ; x > 0, and that 1 ( ) ; x is a monotonic function. This completes the proof of (i). Now, observe that lim L( ) =
!1
n (Qn xi );1 i=1 bn
.
So, to prove (ii) it suces to show
L( )
0 when x b. Thus, L( ) because @L@ +1 assumes its maximum as ! 1. This completes the proof the theorem. Appealing to Lemma 3.1, equation (11) may be rewritten as e;b= x = ; b K ( ) .
(12)
So if 0 < x < +1 b, the MLE of may be obtained by iteratively solving ;b=r r+1 = x + rKb (e ) , r = 0 1 , r
(13)
where 0 = x , and r is the point estimate of at the r-th iteration. At each iteration K ( r ) may be evaluated using numerical integration (see Conte and de Boor 5] for example ). These iterations can be carried out by means of a simple program. The sequence r may be arbitrarily terminated after k steps if jk ; k;1 j < , a predetermined level of accuracy chosen as 0.00001 in this article. Regarding convergence properties, it can be shown, using induction, that the sequence of estimates, fr g, increases with every step of the iteration. Moreover, one can establish that the sequence, fr g, converges to a unique xed point, by noting that the derivative g0 ( ) < 1, ( where
;b=
g( ) = x + bKe() , ) then using Theorem 3.1 of
Conte and de Boor 5]. Theorem 3.3 and numerical evidence indicate that the unique point is indeed the sought after . Since our interest is in being integer, the recursive MLE equation (13) for ^ when is an integer reduces to the more numerically ecient form
r+1
"
#;1
X ;1 (b= )k x r ; 1 b= r + b !r (e ; = k! ) k=0
r = 0 1 (14) where 0 = x= . This form will be used in Section
4.
Now, we briey discuss the method of moments and its relation to the MLE method. Let r = 1 in Lemma 3.1 to obtain equation (12), from which we obtain the estimate of by the method of moments as the solution, if it exists, of equation (13). We have seen, from Theorem 3.3, that equation (12) (equivalently (11)) has a unique solution if and only if 0 < x < +1 b. The solution can be recursively obtained using (13). So, when 0 < x < +1 b, the moment estimate coincides with the MLE . If x +1 b, the iterative procedure described in (13) can still be used as a moment estimate. In such a sample, however, the rate of convergence is bound to be slow, since a large value of x is likely to occur when the true parameter is itself being large (Theorem 3.3) . To see that this is the case, using Lemma 3.1 and Lemma 3.2, we have seen that E (X ) = 1 ( ) is an increasing function of , that it approaches zero as approaches zero, and that it approaches +1 b as increases indenitely. Moreover, Lemma 3.1 implies that standard error of q the for all values. X is bounded above by b (+2) n So even with large values of , the chances that x being larger than +1 b are small and decrease for large samples. We remark in passing that since is the expected value of the non-truncated variable, it is not likely that the truncation point would occur below in practice, hence large values of are mainly of theoretical interest, and samples in which x exceeds +1 b are rare.
4 Numerical Comparisons In this section we provide numerical evidence that the AMV UE formula given in (4) outperforms the MLE method for small values. Random samples of sizes 8, 16 and 24 are generated from righttruncated gamma distributions ( form (1)) with var-
ious values of , , and truncation points b. The values of range from 1 to 6, and is chosen such that the mean, , of the untruncated distribution remains xed at 10. The truncation point b takes values the values 10, 15, and 25. The point estimate ^ is then computed for each case using the AMV UE (4), and the MLE (equation (14)) methods respectively. Repeated sampling is used to compute the relative bias (rbias) and relative mean square error (rmse), see Bai et al 1] p. 402. The computations are tabulated in Tables 1 through 3. All entries corresponding to both methods are computed using 1000 simulation runs. For variance reduction purposes, identical samples were used in both methods. The simulation programs are written in C and carried out on a SUN SPARC station 1. Examining the tabulated data, the following points can be made: 1. The AMV UE method outperforms the MLE method in about all reported cases. However, for \large" values of , the AMV UE becomes impractical as the number of computations grows exponentially in and n, and the propagation error makes the computations less accurate. This is not a serious drawback as for most practical applications \large" values are not common. Moreover, when is large the MLE method tend to improve. 2. The MLE method is useful when b, the truncation point, is \large" relative to , is large, and a large sample size is employed. This is not unexpected, since as increases, keeping the mean, , of the untruncated distribution xed, the variability (coecient of variation) of the gammap distribution decreases at a rate proportional to 1= . 3. Both methods tend to improve as b, the truncation point, becomes larger relative to , as expected. However, the AMV UE method tends to outperform the MLE method in those cases as well. So our recommendation would be to use the AMV UE method for small and small sample sizes, and to use the MLE method or one of its variants otherwise.
References 1] J. Bai, A.J. Jakeman, and M. McAleer. A new approach to maximum likelihood estimation of the three-parameter Gamma and Weibull distributions. Austral. J. Statist., 33:397{410, 1991. 2] R.C.H. Cheng and N.A.K. Amin. Estimating parameters in continuous univariate distribu-
3] 4] 5] 6]
7]
8]
9]
10] 11]
12]
13] 14]
tions with a shifted origin. J. Roy. Statist. Soc. B, 45:394{403, 1983. R. V. Churchill. Operational Mathematics. McGraw-Hill, New York, 1958. A.C. Cohen. Truncated and Censored Samples. Marcel Dekker, New York, 1991. D. Conte and C. de Boor. Elementary Numerical Analysis. McGraw-Hill, New York, 1980. M. El-Taha. Minimum variance unbiased estimation in shifted Gamma distribution. Communication in Statistics: Simulation and Computation, 22:831{843, 1993. M. El-Taha and W. Evans. A new estimation procedure for a right-truncated exponential distribution. Proceedings of the 23rd Pittsburgh Conference on Modeling and Simulation, 23:427{434, 1992. A. Gross. Monotonicity properties of the moments of truncated Gamma and Weibull density functions. Technometrics, 13:332{335, 1971. L.M. Hegde and R.C. Dahiya. Estimation of the parameters of a truncated Gamma distribution. Communications in Statistics, Theory and Methods, pages 561{577, 1989. A. Law and W. Kelton. Simulation Modeling and Analysis. McGraw-Hill, New York, Dec. 1991. G. B. Nath. Unbiased estimates of reliability from the truncated Gamma distribution. Scandinavian Actuarial Journal, pages 181{ 186, 1975. Y.S. Sathe and S.D. Varde. Minimum variance unbiased estimates of reliability for the truncated exponential distribution. Technometrics, 11:609{612, 1969. W. Smith. A note on truncation and sucient statistics. Annals of Mathematical Statistics, 28:247{252, 1957. J.W. Tukey. Suciency, truncation and selection. Annals of Mathematical Statistics, 20:309{ 311, 1949.
Table 1 Comparison of AMV U and MLE : ( b = 10:0 , = 10) n=8 n = 16 n = 24 amvu mle amvu mle amvu mle =1 -0.450119 0.528881 -0.303270 0.576238 -0.193016 0.710738 = 10:0 0.245372 6.173533 0.166774 5.481652 0.150088 6.247508 =2 -0.196844 0.748936 -0.071225 0.512874 -0.010186 0.394479 = 5:0 0.130583 7.515855 0.130455 4.361994 0.145945 2.522101 =3 -0.298893 0.482409 -0.271567 0.275303 -0.269312 0.122914 = 3:33 0.107461 4.549133 0.085768 1.474373 0.080187 0.289077 =4 -0.068370 0.360579 0.005298 0.205807 0.004504 0.108344 = 2:5 0.099682 2.531499 0.100232 0.744445 0.069737 0.344849 =5 -0.031724 0.345076 0.009564 0.142042 0.001387 0.067062 = 2:0 0.099750 1.996279 0.078178 0.327811 0.044176 0.083770 =6 -0.031266 0.247706 -0.001072 0.105042 0.003695 0.064761 = 1:66 0.085485 1.381634 0.064225 0.392063 0.042667 0.214202
rbias rmse rbias rmse rbias rmse rbias rmse rbias rmse rbias rmse
Table 2 Comparison of AMV U and MLE : ( b = 15:0 , = 10) n=8 n = 16 n = 24 amvu mle amvu mle amvu mle -0.270162 0.783725 -0.117069 0.570487 -0.037195 0.515137 0.168336 10.359440 0.145926 5.534598 0.160661 4.871815 -0.054171 0.528705 -0.005784 0.194013 0.017512 0.119112 0.140224 7.412675 0.110744 1.414476 0.085043 0.225913 -0.123600 0.194035 -0.114823 0.070375 -0.115793 0.039073 0.051121 0.693668 0.031754 0.111087 0.025682 0.053814 0.001416 0.122752 -0.002555 0.038414 0.000938 0.027837 0.067856 0.667335 0.033731 0.048005 0.027793 0.036360 -0.005486 0.063996 -0.001689 0.028338 -0.000196 0.018168 0.053469 0.108321 0.030404 0.043148 0.016516 0.019607 -0.002663 0.046892 0.006277 0.029008 -0.005097 0.008526 0.037091 0.065973 0.022327 0.028462 0.013043 0.014777
rbias rmse rbias rmse rbias rmse rbias rmse rbias rmse rbias rmse
Table 3 Comparison of AMV U and MLE : ( b = 25:0 , = 10) n=8 n = 16 n = 24 amvu mle amvu mle amvu mle =1 -0.074136 0.517770 -0.016731 0.205190 0.005413 0.161599 = 10:0 0.170844 8.036053 0.155809 1.099495 0.137292 1.300106 =2 -0.006667 0.087678 0.007734 0.043607 0.007166 0.030200 = 5:0 0.110394 0.280958 0.047761 0.063495 0.035323 0.042992 =3 -0.004972 0.046853 -0.021480 0.009581 -0.019252 0.007450 = 3:33 0.039560 0.070677 0.021065 0.029417 0.015483 0.020516 =4 -0.005853 0.010325 0.003348 0.010754 0.002656 0.007451 = 2:5 0.039232 0.049135 0.018594 0.020230 0.013049 0.013746 =5 0.003854 0.012751 0.002114 0.006155 0.006110 0.008801 = 2:0 0.027310 0.030662 0.013482 0.014186 0.009208 0.009545 =6 -0.002347 0.002556 0.001227 0.003552 0.000822 0.002324 = 1:66 0.021298 0.022838 0.010951 0.011332 0.007328 0.007492
rbias rmse rbias rmse rbias rmse rbias rmse rbias rmse rbias rmse
=1
= 10:0 =2
= 5:0 =3
= 3:33 =4
= 2:5 =5
= 2:0 =6
= 1:66