Unbiased variance estimation in estimating P (X > Y ...

5 downloads 0 Views 103KB Size Report
populations i.e. independent observations X1, X2,... , Xn and Y1, Y2,...,Ym on X and Y, respectively, the uniformly minimum variance unbiased estimator.
METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 329-339

SAMINDRA SENGUPTA

Unbiased variance estimation in estimating P (X > Y) for one and two parameter exponential populations

Summary - In this paper we consider the problem of unbiased estimation of the variance of the uniformly minimum variance unbiased estimator (UMVUE) of P (X > Y ) for two independent random variables X and Y each following a one or two parameter exponential distribution with unknown parameters. In each case we give a necessary and sufficient condition for the existence of an unbiased variance estimator based on random samples from the two populations and obtain the UMVUE of the variance in situations where unbiased estimators exist. Key Words - Estimation of P(X > Y ); Exponential population; Uniformly minimum variance unbiased estimator; Variance estimation.

1. Introduction A two parameter exponential distribution with real parameters µ and λ (> 0), to be denoted hereafter as exp (µ, λ) distribution, is defined by the probability density function (p.d.f.) f (x|µ, λ) =

1 −(x−µ)/λ e , x > µ. λ

(1)

The special case of µ = 0 to be denoted as exp(λ) distribution, is called a one parameter exponential distribution. Both one and two parameter exponential distributions are widely used as models in statistical procedures. Let X and Y be two independent random variables following, respectively, exp (µ1 , λ1 ) and exp(µ2 , λ2 ) distributions with some or all parameters unknown. Received July 2008 and revised December 2008.

330

SAMINDRA SENGUPTA

Consider the problem of unbiased estimation of

θ = P(X > Y ) = =1−

1 eδ/λ1 , 1+σ

σ e−δ/λ1 , 1+σ

if δ ≤ 0 (2)

if δ > 0

where δ = µ1 − µ2 and σ = λ2 /λ1 . For µ1 = µ2 = 0, θ = P(X > Y ) = (1 + σ )−1 . The problem is of interest and has applications in different fields like reliability, biometry etc. (see Gross and Clark, 1975; Kotz et al. 2003). For two independent random samples of sizes n and m from the two populations i.e. independent observations X 1 , X 2 , . . . , X n and Y1 , Y2 , . . ., Ym on X and Y , respectively, the uniformly minimum variance unbiased estimator (UMVUE) of θ was obtained in Tong (1974, 1975) for one parameter exponential distributions and in Beg (1980) [see also Kotz et al. (2003), Chapter 3, Section 3.2.2] for two parameter exponential distributions. Also for n, m ≥ 2, the UMVUE of the variance of the UMVUE of θ for one parameter exponential distributions is given in Kotz et al. (2003, Chapter 2, Section 2.2.3) in the form of a multiple integral. However, to the best of our knowledge, the problem of unbiased estimation of the variance of the UMVUE of θ for two parameter exponential distributions has not so far been studied in the literature except for some special cases e.g. for known λ1 and λ2 (see Ivshin (1996), Pal et al. (2005)). In this paper we address this problem of unbiased estimation of the variance of the UMVUE of θ or equivalently the problem of unbiased estimation of θ 2 based on independent random samples from two parameter exponential populations with all parameters unknown. In particular, we give a necessary and sufficient condition on values of n, m for the existence of an unbiased estimator of θ 2 and obtain its UMVUE in situations where unbiased estimators exist. We also present analogous results for one parameter exponential populations and the derived expression of the UMVUE of θ 2 appears to be computationally more convenient than that given in Kotz et al. (2003). 2. Unbiased estimation of θ 2 for one parameter exponential distributions We first consider one parameter exponential distributions and obtain below a necessary and sufficient condition for the existence of an unbiased estimator of θ 2 .

331

Variance estimation in estimating P(X >Y )

Theorem 2.1. Let µ1 = µ2 = 0 and for some real constant d θd = P(X > Y − d) = =1−

σ e−d/λ2 , 1+σ

1 ed/λ1 , 1+σ

if d ≤ 0 (3)

if d > 0.

Then there exists an unbiased estimator of θd2 based on random samples from the two populations if and only if max(n, m) ≥ 2. Proof. Assume without loss of generality d ≤ 0. Let max(n, m) ≥ 2 and define T = I (X 1 > Y1 + Y2 − 2d), if m ≥ 2   2 (Y1 − d) = 1− I (X 1 + X 2 > Y1 − 2d), X1 + X2

(4)

if n ≥ 2

where I (A) is the indicator function of the set A. Simple calculations yield for m ≥ 2, 

E(T ) = E e−(Y1 +Y2 −2d )/λ1



e2d/λ1 = λ22





ye

−y



1 1 λ1 + λ2

0



dy = θd2

while for n ≥ 2, 

2 (Y1 − d) −(Y1 −2d )/λ1 E(T ) = P[X 1 + X 2 > Y1 − 2d] − E e λ1    Y1 − 2d 2 (Y1 − d) −(Y1 −2d )/λ1 = E 1+ − e λ λ1  1  Y1 −(Y1 −2d )/λ1 = E 1− e λ1 e2d/λ1 = λ2



0







−y y 1− e λ1



1 1 λ1 + λ2





dy = θd2 .

Thus T is an unbiased estimator of θd2 and this proves the if part of the theorem.

332

SAMINDRA SENGUPTA

Let now n = m = 1 and T = T (X 1 , Y1 ) be an unbiased estimator of θd2 based on (X 1 , Y1 ). Since the conditional distribution of Y1 given S = Y1 + Y2 is independent of λ2 and the distribution of (X 1 , S) is complete, it follows that T / = E (T |X 1 , S) is the unique unbiased estimator of θd2 based on (X 1 , S). Also it follows from (4) that an (unique) unbiased estimator of θd2 based on (X 1 , S) is I (X 1 > S − 2d). Thus we must have T/ =

1 S



S

0

T (X 1 , y1 ) dy1 =I (X 1 > S − 2d).

(5)

Differentiating both sides of (5) with respect to S we get T = I (X 1 > Y1 −2d). However, for T = I (X 1 > Y1 − 2d), E (T ) = θ2d which is a contradiction. Hence, there can not exist any unbiased estimator of θd2 based on (X 1 , Y1 ) which proves the only if part of the theorem. For d = 0, we immediately obtain the following corollary. Corollary 2.2. For exp(λ1 ) and exp(λ2 ) populations, there exists an unbiased estimator of θ 2 based on random samples from the two populations if and only if max(n, m) ≥ 2. For exp(λ1 ) and exp(λ2 ) populations, we now obtain the UMVUE of θ 2 for max(n, m) ≥ 2 and without loss of generality assume m ≥ 2. n Since for , S ) with S = the problem a complete sufficient statistic is (S X Y X i=1 X i and  SY = mj=1 Y j , it follows that an unbiased estimator of θ 2 based on (S X , SY ) is the UMVUE of θ 2 and it can be obtained through Rao- Blackwellization of the estimator T defined in (4) viz. θˆ 2 = E[I (X 1 > Y1 + Y2 )|S X , SY ].

(6)

Since the conditional p.d.f. of X 1 given S X is 1 (n − 1) SX



x1 1− SX

n−2

, 0 < x1 < SX ,

we have P[X 1 > Y1 + Y2 |Y1 + Y2 , S X , SY ] = 0, = (n − 1)

 0

Y +Y 1− 1S 2 X



u

n−2

if Y1 + Y2 > S X

Y1 + Y2 du = 1 − SX

n−1

,

if Y1 + Y2 ≤ S X .

Also the conditional p.d.f. of Z = Y1 + Y2 given SY is 1 z (m − 1) (m − 2) SY SY



z 1− SY

m−3

, 0 < z < SY .

333

Variance estimation in estimating P(X >Y )

Hence, by (6), the UMVUE of θ 2 is given by θˆ 2 = P[X 1 > Y1 + Y2 |S X , SY ] = h 0 (n, m, S X , SY )

(7)

where for b > c, 





cu n−1 u (1 − u)m−3 du b 0 = (m − 1)h 01 (n, m − 1, b, c) − (m − 2)h 01 (n, m, b, c)

h 0 (n, m, b, c) = (m − 1) (m − 2)

1

1−

(8)

and for b ≤ c, 





b 1 bu m−3 bu du h 0 (n, m, b, c) = (m − 1) (m − 2) (1 − u)n−1 1 − c 0 c c = (m − 1)h 02 (n, m − 1, b, c) − (m − 2)h 02 (n, m, b, c) with h 01 (n, m, b, c) = (m − 1)



n−1

1

0



1−

cu b

n−1

(1 − u)m−2 du

(n)(m) = (−1) (n − i) (m + i) i=0 i

(9)

 i

(10)

c b

and h 02 (n, m, b, c) = (m − 1) =

m−2 i=0

b c

(−1)i

 0

1



1−

bu c

m−2

(1 − u)n−1 du

(n)(m) (n + i + 1) (m − i − 1)

 i+1

b c

(11) .

3. Unbiased estimation of θ 2 for two parameter exponential distributions For two parameter exponential populations with known λ−1 and λ2 , Ivshin (1996) and Pal et al. (2005) derived the UMVUE of θ 2 based on the complete sufficient statistic (X (1) , Y(1) ), where X (i) and Y( j) are, respectively, the i-th and the j-th order statistics based on the two random samples, 1 ≤ i ≤ n, 1 ≤ j ≤ m. The result is quoted in Theorem 3.1 stated below.

334

SAMINDRA SENGUPTA

Theorem 3.1. For known λ1 and λ2 and for n, m ≥ 2, the UMVUE of θ 2 based on random samples from exp(µ1 , λ1 ) and exp(µ1 , λ2 ) populations is (n − 2) (m + 2σ ) 2D/λ1 e , if D ≤ 0 nm (1 + σ )2 (m − 1) (1 + nσ ) −D/λ2 =1−2 e nm (1 + σ ) (m − 2) σ (2 + nσ ) −2D/λ2 + e , if D > 0 nm (1 + σ )2

T ∗ = T ∗ (D) =

(12)

where D = X (1) − Y(1) . Further, we prove in the lemma below that, for known λ1 and λ2 , an unbiased estimator of θ 2 based on (X (1) , Y(1) ) is necessarily of the form (12) even for min(n, m) = 1. Lemma 3.2. For exp(µ1 , λ2 ) and exp(µ2 , λ1 ) populations with known λ1 , λ2 and for min(n, m) = 1, an unbiased estimator of θ 2 based on (X (1) , Y(1) ) is necessarily of the form (12). Proof. Let λ1 , λ2 be known and assume without loss of generality m = 1. Let T = T (X (1) , Y1 ) be an unbiased estimator of θ 2 i.e. T (x, y) satisfies 



∞ ∞ n T (x, y) e−nx/λ1 −y/λ2 d xd y λ1 λ2 µ1 µ2 = e−nµ1 /λ1 −µ2 /λ2 θ 2 ∀ − ∞ < µ1 , µ2 < ∞

(13)

Differentiating both sides of (13) successively with respect to µ1 and µ2 we get (n − 2) (1 + 2σ ) 2δ/λ1 e , if δ ≤ 0 T (µ1 , µ2 ) = n (1 + σ )2 σ (2 + nσ ) −2δ/λ2 =1− e , if δ > 0 n (1 + σ )2 which implies that T is of the form (12). This completes the proof of the lemma. We now consider the problem of unbiased estimation of θ 2 for exp(µ1 , λ1 ) and exp(µ2 , λ2 ) populations with all parameters unknown and first give in the following theorem a necessary and sufficient condition for the existence of an unbiased estimator of θ 2 . Theorem 3.3. For two-parameter exponential populations with all parameters unknown, there exists an unbiased estimator of θ 2 if and only if n, m ≥ 2.

335

Variance estimation in estimating P(X >Y )

Proof. Assume without loss of generality m = min(n, m) and let m = 1. Suppose there exists an unbiased estimator of θ 2 , say T . Then T is also an unbiased estimator of θ 2 for known λ1 , λ2 and hence E(T |X (1) , Y1 ) is an unbiased estimator of θ 2 for known λ1 , λ2 since for known λ1 , λ2 , (X (1) , Y1 ) is a sufficient statistic. Thus by Lemma 3.2, we must have E(T |X (1) , Y1 ) = T ∗ , where T ∗ is defined in (12). But this is impossible as T and hence E(T |X (1) , Y1 ) can not depend on λ2 . Hence, there can not exist any unbiased estimator of θ 2 for min(n, m) = 1. Also for n, m ≥ 2, an unbiased estimator of θ 2 is I (X 1 > Y1 )I (X 2 > Y2 ) which completes the proof of the theorem. For exp(µ1 , λ1 and exp(µ2 , λ2 ) populations with all parameters unknown, we finally obtain in what follows the UMVUE of θ 2 for n, m ≥ 2. Since the sets of order statistics {X (1) , X (2) , . . ., X (n) } and {Y(1) , Y(2) , . . ., Y(m) } based on the two random samples constitute a sufficient statistic (see Lehmann and Casella, 1998, p. 36), without loss of generality we can restrict to estimators based on sets of order statistics or equivalently on {X (1) , U2 , . . ., Un } and {Y(1 ), V2 , . . ., Vm } where Ui = (n − i + 1)(X (i) − X (i−1) ) and Vj = (m − j + 1)(Y( j) − Y( j−1) ), 2 ≤ i ≤ n, 2 ≤ j ≤ m. Note that Ui , 2 ≤ i ≤ n and Vj , 2 ≤ j ≤ m are independently distributed following exp(λ1 ) and exp(λ2 ) distributions, respectively, independently of X (1) and Y(1) (see Johnson et al., 1994, Chapter 19, Section 7.3). Let n, m ≥ 2 and T be an unbiased estimator of θ 2 based on {X (1) , U2 , . . . , Un } and {Y(1) , V2 , . . . , Vm }. We then have θ 2 = E(T ) = E E(T |X (1) , Y(1) ) = E(T ∗ ) ∀−∞ < µ1 , µ2 < ∞, 0 < λ1 , λ2 < ∞ (14) where T ∗ is defined by (12). As the distribution of (X (1) , Y(1) ) is complete, (14) implies that for given x(1) , y(1) E[T (x(1) , y(1) , U2 , . . . , Un , V2 , . . . , Vm )] = T ∗ (d)

∀ 0 < λ 1 , λ2 < ∞

(15)

where d = x(1) − y(1) . Note that T ∗ (d) can be expressed as



2(n − 2) (n − 2) (m − 2) 2 θ2d + θd , for d ≤ 0 nm nm m−1 (n − 1) (m − 1) (16) {1 − G (d)} − 2 {1 − θd } =1−2 nm nm

2(m −2) (n −2) (m −2) + (1−θ2d )+ (1−θd )2 , for d > 0 nm nm

T ∗ (d) =

where θd is defined in (3) and G(y) is the cumulative distribution function of exp(λ2 ) distribution.

336

SAMINDRA SENGUPTA

Now for n, m ≥ 3 and for given d ≤ 0, it follows by arguments similar to those used to derive (7) UMVUE of θd2 based on thecomplete n that the  n sufficient statistics SU = i=2 Ui = i=2 X (i) − X (1) and SV = mj=2 Vj =

m j=2 Y( j) − Y(1) is given by (see Lemma A.1 in the appendix) E[I (U2 > V2 + V3 − 2d)|SU , SV ] = P[U2 > V2 + V3 − 2d|SU , SV ] = H0 (n − 1, m − 1, −2d, SU , SV )

(17)

where H0 (n, m, a, b, c) = 0, if b ≤ a = (m −1)H01 (n, m −1, a, b, c)−(m −2)H01 (n, m, a, b, c), if a < b < a +c (18) = (m −1)H02 (n, m −1, a, b, c)−(m −2)H02 (n, m, a, b, c), if b ≥ a +c with (m − 1) H01 (n, m, a, b, c) = n−1 m−1 b c 

a = 1− b

n−1

(n −1) − n−1 m−1 b c

m−1



b−a

0

(b−a −u)n−1 (c−u)m−2 du

m − 1

(19)

j (b − a)n−1+ j (a + c − b)m−1− j n − 1 + j j=0

and H02 (n, m, a, b, c) = 

= 1−

a b

n−1



(m − 1) bn−1 cm−1

(n − 1) c bn−1



n−2 j=0

c

0

(b − a − u)n−1 (c − u)m−2 du

n − 2

(20)

j (b − a − c)n−2− j c j . m+ j

Similarly, for n, m ≥ 3 and for given d > 0, the UMVUE of (1 − θd )2 is H0 (m − 1, n − 1, 2d, SV , SU ). By the same arguments, it follows that, for n, m ≥ 2 and for given d, the UMVUE of θd is E[I (U2 > V2 −d)|SU , SV ] = H0∗ (n −1, m −1,−d, SU , SV ), if d ≤ 0 = 1− H0∗ (m −1, n −1, d, SV , SU ), if d > 0

(21)

where H0∗ (n, m, a, b, c) = 0, b ≤ a = H01 (n, m, a, b, c), = H02 (n, m, a, b, c),

if a < b < a + c if b ≥ a + c

(22)

Variance estimation in estimating P(X >Y )

337

with H01 (n, m, a, b, c) and H02 (n, m, a, b, c) defined, respectively, in (19) and (20). Further, for m ≥ 2 and for given d > 0, the UMVUE of 1 − G(d) is E[I (V2 > d)|SV ] = H0∗∗ (m − 1, d, SV ) where

(23)

H0∗∗ (m, a, b) = 0,

if b ≤ a  (24) a m−1 = 1− , if b > a. b Thus for n, m ≥ 2, it follows by (15) and (16) that the UMVUE of θ 2 for two parameter exponential populations with all parameters unknown is given by 2(n − 2) ∗ H0 (n − 1, m − 1, −2D, SU , SV ) θˆ 2 = nm (n − 2) (m − 2) + H0 (n − 1, m − 1, −2D, SU , SV ) , i f D ≤ 0 nm 2 (m − 1) ∗∗ H0 (m − 1, D, SV ) =1− nm (25) 2 (n − 1) (m − 1) ∗ − H0 (m − 1, n − 1, D, SV , SU ) nm 2(m − 2) ∗ H0 (m − 1, n − 1, 2D, SV , SU ) + nm (n − 2) (m − 2) + H0 (m − 1, n − 1, 2D, SV , SU ) , i f D > 0 nm since for the problem (X (1) , Y(1) , SU , SV ) is a complete sufficient statistic. 

Appendix Lemma A.1. Let a, b, c ≥ 0. For a < b < a + c,  (m − 1) b−a (b − a − u)n−1 (c − u)m−2 du bn−1 cm−1 0  m −1   n−1 m−1 a (n −1) j = 1− − n−1 m−1 (b − a)n−1+ j (a +c−b)m−1− j b b c n −1+ j j=0 and for b ≥ a + c,  (m − 1) c (b − a − u)n−1 (c − u)m−2 du bn−1 cm−1 0 n − 2   n−2 a n−1 (n − 1) c j = 1− − (b − a − c)n−2− j c j . b bn−1 j=0 m + j

(A.1)

(A.2)

338

SAMINDRA SENGUPTA

Proof. LHS of (A.1) =

n+ j m − 2 (m − 1) m−2 m−2− j (b − a) + c − b) (a j bn−1 cm−1 j=0 n+ j

 m − 1 1 1 (n − 1) m−2 − = n−1 m−1 (a + c − b)m−2− j (b − a)n+ j j +1 b c n − 1 n + j j=0 



=

 m − 1  1 1 (n − 1) m−1 − (a + c − b)m− j−1 (b − a)n+ j−1 n−1 m−1 j b c n − 1 n + j − 1 j=0

= RHS of (A.1). Similarly, LHS of (A.2) =

(m − 1) bn−1

n−1 j=0

n − 1

j (b − a − c)n−1− j c j m+ j −1 



j n − 1 1− (b − a − c)n−1− j c j n−1 j b m + j − 1 j=0 n − 2  n−1 n−1 a (n − 1) c j −1 = 1− − (b − a − c)n−1− j c j−1 n−1 b b m + j − 1 j=1 =

1

n−1 

= RHS of (A.2). This completes the proof of the lemma.

Acknowledgments The author is thankful to the referee for his useful comments and suggestions on an earlier draft of the paper.

REFERENCES Beg, M. A. (1980) On the estimation of P(Y < X ) for the two-parameter exponential distribution, Metrika, 27, 29-34. Gross, A. J. and Clark, V. A. (1975) Survival distributions: Reliability application in the biomedical sciences, John Wiley, New York. Ivshin, V. V. (1996) Unbiased estimation of P(X < Y ) and their variances in the case of uniform and two-parameter distributions, J.Math. Sci, 81, 2790-2793.

Variance estimation in estimating P(X >Y )

339

Johnson, N. L., Kotz, S., and Balakrishnan, N. (1994) Continuous Univariate Distributions, Vol.1. John Wiley, New York. Kotz, S., Lumelskii, Y., and Pensky, N. (2003) The stress and strength model and its generalizations: Theory and Applications, World Scientific, New Jersey. Lehmann, E. L. and Casella, G. (1998) Theory of point estimation, 2nd ed. Springer-Verlag, New York. Pal, M., Ali, M. M., and Woo, J. (2005) Estimation and testing of P(Y > X ) in two-parameter exponential distributions, Statistics, 39, 415-428. Tong, H. (1974) A note on the estimation of Pr (Y < X ) in the exponential case, Technometrics, 16, 625. Tong, H. (1975) Errata, Technometrics, 17, 395.

S. SENGUPTA Department of Statistics University of Calcutta 35, Ballygunge Circular Road Kolkata 700019, India [email protected]