Multiple comparison procedures with the average for ... - Science Direct

4 downloads 8024 Views 1MB Size Report
ELSEVIER. Computational Statistics & Data Analysis 26 (1998) 461-484 ... procedure is to select a subset of all good populations which are better (no worse, in.
COMPUTATIONAL STATISTICS & DATAANALYSIS ELSEVIER

Computational Statistics & Data Analysis 26 (1998) 461-484

Multiple comparison procedures with the average for exponential location parameters Shu-Fei Wu a,*, Hubert J. Chen b aDepartment of Statistics, Tamkang University, Taiwan, ROC bDepartment of Statistics, University of Georgia, Athens, GA 30602, USA Received 1 November 1996; accepted 1 June 1997

Abstract

In this article, multiple comparison procedures with the average of exponential location parameters when sample sizes are equal are under investigation. A subset selection approach and a simultaneous confidence interval approach with minimum expected length are considered for exponential distributions with common known or unknown scale parameter. These procedures will have broad applications in selecting a subset which includes all better-than-the-average treatments in experimental design and/or in identifying all better-than-the-average, worse-than-the-average and not-much-differencefrom-the-average products in agriculture, business, manufacturing and other industries. Some numerical approximation approaches using Bonferroni inequality are proposed in this article. Statistical tables to implement these procedures for the case of equal sample size are provided for use in practice. A simulation result indicates that Bonferroni approximation performs better than the confidence interval with equal-tail probability. Computer software programs for calculating the percentage points and for simulation are available from the authors. ~) 1998 Elsevier Science B.V. All rights reserved.

Keywords: Subset selection; Simultaneous confidence interval; Bonferroni inequality; Monte Carlo technique

I. Introduction

In 1984, Mukhopadhyay and Hamdy (1984) proposed a two-stage procedure for selecting the best exponential population when the scale parameters are unknown and unequal by using the Bonferroni inequality. Lam and Ng (1990) and Ng et al. * Corresponding author. 0167-9473/98/$19.00 (~) 1998 Elsevier Science B.V. All rights reserved PH S 0 1 6 7 - 9 4 7 3 ( 9 7 ) 0 0 0 4 4 - 3

462

S.-F Wu, H.J. Chen/ Computational Statistics & Data Analysis 26 (1998) 461-484

(1993) improved their work by recent research given in Lam (1987, 1988). A general review in this area was given by Gupta and Panchapakesan (1979, 1985), and Hochberg and Tamhane (1987). In this article, multiple comparison procedures with the average by the subset selection and confidence interval approaches for exponential location parameters with common known or unknown scale parameter are proposed. There are many applications of exponential distribution in the analysis of reliability and the life test experiments. See, e.g., pp. 220-221 of Johnson and Kotz (1970), Bain (1978), Lawless and Singhal (1980) and Zelen (1966). In many experimental situations, the average performance is used as a benchmark to be compared with other treatments. Comparing the location parameters of exponential distributions with their location average within the group is equivalent to comparing the means of exponential distributions with their own average of means since each mean is equal to its location parameter plus its common scale parameter and the common scale parameter will vanish by taking difference. The goal of the subset selection procedure is to select a subset of all good populations which are better (no worse, in practical interpretation) than the average. By better than the average, we mean that the location parameter of the ith population is greater than or equal to the average of all location parameters. This procedure can be a very useful one which allows the experimenters to select from among several populations those which are better than the average when the average is used as a control. One can also use this procedure to screen a great number of products with longer lifetime, to screen drugs with longer hours of pain-relief than the average in clinical trials, or to screen car models which are more reliable than the average for consumer's references. After this procedure is carried out, one can choose one or a few populations from among the selected subset which are better than the average, depending on availability, price range, side-effect and other considerations. As for confidence interval approach, the goal is to classify treatments into groups of better than the average, worse than the average, and not much different from the average where the average is used as a control.

2. Common known scale parameters Let n l , . . . , nk be k given populations, where observations from population ni follow an exponential distribution denoted by E(Oi, a) (i----1,...,k), where 01.... ,Ok are unknown location parameters (also known as threshold values) and a is a known scale parameter. In the application of the analysis of reliability and the life test experiments, the threshold value Oi represents the minimum life of the ith lifetime distribution. Let X,j ( j = 1. . . . . n) be an independent random sample of size n from population hi. Define the first order statistic of the n observations from population ng as Xg = minl~ X - 2--n' i = 1,...,k,

(1)

where ~----ELlf(~/k, 0 = ~i=10i/k k and Xi - )~ is an unbiased estimator of Oi - O. A correct selection (CS) is said to be made if all better than the average populations are included in the selected subset S. The value of h > 0 is determined by satisfying the probability requirement given by infP(CSIRI)=P*. Suppose that )(~..... k~l are k~ minimum order statistics corresponding to those populations with location parameters 0'1,... ' 0'kl greater than or equal to 0. Then the probability of correct selection under selection procedure Ra is given by

P(CSIR1)=p(~,>~

ha2n, i = 1,...,kl)

=P (~y -Oi , > f ( - O - ~ n +haO - O , ,

,

i=l,...,k,

)

> , - p ( y / . , > 17. - ha 2---n' i = l , . . . , k l ) , since 0 - 0 i

=n

Wi*- ~W*/k>

-h, i=l,...,k

,

(5)

j¢i

where Y/* = ) ( i - 0i, W/* =2nYi*/a has a chi-squared distribution with 2 df and ~ i Wff has a chi-squared distribution with v df. Therefore, we have

P(CSIR1)>-P

l~"*>ff7-i- . . W7

- 1' i = l , . . . , k

k

_> 1 - ~--~P(.4~),

(6)

i=1

where P(Ai)=P(Wi* _< ~ 1i ¢x--,k ~ , , : -w * _ [kh/(k - 1 )]). The last inequality holds by use of the 1.h.s. of Lemma 1. Furthermore, we have P(.di) =

F2

k-

1

- 1

Wj* = t

(7)

fv(t)dt,

where fv(.) and F2(.) are the probability density function (p.d.f.) and cumulative density function (c.d.f.) of a chi-squared distribution with v and 2 df, respectively. The last equation can be expressed as o~ e(Ai'=fkh

t k - 2e-t/2

(1-exp(~

~)) F(-~:i~-~_ldt f

= 1 -F~(kh) - e(kh/v)jk h

-~_-- ~

dt,

(8)

where F~(.) is the c.d.f, of a chi-squared distribution with v df. After change of variable w = k t / ( k - 1 ) and integration, we have

P(.,ti ) = l - Fv(kh ) - e(kh/v) ( ~ - -

) k-1 (1 -F~(k')).

(9)

From (6) and (9), we have

k i=1

P(CSIR1 ) > 1 - Z P(e[,)= 1 - k + kF~(kh) +

( k - 1) ~-1 kk-2

e(a/~)(1 - F~(k')). (10)

S.-F Wu, H.J. ChenlComputationalStatistics& DataAnalysis26 (1998)461-484

465

In order to satisfy the probability requirement, the value of h can be solved by letting the r.h.s, of (10) be equal to P*, where ( 1 / k ) < P * < 1. Therefore the proof of Theorem 2 is completed. The solution of above nonlinear equation can be approached by using quadratic convergent Newton-Raphson's method for given values of k, and P*. The values of h are given in the last row of Tables 1-3 for P* =0.90,0.95,0.975, k = 3 ( 1 ) 1 0 ( 2 ) 20(10)100. Note that when k>>kl, the lower bound may be very conservative and the simulation confirms such a result. Note that the subset selection (1) is equivalent to just an upper one-sided confidence intervals ( - c ~ , f ( i - ~ + (ha/2n)) for (Oi 0), i = l , . . . , k . 2.2. Simultaneous confidence interval

Theorem 3. A set o f (1 - ~)100% simultaneous confidence intervals ( S C I ) f o r all the differences between the threshold values and the average, Oi - O, i = 1.... , k, with minimum expected length is given by ( X ~ - ~ + ( C x a / 2 n ) , Z - ~ +(Cza/Zn)), i = 1,..., k, where cl < 0 and c2 > 0 can be approximated by solving the following nonlinear system o f equations: gl(Cl,C2) = 1 -

k -

k L e ~/v)

(~_~_1 k-1

- e (~2/v)

F~(kc:)

(1 - F~(k'"))

- (1 - ~ ) = 0 ,

(11)

92(c1,c2) = f22(c1,c2) - f11(c1,c2) = 0,

(12)

where

k2

f l l ( C l , C2) = - - e (kc'/v) v

( ~ _ _ 1 k-1

f22(cl, c2) = kE f~(kc2 ) + - - e (~:2/v)

{ 1 - F~(k'") - 2kf~(k"')},

where k"' = kZc2/(k - 1). Proof

P(SCI) = P

(Yi

c1{7 ~ k

=P

1

c2(7

I

y yn , i = 1 , . . , k

1 k

)
j4i

kh*Sp (k-1)~'

) i:1 ..... k ,

where S : v'Sp/a has a chi-squared distribution with v' dr. We have e(cslR2)_>P

~*>~--2-f

.. w*

(k--i3v"

i=l ..... k ,

k

(17)

_> 1 - E e(2i), i=1

where

P(A,)=P

1 ~ W~*

Suggest Documents