Journal of Statistical Computation and Simulation Confidence interval ...

This article was downloaded by: [University of Windsor] On: 28 May 2014, At: 03:20 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Computation and Simulation Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/gscs20

Confidence interval estimation of the location and scale parameters of the logistic distribution using pivotal method a

b

a

H. A. Muttlak , W. A. Abu-Dayyeh , E. Al-Sawi & M. Al-Momani c a

Department of Mathematics & Statistics , King Fahd University of Petroleum & Minerals , Dhahran, Saudi Arabia b

Department of Mathematics & Statistics , Sultan Qaboos University , Muscat, Oman c

Department of Mathematics & Statistics , Windsor University , Ontario, Canada Published online: 15 Apr 2010.

To cite this article: H. A. Muttlak , W. A. Abu-Dayyeh , E. Al-Sawi & M. Al-Momani (2011) Confidence interval estimation of the location and scale parameters of the logistic distribution using pivotal method, Journal of Statistical Computation and Simulation, 81:4, 391-409, DOI: 10.1080/00949650903379572 To link to this article: http://dx.doi.org/10.1080/00949650903379572

PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

Downloaded by [University of Windsor] at 03:20 28 May 2014

This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/termsand-conditions

Journal of Statistical Computation and Simulation Vol. 81, No. 4, April 2011, 391–409

Confidence interval estimation of the location and scale parameters of the logistic distribution using pivotal method H.A. Muttlaka *, W.A. Abu-Dayyehb , E. Al-Sawia and M. Al-Momanic


a Department

of Mathematics & Statistics, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia; b Department of Mathematics & Statistics, Sultan Qaboos University, Muscat, Oman; c Department of Mathematics & Statistics, Windsor University, Ontario, Canada (Received 27 May 2009; final version received 30 September 2009 ) Different confidence intervals (CI) will be constructed for the location and scale parameters of the logistic distribution by assuming that one of them is known. The maximum likelihood estimator and several different pivots will be used to construct different CI for the logistic parameters, using simple random sampling (SRS) and ranked set sampling (RSS). They will be compared via their expected lengths and the standard errors of their lengths using computer simulation. The CIs based on the RSS are found to be more efficient, i.e. having shorter expected lengths and smaller standard errors than their competitors based on the SRS. Keywords: efficiency; expected length; maximum likelihood estimator; pivotal method; ranked set sampling; simple random sampling

1.

Introduction

The logistic distribution has been one of the most important statistical distributions because of its simplicity and historical importance as growth life time models. Logistic distribution has been used in the analysis of quintal response data, probit analysis and dosage response studies. For more applications of the logistic distribution see [1]. Several authors consider some sort of inferences for the parameters of the logistic distribution. Lam et al. [2] considered the estimation of the location and scale parameters of a logistic distribution using ranked set sampling (RSS). Abu-Dayyeh et al. [3] proposed several tests for the location parameters of the logistic distribution using simple random sampling (SRS) and RSS. Abu-Dayyeh et al. [4] studied the exact Bahadur slope for combining independent tests for normal and logistic distributions using RSS data. Abu-Dayyeh et al. [5] discussed several estimators for the parameters of logistic distribution estimation using SRS and RSS. Wu and Lu [6] constructed different prediction intervals for an ordered observation from logistic distribution based on censored samples.

*Corresponding author. Email: [email protected]

ISSN 0094-9655 print/ISSN 1563-5163 online © 2011 Taylor & Francis DOI: 10.1080/00949650903379572 http://www.informaworld.com


392

H.A. Muttlak et al.

McIntyre [7] was the first to suggest using RSS. He applied the technique in assessing the yields of pasture plots without actually carrying out the time-consuming process of mowing and weighing the hay for a large number of plots. Takahasi and Wakimoto [8] and Dell and Clutter [9] studied theoretical aspects of this technique on the assumption of perfect judgement ranking and imperfect judgement ranking, respectively. Several aspects of the logistic distribution have been considered by [10–15]. The RSS method can be summarized as follows: Select m random sets of size m units and rank the units within each set with respect to a variable of interest by visual inspection or by any cheap method. Then select for actual measurement the smallest unit from the first set. From the second set, select for actual measurement the second smallest unit. The procedure is continued until the largest unit from the mth set is selected for measurement. In this way, we obtain a total of m-measured units, one from each set. The cycle may be repeated r times until mr units have been measured. These mr units form the RSS data. Let X(i:m)j , i = 1, . . . , m and j = 1, . . . , r denoting the ith order statistic from the ith set of size m units in the j th cycle be the RSS data, with sample size n = mr. The sample mean for the RSS data may be defined by m r 1 X(i:m)j . X¯ rss = mr j =1 i=1

(1)

The variance of X¯ rss is given by σX2¯ rss = var(X¯ rss ) =

m 1 2 σ , m2 r i=1 (i:m)

(2)

2 where σ(i:m) = E[X(i:m) − E(X(i:m) )]2 , see Takahasi and Wakimoto [8] for more details. To compare the different methods that we are going to use to construct different confidence interval (CI) for θ and λ in this study, we define the efficiency of using RSS data instead of SRS data using the average lengths and the mean standard error of the length of the intervals. The first efficiency will depend on the expected length of the CI, which can be defined as follows:

eff(Lrss , Lsrs ) =

E(Lsrs ) , E(Lrss )

(3)

where Lsrs and Lrss represent the length of the CI using SRS and RSS data, respectively. The second efficiency will depend on the standard error of the length of the intervals, which can be defined as follows: eff(srss , ssrs ) =

s(Lsrs ) . s(Lrss )

(4)

The overall aim of this study is to develop several CIs for the logistic distribution parameters using SRS and RSS methods. Specifically, constructing CI for the location parameter θ when the scale parameter λ is known and when the coefficient of variation of the random variable X is known, in particular, we will consider that θ = λ. Constructing CI for λ when θ is known, which can be assumed wlog to be 0, we will construct CI for θ and λ using the maximum likelihood estimator (MLE) and five different pivots using SRS and RSS data. Finally, we will compare these different CIs using SRS and RSS data using computer simulation. We will use 95% CI for the cases that we are going to consider in this study.

Journal of Statistical Computation and Simulation

2. 2.1.

393

CI for θ when λ is known CI estimate for θ using the MLE

Case 1 The MLE for θ when λ is known using SRS: Assuming that λ = 1, the pdf for the logistic will be f (xi , θ |λ = 1) =

e−(xi −θ ) . (1 + e−(xi −θ ) )2


The likelihood function is given by L(θ, x) =

n

f (xi , θ) =

i=1

n i=1

n

e−(xi −θ ) e− i=1 (xi −θ ) = . n −(xi −θ ) )2 (1 + e−(xi −θ ) )2 i=1 (1 + e

The log-likelihood function is

n

e− i=1 (xi −θ) l(θ, x) = ln[L(θ, x)] = ln n −(xi −θ) )2 i=1 (1 + e

=−

n

(xi − θ) − 2

i=1

n

ln(1 + e−(xi −θ ) ).

i=1

To find the MLE for θ, take the derivative of l(θ, x) with respect to θ and set that equal to zero and solve for θ: n e−(xi −θ ) ∂l(θ, x) =n−2 = 0, ∂θ 1 + e−(xi −θ ) i=1 then we get n i=1

e−(xi −θ) n = . −(x −θ ) i 1+e 2

(5)

The MLE of θ which will be denoted by θ mle1 cannot be obtained in the closed form; so we will

use a numerical method to find θ mle1 . The CI for θ using θ mle 1 based on SRS data can be found using the following pivot:

Tmle1 (X) = θ mle1 − θ, so the distribution of Tmle1 (X) is free of θ . The following algorithm will be used to find a and b such that P (a ≤ Tmle1 (X) ≤ b) = 1 − α, and a point estimate for θ : (1) Simulate a SRS of size n = mr from logistic L(θ = 0, λ = 1). (2) Substitute these values in Equation (5) and solve for θ to get one value, call it ti . (3) Repeat steps 1 and 2, 100,000 times to get t1 , t2 , . . . , t100,000 , the point estimate for θ is θˆmle1 , which is the average of t1 , t2 , . . . , t100,000 . (4) Obtain the (α/2)th percentile and the (1 − α/2)th percentile of t1 , t2 , . . . , t100,000 and let amle1 = (α/2)th percentile and bmle1 = (1 − α/2)th percentile.

394

H.A. Muttlak et al.

To find a (1 − α)100% CI for θ , consider 1 − α = Pθ (amle1 ≤ Tmle1 (X) ≤ bmle1 ) = Pθ (amle1 ≤ θˆmle1 − θ ≤ bmle1 ) = Pθ (−bmle1 ≤ θ − θˆmle1 ≤ −amle1 ) = Pθ (θˆmle1 − bmle1 ≤ θ ≤ θˆmle1 − amle1 ). So a (1 − α)100% CI for θ is given by


(θˆmle1 − bmle1 , θˆmle1 − amle1 ). Then, the length of the interval is equal to [bmle1 − amle1 ]. We can see that the length of the CI is constant and hence the standard error is zero. To find the average length, we repeat steps 1–4, 150,000 times in each step, we find the length of the interval [bmle1 − amle1 ], and then we find the average of these lengths to get an estimate for the length of the interval using SRS. Case 2 The MLE for θ when λ is known using RSS: The probability density function of the ith order statistic in the j th cycle is f(i:m)j (x(i:m)j ) =

m! f (x(i:m)j )[F (x(i:m)j )]i−1 [1 − F (x(i:m)j )]m−i . (m − i)!(i − 1)!

The likelihood function for RSS is L(θ, x (i:m)j ) =

m r

f(i:m)j (x(i:m)j )

j =1 i=1

=

m r j =1 i=1

=

m r j =1 i=1

m! f (x(i:m)j )[F (x(i:m)j )]i−1 [1 − F (x(i:m)j )]m−i (m − i)!(i − 1)! m! e−(x(i:m)j −θ ) (m − i)!(i − 1)! (1 + e−(x(i:m)j −θ ) )2

×

i−1

1 1 + e−(x(i:m)j −θ)

m−i

1 1 + e(x(i:m)j −θ )

.

The log-likelihood function is given by

l(θ, x (i:m)j ) = ln[L(θ, x(i:m)j )] =

m r j =1 i=1

−

r m

ln k +

m r j =1 i=1

(i − 1) ln(1 + e−(x(i:m)j −θ) ) −

j =1 i=1

where k = m!/(m − i)!(i − 1)!.

[−x(i:m)j + θ − 2 ln(1 + e−(x(i:m)j −θ ) )]

m r j =1 i=1

(m − i) ln(1 + e(x(i:m)j −θ ) ),


395

To find the MLE for θ , take the derivative of l(θ, x) with respect to θ and set that equal to zero and solve for θ ∂l(θ, x (i:m)j ) ∂θ

=n−

m r (i + 1)e−(x(i:m)j −θ ) − (m − i)

1 + e−(x(i:m)j −θ )

j =1 i=1

= 0,

simplify to get m r (i + 1)e−(x(i:m)j −θ ) − (m − i)

1 + e−(x(i:m)j −θ )

j =1 i=1

= n.

(6)


The MLE of θ using RSS that is denoted by θ mlerss cannot be obtained in the closed form and will be found numerically using the algorithm that we used in Section 2.2.1. We need to use RSS data and Equation (6) instead of Equation (5).

The CI for θ using θ mlerss based on RSS data can be found using the following pivot:

Tmlerss1 (X) = θ mlerss − θ. The following algorithm will be used to find a point estimate for θ and a and b such that P (a ≤ Tmlerss1 (X) ≤ b) = 1 − α, (1) Simulate a RSS of size n = mr from logistic L(θ = 0, λ = 1). (2) Substitute these values in Equation (6) and solve for θ to get one value, call it ri . (3) Repeat steps 1 and 2, 100,000 times to get r1 , r2 , . . . , r100,000 ; the point estimate for θ is θˆmlerss , which is the average of r1 , r2 , . . . , r100,000 . (4) Obtain the (α/2)th percentile and the (1 − α/2)th percentile of r1 , r2 , . . . , r100,000 and let amlerss1 = (α/2)th percentile and bmlerss1 = (1 − α/2)th percentile. To find a (1 − α)100% CI for θ using RSS, consider 1 − α = Pθ (amlerss1 ≤ Tmlerss1 (X) ≤ bmlerss1 ) = Pθ (amlerss1 ≤ θˆmlerss − θ ≤ bmlerss1 ) = Pθ (−bmlerss1 ≤ θ − θˆmlerss ≤ −amlerss1 ) = Pθ (θˆmlerss − bmlerss1 ≤ θ ≤ θˆmlerss − amlerss1 ). So, a (1 − α)100% CI for θ is given by (θˆmlerss − bmlesrs1 , θˆmlerss − amlesrs1 ). Then, the length of the CI is equal to [bmlerss1 − amlerss1 ], which is a constant. So, the standard error is zero. To find the average length, we repeat steps 1–4, 150,000 times. In each step, we find the length of the interval [bmlerss1 − amlerss1 ], and then we find the average of these lengths to get an estimate for the length of the interval using RSS. The computer simulation were run for r = 3, 4, 6 and different values of m, namely for m = 3, 4, 5. The efficiencies based on the length of the interval are reported in Table 1. We can see that using RSS will improve the efficiency, i.e. reducing the length of the interval between 41% up to 74% depends on the set size, but increasing the number of replications will not increase the efficiencies. Note In this study, the computer simulations were run for r = 3 and m = 3, 4 and 5, for all the cases that we considered.

396

H.A. Muttlak et al.

Table 1. The efficiencies for the average length and standard error using different pivots to construct CI estimate for the location parameter θ when λ is known, using RSS and SRS data for different values of the set size m. m=3 Pivot

eff(Lrss , Lsrs )

T1m (X) T11 (X) T12 (X) T13 (X) T14 (X) T15 (X)

m=4 eff(srss , ssrs )

1.4117 1.3597 1.2716 1.0002 1.3497 1.4093

eff(Lrss , Lsrs ) 1.5854 1.4856 1.4098 1.0048 1.4852 1.5974

1.1461 1.4354

m=5 eff(srss , ssrs )

eff(Lrss , Lsrs )

eff(srss , ssrs )

1.7398 1.6106 1.4465 1.0658 1.5884 1.7509

1.1888 1.7686

1.3418 2.0638


2.2. CI estimate for θ using five pivots Five different pivots will be used to construct CIs for θ when λ is known. These pivots are T11 (x) = x¯ − θ, where x¯ is the sample mean, T12 (x) = x˜ − θ, where x˜ is the sample median, T13 (x) =

n

(Xi − θ )2 ,

i=1

T14 (x) = −

n

ln

i=1

and T15 (x) =

n

1 − F (xi − θ) F (xi − θ)

(1 − F (xi − θ )).

i=1

2.2.1.

CI estimate for θ using the pivots T11 (x) and T12 (x)

We need to find a and b such that P (a ≤ T11 (x) ≤ b) = P (a ≤ x¯ − θ ≤ b) = 1 − α. Since we do not know the distribution of T11 (x), we will use the following algorithm to find a and b for the SRS data: (1) (2) (3) (4)

Select a SRS of size n = mr observations from the logistic distribution L(1, 1). Let ti = x¯ − 1, we are using θ = 1 in this simulation. Repeat steps 1 and 2, 100,000 times to get t1 , t2 , . . . , t100,000 . Obtain the (α/2)th percentile and the (1 − α/2)th percentile of t1 , t2 , . . . , t100,000 and let asrs = (α/2)th percentile and bsrs = (1 − α/2)th percentile.

To find a (1 − α)100% CI for θ using SRS data, the following algorithm will be used: (1) Select a SRS of size n = mr observations from the logistic distribution L(1, 1). (2) Find the lower and upper confidence bounds dL srs = x¯ − bsrs and dU srs = x¯ − asrs .


397

(3) Find the length of the interval for the above pivot using SRS data L11srs = [dU srs − dL srs ] = [bsrs − asrs ].


(4) Repeat steps 1–3, 150,000 times and find the mean and the standard error of the length of the intervals. Obviously, that the standard error will be zero since the length of the interval is not depending on the sample mean. As for the RSS data, we select RSS from L(1, 1) and replace ti by ri = x¯rss − 1 and follow the same steps that we used in the above algorithm to find the mean of the intervals. The efficiencies based on the length of the intervals for the pivot T11 (x) are given in Table 1. Using RSS will improve the efficiencies, i.e. reducing the length of the interval between 36% up to 61% depends on the set size. As for pivot T12 (x) = x˜ − θ , which is very close to the previous pivot, we will follow the same steps that we used above. We only need to replace the mean in the previous steps by the median. The efficiencies based on the length intervals for pivot T12 (x) using RSS and SRS are given in Table 1. Using RSS will improve the efficiencies, i.e. reducing the length of the interval from 27% up to 45% depending on the set size. 2.2.2.

CI estimate for θ using the pivot T13 (x)

Since we do not know the distribution of T13 (x), we will use the algorithm that we used in the above section to find a and b with the following modification: Select a SRS of size n = mr observations from logistic L(0, 1) and use ti = ni=1 (Xi − θ)2 . To find a (1 − α)100% CI for θ using SRS, Select a SRS of size n = mr observations from L(0, 1) and solve the following two equations for θ: dL srs =

n

(Xi − θ)2 = asrs

i=1

and dU srs =

n

(Xi − θ)2 = bsrs .

i=1

The remaining steps are as in the previous pivots. As for the RSS data, we will use the above algorithm to find the mean and standard error of the length of the intervals. We only need to replace xi by X(i:m)j , i = 1, 2, . . . , m and j = 1, 2, . . . , r and ti by ri =

r m

(x(i:m)j − θ)2 .

i=1 j =1

The efficiencies of the average length of the intervals and the standard errors are given in Table 1. Using RSS will reduce the standard errors between 15% and 34% depending on the set size. But using RSS will only reduce the length of the interval up to 7% depending on the set. 2.2.3.


To construct a CI for θ using SRS data for this pivot, we need to find a and b such that P (a ≤ T14 (x) ≤ b) = 1 − α, but F (xi − θ ) ∼ U (0, 1) and 1 − F (xi − θ) ∼ U (0, 1). To find the values

398

H.A. Muttlak et al.

of a and b in the above equation, we will use the following algorithm for the SRS data: (1) Select a SRS of size n = mr observations from U ∼ (0, 1). (2) Let ti = ni=1 − ln((1 − ui )/ui ). (3) Repeat steps 1 and 2, 100,000 times to get t1 , t2 , . . . , t100,000 . Then obtain the (α/2)th and the (1 − α/2)th percentiles as before. Let asrs = (α/2)th percentile and bsrs = (1 − α/2)th percentile. To find a (1 − α)100% CI for θ, select a SRS of size n = mr from L(1, 1) and solve the following two equations for θ :


asrs = −

n

ln

i=1

and bsrs = −

n

ln

i=1

1 − F (xi − θ) F (xi − θ)

1 − F (xi − θ) F (xi − θ )

and call it dL srs

and call it dU srs ,

and follow the above steps to find the mean and the standard error of the length of the intervals. The simulation results show that the standard error will be zero. As for the RSSdata, we will use the above algorithm to find a and b. We only need to replace ti by ri = rj =1 m i=1 − ln((1 − u(i:m)j )/u(i:m)j ) and F (xi − θ) by F (x(i:m)j − θ). The efficiencies based on the length of the intervals are given in Table 1. As we can see that using RSS will improve the efficiency, i.e. reducing the length of the interval between 35% and 59% depending on the set size. 2.2.4.

CI estimate for θ using pivot T15 (x)

To construct a CI for θ using SRS data using T15 (x) pivot, we need to find a and b such that P (a ≤ T15 (x) ≤ b) = 1 − α, but F (xi − θ ) ∼ U (0, 1) and 1 − F (xi − θ) ∼ U (0, 1). To find the values of a and b using the above equation, we will use the algorithm that we used in Section 2.2.3. We only need to use ti to be ti = ni=1 ui and every thing else is the same. To find a (1 − α)100% CI for θ using SRS data, select a SRS of size n = mr observations from the logistic distribution L(1, 1). Solve the following two equations for θ : asrs =

n

(1 − F (xi − θ )) and call it dL srs

i=1

and bsrs =

n

(1 − F (xi − θ)) and call it dU srs .

i=1

Other steps are as mentioned before. As r for mthe RSS data, we will use the above algorithm. We only need to replace ti by ri = j =1 i=1 u(i:m)j and F (xi − θ ) by F (x(i:m)j − θ ). The efficiencies based on the length and the standard errors are given in Table 1. Using RSS will reduce the length of the interval between 41% and 75% depending on the set size. The standard errors will be reduced between 43% and 206% if we use RSS data.


3. 3.1.

399

CI for θ if the coefficient of variation is known CI estimate for θ using the MLE

We will assume that θ = λ. The likelihood function for the SRS is L(θ, x) =

n

f (xi , θ) =

i=1

n i=1

n e−(xi /θ −1) 1 e− i=1 (xi /θ −1) = n n . −(xi /θ −1) )2 θ (1 + e−(xi /θ −1) )2 θ i=1 (1 + e


The log-likelihood function is l(θ, x) = ln L(θ, x) = −n ln(θ ) −

n xi i=1

θ

n

ln(1 + e−(xi /θ −1) ). −1 −2 i=1

Taking the derivative with respect to θ , we get ∂l(θ, x) n =− − ∂θ θ

n i=1

θ

xi

−

n

2 xi e−(xi /θ −1) = 0. θ 2 i=1 1 + e−(xi /θ −1)

After some simplification, we will get: n

2 xi θ − x¯ + = 0. n i=1 1 + e(xi /θ −1)

(7)

The MLE of θ cannot be obtained in closed form, and it will be calculated numerically by

simulation using Equation (7) to get θ mle2 the MLE of θ. Let Tmle2 (x) = (θ mle2 − θ )/θ , which is a pivot, so its distribution is free of θ .

To obtain the values of a and b and θ mle2 , we will follow the same algorithm that we used in Section 2.1. We only need to simulate a SRS of size n = mr from L(1, 1) and using Equation (7) in stead of Equation (6) and every thing else is the same. We solve the pivot Tmle2 (x) to find a (1 − α)100% CI for θ. It is easy to show that

θˆmle2 θˆmle2 , bmle2 + 1 amle2 + 1

is a (1 − α)100% CI for θ . The length of the CI is given by Lmle2 =

θˆmle2 θˆmle2 − . bmle2 + 1 amle2 + 1

The standard deviation of the length is given by

s(Lmle2 ) =

bmle2 − amle2 (amle2 + 1)(bmle2 + 1)

s(θˆmle2 ),

where s(θˆmle2 ) can be calculated using θ mle2i , i = 1, 2, . . . , 100,000.

400

H.A. Muttlak et al.

As for the CI estimator for θ when the θ = λ using RSS data, the likelihood function is L(θ, x rss ) = f (xrss i , θ) =

m r j =1 i=1

r m e− j =1 i=1 (x(i:m)j /θ −1) e−(x(i:m)j /θ −1) 1 = n r m . −(x(i:m)j /θ −1) )2 θ θ (1 + e−(x(i:m)j /θ −1) )2 j =1 i=1 (1 + e

The log-likelihood function is l(θ, x rss ) = −n ln(θ ) −

m r x(i:m)j

θ

j =1 i=1

m r

−1 −2 ln(1 + e−(x(i:m)j /θ −1) ). j =1 i=1


Taking the derivative with respect to θ, we get ∂l(θ, x rss ) n =− − ∂θ θ

r

j =1

i i=1

x(i:m)j

θ

n

2 x(i:m)j e−(x(i:m)j /θ −1) = 0. − 2 θ i=1 1 + e−(x(i:m)j /θ −1)

After some simplification, we will get θ − x¯rss +

m

r x(i:m)j 2 = 0. n j =1 i=1 1 + e(x(i:m)j /θ −1)

(8)

The MLE of θ cannot be obtained in closed form, and it will be calculated numerically by

simulation using Equation (8) to get θ mlerss2 the MLE of θ. Let

θ mlerss2 − θ Tmlerss2 (x) = , θ which is a pivot; so its distribution is free of θ. We will use the above algorithm to obtain a (1 − α)100% CI for θ using RSS data. Then solve the above pivot to find a (1 − α)100% CI for θ to be θˆmlerss2 θˆmlerss2 , . bmlerss2 + 1 amlerss2 + 1 The length of the CI is given by Lmlerss2 =

θˆmlerss2 θˆmlerss2 − . bmlerss2 + 1 amlerss2 + 1

The standard deviation of the length is given by

s(Lmlerss2 ) =

bmlerss2 − amlerss2 (amlerss2 + 1)(bmlerss2 + 1)

s(θˆmlerss2 ),

where s(θˆmlerss2 ) can be calculated using θ mlerss2i , i = 1, 2, . . . , 100,000. The efficiencies for length and the standard error of the intervals using the pivots Tmle2 (X) and Tmlerss2 (X) are given in Table 2. Using RSS will reduce the length and standard errors of the intervals up to 42% and 55%, respectively, depending on the set size.


401

Table 2. The efficiencies for the average length and standard error using different pivots to construct CI estimate for the location parameter θ when θ = λ, using RSS and SRS data for different values of the set size m. m=3 Pivot T2m (X) T21 (X) T22 (X) T23 (X) T24 (X) T25 (X)

m=4

m=5

eff(Lrss , Lsrs )

eff(srss , ssrs )

eff(Lrss , Lsrs )

eff(srss , ssrs )

eff(Lrss , Lsrs )

eff(srss , ssrs )

1.3740 1.1945 1.1205 1.0107 1.2414 1.1177

1.4218 1.3717 1.2258 1.0499 1.5275 2.1454

1.4050 1.2683 1.1874 1.0887 1.3411 1.2733

1.4929 1.5427 1.3656 1.1730 1.8047 2.4019

1.4189 1.3402 1.2076 1.0936 1.4522 1.4590

1.5519 1.7270 1.4221 1.1823 2.1447 2.4363


3.2. CI estimate for θ using five pivots Five different pivots will be use to construct CIs for θ when θ = λ. These pivots are n |xi | − θ , T21 (x) = i=1 θ x˜ − θ T22 (x) = , θ where x˜ is the median for the absolute value of the data set, n

xi − θ 2 , T23 (x) = θ i=1 n 1 − F ((xi − θ )/θ ) T24 (x) = − ln F ((xi − θ )/θ ) i=1 and T25 (x) =

n

1−F

i=1

3.2.1.

xi − θ θ

.


We can use the algorithm of Section 2.2.1 to find a and b such that P (a ≤ T21 (x) ≤ b) = 1 − α. We only need to replace ti by ti = ( ni=1 |xi | − θ )/θ . The (1 − α)100% CI for θ can be shown to be |xi | |xi | , . bsrs + 1 asrs + 1 Once again we can use the algorithm of Section 2.2.1 to find a (1 − α) 100% using SRS data. We need to select SRS data from logistic L(1, 1) and compute the lower and the upper confidence bounds using |xi | |xi | and dU srs = , dL srs = bsrs + 1 asrs + 1 and every thing else is the same.

402

H.A. Muttlak et al.

As for theRSS, we will use the above algorithm to find a and b. We only need replace ti by ri = rj =1 m i=1 (|x(i:m)j | − θ )/θ. To find a (1 − α) 100% CI for θ using RSS data, we will follow the same way for SRS. We only need to replace xi by x(i:m)j . The efficiencies based on the length and the standard errors are given in Table 2. We can see that the RSS reduces the length and the standard error of the interval up to 34% and 73%, respectively, depending on the set size. 3.2.2. CI estimate for θ using the pivot T22 (x)


The pivot T22 (x) is suggested to be used to construct a CI for θ using the median of the data. The algorithm in Section 3.2.1 can be used to find a and b for SRS such that P (a ≤ T22 (x) ≤ b) = 1 − α. We only need to use ti = (x˜ − θ )/θ and other steps are similar to the steps of Section 3.2.1. Following the same procedure that we used in Section 3.2.1, we can find (1 − α) 100% CI for θ to be x˜ x˜ , . bsrs + 1 asrs + 1 Again we can use the algorithm in Section 3.2.1 to find the mean and the standard error of the length of the intervals. As for the RSS, we will use the above algorithm to find a and b; we only need replace ti by ri = (x − θ )/θ , where x is the median of the absolute values of the RSS data. To find a (1 − α)100% CI for θ using RSS data, we follow the same method of SRS as showing, but we should use RSS data. The efficiencies based on the average length and the standard errors are reported in Table 2. As we can see that using RSS will reduce the average length of interval up to 21% and reducing the standard errors up to 42% depending on the set size. 3.2.3. CI estimate for θ using the pivot T23 (x) To find a and b such that P (a ≤ T23 (x) ≤ b) = 1 −α, we will use the same algorithm that used in Section 3.2.1. We only need to replace ti by ti = ni=1 ((xi − θ )/θ )2 . Following the procedure used in Section 3.2.1, we can find the (1 − α)100% CI for θ to be n

n

xi − θ 2 xi − θ 2 − asrs , − bsrs . θ θ i=1 i=1 We can find the mean and the standard error of the length of the intervals using the previous algorithm. As for theRSS, we will use the above algorithm to find a and b, we only need replace ti by 2 ri = rj =1 m i=1 (x(i:m)j − θ ) /θ . To find a (1 − α)100% CI for θ using RSS data, we will use the same results for SRS as shown, but we will use RSS data. The efficiencies based on the average lengths and standard errors are given in Table 2. Using RSS data will improve efficiency in estimating the length of the interval only by 10% at the maximum depending on the set size. However, it will improve the efficiency of estimating the standard errors up to 18% depending on the set size.


3.2.4.

403


Since the cdf of the logistic is F (x) = [1 + exp(−((x − θ )/θ ))]−1 , this pivot can be simplified to n

xi − θ T24 (x) = . θ i=1


To find the values of a and b such that P (a ≤ T24 (x) ≤ b) = 1 − α, we can use the algorithm given in Section 3.2.1 with ti = ni=1 (xi − θ )/θ and ni=1 xi /θ > n. The (1 − α)100% CI for θ using SRS can be shown as n

n i=1 xi i=1 xi , . bsrs + n asrs + n To find a (1 − α)100% CI for θ using SRS data, we used the algorithm given in Section 3.2.1. As for the data, we can use the above algorithm to find a and b. We only need to replace RSS ti by ri = rj =1 m i=1 (x(i:m)j − θ )/θ. It can be shown that a (1 − α)100% CI for θ using RSS data is r m r m j =1 i=1 x(i:m)j j =1 i=1 x(i:m)j , . brss + n arss + n Once again, we will find the mean and the standard error of the length of the intervals using RSS data by following the same method as used for SRS. Using RSS will reduce the length of the interval up to 45% depending on the set size. Using RSS will reduce the standard errors up to 214% depending on the set size. 3.2.5.

CI estimate for θ using pivot T25 (x)

We need to find a and b such that P (a ≤ T25 (x) ≤ b) = 1 − α, but

F

and

xi − θ θ

∼ U (0, 1)

xi − θ 1−F ∼ U (0, 1). θ To find the values of a and b using the above equations, we used the same algorithm of Section 2.2.4 for the SRS data. To find a (1 − α)100% CI for θ using SRS data, we can use the algorithm used in Section 2.2.4 by solving the following two equations:

n xi − θ asrs = 1−F and call it dL srs θ i=1

and bsrs =

n

1−F

i=1

And everything else is the same.

xi − θ θ

and call it dU srs .

404

H.A. Muttlak et al.

As for the RSS data, we will use the same way as mentioned above. We only need to use the RSS instead of SRS data. The efficiencies based on the average length and the standard errors are given in Table 2. Using RSS will reduce the length of the interval more than 46% depending on the set size. On the other hand RSS will reduce the standard errors more than 244% depending on the set size.

4.


4.1.

CI for λ when θ is known CI estimate for λ when θ = 0 using the MLE

To find the MLE for λ when θ is known using SRS, we assumed wlog that θ = 0. So the logistic pdf will be reduced to f (xi , λ) =

e−xi /λ , λ(1 + e−xi /λ )2

−∞ < xi < +∞

and

λ > 0.

The likelihood function is given by L(λ, x) =

n

f (xi , λ) =

i=1

n i=1

n

e−xi /λ e− i=1 xi /λ = n n . −x /λ 2 −xi /λ )2 λ(1 + e i ) λ i=1 (1 + e

The log-likelihood function is n l(λ, x) = ln[L(λ, x)] = −n ln λ −

i=1

xi

λ

−2

n

ln(1 + e−xi /λ ).

i=1

To find the MLE for λ, take the derivative of l(λ, x) with respect to λ and set the equation equal to zero and solve for λ. n n xi ∂l(λ, x) n 2 xi e−xi /λ = − + i=1 − =0 ∂λ λ λ2 λ2 i=1 1 + e−xi /λ (9) n 2 xi e−xi /λ x¯ = λ + . n i=1 1 + e−xi /λ The MLE of λ which will be denoted by λˆ mle cannot be obtained in the closed form, so we will use numerical method to find λˆ mle . The CI for λ using λˆ mle based on SRS data can be found using the following pivot: Tmle3 (X) =

λˆ mle . λ

So the distribution of Tmle3 (X) is free of λ. The following algorithm will be used to find a and b and such that P (a ≤ Tmle3 (X) ≤ b) = 1 − α, and a point estimate for λ: (1) Simulate a SRS of size n = mr from logistic L(0, 1). (2) Substitute these values in Equation (9) and solve for λ to get one value, call it ti .


405

(3) Repeat steps 1 and 2, 100,000 times to get t1 , t2 , . . . , t100,000 , the point estimate for λ is λˆ mle , which is the average of t1 , t2 , . . . , t100,000 . (4) Obtain the (α/2)th percentile and the (1 − α/2)th percentile of t1 , t2 , . . . , t100,000 and let amle3 = (α/2)th percentile and bmle3 = (1 − α/2)th percentile. The (1 − α)100% CI for λ using the pivot Tmle3 (X) can be shown as


λˆ mle λˆ mle , bmle3 amle3

.

To find an estimate for the length and its standard error using SRS, we use the same algorithm used in Section 2.1. The MLE for λ when θ is known using RSS can be shown as 2 x(i:m)j e−x(i:m)j /λ . n j =1 i=1 1 + e−x(i:m)j /λ r

x¯rss = λ +

m

(10)

The MLE of λ which will be denoted by λˆ mlerss cannot be obtained in the closed form; so we used a numerical method to find λˆ mlerss . The CI for λ using λˆ mlerss based on RSS data can be shown as (λˆ mlerss /bmlerss3 , λˆ mlerss /amlerss3 ). We followed the algorithm used in Section 2.1 to find the average and the standard error of these lengths. The efficiencies using the MLE to estimate λ using RSS and SRS are given in Table 3. Using RSS will improve the efficiency, i.e. reduce the length of the interval up to 19% depending on the set size, and improve the efficiency of the standard error up to 36%. 4.2. CI estimate for λ using five pivots The following pivots will be used to construct CI for λ when θ = 0: T31 (x) =

n |xi | i=1

T32 (x) =

λ

,

X˜ , λ

Table 3. The efficiencies for the average length and standard error using different pivots to construct CI estimate for the location parameter λ when θ is known, using RSS and SRS data for different values of the set size m. m=3 Pivot T3m (X) T31 (X) T32 (X) T33 (X) T34 (X) T35 (X)

m=4

m=5

eff(Lrss , Lsrs )

eff(srss , ssrs )

eff(Lrss , Lsrs )

eff(srss , ssrs )

eff(Lrss , Lsrs )

eff(srss , ssrs )

1.0540 1.0603 1.0304 1.0493 1.0573 1.0221

1.0983 1.1062 1.0659 1.0886 1.1093 1.1019

1.1057 1.1334 1.0919 1.0877 1.1224 1.0595

1.2041 1.2614 1.1837 1.1645 1.2379 1.1549

1.1782 1.1937 1.1230 1.1429 1.1915 1.0808

1.3584 1.3877 1.2556 1.2631 1.3861 1.1662

406

H.A. Muttlak et al.

where X˜ = median|Xi |, i.e. we take the absolute values of the sample data and then we find the median, T33 (x) =

n xi 2 i=1

T34 (x) = −

n

λ ln

i=1

, 1 − F (xi /λ) F (xi /λ)

and


T35 (x) =

n

1−F

x

i=1

4.2.1.

i

λ

.

CI estimate for λ using the pivots T31 (x) and T32 (x)

The following algorithm will be used to find a and b using SRS data for T31 (x): (1) (2) (3) (4)

Select a SRS of size n = mr observations from the logistic distribution L(0, 1). Let ti = ni=1 |xi |/λ. Repeat steps 1 and 2, 100,000 times to get t1 , t2 , . . . , t100,000 . Obtain the (α/2)th percentile and the (1 − α/2)th percentile of t1 , t2 , . . . , t100,000 and let asrs = (α/2)th percentile and bsrs = (1 − α/2)th percentile.

A (1 − α)100% CI for λ can be shown as

|xi | |xi | . , bsrs asrs

To find a (1 − α)100% CI for λ using SRS data, the following algorithm will be used: (1) Select a SRS of size from the logistic distribution L(0, 1). n = mr observations (2) Compute dL srs = |xi |/bsrs and dU srs = |xi |/asrs . (3) Find the length of the interval for T31 (x) using SRS data as follows: L31 srs = [dU srs − dL srs ]. (4) Repeat steps 1–3, 150,000 times and find the mean and the standard error of the length of the intervals. As for the RSSdata, we will use the above algorithm to find a and b. We only need to replace ti by ri = rj =1 m i=1 |x(i:m)j |/λ. In constructing the CI, we need to compute ⎛ ⎝

m r j =1 i=1

|x(i:m)j |/brss ,

m r

⎞ |x(i:m)j |/arss ⎠ .

j =1 i=1

The efficiencies of estimating the average lengths and the standard errors of the intervals are given in Table 3. As we can see, RSS will reduce the average length of the interval and standard errors up to 18% and 38%, respectively, depending on the set size.


407

˜ is very close to the previous pivot. We used the same algorithm to find The pivot T32 (x) = X/λ a and b. We only need to replace ti by ti = x/λ. ˜ The (1 − α)100% CI for λ can be shown as x˜ x˜ . , bsrs asrs


To find a (1 − α)100% CI for λ using SRS data, we used the above algorithm. As for the RSS data, we will use the algorithm used for the SRS. We only need to replace ti by ri = x/λ, where x = median|X(i:m)j |, i.e. the median of the absolute values of RSS data. ˜ rss ]. The (1 − α)100% CI for λ using RSS data can be shown as [x/b ˜ rss , x/a The efficiencies of estimating the average lengths and the standard errors of the intervals are given in Table 3. As we can see, RSS will reduce the length of the interval and standard errors up to 12% and 26%, respectively, depending on the set size. 4.2.2. CI estimate for λ using the pivot T33 (x) The algorithm of Section 4.2.1 can be used to find a and b using SRS. We only need to replace ti by ti = ni=1 (xi /λ)2 . The (1 − α)100% CI for λ can be shown as ⎞ ⎛ n n 2 xi2 x i ⎠ ⎝ . , b a srs srs i=1 i=1 We followed the above algorithm to find a (1 − α)100% CI for λ using SRS data. As for the RSS data, we will use the above algorithm to find a and b and then a (1 − α)100% CI for λ. We only need to replace xi by x(i:m)j . The efficiencies of estimating the average lengths and the standard errors of the intervals are given in Table 3. As we can see, RSS will reduce the length of the interval and standard errors up to 14% and 26%, respectively, depending on the set size. 4.2.3. CI estimate for λ using the pivot T34 (x) Since the cdf F (x) = [1 + exp(−(x/λ)]−1 , this pivot can be simplified as T34 (x) =

n xi i=1

λ

.

The algorithm of Section 4.2.1 can be used to find a and b using SRS. We only need to replace ti by ti = ni=1 |xi |/λ and everything else is the same. The (1 − α)100% CI for λ can be shown as n

n i=1 |xi | i=1 |xi | . , bsrs asrs To find a (1 − α)100% CI for λ, we used the algorithm that we used in Section 4.2.1. As for the RSS data, we used the above algorithm, replacing ti by ri = rj =1 m i=1 |x(i:m)j |/λ. The (1 − α)100% CI forθ using RSS can be shown as r m r m j =1 j =1 i=1 |x(i:m)j | i=1 |x(i:m)j | . , brss arss Once again, we used the algorithm that we used for SRS data. We only need to use RSS data instead of SRS.

408

H.A. Muttlak et al.

The efficiencies of estimating the average lengths and the standard errors of the intervals are given in Table 3. As we can see, RSS will reduce the length of the interval and standard errors up to 19% and 39%, respectively, depending on the set size. 4.2.4.

CI estimate for λ using pivot T35 (x)

We need to find a and b such that P (a ≤ T35 (x) ≤ b) = 1 − α, but


F

x

and 1−F

i

λ

∼ U (0, 1)

x i

∼ U (0, 1). λ To find the values of a and b using the above equations, we used the algorithm used in Section 2.2.4 for the SRS data. To find a (1 − α)100% CI for λ using SRS data, we select a SRS of size n = mr observations from the logistic distribution L(0, 1). Solve the following two equations for λ: asrs =

n

1−F

i=1

and bsrs =

n

1−F

i=1

x

i

λ x

i

λ

and call it dL srs

and call it dU srs .

The rest of the algorithm is similar to what we have done before. As for the data, we will use the above algorithm to find a and b. We only need to replace RSS ti by ri = rj =1 m i=1 u(i:m)j . To find a (1 − α)100% CI for λ, we used the above algorithm for RSS data. The efficiencies of estimating the average lengths and the standard errors of the intervals are given in Table 3. As we can see, RSS will reduce the length of the interval and standard errors up to 8% and 17%, respectively, depending on the set size.

5.

Conclusions and recommendations

In this study, we established that using RSS to come up with the point estimate for the parameters of interest, and then construct a CI for these parameters will increase the efficiencies of the expected lengths and the standard errors of these intervals. However, increasing the efficiencies will vary from one case to another and from one pivot to others. The following are some specific comments. Note that the three cases that we are referring to are case 1: estimating θ when λ = 1; case 2: estimating θ when λ = θ ; case 3: estimating λ when θ = 0. (1) Using the MLE will increase the efficiencies of using RSS compared with SRS for the three cases of interest. But the amount of gain in the efficiency will depend on the set size; for


409

example, if we are using the expected length of the interval for case 1, the gain in the efficiency is between 41% and 74% depending on the set size. (2) Increasing the number of replications (cycles) will have not much effect on the efficiencies for most of the cases considered in this study. (3) Increasing the set size will increase the efficiencies for the cases considered in this study. (4) There is no one pivot that dominates all other pivots for all the cases under consideration.


Our final recommendations are as follows: (1) If we wish to estimate θ when λ is known, we recommend using either the MLE, T11 (x), or T15 (x). But pivot T15 (x) dominated all other pivots used in this study. (2) To estimate θ when λ = θ , we recommend using either the MLE, T21 (x), T24 (x) or T25 (x). The largest increase in the efficiency is obtained if we use T25 (x) for the standard error, but there is no clear cut if we use the expected length. (3) To estimate λ when θ is known, we recommend using the MLE, T31 (x), T32 (x) or T34 (x). But T31 (x) dominated all other pivots even though the gain in the efficiencies is not large compared with other pivots, especially the pivot T34 (x). Acknowledgements This work was supported by King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia under the fast track project # FT/2006-21.

References [1] N.L. Johnson, S. Kotz, and N. Balakrishnan, Continuous Univariate Distributions, Vol. 1, 2nd ed., Wiley, New York, 1994. [2] K. Lam, B.K. Sinha, and Z. Wu, Estimation of location and scale parameters of a logistic distribution using ranked set sample, in Papers in Honor of Herbert A. David, S. Nagaraja and Morrison, ed., 1995, pp. 187–197. [3] W.A. Abu-Dayyeh, S. Al-Subh, and H.A. Muttlak, Testing hypotheses about the location parameter of the logistic distribution using simple random sampling and ranked set sampling, Pak. J. Stat. 17 (2001), pp. 37–50. [4] W.A. Abu-Dayyeh, M.A. Al-Momani, and H.A. Muttlak, Exact Bahadur slope for combining independent tests for normal and logistic distributions, App. Math. Comp. 135(2002), pp. 135–150. [5] W.A. Abu-Dayyeh, S. Al-Subh, and H.A. Muttlak, Logistic parameters estimation using simple and RSS data, App. Math. Comp. 150 (2004), pp. 543–554. [6] T. Wu and H. Lu, Prediction intervals for an ordered observation from the logistic distribution based on censored samples, J. Statist. Comput. Simul. 77 (2007), pp. 389–405. [7] G.A. McIntyre, Method for unbiased selective sampling, using, ranked sets, Aust. J. Agric. Res. 3 (1952), pp. 85–390. [8] K. Takahasi and K. Wakimoto, On the unbiased estimates of the population mean based on the sample stratified by means of ordering, Ann. Inst. Stat. Math. 20 (1968), pp. 1–31. [9] T.R. Dell and J.L. Clutter, Ranked set sampling theory with order statistics Background, Biometrics 28 (1972), pp. 545–553. [10] N. Balakrishnan, Order statistics from the half logistic distribution, J. Statist. Comput. Simul. 20 (1985), pp. 287–309. [11] N. Balakrishnan and S. Puthenpura, Best linear unbiased estimators of location and scale parameters of the half logistic distribution, J. Statist. Comput. Simul. 25 (1986), pp. 193–204. [12] N. Balakrishnan and A.C. Cohen, Order Statistics and Inference: Estimation Methods, Academic Press, Boston, 1991. [13] N. Balakrishnan and K.H.T. Wong, Approximate MLEs for the location and scale parameters of the half logistic distribution with Type-II right censoring, IEEE Trans. Reliab. 40 (1991), pp. 140–145. [14] M. Thoresen and P. Laake, A simulation study of statistical tests in logistic measurement error models, J. Statist. Comput. Simul. 77 (2006), pp. 683–694. [15] A. Balemi and A. Lee, Comparison of GEE1 and GEE2 estimation applied to clustered logistic regression, J. Statist. Comput. Simul. 79 (2009), pp. 361–378.

Journal of Statistical Computation and Simulation Confidence interval ...

Journal of Statistical Computation and Simulation Confidence interval ...

Suggest Documents

Journal of Statistical Computation and Simulation ...

Journal of Statistical Computation and Simulation ...

Journal of Statistical Computation and Simulation ...

Journal of Statistical Computation and Simulation ...

Journal of Statistical Computation and Simulation A ...

Page 1 Journal of Statistical Computation and Simulation - - .. - Vol. 76 ...

Effect size, confidence interval and statistical significance: a practical ...

Effect size, confidence interval and statistical significance: a practical ...

Effect size, confidence interval and statistical significance: a practical

Statistical Modeling and Computation - Journal of Statistical Software

95% confidence interval - PLOS

95% confidence interval

redefining the confidence interval

Interval Computation - informatik.uni-bremen.de

Interval Computation - informatik.uni-bremen.de

use of the interval statistical procedure for simulation ... - WSC Archive

Confidence Interval or P-Value?

Alternatives to P value: confidence interval and

Student's t-distribution and confidence interval ...

Reliability of 95% confidence interval revealed by

Confidence Interval Estimation of a Normal Percentile

Wind Power Interval Forecasting Based on Confidence Interval ... - MDPI

Fast Computation of Trimmed Means - Journal of Statistical Software

Hierarchical computation of interval availability ... - Semantic Scholar