Hindawi Publishing Corporation The Scientific World Journal Volume 2013, Article ID 431868, 7 pages http://dx.doi.org/10.1155/2013/431868
Research Article Some Improved Ratio, Product, and Regression Estimators of Finite Population Mean When Using Minimum and Maximum Values Manzoor Khan and Javid Shabbir Department of Statistics, Quaid-i-Azam University, Islamabad 45320, Pakistan Correspondence should be addressed to Javid Shabbir;
[email protected] Received 5 August 2013; Accepted 16 September 2013 Academic Editors: T. Robak and Y. Xia Copyright © 2013 M. Khan and J. Shabbir. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Efficient estimation of finite population mean is carried out by using the auxiliary information meaningfully. In this paper we have suggested some modified ratio, product, and regression type estimators when using minimum and maximum values. Expressions for biases and mean squared errors of the suggested estimators have been derived up to the first order of approximation. The performances of the suggested estimators, relative to their usual counterparts, have been studied, and improved performance has been established. The improvement in efficiency by making use of maximum and minimum values has been verified numerically.
1. Introduction Supplementary information in form of the auxiliary variable is rigorously used for the estimation of finite population mean for the study variable. Ratio and product estimators due to Cochran [1] and Murthy [2], respectively, are good examples when information on the auxiliary variable is incorporated for improved estimation of finite population mean of the study variable. When correlation between the study variable (𝑦) and the auxiliary variable (𝑥) is positive, ratio method of estimation is effective and when correlation is negative, product method of estimation is used. There are a lot of improvements and advancements in the construction of ratio, product, and regression estimators using the auxiliary information. For recent details, see Haq et al. [3], Haq and Shabbir [4], Yadav and Kadilar [5], Kadilar and Cingi [6], and Koyuncu and Kadilar [7] and the references cited therein. The ratio method of estimation is at its best when the relationship between 𝑦 and 𝑥 is linear and the line of regression passes through the origin but as the line departs from origin, the efficiency of this method decreases. In practice, the condition that the line of regression passes through the origin is rarely satisfied and regression estimator is used for estimation of population mean.
Let 𝑈 = (𝑈1 , 𝑈2 , . . . , 𝑈𝑁) be a population of size 𝑁. Let (𝑦𝑖 , 𝑥𝑖 ) be the values of the study and the auxiliary variables, respectively, on the 𝑖th unit of a finite population. Let us assume that a simple random sample of size 𝑛 is drawn without replacement from 𝑈 for estimating the population mean 𝑌 = ∑𝑁 𝑖=1 𝑦𝑖 /𝑁. It is further assumed that the population mean 𝑋 = ∑𝑁 𝑖=1 𝑥𝑖 /𝑁 of the auxiliary variable 𝑥 is known. The minimum say (𝑥min ) and maximum say (𝑥max ) values of the auxiliary variables are also assumed to be known. The variance of mean per unit estimator 𝑦 = ∑𝑛𝑖=1 𝑦𝑖 /𝑛 is given by 𝑉 (𝑦) = 𝜆𝑆𝑦2 ,
(1)
2 where 𝜆 = ((1/𝑛) − (1/𝑁)) and 𝑆𝑦2 = (1/(𝑁 − 1)) ∑𝑁 𝑖=1 (𝑦𝑖 − 𝑌) .
Some time there exists unusually very large (say 𝑦max ) and very small (say 𝑦min ) units in the population. The mean per unit estimator is very sensitive to these unusual observations and as a result population mean will be either underestimated (in case the sample contains 𝑦min ) or overestimated
2
The Scientific World Journal
(in case the sample contains 𝑦max ). To overcome the situation Sarndal [8] suggested the following unbiased estimator: 𝑦+𝑐 { { 𝑦𝑠 = { 𝑦 − 𝑐 { {𝑦
if sample contains 𝑦min but not 𝑦max , if sample contains 𝑦max but not 𝑦min , for all other samples, (2)
where 𝑐 is a constant. The variance of 𝑦𝑠 is given by 𝑉 (𝑦𝑠 ) = 𝜆𝑆𝑦2 −
2𝜆𝑛𝑐 (𝑦 − 𝑦min − 𝑛𝑐) . 𝑁 − 1 max
(3)
2. Proposed Estimators Motivated by Sarndal [8], we extend this idea to estimators which make use of the auxiliary information for increased precision. It is well known that ratio and product estimators are used when 𝑦 and 𝑥 are positively and negatively correlated, respectively. We suggest estimator for each case separately as follows. Case 1 (positive correlation between 𝑦 and 𝑥). When 𝑦 and 𝑥 are positively correlated, then with selection of a larger value of 𝑥, a larger value of 𝑦 is expected to be selected and when smaller value of 𝑥 is selected, selection of a smaller value of 𝑦 is expected. So we define the following estimators:
Further, 𝑉(𝑦𝑠 ) < 𝑉(𝑦) if 0 < 𝑐 < (𝑦max − 𝑦min )/𝑛. For, 𝑐opt = (𝑦max − 𝑦min )/2𝑛, variance of 𝑦𝑠 is given by ̂ 𝑌 𝑅𝐶
2
𝑉(𝑦𝑠 )opt = 𝑉 (𝑦) −
𝜆(𝑦max − 𝑦min ) , 2 (𝑁 − 1)
(4)
which is always smaller than 𝑉(𝑦). The usual ratio and product estimators of population mean (𝑌) are given by 𝑦𝑅 = 𝑦 (
(5)
where 𝑦 = ∑𝑛𝑖=1 𝑦𝑖 /𝑛 and 𝑥 = ∑𝑛𝑖=1 𝑥𝑖 /𝑛 are the sample means of variables 𝑦 and 𝑥, respectively. The expressions for biases (𝐵(⋅)), and mean square errors (𝑀(⋅)), of the conventional ratio and product estimators, are given by 𝐵 (𝑦𝑅 ) ≅ 𝑌𝜆 (𝐶𝑥2 − 𝜌𝑦𝑥 𝐶𝑦 𝐶𝑥 ) ,
(6)
𝐵 (𝑦𝑃 ) = 𝑌𝜆𝜌𝑦𝑥 𝐶𝑦 𝐶𝑥 ,
(7)
2
𝑀 (𝑦𝑅 ) ≅ 𝑌 𝜆 (𝐶𝑦2 + 𝐶𝑥2 − 2𝜌𝑦𝑥 𝐶𝑦 𝐶𝑥 ) , 𝑀 (𝑦𝑝 ) ≅ 𝑌
𝜆 (𝐶𝑦2
+
𝐶𝑥2
+ 2𝜌𝑦𝑥 𝐶𝑦 𝐶𝑥 ) ,
(8)
2
coefficient between 𝑦 and 𝑥, 𝑆𝑥2 = ∑𝑁 𝑖=1 (𝑥𝑖 − 𝑋) /(𝑁 − 1), and 𝑆𝑦𝑥 = ∑𝑁 (𝑦 −𝑌)(𝑥 −𝑋)/(𝑁−1) are the population variance 𝑖 𝑖=1 𝑖 and population covariance, respectively. Usual regression estimator is given by 𝑦𝑙𝑟 = 𝑦 + 𝑏 (𝑋 − 𝑥) ,
(10)
where 𝑏 is the sample regression coefficient. The variance of the estimator 𝑦𝑙𝑟 is given by 2 ). 𝑉 (𝑦𝑙𝑟 ) = 𝜆𝑆𝑦2 (1 − 𝜌𝑦𝑥
(11)
(13)
where (𝑦𝑐11 = 𝑦 + 𝑐1 , 𝑥𝑐21 = 𝑥 + 𝑐2 ) if the sample contains 𝑦min and 𝑥min ; (𝑦𝑐11 = 𝑦 − 𝑐1 , 𝑥𝑐21 = 𝑥 − 𝑐2 ) if the sample contains 𝑦max and 𝑥max , and (𝑦𝑐11 = 𝑦, 𝑥𝑐21 = 𝑥) for all other combinations of samples. Case 2 (negative correlation between 𝑦 and 𝑥). When 𝑦 and 𝑥 are negatively correlated then with selection of a larger value of 𝑥, a smaller value of 𝑦 is expected to be selected and when smaller value of 𝑥 is selected, a larger value of 𝑦 is expected to be selected. Keeping these points in view, the following estimators are therefore suggested:
̂ 𝑌 𝑃𝐶
(9)
where 𝐶𝑦 = 𝑆𝑦 /𝑌 and 𝐶𝑥 = 𝑆𝑥 /𝑋 are the coefficients of variation of 𝑦 and 𝑥, respectively, 𝜌𝑦𝑥 = 𝑆𝑦𝑥 /𝑆𝑦 𝑆𝑥 is the correlation
(12)
and similarly 𝑦𝑙𝑟𝐶1 = 𝑦𝑐11 + 𝑏 (𝑋 − 𝑥𝑐21 ) ,
𝑋 ), 𝑥
𝑥 𝑦𝑃 = 𝑦 ( ) , 𝑋
2
𝑦 + 𝑐1 { 𝑋, { { 𝑥 + 𝑐2 { { { {𝑦 − 𝑐 𝑦𝑐 { 1 = 11 𝑋 = { 𝑋, { 𝑥𝑐21 𝑥 − 𝑐2 { { { { {𝑦 { 𝑋 {𝑥
(𝑦 + 𝑐1 ) (𝑥 − 𝑐2 ) { { , { { { 𝑋 { { { 𝑥𝑐 { (𝑦 − 𝑐 ) (𝑥 + 𝑐 ) 1 2 = 𝑦𝑐12 22 = { , { 𝑋 { 𝑋 { { { { { {𝑦𝑋 {𝑥
(14)
and similarly 𝑦𝑙𝑟𝐶2 = 𝑦𝑐11 + 𝑏 (𝑋 − 𝑥𝑐22 ) ,
(15)
where (𝑦𝑐12 = 𝑦 + 𝑐1 , 𝑥𝑐22 = 𝑥 − 𝑐2 ) if sample contains 𝑦min and 𝑥max ; 𝑦𝑐12 = (𝑦 − 𝑐1 , 𝑥𝑐22 = 𝑥 + 𝑐2 ), if sample contains 𝑦max and 𝑥min , and (𝑦𝑐12 = 𝑦, 𝑥𝑐22 = 𝑥) for all other combinations of samples. To find the bias and mean square error of these suggested estimators, we first prove two theorems which will be used in subsequent derivations.
The Scientific World Journal
3
Theorem 1. If a sample of size 𝑛 units is drawn from a population of size 𝑁 units, then the covariance between 𝑦𝑐12 and 𝑥𝑐12 , when they are positively correlated, is given by Cov (𝑦𝑐11 , 𝑥𝑐21 ) = 𝜆𝑆𝑦𝑥 −
−1
𝑠∈𝑆
× [𝑐1 (
𝑛𝜆 𝑁−1
−1
𝑁 𝑁 = ( ) [∑ (𝑦 − 𝑌) (𝑥 − 𝑋)] − ( ) 𝑛 𝑛 𝑥max − 𝑥min 𝑁−2 )( ) 𝑛−1 𝑛
− 𝑐2 (
× [𝑐1 (𝑥max − 𝑥min ) + 𝑐2 (𝑦max − 𝑦min )
𝑦max − 𝑦min 𝑁−2 )( ) 𝑛−1 𝑛
𝑁−2 )] +2𝑐1 𝑐2 ( 𝑛−1
−2𝑛𝑐1 𝑐2 ] . (16)
= Cov (𝑦, 𝑥) − Proof. Let us assume that 𝑛 units have been drawn without replacement from a population of size 𝑁. Let 𝑆𝑛 denote a sample space. We partition the whole sample space into three mutually exclusive and collectively exhaustive sets, that is, 𝑆1 , 𝑆2 , and 𝑆3 such that 𝑆𝑛 = 𝑆1 ∪ 𝑆2 ∪ 𝑆3 . Further 𝑆1 is the set of all possible samples which contains 𝑦min and 𝑥min , and 𝑆2 consists of all samples which contains 𝑦max and 𝑥max , and 𝑆3 = 𝑆 − 𝑆1 − 𝑆2 . The number of sample points in 𝑆1 , 𝑁−2 𝑁−2 𝑁 𝑆2 , and 𝑆3 is given by ( 𝑁−2 𝑛−1 ), ( 𝑛−1 ), and ( 𝑛 ) − 2 ( 𝑛−1 ), respectively. By definition of covariance, we have Cov (𝑦𝑐11 , 𝑥𝑐21 ) −1
𝑁 = ( ) [ ∑ (𝑦 + 𝑐1 − 𝑌) (𝑥 + 𝑐2 − 𝑋) 𝑛 [𝑠∈𝑆1 + ∑ (𝑦 − 𝑐1 − 𝑌) (𝑥 − 𝑐2 − 𝑋)
× [𝑐1 (𝑥max − 𝑥min ) + 𝑐2 (𝑦max − 𝑦min ) −2𝑛𝑐1 𝑐2 ] = 𝜆𝑆𝑦𝑥 −
𝑛𝜆 (𝑁 − 1)
× [𝑐1 (𝑥max − 𝑥min ) + 𝑐2 (𝑦max − 𝑦min ) −2𝑛𝑐1 𝑐2 ] . (17) Theorem 2. If a sample of size 𝑛 units is drawn from a population of size 𝑁 units, then the covariance between 𝑦𝑐12 and 𝑥𝑐22 , when they are negatively correlated, is given by Cov (𝑦𝑐12 , 𝑥𝑐22 ) = 𝜆𝑆𝑦𝑥 +
𝑠∈𝑆2
−1
+ ∑ (𝑦 − 𝑌) (𝑥 − 𝑋)
−2𝑛𝑐1 𝑐2 ] . (18) The above Theorem 2 can be proved similarly as Theorem 1. We define the following relative error terms. Let 𝑒0 = (𝑦𝑐1 − 𝑌)/𝑌 and 𝑒1 = (𝑥𝑐2 − 𝑋)/𝑋, such that
𝑠∈𝑆2
+ ∑ (𝑦 − 𝑌) (𝑥 − 𝑋) 𝑠∈𝑆3
𝐸 (𝑒0 ) = 𝐸 (𝑒1 ) = 0, 𝐸 (𝑒02 ) =
{ } −𝑐1 { ∑ (𝑥 − 𝑋) − ∑ (𝑥 − 𝑋)} 𝑠∈𝑆1 {𝑠∈𝑆2 }
𝐸 (𝑒12 ) =
{ } − 𝑐2 { ∑ (𝑦 − 𝑌) − ∑ (𝑦 − 𝑌)} 𝑠∈𝑆1 {𝑠∈𝑆2 }
𝐸 (𝑒0 𝑒1 ) =
{ } + 𝑐1 𝑐2 { ∑ + ∑ }] {𝑠∈𝑆1 𝑠∈𝑆2 }]
𝑛𝜆 𝑁−1
× [𝑐1 (𝑥max − 𝑥min ) + 𝑐2 (𝑦max − 𝑦min )
+ ∑ (𝑦 − 𝑌) (𝑥 − 𝑋)] 𝑠∈𝑆3 ] 𝑁 = ( ) [ ∑ (𝑦 − 𝑌) (𝑥 − 𝑋) 𝑛 [𝑠∈𝑆1
𝑁−𝑛 𝑁 (𝑁 − 1)
𝜆 𝑌
2
𝜆 𝑋
2
(19)
[𝑆𝑦2 −
2𝑛𝑐1 (𝑦 − 𝑦min − 𝑛𝑐1 )] , 𝑁 − 1 max
(20)
[𝑆𝑥2 −
2𝑛𝑐2 − 𝑥min − 𝑛𝑐2 )] , (𝑥 𝑁 − 1 max
(21)
𝜆 𝑛 [𝑆𝑦𝑥 − 𝑁−1 𝑌𝑋 × {𝑐2 (𝑦max − 𝑦min ) + 𝑐1 (𝑥max − 𝑥min ) − 2𝑛𝑐1 𝑐2 }] . (22)
4
The Scientific World Journal ̂ in terms of 𝑒’s, we have Expressing 𝑌 𝑅𝐶 ̂ = 𝑌 (1 + 𝑒 ) (1 + 𝑒 )−1 . 𝑌 𝑅𝐶 0 1
(23)
Expanding and rearranging right-hand side of (23), to first degree of approximation, we have ̂ − 𝑌) ≅ 𝑌 (𝑒 + 𝑒 − 𝑒 𝑒 + 𝑒2 ) . (𝑌 𝑅𝐶 0 1 0 1 1
(24)
Here we have one equation with two unknowns so unique, solution is not possible, so we let 𝑐2 = (𝑥max − 𝑥min )/2𝑛, and then 𝑐1 = (𝑦max − 𝑦min )/2𝑛. For optimum values of 𝑐1 and 𝑐2 , the optimum mean ̂ is given by square error of 𝑌 𝑅𝐶 ̂ ) 𝑀(𝑌 𝑅𝐶
opt
̂ is given by Using (24), the bias of 𝑌 𝑅𝐶
≅ 𝑀 (𝑦𝑅 ) −
× [(𝑦max − 𝑦min ) − 𝑅 (𝑥max − 𝑥min )] . Similarly the bias and mean square error or optimum ̂ are, respectively, given by mean square error of 𝑌 𝑃𝐶
× (𝑐2 (𝑦max − 𝑦min ) + 𝑐1 (𝑥max − 𝑥min )
̂ ) ≅ 𝜆 [𝑆 − 𝑛 𝐵 (𝑌 𝑃𝐶 𝑦𝑥 𝑁−1 𝑋
− 2𝑛𝑐1 𝑐2 ) }] ,
× {𝑐2 (𝑦max − 𝑦min ) + 𝑐1 (𝑥max − 𝑥min ) (25)
where 𝑅 = 𝑌/𝑋. ̂ , to the first Using (24), the mean square error of 𝑌 𝑅𝐶 degree of approximation, is given by ̂ ) ≅ 𝜆 [𝑆2 − 2𝑛𝑐1 (𝑦 − 𝑦 − 𝑛𝑐 ) 𝑀 (𝑌 𝑅𝐶 min 1 𝑦 𝑁 − 1 max +𝑅
{𝑆𝑥2
(29) 2
̂ ) ≅ 𝜆 [𝑅 (𝑆2 − 2𝑛𝑐2 (𝑥 − 𝑥min − 𝑛𝑐2 )) 𝐵 (𝑌 𝑅𝐶 𝑥 𝑁 − 1 max 𝑋 𝑛 − {𝑆𝑦𝑥 − 𝑁−1
2
𝜆 2 (𝑁 − 1)
− 2𝑛𝑐1 𝑐2 }] , (30) ̂ ) ≅ 𝑀 (𝑦 ) − 2𝜆𝑛 𝑀 (𝑌 𝑃𝐶 𝑃 𝑁−1 × [(𝑐1 + 𝑅𝑐2 ) {(𝑦max − 𝑦min ) + 𝑅 (𝑥max − 𝑥min )
2𝑛𝑐2 − − 𝑥min − 𝑛𝑐2 )} (𝑥 𝑁 − 1 max
−𝑛 (𝑐1 + 𝑅𝑐2 ) } ] . (31)
𝑛 − 2𝑅 {𝑆𝑦𝑥 − 𝑁−1 × (𝑐2 (𝑦max − 𝑦min ) + 𝑐1 (𝑥max − 𝑥min )
For optimum values of 𝑐1 and 𝑐2 , the optimum mean ̂ is given by square error of 𝑌 𝑃𝐶 ̂ ) 𝑀(𝑌 𝑃𝐶
−2𝑛𝑐1 𝑐2 ) }] (26)
opt
≅ 𝑀 (𝑦𝑃 ) −
𝜆 2 (𝑁 − 1)
(32) 2
× [(𝑦max − 𝑦min ) − 𝑅 (𝑥max − 𝑥min )] .
or ̂ ) ≅ 𝑀 (𝑦 ) − 2𝜆𝑛 𝑀 (𝑌 𝑅𝐶 𝑅 𝑁−1 × [(𝑐1 − 𝑅𝑐2 ) {(𝑦max − 𝑦min ) − 𝑅 (𝑥max − 𝑥min )
𝑉 (𝑦𝑙𝑟𝐶1 ) = 𝑉 (𝑦𝑙𝑟 ) −
−𝑛 (𝑐1 − 𝑅𝑐2 ) }] . (27) To find optimum values of 𝑐1 and 𝑐2 , we differentiate (27) with respect to 𝑐1 and 𝑐2 as ̂ ) 𝜕𝑀 (𝑌 𝑅𝐶 𝜕𝑐1
𝜕𝑐2
2𝜆𝑛 𝑁−1
× [(𝑐1 − 𝛽𝑐2 ) {(𝑦max − 𝑦min ) − 𝛽 (𝑥max − 𝑥min ) −2𝑛 (𝑐1 − 𝛽𝑐2 ) }] , (33)
= 0 ⇒ (𝑦max − 𝑦min ) − 𝑅 (𝑥max − 𝑥min )
where 𝛽 = 𝜌𝑦𝑥 (𝑆𝑦 /𝑆𝑥 ) is the population regression coefficient of 𝑦 on 𝑥. For 𝑐2 = (𝑥max − 𝑥min )/2𝑛 and 𝑐1 = (𝑦max − 𝑦min )/2𝑛, optimum variance of 𝑦𝑙𝑟𝐶1 is given by
− 2𝑛 (𝑐1 − 𝑅𝑐2 ) = 0, ̂ ) 𝜕𝑀 (𝑌 𝑅𝐶
The variance of regression estimator 𝑦𝑙𝑟𝐶1 in case of positive correlation is given by
= 0 ⇒ (𝑦max − 𝑦min ) − 𝑅 (𝑥max − 𝑥min )
𝑉(𝑦𝑙𝑟𝐶1 )opt = 𝑉 (𝑦𝑙𝑟 ) −
− 2𝑛 (𝑐1 − 𝑅𝑐2 ) = 0 (28)
𝜆 2 (𝑁 − 1)
(34) 2
× [(𝑦max − 𝑦min ) − 𝛽(𝑥max − 𝑥min )] .
The Scientific World Journal
5
For negative correlation, variance of the regression estimator 𝑦𝑙𝑟𝐶2 is given by 𝑉 (𝑦𝑙𝑟𝐶2 ) = 𝑉 (𝑦𝑙𝑟 ) −
2𝜆𝑛 𝑁−1
× [(𝑐1 + 𝛽𝑐2 ) {(𝑦max − 𝑦min ) + 𝛽 (𝑥max − 𝑥min )
̂ ) > 0 (by (iii) mean per unit estimator if 𝑉(𝑦) − 𝑀(𝑌 𝑃𝐶 (1) and (31)) or if 𝑅𝑆 𝑛 𝜌𝑦𝑥 < − 𝑥 + 2𝑆𝑦 𝑅𝑆𝑦 𝑆𝑥 (𝑁 − 1) × [(𝑐1 + 𝑅𝑐2 ) {(𝑦max − 𝑦min ) + 𝑅 (𝑥max − 𝑥min ) −2𝑛 (𝑐1 + 𝑅𝑐2 )}] ;
−2𝑛 (𝑐1 + 𝛽𝑐2 )}] .
(40)
(35) For 𝑐2 = (𝑥max − 𝑥min )/2𝑛 and 𝑐1 = (𝑦max − 𝑦min )/2𝑛, optimum variance of 𝑦𝑙𝑟𝐶2 is given by 𝑉(𝑦𝑙𝑟𝐶2 )opt = 𝑉 (𝑦𝑙𝑟 ) −
min [− 𝑅𝑐2 , −𝑅𝑐2
𝜆 2 (𝑁 − 1)
(36) 2
+(
× [(𝑦max − 𝑦min ) + 𝛽 (𝑥max − 𝑥min )] . So in general we can write 𝑉(𝑦𝑙𝑟𝐶)opt as 𝑉(𝑦𝑙𝑟𝐶)opt
𝜆 = 𝑉 (𝑦𝑙𝑟 ) − 2 (𝑁 − 1) 2 × [(𝑦max − 𝑦min ) − 𝛽 (𝑥max − 𝑥min )] .
(37)
̂ , The conditions under which the suggested estimators 𝑌 𝑅𝐶 ̂ , and 𝑦 perform better than the usual mean per unit 𝑌 𝑙𝑟𝐶
estimator and their usual counterpart is given below. (a) Comparison of Proposed Ratio Type Estimator. A proposed ̂ will perform better than estimator 𝑌 𝑅𝐶
(i) mean per unit estimator (by (1) and (27)) if 𝑉(𝑦)− ̂ ) > 0 or if 𝑀(𝑌 𝑅𝐶
𝜌𝑦𝑥
𝑅𝑆𝑥 𝑛 > − 2𝑆𝑦 𝑅𝑆𝑦 𝑆𝑥 (𝑁 − 1)
+(
𝑅 (𝑥max − 𝑥min ) + (𝑦max − 𝑦min ) )] . 𝑛
(c) Comparison of Proposed Regression Type Estimator. A proposed regression type estimator (positive correlation) will perform better than (v) mean per unit estimator if 𝑉(𝑦) − 𝑉(𝑦𝑙𝑟𝐶1 ) > 0 (by (1) and (33)) or if 2𝑛 2 >− 2 𝜌𝑦𝑥 𝑆𝑦 (𝑁 − 1) × [(𝑐1 − 𝛽𝑐2 ) {(𝑦max − 𝑦min ) − 𝛽 (𝑥max − 𝑥min ) −2𝑛 (𝑐1 − 𝛽𝑐2 )}] ; (42)
min [𝛽𝑐2 , 𝛽𝑐2 − {
−2𝑛 (𝑐1 − 𝑅𝑐2 )}] ; (38) (ii) usual ratio estimator (by (8) and (27)) if 𝑀(𝑦𝑅 )− ̂ ) > 0 or if 𝑀(𝑌 𝑅𝐶 𝑅 (𝑥max − 𝑥min ) − (𝑦max − 𝑦min ) }] < 𝑐1 𝑛
< max [𝑅𝑐2 , 𝑅𝑐2 − {
(41)
(vi) usual regression estimator if 𝑉(𝑦𝑙𝑟 )−𝑉(𝑦𝑙𝑟𝐶1 ) > 0 (by (11) and (33)) or if
× [(𝑐1 − 𝑅𝑐2 ) {(𝑦max − 𝑦min ) − 𝑅 (𝑥max − 𝑥min )
min [𝑅𝑐2 , 𝑅𝑐2 − {
𝑅 (𝑥max − 𝑥min ) + (𝑦max − 𝑦min ) )] < 𝑐1 𝑛
< max [− 𝑅𝑐2 , −𝑅𝑐2
3. Comparison
𝑃𝐶
̂ ) > 0 (by (iv) usual product estimator if 𝑀(𝑦𝑃 ) − 𝑀(𝑌 𝑃𝐶 (9) and (31)) or if
𝑅 (𝑥max − 𝑥min ) − (𝑦max − 𝑦min ) }] . 𝑛 (39)
(b) Comparison of Proposed Product Type Estimator. A proposed product type estimator will perform better than
𝛽 (𝑥max − 𝑥min ) − (𝑦max − 𝑦min ) }] < 𝑐1 𝑛
< max [𝛽𝑐2 , 𝛽𝑐2 − {
𝛽 (𝑥max − 𝑥min ) − (𝑦max − 𝑦min ) }] . 𝑛 (43)
A proposed regression type estimator (negative correlation) will perform better than (vii) mean per unit estimator if 𝑉(𝑦) − 𝑉(𝑦𝑙𝑟𝐶2 ) > 0 (by (1) and (35)) or if 2𝑛 2 >− 2 𝜌𝑦𝑥 𝑆𝑦 (𝑁 − 1) × [(𝑐1 + 𝛽𝑐2 ) {(𝑦max − 𝑦min ) + 𝛽 (𝑥max − 𝑥min ) −2𝑛 (𝑐1 − 𝛽𝑐2 )}] ; (44)
6
The Scientific World Journal Table 1: Numerical values of conditions (38)–(45).
Conditions (38) (39) (40) (41) (42) (43) (44) (45)
Population 1 0.99 > 0.592 0.214 < 0.287 < 0.360 0.99 < −0.592∗ −0.360 < 0.287 < 0.93 0.980 > 0.000 0.273 < 0.287 < 0.301 0.980 > −1.913 −0.30 < 0.287 < −0.15∗
Population 2 −0.667 > 0.2529∗ 0.442 < 1.125 < 1.807 −0.667 < −0.253 −0.442 < 1.125 < 2.692 0.459 > 0.000 −0.773 < 1.125 < −0.592∗ 0.459 > 0.272 0.592 < 1.125 < 4.026
Population 3 0.980 > 0.478 22.23 < 26.16 < 30.094 0.980 < −0.4783∗ −22.238 < 26.16 < 74.57 0.9605 > 0.000 22.78 < 26.16 < 29.549 0.9605 > −0.862 −30.63 < 26.16 < −22.78∗
Population 4 0.981 > 0.525 22.92 < 24.5 < 26.07 0.981 < −0.525∗ −26.07 < 24.5 < 75.07 0.963 > 0.000 24.36 < 24.5 < 24.63 0.963 > −1.88 −24.36 < 24.5 < −21.2∗
Note: ∗ indicates that the condition is not satisfied.
Table 2: PRE of different estimators with respect to 𝑦. Estimator 𝑦𝑅 𝑦𝑃 𝑦𝑙𝑟 ̂ 𝑌
Population 1 1742.030 21.053 5115.818
𝑅𝐶
̂ 𝑌 𝑃𝐶 𝑦𝑙𝑟𝐶
Population 2 51.514 175.210 184.699
Population 3 2501.29 26.383 2536.131
Population 4 2442.965 23.998 2763.749
2319.293
54.325
2940.086
2502.692
21.117 5249.673
212.639 208.254
26.424 2856.83
24.004 2764.336
(viii) usual regression estimator if 𝑉(𝑦𝑙𝑟 ) − 𝑉(𝑦𝑙𝑟𝐶1 ) > 0 (by (11) and (33)) or if min [− 𝛽𝑐2 , −𝛽𝑐2 +{
𝛽 (𝑥max − 𝑥min ) − (𝑦max − 𝑦min ) }] < 𝑐1 𝑛
Population 3 (Murthy [10, page 399]). Let 𝑦 be the area under wheat crop in 1964 and let 𝑥 be the area under wheat crop in 1963. 𝑁 = 34, 𝑛 = 12, 𝑋 = 208.882, 𝑌 = 199.441, 𝑦max = 634, 𝑦min = 6, 𝑥max = 564, 𝑥min = 5, 𝑆𝑦2 = 22564.56, 𝑆𝑥2 = 22652.05, 𝑆𝑥𝑦 = 22158.05, and 𝜌𝑦𝑥 = 0.980.
< max [− 𝛽𝑐2 , −𝛽𝑐2 +{
Population 2 (Singh and Mangat [9, page 195]). Let 𝑦 be the weekly time (hours) spent in nonacademic activities and let 𝑥 be the overall grade point average (4.0 bases). 𝑁 = 36, 𝑛 = 12, 𝑋 = 2.798333, 𝑌 = 14.77778, 𝑦max = 33, 𝑦min = 6, 𝑥max = 3.82, 𝑥min = 1.81, 𝑆𝑦2 = 38.178, 𝑆𝑥2 = 0.3504, 𝑆𝑥𝑦 = −2.477, and 𝜌𝑦𝑥 = −0.6772.
𝛽 (𝑥max − 𝑥min ) − (𝑦max − 𝑦min ) }] . 𝑛 (45)
(d) Comparison of Suggested Estimators for Optimum Values of 𝑐1 and 𝑐2 with Usual Estimators. For optimum values of 𝑐1 and 𝑐2 , the proposed estimator will always perform better than usual mean per unit estimator and their usual counterparts (ratio, product and regression estimators).
4. Empirical Study We consider the following datasets for numerical comparison. Population 1 (Singh and Mangat [9, page 193]). Let 𝑦 be the milk yield in kg after new food and let 𝑥 be the yield in kg before new yield. 𝑁 = 27, 𝑛 = 12, 𝑋 = 10.41111, 𝑌 = 11.25185, 𝑦max = 14.8, 𝑦min = 7.9, 𝑥max = 14.5, 𝑥min = 6.5, 𝑆𝑦2 = 4.103, 𝑆𝑥2 = 4.931, 𝑆𝑥𝑦 = 4.454, and 𝜌𝑦𝑥 = 0.990.
Population 4 (Cochran [11, page 152]). Let 𝑦 be population size in 1930 (in 1000) and 𝑥 be the population size in 1920 (in 1000). 𝑁 = 49, 𝑛 = 12, 𝑋 = 103.1429, 𝑌 = 127.7959, 𝑦max = 634, 𝑦min = 46, 𝑥max = 507, 𝑥min = 2, 𝑆𝑦2 = 15158.83, 𝑆𝑥2 = 10900.42, 𝑆𝑥𝑦 = 12619.78 and 𝜌𝑦𝑥 = 0.98. The conditional values and results are given in Tables 1 and 2, respectively. For percentage relative efficiency (PRE), we use the following expression: PRE (𝑦𝑖 , 𝑦) =
𝑉 (𝑦) × 100 𝑉 (𝑦𝑖 ) or 𝑀 (𝑦𝑖 )
(46)
for 𝑖 = 𝑅, 𝑃, 𝑅𝐶, 𝑃𝐶, 𝑙𝑟, 𝑙𝑟𝐶.
5. Conclusion ̂ is From Table 2, it is observed that the ratio estimator 𝑌 𝑅𝐶 performing better than 𝑦𝑅 in Populations 1, 3, and 4 because
The Scientific World Journal ̂ is better of positive correlation. The product estimator 𝑌 𝑃𝐶 than 𝑦𝑃 just in Population 2 because of negative correlation. The regression estimator 𝑦𝑙𝑟𝐶 outperforms than all other considered estimators and is preferable.
Acknowledgments The authors are thankful to the learned referees for their valuable suggestions and helpful comments in revising the manuscript.
References [1] W. G. Cochran, “The estimation of the yields of cereal experiments by sampling for the ratio to total produce,” The Journal of Agriculture Science, vol. 30, pp. 262–275, 1940. [2] M. N. Murthy, “Product method of estimation,” Sankhya A, vol. 16, pp. 69–74, 1964. [3] A. Haq, J. Shabbir, and S. Gupta, “Improved exponential ratio type estimators in stratified sampling,” Pakistan Journal of Statistics, vol. 29, no. 1, pp. 13–31, 2013. [4] A. Haq and J. Shabbir, “Improved family of ratio estimators in simple and stratified random sampling,” Communication in Statistics: Theory and Methods, vol. 42, no. 5, pp. 782–799, 2013. [5] S. K. Yadav and C. Kadilar, “Improved class of ratio and product estimators,” Applied Mathematics and Computation, vol. 219, no. 22, pp. 10726–10731, 2013. [6] C. Kadilar and H. Cingi, “Improvement in estimating the population mean in simple random sampling,” Applied Mathematics Letters, vol. 19, no. 1, pp. 75–79, 2006. [7] N. Koyuncu and C. Kadilar, “Ratio and product estimators in stratified random sampling,” Journal of Statistical Planning and Inference, vol. 139, no. 8, pp. 2552–2558, 2009. [8] C. E. Sarndal, “Sample survey theory vs general statistical theory: estimation of the population mean,” International Statistical Institute, vol. 40, pp. 1–12, 1972. [9] R. Singh and N. S. Mangat, Elements of Survey Sampling, Kluwer Academic Publishers, London, UK, 1996. [10] M. N. Murthy, Sampling Theory and Methods, Statistical Publishing Society, Calcutta, India, 1967. [11] W. G. Cochran, Sampling Techniques, John Wiley & Sons, New York, NY, USA, 3rd edition, 1977.
7