Khan SpringerPlus (2016)5:86 DOI 10.1186/s40064-016-1717-4
Open Access
RESEARCH
A ratio chain‑type exponential estimator for finite population mean using double sampling Mursala Khan1,2* *Correspondence:
[email protected] 1 Department of Mathematics, COMSATS Institute of Information Technology, Abbottabad 22060, Pakistan Full list of author information is available at the end of the article
Abstract In this article, we have proposed a ratio chain-type exponential estimator for finite population mean of the study variable under double sampling scheme using auxiliary variables. The large sample properties of the suggested strategy are derived up to first order, of approximation, and its competence conditions are carried out under which the suggested estimator is performed better than the other existing estimators discussed in the literature. An empirical study shows that the suggested strategy is more efficient than the other relevant competing estimators under two phase sampling scheme. Keywords: Double sampling, Study variable, Bias, Auxiliary variable, Mean squared-error, Estimator, Efficiency
Introduction and literature review To increase the precision of estimators for population mean of the study variable under double sampling design, a lot of works have been done in the field of sample survey and when the study variable is strongly connected with the auxiliary variables the precision of the estimators can be more and more. Using the knowledge of the auxiliary variables several authors have proposed different estimation technique for finite population mean of the study variable, Sukhatme (1962), have developed a general ratio-type estimator. Chand (1975), have suggested two chain ratio-type estimators to estimate the population mean using two auxiliary variables (Kiregyera 1980, 1984; Srivnstava et al. 1990; Bahl and Tuteja 1991; Srivastava 1970; Cochran 1977; Singh et al. 2006, 2007, 2011; Dash and Mishra 2011; Singh and Choudhury 2012; Khare et al. 2013; Khare and Rehman 2013) etc. Let us consider a finite population of size N of different units U = {U1 , U2 , U3 , . . . , UN } . Let y and x be the study and the auxiliary variable with corresponding values yi and xi respectively for i-th unit i = {1, 2, 3, . . . , N } is defined on a finite population U. N N ¯ Let Y¯ = 1 N i=1 yi and X = 1 N i=1 xi be the corresponding population means of the study as well as auxiliary variable respectively. Also let N N ¯ 2 be the corresponding ¯ 2 and Sx2 = 1 N Sy2 = 1 N i=1 xi − X i=1 yi − Y © 2016 Khan. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
Khan SpringerPlus (2016)5:86
Page 2 of 9
population variances of the study as well as auxiliary variable respectively and letCy and Cx be the coefficient of variation of the study as well as auxiliary variable respectively, and ρyx be the correlation coefficient between x and y. Let y and x be the study and the auxiliary variable with corresponding values yi and n xi respectively for i-th unit i = {1, 2, 3, . . . , n} in the sample. Let y¯ = 1 n i=1 yi and n x¯ = 1 n x be the corresponding unbiased sample means of the study as well as i=1 i auxiliary variable respectively. 2 n n ¯ )2 be the ¯ and sx2 = 1 n − 1 Also let sy2 = 1 n − 1 i=1 yi − y i=1 (xi − x corresponding unbiased sample variances of the study as well as auxiliary variable respectively. Let Syx, Syz and Sxz be the co-variances between their respective subscripts respecs tively. Similarly byx = syx2 is the corresponding sample regression coefficient y on x based x S on a sample of size n. Also Cy = Y¯y , Cx = SX¯x and Cz = SZ¯z are the coefficients of variation of the study and auxiliary variables respectively. The usual unbiased estimator to estimate the population mean of the study variable is n
y¯ 0 =
1 yi n
(1)
i=1
The variance of the estimator y¯ up to first order of approximation is, given by
V y¯ 0 = f1 Y¯ 2 Cy2
(2)
The usual ratio and regression estimators in two phase sampling and their mean square error are, given as follows
y¯ 1 =
y¯ ′ x¯ x¯
(3)
y¯ 2 = y¯ + byx x¯ ′ − x¯
(4)
MSE y¯ 1 = Y¯ 2 f1 Cy2 + f3 Cx2 − 2ρyx Cy Cx
(5)
2 2 Var y¯ 2 = Sy2 f1 1 − ρyx + f2 ρyx
(6)
The mean squared error and variance are given below
where f1 = n1 − N1 , f2 = n1′ − N1 and f3 = n1 − n1′ . Chand (1975), proposed the following chain ratio-type estimator in double sampling by incorporating the knowledge of two auxiliary variables, the suggested estimator is, given by y¯ 3 =
y¯ x¯ ′ ¯ Z x¯ z¯ ′
The mean square error of the suggested estimator is, given as
(7)
Khan SpringerPlus (2016)5:86
Page 3 of 9
� � 2 2 f C + f − 2ρ C C C 1 3 yx y x y x � � � � MSE y¯ 3 = Y¯ 2 2 + f2 Cz − 2ρyz Cy Cz
(8)
Kiregyera (1984), suggested the following chain-type exponential estimators in two phase sampling, the suggested estimators are given as
y¯ 4 =
y¯ ′ x¯ + bxz Z¯ − z¯ ′ x¯
y¯ 5 = y¯ + byx x¯ ′ − x¯ − bxz Z¯ − z¯ ′
(9)
(10)
The mean square errors of the suggested estimators, up to first order of approximation are, given as follows
MSE y¯ 4 = Y¯ 2
f1 Cy2 + f3 Cx Cx − 2ρyx Cy +f2 ρxz Cx ρxz Cx − 2ρyz Cy
f ρ ρ ρ ρ − 2ρ 2 yx xz yx xz yz MSE y¯ 5 = Y¯ 2 Cy2 2 + f1 − f3 ρyx
(11)
(12)
Searls (1964), proposed an estimation procedure for population mean using known knowledge of the coefficient of variation of the auxiliary variable
y¯ ∗ = a¯y
(13)
var(¯y∗ ) = (1 − B)f1 Y¯ 2 Cy2
(14)
−1 where a = 1 + f1 Y¯ 2 Cy2 and B = f1 Y¯ 2 Cy2 Khare and Rehman (2013), have proposed improved chain type estimators for population mean using auxiliary information, the suggested estimators are given by y¯ 6 =
y¯ ∗ ′ x¯ + b Z¯ − z¯ ′ x¯
y¯ 7 = y¯ ∗ + b1 x¯ ′ − x¯ − b2 Z¯ − z¯ ′
(15)
(16)
where b, b1 and b2 are constants. The mean square errors of the suggested estimators, are, given by
+ b2 R2 f2 Cz2 − 2bR(1 − B)f2 ρyz Cy Cz � � � � MSE y¯ 6 = Y¯ 2 + f3 Cx Cx − 2(1 − B)ρyx Cy
+ (1 − B)f1 Cy2
and
(17)
Khan SpringerPlus (2016)5:86
Page 4 of 9
� � ¯ x b1 XC ¯ x − 2(1 − B)Y¯ ρyx Cy b1 f3 XC � � � � ¯ 2 Cz b1 b2 ZC ¯ z − 2Y¯ (1 − B)ρyz Cy MSE y¯ 7 = + b1 b2 Zf
¯2
+Y
(18)
(1 − B)f1 Cy2
where the optimum values of b, b1 and b2 are bopt = M b2opt = b1opt .
(1−B)Cyz , RCz2
b1opt =
−k2 ± k22 −4k1 k3 2k1
and
Y¯ (1−B)Cyz ¯ 1 Cyx , R = Z¯ X¯ and M= , k1 = f3 b12 X¯ 2 Cx2, k2 = f3 (B − 1)Y¯ Xb ¯ z2 ZC ¯ 2 Cz M ZC ¯ z − Y¯ (1 − B)ρyz Cy . k2 = M Zf Singh et al. (2013), recommended a class of exponential chain ratio-product type estimator for estimating population mean using two auxiliary variables, as Also
y¯ 7 = y¯ α exp
¯
x¯ ′ zZ¯ ′ − x¯ ¯
x¯ ′ zZ¯ ′ + x¯
+ β exp
¯
x¯ − x¯ ′ zZ¯ ′ ¯
x¯ + x¯ ′ zZ¯ ′
(19)
where α and β are suitably chosen constants, such that α + β = 1 The minimum mean square error of the suggested estimator is given as follows
2 ρyx f3 Cx + ρyz f2 Cz 2 2 MSE y¯ 8 = Y¯ Cy f1 − f3 Cx2 + f2 Cz2
where the optimum value of α is αopt =
1 2
+
(20)
(ρyx f3 Cxy +ρyz f2 Czy ) . (f3 Cx2 +f2 Cz2 )
The proposed estimator Let us consider a finite population U = {U1 , U2 , U3 , . . . , UN } of size N units. To estimate the population mean Y¯ of the variable of interest y taking values yi, in the existence of two auxiliary variables say x and z taking values xi and zi respectively for the ith unit Ui. We assume that there is high correlation between y and x as compared to the correlation between y and z (i.e. ρyx > ρyz > 0). When the population X¯ of the auxiliary variable x is unknown, but information on the other cheaply auxiliary variable z closely related to x but compared to x remotely to y, is available for all the units in a population. In such a situation we use two phase sampling. In the two phase sampling scheme a large initial sample of size n′ (n′ < N ) is drawn from the population U by using simple random sample without replacement sampling (SRSWOR) scheme and measure x and z to estimate X¯ . In the second phase, we draw a sample (subsample) of size n from first phase sample of size n′, i.e. (n < n′) by using SRSWOR or directly from the population U and observed the study variable y. Under the given probability sampling, we have proposed a chain-ratio-type exponential estimator for finite population mean of the study variable y, given by ′ Z¯ − z¯ ′ x¯ − x¯ k1 ′ − x¯ y¯ m = y¯ ′ + k2 x¯ exp x¯ + x¯ Z¯ + z¯ ′
(21)
where k1 and k2 are the unknown constants, whose value is to be determined for optimality conditions.
Khan SpringerPlus (2016)5:86
Page 5 of 9
To obtain the properties of the proposed estimator we define the following relative error terms and their expectations. ′ ¯ ¯ ¯ ¯ ¯ Let e0 = y¯ −Y¯ Y , e1 = x¯ −X¯ X , e1′ = x¯ X−¯ X , e2 = z¯ −Z¯ Z and e2′ = z¯ −Z¯ Z . ′ Such that E(ei ) = E ei = 0. For i = 0, 1, 2.
E e02 = f1 Cy2 , E e12 = f1 Cx2 , E e22 = f1 Cz2 , E e1′2 = E e1 e1′ = f2 Cx2 , E e0 e2′ = f2 Cyz ,
E(e0 e1 ) = f1 Cyx , E e0 e1′ = f2 Cyx , E(e0 e2 ) = f1 Cyz ,
E e1 e2′ = E e1′ e2 = E e1′ e2′ = f2 Cxz ,
E e2′2 = E e2 e2′ = f2 Cz2 , E(e1 e2 ) = f1 Cxz .
Equation (21) can be further simplified as given by
′ K1 ′2 ′ e1 − e1 3e e y¯ m = Y¯ (1 + e0 ) + k2 X¯ 1 + e1′ 1 − 2 + 2 − (1 + e1 ) 2 + e1′ + e1 2 8
(22)
Further simplifying, up to order one � � � �� � k1 e1′ − e1 k1 e1′ + e1 e1′ − e1 � � − 1 + e0 + e2′ e1′ e2′ 3e2′2 2 4 ′ + k2 X¯ e1 − y¯ m = Y¯ − + � ′ � � ′ �2 2 2 8 + k1 (k1 − 1) e1 − e1 + k1 e0 e1 − e1 4 2
(23)
Expanding the right hand side of the Eq. (23), up to first order of approximation and subtracting Y¯ from both sides we get � � � �� � k1 e1′ − e1 k1 e1′ + e1 e1′ − e1 � � e + − 0 e2′ e1′ e2′ 3e2′2 2 4 ′ ¯ X e + k − y¯ m − Y¯ = Y¯ − + � ′ � � ′ �2 2 1 2 2 8 + k1 (k1 − 1) e1 − e1 + k1 e0 e1 − e1 4 2
(24)
On squaring and taking expectation on both sides of (24), we get mean square error up to first order of approximation, given as
�
MSE y¯ m
�
� � ¯ 2 f2 2Cyx − Cyz Y¯ 2 f1 Cy2 − Y¯ 2 k1 f3 Cyx + Y¯ Xk � � 1 = Y¯ 2 Cy2 1 + k22 X¯ 2 f2 Cz2 + 4Cx2 − 4Cxz + k12 Y¯ 2 f3 Cx2 4 4
(25)
Khan SpringerPlus (2016)5:86
Page 6 of 9
Differentiating Eq. (25) w.r.t to k1 and k2 we get the optimum values of k1 and 2ρ C k2 respectively as given by, where the optimum values are k1opt = Cxyx y and −2Y¯ (2C −C ) k2opt = X¯ 4C 2 +Cyx2 −4Cyz respectively. ( x z xz ) On substituting the optimum value of k1 and k2 in Eq. (25), we get the minimum mean square error of the proposed estimators, given as follows
¯2
MSE y¯ m min = Y
Cy2
2 f1 − f3 ρxy
2 f2 2ρxy Cx − ρyz Cz − 4Cx (Cx − ρxz Cz ) + Cz2
(26)
Efficiency comparison In this section, we have obtained some conditions by comparing the mean square errors of the estimators under which the proposed estimator performs better than the other existing estimators. The proposed estimator y¯ m is more efficient if the given conditions are satisfied. (i) By (26) and (2), MSE y¯ m min ≤ MSE y¯ 0 if, 2 f2 2ρyx Cx − ρyz Cz 2 ≥ 0. f3 ρxy + 2 4Cx + Cz2 − 4Cxz (ii) By (26) and (5), MSE y¯ m min ≤ MSE y¯ 1 if, � � 2 2 2 Cx + ρxy Cy − 2ρyx Cy Cx � �2 Cy2 f2 2ρxy Cx − ρyz Cz ≥ 0. � � + 4Cx (Cx − ρxz Cz ) + Cz2 (iii) By (26) and (6), MSE y¯ m min ≤ Var y¯ 2 if, 2 f2 2ρyx Cx − ρyz Cz ≥ 0. 4Cx2 + Cz2 − 4Cxz
(iv) By (26) and (8), MSE y¯ m min ≤ MSE y¯ 3 if,
� � 2 2 f3 Cy ρxy + Cx2 − 2Cyx � � � �2 � � ≥ 0. 2Cxy − Cyz � + Cz2 − 2Cyz + f2 � 2 4Cx + Cz2 − 4Cxz (v) By (26) and (11), MSE y¯ m min ≤ MSE y¯ 4 if, � � 2 f3 Cy2 ρxy + Cx2 − 2Cyx � �2 2Cxy − Cyz ≥ 0. � + +f � 2 4Cx + Cz2 − 4Cxz 2 � � Cx ρxz Cx ρxz − 2ρyz Cy
Khan SpringerPlus (2016)5:86
Page 7 of 9
(vi) By (26) and (12), MSE y¯ m min ≤ MSE y¯ 5 if, 2 2ρxy Cx − ρyz Cz ≥ 0. ρyx ρxz ρyx ρxz − 2ρyz + 2 4 Cx − Cxz + Cz2 (vii) By (26) and (17), MSE y¯ m min ≤ MSE y¯ 6 if, � � 2 f3 Cy2 ρxy + Cx2 − 2(1 − B)Cyx − Bf1 Cy2 � �2 � 2Cxy − Cyz ≥ 0. � +f 4Cx2 + Cz2 − 4Cxz 2 2 2 2 + b R Cz − 2bR(1 − B)Cyz (viii) By (26) and (18), MSE y¯ m min ≤ MSE y¯ 7 if, � � � �2 2 f3 Sy2 ρxy + b12 Sx2 − 2(1 − B)b1 Syx − f1 Sy2 � �2 2 ¯ Y 2Cxy − Cyz � � ≥ 0. + f2 4Cx2 + Cz2 − 4Cxz � � + b b b b ZS ¯ 2 − 2(1 − B)S 1 2
1 2
z
yz
By (26) and (20), MSE y¯ m min ≤ MSE y¯ 8 if, � �2 f2 2ρxy Cx − ρyz Cz 2 � − f3 ρxy + � 4Cx (Cx − ρxz Cz ) + Cz2 � ≥ 0. �2 ρyx f3 Cx + ρyz f2 Cz � � f3 Cx2 + f2 Cz2
(ix)
Numerical study To verify the theoretical conditions under the efficiency comparison numerically, we have taken a real data set from the literature. The description and the necessary data statistics of the data is, given below. For percent relative efficiency we use the following formula. Var y¯ 0 ∗ 100, PRE = MSE y¯ j
for j = 1, 2, 3, 4, 5, 6, 7, 8 and m. Population
The data from the population of 100 records of resale of homes from February 15 to April 30, 1993 from the files maintained by the Albuquerque Board of realtors on selling price ($) as a study variable y square feet of living space as an auxiliary variable x and annual taxes ($) as an additional variable z have been taken. The numerical values of the parameters of the population are given as.
Y¯ = 1093.41,
X¯ = 1697.44,
Z¯ = 801.58,
Khan SpringerPlus (2016)5:86
Page 8 of 9
Table 1 Mean square errors (MSE’s) of the estimators Estimators
Mean squared errors (MSE’s) n′ = 60, n = 30
n′ = 65, n = 35
n′ = 70, n = 40
y¯0 y¯1
3583.66
2852.30
2303.78
1691.43
1355.15
1087.35
y¯2 y¯3
1690.47
1354.39
1086.73
1490.62
1192.96
958.26
y¯4 y¯5
1282.01
1024.46
824.15
1278.03
1021.27
821.59
y¯6 y¯7
1275.16
1018.37
819.03
1274.17
1017.59
818.40
y¯8 y¯m
1462.31
1181.19
950.65
1136.84
921.83
748.56
Table 2 Percent relative efficiency (PRE’s) of the proposed estimator with the existing estimators Estimators
Percent relative efficiency (PRE’s) n′ = 60, n = 30
n′ = 65, n = 35
n′ = 70, n = 40
y¯0 y¯1
100.00
100.00
100.00
211.87
210.48
211.87
y¯2 y¯3
211.99
210.60
211.99
240.41
239.09
240.41
y¯4 y¯5
279.54
278.42
279.54
280.41
279.29
280.41
y¯6 y¯7
281.04
280.09
281.28
281.26
280.30
281.50
y¯8 y¯m
244.79
303.05
376.54
314.87
388.31
478.19
Cy2 = 0.1285,
Cx2 = 0.0994,
Sy = 391.90,
Sx = 535.01,
ρyx = 0.84,
ρyz = 0.64,
Cz2 = 0.1560, Sz = 316.62,
ρxz = 0.86.
Conclusion On the basis of mean square errors and the percent relative efficiencies of the estimators as shown in Tables 1 and 2, it has been observed that the performance of the proposed estimator is better than the other relevant existing estimators discussed in the literature of survey sampling, which reveals the usefulness of suggested method in practice and would work very well in practical surveys. Author details 1 Department of Mathematics, COMSATS Institute of Information Technology, Abbottabad 22060, Pakistan. 2 Carl Duisberg Center, Jägerstraße 64, 10117 Mitte‑Berlin, Germany.
Khan SpringerPlus (2016)5:86
Acknowledgements This work was written by me, and we will pay the publication charges of this Journal. Competing interests The author declares that there is no competing interests. Received: 10 August 2015 Accepted: 13 January 2016
References Bahl S, Tuteja RK (1991) Ratio and product type exponential estimator. Inf Optim Sci 12:159–163 Chand L (1975) Some ratio type estimator based on two or more auxiliary variables. Unpublished Ph.D. dissertation, Lowa State University, Ames, Lowa Cochran WG (1977) Sampling techniques. Wiley, New-York Dash PR, Mishra G (2011) An improved class of estimators in two-phase sampling using two auxiliary variables. Commun Stat Theory Methods 40:4347–4352 Khare BB, Rehman HU (2013) Improved chain type estimators for population mean using two auxiliary variables and double sampling scheme. Int J Appl Stat Probab 1(3):82–87 Khare BB, Srivastava U, Kumar K (2013) A generalized chain ratio in regression estimator for population mean using two auxiliary characters in sample survey. J Sci Res Banaras Hindu Univ Varanasi 57:147–153 Kiregyera B (1980) A chain ratio-type estimator in finite population mean in double sampling using two auxiliary variables. Metrika 27:217–223 Kiregyera B (1984) Regression-type estimator using two auxiliary variables and model of double sampling from finite populations. Metrika 31:215–223 Searls DT (1964) The utilization of a known coefficient of variation in the estimation procedure. J Am Stat Assoc 59:1225–1226 Singh BK, Choudhury S (2012) Exponential chain ratio and product-type estimators for finite population mean under double sampling scheme. Globel J Sci Front Res Math Decis Sci 12(6):2249–4626 Singh BK, Choudhury S, Kalita D (2013) A class of exponential chain ratio-product type estimator with two auxiliary variables under double sampling scheme. Electron J Appl Stat Anal 6(2):166–174 Singh R, Chuhan P, Swan N (2007) Families of estimators for estimating population mean using known correlation coefficient in two phase sampling. Stat Transit 8(1):89–96 Singh R, Chuhan P, Swan N, Smarandache F (2011) Improved exponential estimator for population variance using two auxiliary variables. Ital J Pure Appl Math 28:101–108 Singh HP, Singh S, Kim JM (2006) General families of chain ratio type estimators of the population mean with known coefficient of variation of the second auxiliary variable in two phase sampling. J Korean Stat Soc 35(4):377–395 Srivastava SK (1970) A two phase estimator in sampling surveys. Aust J Stat 12:23–27 Srivnstava SR, Khare BB, Srivastava SR (1990) A generalized chain ratio estimator for mean of finite population. J Indian Soc Agric Stat 42(1):108–117 Sukhatme BV (1962) Some ratio type estimators in two-phase sampling. J Am Stat Assoc 57:628–632
Page 9 of 9