This article was downloaded by: [Hacettepe University] On: 30 July 2015, At: 02:51 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: 5 Howick Place, London, SW1P 1WG
Communications in Statistics - Theory and Methods Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/lsta20
Separate Ratio Estimators for the Population Variance in Stratified Random Sampling a
a
a
Gamze Özel , Hülya Çingi & Merve Oğuz a
Department of Statistics, Hacettepe University, Ankara, Turkey Accepted author version posted online: 04 Apr 2014.Published online: 05 Nov 2014.
Click for updates To cite this article: Gamze Özel, Hülya Çingi & Merve Oğuz (2014) Separate Ratio Estimators for the Population Variance in Stratified Random Sampling, Communications in Statistics - Theory and Methods, 43:22, 4766-4779, DOI: 10.1080/03610926.2012.729642 To link to this article: http://dx.doi.org/10.1080/03610926.2012.729642
PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &
Downloaded by [Hacettepe University] at 02:51 30 July 2015
Conditions of access and use can be found at http://www.tandfonline.com/page/termsand-conditions
Communications in Statistics—Theory and Methods, 43: 4766–4779, 2014 Copyright © Taylor & Francis Group, LLC ISSN: 0361-0926 print / 1532-415X online DOI: 10.1080/03610926.2012.729642
Separate Ratio Estimators for the Population Variance in Stratified Random Sampling
Downloaded by [Hacettepe University] at 02:51 30 July 2015
¨ ¨ ˘ GAMZE OZEL, HULYA C ¸ INGI, AND MERVE OGUZ Department of Statistics, Hacettepe University, Ankara, Turkey We propose separate ratio estimators for population variance in stratified random sampling. We obtain mean square error equations and compare proposed estimators about efficiency with each other. By these comparisons, we find the conditions which make proposed estimators more efficient than others. It has been shown that proposed classes of estimators are more efficient than usual unbiased estimator. We find that separate ratio estimators are more efficient than combined ratio estimators for population variance. The theoretical results are supported by a numerical illustration with original data. A simulation study is also carried out to investigate empirical performance of estimators. Keywords Separate ratio estimators; Variance estimator; Efficiency; Stratified random sampling. Mathematics Subject Classification 62D05.
1. Introduction In sampling theory literature, many estimators for the population variance have been proposed by Das and Tripathi (1978), Isaki (1983), Singh et al. (1988), Arcos et al. (2005), Kadilar and Cingi (2007), Unyazici (2008), and Unyazici and Cingi (2008) in simple random sampling. Recenlty, a new hybrid type estimator for the population variance is proposed by Gupta and Shabbir (2008) in simple random sampling. Ratio estimators take advantage of the correlation between the auxiliary variable x and the study variable y. When information is available on the auxiliary variable that is positively correlated with the study variable, the ratio estimator is a suitable estimator to estimate the population variance. Isaki (1983) suggested a ratio estimator for the population variance as
2 = sratio
sy2 sx2
Sx2 ,
(1.1)
Received September 27, 2011; Accepted September 7, 2012. Address correspondence to Gamze Ozel, Department of Statistics Hacetepe Universiry, Ankara 06800, Turkey; E-mail:
[email protected]
4766
Ratio Estimators for the Population Variance n
where
sy2
=
N
n
¯ 2 (yi −y)
i=1
and
n
sx2
=
¯ 2 (xi −x)
i=1 N
(yi −Y¯ )2
4767
are the unbiased estimators of the population vari-
n
¯ 2 (xi −X)
ances Sy2 = i=1 N−1 and Sx2 = i=1 N−1 the MSE of this estimator as follows:
, respectively. Prasad and Singh (1990) obtained
2 MSE(sratio )∼ = λSy4 βy + βx − 2θ , where λ = n1 , βy =
μ40 , μ220
is the kurtosis for the population of the study variable, β x =
the kurtosis for the population of the auxiliary variable, μrs =
Downloaded by [Hacettepe University] at 02:51 30 July 2015
(1.2)
1 N
N j =1
μ04 , μ202
is
¯ s, (yj − Y¯ )r (xj − X)
¯ and Y¯ are and θ = μ20μ22μ02 . Here, N is number of units in population, n is the sample size, X population means of the auxiliary variable x and the study variable y, respectively. Stratified Random Sampling (STRS) is an alternative method to simple random sampling for the selection of a sample from the population. In the STRS, the aim is again to represent the population by a sample in the most accurate way. Population variance of the study variable in the STRS was obtained by Kadilar and Cingi (2006a) as follows: 2 = (N − 1)Sst,y
Nh
Nh 2 (yhi − Y¯ h ) + Y¯ h − Y¯ ,
(yhi − Y¯ )2 =
h=1 i=1
(1.3)
h=1 i=1
where is the number of stratum, Nh is the population size in the stratum h, and Y¯ h is the population mean of the study variable in the stratum h. Assuming that N ∼ = N − 1 and − 1, we can write (1.3) as N Nh ∼ = h Nh 2 2 2 ∼ NSst,y Nh Syh + Nh Y¯ h − Y¯ , = h=1 i=1
2 Sst,y
h=1
∼ ωh S 2 + ωh Y¯ h − Y¯ 2 , = yh h=1
(1.4)
h=1
2 where ωh = NNh is the stratum weight and Syh is the population variance of the study variable in stratum h. Various ratio estimators can be improved in the STRS using the auxiliary variable and some parameters related to auxiliary variable such as coefficient of variation, kurtosis, coefficient of correlation, etc. The information about the auxiliary variable also increases the precision of the population variance in the STRS. The mean square error (MSE) equations for the population variance have also attracted some attention in the literature. Kadilar and Cingi (2003, 2006b) obtained the MSE equation of the ratio estimator for the population mean and variance by means of the auxiliary variable. Combined ratio estimator for the population variance is obtained by Kadilar and Cingi (2006a) as follows:
2 = src1
2 where sst,x =
h=1
2 ωh sxh +
h=1
2 sst,y 2 sst,x
Sx2 ,
(1.5)
ωh (x¯ h − x¯ st )2 is the estimator of the population variance of
2 the auxiliary variable in STRS. Here, sxh is the sample variance of the auxiliary variable in
¨ et al. Ozel
4768
stratum h. The MSE of the estimator, given in (1.3) is found as follows:
4 2 ∼ Sx 2 4 2 ¯ MSE(src ) = 2 ωh λh Syh (βyh − 1) + 4 ωh (Yh − Y¯ ) λh μ30h − ωh λh μ30h δ h=1 h=1 h=1
2 2 2 +4 ωh2 (Y¯ h − Y¯ )2 λh Syh −2 ωh λh Syh + ωh2 λh Syh h=1
υ −2 δ
h=1
2 2 ωh2 λh Syh Sxh (θh
h=1
υ − 1) − 4 δ
h=1
¯h ωh2 (X
¯ λh μ21h − X)
h=1
υ υ υ 2 ¯ − ωh λh μ21h − λh μ03h + ωh λh μ03h − 4 ωh (Yh − Y¯ ) δ δ δ h=1 h=1 h=1
υ2 2 4 × λh μ12h − ωh λh μ21h + 2 ωh λh Sxh (βxh − 1) δ h=1 h=1 υ 2 ¯ ¯ Y¯ h − Y¯ ) λh Syxh − 2 −8 ωh (Xh − X)( ωh λh Syxh δ h=1 h=1
υ2 2 ¯ 2 ¯ 2 + ωh λh Syxh + 4 2 ωh (Xh − X) δ h=1 h=1
2 2 2 2 × λh Sxh − 2 ωh λh Sxh + ωh λh Sxh , (1.6)
Downloaded by [Hacettepe University] at 02:51 30 July 2015
h=1
where θh =
μ22h ,υ μ20h μ02h
=
h=1
h=1
2 ωh Syh +
ωh (Y¯ h − Y¯ ) and δ =
h=1
h=1
2 ωh Sxh +
¯ h − X). ¯ ωh (X
h=1
Here, βxh is the population kurtosis of the auxiliary variable in the stratum h, βyh is the population kurtosis of the study variable in the stratum h, and Syxh is the population covariance between the auxiliary variable and study variable in stratum h. Furthermore, following variance estimators in STRS are given by 2 src2 =
2 src3 =
2 src4 =
2 src5 =
2 sst,y 2 sst,x + Cx 2 sst,y 2 sst,x + βx
(Sx2 + Cx ),
(1.7)
(Sx2 + βx ),
(1.8)
2 sst,y 2 sst,x βx + Cx 2 sst,y 2 sst,x Cx + βx
(Sx2 βx + Cx ),
(1.9)
(Sx2 Cx + βx ),
(1.10)
whose MSE equations have the same form with (1.6). As seen from (1.6), MSE equation of the combined ratio estimators has complicated form, new ratio estimators can be give more better results for the population variance in the STRS.
Downloaded by [Hacettepe University] at 02:51 30 July 2015
Ratio Estimators for the Population Variance
4769
Singh and Vishwakarma (2008a) also derived estimators for the population variance using auxiliary information in STRS. On the other hand, separate ratio estimators for the population variance in the STRS have not been studied. Furthermore, non-existence of separate ratio estimators for the population variance obstacles usage of them in the STRS theory itself and its applications in ecology, seismology, etc. (Cingi and Kadilar, 2009). Consequently, since the results are sparse, the aims of this study are to derive new separate ratio estimators for the population variance in the STRS. We also examine the behavior of the estimators of variance for the separate ratio estimators in the STRS. Sampling methods are important for assessing natural resource abundance. Natural populations in ecology (forestry, fisheries, and wildlife) are extremely large; consequently, sampling methods have to be conducted for characterizing those populations. As pointed out by Robinson et al. (1999), the opportunities to integrate auxiliary information into forest context is STRS. The population consists of the flower of the Asteracea which grows three different regions in Ankara are extremely large and it is not possible to investigated all population. Hence, in this study the samples from the Asteracea are selected using the STRS and the ratio estimators for the population variance are obtained. The rest of this article is organized as follows. The MSE equations of new estimators for the population variance in the STRS are given in Sec. 2 using different population information of the auxiliary variable. The efficiencies based on the MSE equations are compared theoretically in Sec. 3. Then, this theoretical comparison is also showed numerically and a simulation study is presented in Sec. 4. Some concluding remarks are given in Sec. 5.
2. Suggested Estimators For estimating the population variance Sy2 in the STRS, separate ratio estimators can be given in the following way. Consider a finite population U = {u1 , u2 , ..., uN } of size N. Let the population of size N be stratified into strata with hth stratum containing Nh units, where h = 1, 2, ..., such that Nh = N . A simple random sample of size nh is h=1
drawn without replacement from the hth stratum such that
nh = n. Let (yhi , xhi ) denote
h=1
the observed values of the study and the auxiliary variable on ith unit of the hth stratum, 2 be the population where i = 1, 2, ..., Nh and h = 1, 2, ..., . Moreover, assume that Syh Nh variance of Y in the stratum h, where ωh = N is the stratum weight. Similar expression for the auxiliary variable X can also be defined. Motivated by Isaki (1983), we propose following separate ratio variance estimators in the STRS: 2 = spr1
ωh
h=1 2 spr2 =
ωh
h=1 2 spr3
=
h=1
ωh
2 syh 2 sxh
2 Sxh , 2 syh
2 sxh + Cxh 2 syh 2 sxh + βxh
(2.1)
2 (Sxh + Cxh ),
(2.2)
2 (Sxh + βxh ),
(2.3)
¨ et al. Ozel
4770 2 spr4 =
ωh
h=1 2 spr5 =
ωh
h=1 2 spr6
=
ωh
h=1 2 spr7 =
ωh
Downloaded by [Hacettepe University] at 02:51 30 July 2015
h=1 2 spr8 =
ωh
h=1 2 spr9 =
ωh
h=1 2 spr10 =
ωh
h=1 2 spr11 =
ωh
h=1 2 spr12 =
ωh
h=1 2 spr13 =
ωh
h=1 2 spr14
=
ωh
h=1 2 spr15 =
ωh
h=1 2 spr16 =
ωh
h=1 2 spr17 =
ωh
h=1 2 spr18 =
h=1
ωh
2 syh 2 sxh + ρxyh
2 (Sxh + ρxyh ),
2 syh 2 sxh Cxh + βxh
2 (Sxh Cxh + βxh ),
2 syh 2 sxh Cxh + ρxyh 2 syh 2 sxh βxh + Cxh
(2.4)
(2.5)
2 (Sxh Cxh + ρxyh ),
(2.6)
2 (Sxh βxh + Cxh ),
2 syh 2 sxh βxh + ρxyh 2 syh 2 sxh ρxyh + Cxh 2 syh 2 sxh ρxyh + βxh
(2.7)
2 (Sxh βxh + ρxyh ),
(2.8)
2 (Sxh ρxyh + Cxh ),
(2.9)
2 (Sxh ρxyh + βxh ),
(2.10)
2 syh 2 2 α(sxh + Cxh ) + (1 − α)(Sxh + Cxh ) 2 syh 2 2 α(sxh + βxh ) + (1 − α)(Sxh + βxh )
2 + Cxh ), (Sxh
2 + βxh ), (Sxh
2 syh 2 2 α(sxh + ρxyh ) + (1 − α)(Sxh + ρxyh )
2 (Sxh + ρxyh ),
2 syh 2 2 α(sxh Cxh + βxh ) + (1 − α)(Sxh Cxh + βxh )
2 Cxh + βxh ), (Sxh
2 syh 2 2 α(sxh Cxh + ρxyh ) + (1 − α)(Sxh Cxh + ρxyh ) 2 syh 2 2 α(sxh βxh + Cxh ) + (1 − α)(Sxh βxh + Cxh )
2 2 α(sxh βxh + ρxyh ) + (1 − α)(Sxh βxh + ρxyh )
(2.12)
(2.13)
(2.14)
2 Cxh + ρxyh ), (2.15) (Sxh
2 βxh + Cxh ), (Sxh
2 syh
(2.11)
(2.16)
2 βxh + ρxyh ), (2.17) (Sxh
2 syh 2 2 α(sxh ρxyh + Cxh ) + (1 − α)(Sxh ρxyh + Cxh )
2 ρxyh + Cxh ), (2.18) (Sxh
Ratio Estimators for the Population Variance 2 = spr19
h=1
2 spr20
=
ωh
2 syh 2 2 α(sxh ρxyh + βxh ) + (1 − α)(Sxh ρxyh + βxh )
2 ωh syh
h=1 2 spr21 =
2 Sxh + βxh 2 sxh + βxh
2 ωh syh 2−
Downloaded by [Hacettepe University] at 02:51 30 July 2015
h=1
4771
2 ρxyh + βxh ), (2.19) (Sxh
φ (2.20)
,
2 + βxh sxh 2 Sxh + βxh
ϕ ,
(2.21)
where Cxh = SX¯xhh is the coefficient of variation in the stratum h, βxh is the population kurtosis of the auxiliary variable in stratum h, and ρxyh is the coefficient of correlation between the auxiliary variable x and the study variable y in stratum h. Here, α, φ, and ϕ are constants. The MSE of the proposed estimators can be found using the first degree approximation in the Taylor series method defined by 2 ∼ )= MSE(spr
dh
dh =
dh
(2.22)
h
h=1
where
∂h(a,b) ∂h(a,b) 2 2 2 2 ∂a ∂b Syh ,Sxh Syh ,Sxh
(2.23)
and h
2 2 2 ) cov(syh , sxh ) V (syh . = 2 2 2 cov(syh , sxh ) V (sxh )
(2.24)
2 2 Here, h(a, b) = h(syh , sxh ). From (2.23), we obtain dh1 for the first estimator in (2.1) as follows: ω S2 (2.25) dh1 = ωh − Sh 2 yh . xh
Then, the MSE of the first estimator is obtained from (2.22), (2.24), and (2.25) 2 MSE(spr1 )∼ =
2 ωh2 V (syh )−2
h=1
2 ωh2 Syh 2 Sxh
2 2 cov(syh , sxh )+
4 ωh2 Syh 4 Sxh
2 V (sxh ),
where 2 4 2 4 2 2 2 2 V (syh ) = λh Syh (βyh − 1), V (sxh ) = λh Sxh (βxh − 1), and cov(syh , sxh ) = λh Syh Sxh (θh − 1) (2.26) 2 is given by (Kendall and Stuart, 1963). From (2.26), the MSE of spr1 2 )∼ MSE(spr1 =
4 ωh2 λh Syh (βyh + βxh − 2θh ).
(2.27)
h=1
Now let us consider the proposed estimators from (2.2)–(2.10). To find a general MSE equation of these estimators, the first degree approximation in the Taylor series method in
¨ et al. Ozel
4772
(2.22) is again used. Then, dhi , h = 1, 2, ..., , i = 2, ..., 10, is defined for the proposed estimators from (2.2)–(2.10) as dhi = ωh −ωh Rhi , i = 2, ..., 10.
(2.28)
Then, the MSE equation of the proposed estimators from (2.2)–(2.10) is obtained by
2 MSE(spri )∼ =
4 ωh2 λh Syh
2 (βxh − 1) , βyh − 1 − 2Rhi (θh − 1) + Rhi
(2.29)
Downloaded by [Hacettepe University] at 02:51 30 July 2015
h=1
where Rhi , h = 1, 2, ..., , i = 2, ..., 10, is replaced, respectively, with
Rh2 = Rh6 = Rh9 =
2 Syh 2 Sxh + Cxh
, Rh3 =
2 Cxh Syh 2 Sxh Cxh + ρxyh 2 ρxyh Syh 2 Sxh ρxyh + Cxh
2 Syh 2 Sxh + βxh
, Rh7 = , Rh10 =
, Rh4 =
2 βxh Syh 2 Sxh βxh + Cxh
2 Syh 2 Sxh + ρxyh
, Rh8 =
2 ρxyh Syh 2 Sxh ρxyh + βxh
, Rh5 =
2 Cxh Syh 2 Sxh Cxh + βxh
2 βxh Syh 2 Sxh βxh + ρxyh
,
,
.
Next, the MSE equation of the proposed estimators from (2.11)–(2.19) is derived using the Taylor series method in (2.22). For this aim, dhj , h = 1, 2, ..., , j = 11, ..., 19, is obtained as dhj = ωh −αωh Rhi , i = 2, ..., 10, j = 11, ..., 19.
(2.30)
Then, using (2.22), (2.24), and (2.30), the MSE of the proposed estimators from (2.11)–(2.19) is derived as
2 )∼ MSE(sprj =
4 2 (βyh − 1) − 2αRhi (θh − 1) + α 2 Rhi ωh2 λh Syh (βxh − 1) .
(2.31)
h=1
We found that α ∗ which minimizes the MSE in (2.31). Therefore, we obtain the derivative of (2.31) with respect to α then make it equal to zero. Then α ∗ is given by ∗
α =
(θh − 1)
h=1 h=1
Rhi (βxh − 1)
, h = 1, 2, ..., , i = 2, ..., 10.
(2.32)
Ratio Estimators for the Population Variance
4773
When we replace α ∗ in (2.32) with α in (2.31), we find the minimum MSE of the proposed estimators from (2.11)–(2.19) as (θh −1)2 4 βyh − 1 − (β ωh2 λh Syh xh −1) h=1 2 2 4 h −1) = ωh λh Syh (βyh − 1) 1 − (βyh(θ −1)(βxh −1)
2 MSE min (sprj )∼ =
= =
h=1 h=1 h=1
Downloaded by [Hacettepe University] at 02:51 30 July 2015
where 2 2 ρsyh ,sxh
4 ωh2 λh Syh (βyh − 1) 1 − ρs22
(2.33)
2 yh ,sxh
2 1 − ρs22 ωh2 V syh
2 yh ,sxh
,
2 2 , sxh cov syh (θh − 1) = = . 2 2 (βyh − 1)(βxh − 1) V (syh )V (sxh )
(2.34)
Now consider the proposed separate ratio estimators in (2.20) and (2.21). From (2.23), we obtain dhk and dhk , respectively, for the proposed estimators in (2.20) and (2.21) as 2 2 φωh Syh ϕωh Syh and dhk = ωh − S 2 +β . (2.35) dhk = ωh − S 2 +β xh
xh
xh
xh
The MSE of the proposed estimators are obtained from (2.22), (2.24), and (2.35). It is found that the proposed estimators in (2.20) and (2.21) have the same MSE in (2.31). However, the values of φ and ϕ which make MSEs minimum for the proposed estimators in (2.20) and (2.21) are different from (2.32) and φ ∗ and ϕ ∗ are given by ∗
∗
φ =ϕ =
(θh − 1)
h=1
, h = 1, 2, ..., ,
(2.36)
Rh3 (βxh − 1)
h=1
where Rh3 =
2 Syh
2 Sxh +βxh
.
3. Efficiency Comparisons 2 2 In this section, we first compare spr1 , given in (2.1), with spri , i = 2, ..., 10, for the efficiency comparison as follows: 2 2 ) < MSE(spr1 ), i = 2, ..., 10. MSE(spri
(2.37)
From (2.27) and (2.29), we have h=1
4 ωh2 λh Syh
2 4 (βxh − 1) < ωh2 λh Syh (βyh +βxh −2θh ) βyh − 1 − 2Rhi (θh − 1) + Rhi h=1
Then, the efficiency condition for (2.37) is found as 2 2 Rhi βxh −Rhi −2Rhi θh +2Rhi −1−βxh +2θh < 0, h = 1, 2, ..., , i = 2, ..., 10. (2.38)
¨ et al. Ozel
4774
When condition in (2.38) is satisfied, we can infer that the proposed estimators from (2.2)–(2.10) are more efficient than proposed ratio estimator in (2.1). Note that (2.37) is not satisfied when Rhi = 1, h = 1, 2, ..., , i = 2, ..., 10. This means that there is no difference 2 2 between MSE of spr1 and MSE of spri , i = 2, ..., 10. However, we know that Rhi = 1 since Cxh = 0, βxh = 0, ρxyh = 0. 2 , given in (2.1), with the proposed Second, we compare the proposed estimator spr1 2 estimators sprj , j = 11, ..., 19, from (2.11)–(2.19) as follows: 2 2 MSE min (sprj ) < MSE(spr1 ), j = 11, ..., 19.
(2.39)
Comparing (2.27) and (2.33), we can write (θh − 1)2 4 4 ωh2 λh Syh ωh2 λh Syh (βyh + βxh − 2θh ) (βyh − 1) − < (β − 1) xh h=1 h=1
Downloaded by [Hacettepe University] at 02:51 30 July 2015
From this inequality, we have (βyh − 1)(βxh − 1) − (θh − 1)2 < (βyh + βxh − 2θh )(βxh − 1). Then, it holds that 2 − 2βxh θh − βyh − βxh + 2θh . (βyh βxh − βyh − βxh + 1) − (θh2 − 2θh + 1) < βyh βxh + βxh
From this inequality, the efficiency condition is found as (βxh − θh )2 > 0, h = 1, 2, ..., .
(2.40)
When the condition in (2.40) is satisfied, the proposed estimators from (2.11)–(2.19) are more efficient than the proposed estimator in (2.1). When we examine the condition in (2.40) in detail, we see that this condition is always satisfied. Third, we compare the proposed estimators from (2.2)–(2.10) with the proposed estimators from (2.11)–(2.21) as follows: 2 2 ) < MSE(spri ), i = 2, ..., 10, j = 11, ..., 21. MSE min (sprj
(2.41)
From (2.29) and (2.33), we can write
4 ωh2 λh Syh (βyh − 1)(1 − ρs22
2 yh ,sxh
h=1
2 +Rhi (βxh − 1) .
)
. βyh − 1
Ratio Estimators for the Population Variance
4775
Table 1 Data statistics of the population N = 450 ¯ = 3.455 X Y¯ = 2.291 Sx2 = 0.518 Sy2 = 0.486
βx = 3.697 βy = 5.707 θ = 2.530 ρxy = 0.630 B = 0.613
From this inequality, the efficiency condition is obtained by ρs22
Downloaded by [Hacettepe University] at 02:51 30 July 2015
2 yh ,sxh
2 (βyh − 1) − 2Rhi (θh − 1) + Rhi (βxh − 1) > 0.
(2.42)
The proposed estimators from (2.11)–(2.21) are more efficient than the proposed estimators from (2.2) – (2.10) when this condition is satisfied. Furthermore, since (θh − 1) = 2 2 ρsyh (βyh − 1)(βxh − 1) we can rewrite (2.42) as ,sxh 2 2 2 (β − 1) − R − 1) > 0. (β ρsyh yh hi xh ,sxh
(2.43)
When we analyze√ the condition in (2.43), we see that this condition is always satisfied if √ 2 2 and only if ρsyh (β − 1) = R (β xh hi xh − 1). ,sxh
4. Numerical Example In this article, we consider the data set used by Savci (2007) for the numerical comparisons of the proposed estimators in the STRS. The population consists of the flower of the Asteraceae. The Asteraceae is growing three different regions in Ankara (Aquapark, the Garden of Opera and Ballet House, and the flowers in mold). The study and auxiliary variables are the length of the pappus and the length of the flower, respectively. The summary statistics for the population are given in Table 1. As shown in Table 1, the correlation between the study variable and auxiliary variable is positive (ρxy = 0.630) and it can said that the pappus length of the flower is related to the flower’s length. Therefore, the ratio estimators can be used for the estimation of the population variance in the STRS. Note that the population regression coefficient is found as B = 0.613. Then, we have stratified the data by the regions of the flower and we have selected samples from each stratum (region). The summary statistics for the population in each stratum are presented in Table 2. Table 2 Data statistics of the population for the each stratum h
Wh
¯h X
Y¯ h
2 Sxh
2 Syh
βxh
βyh
ρxyh
θh
1 2 3
0.33 0.33 0.34
2.499 3.828 4.038
1.761 2.029 3.084
0.033 0.076 0.049
0.132 0.139 0.209
20.158 18.313 3.489
5.839 2.116 2.299
0.452 0.147 0.333
9.678 0.844 1.085
¨ et al. Ozel
4776
Downloaded by [Hacettepe University] at 02:51 30 July 2015
Table 3 Data statistics of the sample for the each stratum h
nh
x¯ h
y¯ h
2 sxh
2 syh
λh
1 2 3
21 33 27
2.477 3.857 4.122
1.676 2.093 3.130
0.023 0.047 0.043
0.130 0.147 0.187
0.05 0.03 0.04
The Neyman allocation method is used for determining the sample sizes of each stratum (Cingi, 2004). Here, we take the sample size as n = 81. We would like to recall that the sample size has no effect on efficiency comparisons of the estimators, as shown in Sec. 3. Then, the summary statistics of the samples are given in Table 3. 2 2 , ..., spr21 ) are computed The MSE values of the proposed variance estimators (spr1 using STRS from (2.27), (2.29) and (2.33), respectively. The MSE values of the combined ratio estimators given in (1.5), and (1.7)–(1.10) are obtained from (1.6). Finally, these estimators are compared between each other with respect to their MSE values and presented in Table 4. Table 4 shows that: (i) the proposed separate ratio estimators are better than the combined ratio estimators; 2 , j = 11,..., 21, from (2.11)–(2.21) have (ii) the proposed separate ratio estimators sprj the same value of MSE. The MSE values of these estimators are smaller than that of the proposed estimators in (2.2)–(2.10); 2 2 , βxh , and ρxh in spr9 if (iii) the largest gain in efficiency is observed by using Sxh inter-group comparison of in (2.2)–(2.10) is done; (iv) the class of estimators based on the population kurtosis of the auxiliary variable βxh is better than the other estimators when inter-group comparison of in (2.2)–(2.10) is done; and (v) the MSE values of the proposed estimators from (2.11)–(2.21) are smaller than 2 in (2.1). the MSE of the first estimator spr1
Table 4 The MSE values of the proposed variance estimators in the STRS for the real data set Estimators 2 src1 2 src2 2 src3 2 src4 2 src5 2 spr1 2 spr2 2 spr3 2 spr4
MSE
Estimators
MSE
Estimators
MSE
0.016423 0.016214 0.015620 0.015947 0.015903 0.002534 0.000847 0.000791 0.000827
2 spr5 2 spr6 2 spr7 2 spr8 2 spr9 2 spr10 2 spr11 2 spr12 2 spr13
0.000794 0.000788 0.002043 0.001401 0.000807 0.000793 0.000419 0.000419 0.000419
2 spr14 2 spr15 2 spr16 2 spr17 2 spr18 2 spr19 2 spr20 2 spr21
0.000419 0.000419 0.000419 0.000419 0.000419 0.000419 0.000419 0.000419
Ratio Estimators for the Population Variance
4777
Table 5 Simulation results for the MSE values of the proposed variance estimators in the STRS for various sample sizes n = 25
Downloaded by [Hacettepe University] at 02:51 30 July 2015
MSE
n = 50
MSE
MSE
MSE
MSE
MSE
2 src1 2 src2 2 src3 2 src4 2 src5 2 spr1 2 spr2 2 spr3 2 spr4
0.11873 0.11892 0.11408 0.10986 0.10454 0.07394 0.00956 0.00897 0.00944
2 spr5 2 spr6 2 spr7 2 spr8 2 spr9 2 spr10 2 spr11 2 spr12 2 spr13
0.00896 0.00867 0.01008 0.00632 0.00625 0.00867 0.00425 0.00417 0.00489 n = 75
2 spr14 2 spr15 2 spr16 2 spr17 2 spr18 2 spr19 2 spr20 2 spr21
0.00408 0.00393 0.00406 0.00315 0.00489 0.00578 0.00502 0.00593
2 src1 2 src2 2 src3 2 src4 2 src5 2 spr1 2 spr2 2 spr3 2 spr4
0.11892 0.11913 0.11354 0.10943 0.10438 0.07191 0.00947 0.00893 0.00936
2 spr5 0.00872 2 spr6 0.00854 2 spr7 0.00995 2 spr8 0.00562 2 spr9 0.00559 2 spr10 0.00817 2 spr11 0.00419 2 spr12 0.00408 2 spr13 0.00475 n = 100
2 spr14 2 spr15 2 spr16 2 spr17 2 spr18 2 spr19 2 spr20 2 spr21
0.00394 0.00375 0.00401 0.00306 0.00483 0.00457 0.00484 0.00492
2 src1 2 src2 2 src3 2 src4 2 src5 2 spr1 2 spr2 2 spr3 2 spr4
0.11795 0.11874 0.11356 0.10936 0.10427 0.07183 0.08545 0.08130 0.08653
2 spr5 2 spr6 2 spr7 2 spr8 2 spr9 2 spr10 2 spr11 2 spr12 2 spr13
0.00853 0.00876 0.00884 0.00591 0.00547 0.00794 0.00405 0.00397 0.00415
2 spr14 2 spr15 2 spr16 2 spr17 2 spr18 2 spr19 2 spr20 2 spr21
0.00380 0.00364 0.00397 0.00263 0.00441 0.00462 0.00470 0.00453
2 src1 2 src2 2 src3 2 src4 2 src5 2 spr1 2 spr2 2 spr3 2 spr4
0.11684 0.11755 0.11347 0.10908 0.10415 0.07043 0.08138 0.07954 0.08159
2 spr5 2 spr6 2 spr7 2 spr8 2 spr9 2 spr10 2 spr11 2 spr12 2 spr13
2 spr14 2 spr15 2 spr16 2 spr17 2 spr18 2 spr19 2 spr20 2 spr21
0.00382 0.00368 0.00393 0.00298 0.00450 0.00442 0.00467 0.00452
0.00821 0.00843 0.00859 0.00476 0.00563 0.00745 0.00402 0.00390 0.00418
2 Since the proposed separate ratio estimators sprj , j = 11,..., 21, from (2.11)–(2.21) have the same value of the MSE, a simulation study is conducted to investigate empirical performance of these estimators. For the simulation, an artificial population of N = 200 units is considered. The auxiliary variable is a random sample from a normal population with mean and standard deviation of 4.7 and 0.65, respectively. The samples are generated by the simple random sampling without replacement using sample sizes n = 25, 50, 75, 100. For each combination of sample size, 1000 samples are selected. As the population is held constant during these 1000 replicates, we are able to evaluate the performance of the estimators. This performance is evaluated using the MSE values of the proposed estimators. 1000 1 (s 2 − S 2 ) where The MSE values of each estimator are computed by MSE(s 2 ) = 1000 i=1
S 2 is the population variance of the study variable. In Table 5, the MSE values of the proposed estimators are compared to other estimators for each n. From Table 5, we can conclude the following. (i) All proposed separate ratio estimators are more efficient than the combined ratio estimator for all sample sizes, 2 , j = 11,..., 21, do much better than all (ii) The proposed separate ratio estimators sprj estimators. 2 has the lowest empirical MSE compared with other estimations for all values (iii) spr17 of n.
4778
¨ et al. Ozel
2 (iv) The proposed separate ratio estimators sprj , j = 11,..., 21, are approximately same for large sample size n. 2 gives the worst results when inter-group comparison of in (2.11) to (2.21) is (v) spr21 done. (vi) The auxiliary variable information βxh , ρxyh and Cxh , ρxyh yield better estimations if inter-group comparison of (2.11)–(2.21) is done.
From the above discussions, we can conclude that the class of separate ratio estimators is to be preferred to combined ratio estimators in STRS. These simulation results are also support the theoretical findings in Sec. 3.
Downloaded by [Hacettepe University] at 02:51 30 July 2015
5. Conclusion In this article, we developed new separate ratio estimators for the population variance in STRS and obtained their MSE equations. Different classes of ratio estimators are proposed using the auxiliary variable information. By MSE equations, the MSE values are compared. It has shown found that the proposed estimators from (2.2)–(2.10) are always more efficient than the proposed estimator in (2.1) since the condition in (2.38) is satisfied. Similarly, the proposed estimators from (2.11)–(2.21) are more efficient than the proposed estimator in (2.1) when the condition in (2.40) is satisfied. Furthermore, the proposed estimators from (2.11)–(2.23) are always more efficient than the proposed estimators from (2.2) to (2.10) since the condition in (2.43) is satisfied. This theoretical result is also supported by the numerical example and the simulation study. Since the efficiency of the estimators from (2.11)–(2.21) is same, we compared these proposed estimators to other estimators in the simulation study. From theoretical discussion in Sec. 3, and the results of numerical example and simulation, we infer that the proposed separate ratio estimators are better than the combined ratio estimators. In the forthcoming studies, we hope to develop new estimators for the population variance in other sampling methods.
Acknowledgments The authors are thankful to the anonymous referees for their constructive comments and suggestions for the improvement of this article.
References Arcos, A., Rueda, M., Martine, M. D., Gonzales, S., Roma, Y. (2005). Incorporating the auxiliary information available in variance estimation. Appl. Math. Computat. 160:387–399. Cingi, H. (1994). Sampling Theory. Ankara, Turkey: Hacettepe University Press. Cingi, H., Kadilar, C. (2009). Advances in Sampling Theory-Ratio Method of Estimation. Sharjah, UAE: Bentham Science Publishers (E-Book). Das, A. K., Tripathi, T. P. (1978). Use of auxiliary information in estimating the finite population variance. Sankhya 40: 139–148. Gupta, S., Shabbir, J. (2008). On improvement in estimating the population mean in simple random sampling. J. Appl. Statist. 35: 559–566. Isaki, C. T. (1983). Variance estimation using auxiliary information. J. Amer. Statist. Assoc. 78: 117–123. Kadilar, C., Cingi, H. (2003). Ratio estimators in stratified random sampling. Biometr. J. 45(2): 218–225. Kadilar, C., Cingi, H. (2006a). Ratio estimators for the population variance in simple and stratified random sampling. Appl. Math. Computat. 173(2):1047–1059.
Downloaded by [Hacettepe University] at 02:51 30 July 2015
Ratio Estimators for the Population Variance
4779
Kadilar, C., Cingi, H. (2006b). Improvement in variance estimation using auxiliary information. Hacettepe J. Math. Statist. 35(1):111–115. Kadilar, C., Cingi, H. (2007). Improvement in variance estimation in simple random sampling. Commun. Statist.: Theo. Meth. 36(11):2075–2081. Kendall, M., Stuart, A. (1963). The Advanced Theory of Statistics: Distribution Theory. London: Griffin. Prasad, B., Singh, H. P. (1990). Some improved ratio-type estimators of finite population variance in sample surveys. Commun. Statist.: Theor. Meth. 19:1127–1139. Robinson, A. P., Hamlin, D. C., Fairweather S. E. (1999). Improving forest inventories: three ways to incorporate auxiliary information. J. Foresty 97(12):38–42. Savci, A. E. (2007). The bio-ecology of Centaurea Tchihatcheffii Fischer and Meyer (Asteraceae) in the vicinity of Ankara-Golbasi. Unpublished M.Sc. Thesis, Faculty of Science, Hacettepe University, Ankara, Turkey. Singh, H. P., Upadhyaya, L. N., Namjoshi, U. D. (1988). Estimation of finite population variance. Curr. Sci. 57:1331–1334. Singh, H. P., Vishwakarma, G. K. (2008a). A family of estimators of population mean using auxiliary information in stratified sampling. Commun. Statist.: Theor. Meth. 37(7):1038–1050. Singh H. P., Vishwakarma, G. K. (2008b). Some families of estimators of variance of stratified random sample mean using auxiliary information. Journal of Statistical Theory and Practice, 2(1):21–43. Unyazici, Y. (2008). Variance estimation methods in some sampling designs. Unpublished PhD. Thesis, Faculty of Science, Hacettepe University, Ankara, Turkey. Unyazici, Y., Cingi, H. (2008). New generalized estimators for the population variance using auxiliary information. Hacettepe J. Math. Statist. 37(2):177–184.