Communications in Statistics—Theory and Methods, 34: 597–602, 2005 Copyright © Taylor & Francis, Inc. ISSN: 0361-0926 print/1532-415X online DOI: 10.1081/STA-200052156
Sampling Theory
A New Ratio Estimator in Stratified Random Sampling CEM KADILAR AND HULYA CINGI Hacettepe University, Department of Statistics, Beytepe, Ankara, Turkey In this article, we suggest a new ratio estimator in stratified random sampling based on the Prasad (1989) estimator. Theoretically, we obtain the mean square error (MSE) for this estimator and compare it with the MSE of traditional combined ratio estimate. By this comparison, we demonstrate that proposed estimator is more efficient than combined ratio estimate in all conditions. In addition, this theoretical result is supported by a numerical example. Keywords Mean square errors; Ratio-type estimators; Stratified random sampling. Mathematics Subject Classification Primary 62D05.
1. Introduction The combined ratio estimate is y¯ RC =
y¯ st X = Rc X x¯ st
(1)
where X is the population mean of auxiliary variate and y¯ st =
k
h y¯ h x¯ st =
h=1
k
h x¯ h
h=1
where k is the number of stratum, h = NNh is stratum weight, N is the number of units in population, Nh is the number of units in stratum h, y¯ h is the sample mean of variate of interest in stratum h and x¯ h is the sample mean of auxiliary variate in stratum h. The variance of combined ratio estimate is V¯yRC =
k
2 2 h2 h Syh − 2RSyxh + R2 Sxh
(2)
h=1
Received February 7, 2003; Accepted July 21, 2004 Address correspondence to Cem Kadilar, Hacettepe University, Department of Statistics, Beytepe, Ankara, Turkey; E-mail:
[email protected]
597
598
Kadilar and Cingi
where h = 1−nnh /Nh R = XY is the population ratio, nh is the number of units in h 2 sample stratum h, Syh is the population variance of variate of interest in stratum 2 h, Sxh is the population variance of auxiliary variate in stratum h, and Syxh is the population covariance between auxiliary variate and variate of interest in stratum h (Cochran, 1977). In stratified random sampling, Kadilar and Cingi (2003) developed ratio estimators as follows: k h Xh + Cxh (3) y¯ stSD = y¯ st h=1 k ¯ h=1 h xh + Cxh based on the Sisodia and Dwivedi (1981) estimator; k y¯ stSK = y¯ st h=1 k h=1
h Xh + 2h x h ¯xh + 2h x
(4)
based on Singh and Kakran (1993) estimator; k y¯ stUS1 = y¯ st h=1 k h=1
h Xh 2h x + Cxh h ¯xh 2h x + Cxh
(5)
based on the first estimator of Upadhyaya and Singh (1999), k h Xh Cxh + 2h x y¯ stUS2 = y¯ st h=1 k ¯ h=1 h xh Cxh + 2h x
(6)
based on second estimator of Upadhyaya and Singh (1999). Here, Cx is the population coefficient of variation and 2 x is the population coefficient of kurtosis of auxiliary variate x. Kadilar and Cingi (2003) demonstrate that all of these estimators, presented in Eqs. (3)–(6), have a bigger MSE than traditional combined ratio estimate has in some conditions. Therefore, in the next section, we will propose a new ratio estimator in stratified random sampling, and in Sec. 3, we will prove that this proposed estimator is more efficient than the combined ratio estimate in all conditions. In Sec. 4, this theoretical proof will be supported by a numerical example.
2. Ratio Estimator and Its Mean Square Error When first degree approximation is used in obtaining the mean square error (MSE) of a ratio estimate, it is known MSE is equal to the variance, so MSE of combined ratio estimate can be written as follows: MSE¯yRC =
k
2 2 h2 h Syh − 2RSyxh + R2 Sxh
(7)
h=1
To obtain the bias of combined ratio estimate, we write 1 Y = XE ¯yst − R¯xst E¯yRC − x¯ st
(8)
Ratio Estimator in Stratified Random Sampling
599
where E symbolizes expected value. We can rewrite (1/¯xst as 1 x¯ st − X −1 1 1 −1 1+ = = X + ¯xst − X = x¯ st X + ¯xst − X X X and let this expression expand to Taylor series. If we use first degree approximation (omit the terms after the second term, i.e., square, cubic, etc., terms) in Taylor series expansion, the equation will be x¯ st − X 1 1 1− x¯ st X X From Eq. (8), E¯yRC − Y = E
1−
X x¯ st − ¯yst − R¯xst X
X RE ¯xst ¯xst − X
E ¯yst ¯xst − = E¯yst − R¯xst − + X X As E¯yst − R¯xst = 0, we can write 1
RE¯xst − X 2 − E ¯yst − Y ¯xst − X X k k 2 1 2 2 = R h h Sxh − h h cov¯yh x¯ h X h=1 h=1
E¯yRC − Y =
From this equation, the bias of combined ratio estimate is B¯yRC =
k 1 2 h2 h RSxh − Syxh X h=1
(9)
(Cingi, 1994). 2.1. The Suggested Ratio-Type Estimator In simple random sampling, Prasad (1989) proposed a ratio estimator as y¯ y¯ p = ¯yR = X x¯ where the coefficient =
(10)
1+ Cy Cx . Cy2 +1
In stratified random sampling, we suggest that y¯ stp = ∗ y¯ RC
(11)
600
Kadilar and Cingi
Therefore, the MSE of this estimator is 2 MSE¯ystp = E y¯ stp − Y = E∗ y¯ RC − Y 2 ∗2 2 Y + Y2 = E y¯ RC − 2∗ y¯ RC 2 = ∗2 E y¯ RC Y2 − 2∗ Y E¯yRC + 2 − 2∗ Y2 + Y 2 − ∗2 E¯yRC 2 = ∗2 E y¯ RC Y 2 + ∗2 2 = ∗2 E¯yRC − E yRC 2 + Y 2 ∗ − 12 = ∗2 Var¯yRC + Y 2 ∗ − 12 From this equation, we obtain the MSE of the suggested estimate as follows: MSE¯ystp = ∗2
k
2 2 Y 2 h2 h Syh − 2RSyxh + R2 Sxh + ∗ − 12
(12)
h=1
Bias of this estimator is obtained as E¯ystp − Y = E∗ y¯ RC − Y ¯ st ∗y =E X−Y x¯ st ∗ y¯ st − R¯xst = XE x¯ st X x¯ st − ∗ = E y¯ st − R¯xst 1 − X ∗ R = ∗ E¯yst − RE¯xst − E ¯yst ¯xst − X + E ¯xst ¯xst − X
X X R Y ∗ = ∗ Y ¯xst − X + E ¯xst − X 2
Y− X − E ¯yst − X X X k 1 2 = ∗ − 1 Y+ h2 h RSxh − ∗ Syxh X h=1 In order to find the equation of ∗ which makes the MSE minimum, we should take the derivative of the MSE with respect to ∗ and equal this equation to zero as follows: k
MSE¯ystp ∗ 2 2 = 2 h2 h Syh − 2RSyxh + R2 Sxh + 2∗ − 1 Y 2 = 0
∗ h=1
From this equation, we obtain ∗ = where 0 < ∗ < 1.
Y2 +
k h=1
Y2 2 2 h2 h Syh − 2RSyxh + R2 Sxh
Ratio Estimator in Stratified Random Sampling
601
3. Efficiency Comparison If we compare the MSE of combined ratio estimator with the MSE of proposed estimator we will have the condition as follows: Let =
k
2 2 h2 h Syh − 2RSyxh + R2 Sxh
h=1
MSE¯ystp < MSE¯yRC ∗2 − 1 + Y 2 ∗ − 12 < 0 ∗ − 1 ∗ + 1 + ∗ − 1 Y 2 < 0 From this condition, as ∗ − 1 < 0, it is clear that if ∗ >
Y2 − Y2 +
(13)
the suggested estimator is more efficient than the combined ratio estimator. When we examine the condition (13) in detail, we see that this condition is always satisfied. Therefore, we can say that the suggested estimator is more efficient than combined ratio estimator in all conditions.
4. Numerical Example We have used the data of Kadilar and Cingi (2003) in this section. We have applied our proposed and combined ratio estimators on the data of apple production amount (as interest of variate) and number of apple trees (as auxiliary variate) in 854 villages of Turkey in 1999 (Source: Institute of Statistics, Republic of Turkey). First, we have stratified the data by regions of Turkey and from each stratum (region); we have randomly selected the samples (villages). By using the Neyman allocation (Cochran, 1977), N S n h = n k h h h=1 Nh Sh
(14)
Table 1 Data statistics N = 854 n = 140 X = 37600 y¯ = 2930 Sx = 144794 Sy = 17106 R = 007793 ∗ = 0975 = 215710432
N1 = 106 n1 = 9 X1 = 24375 Y1 = 1536 Sx1 = 49189 Sy1 = 6425 1 = 082 1 = 0102 12 = 0015
N2 = 106 n2 = 17 X2 = 27421 Y2 = 2212 Sx2 = 57461 Sy2 = 11552 2 = 086 2 = 0049 22 = 0015
N3 = 94 n3 = 38 X3 = 72409 Y3 = 9384 Sx3 = 160757 Sy3 = 29907 3 = 090 3 = 0016 32 = 0012
N4 = 171 n4 = 67 X4 = 74365 Y4 = 5588 Sx4 = 285603 Sy4 = 28643 4 = 099 4 = 0009 42 = 004
N5 = 204 n5 = 7 X5 = 26441 Y5 = 967 Sx5 = 45403 Sy5 = 2390 5 = 071 5 = 0138 52 = 0057
N6 = 173 n6 = 2 X6 = 9844 Y6 = 404 Sx6 = 18794 Sy6 = 946 6 = 089 6 = 0006 62 = 0041
602
Kadilar and Cingi Table 2 MSE values of ratio estimators Estimators
MSE values
Proposed Combined ratio
210423.632 215710.432
we have computed sample size in stratum h. Here we take sample size as n = 140 (Cingi, 1994). From the results of nh , we have decided to join two regions so we take six strata (as 1: Marmara, 2: Agean, 3: Mediterranean, 4: Central Anatolia, 5: Black Sea, 6: East and Southeast Anatolia) for this data. Then by using this stratified random sampling, the MSE of combined and proposed ratio estimators have been computed by the Eqs. (7) and (12), respectively. Finally, these estimators have been compared between each other with respect to their MSE values. In Table 1, we observe the statistics about the population, strata, and sample size. Note that the correlation between the variates is 92%. In Table 2, the values of MSE are given. From these values, it is seen that the MSE value of the proposed ratio estimator is smaller than that of combined ratio estimator. It is an expected 2 result, since ∗ = 0975 > YY 2 − = 0951, as mentioned in Sec. 3. +
5. Conclusion We have derived a new ratio-type estimator in stratified random sampling from the estimator of Prasad (1989) and obtained its MSE equation. By this equation, the MSE of proposed estimator has been compared with that of combined ratio estimate in theory and by this comparison it has been found that in all conditions the proposed estimator has a smaller MSE than the combined ratio estimate has. This theoretical result has also been satisfied by a numerical example, whereas Kadilar and Cingi (2003) found that combined ratio estimator was more efficient than the other estimators such as Sisodia and Dwivedi, Singh and Kakran, first and second estimators of Upadhyaya and Singh for the same data used in this article. In the forthcoming studies, we hope to develop new estimators in other sampling methods.
References Cingi, H. (1994). Sampling Theory. Ankara, Turkey: Hacettepe University Press. Cochran, W. G. (1977). Sampling Techniques. New York: John Wiley and Sons. Kadilar, C., Cingi, H. (2003). Ratio estimators in stratified random sampling. Biometrical J. 45(2):218–225. Prasad, B. (1989). Some improved ratio type estimators of population mean and ratio in finite population sample surveys. Commun. Statist. Theor. Meth. 18(1):379–392. Singh, H. P., Kakran, M. S. (1993). A modified ratio estimator using known coefficient of kurtosis of an auxiliary character. (unpublished). Sisodia, B. V. S., Dwivedi, V. K. (1981). A modified ratio estimator using coefficient of variation of auxiliary variable. J. Indian Soc. Agricul. Statist. 33:13–18. Upadhyaya, L. N., Singh, H. P. (1999). Use of transformed auxiliary variable in estimating the finite population mean. Biometrical J. 41(5):627–636.