A New Estimator For Population Mean Using Two Auxiliary Variables in Stratified random Sampling Rajesh Singh and Sachin Malik Department of Statistics, Banaras Hindu University Varanasi-221005, India (
[email protected],
[email protected])
Abstract In this paper, we suggest an estimator using two auxiliary variables in stratified random sampling. The propose estimator has an improvement over mean per unit estimator as well as some other considered estimators. Expressions for bias and MSE of the estimator are derived up to first degree of approximation. Moreover, these theoretical findings are supported by a numerical example with original data. Key words: Study variable, auxiliary variable, stratified random sampling, bias and mean squared error.
1. Introduction The problem of estimating the population mean in the presence of an auxiliary variable has been widely discussed in finite population sampling literature. Out of many ratio, product and regression methods of estimation are good examples in this context. Diana [2] suggested a class of estimators of the population mean using one auxiliary variable in the stratified random sampling and examined the MSE of the estimators up to the kth order of approximation. Kadilar and Cingi [3], Singh et al. [7], Singh and Vishwakarma [8], Koyuncu and Kadilar [4] proposed estimators in stratified random sampling. Singh [9] and Perri [6] suggested some ratio cum product estimators in simple random sampling. Bahl and Tuteja [1] and Singh et al. [11] suggested some exponential ratio type estimators. In this chapter, we suggest some exponentialtype estimators using the auxiliary information in the stratified random sampling.
L
Consider a finite population of size N and is divided into L strata such that
N
h
N where
h 1
N is the size of h th stratum (h=1,2,...,L). We select a sample of size n h from each stratum by L
simple random sample without replacement sampling such that n h n , where n h is the h 1
stratum sample size. A simple random sample of size nh is drawn without replacement from the hth stratum such that∑
n = n. Let (yhi, xhi, zhi) denote the observed values of y, x, and z on
the ith unit of the hth stratum, where i=1, 2, 3...Nh. To obtain the bias and MSE, we write L
L
L
h 1
h 1
h 1
y st w h y h Y 1 e 0 , x st w h x h X1 e1 , z st w h z h Z1 e 2
Such that,
Ee 0 Ee 0 Ee 0 0 L
Vrst w
r s t h
r
s
E yh Y x h X zh Z r
s
Y X Z
h 1
t
t
where, L
y st w h y h , y h h 1
1 nh
nh
y hi , Y h i 1
L
Y Y st w h Y h , w h h 1
1 Nh
nh
Y
hi
i 1
Nh N
and 2
V( y st ) Y V200
Similar expressions for X and Z can also be defined.
(1.1)
L
L
Wh2 f h S2yh
E e 02
h 1
Y
E e12
V200 ,
2
2 h h
W f S h 1
L
X
W f S
2 zh
h 1
Z
2 h h
Ee 0 e1
V002 ,
2
W f S
where , S
h 1
YZ
Nh
y
i 1
2 zh
Nh
S i 1
and f h
z
YX
V110 ,
L 2 h h
2 yh
2 yxh
h 1
L
W f S
Ee 0 e 2
V020 ,
2
L
2 h h
E e 22
2 xh
2 yzh
2 h h
andEe1e 2
V101 ,
2
Yh , Nh 1
h
2 xh
S
2
x
i 1
Nh
Xh N h 1 h
W f S h 1
XZ
2 xzh
V011 ,
2
Nh Nh z h Zh yh Y h x h X h z h Zh Zh , S yzh , S xzh Nh 1 Nh 1 N h 1 i 1 i 1 h
1 1 nh Nh
2. Estimators in literature In order to have an estimate of the study variable y, assuming the knowledge of the population proportion P, Naik and Gupta [5] and Singh et al. [11] respectively proposed following estimators
X t 1 y st x st
X x st t 2 y st exp X x st
(2.1)
The MSE expressions of these estimators are given as
(2.2)
2
MSE t 1 Y V200 V020 2V110 2 V MSEt 2 Y V200 020 V110 4
(2.3) (2.4)
When the information on the two auxiliary variables is known, Singh [10] proposed some ratio cum product estimators in simple random sampling to estimate the population mean of the study variable y. Motivated by Singh [10] and Singh et al. [7], Singh and kumar propose some estimators in stratified sampling as X x st Z z st t 3 y st exp exp X x st Z z st
(2.5)
x st X z st Z t 4 y st exp exp x st X z st Z
(2.6)
X x st z st Z t 5 y st exp exp X x st z st Z
(2.7)
x st X Z z st t 6 y st exp exp x st X Z z st
(2.8)
The MSE equations of these estimators can be written as 2 V V V MSE(t 3 ) Y V200 020 002 V110 - V101 011 4 4 2
(2.9)
2 V V V MSE(t 4 ) Y V200 020 002 V110 V101 011 4 4 2
(2.10)
2 V V V MSE(t 5 ) Y V200 020 002 V110 V101 011 4 4 2
(2.11)
2 V V V MSE(t 6 ) Y V200 020 002 V110 - V101 011 4 4 2
(2.12)
When there are two auxiliary variables, the regression estimator of Y will be
t 7 y st b1h X x st b 2h Z z st
Where b1h
s yx 2
sx
and b 2h
s yz s 2z
(2.13)
. Here s 2x and s 2z are the sample variances of x and z respectively,
s yx and s yz are the sample covariance’s between y and x and between z respectively. The MSE
expression of this estimator is: L
MSE t 7 Wh2 f h S 2yh 1 ρ 2yxh ρ 2yzh 2ρ yxh ρ yzh ρ xzh
(2.14)
h 1
3. The proposed estimator We suggest using the ratio estimator given in equation (2.5) instead of estimator given in equation (2.13). By this way, we obtain the following estimator m1
X x st Z z st t p y st exp exp X x st Z z st
m2
b1h X x st b 2h Z z st
(3.1)
Expressing equation (3.1) in terms of e’s, we have m2 e m1 e 2 1 exp b1h Xe1 b 2h Ze 2 t p Y 1 e 0 exp 2 e1 2 e 2
m1e1 m12 e12 m 2 e 2 m1m 2 e1e 2 m 22 e 22 m 2 e 0 e 2 m1e 0 e1 Y 1 e0 2 4 2 4 4 2 2
b1h e1 X b 2h e 2 Z
(3.2)
Squaring both sides of (3.2) and neglecting the term having power greater than two, we have me m e t p Y Y e 0 1 1 2 2 b1h e1 X b 2h e 2 Z 2 2
2
2
(3.3)
Taking expectations of both the sides of (3.3), we have the mean squared error of t p up to the first degree of approximation as 2
MSE(t p ) Y V200 P1 P2 YP3
(3.4)
Where,
m12 V020 m 22 V002 m1m 2 V011 m1V110 m 2 V101 4 4 2 P2 B1h2 V020 B 22h V002 2B1h B 2h V011 P1
P3 2B1h V110 2B 2h V101 m1B1h V020 m1B 2h V011
L
(3.5)
L 2 h h
W f ρ Where, B1h
m 2 B1h V011 m 2 B 2h V002
yxh
2 h h
W f ρ
S yh Sxh
h 1
and B 2h
L 2 h h
W f S
2 xh
yzh
S yh S zh
h 1
L 2 h h
W f S
h 1
2 zh
h 1
The optimum values of m1 and m 2 will be
2 4 B1h V011V002 B 2h V011 B1h V020 V002 B 2h V011 V002 2 Y V020 V002 V011 2 4 B1h V011 V020 B 2h V011 B1h V011V020 B 2h V002 V020 m2 2 Y V020 V002 V011
m1
(3.6)
Putting optimum values of m1 and m 2 from (3.6), we obtained min MSE of proposed estimator tp .
4. Efficiency comparison In this section, the conditions for which the proposed estimator t p is better than y st , t 1 , t 2 , t 3 ,
t 4 , t 5 , t 6 , and t 7 .
The variance is given by 2
V( y st ) Y V200
(4.1)
To compare the efficiency of the proposed estimator with the existing estimator, from (4.1) and (2.3), (2.4), (2.9), (2.10), (2.11), (2.12) and (2.14), we have 2
V( y st ) - MSE(t p ) Y P1 P2 YP3 0 2
(4.2) 2
MSE(t 1 ) - MSE(t p ) Y V020 2V110 Y P1 P2 YP3 0 2 V 2 MSE(t 2 ) - MSE(t p ) Y 020 V110 - Y P1 P2 YP3 0 4
2V 2 V V MSE(t 3 ) - MSE(t p ) Y 020 002 V110 - V101 011 - Y P1 P2 YP3 0 4 2 4
2 V 2 V V MSE(t 4 ) - MSE(t p ) Y 020 002 V110 V101 011 - Y P1 P2 YP3 0 4 2 4
2 V 2 V V MSE(t 5 ) - MSE(t p ) Y 020 002 V110 V101 011 - Y P1 P2 YP3 0 4 2 4
2 V 2 V V MSE(t 6 ) - MSE(t p ) Y 020 002 V110 V101 011 - Y P1 P2 YP3 0 4 2 4
(4.3)
(4.4)
(4.5)
(4.6)
(4.7)
(4.8)
Using (4.2) - (4.8), we conclude that the proposed estimator outperforms than the estimators considered in literature. 5. Empirical study In this section, we use the data set in Koyuncu and Kadilar [4]. The population statistics are given in Table 3.2.1. In this data set, the study variable (Y) is the number of teachers, the first
auxiliary variable (X) is the number of students, and the second auxiliary variable (Z) is the number of classes in both primary and secondary schools. Table 5.1: Data statistics
N1=127
N2=117
N3=103
N4=170
N5=205
N6=201
n1=31
n2=21
n3=29
n4=38
n5=22
n6=39
S y1 883.835
S y2 644
S y3 1033.467
S y4 810.585
S y5 403.654
S y6 711.723
Y1 703.74
Y 2 413
Y 3 573.17
Y 4 424.66
Y 5 267.03
Y 6 393.84
S x1 30486.751
S x2 15180 .760
S x3 27549.697
S x4 18218.931
S x5 8997.776
S x6 23094.141
X1 20804.59
X 2 9211.79
X 3 14309.30
X 4 9478.85
X 5 5569.95
X 6 12997.59
S yx1 25237153.52
S yx2 9747942.85
S yx3 28294397.04
S yx4 1452885.53
S yx5 3393591 .75
S yx6 15864573.97
ρ yx1 0.936
ρ yx2 0.996
ρ yx3 0.994
ρ yx4 0.983
ρ yx5 0.989
ρ yx6 0.965
β 2 (x 1 ) 4.593
β 2 (x 2 ) 18.543
β 2 (x 3 ) 15.446
β 2 (x 4 ) 10.162
β 2 (x 5 ) 21.947
β 2 (x 6 ) 23.114
β 2 (y1 ) 2.158
β 2 (y 2 ) 16.392
β 2 (y 3 ) 14.979
β 2 (y 4 ) 12.167
β 2 (y 5 ) 21.008
β 2 (y 6 ) 20.254
S z1 555.5816
S z2 365.4576
S z3 612.9509
S z4 458.0282
Sz5 260.8511
Sz6 397.0481
Z1 498.28
Z 2 318.33
Z 3 431.36
Z 4 498.28
Z 5 227.20
Z 6 313.71
S yz1 480688 .2
S yz2 230092 .8
S yz3 623019.3
S yz4 36493.4
S yz5 101539
S yz6 277696 .1
S xz1 15914648
S xz2 5379190
S xz3 164900674.56
S xz4 8041254
S xz5 214457
S xz6 8857729
ρ yz1 0.978
ρ yz2 0.976
ρ yz3 0.983
ρ yz4 0.982
ρ yz5 0.964
ρ yz6 0.982
β 2 (z 1 ) 2.314 β 2 (z 4 ) 8.624
β 2 (z 2 ) 11.190
β 2 (z 5 ) 9.720
β 2 (z 3 ) 10.786 β 2 (z 6 ) 14.406
We have computed the pre relative efficiency (PRE) of different estimators of Y st with respect to y st and complied in table 5.2:
Table 5.2: Percent Relative Efficiencies (PRE) of estimator
S. No.
Estimators
PRE’S
1
y st
100
2
t1
1029.46
3
t2
370.17
4
t3
2045.43
5
t4
27.94
6
t5
126.41
7
t6
77.21
8
t7
2360.54
9
tp
4656.35
6. Conclusion In this paper, we proposed a new estimator for estimating unknown population mean of study variable using information on two auxiliary variables. Expressions for bias and MSE of the estimator are derived up to first degree of approximation. The proposed estimator is compared with usual mean estimator and other considered estimators. A numerical study is carried out to support the theoretical results. In the table 5.2, the proposed estimator performs better than the usual sample mean and other considered estimators. References [1] Bahl, S. and Tuteja, R.K. (1991): Ratio and product type exponential estimator. Infrm. and Optim. Sci., XII, I, 159-163. [2] Diana, G. (1993). A class of estimators of the population mean in stratified random sampling. Statistica 53(1):59–66. [3] Kadilar,C. and Cingi,H. (2003): Ratio Estimators in Straitified Random Sampling. Biometrical Journal 45 (2003) 2, 218-225. [4] Koyuncu, N. and Kadilar, C. (2009) : Family of estimators of population mean using two auxiliary variables in stratified random sampling. Comm. In Stat. – Theory and Meth., 38:14, 2398-2417. [5] Naik,V.D and Gupta, P.C., 1996: A note on estimation of mean with known population proportion of an auxiliary character. Jour. Ind. Soc. Agri. Stat., 48(2), 151-158. [6] Perri, P. F. (2007). Improved ratio-cum-product type estimators. Statist. Trans. 8:51–69.
[7] Singh, R., Kumar, M., Singh,R.D.,and Chaudhary, M.K.(2008): Exponential Ratio Type Estimators in Stratified Random Sampling. Presented in International Symposium On Optimisation and Statistics ( I.S.O.S).-2008 held at A.M.U ., Aligarh, India, during 29-31 Dec 2008.
[8] Singh, H., P. and Vishwakarma, G. K. (2008): A family of estimators of population mean using auxiliary information in stratified sampling. Communication in Statistics Theory and Methods, 37(7), 1038-1050. [9] Singh, M. P. (1965): On the estimation of ratio and product of population parameter. Sankhya 27 B, 321-328. [10] Singh, M.P. (1967): Ratio-cum-product method of estimation. Metrika 12, 34- 72. [11] Singh, R., Chauhan, P., Sawan, N. and Smarandache,F. (2007): Auxiliary information and a priory values in construction of improved estimators. Renaissance High press.