Sampling Algorithm of Order Statistics for Conditional ...

An Improved Estimator of Finite Population Mean Using Auxiliary Attribute(s) in Stratified Random Sampling Under Non-Response Rajesh Singh Department of Statistics BHU, Varanasi (U.P.), India [email protected]

Sachin Malik Department of Statistics BHU, Varanasi (U.P.), India [email protected]

Abstract In the present study, we propose a new estimator for population mean using Singh et al. (2007) and Malik and Singh (2013) estimators in the case of stratified random sampling when the information is available in form of attributes under non-response. Expressions for the mean squared error (MSE) of the proposed estimators are derived up to the first degree of approximation. The theoretical conditions have also been verified by a numerical example. It has been shown that the proposed estimators are more efficient than the traditional estimators.

Keywords: Stratified random sampling, Combined exponential ratio type estimator, bias, Mean Square error, Auxiliary attribute, Efficiency. 1. Introduction The problem of estimating the population mean in the presence of an auxiliary variable has been widely discussed in finite population sampling literature. Diana (1993) suggested a class of estimators of the population mean using one auxiliary variable in the stratified random sampling and examined the MSE of the estimators up to the kth order of approximation. Kadilar and Cingi (2003), Singh and Vishwakarma (2005), Singh et al. (2009), Koyuncu and Kadilar (2008, 2009) proposed estimators in stratified random sampling. Singh (1965) and Perri (2007) suggested some ratio cum product estimators in simple random sampling. Bahl and Tuteja (1991) and Singh et al. (2007) suggested some exponential ratio type estimators. There are some situations when in place of one auxiliary attribute, we have information on two qualitative variables. For illustration, to estimate the hourly wages we can use the information on marital status and region of residence (see Gujrati and Sangeeta (2007). Here we assume that both auxiliary attributes have significant point bi-serial correlation with the study variable and there is significant phi-correlation (see Yule (1912)) between the two auxiliary attributes. In surveys covering human populations, information in most cases is not obtained from all the units in the survey even after some call-backs. An estimate obtained from such incomplete data may be misleading especially when the respondents differ from the nonrespondents because the estimate can be biased. Hansen and Hurwitz (1946) envisaged a simple technique of sub-sampling the non-respondents in order to adjust for the nonresponse in a mail survey. In estimating population parameters like the mean, total or

Pak.j.stat.oper.res. Vol.X No.2 2014 pp147-155

Rajesh Singh, Sachin Malik

ratio, sample survey experts sometimes use auxiliary information to improve precision of the estimates. L

Consider a finite population of size N and is divided into L strata such that

N h 1

h

N

where is the size of h th stratum (h=1,2,...,L). We select a sample of size n h from each stratum by simple random sample without replacement sampling such that L

n h 1

h

 n , where n h is the stratum sample size. Let ( y hi ,  jhi ) (j=1,2) denote the values

of the study variable (y) and the auxiliary attributes  j , (j=1,2) respectively in the h th stratum. Suppose that n h1 units will respond and n h 2 will not respond such that n n h1  n h 2  n h . We select a sub-sample of size rh  h2 k h  1 from n h 2 units and it is kh assumed that all the selected units will respond. Suppose ψ jhi

Let

th  1, if i unit in the stratum h possesses the attribute ψ j    0, otherwise

L

L

h 1

h 1

Pj   Wh Pjh and p jst   Wh p jh (j=1, 2) are the population and sample

proportions of units of the auxiliary attributes where Pjh  Nh

nh

i 1

i 1

A jh Nh

, p jh 

a jh nh

and

A jh    jhi and a jh    jhi .

 y Nh

Let S2yh 

i 1

hi  Yh



 ψ Nh

2

and S2ψjh 

i 1

 Pjh 

2

jhi

be the population variances of the N h 1 N h 1 study variable and the auxiliary attributes in the hth stratum. Where, Nh

Yh   i 1





y hi  Y h  jhi  Pjh  y hi , S y jh   Nh Nh 1

and  y jh 

S y

jh

S jh Syh

be the population bi-covariance and point bi-serial correlation between the study variable h th and the auxiliary attributes respectively in the stratum. Also Nh S12h   P   P  S12h   1hi 1h 2 hi 2 h and 12h  be the population bi-covariance Nh 1 S1h S2 h i 1 and phi-correlation coefficient between the auxiliary attributes in the h th stratum respectively.

148


An Improved Estimator of Finite Population Mean Using Auxiliary Attribute(s) in Stratified Random …….. L

To estimate Y   Wh Y h , we assume that Pjh (j=1, 2) are known. h 1

Now adapting the Hansen-Hurwitz (1946) methodology, an unbiased estimator of L * * N population mean Y is given by y st   Wh y h , where Wh  h is known as stratum N h 1 *

weight and y h 

n h1 y1h  n h 2 y 2 h . Here, y1h and y 2 h are the sample means of nh

n h1 responding units n h 2 sub-sampled units respectively. *

The variance of the estimator y st is given as

 

L

L

var y st   Wh2 f h S2yh   Wh2 *

Where, Wh 2  fh 

h 1

Nh2 Nh

h 1

k h  1 W

2 h2 yh2

nh

S

is the known non-response rate in the h th

stratum and

1 1 n h Nh .

2. Estimators in literature In order to have an estimate of the study variable y, assuming the knowledge of the population proportion P, Naik and Gupta (1996) and Singh et al. (2007) respectively proposed following estimators *  P  t1  yst  1   p1st 

(2.1)

* P p  t 2  yst exp  1 1st   P1  p1st 

(2.2)

where, L

L

h 1

h 1

P1   Wh P1h and p1st   Wh p1h .

The Bias and MSE expressions of the estimator’s t i (i=1, 2) up to the first order of approximation are, respectively, given by

Bt1  



1 L 2 Wh f h R1S2 1h   y1h SyhS1h  P1 h 1




(2.3)

149


Bt 2  

1 L 3  Wh2f h  S2 1h   y1h SyhS1h   2P1 h 1 4 



L



L

2 2 2 2 2 MSE t 1    Wh f h S yh  R 1 S1h  2R 1 y1h S yhS1h   Wh h 1

h 1

k h  1 W nh

h2

(2.4)

S2yh2

(2.5)

 2 R 12 2  L k  1 W S2 S1h  R 1 y1h SyhS1h    Wh2 h MSE t 2    W f Syh  h 2 yh2 4 nh h 1   h 1 L

2 h h

where, R 1 

(2.6)

Y P1

Following Bahl and Tuteja (1991), Shagir and Shabbir (2012) proposed a ratio type exponential estimator as *  P1  p1st   P 2 p 2st   exp   t 3  yst exp   P1  a  1p1st   P2  b  1p 2st 

(2.7)

The Bias and MSE expressions of the estimators t 3 up to the first order of approximation is L  2a  1 S2ψ1h 2b  1 S2ψ 2h ρ yψ1h S yhSψ1h ρ yψ2h S yhSψ 2h ρ ψ1ψ 2h Sψ1h Sψ 2h Bt 3    Wh2 f h       2a 2 P12 abP1P2 2b 2 P22 a YP1 bYP2 h 1 

   

(2.8)

and L  2 R2 2 R2 2 R R 2 MSE t 3    Wh f h S yh  21 Sψ1h  22 Sψ21h  2 1 ρ yψ1h S yhSψ1h  2 1 ρ yψ2h S yhSψ2h a b a b h 1  k  1 W S2 RR  L  2 1 2  1 2h S1h S 2 h    Wh2 h h 2 yh2 (2.9) ab nh  h 1

where, R 2 

Y P2

3. The proposed estimator Following Malik and Singh (2013), we propose an estimator as 1

P p   P  p 2st   t 4  y st exp  1 1st  exp  2  P1  p1st   P2  p 2st  *

2

 b1 P1  p1st   b 2 P2  p 2st 

(3.1)

where, 1 and  2 are real constants. To obtain the bias and MSE of t 4 to the first degree of approximation, we define *

y Y p P p  P2 e 0  st , e1  1st 1 , e 2  2st P1 P2 Y 150


An Improved Estimator of Finite Population Mean Using Auxiliary Attribute(s) in Stratified Random ……..

Such that, E(ei )  0 ; i  0, 1, 2. L

 

2 And E e 0 

W f S 2 h h

h 1

Y

L

2 yh

2

W f S

 

, E e12 

h 1

P

Ee 0 e 2  

,

YP1

 

, E e 22 

W f S h 1

2 h h

2  2h

,

P22 L

L

 Wh2 f h S2y1h h 1

L

2 1h

2 1

L

Ee 0 e1  

2 h h

 Wh2 f h S2y2h

, and E e1e2  

h 1

YP2

W h 1

2 h

f h S21 2 h

P1 P2

Expressing equation (3.1) in terms of e’s, we have 2    e  1   e 2   1   exp    b1e1P1  b 2 e 2 P2 t 4  Y 1  e 0  exp    2  e1   2  e 2    

 e e   e  2e 2  e  ee  2e2  e e  Y 1  e 0  1 1  1 1  2 2  1 2 1 2  2 2  2 0 2  1 0 1   b1e1P1  b 2 e 2 P2 2 4 2 4 4 2 2  

(3.2)

Retaining the term’s up to single power of e’s in (3.2) we have

   e  e  t 4  Y  Y e 0  1 1  2 2   b1e1P1  b 2 e 2 P2  2 2    

(3.3)

Squaring both sides of (3.3) and then taking expectations, we get the MSE of the estimator t 4 up to the first order of approximation, as 2 2 2 2 2 2   2 1 R 1 S1h  2 R 2S 2h 1 2 R 1R 2  1 2h S1h S 2 h MSE( t 4 )   W f S yh     1R 1 y1h S yhS1h 4 4 2 h 1   L

2 h h



  2 R 2 y2 h SyhS2 h  B12S2 1h  B22S2 2 h  2B1B212h S1h S2 h



 2 B1R1S2 1h  B2 R 2S2 2 h 1B1R 1S2 1h

 L

W + h 1



2 2 h

k h  1 W

h2

nh

 2 B 2 R 2S2 2 h 2



1B 2 R 2S2 2 h

S2yh2

(3.4)

L

Where, B1 

 Wh2 f h  y1h S yhS1h h 1

L

W f S h 1

2 h h

2

 2 B1R 1S2 2 h      2   

L

and B 2 

2 1


W f  h 1

2 h h

y 2 h

S yhS 2 h

L

W f S h 1

2 h h

2 2

151


Minimising equation (3.4) with respect to 1 and  2 we get the optimum values as

1 

2R 2 A 3 A 4  4 R 1 A 2 A 5 2R 1A1A 5  4R 2 A 2 A 3 and  2  2 2 R 1R 22 A1A 4  4A 22 R 1 R 2 A1 A 4  4 A 2









where, L

A1   Wh2 f h S2yh h 1 L

A 2   Wh2 f h  1 2h S1h S 2 h h 1 L



A 3   Wh2 f h R 1 y1h S yhS1h  B1R 1S2 1h  B 2 R 2S2 2 h



h 1 L

A 4   Wh2 f h S2 2 h and h 1 L



A 5   Wh2 f h R 2  1 2h S1h S 2 h  B1R 1S2 2 h  B 2 R 2S2 2 h h 1

              



4. Empirical study Source: [Murthy (1967)] We randomly select a sample of size n h from each stratum by using the Neyman allocation and consider the first 10%, 20% and 30% values in each stratum as nonresponse for Wh 2 =0.1, Wh 2 =0.2 and Wh 2 =0.3 respectively. The population consists of village wise complete enumeration and data are obtained in 1951 and 1961 censuses for a Tehsil. The area of village is used to stratify the population into three strata. Let y be the cultivated area in the village in hectares in 1951 and  jhi (j=1,2) are given below Let 1, if i th village in the stratum1has area greater than 550 hectares ψ11i   0, otherwise

1, if i th village in the stratum 2 has area greater than 1300 hectares ψ12i   0, otherwise 1, if i th village in the stratum 3has area greater than 2500 hectares ψ13i   0, otherwise

152



Let

1, if i th village in the stratum1has number of cultivators greater than 550 ψ 21i   0, otherwise 1, if i th village in the stratum 2has number of cultivators greater than 700 ψ 22i   0, otherwise and

1, if i th village in the stratum 3has number of cultivators greater than 1500 ψ 22i   0, otherwise Data are presented in Table 4.1 and results are given in Table 4.2. Table 4.1: Data Statistics Stratum no.

1

2

3

Nh

43

45

40

nh

10

12

18

Yh

397.1425

760.1795

1234.6180

P1h

0.5814

0.4444

0.4000

P2 h

0.39535

0.5333

0.4000

S2yh

39975.0569

61455.990

172425.900

S2 1h

0.2492

0.2525

0.2462

S2 2 h

0.2448

0.2545

0.2462

 y1h

0.6922

0.3750

0.5057

 y 2 h

0.3956

0.2847

0.3261

1 2 h

0.2040

0.3884

0.5833

S2yh2 , For Wh2 =0.1

16871.6298

5534.3610

174694.200

For Wh2 =0.2

10951.1712

43003.2000

123887.767

For Wh2 =0.3

15878.3587

49346.8000

152450.527

We compute the percent relative efficiency (PRE) of different estimators for different values of Wh 2 and k h and use the given expression

  *

var y st PRE  x100 , (i=1, 2, 3, 4) MSEt i 


153

Rajesh Singh, Sachin Malik *

Table 4.2: Percentage relative efficiency of different estimators with respect to y st Wh 2

kh

PREt1 

0.1

2.0 2.5 3.0 3.5 2.0 2.5 3.0 3.5 2.0 2.5 3.0 3.5

13.95 14.45 14.94 15.42 15.01 15.10 16.97 17.92 16.65 18.39 20.05 21.65

0.2

0.3

PREt 2 

PREt 3 

PREt 4 

54.93 55.93 56.89 57.80 57.02 58.87 60.57 62.13 60.02 62.86 65.33 67.49

129.83 128.31 126.94 125.69 126.75 124.32 122.30 120.58 122.93 119.82 117.45 115.59

449.35 435.30 422.10 409.00 422.10 397.95 376.43 357.10 397.95 366.51 339.68 316.50

From Table 4.2, we observe that the proposed estimator t 4 is more efficient as compared *

to the usual estimator y st , Shagir and Shabbir (2012) and the estimators t1 and t2. The efficiency of the estimators decreases with increase in the level of non-response in each stratum and k h . 5. Conclusion In this paper, we have proposed a combined exponential ratio type estimator t 4 using information on the auxiliary attribute(s) under non-response. Expressions for bias and MSE’s of the proposed estimators are derived up to first degree of approximation. The proposed estimator is compared with usual mean estimator, Shagir and Shabbir (2012) estimator and estimators t1 and t2. A numerical study is carried out to support the theoretical results for different values of non-responses. The proposed estimator t 4 performed better than the usual sample mean estimator and estimators t1, t2 and t3. The efficiency of the estimators decreases with increase rate of non-response Wh 2 and k h . References 1.

Bahl, S. and Tuteja, R.K. (1991). Ratio and product type exponential estimator. Journal of information and optimization sciences, 12(1), 159-163.

2.

Diana, G. (1993). A class of estimators of the population mean in stratified random sampling. Statistica, 53(1), 59–66.

3.

Gujarati, D. N. and Sangeetha (2007). Basic Econometrics. Tata McGraw – Hill.

4.

Hansen, M.H. and Hurwitz, W.N. (1946). The problem of non-response in sample surveys. J. Am. Stat. Assoc., 41,517–529.

154



5.

Kadilar, C. and Cingi, H. (2003). Ratio Estimators in Stratified Random Sampling. Biometrical Journal, 45 (2), 218-225.

6.

Kadilar, C. and Cingi, H. (2005). A new estimator using two auxiliary variables. Applied math and computation, 162, 901–908.

7.

Koyuncu, N. and Kadilar, C. (2008). Ratio and product estimators in stratified random sampling. J. Statist. Plann. Inference, 139(8), 2552-2558.

8.

Koyuncu, N. and Kadilar, C. (2009). Family of estimators of population mean using two auxiliary variables in stratified random sampling. Comm. In Stat. – Theory and Meth., 38(14), 2398-2417.

9.

Malik S. and Singh R. (2013). An improved estimator using two auxiliary attributes. Applied mathematics and computation. 219, 10983-10986.

10.

Murthy, M.N. (1967). Sampling Theory and Methods. Statistical Publishing Society, Calcutta.

11.

Naik, V.D., Gupta, P.C. (1996). A note on estimation of mean with known population proportion of an auxiliary character. Journal of the Indian Society of Agricultural Statistics, 48(2), 151-158.

12.

Perri, P. F. (2007). Improved ratio-cum-product type estimators. Statist. Trans. 8:51–69.

13.

Shagir A. and Shabbir J. (2012). Estimation of Finite Population Mean in Stratified Random Sampling Using Auxiliary Attribute(s) under Non-Response. Pak. j. of stat. oper. Res., 8(1), 73-82.

14.

Singh, H. P.; Vishwakarma, G. K. (2005). Combined Ratio-Product Estimator of Finite Population Mean in Stratified Sampling. Metodologia de Encuestas, 8, 35-44.

15.

Singh, M.P. (1965). On the estimation of ratio and product of the population parameters. Sankhya, B, 27, 321-328.

16.

Singh, R., Kumar, M., Chaudhary, M. K. and Kadilar, C. (2009). Improved Exponential Estimator in Stratified Random Sampling. Pakistan Journal of Statistics and Operation Research, 5(2), 67-82.

17.

Yule, G. U. (1912). On the methods of measuring association between two attributes. Jour. of the Royal Soc., 75, 579-642.


155