utilization of non-response auxiliary population mean in ... - Core

1 downloads 0 Views 116KB Size Report
E mail: 1. [email protected], 2. nst_stats@yahoo.co.in. Abstract. This paper presents an imputation ...... Smith, T.M.F. (1991). Post-stratification, The ...
Journal of Reliability and Statistical Studies (ISSN: 0974-8024) Vol. 2, Issue 1(2009): 28 - 40

UTILIZATION OF NON-RESPONSE AUXILIARY POPULATION MEAN IN IMPUTATION FOR MISSING OBSERVATIONS D. Shukla 1 , Narendra Singh Thakur 2 and Dharmendra Singh Thakur 3 Deptt. of Maths and Stats., Dr. Harisingh Gour University, Sagar (M.P.), India. E mail: 1. [email protected], 2. [email protected]

Abstract This paper presents an imputation based factor-type class of estimation strategy for estimating population mean in presence of missing values of auxiliary variable. The non-sampled part of population is used as an imputation technique in the form of a proposed class of estimators. The bias and mean squared error of this class is obtained. Some special cases are discussed. A specific range of parameter is found where the proposed class is optimal. The efficiency of the proposed estimator is compared with similar non-imputed estimator and it is found useful under missing observations setup.

Keywords: Imputation, Non-response, Post-stratification, Simple Random Sampling Without Replacement (SRSWOR), Respondents (R).

1. Introduction To estimate the population mean using auxiliary variable, many estimators are available in literature like-ratio, product, regression, dual-to-ratio estimator and so on. If some values of auxiliary variable are missing, none of the above estimators can be used. In sampling theory, the problem of mean estimation of a population is considered by many authors like Singh (1986), Singh and Singh (1991), Singh et al. (1994), Singh and Singh (2001). Sometimes, in survey situations, a small part of sample remains nonresponded (or incomplete) due to many practical reasons. Techniques and estimation procedures are needed to develop for this purpose. The imputation is a well defined methodology by virtue of which this kind of problem could be partially solved. Ahmed et al. (2006), Rao and Sitter (1995), Rubin (1976) and Singh and Horn (2000) have given applications of various imputation procedures. Hinde and Chambers (1990) studied the non-response imputation with multiple sources of non-response. The nonresponse in sample surveys immensely looked into by Hansen and Hurwitz (1946), Lessler and Kalsbeek (1992), Khot (1994), Grover and Couper (1998) etc. When the population is divided into two groups namely “response” and “nonresponse” then the procedure is known as post-stratification. Estimation problem in sample surveys, in the setup of post-stratification, under non-response situation is studied due to Shukla and Dubey (2001, 2004, and 2006). Some other useful contributions to this area are due to Smith (1991), Agrawal and Panda (1993), Shukla and Trivedi (1999, 2001, 2006), Wywial (2001) and Shukla et al. (2002, 2006). When a sample is full of response over study variable but some of auxiliary values are missing, it is hard to utilize the usual existing estimators. Traditionally, it is essential to estimate those missing observations first by some specific estimation techniques. One can think of utilizing the non-sampled part of the population in order to get estimates of missing observations in the sample. These estimates could be imputed into actual estimation procedures used for estimating the population mean. The content of this paper takes

Utilization of Non-Response Auxiliary Population Mean…

29

into account the similar aspect for non-responding values of the sample assuming poststratified setup and utilizing the auxiliary source of data.

1.1 Symbols and Setup Let U = (U1, U2 , ........., UN) be a finite population of N units with Y as a study variable and X an auxiliary variable. The population has two types of individuals like N1 as number of "respondents (R)" and N2 "non-respondents (NR)", (N = N1+N2). Their population proportions are expressed like W1 = N1/N and W2 = N2 /N. Further, let Y and X be the population means of Y and X respectively. The following notations are used in this paper: Rgroup Y1 X1

S1Y2 2 S1X

C1Y C1 X ρ n1 y1 x1 ρ1

Respondents group (group of those who respond during survey. Population mean of R-group of Y. Population mean of R-group of X. Population mean square of R-group of Y. Population mean square of R-group of X. Coefficient of Variation of Y in R-group. Coefficient of Variation of X in R-group. Correlation Coefficient in population between X and Y. Post-stratified sample size coming from R-group. Sample mean of Y based on n1 observations of R-group. Sample mean of X based on n1 observations of R-group. Correlation Coefficient between study variable Y and auxiliary variable X for R-group.

NRgroup Y2

Non-respondents group or group of those who do not respond during survey. Population mean of NR-group of Y.

X2

Population mean of NR-group of X.

2 S 2Y

S 22X C 2Y C2 X

n n2 y2 x2 ρ2

Population mean square of NR-group of Y. Population mean square of NR-group of X. Coefficient of Variation of Y in NRgroup. Coefficient of Variation of X in NRgroup. Sample size from population of size N by SRSWOR. Post-stratified sample size from NRgroup. Sample mean of Y based on n2 observations of NR-group. Sample mean of X based on n2 observations of NR-group. Correlation Coefficient between study variable Y and auxiliary variable X for NR-group.

Further, consider few more symbolic representations:  1 D1 = E   n1

  1 (N − n )(1 − W1 )  ; D = E  1 = + 2  n (N − 1) n 2W12    nW1  2 Y =

N1Y1 + N 2Y2 N

;

X=

N1 X 1 + N 2 X 2 N

  1 (N − n )(1 − W2 )  = +  (N − 1) n 2W 22    nW 2

(1.1) (1.2)

2. Assumptions The following assumptions are made before formulating an imputation based estimation procedure :

30

Journal of Reliability and Statistical Studies, June 2009, Vol. 2(1)

1. 2. 3. 4.

5. 6.

The values of N and n are known. Also, N1 and N2 are known by past data, past experience or guess of the investigator (N1 + N2 = N). Other population parameters are assumed known, in either exact or in ratio form except the Y , Y 1 and Y 2 . The population means X and X 2 are known. The sample of size n is drawn by SRSWOR and post-stratified into two groups of size n1 and n2 (n1 + n2 = n) according to R and NR group respectively. The information about Y variable in sample is completely available. The sample means y1 and y 2 of both groups are known such that 1

y=

7.

n1 y1 + n 2 y 2 n

which is the sample mean on n units.

The sample mean x of auxiliary variable for R-group is known, but the information about x of NR-group is missing. Therefore, the value of 1

2

n x + n2 x 2 x= 1 1 can not be obtained due to absence of x2 . n

3. Proposed Class of Estimation Strategy To estimate population mean Y the usual ratio, product and regression estimators are not applicable when observations related to x2 are missing. Singh and Shukla (1987) have proposed a factor type estimator for estimating population mean Y . Shukla et al. (1991), Singh and Shukla (1993), Shukla (2002) have also discussed properties of factor-type estimators applicable for estimating population mean under SRSWOR and Two-Phase Sampling. But all these cannot be useful due to unknown information x2 . In order to solve this, an imputation (x ) is adopted as: * 2

 N X − nX  2    N − n 

x 2* =

(3.1)

The logic for this imputation is to utilize the non-sampled part of the population of X for obtaining an estimate of missing x2 and generate x * as describe below : N x +N x  *  (3.2) x =  *

1

 

1

2

N1 + N 2

2

 

A proposed and class of imputed factor-type estimation strategy for estimating    y =  N y + N y   ( A + C )X + f B x 

( )

FT k

Y

is:

*

 

1

1

2

N

2

(

)

 A + f B X + C x *   

(3.3)

where 0 < k < ∞ and k is a constant, A = (k – 1) (k – 2 ); B = (k – 1) (k – 4); C = (k – 2) (k – 3) (k – 4); f = n / N

4. Large Sample Approximation Consider the following for large n: y1 = Y1 (1 + e1 ) ;

y 2 = Y2 (1 + e2 ) ;

x1 = X 1 (1 + e3 );

x 2 = X 2 (1 + e 4 )

where, e1 , e2 , e3 and e 4 are very small numbers and ei < 1 (i = 1,2,3,4).

(4.1)

31

Utilization of Non-Response Auxiliary Population Mean…

Using the basic concept of SRSWOR and the concept of post-stratification of the sample n into n1 and n2 [see Cochran (2005), Sukhatme et al. (1984)],we get

[ ] E(e ) = E[E (e ) n ] = 0;

[ ] E(e ) = E[E (e ) n ] = 0

E(e1 ) = E E (e1 ) n1 = 0; 3

3

E(e2 ) = E E(e2 ) n2 = 0

1

4

4

2

  

(4.2)

Assuming the independence of R-group and NR-group representation in the sample, the following expression could be obtained:

[

[ ]

  1 =  E    n1

 1 1  = E  −  n1 N 

]

E e12 = E E(e12 ) n1

 2   C1Y  n1       

 1  2   1   − C1Y  =  D1 − C12Y   N  N       

(4.3)

[ ]

[

]

 1  =  D2 − C22Y  N  

(4.4)

[ ]

[

]

 1  =  D1 − C12X  N  

(4.5)

E e22 = E E (e 22 ) n2 E e32 = E E (e32 ) n1

  1 E e42 =  D2 − C 22X  N  

[ ]

and

[

E[e1e3 ] = E E(e1e3 ) n1

]

(4.6)

 1 1     = E  −  ρ1C1Y C1 X  n1   n1 N     

  1 =  D1 −  ρ1C1Y C1 X  N    

[

]

[

]

(4.7)

E[e1e2 ] = E E(e1e2 ) n1 , n2 = 0 E[e1e4 ] = 0

(4.8) (4.9) (4.10)

E[e2 e3 ] = E E(e2 e3 ) n1 , n2 = 0 1   E[e2 e4 ] =  D2 −  ρ 2C 2Y C 2 X N 

(4.11)

E[e e ] = 0 (4.12) The expressions (4.8), (4.9), (4.10) and (4.12) are true under the assumption of independent representation of R-group and NR-group units in the sample. This is introduced to simplify mathematical expressions. 3

4

Theorem 4.1: The estimator (y ) could be expressed under large sample approximations in following form: y =δ Y [1+s1W1e1+s2W2e2][1+(α−β)e3−(α−β)β e + (α−β) β e − (α−β) β e +...] (4.13) Proof: Rewrite x as in (3.2): FT

k

( )

2 3

FT k

2

3 3

3

4 3

*

*

x



*

x

=

=

 N x + N ( x *2 )  2  1 1    N1 + N 2  

where

x2

*

=

 N X −nX 2     N − n 

 N X − n X 2  1 N1 x1 + N2   = X [W1r1 + p (1 − fr2 ) + W1r1e3 ] = X [v + W1r 1e3 ] N  N − n  

(4.14)

32

Journal of Reliability and Statistical Studies, June 2009, Vol. 2(1)

where,

N2 N −n

p=

X1

; r1=

X

X2

; r2 =

X

; W1 =

N1 N

n N

; f=

; v = W1r1 + p(1 – fr2).

Now, the estimator (y ) under large sample approximations (4.1) and using (4.12) will be   y =  N y + N y   ( A + C )X + f B x  FT

( )

*

1

 

FT k

1

2

2

(

)

 A + f B X + C x *   

N

=

 ( A + fBv + C ) + fBW 1 r1 e3  Y [1 + s1W1e1 + s2W2 e2 ]    ( A + fB + Cv ) + CW 1 r1 e 3 

=

 ψ + ψ 2 e3  Y [1 + s1W1e1 + s 2W 2 e 2 ]  1  ψ 3 + ψ 4 e3 

= δ2 Y [1 + s W e 1 = A + fBv + C; 1

where,

k

Y1

s1 =

1 1

+ αe3)(1 + βe3)-1 2 = fBW1r1 ; 3 = A + fB + Cv;

Y2

; s2 =

Y

+ s 2W 2 e2 ] (1

Y

ψ2 ; ψ1

; α=

β=

ψ4 ψ3

; δ=

ψ1 ψ3

We can further express (4.15) as: (y ) = δ Y [1 + s1W1e1+ s2W2e2] [1+(α − β)e3 − (α − β) β FT

k

(4.15) 4=

Cr1W1 ;

. + (α − β) β

e 32

2

e 33 −

.....…..] (4.16)

5. Bias and Mean Squared Error Using E(.) for expectation, B(.) for bias and M(.) for mean squared error, we have to the first order of approximations for i, j = 1, 2, 3, ..... E[e e ] = E[e e ] = E[e e ] = 0 when i + j > 2 (5.1) i 1

i j 1 3

j 2

i j 2 3

(y )

Theorem 5.1: To the first order of approximations, the bias of the estimator B (y

) = Y (δ − 1)− δ C  B (y ) = E[ (y ) − Y ]

Y is

FT

k

1X



)

1  − β  D 1 − N 

{

  β C 1 X − s 1W 1 ρ 1 C 1Y 



}

FT

k

of

(5.2)



Proof: Taking expectations in (4.16), we have E (y ) = δ Y E [1+ s1W1e1+ s2W2e2] [1 + (α − β)e3 − (α − β) β e +(α − β) β FT

FT

FT k

k

2 3

k

2

e 33

…....]

  1 2 1   = δ Y 1 − α − β β  D1 − C1 X + α − β s1W1  D1 −  ρ1 C1Y C1 X  N N      

(

)

(

)

  1  = δ Y 1 − α − β  D1 − C1 X β C1 X − s1W1 ρ1 C1Y  N    

(

Therefore, B

(y )

FT k

)

{

}

  1  = Y  δ − 1 − δ C1 X α − β  D1 −  β C1 X − s1W1 ρ1 C1Y  N    

(

)

(

Theorem 5.2: The mean squared error of M (y

FT

)

)

{

}

(y ) is FT

k

  1 1   = Y  δ − 1 2 +  D1 − {K1 s12C12Y + K 2 C1X2 + 2 K 3 s1 ρ1C1Y C1 X } +  D2 − δ 2 s22W22 C22Y  k N N      

where,

2

(

)

(

){ (

) (

) };

K1 = δ 22W12 ; K 2 = δ α − β δ α − β − 2 δ − 1 β

Proof : M (y

FT

)= k

E[ (y

FT

)

k

−Y]

(

)(

K 3 = W1δ 2δ − 1 α − β

(5.3)

)

2

= E [δ Y {1 + s1W1e1 + s2W2e2}{1+ (α − β)e3 − (α −β)β e +(α − β) β 2 3

2

e 33 ….} − Y

]

2

33

Utilization of Non-Response Auxiliary Population Mean…

Using large sample approximations of (5.1), we have M (y ) = Y E [(δ−1) + δ{(α − β)e3 − (α− β)β e + (s1W1e1+ s2W2e2) 2

FT

2 3

k

2

+ (α− β) (s1W1e1 + s2W2e2) e3}] = Y [(δ −1)2 + δ {(α−β)2 E (e32 ) + s12W12 E (e12 ) + s 22W22 E (e22 ) +2(α− β)s1W1E(e1 e3)} + 2δ(δ−1){−(α − β)β E( e ) + (α− β)s1W1 E(e1e3 )}] [Using (4.2), (4.7) & (4.8)] 2

2

2 3

2 1  2 = Y  δ − 1 +  D1 −  δ 2 s12W12C12Y + δ α − β δ α − β N  

(

)

{

(

(

) }

(

){ (

)(

)

)

2 − 2 δ − 1 β C 1X + 2δ 2δ − 1 α − β s1W1 ρ 1C1Y C1X

} +  D



2



1 N

 2 2 2 2  δ s2W2 C2Y   

2  1 1   2 2 = Y  δ − 1 +  D1 − {K1 s12C12Y + K 2 C1X + 2 K 3 s1 ρ1C1Y C1 X }+  D2 − δ 2 s22W22 C22Y  N N    

(

)

6. Some Special Cases The term A, B and C are functions of k. In particular, there are some special cases: Case I : k = 1 => A = 0; B = 0; C = − 6; 1 = −6; 2 = 0; 3 = − 6v; 4 = −6r1w1 α = 0; β = The estimator

(y )

=

FT k =1

B

(y ) FT

; δ=

1 v

)

k =1

;

W12 v2

K1 =

r12W12 (3 − 2v ) v4

; K2 =

; K3 =

(y ) along with bias and m.s.e. under case I is: FT

r1W12 (v − 2) v3

(6.1)    r1 W1 2 C1 X {r1 C1 X − v s1 ρ 1C1Y }  

2  1  2 = Y v − 4  (1 − v ) v 2 + W1 2  D1 − N  

(6.2)

 2 2 2 {v s1 C1Y + (3 − 2v )r12 C 12X + 2 (v − 2 )vr1 s1 ρ 1 C1Y C 1 X 

 1   +  D 2 −  v 2W 22 s 22 C 22Y  N   

(6.3)

Case II : k = 2 => A = 0; B = −2; C = 0; α = r1W1v ; β = 0; δ = v; −1

(y ) B

(y )



FT k = 2

M (y

FT

)

 = Y (v − 1) +  D1 −

k =2



=

1=

K1= W

1

2

v

= 0; α =

−2f v;

2=

−2fW1r1;

3=

; K2= r W ; K3 = r W 2 1

2

1

1

1

2

-2f ;



4=

0;

(2v − 1) ;

(6.4)

 1 2 W1 r1 s1 ρ 1C1Y C1X  N  

(6.5)

2  1 1   2 Y (v −1) + W12  D1 − {v2 s12C12Y + r12C12X + 2(2v −1)s1r1ρ1C1Y C1X } +  D2 − v2W22 s22C22Y  N N      

Case III : k = 3 => A = 2; B = −2; C = 0; 4

2

*  N y + N y  x  2 2  1 1   N    X   

=

FT k = 2

;

k

 N y + N y  X  2 2  *   1 1 N   x  

 1  = Y v −3 (1 − v )v 2 +  D1 − N  

FT k =1

M (y

r1W1 v

− fW1 r1 1 − fv

; β = 0; δ =

1 − fv 1− f

; K1=

= 2(1− f v); 1 (1 − fv ) 2 W12 (1 − f ) 2

;

= −2fW1r1; 2 K2 =

f 2W12 r12 (1 − f ) 2

;

(6.6) = 2(1−f); 3

34

Journal of Reliability and Statistical Studies, June 2009, Vol. 2(1)

K3 =

− fW12 r1{1 + f (1 − 2v )} (1 − f ) 2

(y )

k =3

FT

B (y

FT

M (y

)

FT

k =3

)

k =3

;  X − f x    (1 − f ) X 

N y +N y 2 2  1 1  N 

=

*

   

(6.7)

  1  Y f (1 − f ) −1 (1 − v ) −  D1 − W12 r1 s1 ρ1C1Y C1 X  N   

=

= Y (1− f ) 2

−2

(6.8)

 2 1 2 2 2 2 2 2 2 2  f (1−v)  D1 −  W1 (1− fv) s1 C1Y + f r1 C1X N  

{

 1  2 − 2{1 + f (1− 2ν )} fr1s1ρ1C1Y C1X } +  D2 −  (1− fv) W22 S22C22Y  N  

Case IV : k = 4 => A = 6; B = 0; C = 0; δ = 1; K1 = W ; K2= 0; K3 = 0;

1

= 6;

2

= 0;

3

(6.9)

= 6;

4

= 0; α = 0; β = 0;

2

1

(y )

k =4

FT

B (y

) V (y ) FT

FT

k =4

k =4

 N y + N2 y2  = 1 1  N  

(6.10)

=0

(6.11)

2 2 =  D1 − 1 .W12 Y 1 C12Y +  D2 − 1 .W22 Y 2C22Y N N    

(6.12)

7. Estimator Without Imputation Throughout the discussion, the value of x2 is assumed unknown. This is imputed by the term x to provide the generation of x . [See (3.1) & (3.2)]. Suppose x is known, then there is no need of imputation and the proposed estimators (3.2) and (3.3) reduce into : *

*

2

2

 N x + N 2 x2  x (#) =  1 1  N  

(7.1)

 N y + N 2 y2 =  1 1 N 

(7.2)

[( y ) ] FT

w k

 ( A + C )X + fBx (# )   (#)  ( A + fB )X + Cx 

where, k is a constant (0 < k < ∞) and A = (k – 1) (k – 2); B = (k – 1) (k – 4); C = (k – 2) (k – 3) (k – 4); f = n/N. Theorem 7.1: The estimator [( y

FT

) ] is biased for w

k

Y with the amount of bias

  1 1  B[( y FT )w ]k = (ψ 1' −ψ 2' )  D1 − W12 r1C1X {s1 1C1Y −ψ 2, r1C1X } +  D2 − W22 r2 C2 X {s 2 ρ 2 C2Y −ψ 2' r2 C2 X } N N      

where ψ = fB / (A + fB + C ) ; ψ = C / (A + fB + C ) . Proof: The estimator [( y FT )w ]k may be approximated as : ' 1

[( y ) ] FT

w k

(7.3)

' 2

 N y + N2 y2 =  1 1 N 

[

 ( A + C )X + fBx ( # )   (# )  ( A + fB )X + Cx 

][

= Y + W1Y1e1 + W2Y2 e2 1 + ψ 1' (W1r1e3 + W2 r2 e4 )

] [1 +ψ

' 2

(W r e

1 1 3

]

+ W2 r2 e4 )

−1

Expanding the above using binominal expansion, and ignoring (eik e lj ) terms for (k + l )>2, (k, l = 0,1,2 ....), (i, j = 1, 2, 3, 4 ); the estimator results into

35

Utilization of Non-Response Auxiliary Population Mean…

[( y ) ] FT

= Y + Y (∆1 − ∆ 2 ) + W1Y1e1 (1 + ∆1 − ∆ 2 ) + W2Y2 e2 (1 + ∆1 − ∆ 2 )

w k

(7.4) ) and W1 r1 +W2 r2 = 1

where, ∆1 = (ψ −ψ ) (W r e + W r e ) , ∆2 = ψ (ψ −ψ ) (W r e + W r e holds. Further, up to first order of approximation, one may derive the following: ' 1

' 2

1 1 3

2 2

2

(iv) E [e

]

' 2

4

(ii) E [∆ ] =

(i) E [∆1] = 0; (iii) E [∆ ] = ψ

' 2

2 1



' 1



' 1

' 1

2 2 4

;

 1 1    −ψ 2' ) W12 r12  D1 − C12X + W22 r22  D2 − C22X  ; N N      



' 1

1  −ψ 2' ) W1 r1  D1 −  ρ1C1X C1Y N 

(v) E [e2 ∆1 ] =



' 1

1   −ψ 2' ) W2 r2  D2 −  ρ 2C 2 X C 2Y N 

∆1

1 1 3

 1 1   2 −ψ 2' ) W12 r12  D1 − C12X + W22 r22  D2 − C22X  N N      

=

1

2

' 2

; ;

(vii) E [e ∆ ] = 0 [under o (n )] ; (vi) E [e ∆ ] = 0 [under o (n )] ; The bias of the estimator without imputation is B[( y ) ] = E [{( y ) } − Y ] = E Y ( ∆1 − ∆ 2 ) + W1Y1e1 (1 + ∆1 − ∆ 2 ) + W2Y2 e2 (1 + ∆1 − ∆ 2 )    −1

1

FT w k

−1

2

2

2

FT w k

  1 1  = Y (ψ 1' −ψ 2' )  D1 − W12 r1C1 X {s1 ρ1C1Y −ψ 2' r1C 2 X } +  D2 − W22 r2 C2 X {s 2 ρ 2 C2Y −ψ 2' r2 C2 X } N N   

Theorem 7.2 : The mean squared error of the estimator [( y FT )w ]k is :

{

}

  1 2 M [( y FT )w ]k = Y 2  D1 − W12 s12 C12Y + (ψ 1' −ψ 2' ) r12 C12X + 2 s1 (ψ 1' −ψ 2' ) r1 ρ1C1Y C1 X  N    

{

}

 1  2 2 +  D2 − W22 s 22 C22Y + (ψ 1' −ψ 2' ) r22 C22X + 2 s 2 (ψ 1' −ψ 2' ) r2 ρ 2 C2Y C2 X  N   

Proof :

[{( ) } − Y ]

M[( y FT )w ]k = E y FT

[

(7.5)

2

w k

]

M[( y FT )w ]k = E Y (∆1 − ∆ 2 ) + W1Y1e1 (1 + ∆ 1 − ∆ 2 ) + W2Y2 e2 (1 + ∆ 1 − ∆ 2 )

2

= Y 2 E (∆ 12 ) + W12 Y1 2 E (e12 ) + W22Y2 2 E (e22 ) + 2W1 Y Y1 E (e1 ∆ 1 ) +2W2 Y Y2 E (e2 ∆1 ) + 2W1 W2Y1Y2 E (e1e 2 )

{

}

  1 2 M [( y FT )w ]k = Y 2  D1 − W12 s12 C12Y + (ψ 2' −ψ 1' ) r12 C12X + 2 s1 (ψ 2' −ψ 1' ) r1 ρ1C1Y C1 X  N  

{

}

 1  2 2 +  D2 − W22 s 22C 22Y + (ψ 2' −ψ 1' ) r22 C22X + 2 s 2 (ψ 2' −ψ 1' ) r2 ρ 2 C2Y C 2 X  N  

Remark At k = 1, k = 2, k = 3 and k = 4, the biases and mean squared errors of nonimputed estimators are given below : Case I : k = 1 => A = 0 ; B = 0 ; C = - 6 ; ψ 1' = 0; ψ 2' = 1;

[( y ) ] FT

w k =1

 N y + N 2 y2 =  1 1 N 

 X   (# )  x 

(7.6)

  1 1  B[( y FT )w ]k =1 = −Y  D1 − W12 r1C1 X {s1 ρ1C1Y − r1C1 X } +  D2 − W22 r2 C2 X {s 2 ρ 2 C 2Y − r2 C2 X } N N       1   1 M [( y FT )w ]k =1 = Y 2  D1 − W12 {s12 C12Y + r12 C12X − 2s1 r1 ρ 1C1Y C1 X } +  D2 − N N    

(7.7)

  2 2 2 W2 {s 2 C2Y + r22C 22X − 2 s 2 r2 ρ 2C 2Y C 2 X }  

(7.8)

36

Journal of Reliability and Statistical Studies, June 2009, Vol. 2(1)

Case II : k = 2 => A = 0; B = –2; C = 0; ψ 1' = 1; ψ 2' = 0;

[( y ) ] FT

w k =2

 N y + N 2 y2 =  1 2 N 

 x ( # )     X 

(7.9)

  1 1  B[( y FT )w ]k =2 = Y  D1 − W12 s1r1 ρ1C1 X C1Y +  D2 − W22 s2 r2 ρ 2C2 X C2Y  N N     

(7.10)

 1  M [( y FT )w ]k = 2 = Y 2  D1 − W1 2 {s12 C12Y + r12 C12X + 2 s1 r1 ρ 1C1Y C1 X } N   

1  +  D2 − N  Case III : k = 3 =>

  2 2 2 2 2  W2 {s2 C2Y + r2 C2 X + 2s2 r2 ρ 2 C2Y C2 X }    A = 2; B = –2; C = 0; ψ 1' = -f (1-f)-1; ψ 2' = 0;

(7.11)

   

(7.12)

 1 1  −1  B[( y FT )w ]k =3 = −Yf (1 − f )  D1 − W12 r1 s1 ρ1C1Y C1 X +  D2 − W22 r2 s 2 ρ 2 C2Y C2 X  N N      

(7.13)

[( y ) ] FT

w k =3

 N y + N 2 y2 =  1 1 N 

 X − x (# )   (1 − f ) X

{

 1 −2 −1 M[( y FT )w ]k =3 = Y 2  D1 − W12 s12 C12Y + (1 − f ) f 2 r12C12X − 2(1 − f ) f s1 ρ1C1Y C1 X N   

}

1  −2 −1 +  D2 − W22 {s 22 C22Y + (1 − f ) f 2 r22 C22X − 2(1 − f ) f s2 r2 ρ 2C2Y C2 X N 

}]

(7.14)

Case IV: k = 4 => A = 6 ; B = 0 ; C = 0; ψ 1' = 0; ψ 2' = 0;

[( y ) ] FT

w k =4

 N y + N 2 y2 =  1 1 N 

B[( y FT )w ]k =4 = 0 ; V [( y FT )w ]k =4

8.

  

(7.15) (7.16)

   1 1  = Y 2  D1 − W12 s12 C12Y +  D2 − W22 s22 C22Y  N N    

(7.17)

Numerical Illustration

Consider two artificial populations I and II (given in Appendix A and B), each of which is divided into R and NR-groups of sizes N1 and N2 respectively. Let the samples of sizes 40 and 30 respectively from populations I and II drawn with SRSWOR be post-stratified into R and NR-groups. Then, we have Population I n1 = 28; n2 = 12 ; n = 40; f = 0.22

Population II n1 = 20; n2 = 10 ; n = 40; f = 0.20

The values of population parameters for the two populations are given in Table 8.1 and the values of bias and MSE are shown in Table 8.2.

37

Utilization of Non-Response Auxiliary Population Mean…

Size Mean Y Mean X M.S. Y M.S. X C.V. Y C.V. X Cor.Coeff.

Entire Population I II 180 150 63.77 159.03 29.20 113.22 299.87 2205.18 110.43 1972.61 0.272 0.295 0.360 0.392 0.809 0.897

R-group I II 100 90 66.33 173.60 30.72 128.45 349.33 2532.36 112.67 2300.86 0.282 0.290 0.345 0.373 0.805 0.857

NR-group I II 80 60 59.92 140.81 26.92 94.19 206.35 1219.90 100.08 924.17 0.240 0.248 0.372 0.323 0.808 0.956

Table 8.1: Parameters of Populations – I & II given in Appendix A & B.

Type of Estimator

( y FT ) k

( y FT ) w

Population-I Bias MSE -2.0 17.1255

Population-II Bias MSE -2.3386 8.025

1.6628

228.6822

2.6018

49.8306

1.4183

28.0158

-0.6512

7.0054

0

43.64

0

9.2662

0.1433 0.3141 -0.5962 0

12.9589 216.3024 24.327 43.64

0.1095 0.1599 -0.031 0

6.0552 46.838 5.2423 9.2662

Description of the estimator

(y (y (y (y

[( y [( y [( y [( y

) ) ) )

FT k =1

FT

k =2

FT k =3

FT k = 4

)] )] )] )]

FT w k =1

FT w k = 2 FT w k = 3

FT w k = 4

Table 8.2: Bias and M.S.E. Comparisons of (y FT )k and (y FT )w

The m.s.e. of the proposed imputed estimator is higher than that of nonimputed estimator but both are very close. Obviously, the non-imputed estimator will be better than the imputed estimator due to complete availability of information. The proposed one is very near to the non-imputed estimator showing utility due to new estimation technique in missing observation environment. Define a term LI as “percentage loss due to imputation” with formulation.

(LI )k

( )  × 100 ( ) 

 MSE y FT =  MSE y FT

k

w

The table 8.3 shows the variation of LI over k. k k=1 k=2 k=3 k=4

(LI)k Population –I Population-II 132.1524203 132.5307 105.7233762 106.3893 115.1633987 133.6322 100 100

Table 8.5: Variation of LI over k

(8.1)

38

Journal of Reliability and Statistical Studies, June 2009, Vol. 2(1)

The percentage loss in MSE due to imputation is small and accomodable over suitable choice of k. But at the same time, proposed one tackles and solves the problem of missing observations also. 9. Conclusions As per table 8.2 and 8.3, the imputed class performs closer to the non-imputed class of estimators over suitable choice of k. The over all comparative procedure shows almost a closed performance of imputed factor-type estimator to the same without imputation. The imputed factor-type class of estimators reveals a good potential for utilizing the information X 2 in place of missing x 2 . The class presents efficient member when k = 1 and k = 3. The LI comparison shows that with a little loss, one can handle the non-responded observations effectively. Actually, the best choice of k is suppose to be near to k = 1 or near to k = 3. It is worthwhile to say that the proposed class contains estimators is effective for mean estimation even when some observations of auxiliary variable X are missing (or non-responded).

Acknowledgement Authors are thankful to the referee for his valuable comments and useful suggestions which improved the quality of the original manuscript.

References 1. 2.

3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.

Agrawal, M.C. and Panda, K.B. (1993). An efficient estimator in poststratification, Metron, 51, 3-4, 179-187. Ahmed, M.S., Al-titi, Omar, Al-Rawi, Ziad and Abu-dayyeh, Walid (2006). Estimation of a population mean using different imputation method, Statistics in Transition, 7(6), 1247-1264. Cochran, W.G. (2005). Sampling Techniques, Third Edition, John Willey & Sons, New Delhi. Grover and Couper (1998). Non-response in household surveys, John Wiley and Sons, New York. Hansen, M.H. and Hurwitz, W.N. (1946). The problem of non-response in sample surveys, Jour. Amer. Stat. Asso., 41, 517-529. Hinde, R.L. and Chambers, R.L. (1990). Non-response imputation with multiple source of non-response, Jour. Off. Statistics, 7, 169-179. Khot, P.S. (1994). A note on handling non-response in sample surveys, Jour. Amer. Stat. Assoc., 89, 693-696. Lessler and Kalsbeek (1992). Non-response error in surveys, John Wiley and Sons, New York. Rao, J.N.K. and Sitter, R.R. (1995). Variance estimation under two phase sampling with application to imputation for missing data, Biometrika, 82, 453-460. Rubin, D.B. (1976). Inference and missing data, Biometrika, 63, 581-593. Shukla, D. (2002). F-T estimator under two phase sampling, Metron, 59, 1-2, 253263. Shukla, D. and Dubey, J. (2001). Estimation in mail surveys under PSNR sampling scheme, Ind. Soc. Ag. Stat., 54, 3, 288-302. Shukla, D. and Dubey, J. (2004). On estimation under post-stratified two phase non-response sampling scheme in mail surveys, International Jour. of Management and Systems, 20, 3, 269-278.

Utilization of Non-Response Auxiliary Population Mean…

39

14. Shukla, D. and Dubey, J. (2006). On earmarked strata in post-stratification, Statistics in Transition, 7, 5, 1067-1085 15. Shukla, D. and Trivedi, M. (1999). A new estimator for post-stratified sampling scheme, Proceedings of NSBA-TA (Bayesian Analysis), Eds. Rajesh Singh, 206218. 16. Shukla, D. and Trivedi, M. (2001). mean estimation in deeply stratified population under post-stratification. Ind. Soc. Ag. Stat., 54(2), 221-235. 17. Shukla, D. and Trivedi, M. (2006). Examining stability of variability ratio with application in post-stratification, International Jour. of Management and Systems, 22, 1, 59-70. 18. Shukla, D., Bankey, A. and Trivedi, M. (2002). Estimation in post-stratification using prior information and grouping strategy, Ind. Soc. Ag. Stat., 55, 2, 158-173. 19. Shukla, D., Singh, V.K. and Singh, G.N. (1991). On the use of transformation in factor type estimator, Metron, 49 (1-4), 359-361. 20. Shukla, D., Trivedi, M. and Singh, G.N. (2006). Post-stratification two-way deeply stratified population, Statistics in Transition, 7, 6, 1295-1310. 21. Singh G.N. and Singh V.K. (2001). On the use of auxiliary information in successive sampling, Jour. Ind. Soc. Ag. Stat., 54(1), 1-12. 22. Singh, H.P. (1986). A generalized class of estimators of ratio product and mean using supplementary information on an auxiliary character in PPSWR scheme, Guj. Stat. Rev., 13, 2, 1-30. 23. Singh, S. and Horn, S. (2000). Compromised imputation in survey sampling, Metrika, 51, 266-276. 24. Singh, V.K. and Shukla, D. (1987). One parameter family of factor type ratio estimator, Metron, 45 (1-2), 273-283. 25. Singh, V.K. and Shukla, D. (1993). An efficient one-parameter family of factor type estimator in sample survey, Metron, 51, 1-2, 139-159. 26. Singh, V.K. and Singh, G.N. (1991). Chain type estimator with two auxiliary variables under double sampling scheme, Metron, 49, 279-289. 27. Singh, V.K., Singh, G. N. and Shukla, D. (1994). A class of chain ratio estimator with two auxiliary variables under double sampling scheme, Sankhya, Ser. B., 46(2), 209-221. 28. Smith, T.M.F. (1991). Post-stratification, The Statisticians, 40, 323-330. 29. Sukhatme, P.V., Sukhatme, B.V., Sukhatme, S. and Ashok, C. (1984). Sampling Theory of Surveys with Applications, Iowa State University Press, I.S.A.S. Publication, New Delhi. 30. Wywial, J. (2001). Stratification of population after sample selection, Statistics in Transition, 5, 2, 327-348.

40

Journal of Reliability and Statistical Studies, June 2009, Vol. 2(1)

Appendix A: Population I (N= 180) R-group: (N1=100) Y: X: Y: X: Y: X: Y: X: Y: X: Y: X: Y: X:

110 80 200 150 175 145 110 75 180 135 205 110 210 170

75 40 135 85 160 110 110 80 150 110 180 105 145 85

85 55 120 80 165 135 120 90 185 135 150 110 135 95

165 130 165 100 175 145 230 165 165 115 205 175 250 190

125 85 150 25 185 155 220 160 285 125 220 180 265 215

110 50 160 130 205 175 280 205 150 205 240 215 275 200

Y: X: Y: X: Y: X: Y: X: Y: X: Y: X:

85 55 110 70 115 75 105 65 155 115 160 120

75 40 90 60 140 105 125 75 190 130 155 115

115 65 155 85 180 120 110 70 150 110 170 120

165 115 130 95 170 115 120 80 180 120 195 135

140 90 120 80 175 125 130 85 200 125 200 150

110 55 95 55 190 135 145 105 160 145

85 35 165 135 140 80 275 215 235 100 260 225 205 165

80 40 145 105 105 75 220 190 125 195 185 110 195 155

150 110 215 185 125 65 145 105 165 85 150 90 180 150

165 115 150 110 230 170 155 115 135 115 155 95 115 175

135 95 145 95 230 170 170 135 130 75 115 85

120 60 150 75 255 190 195 145 245 190 115 75

140 70 150 70 275 205 170 135 255 205 220 175

135 85 195 165 145 105 185 110 280 210 215 185

145 115 190 160 125 85 195 145 150 105 230 190

NR-group: (N2=80) 115 60 100 60 160 110 160 110 155 120

13.5 65 125 75 155 115 170 115 170 105

120 70 140 90 175 135 180 130 195 100

125 75 155 105 195 145 `145 95 200 95

120 80 160 125 90 45 130 65 150 90

150 120 145 95 90 55 195 135 165 105

145 105 90 45 80 50 200 130 155 125

90 45 90 55 90 60 160 115 180 130

105 65 95 65 80 50 110 55 200 145

Appendix B : Population II (N=150) R-group (N1=90) Y: X: Y: X: Y: X: Y: X: Y: X: Y: X:

90 30 100 50 90 40 45 15 65 25 75 25

75 35 40 10 95 50 40 15 60 40 70 20

70 30 45 25 80 35 40 20 75 35 40 35

85 40 55 25 85 45 35 10 75 30 55 30

95 45 35 10 55 35 55 30 85 40 75 45

55 25 45 15 60 25 75 25 95 35 45 10

Y: X: Y: X: Y: X: Y: X:

40 10 65 35 65 30 75 30

90 30 80 45 70 40 65 35

95 30 55 30 55 30 70 40

70 30 65 30 35 10 65 25

60 25 75 40 35 15 70 45

65 30 55 15 50 25 45 10

65 40 35 10 75 30 80 30 90 40 55 30

80 50 55 25 85 40 80 40 90 35 60 25

65 35 85 35 80 25 85 35 45 15 85 40

50 30 95 55 65 35 55 20 40 25 55 15

45 15 65 35 35 10 45 25 45 15 60 25

55 20 75 40 40 15 70 30 55 30 70 30

60 25 70 30 95 45 80 40 60 30 75 35

60 30 80 45 100 45 90 45 65 25 65 30

95 40 65 40 55 25 55 30 60 20 80 45

60 20 45 15 60 30 55 15

65 30 40 10 30 10 60 25

60 30 75 40 35 20 70 30

55 35 75 45 45 15 75 35

55 25 45 10 55 30 65 30

45 20 70 30 65 30 80 45

NR-group (N2=60) 85 40 50 15 55 30 55 30

55 25 55 20 35 15 60 25

45 15 60 30 55 20 85 40

Suggest Documents