a median based regression type estimator of the finite ...

2 downloads 0 Views 266KB Size Report
Abstract : In the present study, a regression type estimator of finite population mean of ... Key words : Bias, Ratio estimator, Mean squared error, Simple random ...
Int. J. Agricult. Stat. Sci. Vol. 13, No. 1, pp. 265-271, 2017

ISSN : 0973-1903

ORIGINAL ARTICLE

A MEDIAN BASED REGRESSION TYPE ESTIMATOR OF THE FINITE POPULATION MEAN S. K. Yadav1, Lakhan Singh2*, S. S. Mishra1, Prem Prakash Mishra3 and Surendra Kumar4 1

Department of Mathematics and Statistics, Dr. RML Avadh University, Faizabad - 224 001, India. 2 Department of Statistics, H. N. B. G. University, Garhwal, Srinagar - 264 174, India. 3 Department of Mathematics, National Institute of Technology, Chumukedima, Nagaland - 797 103, India. 4 Department of Mathematics, Govt. Degree College, Pihani, Hardoi - 241 406, India. E-mail: [email protected] Abstract : In the present study, a regression type estimator of finite population mean of the study variable has been proposed using population median of the study variable. The expressions for the bias and mean squared error of the proposed estimator have been derived up to the first order of approximations. The minimum value of the mean squared error has also been obtained for the proposed estimator. A theoretical comparison has been made with the mean per unit estimator, usual ratio, usual regression estimator [Bahl and Tuteja (1991), Kadilar (2016) and Subramani (2016)] estimators. The different conditions under which the proposed estimator performs better than other estimators have been given. Through the numerical example, the theoretical findings have been judged of the proposed and other estimators. It is seen that the proposed estimator performs better than other existing estimators. Key words : Bias, Ratio estimator, Mean squared error, Simple random sampling, Efficiency.

1. Introduction Sampling is a good alternative of complete enumeration whenever the population is very large and it is very costly and time taking to take observations on every unit of the population. The most appropriate estimator of any population parameter is the corresponding statistic. Thus, the most suitable estimator for the population mean is the sample mean. We wish that the estimator must have all the desirable properties such as unbiasedness, minimum variance, most efficient etc. Although, sample mean is unbiased estimator of population mean but its sampling distribution is not closely scattered around the true population mean. Thus, it has a reasonably large amount of variance. Our aim is to search for such estimator, may be biased but its sampling distribution should be very close to true value of the parameter, meaning that it should have minimum mean squared error. This problem is solved by the use of auxiliary variable, which is highly positively or negatively correlated with the study variable. This auxiliary information is collected at additional cost of the survey. It would be better and economic, if we have *Author for correspondence

Received January 10, 2017

information on some parameter of the main variable in addition and by the use of this information, if estimation is improved, then this will be the better thing in the field of sampling as it does not increase the cost of the survey and the estimation is improved in addition. The additional information on population median of study variable, which easily available many times has been utilized in the present manuscript. There are various situations where this additional information is easily available such as in the surveys involving the estimation of average income, average marks etc, it is very reasonable to assume that the population mean is unknown whereas the population median is known. Let the finite population under consideration consists of N distinct and identifiable units and let (xi , yi ), i = 1, 2, ..., n, be a bivariate sample of size n taken from (X, Y) using a SRSWOR scheme. Let X and Y respectively be the population means of the auxiliary and the study variables and let x and y be the corresponding sample means. It is well established that in simple random sampling scheme, sample means and Revised March 28, 2017

Accepted April 17, 2017

S. K. Yadav et al.

266

are unbiased estimators of population means of X and Y , respectively. Population mean is one of the very important measures of central tendency in almost all fields of society including field of Medical sciences, Biological sciences, Agriculture, Industry, Social sciences, Humanities etc. Thus, the estimation of population mean is of great significance in above fields. The following four examples are of interest given by Subramani (2016) for the estimation of the population mean which make use of information on the population median of the study variable. Example 1.1 : In an Indian University, 5000 students entered for the University examination. The results are given below. The problem is to estimate the average marks scored by the students (population mean). Here, it is reasonable to assume that the median of the marks is known since we have the following information. Table 1 : Results of the University Examination. Passed with

Percentage of marks

Number of students

Cumulative total

Distinction

75-100

850

850

First Class

60-75

3100

3950

Second Class 50-60

600

4550

Failed

450

5000

5000

5000

0-50 Total

The median value will be between 60 and 75. Approximately one can assume the population median value as 67.5.

Example 1.2 : In an Indian University 800 faculty members are working in different categories and the basic salary drawn by different categories of the faculty members are given in Table 2. The problem is to estimate the average salary drawn by the faculty members (population mean) per month. Here, it is reasonable to assume that the median of the salary is known based on the information given in Table 2. Example 1.3 : In the estimation of body mass index (BMI) of the 350 patients of a Hospital, it is reasonable to assume that the population median of the BMI is known based on the information given in Table 3. Example 1.4 : In the problem of estimating the blood pressure of the 202 patients of a hospital, it is reasonable to assume that the median of the blood pressure is known based on the information available in Table 4.

2. Review of Estimation of Population Mean The most suitable estimator of population mean Y is the corresponding sample mean y of the study variable Y given by n

t0  y 

1  yi n i 1

(1)

Sample mean is an unbiased estimator of population mean and its variance up to the first order of approximation is

V t0  

1 f 2 1  f 2 2 Sy  Y Cy n n

(2)

Table 2 : Salary of University faculty members. Category

Basic salary in Indian Rupees (IRs) Per month*

Number of faculty members

Cumulative total

Senior Professor

56000+10000**

20

20

Professor - Grade I

43000+10000

40

60

Professor - Grade II

37400+10000

60

120

Associate Professor - Grade I

37400+10000

80

200

Associate Professor - Grade II

37400+9000

100

300

Assistant Professor - Grade I

15100+8000

110

410

Assistant Professor - Grade II

15100+7000

140

550

Assistant Professor - Grade III

15100+6000

250

800

Total

800

800

*Actual salary depends on their experience in their designation and other allowances. **The basic salary is the sum of the basic (the first value) and the academic grade pay (the second value), which will differentiate people with same designation but different grades. The population median value will be assumed as IRs. 15100+8000 = IRs. 23100.

A Median based Regression Type Estimator of the Finite Population Mean

267

Table 3 : Body mass index of 350 patients of a hospital. Category

BMI range – kg/m2

Number of patients

Cumulative total

Very severely underweight

less than 15

15

15

Severely underweight

from 15.0 to 16.0

35

50

Underweight

from 16.0 to 18.5

67

117

Normal (healthy weight)

from 18.5 to 25

92

209

Overweight

from 25 to 30

47

256

Obese Class I (Moderately obese)

from 30 to 35

52

308

Obese Class II (Severely obese)

from 35 to 40

27

335

Obese Class III (Very severely obese)

over 40

15

350

350

350

Total

The median value will be between 18.5 and 25. Approximately one can assume the population median of the BMI value as 21.75. Table 4 : Blood pressure of 202 patients of a hospital. Category Hypotension Desired Pre-hypertension Stage 1 Hypertension Stage 2 Hypertension Hypertensive Emergency

Systolic, mmHg < 90 90–119 120–139 140–159 160–179  180 Total

Number of patients 10 112 40 20 13 7 202

Cumulative no. of patients 10 122 162 182 195 202 202

The median value will be between 90 and 119. Approximately one can assume the population median value as 104.5.

where,

Cy 

where,

Sy Y

2

, Sy 

2 1 N Yi  Y  , f  Nn .  N  1 i 1

Cx 

Cochran (1940) utilized the positively correlated auxiliary variable with the study variable and proposed the traditional ratio estimator as

t1  y

X x

(3)

Above estimator is a biased estimator of population mean and its bias and mean squared error, up to the first order of approximation respectively are given by

B t1  

1 f Y C x2  C yx n

MSE t1  





1 f 2 2 Y C y  C x2  2C yx n



Sx X

2

, Sx 

 yx 

1 N 1

(4)

2

 X i  X  i 1

Cov  x , y  , SxS y

Cov x , y  

1 N 1

N

 Y  Y  X i

i

 X ,

i 1

Cyx = yxCyCx. The traditional linear regression estimator of population mean is given by

t 2  y   yx  X  x 



N

(5)

where, yx is the regression coefficient of the line Y on X. It is an unbiased estimator of population mean and

S. K. Yadav et al.

268

its variance up to the first order of approximation is given by

V t 2  

1 f 2 2 Y C y 1   2yx  n

(6)

Bahl and Tuteja (1991) proposed the following exponential ratio type estimator of population mean using additional information on population mean of the auxiliary variable as

X x t3  y exp   X  x

(7)

The bias and the mean squared error of the above estimator up to the first order of approximation is given by

B t 3  

1 f Y 3C x2  4C yx 8n

MSE t3  





1  f 2  2 C x2  Y C y   C yx  n 4  

(8)



(9)

The bias and the mean squared error of the above estimator up to the first order of approximation respectively are

Bt 4  

1  f     1 3  2  1  Y   C x     C yx  n 8 2    2

MSEt4  

 1 f 2  2  2 1 2 Y C y       C x  2  1 C yx  (10) n 8   

The optimum value of the characterizing scalar 

The minimum value of the mean squared error of the estimator t4 is

1 f 2 2 Y C y 1   2yx  n

(12)

The bias and the mean squared error of the above estimator, up to the first order of approximation respectively are

B t5  

1 f  2 Biasm   Y Cm  C ym  n M   1 f 2 2 Y C y  R52Cm2  2 R5C ym n



Y S 1 , C m  m , S m2  N where, R5  M M Cn N

S ym 

1 N

Cn

Cn

 y

i

Y

NC



(13)

n

 m  M  , 2

i

i 1

 m i  M , C ym

i 1



S ym YM

.

For detailed study of the modified ratio type estimators of population mean of the study variable, the latest references can be made of Subramani (2013), Subramani and Kumarapandiyan (2012a,b,c, 2013a,b), Tailor and Sharma (2009), Yadav and Pandey (2011), Yadav and Adewara (2013), Yadav et al. (2014, 2015), Yadav et al. (2016a, 2016b, 2016c, 2016d), Abid et al. (2016). Using the information on population median of study variable, we have proposed the following regression type estimator of population mean of study variable as

t p  y   ym M  m 

(14)

where, ym is the regression coefficient of the line y on m and is to be estimated such that variance of tp is minimum.

1      yxC y / C x  2 

MSEmin t 4  

M  t5  y   m

3. Proposed Estimator

is

opt

Subramani (2016) utilized the additional information on population median of study variable and proposed the following ratio estimator of population mean of the study as

MSE t5  

Kadilar (2016) proposed the following modified exponential type estimator of population mean using the auxiliary variable as

X x x t 4  y   exp  X X x

which is equal to the variance of the usual regression estimator.

(11)

Here, tp is an approximately unbiased estimator of population mean Y for known ym. Now,

 

2

V t p  V  y    ymV m   2 ym Cov  y , m 

(15)

A Median based Regression Type Estimator of the Finite Population Mean

269

Table 5 : Parameter values and constants computed from three populations. For sample size

Parameters

For sample size

Popln-1

Popln-2

Popln-3

Popln-1

Popln-2

Popln-3

N

34

34

20

34

34

20

n

3

3

3

5

5

5

Cn

5984

5984

1140

278256

278256

15504

Y

856.4118

856.4118

41.5

856.4118

856.4118

41.5

M M

747.7223

747.7223

40.2351

736.9811

736.9811

40.0552

767.5

767.5

40.5

767.5

767.5

40.5

X

R1

208.8824 4.0999

199.4412 4.2941

441.95 0.0939

208.8824 4.0999

199.4412 4.2941

441.95 0.0939

R5

N

1.1158

1.1158

1.0247

1.1158

1.1158

1.0247

2 y

0.222726

0.222726

0.01575

0.125014

0.125014

0.008338

C x2

0.157785

0.172408

0.014818

0.088563

0.096771

0.007845

C m2

0.172341

0.172341

0.015931

0.100833

0.100833

0.006606

Cym

0.137284

0.137284

0.012549

0.07314

0.07314

0.005394

Cyx

0.084194

0.087264

0.009964

0.047257

0.048981

0.005275

y x

0.4491

0.4453

0.6522

0.4491

0.4453

0.6522

C

Table 6 : Bias of the existing and proposed estimators. For sample size

Estimator

For sample size

Popln-1

Popln-2

Popln-3

Popln-1

Popln-2

Popln-3

t1

63.0241

72.9186

0.2015

35.3748

40.9285

0.1067

t3

4.4436

5.4714

0.0068

1.39995

1.7238

0.0019

t5

52.0924

52.0924

0.4118

57.7705

57.7705

0.5061

tp

31.2036

31.2036

0.3616

43.3777

43.3777

0.6018

ym is estimated by minimizing Equation (15). So

 ym

V t p   ym

with other mentioned existing estimators have been made under this Section and the conditions under which it performs better than other estimators are given.

 0 gives the estimate of ym as

Cov  y , m   V m 

From Equations (17) and (2), we have (16)

Substituting the value of ym in (15), we get the minimum value of V(tp) as



2 Min.V t p   V  y  1   ym

where,  ym 



(17)

Cov  y , m  V  y .V m 

4. Efficiency Comparison A theoretical comparison of proposed estimator

V(t0) – MSEmin(tp) > 0, if  2ym  0 , If above condition is satisfied, proposed estimator is better than the usual mean per unit estimator of population mean. From Equations (17) and (4), we have MSE(t1) – MSEmin(tp) > 0, if

R12C x2  2 R1C yx   2ym  0 , or R12 C x2   2ym  2 R1C yx If above condition is fulfilled, proposed estimator performs better than the usual ratio estimator of Cochran (1940). From Equations (17) and (6), we have

S. K. Yadav et al.

270

Table 7 : Variance / Mean squared error of the existing and proposed estimators. For sample size

Estimator

For sample size

Popln-1

Popln-2

Popln-3

Popln-1

Popln-2

Popln-3

t0

163356.40

163356.40

27.12

91690.37

91690.37

14.36

t1

155579.70

161801.63

18.32

87325.38

90817.69

9.70

t2

39633.17

39801.98

4.41

12486.31

12539.49

1.23

t3

39672.88

39803.44

4.63

12498.81

12539.95

1.29

t4

39633.17

39801.98

4.41

12486.31

12539.49

1.23

t5

88379.06

88379.06

11.33

58356.92

58356.92

7.15

tp

25270.68

25270.68

2.86

9003.44

9003.44

1.02

Table 8 : PRE of the proposed estimator tp with respect to existing estimators. For sample size n = 3

Parameters

For sample size n = 5

Popln-1

Popln-2

Popln-3

Popln-1

Popln-2

Popln-3

t0

646.4266

615.6530

156.8346

156.9917

156.8346

349.7296

t1

646.4266

640.2741

157.5026

157.5084

157.5026

349.7296

t2

948.2517

640.5594

154.1958

161.8881

154.1958

396.1538

t3

1018.3930

969.9113

138.6838

138.8226

138.6838

648.1625

t4

1018.3930

1008.7000

139.2744

139.2795

139.2744

648.1625

t5

1407.8430

950.9804

120.5882

126.4706

120.5882

700.9804

MSE(t2) – MSEmin(tp) > 0, if

MSE(t5) – MSEmin(tp) > 0, if

 2ym   2yx  0

R52 Cm2  2 R5 C ym   2ym  0 ,

The proposed estimator is better than the usual regression estimator, if the above condition is satisfied. From Equations (17) and (8), we have MSE(t3) – MSEmin(tp) > 0, if

C x2  C yx  2ym  0, or 4

R52 Cm2   2ym  2 R5C ym If above condition is satisfied, proposed estimator is better than the Subramani (2016) estimator of population mean using information on median of the study variable.

5. Numerical Study

MSE(t4) – MSEmin(tp) > 0, if

To judge the performances of the proposed and the existing estimation, we have considered the population given in Subramani (2016). Tables 5, 6, 7 and 8 represent the parameter values along with constants, biases of various estimators along with proposed estimator, variances and mean squared errors of existing and proposed estimator and percentage relative efficiencies (PRE) of the proposed estimator over other existing estimators respectively.

 2ym   2yx  0

6. Results and Discussion

C x2   2ym  C yx 4 If above condition proposed estimator performs better than the Bahl and Tuteja (1991) estimator of population mean. From Equations (17) and (11), we have

The proposed estimator is better than the Kadilar (2016) estimator, if the above condition is satisfied. From Equations (17) and (13), we have

The present paper deals with the estimation of population mean using information on the population median of the study variable. The regression type estimator of population mean of study variable using

A Median based Regression Type Estimator of the Finite Population Mean

population median of study variable is proposed. The expressions for the bias and mean squared error of the proposed estimator have been obtained up to the first order of approximation. The minimum value of the mean squared error of the proposed estimator is also obtained. The proposed estimator is compared with the existing estimators and the conditions under which the proposed estimator performs better than other existing estimators have been given. The numerical study is also carried out to judge the performances of various estimators. From Table 5, it is verified that the proposed estimator has minimum mean squared error among other existing estimators of population mean. The proposed estimator is better than the estimators of Cochran (1940), Bahl and Tuteja (1991) estimator, usual regression estimator, Kadilar (2016) and Subramani (2016). The proposed estimator may be used for the improved estimation of population mean under simple random sampling scheme.

Acknowledgements The authors are very much thankful to anonymous referees for critically examining the manuscripts and giving suggestions which improved the earlier draft.

271

Subramani, J. and G. Kumarapandiyan (2012c). Estimation of population mean using known median and coefficient of skewness. American Journal of Mathematics and Statistics, 2(5), 101–107. Subramani, J. and G. Kumarapandiyan (2013a). Estimation of population mean using deciles of an auxiliary variable. Statistics in Transition-New Series, 14(1), 75–88. Subramani, J. and G. Kumarapandiyan (2013b). A new modified ratio estimator of population mean when median of the auxiliary variable is known. Pakistan Journal of Statistics and Operation Research, 9(2), 137–145. Subramani, J. (2016). A new median based ratio estimator for estimation of the finite population mean. Statistics in Transition New Series, 17, 4, 1-14. Tailor, R. and B. Sharma (2009). A modified ratio-cum-product estimator of finite population mean using known coefficient of variation and coefficient of kurtosis. Statistics in Transition-New Series, 10(1), 15–24. Yadav, S. K. and H. Pandey (2011). Improved Exponential Estimators of Population Mean Using Qualitative Auxiliary Information under Two Phase Sampling. Investigations in Mathematical Sciences, 1, 85-94.

References

Yadav, S. K. and A. A. Adewara (2013). On Improved Estimation of Population Mean using Qualitative Auxiliary Information. Mathematical Theory and Modeling, 3, 11, 42-50.

Abid, M., N. Abbas, R. A. K. Sherwani and H. Z. Nazir (2016). Improved Ratio Estimators for the population mean using non-conventional measure of dispersion. Pakistan Journal of Statistics and Operations Research, XII(2), 353-367.

Yadav, S. K., S. S. Mishra and A. K. Shukla (2014). Improved Ratio Estimators for Population Mean Based on Median Using Linear Combination of Population Mean and Median of an Auxiliary Variable. American Journal of Operational Research, 4, 2, 21-27.

Bahl, S. and R. K. Tuteja (1991). Ratio and product type exponential estimator. Information and Optimization Sciences, XII(I), 159-163.

Yadav, S. K., S. S. Mishra and A. K. Shukla (2015). Estimation Approach to Ratio of Two Inventory Population Means in Stratified Random Sampling. American Journal of Operational Research, 5, 4, 96-101.

Cochran, W. G. (1940). The Estimation of the Yields of the Cereal Experiments by Sampling for the Ratio of Grain to Total Produce. The Journal of Agric. Science, 30, 262275. Kadilar, G. O. (2016). A new exponential type estimator for the population mean in simple random sampling. Journal of Modern Applied Statistical Methods, 15(2), 207-214. Subramani, J. (2013). Generalized modified ratio estimator of finite population mean. Journal of Modern Applied Statistical Methods, 12(2), 121–155.

Yadav, S. K., S. S. Mishra, A. K. Shukla, S. Kumar and R. S. Singh (2016a). Use of Non-Conventional Measures of Dispersion for Improved Estimation of Population Mean. American Journal of Operational Research, 6, 3, 6975. Yadav, S. K., S. A. T. Gupta, S. S. Mishra and A. K. Shukla (2016 b). Modified Ratio and Product Estimators for Estimating Population Mean in Two-Phase Sampling. American Journal of Operational Research, 6, 3, 61-68.

Subramani, J., and G. Kumarapandiyan (2012a). Estimation of population mean using coefficient of variation and median of an auxiliary variable. International Journal of Probability and Statistics, 1(4), 111–118.

Yadav, S. K., J. Subramani, S. S. Mishra and A. K. Shukla (2016c). Improved Ratio-Cum-Product Estimators of Population Mean Using Known Population Parameters of Auxiliary Variables. American Journal of Operational Research, 6, 2, 48-54.

Subramani, J. and G. Kumarapandiyan (2012b). Modified ratio estimators using known median and coefficient of kurtosis. American Journal of Mathematics and Statistics, 2(4), 95–100.

Yadav, S. K., S. Misra, S. S. Mishra and N. Chutiman (2016d). Improved ratio estimators of population mean in Adaptive Cluster Sampling. Journal of Statistics Applications and Probability Letter, 3, 1, 1-6.

Suggest Documents