Statistical Papers 49, 37-58 (2008)
Statistical Papers © Springer-Verlag 2008
A modified estimator of population mean using power transformation Housila P. Singh 1, Rajesh Tailor ~, Sarjinder Singh 2, Jong-Min Kim 3 School of Studies in Statistics, Vikram University, Ujjain - 456010, M. P., India 2 Department of Statistics, St. Cloud State University, St. Cloud, MN 56301 USA; (e-mail:
[email protected]) 3 Statistics, Division of Science and Mathematics, University of Minnesota - Morris, Morris, MN 56267, USA
Received: March 1, 2005; revised version: December 6, 2005
Summary In this paper we have suggested two modified estimators of population mean using power transformation. It has been shown that the modified estimators are more efficient than the sample mean estimator, usual ratio estimator, Sisodia and D w i v e d i ' s (1981) estimator and Upadhyaya and Singh's (1999) estimator at their optimum conditions. Empirical illustrations are also given for examining the merits of the proposed estimators. Following Kadilar and Cingi (2003) the work has been extended to stratified random sampling, and the same data set has been studied to examine the performance in stratified random sampling.
Keywords: Study efficiency.
variate, auxiliary variate, mean squared error,
38
I. Introduction
Consider a finite population u = ( u 1 , u 2 ..... UN) of size N. Let y and x denote the study variate and the auxiliary variate taking values yi and xi respectively on the ith unit ui (i = 1,2 . . . . . U ). We assume that (yi, xi) > 0, since survey variates are generally non-negative. Let N
N
--2
i=1
i=1
Y'= N-1 Y~Yi, $2 = ( N - l ) -1Y~(Yi-Y)
and
Cy :
Sy/F b e
t h e population
mean, population variance and population coefficient of variation of the study variable y respectively. Further
X assume
~=N
-1
Y.xi,
i=1 S2:(N_I)
- 1N y~(xi - ,~)2 , Cx : i=t
sx/~andfl2(x): N ZN( x i - Y )--4// { t ~Nl ( xi i=t
"=
~.2 2 ) }
are the known population mean, population variance, population coefficient of variation and population coefficient of kurtosis of the auxiliary variable x respectively. Assume that a simple random sample of size n is drawn without replacement from population v . Let y and 2 be the sample means of y and x respectively. For estimating population mean 7 the classical ratio estimator is defined as:
where the population mean x of the auxiliary variable is assumed to be known. If Y and cx are known, Sisodia and Dwivedi (1981) suggested a transformed ratio type estimator for the population mean F as: = y--('g+Cx t
(1.2)
In many practical situations the value of the auxiliary variate x may be available for each unit in the population, for instance, see Das and Tripathi (1980, 1981), Singh (2004), Stearns and Singh (2005) and Kadilar and Cigi (2005). Thus utilizing the information on T , cx and fl2(x) of the auxiliary variable, Upadhyaya and Singh (1999) suggested the following ratio-type estimators for population mean 7 as:
39
_.(#& {u)+Cx
.~R(2): Y/#"6'2(x)+ Cx ]
(1.3)
YR(3)= k-'(}TCx-+~ (x)/
(1.4)
and
It is a well known result that the regression estimator is more efficient than the ratio (product) estimator except in the case where the regression line of the variable y on the variable x passes through the neighborhood of the origin, in which case the efficiencies of these estimators are almost equal. However, in many practical situations the regression line does not pass through the origin. Considering this fact Srivastava (1967) suggested a modified ratio-type estimator using power transformation is more efficient than the usual ratio estimator in some situations, and related work can be had from Singh (2003). In the present investigation we have suggested the modification of Upadhyaya and Singh (1999) estimators by using the concept of power transformation earlier used by Srivastava (1967). It is shown that the proposed estimators are more efficient than the sample mean estimator y, usual ratio estimator JR(0), Sisodia and Dwivedi (1981) estimator Ya(t), and Upadhyaya and Singh (1999) estimators
.~R(2)
and YR(3). Numerical illustrations are given to
judge the merits of the suggested estimators over others. 2. The modified estimators
By applying power transformation on Upadhyaya and Singh (1999) estimators, the modified estimators are given by:
-.:~.~2(x)+Cx } a "~R(°:)= Y~ -~lq2(x)+ Cx
(2.1)
and
_[ XCx +/5'2(x)[ # y~(a) = y~ ~c+ + v= (x)J
(2.2)
where a and 6 are suitably chosen scalars such that the mean squared errors of Ya(~) and YR(a) are minimum. It may be noted
40 that the new estimators are generalizations of earlier estimators, namely (2.1) generalizes (1.3), and (2.2) generalizes (1.4). To the first degree of approximation, the biases and mean squared errors of Ya(~) and Ye,(a) are, respectively, given by:
B(37R(o~))=(T/2)Aff0Cx2 [(or+ 1)0- 2K]
(2.3)
B~R(a })= (7/2)2o~0'Cx2 [(6 + 1)o'-gK]
(2.4)
MSE(TVR(a) )= AY'g [c g + ~zoc g (~zO- 2K )]
(2.5)
MSE@R(a))= .g~2 [Cy2+ 60'C 2 (60'-2K)]
(2.6)
and
where O = {.Xfig(x)}/{.~fi2(x)+ Cx} , tg'= {.~Cx}/{.~C x + fig(x)}, 2 = (N - n),/(nN), K = vCy/Cx, and p is the correlation coefficient between y and x. The mean squared errors at (2.5) and (2.6) are, respectively, minimized for:
a=(K/O)=aop t (say)
(2.7)
6opt (say)
(2.8)
and 6 = (K/O')=
Thus the common minimum mean squared error of Ya(o,) and YR(8) is given by:
min.MSE~R(a)) = min.MSE~R(6)) = A72C2 (1- p 2 )
(2.9)
Substitution of (2.7) and (2.8) respectively in (2.3) and (2.4) yield the resulting biases of ~(~) and yp,(a) as: B(YP@op,))= ~(F/2)K(0- K)C2
(2.10)
B(yp,(aopt)) = .¢(F/Z)K(O'-K)C2
(2.11)
and
41
3. Efficiencies of modified estimators It can easily be proved that the proposed modified estimator y~(~) has lower mean squared error than the: ( i ) sample mean estimator y if: 0 0
B
and
(A* MSE(~stUS1 )_ m i n . M S E ( . ~ R ( ¢ ' ) ) =
,)2 - RUS 2, B
(6.24)
>0
B
We note from (6.23) and (6.24) that the estimator YR(as,) is better than usual unbiased estimator Y~, and the estimator Ys,SV2. In practice, if the values of .!t ) and @)
are not known, it is
advisable to use their consistent estimators as: d,!~') - ~-"~
(6.25)
BRus1
and
@ ) - ~,=
(6.26)
B RUS 2
where: k
k
h=l
h=l
^
k
o2 [x~s 2 A = Z O)2yhCxhSyxh, ]4 = Z °)2)'hfl2h(X)Syxh , B = X o92 hYhP2h~ ) xh, ~ , = Y.(.OhYhCxhSxh ~' 2 2 2 ,
h=l
2
Sxh =
(~h-O-~(xhj
h=l
-
~h) 2 ,
j=l
RUS1 = 2 1 h Y h
h{-Xh,82h(x)+Cxh}'
nh Yh='qlzYhj, j=l
Syxh = ( n h - 1 ) - l n hE ( Y h j - Y h ) ( X h j - - Y h ) , j=t
and ~ s z = Zc°hYh
-Xh = nh I Z j=l
=
h{-XhCxh + fl2h(X)} • =
Thus we get the resulting estimators for the population mean F as:
-
~
_-
(xusl)
a~;)
(6.27)
and V stl
~, xUS 2 J
(6.28)
55
To the first degree of approximation it can be shown that: MSE
~R(~s,))
(6.29)
min.MSE~R(a~s,))
and (6.30)
MSE(.~R(a~,)] = min.MSE(~R(as,)) •
where
min.MSE~R(a,.,) ) and
min.MSE~R(as,)) are respectively given
by (6.19) and (6.20).
7. Empirical study using stratified random sampling In order to see the performance o f the suggested estimators over other estimators, we have chosen the same data set as considered by Kadilar and Cingi (2003) and is related to biometrical science. The percent relative efficiencies (PREs) o f different estimators with respect to Ys, have been computed and presented in Tables 7.1, 7.2, 7.3 and 7.4.
Table 7.1. PRE of YR(c~st) with respect to Yst for different values of ast. test
PRE ~Zst PRE
0.0
0.25
0.50
0.75
100.00 1.00 232.75
144.24 1.25 178.38
206.62 1.50 124.85
245.03 1.60 107.79
g!t )
0.82579 248.09 1.6515 100.00
Table 7.2. PRE of ~R(ssl) with respect to Yst for different values of 8st.
6st PRE
fist PRE
0.00 100.00 ~;!~)= 1.0411
0.25 0.50 0.75 1.00 141.57 202.91 278.09 326.37 1.25 1.50 1.75 2.00
2.0822
360.09 300.00 227.20 159.41 117.78
100.00
Tables 7.1 and 7.2 exhibit that the estimators YR(as,) and YR(4,) are better than the conventional unbiased estimator fist even when the scalars o~st and as, depart much from their corresponding optimum values c~!7) and @ ) . Thus there is enough scope of choosing
56 scalars ~st and 3",t in YR(~,~) and YR(8,,) to obtain better estimators. It is further observed from the Table 7.3 that the estimators -YR(a~.,) and YR(~st) (or the estimators YR(ast) and YR(g~) based on estimated optimum values) give largest gain in efficiency at their optimum
-(o conditions (i.e. the optimum estimators YRI~,t) Table
and ;(o) R(~.t)).
7.3. PREs of .~st, ~Rc, YstSD, YstSK, .~stUS1, YstUS2, f(~lCtst)
(or ~(R0/&s,)],~(R0/fis,)(or~(R0~s,)]with respectto fist. Yst
-~RC
~stSD
YstSK
PRE
100.00
312.21
312.00
312.02
Estimator
YstUS1
YstUS2
y(~tCtst)
~(ROIsst)
PRE
232.75
326.37
248.09
360.09
Estimator
We have further computed the ranges of c~t and 6~t for Yst(~s,) and f~t(4~,) to be more efficient than different estimators of population mean Y and compiled in Table 7.4. Table 7.4. Ranges of ~st and •st for Yst(ast) and Yst(J~,) to be more efficient than various estimators of the population mean. Estimator
Yst YRC YstSD ~stSK
"TstUSl YstUS2
Range of
~st
(0.0000, 1.6516) (-54.0127,55.6643) (-54.0056,55.6571) (-53.9484,55.6000) (0.6516, 1.0000) (0.0000, 1.6515)
Range of
6st
(0.0000, 2.0822) (-0.6522,2.7344) (-0.6518,2.7340) (-0.6490,2.7312) (0.6004, 1.4818) (1.0000, 1.0822)
Table 7.4 gives common range of c~s, as (0.6516, 1.0000) for YR(~st) to be better than the estimators Yst, .YRC, Y~tSD, Y~tS*:,
.~stUSi, i=1, 2 while the c o m m o n range of ~st is (1.0000, 1.0822) for YR(Sst) to be more efficient than the rest of the estimators.
57
Conclusions We conclude that the modified estimators YR(a) and Ya(g), and their extension in stratified sampling are worth using not only at their optimum conditions, for in a quite wide range of scalars around the optimum conditions. Thus this study answers a valuable question recently raised by Kadilar and Cingi (2003) about the doubtfulness of the validity of the theory of ratio type estimators in stratified random sampling and simple random sampling.
Acknowledgements The authors are thankful to the Editor Professor G6tz Trenkler and a learned referee for their valuable comments to bring the original manuscript in the present form.
References Cochran, W.G. (1977). Sampling Techniques. John Wiley and Sons. Inc. London. Das, A.K. (1988). Contribution to the theory of sampling strategies based on auxiliary information. Ph.D. thesis submitted to Bidhan Chandra Krishi Vishwavidyalaya, Mohanpur, Nadia, West Bengal, India. Das, A.K. and Tripathi, T.P. (1980). Sampling strategies for population mean when the coefficient of variation of an auxiliary character is known. Sankhya, 42, C, 76-86. Das, A.K. and Tripathi, T.P. (1981). A class of sampling strategies for population mean using information on mean and variance of an auxiliary character. Proc. of the Indian Statistical Institute Golden Jubilee International Conference on Statistics." Applications and New directions, Calcutta, 16-19 December 1981, 174-181. Kadilar, C. and Cingi, H. (2003). Ratio estimators in stratified random sampling. Biom. J., 45, 2, 218-225.
58
Kadilar, C. and Cingi, H. (2005). A new ratio estimator in stratified random sampling. Comm. Statist.- Theory Meth., 34, 597-602. Singh, S. (2003). Advanced sampling theory with applications." How Michael "Selected" Amy. pp 1-1220 (Vol. 1 and Vol. 2) Kluwer Academic Publishers, The Netherlands. Singh, S. (2004). Golden and Silver Jubilee Year-2003 of the linear regression estimators. Proc. of the American Statistical Association, Survey Method Section [CD-ROM], Toronto, Canada." American Statistical Association: pp. 4382-4389. Sisodia, B.V.S. and Dwivedi, V.K. (1981). A modified ratio estimator using coefficient of variation of auxiliary variable. Jour. Indian Soc. Agric. Statist., 33, 13-18. Srivastava, S.K. (1967). An estimator using auxiliary information in sample surveys. Calcutta Statist. Assoc. Bull., 16, 121-132. Steams, M and Singh, S. (2005). A new model assisted chi-square distance function for calibration of design weights. Presented at the Joint Statistical Meeting, Minneapolis, ASA Section on Survey Research Methods,[CD] pp. 3600-3607. Upadhyaya, L.N. and Singh, H.P. (1999). Use of transformed auxiliary variable in estimating the finite population mean. Biom. J., 41, 5,627-636.