On Variance Estimators for Normal Population - InterStat

On Efficient Iterative Estimation Algorithm Using Sample Counterpart of the Searles’ Normal Mean Estimator with Exceptionally Large But Unknown Coefficient of Variation 1

Winston A. Richards, 2Robin Antoine, *2Ashok Sahai, and 3M. Raghunadh Acharya 1

Department of Mathematics and Statistics The Pennsylvania State University, Harrisburg 2

Department of Mathematics & Computer Science; Faculty of Science & Agriculture St. Augustine Campus; The University of The West Indies. Trinidad & Tobago; West Indies. & Department of Statistics and Computer Science, Aurora’s Post Graduate College; Osmania University, Hyderabad (Andhra Pradesh); India. 3

Abstract This paper addresses the issue of finding an optimal estimator of the normal population mean when the coefficient of variation is unknown but is expected to be exceptionally high, as per the pilot surveys of the population at hand. The paper proposes an “Efficient Iterative Estimation Algorithm Using Sample Counterpart of the Searles’ Normal Mean Estimator”. The estimators per this strategy have no close form, and hence are not amenable to an analytical study determining their relative efficiencies as compared to the usual unbiased sample mean estimator. Nevertheless, we examine these relative efficiencies of our estimators with respect to the usual unbiased estimator x by means of an illustrative numerical empirical study. MATLAB 7.7.0.471 (R2008b) is used in programming this illustrative „Simulated Empirical Numerical Study‟.

Key words and phrases: MVUE, MMSE, Complete Sufficient Statistic, Numerical Study. AMS Classification: 62F10.

[*[email protected] : Corresponding Author’s Email Address] 1

1 Introduction This paper addresses the issue of finding an optimal estimator of the normal population mean when the coefficient of variation is unknown but is expected to be exceptionally high, as per the pilot surveys of the population at hand. This might well happen to be the case particularly if the population mean θ is expected to be positive but very close to zero/ very small, even though the population standard deviation, say σ happens to be relatively quite large. Such cases are very exceptional but are known to be relevant practically in the statistical modeling applications in the areas of astronomy, stock-markets, biodiversity, and environmental sciences, etc. Searls (1964), Khan (1968), Gleser & Healy (1976), and Arnholt & Hebert (1995) considered the problem of estimating the normal population mean, when its coefficient of variation (c. v.) is known. It is very well known that the Searles (1964) estimator for the normal population mean gets to be: Say, SE = x / (1 + V2); wherein „V‟ is the coefficient of variation of „ x ‟.

(1.1)

As such, c. v. ( x ) = σ/ (√n. θ) = V.

(1.2)

The Searles (1964) estimator is known to be “MMSE”, and hence optimal. But this estimator could be used only when for such a Normal population, N (  , a2), a = V2 is known. _

In that case, the sufficient statistic ( X ; S2) is not complete. Consequently, when estimating the mean, variance, or essentially any function of the unknown parameter θ, we are groping in the dark when attempting to find its Universally Minimum Mean Squared Error Estimate (UMMSE), or to find the Universally Minimum Variance Unbiased Estimate (UMVUE). In this paper we have, however, been motivated by the results of Searls (1964) leading to the estimator “SE”, as in (1.1) concerning the estimation of the normal population mean when its ‟C.V‟ is unknown. The fact that „V‟ might not be known is a reality much more often in practice than when it is known. In fact the assumption that „V‟ is quite closely known is also seldom justified. As such, this paper proposes the estimator using the sample counter-part of “V2”, namely: Say v2 = S2/ {n.( x )2}; x and S2 being, respectively, the sample mean & the sample variance.

(1.3)

Hence, the initially proposed estimator of the Normal Population Mean „θ‟ happens to be as below: Say, SE (0) = x / (1 + v2); wherein „v2‟ is the estimated squared c. v. of „ x ‟, as in (1.3)

(1.4) 2

2 The Proposed Estimators We start by noting that the aforesaid estimator SE ( 0 ) involves infinite terms containing “ x and S2”, as per the „Taylor‟s Expansion! Hence, the proposed estimator SE ( 0 ) has no close-form expression, and thus is not amenable to any analytical study determining its relative efficiency as compared to the usual unbiased sample mean estimator. The only open recourse is to study this problem numerically with a large number of simulated samples of various illustrative sizes (Say „n‟) from the parent normal population with illustrative values of the population mean and variance or standard deviation. This type of “Simulated Numerical Empirical Study” has been attempted later in the subsequent section considering “51,000 replications” via simulated samples. But, before we take to that, we take up an observation which is simple but significant as follows. That the estimator SE ( 0 ) is intuitionally expected to perform better than the usual unbiased estimator „ x ‟, as is duly ascertained by the “Simulated Numerical Empirical Study”, we could achieve a better estimator by using that instead of x in the “Initially” proposed estimator SE ( 0 ) [At the Iteration # „0‟ of our proposed „Iterative Algorithm‟], as in (1.4). This leads us to propose: Say SE ( 1 ) = SE ( 0 )/ (1 + v2)

(2.1)

This happens to be the FIRST “Iteration”/ step of our proposed “Efficient Iterative Estimation Algorithm Using Sample Counterpart of the Searles’ Normal Mean Estimator with Unknown Coefficient of Variation”. In fact exactly analogously reasonable will be the proposition of the following estimator, inasmuch as it is intuitionally expected that the estimator SE ( 1 ) would perform better than SE ( 0 ), as is later duly ascertained by the “Simulated Numerical Empirical Study”. Say SE ( 2 ) = SE ( 1 )/ (1 + v2)

(2.2)

In general, now, we are ready to define our “Iterative Algorithm” to achieve more efficient estimator of the normal population mean, as below: Say SE ( I + 1 ) = SE ( I )/ (1 + v2) ; I (Iteration #) = 0 (1) M [Any chosen positive Integer].

(2.3) 3

Now, to quantify the improvement brought forth by these iteratively generated estimators proposed in our paper, we have taken to an empirical study in the following section. We define the Relative Efficiency, Reff ( . ) of the estimator at hand, relative to x , i.e., say,T in equation (2.1) follows in (2.4)  Reff ( • ) = [100.MSE ( • ) / V ( T )] % ; wherein MSE ( • ) = E [ • - θ ]2 .

(2.4)

In above, MSE ( • ) = V ( • ) + B2 ( • ) stands for the MSE {Mean Square Error} of the estimator at hand, V ( • ) for its variance, and B ( • ) for its bias. In this context, we have considered the estimators for their „Relative Efficiencies‟ in the following order as per the “Iteration #”, # = 1 (1) 7 for the illustration. 3 The Simulation Empirical Numerical Study In the preceding section, it is apparent, from the fact that in the absence of any „closed form analytical expressions‟ of the Reff ( • )‟s, that the answer to the question as to what is the relative gain/ achievement in pursuing the „Iteratively More Efficient Estimation‟ of the normal population mean lies in trying to know it through an illustrative “Simulation Empirical Numerical Study”, as is attempted in this section. These Reff ( • )„s have been calculated for NINE illustrative values of the population standard deviation, namely σ = 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, and 5.0. We could assume that, without any loss of generality & for the simplicity of the illustration, the NORMAL population mean θ = 0.1 [i.e. positive, but very close to zero!). Thus, The value of V [i.e., the c. v. ( x )] will be at least 0.5 even when n = 101! As also, we have carried out the illustrative numerical study for NINE examplevalues of the sample size, namely n = 3, 5, 7, 11, 21, 41, 61, 81, and 101. The values of the actual MSE‟s are calculated by considering the random samples of size „n‟ using ’51,000’ replications (pseudo- random normal samples of the size „n‟) for various estimators SE ( I )’s with I = 0 ( 1 ) 7 as also for the usual unbiased estimator, namely the sample mean “ x “. Hence values of Reff ( • )‟s are calculated as per (2.4). These Reff ( • )‟s are reported to the closest three decimal places of their respective actual values in nine tables in the APPENDIX, corresponding to each of the NINE illustrative values of the sample-size „n‟. MATLAB 7.7.0 (R2008b) is used in programming the calculations in this illustrative „Simulated Empirical Numerical Study‟. 4

4 Conclusions As expected, the “Relative Efficiencies” of the proposed “Iteratively More Efficient” estimators of the „Normal population mean‟ are progressively better with the increase of the “Iteration Number I” in the impugned estimator SE ( I ). Though we have limited the illustration up to I = 7, clearly the indications are well-supportive of the fact that we could do better by proceeding further to generate progressively more efficient estimators, till we please! Incidentally, this achievement has been the motivating aim behind this paper. It might be remarked in this context, that the extent-and-the-stage till which the proposed “Algorithm” of generating iteratively more efficient estimator depends on the largeness of the coefficient-of-variation “V” [i.e., the c. v. ( x )] as would be available through the value of its sample counterpart “v”,. This value of “v” might be used with the value of “ x ” [as an estimate of θ] together with that of S2 [as an estimate of σ2] to seek the guidance, from the outlined „Empirical Simulation Study‟ used in this paper, regarding the desirable number of “Iteration” till which the estimator if gainfully improvable.

5

References Arnholt, A. T. & Hebert, J. E. (1995), “Estimating the Mean with known Coefficient of Variation”, The American Statistician, 49, 367-369. Gleser, L. J., & Healy, J.D. (1976), “Estimating the Mean of a Normal Distribution with Known Coefficient of variation”, Journal of the American Statistical Association, 71, 977-981. Khan, R. A. (1968), “A Note on Estimating the Mean of a Normal Distribution with Known Coefficient of Variation”, Journal of the American Statistical Association, 63, 1039-1041. Searls, Donald T. (1964), “The Utilization of a known Coefficient of Variation in the Estimation Procedure”, Journal of the American Statistical Association, 59, 1225-1226.

6

APPENDIX.

Table A.1. Relative Efficiencies (%) of SE(I) ~ Searles Estimated Estimator at the Iteration # I = 0 (1) 7. For Sample Size ~ n=3. SE(I) σ =1.000 σ =1.500 σ =2.000 σ =2.500 σ =3.000 σ =3.500 σ =4.000 σ =4.500 σ =5.000 SE(0) 174.814 176.043 176.649 178.684 176.892 178.058 177.978 177.259 178.146 SE(1) 240.671 244.337 246.216 251.177 246.931 249.547 249.782 247.904 249.766 SE(2) 302.142 309.264 312.985 321.285 314.462 318.651 319.427 316.141 318.972 SE(3) 360.532 371.962 378.032 390.014 380.616 386.543 387.946 383.062 386.898 SE(4) 416.431 432.914 441.775 457.797 445.826 453.708 455.752 449.113 454.041 SE(5) 470.175 492.366 504.421 524.853 510.292 520.380 523.045 514.529 520.666 SE(6) 521.981 550.473 566.097 591.301 574.120 586.683 589.937 579.447 586.935 SE(7) 572.005 607.339 626.885 657.208 637.379 652.689 656.499 643.960 652.952

Table A.2. Relative Efficiencies (%) of SE(I) ~ Searles Estimated Estimator at the Iteration # I = 0 (1) 7. For Sample Size ~ n= 5. SE(I) σ =1.000 σ =1.500 σ =2.000 σ =2.500 σ =3.000 σ =3.500 σ =4.000 σ =4.500 σ =5.000 SE(0) 185.661 188.693 191.462 192.034 191.305 193.054 191.513 191.440 192.398 SE(1) 272.820 282.922 290.547 292.612 290.787 295.610 291.362 292.126 294.582 SE(0) 362.504 384.548 399.612 403.820 401.177 409.940 402.253 404.893 409.163 SE(0) 453.465 492.899 518.466 525.490 522.803 536.044 524.582 530.174 536.454 SE(0) 544.212 606.847 646.333 657.095 655.454 673.513 658.286 667.948 676.404 SE(0) 633.395 725.175 782.217 798.005 798.710 821.811 803.138 818.038 828.859 SE(0) 719.909 846.678 925.019 947.536 952.047 980.366 958.831 980.183 993.622 SE(0) 802.923 970.224 1073.600 1104.965 1114.882 1148.614 1125.016 1154.069 1170.473


7




8




9

On Variance Estimators for Normal Population - InterStat

On Variance Estimators for Normal Population - InterStat

Suggest Documents

comparison of some estimators of square of population mean - InterStat

Some Improved Estimators for Population Variance Using Two Auxiliary

Some Improved Estimators for Estimating Population Variance in the ...

Improved Dual to Variance Ratio Type Estimators for Population ...

Improved Dual to Variance Ratio Type Estimators for Population

A Generalized Class of Estimators for Finite Population Variance in ...

Sensitivity and variance estimators for virtual population ... - Archimer

Optimal Combinations of Pairs of Estimators - InterStat

on efficient estimation for the log-normal model mean - InterStat

New Ratio Estimators Using Correlation Coefficient - InterStat

a class of pps estimators of population variance using ... - CiteSeerX

Improved ratio-type estimators of finite population variance using ...

A Family of Estimators of Population Variance - UNM Gallup

Some Competitive Estimators of Finite Population Variance Using ...

Overlapping Variance Estimators for Simulation - INFORMS PubsOnline

INCLUDING MEAN-VARIANCE RELATIONSHIPS IN ... - InterStat

Bias/variance decompositions for likelihood-based estimators

Folded Variance Estimators for Stationary Time

estimating the variance of a normal population by

generation of standard normal random numbers - InterStat

Time-Bridge Estimators of Integrated Variance

VARIANCE ESTIMATION OF IMPUTED ESTIMATORS ... - ePrints Soton

Robust Henderson III estimators of variance ...

Area estimators variance from systematic samples.