(EM) approach to multivariate data set with mi

26 downloads 0 Views 131KB Size Report
Bollinger, C. R. and Barry, T.H.: is earning non-response ignorable? ... Little R.J.A. and Rubin, D.B.(2002): Statistical analysis with missing data. Second edition.
PROCEEDING OF THE STATISTICS RESEARCH GROUP (SRG) NATIONAL CONFERENCE MATHEMATICS DEPARTMENT, UNIVERSITY OF BENIN, NIGERIA. (2014). PP 70-76.

A New Imputation Method based on Expectation Maximization of the Dataset

Ogbeide, E.M., Osemwenkhae, J. E

Abstract We presents in this research work, the modified Expectation Maximization (EM) approach to multivariate data set with missing observations. The error estimates assessed by comparing it to some other imputation approaches shows some improvement. This approach looks for missing data, in an attempt to fit better density for available data set for statistical inference. This approach belongs to the iterative method of resolving missing observation in survey non response. Particularly, when the missing data can never be recover due to certain conditions.

References Andridge, R.R. and Little, R.F.A.: A review of Hot deck imputation for survey non-response. International Statistical Review. 78 (1) 40-64, 2010. Allen, F.G. and Wilshart, J. (1930): A method of estimating the yield of a missing plot in a field experiments. Journal of Agric. Soc. 20,399-406.

Bethelethem, J. T. (2002): Weighting adjustments for ignorable nonresponse. Chapter 18, in Survey Nonresponse. (R.M. Groves, D.A. Dillman, J.L. Eltinge, and R.J. Little, eds) Wiley, New York. Bollinger, C. R. and Barry, T.H.: is earning non-response ignorable? The Review of Economics and Statistics. 95(2) 407-416, 2013. Bowman, A.W. and Azzalini, A. (1997): Applied Smoothing Techniques for Data Analysis. Clarendon Press. Oxford. Chen, L. and Sakaguchi, T. (2000): Disadvantages of data mining. Information management system, 2000,78-82.

Dillman, D.A. (1999): Mail and Internet survey. The tailored designed method. John Wiley and Sons, New York.

Efron, B. (1994): Missing data, Imputation and Bootstrap. Journal of Amer. Statis. Assoc. 89,463-478.

Efron, B. and Tibshirani, R. J. (1998): Introduction to the Bootstrap. Chapman and Hall, Florida, USA. p. 258 -269.

Fay, R. E. (1996): Alternative paradigms for analysis of imputed survey data. Journal of Amer. Statis. Assoc. 91,490-498.

Fukunaga, K. (1990): Statistical Pattern Recognition Second ed. Accademic press, New York.

Gelman, A.E. and Carlin, J.B. (2002): Post stratification and weighting adjustments. Chapter 19, in Survey Non response (R.M Groves, D.A. Dillman, J.L. Eltinge, and R.F.A. Little eds.) Wiley. New York.

Healy, M.J.R. and Westmacott, M. (1956): Missing values in experiments analyzed on automatic computers. Appl. Statis. 5,203-206.

Holt, D. and Smith, T.M.F. (1979): Post Stratification. Journal of Royal Statistical Association. 81,945-961.

Iannacchione, V. (1982): Weighting sequential hot deck imputation. In the proceedings of the SAS Users Group International conference. San Francisco. CA. 7,759-763.

Jarrett, R.G. (1978): The analysis of designed experiments with missing observations. Appl. Statis. 27,38-46.

Kim, Y. (2001): The curse of the missing data. Accessed online @ www.cs.ubc/ca/net/temporal.htm, December 14, 2011. Lazzeroni, L.C. and Little, R.J. A. (1998): Random-effects model for smoothing post stratification weights. Journal of Official Statis. 14(1),61-78.

Little R.J.A. (1982): Survey non response in sample surveys. Journal of Amer. Statis. Assoc. 77,327-350.

Little R.J.A. (1986): Survey non response adjustment for estimates of means. International Statistical Review. 54,139-157.

Little R.J.A. and Rubin, D.B.(2002): Statistical analysis with missing data. Second edition. Wiley and Sons Publisher. New Jersey. USA.

Liu, C.H., Rubin, D.B. and Wu, Y.(1998): Parameter expansion to accelerate EM: the PX-EM algorithm. Biometrika. 85,755-770.

Meng, X.L. and Rubin, D.B. (1993):Maximum likelihood estimation via the ECM algorithm: a general frame work. Biometrika 80,267-278.

Oh, H.L. and Scheuren, F. (1983): Weighting adjustment for unit non-response. In incomplete data sample survey. Second edition. Ed.W.G. Madow, I. Olkin and D.B. Rubin. P143-184. New York Academic Press.

Preece, D.A. (1971): Iterative procedure for missing values in experiments. Technometrics 13,743-753.

Raghunathan, T. E, Solenberger, P.W and Hoewyk, J.V. (2002): IVEware: Imputation and variance estimation software User guide. Survey Methodology Program, Institute of Social Research, University of Michigan, USA.

Rubin, D.B. (1976): Inference and missing data (with discussion), Biometrika. 63, 581-592. Rubin, D.B. (1978): Multiple Imputations in sample surveys. Proc. Survey Res. Meth. Sec., Amer. Statis. Assoc.1978, 20-34.

Rubin, D.B. (1986): Statistical matching using file concatenation with adjusted weights and multiple imputations. Journal of Bus. Econ. Statis. 4,463-472. Rubin, D.B. (1994): Comments on ‘Missing data, Imputations and the Bootsrtap’’ by Brandley Efron. Journal of Amer. Statis. Assoc.89, 475-478.

Rubin, D.B. (2002): Multiple imputations for non response in survey. Wiley and Sons. New York.

Rubin, D.B. and Thomas, N. (1992): Affinely invariant matching methods with ellipsoidal distributions. Ann. Statis. 20,1079-1093.

Rubin, D.B. and Thomas, N. (2002): Combining propensity score matching with additional adjustments for prognostic covariates. Journal of Statis. Assoc. 95,573-585.

Sande, L. (1979): Hot-deck imputation procedures: Incomplete data in sample survey. Vol.3. New York Research Council.

Sande, L. (1982): Imputing for missing survey responses. Journal of America Statistical Association.77.

Scott, D. W (1992): Multivariate density estimation. John Wiley, New York. Wu, C.F.J. (1983): On the convergence properties of the EM algorithm. Ann. Statis. 11,95-103.

Suggest Documents