Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling

Kemei Anderson Kimutai; Christopher Ouma Onyango; Mike Wafula

doi:doi:10.11648/j.sjams.20210905.12

| Peer-Reviewed

Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling

Kemei Anderson Kimutai, Christopher Ouma Onyango, Mike Wafula

Published in Science Journal of Applied Mathematics and Statistics (Volume 9, Issue 5)

Received: 16 September 2021 Accepted: 9 November 2021 Published: 17 November 2021

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Missing data is a real problem in many surveys. To overcome the problems caused by missing data, partial deletion and single imputation methods among others have been proposed. However, problems such as discarding usable data, inaccuracy in reproducing known population parameters and standard errors are associated with them. In ratio, regression and stochastic imputation, it is assumed that there is a variable with complete cases that can be used as a predictor in estimating missing values in the other variable(s) and the relationship between the dependent and independent variable(s) is linear. This might not always be the case. To overcome these problems accompanied to stochastic and regression estimation, two-phase sampling and nonparametric model-based estimation were employed in this research. Estimator of population total in two-phase sampling was modified. The variance of estimator developed by Hidiroglou, Haziza and Rao was used to compare the performance of the proposed non-parametric model-based imputation in reproducing well known population total and standard errors compared to mean, regression and stochastic methods of imputation. The data was simulated and analyzed using R-statistical Software. The empirical study revealed that non-parametric model-base imputation method is better in reproducing both known population total and standard error.

Published in	Science Journal of Applied Mathematics and Statistics (Volume 9, Issue 5)
DOI	10.11648/j.sjams.20210905.12
Page(s)	126-132
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2021. Published by Science Publishing Group

Keywords

Finite Population Total, Missing Values, Two-phase Sampling

References

[1]	Cali C., Rachel M. K., Richard F. and Christopher V. H. (2019). Dealing with Missing Data: A Comparative Exploration of Approaches Using the Integrated City Sustainability Database. Urban Affairs Review, Vol. 55 (2), 591–615.
[2]	Bii N. K., Onyango C. O. and Odhiambo J. (2020). Estimation of a Finite Population Mean under Random Nonresponse Using Kernel Weights. Journal of Probability and Statistics, vol. 2020, 1-9.
[3]	Yiran D. and Chao-Ying J. P. (2013). Principled missing data methods for researchers. Springer Plus 2 (1), 222-240.
[4]	Adnan F. A., Jamaludin K. R., Muhamad W. Z. and Miskon S. (2021). Review of Current Publications Trend on Missing Data Imputation Over Three Decades: Direction and Future Research. https://doi.org/10.21203/rs.3.rs-996596/v1
[5]	Howell, D. (2012). Treatment of Missing Data-Part 1. www.uvm.edu/dhowell/StatPages/More_Stuff/.../Missing.html
[6]	Bii N. K., Onyango C. O. and Odhiambo J. (2020). Estimating a Finite Population Mean Using Transformed Data in Presence of Random Nonresponse. International Journal of Mathematics and Mathematical Sciences 2020(4), 1-7.
[7]	Dorfman, R. (1992). Nonparametric Regression for Estimating Totals in Finite Populations. Proceedings of the Section on Survey Research Methods, American Statistical Association, 622–625.
[8]	Enders C. K. (2010). Applied Missing Data Analysis. New York: Guilford Press.
[9]	Brady T. W and Roderick J. A. (2013). Non-response adjustment of survey estimates based on auxiliary variables subject to error. Journal of Royal Statistical Society, Vol. 62 (2), 213–231.
[10]	Särndal, C. E. and Lundstrom, S. (2005). Estimation in Surveys with Nonresponse. New York: John Wiley & Sons.
[11]	Yulei, H. (2010). “Missing Data Analysis using Multiple Imputation: Getting to the Heart of the Matter” American Heart Association, 3, 98-105.
[12]	Saunder, J. A., Morrow, N. H., Spitznagel, E., Dori, P., Enola, K. P. and Pescarino, R. (2006). “Imputing Missing Data: A Comparison of Methods for Social Work Researchers” Social Work Research, 30, 19-32.
[13]	Little, R. J., & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.
[14]	Chao-Ying, J. P., Harwell, M., Show-Mann, L. and Lee, H. E. (2006). “Advances in Missing Data Methods and Implications for Educational Research.” In S. Sawilowsky (Ed.), Real data analysis. Greenwich, CT: Information Age Publishing Inc.
[15]	Amanda, N. B. and Enders, C. K. (2010). “An introduction to modern missing data analyses.” Journal of School Psychology, 48, 5–37.
[16]	Lehtonen, R. and Pahkinen, E. (2004). Practical Methods for Design and Analysis of Complex Surveys (2^nd Edition). New York: John Wiley & Sons Ltd.
[17]	Overton, W. S. (1985). A Sampling Plan Tor Streams in the National Stream Survey. Statistics, Technical Report 114, Department Oregon State University, Corvallis, Oregon, 97331.
[18]	Särndal, C. E., Swensson, B., Wretman, J. (1992). Model Assisted Survey Sampling. New York: Springer.
[19]	Nadaraya, E. A. (1964). “On Estimation Regression” Theory of Probability and Application, 9, 141-142.
[20]	Watson, G. S. (1964). “Smoothing Regression Analysis” Sankhya, Series A, 26, 359-372.
[21]	Hidiroglou, M. A., Haziza, D. and Rao, J. N. K. (2009). “Comparison of Variance Estimator in Two-phase Sampling: An Empirical Investigation” Pak. J. of Statistics, 27, 477-492.
[22]	Cochran, W. G. (1977). Sampling Techniques (3^rd Edition). New York, John Wiley and Sons.
[23]	Dennis, D. W., Mendenhall, R. and Schaeffer, R. L. (2008). Mathematical Statistics with Application (7^th Edition). Duxbury: Thomson Books/Cole.
[24]	Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis. New York: Chapman & Hall.

Cite This Article

Plain Text BibTeX RIS

APA Style

Kemei Anderson Kimutai, Christopher Ouma Onyango, Mike Wafula. (2021). Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling. Science Journal of Applied Mathematics and Statistics, 9(5), 126-132. https://doi.org/10.11648/j.sjams.20210905.12

Copy | Download

ACS Style

Kemei Anderson Kimutai; Christopher Ouma Onyango; Mike Wafula. Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling. Sci. J. Appl. Math. Stat. 2021, 9(5), 126-132. doi: 10.11648/j.sjams.20210905.12

Copy | Download

AMA Style

Kemei Anderson Kimutai, Christopher Ouma Onyango, Mike Wafula. Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling. Sci J Appl Math Stat. 2021;9(5):126-132. doi: 10.11648/j.sjams.20210905.12

Copy | Download

@article{10.11648/j.sjams.20210905.12,
  author = {Kemei Anderson Kimutai and Christopher Ouma Onyango and Mike Wafula},
  title = {Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling},
  journal = {Science Journal of Applied Mathematics and Statistics},
  volume = {9},
  number = {5},
  pages = {126-132},
  doi = {10.11648/j.sjams.20210905.12},
  url = {https://doi.org/10.11648/j.sjams.20210905.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sjams.20210905.12},
  abstract = {Missing data is a real problem in many surveys. To overcome the problems caused by missing data, partial deletion and single imputation methods among others have been proposed. However, problems such as discarding usable data, inaccuracy in reproducing known population parameters and standard errors are associated with them. In ratio, regression and stochastic imputation, it is assumed that there is a variable with complete cases that can be used as a predictor in estimating missing values in the other variable(s) and the relationship between the dependent and independent variable(s) is linear. This might not always be the case. To overcome these problems accompanied to stochastic and regression estimation, two-phase sampling and nonparametric model-based estimation were employed in this research. Estimator of population total in two-phase sampling was modified. The variance of estimator developed by Hidiroglou, Haziza and Rao was used to compare the performance of the proposed non-parametric model-based imputation in reproducing well known population total and standard errors compared to mean, regression and stochastic methods of imputation. The data was simulated and analyzed using R-statistical Software. The empirical study revealed that non-parametric model-base imputation method is better in reproducing both known population total and standard error.},
 year = {2021}
}

Copy | Download

TY  - JOUR
T1  - Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling
AU  - Kemei Anderson Kimutai
AU  - Christopher Ouma Onyango
AU  - Mike Wafula
Y1  - 2021/11/17
PY  - 2021
N1  - https://doi.org/10.11648/j.sjams.20210905.12
DO  - 10.11648/j.sjams.20210905.12
T2  - Science Journal of Applied Mathematics and Statistics
JF  - Science Journal of Applied Mathematics and Statistics
JO  - Science Journal of Applied Mathematics and Statistics
SP  - 126
EP  - 132
PB  - Science Publishing Group
SN  - 2376-9513
UR  - https://doi.org/10.11648/j.sjams.20210905.12
AB  - Missing data is a real problem in many surveys. To overcome the problems caused by missing data, partial deletion and single imputation methods among others have been proposed. However, problems such as discarding usable data, inaccuracy in reproducing known population parameters and standard errors are associated with them. In ratio, regression and stochastic imputation, it is assumed that there is a variable with complete cases that can be used as a predictor in estimating missing values in the other variable(s) and the relationship between the dependent and independent variable(s) is linear. This might not always be the case. To overcome these problems accompanied to stochastic and regression estimation, two-phase sampling and nonparametric model-based estimation were employed in this research. Estimator of population total in two-phase sampling was modified. The variance of estimator developed by Hidiroglou, Haziza and Rao was used to compare the performance of the proposed non-parametric model-based imputation in reproducing well known population total and standard errors compared to mean, regression and stochastic methods of imputation. The data was simulated and analyzed using R-statistical Software. The empirical study revealed that non-parametric model-base imputation method is better in reproducing both known population total and standard error.
VL  - 9
IS  - 5
ER  -

Copy | Download

Author Information

Kemei Anderson Kimutai

Department of Mathematics, Kiriri Women’s University of Science and Technology, Nairobi, Kenya
Christopher Ouma Onyango

Department of Mathematics, Statistics & Actuarial Science, Kenyatta University, Nairobi, Kenya
Mike Wafula

Department of Mathematics, Statistics & Actuarial Science, Kenyatta University, Nairobi, Kenya

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Kemei Anderson Kimutai, Christopher Ouma Onyango, Mike Wafula. (2021). Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling. Science Journal of Applied Mathematics and Statistics, 9(5), 126-132. https://doi.org/10.11648/j.sjams.20210905.12

Copy | Download

ACS Style

Kemei Anderson Kimutai; Christopher Ouma Onyango; Mike Wafula. Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling. Sci. J. Appl. Math. Stat. 2021, 9(5), 126-132. doi: 10.11648/j.sjams.20210905.12

Copy | Download

AMA Style

Kemei Anderson Kimutai, Christopher Ouma Onyango, Mike Wafula. Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling. Sci J Appl Math Stat. 2021;9(5):126-132. doi: 10.11648/j.sjams.20210905.12

Copy | Download

@article{10.11648/j.sjams.20210905.12,
  author = {Kemei Anderson Kimutai and Christopher Ouma Onyango and Mike Wafula},
  title = {Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling},
  journal = {Science Journal of Applied Mathematics and Statistics},
  volume = {9},
  number = {5},
  pages = {126-132},
  doi = {10.11648/j.sjams.20210905.12},
  url = {https://doi.org/10.11648/j.sjams.20210905.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.sjams.20210905.12},
  abstract = {Missing data is a real problem in many surveys. To overcome the problems caused by missing data, partial deletion and single imputation methods among others have been proposed. However, problems such as discarding usable data, inaccuracy in reproducing known population parameters and standard errors are associated with them. In ratio, regression and stochastic imputation, it is assumed that there is a variable with complete cases that can be used as a predictor in estimating missing values in the other variable(s) and the relationship between the dependent and independent variable(s) is linear. This might not always be the case. To overcome these problems accompanied to stochastic and regression estimation, two-phase sampling and nonparametric model-based estimation were employed in this research. Estimator of population total in two-phase sampling was modified. The variance of estimator developed by Hidiroglou, Haziza and Rao was used to compare the performance of the proposed non-parametric model-based imputation in reproducing well known population total and standard errors compared to mean, regression and stochastic methods of imputation. The data was simulated and analyzed using R-statistical Software. The empirical study revealed that non-parametric model-base imputation method is better in reproducing both known population total and standard error.},
 year = {2021}
}

Copy | Download

TY  - JOUR
T1  - Estimation of Finite Population Total in Presence of Missing Values in Two-Phase Sampling
AU  - Kemei Anderson Kimutai
AU  - Christopher Ouma Onyango
AU  - Mike Wafula
Y1  - 2021/11/17
PY  - 2021
N1  - https://doi.org/10.11648/j.sjams.20210905.12
DO  - 10.11648/j.sjams.20210905.12
T2  - Science Journal of Applied Mathematics and Statistics
JF  - Science Journal of Applied Mathematics and Statistics
JO  - Science Journal of Applied Mathematics and Statistics
SP  - 126
EP  - 132
PB  - Science Publishing Group
SN  - 2376-9513
UR  - https://doi.org/10.11648/j.sjams.20210905.12
AB  - Missing data is a real problem in many surveys. To overcome the problems caused by missing data, partial deletion and single imputation methods among others have been proposed. However, problems such as discarding usable data, inaccuracy in reproducing known population parameters and standard errors are associated with them. In ratio, regression and stochastic imputation, it is assumed that there is a variable with complete cases that can be used as a predictor in estimating missing values in the other variable(s) and the relationship between the dependent and independent variable(s) is linear. This might not always be the case. To overcome these problems accompanied to stochastic and regression estimation, two-phase sampling and nonparametric model-based estimation were employed in this research. Estimator of population total in two-phase sampling was modified. The variance of estimator developed by Hidiroglou, Haziza and Rao was used to compare the performance of the proposed non-parametric model-based imputation in reproducing well known population total and standard errors compared to mean, regression and stochastic methods of imputation. The data was simulated and analyzed using R-statistical Software. The empirical study revealed that non-parametric model-base imputation method is better in reproducing both known population total and standard error.
VL  - 9
IS  - 5
ER  -

Copy | Download