İhmal Edilebilir Kayıp Veri Durumunda Model Tabanlı Kayıp Veri Baş Etme Yöntemlerinin Şans Parametresine Etkisi

Bu çalışmada, kayıp veri durumunda model tabanlı kayıp veri baş etme yöntemlerinin ihmal edilebilir şans parametresi üzerindeki etkilerinin belirlenmesi amaçlanmıştır. Bu amaçla 500, 1000 ve 3000 örneklem büyüklüğünde tek boyutlu Madde Tepki Kuramı 3 parametreli lojistik modeline uygun olarak üretilen verilerde %2.00, %5.00 ve %10.00 oranlarında tamamen rastgele kayıp ve rastgele kayıp mekanizmalarına uygun olacak şekilde kayıp veri oluşturulmuştur. Oluşturulan kayıp veriler, beklenti maksimizasyon algoritması ve çoklu atama yöntemleri ile tamamlanmıştır. Veri setinde tamamen rastgele kayıp mekanizmasında kayıp veri olması durumunda çoklu atama ve beklenti maksimizasyon algoritması yöntemlerinin kayıp veri oranına da bağlı olarak performansının iyi olduğu sonucuna ulaşılmıştır. Tüm örneklem büyüklüklerinde kayıp veri oranı %2.00 olduğunda her iki yöntemin de en iyi performansı sergilediği, kayıp veri oranı arttıkça referans değerden uzaklaşıldığı görülmektedir. Buna karşın, örneklem büyüklüğü 3000 olduğunda kayıp veri oranı yüksek de olsa referans değere daha yakın kesitimler sundukları sonucuna ulaşılmıştır. Rastgele kayıp veri mekanizmasında ise kayıp veri oranı düşük olduğunda her iki yöntemin de şans parametresi üzerinde iyi performans gösterdiği ancak kayıp veri oranı arttıkça bu performansta önemli düşüşlerin olduğu sonucuna ulaşılmıştır. Çoklu atama ve beklenti maksimizasyon algoritması ile atama yöntemlerinin her ikisi de rastgele kayıp veri mekanizmasında şans parametresi üzerinde iyi performans göstermemektedir

The Effects of Model Based Missing Data Methods on Guessing Parameter in Case of Ignorable Missing Data

The present study aims to investigate the effects of model based missing data methods on guessing parameter in case of ignorable missing data. For this purpose, data based on Item Response Theory with 3 parameters logistic model were created in sample sizes of 500, 1000 and 3000; and then, missing values at random and missing values at completely random were created in ratios of 2.00%, 5.00% and 10.00%. These missing values were completed using expectation–maximization (EM) algorithm and multiple imputation methods. It was concluded that the performance of EM algorithm and multiple imputation methods was efficient depending on the rate of missing values on the data sets with missing values completely at random. When the missing value rate was 2.00%, both methods performed well in all sample sizes; however, they moved away from reference point as the number of missing values increased. On the other hand, it was also found that when the sample size was 3000, the cuts were closer to reference point even when the number of missing values was high. As for missing values at random mechanism, it was observed that both methods performed efficiently on guessing parameter when the number of missing values was low. Yet, this performance deteriorated considerably as the number of missing values increased. Both EM algorithm and multiple imputation methods did not perform effectively on guessing parameter in missing values at random mechanism

___

  • Afifi, A. A. & Elashoff, R. M. (1966). Missing observations in multivariate statistics I. Review of the literature, Journal of the American Statistical Association, 61, 595-605.
  • Agresti, A. & Finlay, B. (1997). Statistical methods fort he social sciences. USA: Pearson Prentice Hall.
  • Akindele, B.P. (2003). The development of an item bank forselection tests into Nigerian Universities: An exploratory study. Unpublished doctoral thesis, University of Ibadan, Ibadan, Nigeria. Allison, P. D. (2003). Missing data techniques for structural equation modeling. Journal of Abnormal Psychology. 112(4), 545-557.
  • Allison, P.D. (2009). Missing data(Sage University Paper Series on Quantitative Applications in the Social Sciences, 72-89). London: Sage Publication.
  • Arnold, A.M. & Kronmal, R.A. (2002). Multiple imputation of baseline data in the cardiovascular health study. American Journal of Epidemiology, 157(1), DOI: 10.1093/aje/kwf156.
  • Baker, F. B. (2001). The basics of item response theory. College Park, MD: ERIC Clearinghouse on Assessment and Evaluation. Baraldi, A.N. & Enders, C.K. (2010). An introduction to modern missing data analysis. Journal of School Psychology, 48, 5–37.
  • Barzi, F. & Woodward, M. (2004). Imputations of missing values in practice: results from imputations of serum cholesterol in 28 cohort studies. American Journal of Epidemiology, 160(1), 34-45, DOI: 10.1093/aje/kwh175.
  • Baykul, Y. & Güzeller, C.O. (2013). Sosyal bilimler için istatistik: SPSS uygulamalı. Ankara: Pegem Akademi. Buhi, E.R., Goodson, P. & Neilands, T.B. (2008). Out of sight , not out of mind: strategies for handling missing data. American Journal of Health Behavior, 32 (1), 83-92.
  • Crocker, L. & Algina, J. (1986). Introduction to classical & modern test theory. Newyork: Holt. Rinehart and Winston. DeMars, C. (2010). Item response theory: Understanding statistics measurement. London : Oxford Press.
  • Dempster, A.P., Laird, N.M. & Rubin, D.B.(1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society.Series B (Methodological), 39(1), 1-38.
  • Donders, A.R.T., van der Heijden, G.J.M.G., Stijnen, T. & Moons, K.G.M. (2006). Review: A gentle introduction to imputation of missing values. Journal of ClinicalEpidemiology, 59, 1087-1091, DOI: 10.1016/j.jclinepi.2006.01.014.
  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates Publishers Enders (2010),
  • Enders, C.K. (2013). Dealing with missing data in developmental research. Child Development Perspectives, 7 (1), 27- 31.
  • Enders, C. K. (2010). Applied missing data analysis.New York: The Guilford Publications, Inc Erkuş, A. (2003). Psikometri üzerine yazılar. Ankara. Türk Psikologlar Derneği Yayınları.
  • Graham, J. W., (2009). Missing data: analysis and design. New York: Springer
  • Graham, J. W., Hofer, S. M., & MacKinnon, D. P. (1996). Maximizing the usefulness of data obtained with planned missing value patterns: an application of maximum likelihood procedures. Multivariate Behavioral Research. 31(2), 197-218.
  • Hambleton R.K., Swaminathan H. & H. J. Rogers (1991). Fundamentals of item response theory. Newbury Park, CA: SAGE Publications, Inc. Duygu KOÇAK – Pegem Eğitim ve Öğretim Dergisi, 8(1), 2018, 155-172 171
  • Hambleton, R. K., Jones R.W. & Rogers, H. J. (1993). Influence of item parameter estimation errors in test development. Journal of Educational Measurement, 30, 143-155.
  • Harris, D. (1989). Comparison of 1-, 2-, and 3-parameter IRT models. Educational Measurement: Issues and Practice, 8(1), 35-41
  • Hartley, H.O.(1956). Programming analysis of variance for general purpose computers. Biometrics, 12, 110-122.
  • Hohensinn, C. & Kubinger, K.D. (2011). On the impact of missing values on the item fit and the model validness of the rasch model. Psychological Test and Assessment Modeling, 53 (3), 380-393.
  • Karasar, N. (2007). Bilimsel araştırma yöntemi: kavramlar, ilkeler, teknikler. Ankara: Nobel Yayın Dağıtım Leite, W. & Beretvas, S.N. (2010). The performance of multiple imputation for likerttype items with missing data. Journal of Modern Applied Statistical Methods, 9(1), 64-74. Little, R. J. A. & Rubin, D. B. (1987). Statistical analysis with missing data. New York: Wiley.
  • Lord, F. M., (1955). Estimation of parameters from incomplete data, Journal of the American Statistical Association, 50, 870-76.
  • Lord, F. M.,& Novick, M. R.(1968). Statistical theories of mental test scores. Reading MA: AddisonWesley.
  • McKnight, P. E., McKnight, K. M., Sidani, S. & Figueredo, A. J. (2007). Missing data: a gentle introduction. New York: The Guilford Publications, Inc.
  • Newman, D.A. (2003). Longitudinal modeling with randomly and systematically missing data: a simulation of ad hoc, maximum likelihood and multiple imputation techniques. Organizational Research Methods, 6(3), 328-362, DOI: 10.1177/1094428103254673
  • R Development Core Team (2011), R: A Language and Environment for Statistical Computing, A Foundation for Statistical Computing, Vienna, Austria, ISBN 3-900051-070, Retrived:[http://www.Rproject.org].
  • Reid, C.A., Kolakowsky-Hayner, S.A., Lewis, A.N. & Amstrong, A.J. (2007). Modern psychometric methodology: applications of item response theory. Rehabilitation Counselling Bulletin, 50 (3), 177- 178.
  • Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York: John Wiley & Sons, Inc.
  • Rubin, D. B., (1976). Inference and missing data. Biometrika. 63, 581-592.
  • Satıcı, E. (2009). Kayıp gözlem olması durumunda kitle ortalaması tahmini. Unpublished doctorate dissertation, Hacettepe Üniversitesi, Fen Bilimleri Enstitüsü, Ankara.
  • Schafer, J. L. (1997). Analysis of incomplete multivariate data. New York: Chapman & Hall/Crc.
  • Turgut, F. & Baykul, Y. (2010). Eğitimde ölçme ve değerlendirme. Ankara: PegemA Yayıncılık
  • Warm, T.A. (1978). A primer of İtem response theory. Technical report with no 941078. Oklahoma City: USA Coast Guard Institute.
  • Wayman, J.C. (2003). Multiple imputation for missing data: What is it and how can I use it? Annual Meeting of the American Educational Research Association. Chicago, IL
Pegem Eğitim ve Öğretim Dergisi-Cover
  • ISSN: 2146-0655
  • Başlangıç: 2011
  • Yayıncı: Pegem Akademi Yayıncılık Eğitim Danışmanlık Hizmetleri Tic. Ltd. Şti.