BİR, İKİ, ÜÇ VE DÖRT PARAMETRELİ LOJİSTİK MADDE TEPKİ KURAMI MODELLERİNİN KARŞILAŞTIRILMASI

Bu araştırma kapsamında 1PLM, 2PLM, 3PLM ve 4PLM’nin araştırma verisine uyum düzeyleri incelenmiş, bu modeller altında ayrı ayrı kestirilen madde ve yetenek parametrelerinin doğruluğu karşılaştırılmış, maddelerin ve ilgili alt testin toplam olarak bu modeller altında sağladıkları bilgi miktarları hesaplanmış ve karşılaştırılmıştır. Bu karşılaştırmaların yapılmasında MEB’den temin edilen 2012 SBS Türkçe alt testi verileri kullanılmıştır. Bu verilerden seçkisiz olarak seçilen 1500 yanıtlayıcı, çalışma grubunu oluşturmaktadır. Araştırma sorularına yönelik analize geçilmeden önce MTK varsayımları tek boyutluluk ve yerel bağımsızlık test edilmiştir. Tek boyutluluk varsayımı için tetrakorik korelasyon matrisine dayalı AFA yapılmış testin tek boyutlu olduğu sonucuna ulaşılmıştır. Yerel bağımsızlık varsayımının karşılanıp karşılanmadığının incelenmesinde Yen’in ?3 istatistiği kullanılmış, ilgili bütün madde çiftleri için bütün modeller altında yerel bağımsızlık varsayımının karşılandığı görülmüştür. Model-veri uyumu bağlamında en iyi uyuma sahip modelin 4PLM olduğu bulunmuştur. Madde parametreleri R Studio programında MML kestirim yöntemi kullanılarak kestirilmiştir. Madde parametrelerinin kestirimine ilişkin standart hata değerleri oldukça küçüktür ve bu madde parametrelerinin doğru bir şekilde kestirildiğine işaret etmektedir. Her bir yanıtlayıcı için ilgili bütün modeller altında yetenek kestirimi ML kestirim yöntemi kullanılarak yapılmıştır. Yetenek kestirimine ilişkin standart hatalar ANOVA tekniğiyle karşılaştırılmıştır. Karşılaştırma sonucu, 4PLM altında yapılan kestirimin, diğer 3 modelden daha düşük standart hataya sahip olduğunu ve yetenek parametresinin bu model altında daha doğru kestirildiğini göstermektedir. Yine 4PLM’nin ilgili veri seti için sağladığı bilgi miktarının diğerlerinden fazla olduğu ve testin amacına uygun şekilde en fazla bilgiyi orta yetenek düzeyinde verdiği görülmüştür. Bütün modeller en fazla bilgiyi θ = -1 ile θ = 0 yetenek aralığında sağlamaktadırlar.

COMPARISON OF 1PL, 2PL, 3PL AND 4 PL ITEM RESPONSE THEORY MODELS

In this study, model-data fit studies were conducted for 1PLM, 2PLM, 3PLM and 4PLM, accuracy of item and ability parameter estimation was compared, and item and test information functions were provided. To be able to compare models on these issue, 2012 SBS (high school entrance exam) Turkish subtest were taken from Ministry of Education. The study group was comprised by 1500 examinee. Before the analysis of research problems, the assumptions of IRT were checked. First, the data were checked for unidimensionality with EFA based upon tetrachoric correlation matrix, and it appeared essentially unidimensional. Then, all pairs of items were checked for local independence using Yen’s Q3. None of the pairwise residual correlations for all 4 models were greater than .20 in absolute value, which showed that local dependence did not appear to be a problem. The item and ability parameters were provided the property of invariance. In terms of model-data fit, it was found that 4PLM was the best fitting model of all. The items were calibrated with all related models using MML in R Studio. The standard errors for most of the item parameters were reasonably small. This showed that the item parameters were estimated with good accuracy. The abilities (θ) were estimated for each individual examinee with all 4 models using maximum likelihood estimation. The standard errors of ability estimate were compared using Analysis of Variance for the degree of estimation accuracy. It was found that the estimation with 4PLM had smaller standard errors than the other 3 models, and the abilities were most precisely estimated with 4PLM. In addition, the most information was provided by 4PLM for the related test. All the models provided more information on the theta interval -1 and 0 than the other θ levels.

___

  • Baker, F. B. (2001). The Basics of Item Response Theory. United States of America: ERIC Clearinghouse on Assessment and Evaluation.
  • Baykul, Y. (2010). Eğitimde ve Psikolojide Ölçme: Klasik Test Teorisi ve Uygulaması. Ankara: Pegem.
  • Brown, T. (2015). Confirmatory factor Analysis for applied Research. New York ve London: Guilford.
  • Chen, W.-H., & Thissen, D. (1997). Local Dependence Indexes for Item Pairs Using Item Response Theory. Journal of Educational and Behavioral Statistics, 22(3), 265–289. https://doi.org/10.3102/10769986022003265
  • Crocker, L., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. California: Thomson Learning.
  • De Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. Methodology in the Social Sciences. New York: Guildford.
  • DeMars, C. (2010). Item Response Theory. New York: Oxford University Press.
  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. New Jersey: Lawrence Erlbaum.
  • Fleck, M. P., Poirier-Littre, M. F., Guelfi, J. D., Bourdel, M. C., & Loo, H. (1995). Factorial structure of the 17-item Hamilton Depression Rating Scale. Acta Psychiatrica Scandinavica, 92(3), 168–172. http://dx.doi.org/10.1111/j.1600-0447.1995.tb09562.x
  • Hambleton, R. K., & Jones, R. W. (1993). Comparison of Classical Test Theory and Item Response Theory and Their Applications to Test Development. Educational Measurement: Issues and Practice, 12(3), 38–47. http://dx.doi.org/10.1111/j.1745-3992.1993.tb00543.x
  • Hambleton, R. K., & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston: Kluwer Nijhoff.
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. London: Sage.
  • Harvey, R. J., & Hammer, A. L. (1999). Item Response theory. Counseling Psychologist, 27(3), 353– 374. https://doi.org/10.1177/0011000099273004
  • Houben, R. M. A., Leeuw, M., Vlaeyen, J. W. S., Goubert, L., & Picavet, H. S. J. (2005). Fear of movement/injury in the general population: Factor structure and psychometric properties of an adapted version of the Tampa Scale for Kinesiophobia. Journal of Behavioral Medicine, 28(5), 415–424. https://doi.org/10.1007/s10865-005-9011-x
  • Kim, D., De Ayala, R. J., Ferdous, A. A., & Nering, M. L. (2007). Assessing Relative Performance of Local Item Dependence (LID) Indexes. Chicago: Paper presented at the annual meeting of the National Council on Measurement in Education.
  • Liao, W., Ho, R., & Yen, Y. (2012). The Four-Parameter Logistic Item Response Theory Model as a Robust Method of Estimating Ability Despite Aberrant Responses. Social Behavior and Personality, 40(10), 1679–1694. DOI: 10.2224/sbp.2012.40.10.1679
  • Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. The British Journal of Mathematical and Statistical Psychology, 63(3), 509–25. https://doi.org/10.1348/000711009X474502
  • Lord, F. M. (1974). Estimation of latent ability there are omitted responses. Psychometrika, 39(2), 247–264. http://www.jstor.org/stable/20461894 adresinden erişilmiştir.
  • Magis, D. (2013). A Note on the Item Information Function of the Four-Parameter Logistic Model. Applied Psychological Measurement, 37(4), 304–315. https://doi.org/10.1177/0146621613475471
  • Maydeu-Olivares, A., Cai, L., & Hernández, A. (2011). Comparing the Fit of Item Response Theory and Factor Analysis Models. Structural Equation Modeling: A Multidisciplinary Journal, 18(3), 333–356. https://doi.org/10.1080/10705511.2011.581993
  • Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods, 8(2), 164–184. https://doi.org/10.1037/1082-989X.8.2.164
  • Romera, I., Delgado-Cohen, H., Perez, T., Caballero, L., & Gilaberte, I. (2008). Factor analysis of the Zung self-rating depression scale in a large sample of patients with major depressive disorder in primary care. BMC Psychiatry, 8, 4. https://doi.org/10.1186/1471-244X-8-4
  • Rulison, K. L., & Loken, E. (2009). I’ve Fallen and I Can’t Get Up: Can High-Ability Students Recover From Early Mistakes in CAT? Applied Psychological Measurement, 33(2), 83–101. https://doi.org/10.1177/0146621608324023
  • Rupp, A. A. (2003). Item response modeling with BILOG-MG and MULTILOG for windows. International Journal of Testing, 3(4), 365–384. DOI: 10.1207/S15327574IJT0304_5
  • Spearman, C. (1904). Spearman’s rank correlation coefficient. The American Journal of Psychology, 15, 72–101.
  • Spearman, C. (1907). Demonstration of Formulæ for True Measurement of Correlation. The American Journal of Psychology, 18(2), 161–169. doi:10.2307/1412408
  • Turgut, M. F., & Baykul, Y. (1992). Ölçekleme Teknikleri. Ankara: Ösym.
  • Waller, N. G., & Reise, S. P. (2010). Measuring Psychopathology with Nonstandard Item Response Theory Models : Fitting the Four-Parameter Model to the Minnesota Multiphasic Personality Inventory. S. E. Embretson (Ed.), Measuring psychological constructs: Advances in modelbased approaches (147-173) içinde. Washington, DC, US: American Psychological Association. http://dx.doi.org/10.1037/12074-007
  • Yen, W. M. (1984). Effects of Local Item Dependence on the Fit and Equating Performance of the Three-Parameter Logistic Model. Applied Psychological Measurement, 8(2), 125–145. https://doi.org/10.1177/014662168400800201
  • Yen, Y.-C., Ho, R.-G., Liao, W.-W., Chen, L.-J., & Kuo, C.-C. (2012). An Empirical Evaluation of the Slip Correction in the Four Parameter Logistic Models With Computerized Adaptive Testing. Applied Psychological Measurement, 36(2), 75–87. https://doi.org/10.1177/0146621611432862
  • Yen, Y., Ho, R., Liao, W., & Chen, L. (2012). Reducing the Impact of Inappropriate Items on Reviewable Computerized Adaptive Testing. Educational Technology & Society, 15, 231–243.
  • Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2002). Effects of Local Item Dependence on the Validity of IRT Item, Test, and Ability Statistics. Journal of Educational Measurement, 39(4), 1–16. https://eric.ed.gov/?id=ED462426 sayfasından erişilmiştir.