Bu araştırmada çok kategorili bilişsel tanı ve çok boyutlu madde tepki kuramı (ÇBMTK) modellerinin birbiri yerine kullanımı durumunda birey parametrelerinin tekrar elde edilebilirliği incelenmiştir. Bu amaç doğrultusunda çok kategorili bilişsel tanı modellerinden polytomous generalized deterministic input noisy and gate (pG-DINA) ve fully-additive model (fA-M) ve ÇBMTK modellerinden ise telafi edici 2PL modeli için veriler üretilmiştir. Veriler, madde ayırt edicilik indeksi, maddelerin yapılarına göre testteki oranı, test uzunluğu ve yetenekler arası korelasyon değerleri farklılaştırılarak toplamda 54 koşul kullanılarak 25 tekrar ile üretilmiştir. Verinin her üç modelle de kestirimi sonucu ortaya çıkan birey parametreleri ile veri üretiminde kullanılan birey parametrelerinin karşılaştırılması ile bulgular elde edilmiştir. Bulgulara göre tüm veri türlerinin en yüksek doğrulukta kestirimi ait oldukları modeller tarafından gerçekleştirilmiştir. Uyarlama içeren analizlerde ise fA-M diğer iki model verilerini, ÇBMTK ise fA-M verisini, verinin ait olduğu model kestirimine yakın bir doğruluk oranında kestirmiştir. PG-DINA’nın diğer iki model verilerini, ÇBMTK’nin ise pG-DINA verisini kestirmede düşük performansa sahip olduğu gözlenmiştir. Araştırmada kullanılan koşullardan sırasıyla test uzunluğu ve madde ayırt ediciliğinin birey parametre doğruluğuna etki eden en kuvvetli faktörler olduğu görülmüştür. Madde yapısı oranı koşulunun ise ÇBMTK verisi analizlerinde ve uyarlamalarında çok daha etkili olduğu görülmüştür. Yetenekler arası korelasyonun varlığının birey parametre doğruluğuna etkisinin ise ÇBMTK verisinin analizlerinde daha belirgin ancak yine de sınırlı olduğu görülmüştür

Retrofitting of Polytomous Cognitive Diagnosis and Multidimensional Item Response Theory Models

In this study, person parameter recoveries are investigated by retrofitting polytomous attribute cognitive diagnosis and multidimensional item response theory (MIRT) models. The data are generated using two cognitive diagnosis models (i.e., pG-DINA: the polytomous generalized deterministic inputs, noisy “and” gate and fA-M: the fully-additive model) and one MIRT model (i.e., the compensatory two-parameter logistic model). Twenty-five replications are used for each of the 54 conditions resulting from varying the item discrimination index, ratio of simple to complex items, test length, and correlations between skills. The findings are obtained by comparing the person parameter estimates of all three models to the actual parameters used in the data generation. According to the findings, the most accurate estimates are obtained when the fitted models correspond to the generating models. Comparable results are obtained when the fA-M is retrofitted to other data or when the MIRT model is retrofitted to fA-M data. However, the results are poor when the pG-DINA is retrofitted to other data or the MIRT is retrofitted to pG-DINA data. Among the conditions used in the study, test length and item discrimination have the greatest influence on the person parameter estimation accuracy. Variation in the simple to complex item ratio has a notable influence when the MIRT model is used. Although the impact on the person parameter estimation accuracy of the correlation between skills is limited, its effect on MIRT data is more significant.

___

  • Ackerman, T. A., Gierl, M. J., & Walker, C. M. (2003). Using multidimensional item response theory to evaluate educational and psychological tests. Educational Measurement: Issues and Practice, 22(3), 37-51.
  • Ardıç, E. Ö. (2020). Bilişsel tanı ve çok boyutlu madde tepki modellerinin sınıflama doğruluğu ve parametrelerinin karşılaştırılması [Comparison of classification accuracy and parameters of cognitive diagnostic and multidimensional item response models]. Unpublished PhD Dissertation, Hacettepe University, Ankara.
  • Chalmers, R. P., (2012). MIRT: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29.
  • Chalmers, R. P., & Flora, D. B. (2014). Maximum-likelihood estimation of noncompensatory IRT models with the MH-RM algorithm. Applied Psychological Measurement, 38(5), 339-358.
  • Chen, H., & Chen, J. (2016). Retrofitting non-cognitive-diagnostic reading assessment under the generalized DINA model framework. Language Assessment Quarterly, 13(3), 218-230.
  • Chen, J., & de la Torre, J. (2013). A general cognitive diagnosis model for expert-defined polytomous attributes. Applied Psychological Measurement, 37(6), 419-437.
  • Chen, J., & de la Torre, J. (2014). A procedure for diagnostically modeling extant large-scale assessment data: The case of the programme for international student assessment in reading. Psychology, 5(18), 1967-1978.
  • de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199.
  • de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69(3), 333-353.
  • de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595-624.
  • de la Torre, J., & Karelitz, T. M. (2009). Impact of diagnosticity on the adequacy of models for cognitive diagnosis under a linear attribute structure: A simulation study. Journal of Educational Measurement, 46(4), 450-469.
  • de la Torre, J., & Lee, Y. S. (2013). Evaluating the Wald test for item‐level comparison of saturated and reduced models in cognitive diagnosis. Journal of Educational Measurement, 50(4), 355-373.
  • de la Torre, J., & Minchen N. (2014). Cognitively diagnostic assessments and the cognitive diagnosis model framework. Psicología Educativa, 20(2), 89-97.
  • DiBello, L.V. Roussos L. A., & Stout, W. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. Rao, C. Sinharay, S. (Eds.) Handbook of Statistics, Psychometrics. Vol. 26. North-Holland: Amsterdam.
  • Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality, Unpublished PhD dissertation, University of Illinois at Urbana-Champaign.
  • Huebner, A., & Wang, C. (2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71(2), 407-419.
  • Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with non-parametric item response theory. Applied Psychological Measurement, 25(3), 258-272.
  • Lee, Y. S., Park, Y. S., & Taylan, D. (2011). A cognitive diagnostic modeling of attribute mastery in Massachusetts, Minnesota, and the US national sample using the TIMSS 2007. International Journal of Testing, 11(2), 144-177.
  • Liu, R., Huggins-Manley, A. C., & Bulut, O. (2018). Retrofitting diagnostic classification models to responses from IRT-based assessment forms. Educational and Psychological Measurement, 78(3), 357-383.
  • Ma, W. & de la Torre, J. (2016). GDINA: The generalized DINA model framework. R package version 0.9.2.
  • McKinley R. L. & Reckase M. D. (1982) The use of the general Rasch model with multidimensional item response data (Research Report: ONR 82-1). American College Testing, Iowa City, IA.
  • Organisation for Economic Co-operation and Development. (2002). Frascati Kılavuzu. Paris: OECD.
  • Reckase, M. D. (2007). Multidimensional item response theory. Rao, C. Sinharay, S. (Ed.) Handbook of Statistics, Psychometrics. Vol. 26. North-Holland: Amsterdam.
  • Reckase, M. D. (2009). Multidimensional item response theory. New York, NY: Springer.
  • Rupp, A. A., Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.
  • Şen, S., & Arıcan, M. (2015). A diagnostic comparison of Turkish and Korean students’ mathematics performances on the TIMSS 2011 assessment. Journal of Measurement and Evaluation in Education and Psychology, 6(2), 238-253.
  • Templin, J., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11(3), 287-305.
  • von Davier, M., & Lee, Y. S. (2019). Introduction: From latent classes to cognitive diagnostic models. In Handbook of Diagnostic Classification Models (pp. 1-17). Springer, Cham.
  • Wang, Y. C. (2009). Factor analytic models and cognitive diagnostic models: How comparable are they? – A Comparison of R-RUM and compensatory MIRT model with respect to cognitive feedback. Unpublished PhD dissertation, The Faculty of The Graduate School at The University of North Carolina at Greensboro).
  • Yakar, L., de la Torre, J., & Ma, W. (2017). An empirical comparison of two cognitive diagnosis models for polytomous attributes. In the Annual Meeting of National Council on Measurement in Education. National Council on Measurement in Education (NCME), San Antonio, TX.