Kategori Sayısının Psikometrik Özellikler Üzerine Etkisinin Mokken Homojenlik Modeli’ne Göre İncelenmesi

Araştırmanın amacı çok kategorili puanlanan maddelerden oluşan testlerde kategori sayısının psikometrik özellikler üzerindeki etkisinin parametrik olmayan madde tepki kuramı (POMTK) modeli ile belirlenmesidir. Belirlenen amaç doğrultusunda iki farklı büyüklükte (100 ve 500), çeşitli dağılım özelliklerine sahip (normal dağılan, sağa çarpık dağılan ve sola çarpık dağılan) örneklemler için iki farklı test uzunluğunda (10 madde ve 30 madde), üç farklı sayıda kategoriye (üç, beş ve yedi) sahip maddeler simülatif olarak üretilmiştir. Kategori sayısının psikometrik özellikler üzerindeki etkisi POMTK modellerinden Mokken Homojenlik Modeli (MHM) ile araştırılmıştır. Yapılan araştırma temel araştırma olarak tasarlanmıştır. Verilerin üretilmesinde ve verilerin analizinde R Studio 3.4.0 yazılımı kullanılmıştır. R Studio yazılımında MHM’ye göre analizler Mokken paketi ile yapılmıştır. MHM’ye göre yapılan ölçekleme sonucunda kategori sayısının değişmesiyle birlikte maddelerin MHM’ye uyumunda belli bir örüntü gözlenmemiştir. Genel olarak hem kısa testlerde, hem de uzun testlerde kategori sayısının güvenirlik değerlerinin kestiriminde etkili olmadıkları gözlenmiştir. Araştırmada belirlenen test koşullarında testler MHM’ye düşük düzeyde uyumlu çıkmıştır.

Anahtar Kelimeler:

çok kategorili puanlanan maddeler, kategori sayısı, parametrik olmayan madde tepki kuramı

Investigation of the Effects of the Number of Categories on Psychometric Properties According to Mokken Homogeneity Model

The aim of the research was to examine the effects of the number of categories for polytomous items on psychometric properties in a nonparametric item response theory (NIRT) model. For the purpose of the study, data sets with two different sample sizes (100 and 500) that come from different sample distribution shapes (normal distribution, positively skewed distribution, and negatively skewed distribution), two different test lengths (10 items and 30 items), and three different number of categories (three, five, and seven) were generated. The effects of the number of categories on psychometric properties of polytomous items were analyzed by Mokken Homogeneity Model (MHM) under NIRT model. The research was designed as a basic research. In the generation and analysis of data sets, R Studio 3.4.0 software was used. For analysis conducted with MHM, Mokken package was used in R Studio. According to scaling with MHM, specific pattern of item fit to MHM with changing the number of categories was not observed. In general, it was found that the number of categories has no effect on reliability estimate. It was determined that tests have weak fit to MHM under test conditions in the research.

Keywords:

polytomous items number of category, nonparametric item response theory, mokken homogeneity model,

PDF

___

Bahry, L. M. (2012). Polytomous item response theory parameter recovery: an investigation of nonnormal distributions and small sample size (Master’s Thesis). Available from ProQuest Dissertations and Theses database. (UMI No. MR90146)
Cohen, A. S., Kim, S. H., & Baker, F. B. (1993). Detection of differential item functioning in the graded response model. Applied Psychological Measurement, 17(4), 335-350. doi:10.1177/01466216930170040
Crocker, L. & Algina, J. (1986). Introduction to Classical and Modern Test Theory. Orlando: Harcourt Brace Jovanovich Inc.
DeMars, C. E. (2002, April). Recovery of graded response and partial credit parameters in multilog and parscale. Paper presented at the annual meeting of American Educational Research Association, Chicago.
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224- 247. doi:10.1177/0146621607302479
Erkuş, A., Sanlı, N., Bağlı, M., & Güven, K. (2000). Öğretmenliğe ilişkin tutum ölçeği geliştirilmesi. Eğitim ve Bilim, 25(116). http://egitimvebilim.ted.org.tr/index.php/EB/article/view/5276/1439 adresinden erişildi.
Fabiola, G., Iwin, L., Jennifer, L., & Zaira, V. (2012). The effect of the number of answer choices on the psychometric properties of stress measurement in an instrument applied to children. Evaluar, 12, 43-59. Retrieved from https://revistas.unc.edu.ar/index.php/revaluar/article/viewFile/4694/4488
Galindo-Garre, F., Hendriks, S. A., Volicer, L., Smalbrugge, M., Hertogh, C. M., & van der Steen, J. T. (2014). The Bedford Alzheimer nursing-severity scale to assess dementia severity in advanced dementia: a nonparametric item response analysis and a study of its psychometric characteristics. Am J Alzheimers Dis Other Demen, 29(1), 84-90. doi: 10.1177/1533317513506777
Hambleton, R. K., & Jones, R. W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement, 12, 38-47.
Hemker, B. T., Sijtsma, K., Molenaar, I. W., & Junker, B. W. (1996). Polytomous irt models and monotone likelihood ratio of the total score. Psychometrika, 61(4), 679-693.
İlhan, M., & Güler, N. (2017). The number of response categories and the reverse directional item problem in likert-type scales: a study with the rasch model. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 8(3), 321-343.
Jiang, S., Wang, C., & Weiss, D. J. (2016). Sample size requirements for estimation of item parameters in the multidimensional graded response model. Frontiers in psychology, 7, 109. doi: 10.3389/fpsyg.2016.00109
Junker, B., and Sijtsma, K. (2001). Nonparametric item response theory in action: an overview of the special issue. Applied Psychological Measurement, 25(3), 211- 220. doi:10.1177/01466210122032028
Koğar H., (2015). Madde tepki kuramına ait parametrelerin ve model uyumlarının karşılaştırılması: bir monte carlo çalışması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6, 142-157.
Lee, J., & Paek, I. (2014). In search of the optimal number of response categories in a rating scale. Journal of Psychoeducational Assessment, 32(7), 663-673. Leung, S. O. (2011). A comparison of psychometric properties and normality in 4-, 5-, 6-, and 11-point likert scales. Journal of Social Service Research, 37(4), 412-421.
Lozano, L. M., García-Cueto, E., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 4(2), 73-79.
Maydeu-Olivares, A., Kramp, U., García-Forero, C., Gallardo-Pujol, D., & Coffman, D. (2009). The effect of varying the number of response alternatives in rating scales: experimental evidence from intra-individual effects. Behavior Research Methods, 41(2), 295-308.
Meijer, R. R. (2004, March). Investigating the quality of items in cat using nonparametric irt. Law School Admission Council Computerized Testing Report. A Publication of the Law School Admission Council.
Meijer, R. R., & Baneke, J. J. (2004). Analyzing psychopathology items: a case for nonparametric item response theory modeling. Psychological Methods, 9(3), 354-368. doi: 10.1037/1082-989X.9.3.354
Mokken, R. J. (1971). A theory and procedure of scale analysis: with applications in political research. The Hague: Mouton.
Mokken, R. J. (1997). Nonparametric models for dichotomous responses. In W. J. van der Linden, and R. K. Hambleton (Eds.), Handbook of modern item response theory (pp. 351-368). New York: Springer-Verlag.
Molenaar, I. W. (2001). Thirty years of nonparametric item response theory. Applied Psychological Measurement, 25(3), 295-299. doi:10.1177/01466210122032091 Ostini, R., & Nering, M. L. (2006). Polytomous Item Response Theory Models. Thousand Oaks, CA: Sage
Pozehl, J. B. (1990). Application of item response theory to criterion-referenced measurement: an investigation of the effects of model choice, sample size, and test length on reliability and estimation accuracy (Doctoral Dissertation). Available from ProQuest Dissertations and Theses database. (UMI No. 9030146)
Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104(1), 1-15.
Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56(4), 611-630.
Rivas, T., Bersabé, R., & Berrocal, C. (2005). Application of double monotonicity model to polytomous items: scalability of the beck depression items on subjects with eating disorders. European Journal of Psychological Assessment, 21(1), 1-10. doi:10.1027//1015-5759.21.1.1
Sachs, J., Law, Y. K., & Chan, C. K. K. (2003). A nonparametric item analysis of a selected item subset of the learning process. British Journal of Educational Psychology, 73(3), 395–423. doi: 10.1348/000709903322275902
Sijtsma, K. & Molenaar, W. I. (2002). Introduction to Nonparametric Item Response Theory, USA: Sage Publications.
Sijtsma, K., Debets, P., & Molenaar, W. I. (1990). Mokken scale analysis for polychotomous items: theory, a computer program and an empirical application. Quality and Quantity, Kluwer academic publishers, Netherlands.
Štochl, J. (2007). Nonparametric extension of item response theory models and its usefulness for assessment of dimensionality of motor tests. Acta Universitatis Carolinae, 42(1), 75-94.
Syu, J. J. (2013). Applying person fit-in faking detection-the simulation and practice of non parametric item response theory. (Doctoral Dissertation, National Chengchi University). Retrieved from http://nccur.lib.nccu.edu.tw/bitstream/140.119/58646/1/251501.pdf
Şengül Avşar, A., & Tavşancıl, E. (2017). Examination of polytomous items' psychometric properties according to nonparametric item response theory models in different test conditions. Educational Sciences: Theory & Practice, 17(2). doi:10.12738/estp.2017.2.0246
Tendeiro, J. N., & Meijer, R. R. (2013). The probability of exceedance as a nonparametric person fit statistic for tests of moderate length. Applied Psychological Measurement, 37(8), 653–665. doi: 10.1177/0146621613499066
Uyumaz, G., & Çokluk, Ö. (2016). An investigation of item order and rating differences in likert-type scales in terms of psychometric properties and attitudes of respondents. Journal of Theoretical Educational Science, 9(3), 400-425. doi:10.5578/keg.10011
van der Ark, L. A. (2007). Mokken scale analysis in r. Journal of Statistical Software, 20(11), 1-19.
van der Ark, L. A. (2015). Package ‘mokken’. Retrieved from http://cran.rproject.org/web/packages/mokken/mokken.pdf
van der Ark, L. A., van der Palm, D. W., & Sijtsma, K. (2011). A latent class approach to estimating test-score reliability. Applied Psychological Measurement, 35(5), 380-392. doi:10.1177/0146621610392911
van Onna, M. J. H. (2004). Estimates of the sampling distribution of scalability coefficient h. Applied Psychological Measurement, 28(6), 427-449. doi:10.1177/0146621604268735
Wang, W. C. (2004). Direct estimation of correlation as a measure of association strength using multidimensional item response models. Educational and Psychological Measurement, 64(6), 937-955. doi:10.1177/0013164404268671
Weng, L. J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64(6), 956-972.
Young, M. A., Blodgett, C., & Reardon, A. (2003). Measuring seasonality: psychometric properties of the seasonal pattern assessment questionnaire and the inventory for seasonal variation. Psychiatry Research, 117(1), 75-83. doi: 10.1016/S0165-1781(02)00299-8
Zhang, O. (2010). Polytomous irt or testlet model: an evaluation of scoring models in small testlet size situations (Master’s Thesis, Universtiy of Florida). Retrived from http://ufdc.ufl.edu/UFE0042638/00001
Zenisky, A. L., Hambleton, R. K., & Sireci, S. G. (2002). Identification and evaluation of local item dependencies in the medical college admissions test. Journal of Educational Measurement, 39(4), 291 -309. doi:10.1111/j.1745- 3984.2002.tb01144.x

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi-Cover

ISSN: 1309-6575
Yayın Aralığı: Yılda 4 Sayı
Başlangıç: 2010
Yayıncı: Selahattin GELBAL

Arşiv

Sayıdaki Diğer Makaleler

Çocuklar İçin “Hayır” Diyebilme Becerisi Ölçeği: Geçerlik ve Güvenirlik Çalışması

Ferat YILMAZ, M. Akif SÖZER

Bireysel ve Sosyal Sorumluluk Davranışları Ölçeği’nin Geliştirilmesi ve İncelenmesi

Bijen FİLİZ, Gıyasettin DEMİRHAN

Öğrenci, Öğretmen ve Öğretimsel Nitelikler Açısından TIMSS-2015’e Dayalı Olarak Öğrencilerin Sınıflandırılması

Emine ÖNEN

Kategori Sayısının Psikometrik Özellikler Üzerine Etkisinin Mokken Homojenlik Modeli’ne Göre İncelenmesi

Asiye ŞENGÜL AVŞAR

Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi

Gülden KAYA UYANIK, Selahattin GELBAL