Dikey Ölçeklemede Klasik Test ve Madde Tepki Kuramına Dayalı Yöntemlerin Karşılaştırılması

Dikey ölçekleme, öğrencilerin, ardışık sınıf ya da yaş seviyelerinde matematik ya da okuma becerileri gibi alanlarda, ne kadar gelişim ortaya koyduğunu belirlemeye yarayan bir test bağlama türüdür. Bu araştırmanın amacı, dikey ölçekleme işlemi sonucunda, sınıf seviyesi arttıkça, bu seviyelerde meydana gelen gelişimin örüntüsünü çıkarmaktır. Araştırmanın verilerini, 2005 yılında Türkiye genelinde yapılan İlköğretim Öğrencilerinin Başarılarının Belirlenmesi Sınavı'na (ÖBBS) ait 6., 7. ve 8. sınıf öğrencileri oluşturmaktadır.Dikey ölçekleme, Klasik Test Kuramı (KTK) ve Madde Yanıt Kuramı (MTK) temelinde uygulanmıştır. KTK'ya dayalı olarak Thurstone (1938), Madde yanıt Kuramına dayalı olarak yapılan yetenek kestirimlerinde de Expected A Posteriori (EAP) puanlama yolu kullanılmıştır. Dikey Ölçekleme sonuçlarının değerlendirme ölçütü olarak; ortalama, standart sapma ve etki büyüklüğü değerleri kullanılmıştır. Araştırma sonunda, Thurstone ölçeklemede Matematik ve Türkçe testlerinde sınıf seviyesi ile birlikte standart sapmalar artış göstermektedir. Araştırmaya dayalı bulgular incelendiğinde, gerek KTK gerekse MTK uygulamalarında, ortalamaların seyrinden farklı olarak standart sapmaların arttığı söylenebilir

Comparison of the Methods of Classical Test Theory and Item Response Theory on Vertical Scaling

Vertical Scaling is a kind of linking which is used to determine how much the students of adjacent grades or ages have improved in the subject areas such as Maths and Language. The purpose of this research is, as a result of vertical scaling, to establish the pattern of students’ improvement in certain levels as class grade increases. The data in this research were obtained from the Achievement Exam (OBBS) results for Turkish primary school students of the 6th, 7th and 8th grades in 2005. Vertical Scaling was conducted based on Classic Test Theory (CTT) and Item Response Theory (IRT). Thurstone (1938) scaling method was used based on CTT, and Expected A Posteriori (EAP) scaling method was used in IRT estimation. As an evaluation criteria of vertical scaling, mean, standard deviation and effect size figures in academic growth were used. At the end of the research, in Thurstone Scaling, standard deviations in Maths and Turkish Tests rise as class levels increase. In the conducts of CTT and IRT, it can be said that standard deviations increase free from the increase or decrease of means

PDF

___

Becker, D. F., & Forsyth, R. A. (1992). An empirical investigation of Thurstone and IRT methods of scaling achievement tests. Journal of Educational Measurement, 29, 341–354.
Gulliksen, H. (1950). Theory of mental tests. New York: Wiley. Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer Nijhoff Publishing.
Jöreskog, K., Sörbom, D., Du Toit, S.H.C. & Du Toit, M. (1999)..LISREL 8: New Statistical Features. Chicago, Illinois: Scientific Software International, Inc.
Kolen, M. J & Brennan, R. L. (2004). Test equating, scaling, and linking: methods and practices (2nd edn) (New York, Springer Verlag).
No Child Left Behind Act of 2001. (2002). Pub. L. No. 107-110, 115 Stat. 1425. Thurstone, L. L. (1925). A method of scaling psychological and educational tests. Journal of Educational Psychology,16(7), 433-451.
Thurstone, L. L. (1938). Primary Mental Abilities. Psychometric Monographs, No 1. 35, 93-107.
Tong, T (2005). Comparison of Methodologies And Results in Vertical Scaling for Educational Achievements Tests.Unpublished Ph.D. Thesis, University of Iowa, Iowa.
Williams, V.S.L., Pommerich, M., & Thissen, D. (1998). A comparison of developmental scales based on Thurstone methods and item response theory. Journal of Educational Measurement, 35, 93-107.
Yen, W. M. (1986). The choice of scale for educational measurement: An IRT perspective. Journal of Educational Measurement, 23, 299–325.
Zimowski,M.F.,Muraki,E.,Mislevy,R.J.,&Bock,R.D.(1996). BILOG-MG:Multiple-group IRT analysis and test maintenance for binary items [Computer software]. Chicago: Scientific Software International.