TÜRKİYE'DE UYGULANAN GENİŞ ÖLÇEKLİ TESTLERİN ÇOK BOYUTLULUĞUNUN ANALİZİ

Test yapısını ampirik olarak değerlendiren yöntemler öncelikle testin boyutluluğu olmak üzere bazı yapısal özellikler hakkında bilgi sağlamalıdır. Test boyutluluğu testin yapı geçerliliği, test puanlarının hesaplanması ve raporlanması üzerinde doğrudan bir etkiye sahiptir. Bu çalışmanın amacı Öğrenci Başarılarının Belirlenmesi Sınavının (ÖBBS) dört alt testinin boyutlarını parametrik olmayan çok boyutluluk yöntemlerinden DIMTEST T istatistiği kullanılarak belirlemektir. Araştırma betimsel türde temel bir çalışmadır. Araştırmanın verileri 2002, 2005 ve 2008 yılı ÖBBS'nin Türkçe, matematik, fen ve teknoloji, sosyal bilgiler ve İngilizce testlerine verilen yanıtlardan elde edilmiştir. Bu çalışma kapsamında verinin boyutlu olup olmadığının test edilmesi amaçlandığı için DIMTEST T istatistiği kullanılmıştır. Hipotez testi sonucu 2002, 2005 ve 2008 de uygulanan ÖBBS'nin tüm alt testlerinin çok boyutlu olduğu belirlenmiştir. Bilişsel testlerde yer alan soruları çözmek için gerekli olan beceriler düşünüldüğünde, tek boyutluluğun sağlanamadığı görülmektedir.

THE ANALYSIS OF LARGE SCALE TESTS APPLIED IN TURKEY IN TERMS OF THEIR MULTIDIMENSIONALITY

The methods that empirically evaluate test structure need to primarily provide information about test dimensionality and other structural characteristics. Test dimensionality has a direct impact on a test's structure validity, score calculation, and result recording. The purpose of this study is to identify the dimensions of four subtests that constitute the parts of Evaluation of Student Achievement Test (ÖBBS) by using DIMTEST T statistics which is a nonparametric multidimensionality method. This study is a descriptive one. The data in this study is obtained from the responses provided for the Turkish, Mathematics, Science and Technology, and Social Sciences tests in ÖBBS 2002. In order to test the multidimensionality of the data DIMTEST T statistics was used. As a result of the hypothesis test it was found that all the subtests in ÖBBS 2002, 2005, and 2008 were multidimensional. When the skills that are necessary to solve the questions in cognitive tests are considered, it can be noticed that unidimensionality was not established.

___

Ackerman, T.A. (1989). Unidimensional IRT calibration of compensatory and noncompensatory multidimensional items. Applied Psychological Measurement, 13, 113- 127.

Ackerman T.A. (1994). Using multidimensional item response theory to understand what items and tests are measuring. Applied Measurement In Education, 7(4), 255-278.

Adams, R. J., Wilson, M., & Wang, W.C. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21, 1-23.

Ansley, T.N. & Forsyth, R.A. (1985). An examination of the characteristics of unidimensional IRT parameter estimates derived from two-dimensional data. Applied Psychological Measurement, 9, 37-48.

Balàzs, K., Hidegkuti, I., & De Boeck, P.(2006). Heterogeneity in logistic regression models. Applied Psychological Measurement. Volume 30 Number 4 July.

Büyüköztürk, Ş., Çokluk, Ö. ve Köklü, N. (2010). Sosyal bilimler için istatistik (6. baskı). Ankara: Pegem A yayınevi Tic. Ltd. Şti.

Camilli, G.,Wang, M., & Fesq, J. (1995). The effects of dimensionality on equating the law school admission test. Journal of Educational Measurement, 32(1), 79-96.

Douglas, J., Kim, H. R., Roussos, L., Stout, W., & Zhang, J. (1999). LSAT Dimensionality analysis for the december 1991, june 1992, and october 1992 administrations [Law School Admission Council Statistical Report 95-05].

Drasgow, F. & Parsons, C.K. (1983). Application of Unidimensional item response theory models to multidimensional data. Applied Psychological Measurement,7,189-199.

Elias, S., Hattie, J., & Douglas, G. (1998). An assessment of various ıtem response model and structural equation model fit ındices to detect unidimensionality. Paper presented at the annual meeting of the National Council on Measurement in Education, San Diego, CA.

Froelich, A.G. (2001). A new bias correction method for DIMTEST. Unpublished manuscript. Froelich, A. G. & Habing B. (2003). Conditional covariance based subtest selection for DIMTEST. ın william stout ınstitute for measurement, dımtest version 2.0[software manual]. Urbana-Champaign, IL: William Stout Institute for Measurement.

Gao, F. (1997). DIMTEST Enhancements and some parametric IRT asymptotics. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign.

Hambleton, R. K. & Swaminathan H. (1985). Item response theory: principlesand application. Kluwer, Nijhoff Publishing a Member of Kluwer Academic Publisher Group.

Hambleton, R. K., Swaminathan H., & Rogers, H.J. (1991). Fundamentals of ıtem response theory. California: Sage Publications Inc.

Harrison, D.A. (1986). Robustness of IRT parameter estimation to violations of the unidimensionality assumption, Journal of Educational Statistics, 11, 91-115.

Hattie, J., Krakowski, K., Rogers, J., & Swaminathan, H. (1996). An assessment of stout's index of essential dimensionality. Applied Psychological Measurement, 20, 1-14.

Hulin, C. L., Drasgow, F., & Parsons, C. K. (1983). Item response theory. Dow Jones-Irwin, Homewood, IL.

Junker, B. W. & Sijtsma, K. (2001). Nonparametric item response theory in action: An overview of the special issue. Applied Psychological Measurement, 25, 211-220.

Kelderman, H. (1996). Multidimensional rasch models for partial-credit scoring. Applied Psychological Measurement, 20, 155-168.

Kim, H. R. (1994). New techniques for dimensionality assessment of standardized test data. Unpublished doctoral dissertation. University of Illinois at Urbana-Champaign, Department of Statistics.

McDonald, R. P. (1981). The dimensionality of test and items. British Journal of Mathematical and Statistical Psychology, 34, 100-117.

Meara, K., Robin, F., & Sireci, S.G. (2000). Using Multidimensional Scaling to Assess the Dimensionality of Dichotomous Item Data, Multivariate Behavioral Research, 35 (2), 229-259.

Milli Eğitim Bakanlığı, Eğitimi Araştırma ve Geliştirme Dairesi Başkanlığı (2002). Öğrenci Başarılarının Belirlenmesi Sınavı Durum Belirleme Raporu: Türkçe, Matematik, Fen Bilgisi, Sosyal Bilgiler. Milli Eğitim Bakanlığı.

Milli Eğitim Bakanlığı, Talim ve Terbiye Kurulu Başkanlığı (2006a). İlköğretim Türkçe dersi (6, 7, 8. Sınıflar) öğretim programı. Milli Eğitim Bakanlığı.

Milli Eğitim Bakanlığı, Talim ve Terbiye Kurulu Başkanlığı (2006b). İlköğretim matematik dersi (6, 7, 8. Sınıflar) öğretim programı. Milli Eğitim Bakanlığı.

Milli Eğitim Bakanlığı, Eğitimi Araştırma ve Geliştirme Dairesi Başkanlığı (2007). Öğrenci başarılarının belirlenmesi sınavı durum belirleme raporu: Türkçe, matematik, fen bilgisi, sosyal bilgiler, İngilizce. Eğitimi Araştırma ve Geliştirme Dairesi Yayınları. Milli Eğitim Bakanlığı

Milli Eğitim Bakanlığı, Eğitimi Araştırma ve Geliştirme Dairesi Başkanlığı (2009). Öğrenci başarılarının belirlenmesi sınavı durum belirleme raporu: Türkçe, matematik, fen bilgisi, sosyal bilgiler, İngilizce. Eğitimi Araştırma ve Geliştirme Dairesi Yayınları. Milli Eğitim Bakanlığı.

Mroch, A. A. & Bolt, D. M. (2006). A simulation comparison of parametric and nonparametric dimensionality detection procedures. Applied Measurement in Education, 19 (1), 67-91.

Nandakumar, R. & Stout, W.F. (1993). Refinements of Stout's procedure for assessing unidimensionality. Journal of Educational Statistics, 18, 41-68.

Özer Özkan, Y. (2012). Klasik test kuramı, tek boyutlu ve çok boyutlu madde tepki kuramı modellerinden kestirilen öğrenci başarısı belirleme sınavı (ÖBBS) başarı ölçülerinin karşılaştırılması. Yayımlanmamış Doktora Tezi, Ankara Üniversitesi Eğitim Bilimleri Enstitüsü, Ankara.

Özbek Baştuğ, Ö.Y. (2012). Assessment of Dimensionality in Social Science Subtest. Educational Sciences: Theory & Practice. 12(1), Winter: 382-385.

Rost, J. & Carstensen, C. H. (2002). Multidimensional rasch measurement via item component models and faceted designs. Applied Psychological Measurement, 26, 42- 56.

Roussos, L.A., Stout, W., & Marden, J. (1998). Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of educational measurement, 35, 1-30.

Seo, M., Rutgers, C., & Roussos, L. (2008). Evaluating the dimensionality of the 2001 PIRLS reading assessment: An application of DIMTEST with DESM and CFA. http://www.iea.nl/fileadmin/user_upload/IRC/IRC_2008/Papers/IRC2008_Seo_Chiu_et al.pdf

Sinharay, S., Haberman, S. J., & Puhan, G. (2007). Subscores Based on Classical Test Theory: to Report or Not to Report. Educational Measurement: Issues and Practice, 26 (4), 21-28.

Smith, J. (2009). Someissues in item response theory: Dimensionality assessment and models for Guessing. Unpublished Doctoral Dissertation. University of South California.

Smith, J. & Roussos, L. (2010). Effect of refined subtest selection on DIMTEST performance with an application to nonsimple structure multidimensionality. Presented at the 2010 annual meeting of the American Education Research Association May 2.

Socha, A. & DeMars, C.E. (2013). An investigation of sample size splitting on ATFIND and DIMTEST. Educational and Psychological Measurement, 73, 631-647.

Stout, W. (1987). A nonparametric approach for assessing latent trait unidimensionality. Psychometrica, 52, 589-617

Stout, W., Froelich, A. G., & Gao, F. (2001). Using resampling methods to produce an improved DIMTEST procedure. In A. Boomsma, M. A. J. van Duijn, & T. A. B.

Snijders (Eds.), Essay on item response theory (pp. 357-375). New York: Springer. Stout, W. F., Habing, B., Douglas, J., Kim, H. R., Roussos, L., & Zhang, J. (1996). Conditional covariance based nonparametric multidimensionality assessment. Applied Psychological Measurement, 19, 331-354.

Stout, W.F., Douglas, J., Junker, B., & Roussos, L.A. (1993). DIMTEST Manual. Unpublished manuscript available from W.F. Stout, Univerisity of UrbanaChampaign, Illinois.

Tate, R. (2003). A comparison of selected empirical methods for assessing the structure of responsesto test ıtems. Applied Psychological Measurement, 27, 159-203.

Traub, R.E. (1983). A priori consideration in choosing an item response model. In R.K.

Van Abswoude, A. A. H., Van der Ark, L.A., & Sijtsma, K. (2004) A comparative study of test data dimensionality assessment procedures under nonparametric IRT models. Applied Psychological Measurement. January, vol. 28, no. 1 3-24.

Yao, L. & Schwarz R. (2006). A Multidimensional partial credit model with associated item and test statistics: an application to mixed format tests. Applied Psychological Measurement, 30, 469-492.

Zhang, J. & Stout, W. (1999a). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64, 213-249.

Zhang, J. & Stout, W.F. (1999b). Conditional covariance structure of generalized compensatory multidimensional items. Psychometrika, 64, 129-152.

Way, W. D., Ansley, T.N., & Forsyth, R. A. (1988). The comparative effects of compensatory and non-compensatory two dimensional data on unidimensional IRT estimates. Applied Psychological Measurement, 12, 239-252.