Comparing Different Test Equating Methods Based on Equity Property

Bu araştırmanın amacı üç farklı eşit yüzdelikli eşitleme ve ikifarklıMaddeTepkiKuramı'na(MTK)dayalıtesteşitleme yöntemlerinitesteşitlemenineşitliközelliğiölçütünegörekarşılaştırmaktır. Araştırmada, 2009 Öğrenci Başarılarının Belirlenmesi Sınavı'nda A ve C kitapçığında yer alan sosyal bilimler testi puanları eşdeğer gruplar deseni altında eşitlenmiştir. AraştırmanınçalışmagrubunuAkitapçığınıalan15.173veCkitapçığını alan 14.365 dokuzuncu sınıf öğrenci oluşturmuştur. Eşitlik özelliği eşitleme yapıldıktan sonra alternatif formlardan elde edilen puanların koşullu ortalamalarının eşit olmasını gerektiren birinci-sıra eşitlik (BSE) ve koşullu ölçmenin standart hatasının eşit olmasınıgerektirenikinci-sıraeşitlik(İSE)özelliğiölçütlerinegöredeğerlendirilmiştir. Araştırmanın sonucunda, BSE özelliğini en iyi koruyan yöntemin MTK gerçek puan eşitleme, İSE özelliğini en iyi koruyan yöntemin MTK gözlenen puan eşitleme yöntemi olduğu görülmüştür.

Farklı Test Eşitleme Yöntemlerinin Eşitlik Özelliği Ölçütüne Göre Karşılaştırılmas

The purpose of this study was to compare the performance of three equipercentile and two item response theory (IRT) equating methods based on equity property. In this study, the social science test scores of Booklet A and Booklet C from 2009 the Assessment of Student Achievement were equated under equivalent groups design. The study group consisted of 15.173 and 14.365 9thgrade students which took Booklet A and Booklet C. The equity property was evaluated based on first-order equity (FOE) which requires the same conditional means after equating and second-order equity (SOE) which requires the same conditional error of measurement. Results showed that IRT true score equating method best preserved FOE property and IRT observed score equating method best preserved SOE property

___

  • Andrews, B. J. (2011). Assessing first-and second-order equity for the common item nonequivalent groups design using multidimensional IRT. Unpublished doctoral dissertation, University of Iowa.
  • Bolt, D. M. (1999). Evaluating the effects of multidimensionality on IRT true-score equating. Applied Measurement in Education, 12 (4), 383-407.
  • Cui, Z. & Kolen, M. J. (2005). RAGE-RGEQUATE [Computer Program]. Iowa City, IA: The University of Iowa, Iowa Testing Programs.
  • Eğitim, Araştırma ve Geliştirme Daire Başkanlığı. (2010). Ortaöğretim ÖBBS raporu 2009.
  • http://egitek.meb.gov.tr/dosyalar/obbs/OBBS_2009.pdf adresinden 31 Temmuz 2013 tarihinde indirilmiştir.
  • Hanson, B. & Zeng, L. (2004). PIE: A computer program for IRT equating. (Windows Console Version,
  • Revised by Cui, May 20, 2004) [Manual]. Unpublished manuscript, College of Education, University of Iowa, Iowa City, Iowa
  • Harris, D. J. & Crouse, J. D. (1993). A study of criteria used in equating. Applied Measurement in Education, 6, 195-240.
  • He, Y. (2011). Evaluating equating properties for mixed-format tests (Unpublished doctoral dissertation). University of Iowa, Iowa City.
  • Kim, D. I., Brennan, R. L. & Kolen, M. J. (2005). A comparison of IRT equating and beta 4 equating. Journal of Educational Measurement, 42(1), 77-99.
  • Kolen, M. J. & Brennan, R. L. (2004). Test equating: Methods and practices (2nd Ed.).New York, NY: Springer-Verlag.
  • Kolen, M. J. (1988). Effectiveness of analytic smoothing in equipercentile equating. Journal of Educational Statistics, 9 (1), 25-44.
  • Kolen, M. J. (2004). POLYCSEM [Computer software]. Iowa City, IA: Center for Advanced Studies in Measurement and Assessment, University of Iowa.
  • Lee, E., Lee, W-C. & Brennan, R. L. (2012). Exploring equity properties in equating using AP examinations (College Board Research Report No. 4).
  • Lee, E., Lee, W. & Brennan, R. L. (2010). Assessing equating results based on first- order and second- order equity (CASMA Research Report No. 31). Iowa City, IA: Center for advanced Studies in Measurement and Assessment, University of Iowa.
  • Liu, C. (2011). A comparison of statistics for selecting smoothing paramaters for loglinear presmoothing and cubic spline post smoothing under a random groups design (Unpublished doctoral dissertation). Available from Iowa Research Online. (UMI No. 1013)
  • Livingston, S. A. (2004). Equating test scores (Without IRT). Educational Testing Service.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Mahvah, NJ: Erlbaum.
  • Morris, C.N. (1982). On the foundations of test equating. In p. W. Holland & D. B. Rubin (Eds.) Test equating (pp. 169-191). New York: Academic.
  • Tong, Y., & Kolen, M. J. (2005). Assessing equating results on different equating criteria. Applied Psychological Measurement, 29(6), 418-432.
  • Zimowski, M. F., Muraki, E., Mislevy, R. J. & Bock, R. D. (2003). BILOGMG 3.0 for Windows: Multiple group IRT analysis and test maintenance for binary items [Computer software]. Lincolnwood, IL: Scientific Software International, Inc.