A Comparison of the Parametric Methods Based on the Item Response Theory in Determining Differential item and Test Functioning

Bu araştırma, değişen madde ve test fonksiyonlarını belirlemede, Madde Tepki Kuramı’na bağlı parametrik metotların karşılaştırılmasını amaçlamaktadır. Bu amaçla, araştırma, madde parametreleri için karşılaştırma metodu, olabilirlik oranı testine dayalı karşılaştırma metodu, madde ve test düzeyinde değişen madde ve test fonksiyonlarını belirlemede kullanılan metotları kullanarak değişen madde ve test fonksiyonunu analiz etmekte ve elde edilen verilerle bu üç yöntemi karşılaştırmaktadır. Bu çalışma, Uluslararası Okuma Becerilerinde Gelişim Projesi (PIRLS) 2001 testlerinden elde edilen verilerle yürütülmüştür. Bulgular, bu araştırma kapsamında kullanılan yöntemlere göre elde edilen değişen madde ve test fonksiyonu sonuçları arasında farklılık olduğunu göstermiştir.

Değişen Madde ve Test Fonksiyonunun Belirlenmesinde Madde Tepki Kuramı’na Dayalı Parametrik Yöntemlerin Karşılaştırılması

This study aims to compare parametric methods based on the item response theory in determining differential item and test functioning. To this end, it analyzes differential item and test functioning by using the comparison method for item parameters, the comparison method based on the likelihood ratio test, and the method to determine differential item and test functioning (DFIT) both at the item and test levels, and compares these three methods in terms of the data obtained. The study was conducted on the data on the Progress in International Reading Literacy 2001 (PIRLS-2001) tests. The results of the analyses indicated a differentiation between the results of differential item and test functioning, which were obtained using the methods in question.

PDF

___

Adams, R. J., & Rowe, K. J., (1988). Item Bias in Keeves, J.P. (ed.) Educational research, methodology, and measurement: An international handbook. Oxford: Pergamon Press.
Bertrand, R., & Boiteau, N. (2003). Comparing the Stability of IRT-Based and Non IRT-Based DIF Methods in Different Cultural Context Using TIMSS Data. ERIC Report-Research (143). EDRS. ED 476 924. 20p.
Crane, P.K., Gibbons, L.E., Narasimhalu, K., Lai, J. S., & Cella, D. (2007). Rapid detection of differential item functioning in assessments of health-related quality of life: The Functional Assessment of Cancer Therapy. Quality of Life Research, 16, 101–114.
Crocker, L., & Algina, J., (1986) Introduction to Classical and Modern Test Theory. Orlando: Harcourt Brace Jovanovich.
Devine, P. J., & Raju N. S., (1982) Extent of Overlap Among Four Item Bias Methods. Educational and Psychological Measurement, 42, 1049-1066.
Gonzalez, E. J., & Kennedy, A. M. (2003). “PIRLS 2001 User Guide for the International Database”. (IEA) International Study Center, Boston College.
Hambleton, R K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. London: Sage Publication.
Holland, P. W., & Wainer, H. (1993). Differential Item Functioning. Hillside, NJ: Lawrence Erlbaum.
IEA (International Association for the Evaluation of Educational Achievement). (n.d.). TIMSS 1999 Publications. Retrieved November 15, 2001 from http//isc.bc.edu/timss1999i/database.html
Kim, S. H., & Cohen, A. S. (1995). A Comparison of Lord’s Chi-square, Raju’s Area Measures, and the Likelihood ratio Test on Detection of Differential Item Functioning. Applied Measurement in Education, 8(4), 291-312.
Kim, S. H., Cohen, A. S., DiStefano, C. A., & Kim, S. (1998). An Investigation of the Likelihood Ratio Test for Detection of Differential Item Functioning Under the Partial Credit Model. ERIC Reports- Evaluative. EDRS. ED 442 837. 23 p.
Korkmaz, M. (2005). Madde Cevap Kuramı’na Dayalı Olarak Çok Kategorili Maddelerde Madde ve Test Yanlılığının (İşlevsel Farklılığın) İncelenmesi. Unpublished PhD Thesis. Ege University, Department of Psychology.
Lord, F. M. (1980). Applications of Item ResponseTtheory to Practical Testing Problems. Hillsdale, NJ: Lawrence Erlbaum.
Master, G. N. (1982). A Rasch Model for Partial Credit Scoring. Psychometrica, 47, 149-174.
Mc Carty, F. A., Oshima, T. C., & Raju, N.S. (2002). Identifying possible sources of differential functioning using differential bundle functioning with polytomous scored data. Peper presented at the annual meeting of the Amreican Educational Research Association, New Orleans.
McCarty, F. A., Oshima, T. C., & Raju, N.S.(2007). Identifying Possible Sources of Differential Functioning Using Differential Bundle Functioning With Polytomously Scored Data. Applied Measurement in Education, 20(2), 205–225
Meade, A. W., & Lautenschlager, G.J. (2004). A Comparison of Item Response Theory and Confirmatory Factor analytic Methodologies for Establishing Measurement Equivalence/Invariance. Organizational Research Methods, 7(4), 361-388.
Mellenberg, G. J. (1983). Conditional Item Bias Methods. In S. H. Irvine and W. J. Barry (Eds), Human Assesment and Cultural Factors (pp. 293-302). Newyork: Plenum Press.
Mullis, I.V.S., Martin, M.O., Gonzalez, E. J. & Kennedy, A.M., (2003) “PIRLS 2001 International Report.” (IEA) International Study Center, Boston College.
Muraki, E. (1992). A Generalized Partial Credit Model: Applications of an EM Algoritm. Applied Psychological Measurement, 16, 159-176.
Muraki, E., & Bock, R. D. (1996). PARSCALE (V4.1). Parameter Scaling of Rating Data. Chicago, IL: Scientific Software, Inc.
Öğretmen, T. (2006). Uluslararası Okuma Becerilerinde Gelişim Projesi (PIRLS) 2001 Testi’nin psikometrik özelliklerinin incelenmesi: Türkiye-Amerika Birleşik Devletleri örneği. Unpublished PhD Thesis, Hacettepe University, Department of Educational Sciences.
Raju, N. S., van der Linden, W. J., & Fleer, P. F. (1995). IRT-based internal measures of differential functioning of items and tests. Applied Psychological Measurement, 19(4), 353-368.
Raju, N. S. (2004). DFITP6. A FORTRAN program for calculating DIF/DTF [Computer Software]. Chicago: Illinois Institute of Technology.
Reise, S. P., Smith, L., & Furr, R.M. (2001). Invariance on the PI-R Neuroticism Scale. Multivariate Behavioral Research, 36(1), 83-110.
Rodney, G. L. and Drasgow, F., (1990). Evaluation of Two Methods for Estimating Item Response Theory Parameters When Assessing Differential Item Functioning. Journal of Applied Psychology. 75(2), 164-174.
Rudner, L., Getson, P. R., & Knight, D. L. (1980). Biased Item Detection Techniques. Journal of Educational Statistics, 5, 213-233.
Stark, S. (1999). EQUATE99. Computer programme for equatimg two metrics in item response theory. University of Illinois IRT Laboratory.
Smith, L. L. (2002). On the Usefulness of Item Bias Analysis to Personality Psychology. Personalityy and Social Psychology Bulletin, 28(6), 754-763.
Schrum, C. L., & Salekin R.T. (2006). Psychopathy in Adolescent Female Offenders: An Item Response Theory Analysis of the Psychopathy Checklist: Youth Version. Behavioral Sciences and the Law Behav. Sci. Law. 24, 39–63.
Teresi, J. A., Kleinman, M., & Welikson, O. K. (2000). Modern Psychometric Methods for Detection of Differential Item Functioning: Application to Cognitive Assessment Measures. Statistics in Medicine, 19, 1651-1683.
Thissen, D. (1992). MULTILOG: Multiple Category Item Analysis and Test Scoring Using Item response Theory (V7.03). Chiago: Scientific Software International, Inc.
Thissen, D., Steinber, L., & Wainer, H. (1993). Detection of Differential Item Functioning Using the Parameters of Item Response Model. İç. P.W. Holland and H. Wainer (Ed). Differential Item Functioning (67-113). Hillside, NJ: Erlbaum.
Toit, M. (2003). IRT from SSI: Bilog-MG, Multilog, Parscale, Testfact. Scientific Software International, Inc.