Tuncay ÖĞRETMEN, Nuri DOĞAN

Madde Yanlılığını Belirleme Teknikleri Arasında Bir Karşılaştırma

Problem Durumu: Testlerdeki maddelerin yanlılığını belirlemek için klasik test kuramı ve madde tepki kuramı dahilinde kullanılabilecek çok sayıda teknik vardır. Alanyazında bu tekniklerin karşılaştırılmasıyla ilgili yapılan çalışmaların bazılarında bir biriyle çelişen sonuçlar elde edilirken, bazılarında ise sonuçlar birbirine benzemektedir. Bu çelişkilerin kaynaklarının belirlenebilmesi için yeni çalışmaların yapılması gerekmektedir. Türkiye’de seçme ve yerleştirme amacıyla kullanılan testler üzerinde ise bu tür çalışmalar yok denecek kadar azdır. Bu çalışma söz konusu ihtiyaca az da olsa cevap verebilmek amacıyla gerçekleştirilmiştir. Araştırmanın Amacıı: Araştırmanın amacı madde yanlılıklarını belirlemede kullanılan tekniklerden elde edilen madde yanlılık değerleri arasındaki ilişkileri belirlemektir. Bu amacı gerçekleştirebilmek için “madde güçlük dönüşümü, düzeltilmiş madde güçlük dönüşümü, Mantel – Haenszel, işaretli ve işaretsiz alan indeksleri teknikleriyle elde edilen madde yanlılık değerleri arasında nasıl bir ilişki vardır?” sorusuna yanıt aranmıştır. Araştırmanın Yöntemi: Teknikleri karşılaştırmak için 2003 yılı Ortaöğretim Kurumları Öğrenci Seçme ve Yerleştirme Sınavının Fen Bilgisi alt testine cevap veren 550000 kişi arasından rasgele seçilmiş, evreni temsil edebildiği belirlenen 3344 kişilik örneklem kullanılmıştır. Yanlılık analizleri cinsiyet bağlamında yürütülmüştür. Bulguları ve Sonuçları: Sonuç olarak, beş teknikte de yanlılık gösteren 4 madde bulunmuştur. Bu maddelerin dördü de erkekler lehine işlemiştir. Beş farklı teknikle elde edilen madde yanlılık değerlerinin büyüklük sıraları arasında manidar ilişkilere rastlanmıştır. Söz konusu ilişkiler, madde güçlük dönüşümü ve düzeltilmiş madde güçlük dönüşümü değerleri; madde güçlük dönüşümü ve düzeltilmiş madde güçlük dönüşümü değerleri ile Mantel – Haenszel değerleri; işaretli ve işaretsiz alan indeksleri arasında 0,01 düzeyinde anlamlıdır. İşaretli alan indeksleri ile Mantel–Haenszel değerleri arasındaki ilişki ise 0,05 düzeyinde anlamlı bulunmuştur. Öneriler: Ulaşılan sonuçlara dayanarak maddelerin yanlılığına karar verirken birden fazla tekniğin kullanılması önerilebilir.

A Comparison Among Item Bias Detection Techniques

Problem Statement: There are a lot of techniques used within the classical test theory and item response theory for determining biased items in the tests. In the literature, at many studies related to comparison of these techniques were obtained same results which are contradiction with each other, even if many other’s results are similar each other. To determine sources of this contradiction conducting the new studies have been needed. As for in Turkey, this kind of studies over the tests which have been used to purpose of selection and placement are next to nothing. This study is veriﬁed to response hardly enough in question necessity. Focus of Study: The purpose of the present study was to investigate the relationships among degrees of item bias obtained from the ﬁve item bias detection procedures. For this reason, this study was focused to answer the question that What are the relationships among the degrees of item bias obtained from the ﬁve item bias detection procedures that are (1) transformed item diﬃculty, (2) revised transformed item diﬃculty, (3) Mantel-Haenszel, (4) signed and (5) unsigned area indices. Methods: The research is put forward from tree thousand – tree hundred – forty four (3344) persons who are selected between ﬁve hundred ﬁfty thousand (555000) students who take Student Selection and Placement Examination for Secondary Education Exam. The item bias analyses have been carried out across the gender groups only. Findings/Results: The signiﬁcant results are that biases’ values between on the level of 0.01 and 0.05 are found by using all techniques except ∆MH bias’ values. Results indicate that, between the bias’ values of ∆z and ∆ź are showed signiﬁcant relation on the level of 0.01. There is signiﬁcantly relation between marked and unmarked area index values on the levels of 0.01. Besides, it is also observed signiﬁcantly relation between the marked area values and ∆MH values on the 0.05 level. According to the techniques, the number of item bias changes between 10 and 15. Only four items are getting biases in all techniques. Recommendations: As a consequence; it can be proposed that common items can accepted as biased which are examined in all techniques by using several of them instead of making decision about the items which are based on only one technique in the real data.

PDF

___

Angoﬀ, W. H., & Ford, S. F. (1973). Item-race interaction on a test of scholastic aptitude. Journal of Educational Measurement, 10, 95-105.
Adams,R. J., & Rowe, K. J. (1988). Item bias. In J.P.Keeves (ed.) Educational research, methodology, and measurement:An international handbook. Oxford: Pergamon Press.
Baker, Frank. (2001). The basics of ıtem response theory. ERIC Clearinghouse on Assessment and Evaluation. University of Maryland, College Park, MD.
Brannick, M. T. (2004). Item response theory. Retrieved January 21, 2004, from, http://luna.cas. usf.edu/ ~mbrannic/ﬁles/pmet/irt.htm
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Orlando:Rinehart and Winston, Inc.
Devine, P. J., & Raju N. S. (1982). Extent of overlap among four item bias methods. Educational and Psychological Measurement, 42, 1049-1066.
Doğan, N. (2002). Klasik test kuramı ve örtük özellikler kuramının örneklemler bağlamında karşılaştırılması. Yayımlanmamış doktora tezi, Hacettepe Üniversitesi.
Dorans, N. J., & Kulick, E. M. (1983). Assessing unexpected diﬀerential item performance of female candidates on SAT and TSWE forms administered in December 1977(ETS Research Report RR-83-9). Princeton, New Jersey.
Dorans, N. J., & Holland, P. W. (1993). DIF detection and description: Mantel Haenszel and standardization. In P. W.Holland, & H. Wainer, (Eds.), Diﬀerential Item Functioning (pp. 35-66). New Jersey: USA.
Fan, X. (1998). Item response theory and classical test theory: an empirical comparison of their item-person statistics.Educational and Psychological Measurement, 58, 357-381.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston: Kluwer-Nijhoﬀ Publishing.
Hambleton, R K., Swaminathan, H. & Rogers, H. J. (1991). Fundamentals of item response theory. London: Sage Publication.
Hills, J. R. (1984). Quantitative methods used in the study of item bias. ERIC Document Reproduction Service No. ED 247 271.
Holland, P.W., & Thayer, D.T. (1986). Diﬀerential item performance and the Mantel-Haenszel procedure (Technical Report No. 86-69). Princeton, NJ: Educational Testing Service.
Ironson, G. H., & Craig, R. (1982). Item bias techniques when amount of bias is varied and score diﬀerences groups are presented. University of South Florida, Tampa. Depertmant of Psychology. (ERIC Document Reproduction Service No. ED 227 146).
Köklü, N. (2002). Açıklamalı istatistik terimleri sözlüğü. Ankara: Nobel Yayınevi
Lord, M.F., & Novic, R. M. (1968). Statistical theories of mental test scores. New York: Addison-Wesley Publishing Company.
Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, New Jersy: Lawrence Erlbaum.
Maranon, P. P., Garcia, M. I. B., & Costas C. S. L. (1997). Identiﬁcation of nonuniform diﬀerential item functioning: a comparison of Mantel-Haenszel and Item Response Theory analysis procedures. Educational and Psychological Measurement, 57, 559-568
Mellenberg, G. J. (1983). Conditional item bias methods. In S. H. Irvine and W. J. Barry (Eds), Human assesment and cultural factors (pp. 293-302). Newyork: Plenum Pres.
Mellenberg, G. J. (1989). Item bias and item response theory. International Journal of Educational Research: Applications of Item Response Theory.13, 123-144.
Milli Eğitim Bakanlığı. (1996). Örnekleriyle Türkçe sözlük. (4. Cilt) Ankara: Türk Tarih Kurumu Basımevi.
Mislevy, R. J., & Bock, D. R. (1986). PC-BILOG: Item analysis and test scoring with binary logistic models. Scientiﬁc Software Inc.
Oort, F. (1992). Computer program which computes area ındices. University of Amsterdam. Faculty of Psychology.
Osterlind, S. (1983). Test item bias. Newbury Park: Sage Publications.
Öğretmen, T., & Doğan, N. (2004). OKÖSYS Matematik alt testine ait maddelerin yanlılık analizi. İnönü Üniversitesi Eğitim Fakültesi Dergisi. 8, 61-76.
Özdemir, D. (2003). Çoktan seçmeli testlerde iki kategorili ve önsel ağırlıklı puanlamanın diferansiyel madde fonksiyonuna etkisi ile ilgili bir araştırma. Eğitim ve Bilim. 25; 37-44
Raju, N. S. (1988).The area between two item characteristic curves. Psychometrica, 53, 495-502.
Raju, N. S. (1990). Determining the signiﬁcance of estimated signed and unsigned areas between two item response functions. Applied Psychological Measurement, 14, 197-207.
Raju, N. S., Drasgow, F., & Slinde, J. A. (1993). An ampirical comparison of the area methods, Lord’s chi-square test, and the Mantel-Haenszel technique for assessing diﬀerential item functioning. Educational and Psychological Meas- urement, 53, 301-314
Rodney. G. L., & Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing diﬀerential item functioning. Journal of Applied Psychology. 75, 164-174.
Rudner, L., Getson, P. R. & Knight, D. L. (1980). Biased item detection techniques. Journal of Educational Statistics. 5,213-233.
Seong, Tae-Je., & Subkoviak, M. J. (1987). A comparative study of recently proposed item bias detection methods. Paper presented at tha annual meeting of the American Educational Research Association, Toronto. (ERIC Document Reproduction Service No. ED 157 942).
Shepard, L. A., Camilli, G., & Williams,D. M. (1984). Validity of approximation techniques for detecting item bias. Journal of Educational Measurement.22, 77-105.
Stage, C. (1997). Predicting gender diﬀerences in word ıtems: A comparison of item response theory and classical test theory. Technical Report: Umeå University, Department of Educational Measurement.
Tittle, C. K. (1988).Test Bias. In J.P. Keeves, (ed.). Educational research, methodology, and measurement:An international handbook. Oxford: Pergamon Press.
Yenal, E. (1995). Diﬀerential item functioning analysis of the quantitative ability section of the ﬁrst stage of the university entrance examination in Turkey. Yayınlanmamış yüksek lisans tezi, Orta Doğu Teknik Üniversitesi..
Yu, C. H. (2002). True score and Item Response Theory. Retrived April 4, 2002 From _____http://seamonkey. ed.asu.edu/~alex/ computer/sas/math_reality.htm>
Yurdugül, H. (2003). Ortaöğretim kurumları seçme ve yerleştirme sınavının madde yanlılığı açısından incelenmesi.Yayımlanmamış doktora tezi, Hacettepe Üniversitesi.
Zumbo, B. D. (1999). A handbook on the theory and methods of diﬀerential item functioning (DIF) logıstıc regressıon modelıng as a unıtary framework for bınary and lıkert-type (ordınal) ıtem scores. Canada: Ottowa, Directorate of Human Resources Research and Evaluation National Defense Headquarters: Author.