Angoff , Nedelsky ve Sınır Değerleri Saptama Yöntemleri ile Bir Testin Sınıflama Doğruluklarının İncelenmesi

Bu araştırma, sınıf öğretmenliği öğrencilerinin test başarılarının Angoff , Nedelsky standart belirleme yöntemleri ve sınır değer saptama yöntemleri ROC ve Aralık Tahmini ile belirlenen sınıflama doğruluklarının incelenmesini içermektedir. Araştırmada farklı standart belirleme yöntemleri incelenmiş olup araştırma, tarama modelinde korelasyonel bir araştırmadır. Araştırmada veri toplama aracı olarak, 2006-2012 yıllarında uygulanan KPSS'deki Türkçe ve Matematik sorularından tesadüfi olarak seçilmiş toplam 30 sorudan oluşan bir test kullanılmıştır. Araştırmanın sonunda, en yüksek kesme puanı Türkçe ve Matematik testi için Angoff yöntemi ile elde edilmiş, başarılı sayılan öğrenci yüzdeleri arasında manidar farklılık olduğu sonucuna ulaşılmıştır. Türkçe ve Matematik testlerinin kesme puanlarının belirlenmesi için uygulanan ROC analizi sonucunda, Matematik testinin başarılı/başarısız olan öğrencileri doğru sınıflandırabildiği görülmüştür. Matematik ve Türkçe testi için aralık tahminine göre belirlenen kesme puanları, ROC analizinde belirlenen kesme puanlarla uyumlu bulunmuştur. Sonuç olarak, Türkçe ve Matematik alt testleri için Angoff ve Nedelsky; ROC ve Aralık Tahmini yöntemlerine göre farklı kesme puanları belirlenmiştir

Angoff , Nedelsky and Examination of Classification Accuracies of a Test by Determination Methods of Limit Values

This research includes the examination of classification accuracies of test successes of classteachership students by Angoff , Nedelsky standard determination methods, limit determination methods, ROC and interval forecast. In this research, different standard determination methods have been examined and the research is correlational in scanning model. In the research, a test consisting of 30 questions (selected coincidentally from Turkish and math questions in KPSS which was held in the years 2006/2012) was used as a data collection tool. At the end of this research, the highest cut-off scores for Turkish and mathematics tests were obtained by the Angoff method, and it has been concluded that there exists a significant difference among the student percents that are deemed to be successful. As a result of the ROC analysis in order to determine the cut-off scores for Turkish and mathematics tests, it has been observed that the mathematics test is capable of classifying the successful/unsuccessful students accurately. It has shown that the cut-off scores determined according to interval estimation for mathematics and Turkish test conform to the cut-off scores determined in ROC analysis. As a result, Angoff and Nedelsky for General Ability sub-tests and different cut-off scores for ROC and Interval Estimation Methods have been identified

PDF

___

Anastasi, A. (1988). Psychological Testing. (sixth edition). NY: Mc Millan Publishing Company.
Büyüköztürk, Ş., Çakmak, E., Akgün, Ö. E., Karadeniz, Ş., & Demirel, F. (2010). Bilimsel Araştırma Yöntemleri.
7. Baskı. Ankara: Pegem A Yayıncılık.
Centor, R. M., & Schwartz, J. S. (1985). “An Evaluation of Methods for Estimating the Area under the Receiver Operating Characteristic (ROC) Curve”. Medical Decision Making Journal, 5 (2), 149-156.
Chang, L. (1999). “Judgmental Item Analysis of the Nedelsky and Angoff Standard-Setting Methods”. Applied Measurement in Education, 12 (2), 151-65.
Cizek, G. J. (2001). Conjectures on the Rise and Call of Standard Setting: An Introduction to Context and Practice. Ed. G. Cizek. In setting Performance Standards: Concepts, Methods and Perspectives, 3- 17. Mahwah, N. J.: Erlbaum.
Cizek, G. J., & Bunch, M. B. (2007). Standard Setting: A Guide to Establishing and Evaluating Performance
Standards on Tests. Thousand Oaks, CA: Sage Publications Ltd. Conway, M. J. (2006). How to Collect Data: Measurement and Evaluation. New York: American Society for
Training and Development. Workplace Learning and Performance Press.
Crocker, L., & Algina, J. (1986). Introduction to Classical and Modern Test Theory. NY: CBS College Publishing Company.
Cronbach, L. J. (1990). Essentials of Psychological Testing. (fifth edition). New York: Harper Collins Publishers Inc.
Çamlıca, H., & Dişçi, R. (2008). “Tanı Testlerinde Sınır Değerlerinin Belirlenmesi”. Türk Onkoloji Dergisi, 23 (1), 26-33.
Çetin, S., & Gelbal, S. (2010). “Farklı Standart Belirleme Yöntemlerinin Geçme Puanları Üzerine Etkisi”. Ankara Üniversitesi Eğitim Bilimleri Fakültesi Dergisi, 43 (1), 79-95.
DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988). “Comparing the Areas under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach”. Biometrics, 44 (3), 837-845.
Dirican, A. (1991). ROC Eğrisi Çözümlenmesi ile Tanı Testlerinin Değerlendirilmesi ve Bilgisayar Uygulaması.
Yayınlanmamış Doktora Tezi. İstanbul Üniversitesi Sağlık Bilimleri Enstitüsü. İstanbul.
Goodwin, L. D. (1996). “Focus on quantitative methods determining cut-off scores”. Research in Nursing and Health, 19 (3), 249-256.
Grove, W. M. (2006). Mathematical Aspects of Diagnosis. United States of America, 50-75.
Gündeğer, C. (2012). Angoff, Yes/No ve Ebel Standart Belirleme Yöntemlerinin Karşılaştırılması. Yayınlanmış Yüksek Lisans Tezi. Hacettepe Üniversitesi. Ankara.
Hanley, J. A., & McNeil, B. J. (1982). “The Meaning and Use of The Area Under a Receiver Operating Characteristic (ROC) Curve”. Radiology,143, 29- 36.
Hess, B., Subhiyah, R. G., & Giordano, C. (2007). “Convergence Between Cluster Analysis and the Angoff
Method for Setting Minimum Passing Scores on Credentialing Examinations”. Evaluation and the Health Professions, 30 (4), 362-375.
Hurtz, G. M., & Hertz, N. R. (1999). “How Many Raters Should be Used for Establishing Cut off Scores with the Angoff Method? A Generalizability Theory Study”. Educationaland Psychological Measurement, 59 (6), 885-897.
Impara, J. C., & Plake. B. S. (1997). “Standard setting: An alternative aprroach”. Journal of Educational Measurement, 34 (4), 353-366.
Irwin, P. (2007). An Alternative Examinee - Centered Standard Setting Strategy. Yayınlanmamış Doktora Tezi. University of Nebraska, USA.
Jeager, R. M. (1989). Certification of Student Competence: Ed. R. L. Linn. Educational Measurement, 485-514. New York: Macmillan.
Karasar, N. (2005). Bilimsel Araştırma Yöntemi. Ankara: Nobel Yayın Dağıtım.
Kanık, E. A., & Erden, S. (2003). “Tanı Testlerinin Değerlendirilmesinde ROC Eğrisinin Kullanımı”. Mersin Üniversitesi Tıp Fakültesi Dergisi, 3, 260-264.
Karayianni, T., Tretiak, O. J., & Herrmann, N. (1996). ROC analysis: Comparison between the Binormal and the Neyman-Pearson Model. Signals, Systems and Computers, Conference Record of the Thirtieth Asilomar Conference. 3-6 Nov.
Knapp, RG., & Miller III MC. (1992). “Clinical Epidemiology and Biostatistics”. United States of America, Williams and Wilkins Press, Le CT. Introductory Biostatistics. United States of America, Wiley and Sons Pres, 336-337.
Kubiszyn, T., & Borich, G. (2003). Educational Testing and Measurement: Classroom Application and Practice. USA: John Wiley and Sons, Inc.
Letner, C. (1992). Introduction to Statistics. Statistical Tables. Geigy Scientific Tables, Ciba- Geigy Ltd. Basel-Switzerland.
Livingston, S. A., & Zieky M. J. (1989). “A comparative study of standard- setting methods”. Applied Measurement in Education, 2 (2), 121-141.
Mathison, S. (2005). Encyclopedia of Evaluation. CA: Sage Publications, Inc.
Metz, C. E., Herman, B. A., & Shen, J. H. (1998). “Maximum likelihood estimation of receiver operating characteristic (ROC) curves from continuously- distributed data”. Statistics Medicine, 17 (9), 1033-1053.
Metz, C. E., & Pan, X. (1999). “Proper binormal ROC curves: Theory and maximum-likelihood estimation”. Journal of Mathematical Psychology, 43 (1), 1-33.
McClish, D. K. (1989). “Analyzing a portion of the ROC curve”. Medicine Decision Making Journal, 9 (3), 190-5.
Nedelsky, L. (1954). “Absolute Grading Standards for Objective Tests”. Educational and Psycohological Measurement, 14, 3-19.
Obuchowski, N. A., Lieber, M. L., & Wians, F. H. (2004). “ROC Curves in Clinical Chemistry: Uses, Misuses and Possible Solutions”. Clinical Chemistry, 50 (7), 1118-25.
Ömür, S., & Selvi, H. (2010). “Angoff, Ebel ve Nedelsky Yöntemleriyle Belirlenen Kesme Puanlarının Sınıflama Tutarlılıklarının Karşılaştırılması”. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 1 (2), 109-113.
Özçelik, A. D. (1992). Ölçme ve Değerlendirme. Ankara: ÖSYM Yayınları.
Pawar, M. S. (2004). Data Collecting Methods and Experiences: A Guide for Social Researchers. India: Sterling Publishers.
Schulz, M. E. (2006). “Commentary: a response to reckase’s conceptual framework and examples for evaluating standard setting methods”. Educational Measurement: Issues and Practice, 25 (3), 4-13.
Shapiro, D. E. (1999). “The interpretation of diagnostic tests”. Statistical Methods in Medical Research, 8 (2), 113-34.
Shen, L. (2001). “A comparison of Angoff and Rasch Model Based Item Map Methods in Standard Setting”. Paper Presented at the Annual Meeting of the American Educational Research Association. Seattle, WA.
Solberg, H. E. (1983). “The Theory of Reference Values. Statistical Treatment of Collected Reference Values. Determination of reference limits”. Journal of Clinical Chemistry and Clinical Biochemistry, 21 (11), 749-60.
Taga,Y., Aslan, D., & Güner, G. (2000). Tıbbi Laboratuarlarda Standardizasyon ve Kalite Yönetimi. Ankara: Türk Biyokimya Derneği Yayınları.
Tanrıverdi, S. (2006). Standart Belirleme Yöntemlerinin Geçme Puanları Üzerine Etkisi. Yayınlanmış Yüksek Lisans Tezi. Hacettepe Üniversitesi, Ankara.
Taşdelen, G. (2009). Nedelsky ve Angoff Standart Belirleme Yöntemlerinin Genellenebilirlik Kuramı İle Karşılaştırılmasına İlişkin Bir Araştırma. Yayınlanmış Yüksek Lisans Tezi. Hacettepe Üniversitesi, Ankara.
Turgut, F. M. (1992). Eğitimde Ölçme ve Değerlendirme Metotları. Ankara: Saydam Matbaacılık.
Zweig, M. H., & Camphell, G. (1993). “Receiver - operating characteristic (ROC) plots: A fundamental evaluation tool in clinical medicine”. Clinical Chemistry, 39, 561-577.
Zwick, R., Şentürk, D., & Wang, J. (2001). “An investigation of alternative methods for item mapping in the national asessment of educational progress”. Educational Measurement: Issues and Practice, 20 (2), 15-25.
Wagner, R. F., Metz, C. E., & Campbell, G. (2007). “Assessment of Medical Imaging Systems and Computer Aids: A Tutorial Review”. Academic Radiology, 14, 723-748.