Murat POLAT

Yabancı Dil Ölçümünde Açık Uçlu ve Çoktan Seçmeli Soruların Farklı Biliş Seviyelerinde Karşılaştırılması

Yabancı dil sınıflarında görülen ders işleme teknikleri, kullanılan materyaller ve sınıf aktiviteleri son yıllarda oldukça değişmiş ve gelişmiştir, ancak Bloom’un taksonomisinin yaygın kullanımına rağmen öğrencilerin dil becerilerinin ölçümünde kullanılan sınav metotları pek fazla değişikliğe uğramamıştır. Okulların ölçme değerlendirme birimleri birtakım nedenlerden ötürü (güvenirlik, ekonomiklik, zaman tasarrufu) eleştirel düşünce, sentez kabiliyeti gibi becerileri ölçmeyip yüzeysel bilgiyi ölçen çoktan seçmeli sınavların sonuçlarına güvenmektedirler. Öte taraftan, açık uçlu sorular da analiz-sentez yeteneği ve üst düzey bilişsel becerileri ölçmeye daha uygun olmalarına rağmen daha düşük güvenirlik, daha fazla zaman ve emek harcanması gibi olumsuz etkileri de beraberinde getirmektedirler. Bu çalışmada Eskişehir’deki bir dil okulunda eğitim gören 116 öğrencinin katılımıyla yabancı dil başarı ölçümünün açık uçlu ve çoktan seçmeli sorular kullanılarak karşılaştırılması amaçlanmıştır. Çalışmada dilbilgisi ve okuma becerileri dersi için hazırlanan iki açık uçlu sınavın çoktan seçmeli sorular içeren versiyonları hazırlanmış ve sonuçların karşılaştırılmasında kullanılmıştır. Araştırma sonuçları açık-uçlu ve çoktan seçmeli testler arasında madde zorluğu ve madde ayrımcılık düzeyleri açısından anlamlı bir fark olduğunu göstermiştir. Hem dilbilgisi hem de okuma çoktan seçmeli testlerinin açık uçlu testlerden daha kolay olduğu bulunmuştur

Analysis of Multiple-choice versus Open-ended Questions in Language Tests According to Different Cognitive Domain Levels

Classroom practices, materials and teaching methods in language classes have changed a lotin the last decades and continue to evolve; however, the commonly used techniques to test students’foreign language skills have not changed much regardless of the recent awareness in Bloom's taxonomy.Testing units at schools rely mostly on multiple choice questions (MCQ) because these types ofquestions are reliable, cost effective and time savers; however, these measure only surface informationin a particular skill while other skills such as critical thinking and synthesis cannot be evaluated usingMCQs. On the other hand, open-ended question (OEQ) tests include analysis and synthesis and ahigher level of cognitive processing, with some washback effects such as less reliable results and moretime and effort in scoring. This study aims to compare language learners’ performances on MCQ andOEQ tests administered to 116 students studying at a language preparatory school in Eskisehir. Fourseparate tests including grammar and reading questions in both forms were prepared and administeredto the same students respectively. Research results showed that there was a significant differencebetween OEQ and MCQ tests in terms of item difficulty and item discrimination levels. In bothgrammar and reading assessment, MCQ tests were found to be easier than OEQ tests.

PDF

___

Akay, H., Soybaş, D., & Argün, Z. (2006). Problem kurma deneyimleri ve matematik öğretiminde açık-uçlu soruların kullanımı. Kastamonu Eğitim Dergisi, 14(1), 129–146.
Alsawalmeh, Y. M., & Feldt, L.S. (1994). A modification of Feldt's test of two dependent alpha coefficients. Psychometrika, 59, 49–57. Retrieved in April, 2019 from: https://doi.org/10.1007/BF02294264
Badger, E., & Thomas, B. (1992). Open-ended questions in reading. Practical Assessment, Research & Evaluation, 3(4), 1991-1993.
Bastin, C., & Van der Linden, M. (2003). The contribution of recollection and familiarity to recognition memory. Neuropsychology, 17, 14–24. Retrieved in March, 2019 from: https://doi:10.1037/0894-4105.17.1.14
Baştürk, S. (2014). Çoktan seçmeli testler. Eğitimde ölçme ve değerlendirme. Nobel Akademik Yayıncılık.
Bektaş, M., & Kudubeş, A. A. (2014). Bir ölçme ve değerlendirme aracı olarak yazılı sınavlar. Dokuz Eylül Üniversitesi Hemşirelik Fakültesi Elektronik Dergisi, 7(4), 330-336.
Beller, M., & Gafni, N. (1996). International assessment of educational progress in mathematics and sciences: The gender differences perspective. Journal of Educational Psychology, 88, 365-377.
Bennett, R. E., Rock, D. A., & Wang, M. (1991). Equivalence of free-response and multiplechoice items. Journal of Educational Measurement, 28, 77-92.
Black, P., Harrison, C., & Lee, C. (2003). Assessment for learning: Putting it into practice. UK: McGraw-Hill Education.
Braun, H. I., Bennett, R. E., Frye, D., & Soloway, E. (1990). Scoring constructed responses using expert systems. Journal of Educational Measurement, 27(2), 93–108.
Breland, H. M., Danos, D. O., Kahn, H. D., Kubota, M. Y., & Bonner, M. W. (1994). Performance versus objective testing and gender: An exploratory study of advanced placement history examination. Journal of Educational Measurement, 31, 275-293.
Brown, H. D. (2004). Language assessment, principles and classroom practices. USA: Longman.
Brualdi, A. C. (1998). Classroom questions. Practical Assessment Research & Evaluation, 6(6), 1-3. Doi: https://doi.org/10.7275/05rc-jd18
Büyüköztürk, Ş. (2013). Sosyal bilimler için veri analizi el kitabı. Ankara: Pegem.
Cahill, D. R., & Leonard, R. J. (1999). Missteps and masquerade in American medical academy: Clinical anatomists call for action. Clinical Anatomy. 12: 220-222
Case, S. M., & Swanson, D. B. (2002). Constructing Written Test Questions for the Basic and Clinical Sciences. 3rd Ed. Philadelphia, PA: National Board of Medical Examiners. https://www.researchgate.net/publication/242759434_Constructing_Written_Test _Questions_For_the_Basic_and_Clinical_Sciences/link/00463529cfae5627590000 00/download
Cheryl, A. M., David, O. D., Bart, K., & Nagasawami, S. V. (2017). Analysis of testing with multiple choice versus open-ended questions: Outcome-based observations in an anatomy course. American Association of Anatomists Sci. Education, 11, 254-261.
Cook, A. E., & Myers, J. L. (2004). Processing discourse roles in scripted narratives: Influence of context and world knowledge. Journal of Memory and Language, 50, 268-288. Doi: https://doi:10.1016/j.jml.2003.11.003
Cooney, T. J., Sanchez, W. B., Leatham, K., & Newborn, D. S. (2004). Open-ended assessment in math: A searchable collection of 450+ questions. “Open-ended Assessment in Math.” www.heinemann.com/math
Cronbach, L. J. (1988). Five perspectives on validity argument and test validity. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Duran, E., & Tufan, B. S. (2017). The Effect of open-ended questions and multiple-choice questions on comprehension. International Journal of Languages’ Education and Teaching, 5(1), 242-254.
Epstein, R. M. (2007). Assessment in education. New England Journal of Medicine. 3, 387-396.
Freahat, N. M., & Smadi, O. M. (2014). Lower-order and higher-order reading questions in secondary and university level EFL textbooks in Jordan. Theory and Practice in Language Studies, 4(9), 1804-1813.
Güler, N. (2017). Eğitimde ölçme ve değerlendirme (10th Ed.). Ankara: Pegem Akademi.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2006). Multivariate data analysis (6 th Ed.). Upper saddle River: Pearson Prentice Hall.
Hancock, G. R. (1994). Cognitive complexity and the comparability of multiple-choice and constructed-response tests. The Journal of Experimental Education, 62(2), 143–157.
Haris, S. S., & Omar, N. (2015). Bloom's taxonomy question categorization using rules and n-gram approach. Journal of Theoretical & Applied Information Technology, 76(3), 401-407.
Harrison, C. J., Konings, K. D., Schuwirth, L. W., Wass, V., & Van der Vleuten, C. P. (2017). Changing the culture of assessment: The dominance of the summative assessment paradigm. BMC Medical Education. 17, 73-87.
Klufa, J. (2015). Multiple choice question tests–advantages and disadvantages. Mathematics and Computers in Sciences and Industry Journal, 3, 91-97.
Ko, M. H. (2010). A comparison of reading comprehension tests: Multiple-choice vs. openended. English Teaching, 65(1), 73-87. Krippendorff, K. (2011). Computing Krippendorff's Alpha-Reliability. Retrieved from https://repository.upenn.edu/asc_papers/43
Koretz, D., Lewis, E., Skewes-Cox, T., & Burstein, L. (1993). Omitted and not-reached items in mathematics in the 1990. National Assessment of Educational Progress. 1, 36-48.
Lawshe, C. H. (1975). A quantitative approach to content validity. Personnel Psychology, 28(4), 563-575.
Lee, H-S., Liu, O., & Linn, M. C. (2011). Validating measurement of knowledge integration in science using multiple-choice and explanation items. Applied Measurement in Education, 24(2), 115–136. Doi: https://doi.org/10.1080/08957347.2011.554604
Magliano, J. P., Millis, K., Ozuru, Y., & McNamara, D. S. (2007). A multidimensional framework to evaluate reading assessment tools. In D. S. McNamara (Ed.), Reading comprehension strategies: Theories, interventions, and technologies. Lawrence
Erlbaum Associates Publishers, London (p. 107–136). Martinez, M. E. (1999). Cognition and the question of test item format. Education Psychology, 34, 207-218.
Paul, D. V., Naik, S. B., & Pawar, J. D. (2014). An evolutionary approach for question selection from a question bank: A case study. International Journal of ICT Research and Development in Africa (IJICTRDA), 4(1), 61-75.
Rauch, D. P., & Hartig, J. (2010). Multiple-choice versus open-ended response formats of reading test items: A two dimensional IRT analysis. Psychological Test and Assessment Modelling, 52(4), 354–379.
Reckase, M. D., & McKinley, R. L. (1991). The discriminating power of items that measure more than one dimension. Applied Psychological Measurement, 15, 361-373.
Ringim, K. J., Razalli, M. R., & Hasnan, N. (2012). A framework of business process reengineering factors and organizational performance of Nigerian banks. Asian Social Sciences, 8(4), 203.
Ruit, K., & Carr, P. (2011). Comparison of student performance on selected response versus constructed-response question formats in a medical neuroscience laboratory practical examination. FASEB J, 2(15), 18-26.
Schuwirth, L. W., Van der Vleuten, C.P., & Donkers, H. (1996). A closer look at cueing effects in multiple-choice questions. International Education Journal. 1(3), 44–49.
Seddon, G. M. (1978). The Properties of Bloom's Taxonomy of Educational Objectives for the Cognitive Domain. Review of Educational Research, 48(2), 303-323.
Taib, F., & Yusoff, M. S. B. (2014). Difficulty index, discrimination index, sensitivity and specificity of long case and multiple-choice questions to predict students’ examination performance. Journal of Taibah University Medical Sciences, 9(2), 110-114. Doi: https://doi.org/10.1016/j.jtumed.2013.12.002
Vasan, N. S., De Fouw, D. O., & Compton, S. (2011). Team based learning in anatomy: An efficient, effective and economical strategy. Anatomic Science Education. 4, 333-339.
Wilson, F. R., Pan, W., & Schumsky, D. A. (2012). Recalculation of the critical values for Lawshe’s content validity ratio. Measurement and Evaluation in Counseling and Development, 45(3), 197-210.
Wright, C. D., Eddy, S. L., Wenderoth, M. P., Abshire, E., Blankenbiller, M., & Brownell, S. E. (2016). Cognitive difficulty and format of exams predicts gender and socioeconomic gaps in exam performance of students in introductory biology courses. CBE-Life Sciences Education, 15(2), 1-23.
Yaman, S. (2016). Çoktan seçmeli madde tipleri ve fen eğitiminde kullanılan örnekleri. Gazi Eğitim Bilimleri Dergisi, 2(2), 151-170.