Computer-based and paper-based testing: Does the test administration mode influence the reliability and validity of achievement tests?

Öz Please fill up the following information accurately. (Please use Times New Roman, 12 pt. Title of the paper (Capitalize only the first letter of the title and proper nouns) Computer-based and paper-based testing: Does the test administration mode influence the reliability and validity of achievement tests? Information about Author(s)* Author 1 Author (Last name, First name)  Öz, Hüseyin Affiliated institution (University) Hacettepe University  Country Turkey  Email address hoz@hacettepe.edu.tr  Department & Rank Foreign Languages Education  Corresponding author (Yes/No) Write only one corresponding author.  Yes Author 2 Author (Last name, First name)  Özturan, Tuba Affiliated institution (University) Erzincan University  Country Turkey  Email address tbozturangmail.com  Department & Rank School of Foreign Languages  Corresponding author (Yes/No) No  Author 3 Author (Last name, First name)   Affiliated institution (University)   Country   Email address   Department & Rank   Corresponding author (Yes/No)   Author 4 Author (Last name, First name)   Affiliated institution (University)   Country   Email address   Department & Rank   Corresponding author (Yes/No)    

___

Akdemir, O., & Oğuz, A. (2008). Computer-based testing: An alternative for the assessment of Turkish undergraduate students. Computers & Education, 51, 1198-1204. doi:10.1016/j.compedu.2007.11.007

Alderson, J.C. (2000). Technology in testing: The present and the future. System, 28, 53-603. doi.org/10.1016/S0346-251X(00)00040-3

Al-Amri, S. (2008). Computer-based testing vs. paper-based testing: A comprehensive approach to examining the comparability of testing modes. Essex Graduate Student Papers in Language and Linguistics, 10, 22–44.

American Psychological Association (1986). Guidelines for computer-based tests and interpretations. Washington, DC: Author.

Bennett, R. E. (2003). Online assessment and the comparability of score meaning. Princeton, NJ: Educational Testing Service.

Blerkom, M. L. V. (2009). Measurement and statistics for teachers. New York, NY: Routledge.

Boo, J. (1997) Computerized versus paper-and-pencil assessment of educational development: Score comparability and examinee preferences. Unpublished PhD dissertation, University of Iowa.

Brown, H.D. (2004). Language assessment: Principles and classroom practices. White Plains, NY: Pearson Education.

Brown, H. D. & Abeywickrama, P. (2010). Language assessment: Principles and classroom practices.White Plains, NY: Pearson Education.

Brusilovsky, P., & Miller, P. (1999). Web-based testing for distance education. Webnet 99 World conference on the WWW, Hawaii, USA, 24-30 October 1999.

Bugbee, A. C. (1996). The equivalence of paper-and-pencil and computer-based testing. Journal of Research on Computing in Education, 28 (3), 282-299.

Chapelle, C. (1998): Construct definition and validity inquiry in SLA research. In L. F. Bachman and A. D. Cohen (Eds.), Interfaces between second language acquisition and language testing research, 32-70. New York, NY: Cambridge University Press.

Chapelle, C. (1999). Validity in language assessment. Annual Review of Applied Linguistics, 19, 254-72.

https://doi.org/10.1017/S0267190599190135

Chapelle, C. (2001) Computer applications in second language acquisition: Foundations for teaching, testing, and research. Cambridge, England: Cambridge University Press.

Chapelle, C., & Douglas, D. (2006). Assessing language through computer technology. Cambridge, England: Cambridge University Press.

Chin, C. H. L. (1990). The effect of computer-based tests on the achievement, anxiety and attitudes of grade 10 science students. (Unpublised master’s thesis). The University of British Columbia, Vancouver.

Choi, I. C., Kim, K. S., & Boo, J. (2003). Comparability of a paper-based language test and a computer-based language test. Language Testing, 20(3), 295-320. doi: 0.1191/0265532203lt258oa

Choi, S. W., & Tinkler, T. (2002). Evaluating comparability of paper and computer based assessment in a K-12 setting. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.

Chua, Y. P. (2012). Effects of computer-based testing on test performance and testing motivation. Computers in Human Behavor, 28(5), 1580-1586. doi: 10.1016/j.chb.2012.03.020

Cisar, S. M., Radosav, D., Markoski, B., Pinter, R., & Cisar, P. (2010). New Possibilities for Assessment through the Use of Computer Based Testing. 8th International Symposium on Intelligent Systems and Informatics, Serbia, 10-11 September 2010 .

Cohen, A. D. (2001). Second language assessment. In M. Celce-Murcia (Ed.). Teaching English as a second or foreign language (3rd ed., pp. 515-534). Boston, MA: Heinle & Heinle.

Creed, A., Dennis, I., & Newstead, S. (1987). Proof-reading on VDUs. Behaviour and Information Technology, 6(1), 3-13. https://doi.org/10.1080/01449298708901814

Delen, E. (2015). Enhancing a computer-based testing environment with optimum item response time. Eurasia Journal of Mathematics, Science and Technology Education, 11(6), 1457-1472. https://doi.org/10.12973/eurasia.2015.1404a

Dermo, J. (2009). E-assessment and the student learning experience: A survey of student perceptions of e-assessment. British Journal of Educational Technology, 40 (2), 203-214. https://doi.org/10.1111/j.1467-8535.2008.00915.x

Dillon, A. (1994). Designing usable electronic text: Ergonomic aspects of human information usage. London: Taylor & Francis.

Dunkel, P. (Ed.) (1991). Computer-assisted language learning and testing: Research issues and practice. New York, NY: Newbury House.

Flaugher, R. (2000). Item banks. In H. Wainer, N. J. Dorans, D. Eignor, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer, 37-59. Mahwah, NJ: Lawrence Erlbaum Associates Inc.

Folk, V. G., & Smith, R. L. (2002). Models for delivery of CBTS. . In C. N. Mills, Potenza, M. T., Fremer, J. J., Ward, W. C. (Eds.), Computer-based testing: Building the foundation for future assessments, 41-66. Mahwah, NJ: Lawrence Erlbaum Associates Inc.

Fulcher, G. and Davidson, F. (2007). Language testing and assessment: An advanced resource book. New York, NY: Routledge.

Guzman, E., & Conejo, R. (2005). Self-assessment in a feasible, adaptive web-based testing system. IEEE Transactions on Education, 48 (4), 688-695. doi: 10.1109/TE.2005.854571

Hakim, B. M. (2017). Comparative study on validity of paper-based test and computer-based test in the context of educational and psychological assessment among Arab students. International Journal of English Linguistics, 8(2), 85-91. http://doi.org/10.5539/ijel.v8n2p85

Hensley, K.K. (2015). Examining the effects of paper-based and computer-based modes of assessment of mathematics curriculum-based measurement. Unpublished PhD thesis, University of Iowa, Iowa.

Higgings, J., Russell, M., & Hoffmann, T. (2005). Examining the effect of computer-based passage presentation on reading test performance. Journal of Technology, Learning and Assessment, 3 (4), 3-35.

Hosseini, M., Abidin, M.J.Z., & Baghdarnia, M. (2014). Comparability of test results of computer based tests (CBT) and paper and pencil tests (PPT) among English language learners in Iran. Social and Behavioral Sciemces, 98, 659-667. doi: 10.1016/j.sbspro.2014.03.465

Hughes, A. (2003). Testing for language teachers. (2nd ed.). Cambridge, England: Cambridge University Press.

Jeong, H. (2014). A comparative study of scores on computer-based tests and paper-based tests. Behaviour and Information Technology, 33(4), 410-422. doi.org/10.1080/0144929X.2012.710647

Kearsley, G. (1996). The World Wide Web: Global access to education. Educational Technology Review, 5, 26-30.

Kim, D. H., & Huynh, H. (2007). Comparability of computer and paper-and-pencil versions of algebra and biology assessments. Journal of Technology, Learning, and Assessment, 6(4), 4-30. Retrieved from http://ejournals.bc.edu/ojs/index.php/jtla/article/download/ 1634/1478.

Laborda, J. G. (2010). Contextual clues in semi-direct interviews for computer assisted language testing. Procedia Social and Behavioral Sciences, 2, 3591-3595. doi:10.4304/jltr.5.5.971-975

Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. Abingdon, Oxon: Routledge.

Lilley, M., Barker, R., & Britton, C. (2004). The development and evaluation of a software prototype for computer-adaptive testing. Computers and Education, 43, 109-123.

Linden, W. J. (2002). On complexity in CBT. . In C. N. Mills, Potenza, M. T., Fremer, J. J., Ward, and W. C. (Eds.), Computer-based testing: Building the foundation for future assessments, 89-102. Mahwah, NJ: Lawrence Erlbaum Associates Inc.

Linden, W. J., & Glas, G. A. W. (2002). Computer-adaptive testing: Theory and Practice. NewYork: Kluwer Academic Publishers.

Logan, T. (2015). The influence of test mode and visuospatial ability on mathematics assessment performance. Mathematics Education Research Journal, 27, 423-441. doi: 10.1007/s13394-015-0143-1

Mackey, A., & Gass, S. M. (2005). Second language research: Methodology and design. Mahwah, NJ: Lawrence Erlbaum Associates.

Madsen, H. S. (1991). Computer-adaptive testing of listening and reading comprehension. In P. Dunkel(Ed.) Computer-assisted language learning and testing, 237-257. New York, NY: Newbury House.

McGough, J., Mortensen, J., Johnson, J., & Fadali, S. (2001). A web based testing system with dynamic question generation. 31st ASEE/ IEEE frontiers in education conference, Reno, 10-13 October 2001.

Muter, P., Latremouille, S. A., Treurniet, W. C., & Beam, P. (1982). Extended reading of continuous text on television screens. Human Factors, 24, 502-508. https://doi.org/10.1177/001872088202400501

Noyes, J. M., & Garland, K. J. (2008). Computer- vs. paper-based tasks: Are they equivalent? Ergonomics, 51(9), 1352-1375. doi: 10.1080/00140130802170387

Paek, P. (2005). Recent trends in comparability studies (Pearson Educational Measurement Research Report 05-05). Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/5FC04F5A-E79D-45FE-8484-07AACAE2DA75/0/TrendsCompStudies_rr0505.pdf.

Parshall, C. G., & Kromrey, J. D. (1993). Computer-based versus paper-and-pencil testing: An analysis of examinee characteristics associated with mode effect. Annual meeting of the American educational research association, Atlanta, GA, April 1993.

Parshall, C. G., Spray, J. A., Kalohn, J. C., & Davey, T. (2002). Practical considerations in computer based testing. Verlag, NewYork: Springer.

Ravid, R. (2011). Practical statistics for educators (4th ed.) Plymouth, UK: Rowman & Littlefiel.

Retnawati, H. (2015). The comparision of accuracy scores on the paper and pencil testing versus computer-based testig. TOJET, 14(4), 135-142.

Roever, C. (2001). Web-based language testing. Language Learning and Technology, 5(5), 84-94.

Russell, M., Goldberg, A., & O’conner, K. (2003). Computer-based testing and validity: A look back into the future. Assessment in Education: Principles, Policy & Practice, 10 (3), 279-293. https://doi.org/10.1080/0969594032000148145

Scheerens, J., Glas C., & Thomas, S. M. (2005). Educational evaluation, assessment, and monitoring: A systemic approach. Lisse: Swets & Zeitlinger B.V.

Semerci, Ç., ve Bektaş, C. (2005). İnternet temelli ölçmelerin geçerliliğini sağlamada yeni yaklaşımlar. TOJET, 4 (1), 130-134.

Siozos, P., Palaigeorgiou, G., Triantafyllakos, G., & Despotakis, T. (2009). Computer-based testing using “digital ink”: Participatory design of a tablet PC based assessment application for secondary education. Computers & Education, 52, 811-819.

Stevenson, J., & Gross, S. (1991). Use of a computerized adaptive testing model for ESOL/ bilingual entry/ exit decision making. In P. Dunkel(Ed.) Computer-assisted language learning and testing, 223-235. New York, NY: Newbury House.

Stobart, G. (2012). Validity in formative assessment. In J. Gardner, (Ed.). Assessment and learning, 233-242. London: Sage Publications, Inc.

Texas Education Agency. (2008). A review of literature on the comparability of scores obtained from examinees on computer-based and paper-based tests. Retrieved from www.tea.state.tx.us/WorkArea/linkit.aspx?LinkIdentifier=id&ItemID=2147494120&libID= 2147494117.

Tsai, T. H., & Shin, C. D. (2012). A score comparability study for the NBDHE: Paper-pencil versus computer versions. Evaluation & the Health Professions, 36(2), 228-239. https://doi.org/10.1177/0163278712445203

Tung, P. (1986). Computerized adaptive testing: Implications for language test developers. In C. W. Stansfield (Ed.). Technology and language testing (pp. 9-11). Washington, DC: TESOL.

Wainer, H., & Eignor, D. (2000). Caveats, pitfalls and unexpected consequences of implementing large-scale computerized testing. In H. Wainer, N. J. Dorans, D. Eignor, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer, 271-298. Mahwah, NJ: Lawrence Erlbaum Associates Inc.

Wang, H. (2010). Comparability of computerized adaptive and paper-pencil tests. [Online: http://images.pearsonassessments.com/images/tmrs/tmrs_rg/Bulletin_13.pdf, retrieved in August, 2013].

Wang, H., & Shin, C. D. (2009). Computer-based & paper-pencil test comparability studies. Test, Measurement and Research Service Bulletin, 9, 1-6. Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/93727FC9-96D3-4EA5-B807-5153EF17C431/0/Bulletin_9.pdf

Wang, H., & Shin, C. D. (2010). Comparability of computerized adaptive and paper-pencil tests. Test, Measurement and Research Service Bulletin, 13, 1-7. Retrieved from http://www.pearsonassessments.com/NR/rdonlyres/057A4A04-9DCB-4B68-9CB0-3F32DDF396F6/0/Bulletin_13.pdf.

Wang, S., Jiao, H., Young, M. J., Brooks, T., & Olson, J. (2007). A meta-analysis of testing mode effects in grade k-12 mathematics tests. Educational and Psychological Measurement, 67(2), 219-238. https://doi.org/10.1177/0013164406288166

Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and an example. Journal of Educational Measurement, 38(1), 19-49. http://dx.doi.org/10.1111/j.1745-3984.2001.tb01115.x

Ward, W. C. (2002). Test models. In C. N. Mills, Potenza, M. T., Fremer, J. J., Ward, W. C. (Eds.), Computer-based testing: Building the foundation for future assessments, 37-40. Mahwah, NJ: Lawrence Erlbaum Associates Inc.

Whiston, S. C. (2009). Principles and applications of assessment in counseling (3rd ed.). CA: Brooks/ Cole.

Yagcı M., Ekiz. H., ve Gelbal, S. (2011). Çevrimiçi sınav ortamlarının öğrencilerin akademik başarılarına etkisi.5th international computer and instructional technologies symposium, Elazığ, Turkey, 22-24 September 2011.

Yaman, S. O., & Cagıltay, N. E. (2010). Paper-based versus computer-based testing in engineering education. IEEE Educon Education Engineering: The Future of Global Learning Engineering Education, 1631-1637. doi: 10.1109/EDUCON.2010.5492397

Yunxiang, L., Ruixue, G., Lili, R., Wangjie, Quinshui, Q., & Hefei (2010). Advantages and disadvantages of computer-based testing: A case study of service learning. [Online: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=5691870, retrieved in July, 2013]. doi: 10.1109/ICISE.2010.5691870