Hatice Cigdem YAVUZ

The Effects of Log Data on Students’ Performance

This study aimed to assess the relationships between response times (RTs), the number of actions taken to solve a given item, and student performance. In addition, the interaction between the students’ information and communications technology (ICT) competency, reading literacy, and log data (time and number of actions) were examined in order to gain additional insights regarding the relations between student performance and log data. The sample consisted of 2 348 students who participated in the triennial international large-scale assessment of the Programme for International Student Assessment (PISA). For the current study, 18 items in the one cluster of the 91st booklet were chosen. To achieve the aim of the study, explanatory item response modeling (EIRM) framework based on generalized linear mixed modeling (GLMM) was used. The results of this study showed that students who spent more time on items and those that took more actions on items were more likely to answer the items correctly. However, this effect did not have variability across items and students. Moreover, the interaction only with reading and the number of actions was found to have a positive effect on the students’ overall performance.

Keywords:

Test-taking behaviors, explanatory item response modelling, log data technology-based assessment,

PDF

___

Azzolini, D., Bazoli, N., Lievore, I., Schizzerotto, A., & Vergolini, L. (2019). Beyond achievement. A comparative look into 15-year-olds’ school engagement, effort and perseverance in the European Union. Luxembourg: Publications Office of the European Union.
Bates, D., & Maechler, M. (2014). lme4: Linear mixed-effects models using S4 classes. R package. Available from http://CRAN.R-project.org/package=lme4.
Briggs, D. C. (2008). Using explanatory item response models to analyze group differences in science achievement. Applied Measurement in Education, 21(2), 89-118. doi: 10.1080/08957340801926086
De Ayala, R. J. (2009). The theory and practice of Item Response Theory. New York: The Guilford Press.
De Boeck, P., & Wilson, M. (Eds.). (2004). Explanatory item response models: A generalized linear and nonlinear approach. New York, NY: Springer, doi: 10.1007/978-1-4757-3990-9
De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuer- linckx, F., & Partchev, I. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39, 1-28.
Dodonova Y. A., & Dodonov Y. S. (2013). Faster on easy items, more accurate on difficult ones: Cognitive ability and performance on a task of varying difficulty. Intelligence, 41, 1–10. doi: 10.1016/j.intell.2012.10.003
Goldhammer, F., Naumann, J., & Greiff, S. (2015). More is not always better: The relation between item response and item response time in Raven’s matrices. Journal of Intelligence, 3(1), 21-40.
Goldhammer, F., Naumann, J., Stelter, A., Tóth, K., Rölke, H., & Klieme, E. (2014). The time on task effect in reading and problem solving is moderated by task difficulty and skill: Insights from a computer-based large-scale assessment. Journal of Educational Psychology, 106(3), 608.
Goldhammer, F., & Klein Entink, R. (2011). Speed of reasoning and its relation to reasoning ability. Intelligence, 39, 108-119.
Greiff, S., Niepel, C., Scherer, R., & Martin, R. (2016). Understanding students' performance in a computer-based assessment of complex problem solving: An analysis of behavioral data from computer-generated log files. Computers in Human Behavior, 61, 36-46.
Greiff, S., Wüstenberg, S., & Avvisati, F. (2015). Computer-generated log-file analyses as a window into students' minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers & Education, 91, 92-105.
Greiff, S., Wüstenberg, S., Csapo, B., Demetriou, A., Hautamaki, J., Graesser, A. C., et al. (2014). Domain-general problem solving skills and education in the 21st century. Educational Research Review, 13,74-83.
Klein Entink, R. H., Fox, J. P., & van der Linden, W. J. (2009). A multivariate multilevel approach to the modeling of accuracy and speed of test takers. Psychometrika, 74, 21-48. doi: 10.1007/s11336-008-9075-y
Kong, X., Wise, S. L., & Bhola, D. S. (2007). Setting the response time threshold parameter to differentiate solution behavior from rapid-guessing behavior. Educational and Psychological Measurement, 67, 606–619.
Kyllonen, P., & Zu, J. (2016). Use of response time for measuring cognitive ability. Journal of Intelligence, 4(4), 14.
Lasry, N., Watkins, J., Mazur, E., & Ibrahim, A. (2013). Response times to conceptual questions. American Journal of Physics, 81, 703. doi:10.1119/1.4812583.
Lee, Y. H. & Haberman, S. J. (2016) Investigating test-taking behaviors using timing and process data. International Journal of Testing, 16(3), 240-267, doi: 10.1080/15305058.2015.1085385.
Lee, Y. H., & Jia, Y. (2014). Using response time to investigate students' test-taking behaviors in a NAEP computer-based study. Large-scale Assessments in Education, 2(1), 8.
Meyer, P. J. (2010). A mixture Rasch model with response time components. Applied Psychological Measurement, 34, 521–538.
OECD (2017a). PISA 2015 Assessment and Analytical Framework: Science, Reading, Mathematic, Financial Literacy and Collaborative Problem Solving. PISA,
OECD Publishing, Paris, https://doi.org/10.1787/9789264281820-en.
OECD (2017b). PISA 2015 results (Volume V): Collaborative problem solving. PISA, OECD Publishing, Paris.
OECD (2017c). PISA 2015 technical report. PISA, OECD Publishing, Paris.
Petscher, Y., Mitchell, A. M., & Foorman, B. R. (2015). Improving the reliability of student scores from speeded assessments: An illustration of conditional item response theory using a computer-administered measure of vocabulary. Reading and Writing, 28,31–56. doi:10.1007/s11145-014-9518-z
R Core Team (2018). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danish Institute for Educational Research
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8(2), 185–205.
Rios, J. A., Guo, H., Mao, L., & Liu, O. L. (2017). Evaluating the impact of careless responding on aggregated-scores: To filter unmotivated examinees or not? International Journal of Testing, 17,74–104. doi:10.1080/15305058.2016.1231193
Robitzsch, A. (2019). Package “sirt”. (https://cran.r-project.org/web/packages/sirt/sirt.pdf)
Schnipke, D. L., & Scrams, D. J. (2002). Exploring issues of examinee behavior: Insights gained from response-time analyses. Computer-based testing: Building the foundation for future assessments, 237-266. in Computer-Based Testing: Building the Foundation for Future Assessments, edited by C. N. Mills et al. (Lawrence Erlbaum Associates, Mahwah, NJ, 1997).
Su, S., & Davison, M. L. (2019) Improving the Predictive Validity of Reading Comprehension Using Response Times of Correct Item Responses. Applied Measurement in Education, 32 (2), 166-182, doi: 10.1080/08957347.2019.1577247
van der Linden, W. J. (2009). Conceptual issues in response‐time modeling. Journal of Educational Measurement, 46(3), 247-272.
Verbić, S., & Tomić, B. (2009). Test item response time and the response likelihood. Retrieved on 20 March 2019 from http://arxiv.org/ftp/arxiv/papers/0901/0901.4356.pdf .
Wickham, H. (2012). reshape2: Flexibly reshape data: a reboot of the reshape package. R package version, 1(2).
Wickham, H., Chang, W., & Wickham, M. H. (2016). Package ‘ggplot2’. Create Elegant Data Visualisations Using the Grammar of Graphics. Version, 2(1), 1-189.Wilson, M., De Boeck, P., & Carstensen, C. H. (2008). Explanatory Item Response Models: A brief introduction. In J Hartig, E Klieme, & D Leutner (Eds.), Assessment of competencies in educational contexts (pp. 91–120). Cambridge, MA: Hogrefe.
Wise, S. L. (2006). An investigation of the differential effort received by items on a low-stakes, computer-based test. Applied Measurement in Education, 19, 95-114.
Wise, S. L., & DeMars, C. E. (2005). Examinee motivation in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10, 1–18.
Wise, S. L., Kingsbury, G. G., Thomason, J., & Kong, X. (2004). An investigation of motivation filtering in a statewide achievement testing program. Paper presented at the Annual meeting of the Na- tional Council on Measurement in Education, San Diego, CA.
Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68(3), 456–477. doi:10.1111/bmsp.12054
Wu, A. D., Chen, M. Y., & Stone, J. E. (2018). Investigating How Test-Takers Change Their Strategies to Handle Difficulty in Taking a Reading Comprehension Test: Implications for Score Validation. International Journal of Testing, 18(3), 253-275.
Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.