Investigating the Effect of Item Position on Person and Item Parameters: PISA 2015 Turkey Sample

Different positions of items in booklets affect the probabilities of correct answers. This effect is called the item position effect in the literature, which causes variances in the item and person parameters. The aim of this study is to investigate the item position effect within the framework of explanatory item response theory. The analyses of this research were carried out on the PISA 2015 Turkey sample, and the item position effect was examined in the domains of reading and mathematics. In addition, the effect of the item position in different item formats (open response and multiple choice) was investigated. According to the results, the item position effect decreased the probability of answering the item correctly, and this effect was higher in reading than in mathematics. Furthermore, in the domain of mathematics, open response items were affected more than multiple-choice items by the item position. In the reading domain, open response and multiple choice items were affected similarly. The results of the analysis show that there were undesirable effects of the item position, and these effects should be taken into account.

___

  • Albano, A. D. (2013). Multilevel modeling of item position effects. Journal of Educational Measurement, 50(4), 408-426. https://doi.org/10.1111/jedm.12026
  • Albano, A. D., McConnell, S. R., Lease, E. M., & Cai, L. (2020). Contextual interference effects in early assessment: Evaluating the psychometric benefits of item interleaving. Frontiers in Education, 5. https://doi.org/10.3389/feduc.2020.00133
  • Asseburg, R., & Frey, A. (2013). Too hard, too easy, or just right? The relationship between effort or boredom and ability-difficulty fit. Psychological Test and Assessment Modeling, 55(1), 92-104. https://psycnet.apa.org/record/2013-18917-006
  • Bates, D., Maechler, M., Bokler, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1-48. https://doi.org/10.18637/jss.v067.i01
  • Brennan, R. L. (1992). The context of context effects. Applied Measurement in Education, 5, 225-264. https://doi.org/10.1207/s15324818ame0503_4
  • Breslow, N. E., & Clayton, D. G. (1993). Approximate inference in generalized linear mixed models. Journal of the American Statistical Association, 88(421), 9-25. https://doi.org/10.2307/2290687
  • Bulut, O. (2021). eirm: Explanatory item response modeling for dichotomous and polytomous item responses (R package version 0.3.0) [Computer software]. https://doi.org/10.5281/zenodo.4556285
  • Bulut, O., Quo, O., & Gierl, M. (2017). A structural equation modeling approach for examining position effects in large‑scale assessments. Large Scale in Assessments in Education, 5(8), 1-20. https://doi.org/10.1186/s40536-017-0042-x
  • Christiansen, A., & Janssen, R. (2020). Item position effects in listening but not in reading in the European Survey of Language Competences. Educational Assessment, Evaluation and Accountability, 33(3), 49-69. https://doi.org/10.1007/s11092-020-09335-7
  • Cook, L. L., & Petersen, N. S. (1987). Problems related to the use of conventional and item response theory equating methods in less than optimal circumstances. Applied Psychological Measurement, 11(1), 225-244. https://doi.org/10.1177/014662168701100302
  • De Boeck, P., & Wilson, M. (2004). Explanatory item response models: A generalized linear and nonlinear approach. Springer.
  • De Boeck, P., Bakker, M., Zwitser, R., Nivard, M., Hofman, A., Tuerlinckx, F., & Partchev, I. (2011). The estimation of item response models with the lmer function from the lme4 package in R. Journal of Statistical Software, 39(12), 1-28. https://doi.org/10.18637/jss.v039.i12
  • Debeer, D., & Janssen, R. (2013). Modeling item-position effects within an IRT framework. Journal of Educational Measurement, 50(2), 164-185. https://doi.org/10.1111/jedm.12009
  • Desjardins, C. D., & Bulut, O. (2018). Handbook of educational measurement and psychometrics using R. CRC Press.
  • Fahrmeir, L., & Tutz, G. (2001). Multivariate statistical modeling based on generalized linear models (2nd ed.). Springer.
  • Frey, A. & Bernhardt, R. (2012). On the importance of using balanced booklet designs in PISA. Psychological Test and Assessment Modeling, 54(4), 397-417. https://www.psychologie-aktuell.com/fileadmin/download/ptam/4-2012_20121224/05_Frey.pdf
  • Frey, A., Hartig, J., & Rupp, A. (2009). An NCME instructional module on booklet designs in large-scale assessments of student achievement: Theory and practice. Educational Measurement: Issues and Practice, 28(3), 39-53. https://doi.org/10.1111/j.1745-3992.2009.00154.x
  • Goff, M., & Ackerman, P. L. (1992). Personality-intelligence relations: Assessment of typical intellectual engagement. Journal of Educational Psychology, 84(4), 537-552. https://doi.org/10.1037/0022-0663.84.4.537
  • Gonzalez, E., & Rutkowski, L. (2010). Principles of multiple matrix booklet designs and parameter recovery in large scale assessments. IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments, 3, 125-156. https://www.ierinstitute.org/fileadmin/Documents/IERI_Monograph/IERI_Monograph_Volume_03_Chapter_6.pdf
  • Guertin, W. H. (1954). The effect of instructions and item order on the arithmetic subtest of the Wechsler- Bellevue. Journal of Genetic Psychology, 85(1), 79-83. https://doi.org/10.1080/00221325.1954.10532863
  • Hambleton, R. K., & Traub, R. E. (1974). The effects of item order in test performance and stress. Journal of Experimental Education, 43(1), 40-46. http://www.jstor.org/stable/20150989
  • Hahne, J. (2008). Analyzing position effects within reasoning items using the LLTM for structurally incomplete data. Psychology Science Quarterly, 50(3), 379-390. https://www.psychologie-aktuell.com/fileadmin/download/PschologyScience/3-2008/05_Hahne.pdf
  • Hartig, J., & Buchholz, J. (2012). A multilevel item response model for item position effects and individual persistence. Psychological Test and Assessment Modeling, 54(4), 418-431. https://www.proquest.com/scholarly-journals/multilevel-item-response-model-positioneffects/docview/1355923397
  • Hecht, M., Weirich, S., Siegle, T., & Frey, A. (2015). Effects of design properties on parameter estimation in large-scale assessments. Educational and Psychological Measurement, 75(6), 1021-1044. https://doi.org/10.1177/0013164415573311
  • Hohensinn, C., Kubinger, K., Reif, M., Schleicher, E., & Khorramdel, L. (2011). Analysing item position effects due to test booklet design within large-scale assessment. Educational Research and Evaluation, 17(6), 497-509. https://doi.org/10.1080/13803611.2011.632668
  • Janssen, R., Schepers, J., & Peres, D. (2004). Models with item and item group predictors. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models (pp. 189-212). Springer. https://doi.org/10.1007/978-1-4757-3990-9_6
  • Kingston, N. M., & Dorans, N. J. (1982). The effect of the position of an item within a test on item responding behavior: An analysis based on item response theory (GRE Board Professional Report GREB No. 79-12bP). Educational Testing Service. https://doi.org/10.1002/j.2333-8504.1982.tb01308.x
  • Kingston, N. M., & Dorans, N. J. (1984). Item location effects and their implications for IRT equating and adaptive testing. Applied Psychological Measurement, 8(2), 147-154. https://doi.org/10.1177/014662168400800202
  • Kolen, M. J., & Brennan, R. L. (2004). Testing equating, scaling, and linking: Methods and practice. Springer.
  • Kolen, M. J., & Harris, D. (1990). Comparison of item pre-equating and random groups equating using IRT and equipercentile methods. Journal of Educational Measurement, 27(1), 27-29. https://doi.org/10.1111/j.1745-3984.1990.tb00732.x
  • Le, L. T. (2007, July). Effects of item positions on their difficulty and discrimination: A study in PISA Science data across test language and countries. Paper presented at the 72nd Annual Meeting of the Psychometric Society, Tokyo. https://research.acer.edu.au/pisa/2/
  • Leary, L. F., & Dorans, N. J. (1985). Implications for altering the context in which test items appear: A historical perspective on an immediate concern. Review of Educational Research, 55(3), 387-413.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Erlbaum.
  • Lord, F. N., & Novick, M. R. (1968). Statistical theories of mental test scores. Addison-Wesley.
  • MacNicol, K. (1956). Effects of varying order of item difficulty in an unspeeded verbal test (Unpublished manuscript). Educational Testing Service.
  • McCoach, D. B., & Black, A. C. (2008). Evaluation of model fit and adequacy. In A. A. O’Connell & D. B. McCoach (Ed.), Multilevel modeling of educational data (pp. 245-272). Information Age Publishing, Inc.
  • McCullagh, P., & NeIder, J. A. (1989). Generalized linear models (2nd ed.). Chapman & Hall.
  • McCulloch, C. E., & Searle, S. R. (2001). Generalized, linear, and mixed models. Wiley.
  • Meyers, J. L., Miller, G. E., & Way, W. D. (2009). Item position and item difficulty change in an IRT- based common item equating design. Applied Measurement in Education, 22(1), 38-60. https://doi.org/10.1080/08957340802558342
  • Mollenkopf, W. G. (1950). An experimental study of the effects on item-analysis data of changing item placement and test time limit. Psychometrika, 15(3), 291-315. https://doi.org/10.1007/BF02289044
  • Nagy, G., Nagengast, B., Frey, A., Becker, M., & Rose, N. (2018). A multilevel study of position effects in PISA achievement tests: student- and school-level predictors in the German tracked school system. Assessment in Education: Principles, Policy & Practice, 26(4), 422-443. https://doi.org/10.1080/0969594X.2018.1449100
  • Organisation for Economic Co-operation and Development. (2009). PISA 2006 technical report. Organisation for Economic Co-operation and Development. https://www.oecd.org/pisa/data/42025182.pdf
  • Organisation for Economic Co-operation and Development. (2012). PISA 2009 technical report. Organisation for Economic Co-operation and Development. http://dx.doi.org/10.1787/9789264167872-en
  • Organisation for Economic Co-operation and Development. (2014). PISA 2012 technical report. Organisation for Economic Co-operation and Development. https://www.oecd.org/pisa/pisaproducts/PISA-2012-technical-report-final.pdf
  • Organisation for Economic Co-operation and Development. (2017). PISA 2015 technical report. Organisation for Economic Co-operation and Development. https://www.oecd.org/pisa/data/2015-technical-report/
  • Okumura, T. (2014). Empirical differences in omission tendency and reading ability in PISA: An application of tree-based item response models. Educational and Psychological Measurement, 74(4), 611-626. https://doi.org/10.1177/0013164413516976
  • R Core Team. (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing.
  • Raven, J. C., Raven, J., & Court, J. H. (1997). Raven’s progressive matrices and vocabulary scales. J. C. Raven Ltd.
  • Rose, N., Davier, M., & Xu, X. (2010). Modeling nonignorable missing data with item response theory (IRT) (Report No. RR-10-11). Educational Testing Service.
  • Rose, N., Nagy, G., Nagengast, B., Frey, A., & Becker, M. (2019). Modeling multiple item context effects with generalized linear mixed models. Frontiers in Psychology, 10. https://doi.org/10.3389/fpsyg.2019.00248
  • Sax, G., & Carr, A. (1962). An investigation of response sets on altered parallel forms. Educational and Psychological Measurement, 22(2), 371-376. https://doi.org/10.1177/001316446202200210
  • Schweizer, K., Schreiner, M., & Gold, A. (2009). The confirmatory investigation of APM items with loadings as a function of the position and easiness of items: A two-dimensional model of APM. Psychology Science Quarterly, 51(1), 47-64. https://psycnet.apa.org/record/2009-06359-003
  • Smouse, A. D., & Munz, D. C. (1968). The effects of anxiety and item difficulty sequence on achievement testing scores. Journal of Psychology, 68(2), 181-184. https://doi.org/10.1080/00223980.1968.10543421
  • Trendtel, M., Robitzsch, A. (2018). Modeling item position effects with a Bayesian item response model applied to PISA 2009-2015 data. Psychological Test and Assessment Modeling, 60(2), 241-263. https://www.psychologie-aktuell.com/fileadmin/download/ptam/2-2018_20180627/06_PTAM-2-2018_Trendtel_v2.pdf
  • Tuerlinckx, F., & De Boeck, P. (2004). Models for residual dependencies. In P. De Boeck & M. Wilson (Eds.), Explanatory item response models (pp. 289-316). Springer.
  • Wainer, H., & Kiely, G. L. (1987). Item clusters and computerized adaptive testing: A case for testlets. Journal of Educational Measurement, 24(3), 185-201. http://www.jstor.org/stable/1434630
  • Weirich, S., Hecht, M., & Böhme, K. (2014). Modeling item position effects using generalized linear mixed models. Applied Psychological Measurement, 38(7), 535-548. https://doi.org/10.1177/0146621614534955
  • Weirich, S., Hecht, M., Penk, C., Roppelt, A., & Böhme, K. (2016). Item position effects are moderated by changes in test-taking effort. Applied Psychological Measurement, 41(2), 115-129. https://doi.org/10.1177/0146621616676791
  • Whitely, E., & Dawis, R. (1976). The influence of test context on item difficulty. Educational and Psychological Measurement, 36(2), 329-337. https://doi.org/10.1177/001316447603600211
  • Wise, L. L., Chia, W. J., & Park, R. (1989, 27-31 March). Item position effects for test of word knowledge and arithmetic reasoning. Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA.
  • Wise, S. L., & DeMars, C. E. (2005). Low examinee effort in low-stakes assessment: Problems and potential solutions. Educational Assessment, 10(1), 1-17. https://doi.org/10.1207/s15326977ea1001_1
  • Wise, S. L., & Kong, X. (2005). Response time effort: A new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 163-183. https://doi.org/10.1207/s15324818ame1802_2
  • Wu, Q., Debeer, D. Buchholz, J., Hartig, J., & Janssen, R. (2019). Predictors of individual performance changes related to item positions in PISA assessments. Large Scale Assessment in Education, 7(5), 1-20. https://doi.org/10.1186/s40536-019-0073-6
  • Yen, W. M. (1980). The extent, causes and importance of context effects on item parameters for two latent trait models. Journal of Educational Measurement, 17(4), 297-311. http://www.jstor.org/stable/1434871
  • Zwick, R. (1991). Effects of item order and context on estimation of NAEP reading proficiency. Educational Measurement: Issues and Practice, 10(3), 10-16. https://doi.org/10.1111/j.1745-3992.1991.tb00198.x