Investigating Consequences of Using Item Pre-knowledge in Computerized Multistage Testing

The goal of this study is to determine the effects of test cheating  in a scenario where test-takers use item pre-knowledge in the c-MST, and to urge practitioners to take additional precautions to increase test security. In order to investigate the statistical consequences of item pre-knowledge use in the c-MST, three different cheating scenarios were created, in addition to the baseline condition (e.g., no pre-knowledge usage). The findings were compared under 30-item and 60-item test length conditions with 1-3-3 c-MST panel design. A total of thirty cheaters were generated from a normal distribution, and EAP was used as an ability estimation method. The findings were discussed with the evaluation criteria of mean bias, root mean square error, correlation between true and estimated thetas, conditional absolute bias, and conditional root mean square. It was found that using item pre-knowledge severely affected the estimated thetas, and as the number of compromised items increased, the results got worse. It was concluded that item sharing and/or test cheating seriously damage the test scores, test usage, and score interpretations. 

___

  • Armstrong, R. D., Jones, D. H., Li, X., & Wu, L. (1996). A study of a network-flow algorithm and a noncorrecting algorithm for test assembly. Applied Psychological Measurement, 20(1), 89-98.
  • Baker, F. (1992). Item response theory. New York, NY: Markel Dekker, INC.
  • Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick. In statistical theories of mental test scores (pp. 397–479). Reading, MA: Addison-Wesley.
  • Bock, R. D., &Mislevy, R. J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied psychological measurement, 6(4), 431–444.
  • Diao, Q., & van der Linden, W. J. (2011). Automated test assembly using lp_solve version 5.5 in R. Applied Psychological Measurement, 35(5) 398–409.
  • Foster, D. (2013). Security issues in technology-based testing. Handbook of test security, 39-83.
  • Guo, J., Tay, L., &Drasgow, F. (2009). Conspiracies and test compromise: An evaluation of the resistance of test systems to small-scale cheating. International Journal of Testing, 9(4), 283-309.
  • ILOG. (2006). ILOG CPLEX 10.0 [User’s manual]. Paris, France: ILOG SA.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, New Jersey: Lawrence Erlbaum Associates.
  • Luecht, R. M., &Nungester, R. J. (1998). Some Practical Examples of Computer‐Adaptive Sequential Testing. Journal of Educational Measurement, 35(3), 229–249.
  • Luecht, R. M. &Sireci, S. G. (2011). A review of models for computer-based testing. Research Report RR-2011–12. New York: The College Board.
  • McLeod, L., Lewis, C., &Thissen, D. (2003). A Bayesian method for the detection of item preknowledge in computerized adaptive testing. Applied Psychological Measurement, 27(2), 121–137.
  • Meijer, R. R. (1996). Person-Fit research: An introduction. Applied Measurement in Education, 9, 3-8.
  • Schnipke, D. L., & Reese, L. M. (1999). A Comparison [of] Testlet-Based Test Designs for Computerized Adaptive Testing. Law School Admission Council Computerized Testing Report. LSAC Research Report Series.
  • Segall, D. O. (2004). A sharing item response theory model for computerized adaptive testing. Journal of Educational and Behavioral Statistics, 29(4), 439–460.
  • Team, R. (2016). RStudio: integrated development for R. RStudio, Inc., Boston, MA. Retrieved from http://www.rstudio.com.
  • Thissen, D., &Mislevy, R. J. (2000). Testing algorithms. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2nd ed., pp. 101–133). Hillsdale, NJ: Lawrence Erlbaum.
  • Weiss, D. J., & Kingsbury, G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361–375.
  • Weissman, A., Belov, D.I., & Armstrong, R.D. (2007). Information-based versus number-correct routing in multistage classification tests. (Research Report RR-07–05). Newtown, PA: Law School Admissions Council.
  • Wollack, J. A., Cohen, A. S., & Serlin, R. C. (2001). Defining error rates and power for detecting answer copying. Applied Psychological Measurement, 25(4), 385-404.
  • Yan, D., von Davier, A. A., & Lewis, C. (Eds.). (2014). Computerized multistage testing: Theory and applications. CRC Press.
  • Yi, Q., Zhang, J., & Chang, H. H. (2006). Severity of Organized Item Theft in Computerized Adaptive Testing: An Empirical Study. ETS Research Report Series, 2006(2), i-25.
  • Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment (Order No. 3136800).
  • Zopluoglu, C., & Davenport, E.C. (2012). The Empirical Power and Type I Error Rates of the GBT and Omega Indices in Detecting Answer Copying on Multiple-Choice Tests. Educational and Psychological Measurement, 72(6), 975–1000.
Gazi Üniversitesi Gazi Eğitim Fakültesi Dergisi-Cover
  • ISSN: 1301-9058
  • Yayın Aralığı: Yılda 3 Sayı
  • Başlangıç: 1985
  • Yayıncı: Gazi Üniversitesi