Applicability and Efficiency of a Polytomous IRT-Based Computerized Adaptive Test for Measuring Psychological Traits

Applicability and Efficiency of a Polytomous IRT-Based Computerized Adaptive Test for Measuring Psychological Traits

Currently, research on computerized adaptive testing (CAT) focuses mainly on dichotomous items and cognitive traits (achievement, aptitude, etc.). However, polytomous IRT-based CAT is a promising research area for measuring psychological traits that has attracted much attention. The main purpose of this study is to test the practicality of the polytomous IRT-based CAT and its equivalence with the paper-pencil version. Data were collected from 1449 high school students (45% female) via the paper-pencil version. The data were used for IRT parameter estimates and CAT simulation studies. For the equivalence study, the research group consisted of 81 students (47% female) who participated in both the paper-pencil and live CAT applications. The paper-pencil version of the vocational interest inventory consists of 17 factors and 164 items. When the EAP estimation method and setting SE

___

  • Abidin, A. Z., Istiyono, E., Fadilah, N., & Dwandaru, W. S. B. (2019). A computerized adaptive test for measuring the physics critical thinking skills. International Journal of Evaluation and Research in Education, 8(3), 376-383. http://dx.doi.org/10.11591/ijere.v8i3.19642
  • Achtyes, E. D., Halstead, S., Smart, L., Moore, T., Frank, E., Kupfer, D. J., & Gibbons, R. D. (2015). Validation of computerized adaptive testing in an outpatient nonacademic setting: he VOCATIONS trial. Psychiatric Services, 1–6. http://doi.org/10.1176/appi.ps.201400390
  • Alkhadher, O., Clarke, D. D., & Anderson, N. (1998). Equivalence and predictive validity of paper-and-pencil and computerized adaptive formats of the differential aptitude tests. Journal of Occupational and Organizational Psychology, 71(3), 205–217. http://doi.org/10.1111/j.2044-8325.1998.tb00673.x
  • Aybek, E. C., & Çıkrıkçı, R. N. (2018). Kendini değerlendirme envanteri’nin bilgisayar ortamında bireye uyarlanmış test olarak uygulanabilirliği. Turkish Psychological Counseling and Guidance Journal, 8(50), 117-141. http://hdl.handle.net/20.500.12575/37233
  • Babcock, B., & Weiss, D. J. (2012). Termination criteria in computerized adaptive tests: do variable - length CATs provide efficient and effective measurement? Journal of Computerized Adaptive Testing, 1(1), 1–18. http://doi.org/10.7333/1212-0101001
  • Baek, S. G. (1995). Computerized adaptive attitude testing using the partial credit model. Dissertation Abstracts International, 55(7-A), 1922. Retrieved April 10, 2022, from PsychInfo database.
  • Baker, F. B. (2001). The basics of item response theory (second edition). Retrieved July 22, 2022, from http://eric.ed.gov/?id=ED458219
  • Betz, N. E., & Turner, B. M. (2011). Using item response theory and adaptive testing in online career assessment. Journal of Career Assessment, 19(3), 274–286. http://doi.org/10.1177/1069072710395534
  • Betz, N. E., Borgen, F. H., Rottinghaus, P., Paulsen, A., Halper, C. R., & Harmon, L. W. (2003). The expanded skills confidence inventory: measuring basic dimensions of vocational activity. Journal of Vocational Behavior, 62(1), 76–100. http://doi.org/10.1016/S0001-8791(02)00034-9
  • Chen, S.-K., Hou, L., Fitzpatrick, S. J., & Dodd, B. G. (1997). The effect of population and method of theta estimation on computerized adaptive testing (CAT) using the rating scale model. Educational and Psychological Measurement, 57(3), 422–439. https://doi.org/10.1177/0013164497057003004
  • Choi, S. W., & Swartz, R. J. (2009). Comparison of CAT item selection criteria for polytomous items. Applied Psychological Measurement, 33(6), 419–440. http://doi.org/10.1177/0146621608327801
  • Choi, S. W., Reise, S. P., Pilkonis, P. A., Hays, R. D., & Cella, D. (2010). Efficiency of static and computer adaptive short forms compared to full-length measures of depressive symptoms. Quality of Life Research, 19(1), 125–136. http://doi.org/10.1007/s11136-009-9560-5
  • Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Harcourt Brace Jovanovich
  • Demir, C., & French, B. F. (2021). Applicability and efficiency of a computerized adaptive test for the Washington assessment of the risks and needs of students. Assessment. https://doi.org/10.1177/10731911211047892
  • Deng, H., Ansley, T., & Chang, H. H. (2010). Stratified and maximum information item selection procedures in computer adaptive testing. Journal of Educational Measurement, 47(2), 202–226. http://doi.org/10.1111/j.1745-3984.2010.00109.x
  • Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Measurement, 19(1), 5–22. http://doi.org/10.1177/014662169501900103 Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Lawrence Erlbaum Assocaiates.
  • Eroğlu, M. G., & Kelecioğlu, H. (2015). Bireyselleştirilmiş bilgisayarlı test uygulamalarında farklı sonlandırma kurallarının ölçme kesinliği ve test uzunluğu açısından karşılaştırılması. Uludağ Üniversitesi Eğitim Fakültesi Dergisi, 28(1), 31–52. https://doi.org/10.19171/uuefd.87973
  • Fliege, H., Becker, J., Walter, O. B., Bjorner, J. B., Klapp, B. F., & Rose, M. (2005). Development of a computer-adaptive test for depression (D-CAT). Quality of Life Research : An International Journal of Quality of Life Aspects of Treatment, Care and Rehabilitation, 14(10), 2277–91. http://doi.org/10.1007/s11136-005-6651-9
  • Gardner, W., Shear, K., Kelleher, K. J., Pajer, K. A., Mammen, O., Buysse, D., & Frank, E. (2004). Computerized adaptive measurement of depression: A simulation study. BMC Psychiatry, 4(1), 13. http://doi.org/10.1186/1471-244X-4-13
  • Gibbons, R. D., Weiss, D. J., Kupfer, D. J., Frank, E., Fagiolini, A., Grochocinski, V. J., … Immekus, J. C. (2008). Using computerized adaptive testing to reduce the burden of mental health assessment. Psychiatric Services, 59(4), 361–8. http://doi.org/10.1176/appi.ps.59.4.361
  • Gibbons, R. D., Weiss, D. J., Pilkonis, P. a, Frank, E., Moore, T., Kim, J. B., & Kupfer, D. J. (2012). Development of a computerized adaptive test for depression. Archives of General Psychiatry, 69(11), 1104–12. http://doi.org/10.1001/archgenpsychiatry.2012.14
  • Gibbons, R. D., Weiss, D. J., Pilkonis, P. A., Frank, E., Moore, T., Kim, J. B., & Kupfer, D. J. (2014). Development of the CAT-ANX: A computerized adaptive test for anxiety. American Journal of Psychiatry, 171(2), 187–194. http://doi.org/10.1176/appi.ajp.2013.13020178
  • Gnambs, T., & Batinic, B. (2011). Polytomous adaptive classification testing: Effects of item pool size, test termination criterion, and number of cutscores. Educational and Psychological Measurement, 71(6), 1006–1022. http://doi.org/10.1177/0013164410393956
  • Hambleton, R. K., Swaminathan, H., & Rogers, D. J. (1991). Fundamentals of item response theory. SAGE
  • He, W., Diao, Q., & Hauser, C. (2014). A comparison of four item-selection methods for severely constrained CATs. Educational and Psychological Measurement, 74(4), 677–696. http://doi.org/10.1177/0013164413517503
  • Hol, M. A., Vorst, H. C., & Mellenbergh, G. J. (2007). Computerized adaptive testing for polytomous motivation items: Administration mode effects and a comparison with short forms. Applied Psychological Measurement, 31(5), 412–429. http://doi.org/10.1177/0146621606297314
  • IACAT. (2016). Research Strategies in CAT | IACAT. Retrieved February 2, 2019, from http://iacat.org/content/research-strategies-cat International Test Commission. (2005). ITC Guidelines for Translating and Adapting Tests. Retrieved February 2, 2019, from www.intestcom.org
  • Jodoin, M. G., Zenisky, A., & Hambleton, R. K. (2006). Comparison of the psychometric properties of several computer-based test designs for credentialing exams with multiple purposes. Applied Measurement in Education, 19(3), 203–220. http://doi.org/10.1207/s15324818ame1903_3
  • Kang, T., Cohen, A. S., & Sung, H.-J. (2005). IRT model selection methods for polytomous items. In: Annual Meeting of the National Council on Measurement in Education, Montreal, 2005. Retrieved February 2, 2019, from https://testing.wisc.edu/
  • Kang, T., Cohen, A. S., & Sung, H.-J. (2009). Msodel selection indices for polytomous items. Applied Psychological Measurement, 33(7), 499–518. http://doi.org/10.1007/s00330-011-2364-3
  • Karasar, N. (2009). Bilimsel araştırma yöntemleri. Ankara: Nobel Yayın Dağıtım.
  • Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması. Eğitim Bilimleri Araştırmaları Dergisi, 4(1), 145–175. http://doi.org/http://dx.doi.org/10.12973/jesr.2014.41.8
  • Langenbucher, J. W., Labouvie, E., Martin, C. S., Sanjuan, P. M., Bavly, L., Kirisci, L., & Chung, T. (2004). An application of item response theory analysis to alcohol, cannabis, and cocaine criteria in DSM-IV. Journal of abnormal psychology, 113(1), 72. https://doi.org/10.1037/0021-843x.113.1.72
  • Linden, W. J. Van Der, & Glas, C. A. W. (2010). Elements of Adaptive Testing. New York, NY: Springer.
  • Linden, W. J. Van Der. (2005). A comparison of item-selection methods for adaptive tests with content constraints. Journal of Educational Measurement, 42(3), 283-302. http://dx.doi.org/10.1111/j.1745-3984.2005.00015.x
  • Lu, P., Zhou, D., Qin, S., Cong, X., & Zhong, S. (2012). The study of item selection method in CAT. In: 6th International Symposium, ISICA (pp. 403–415). Wuhan - China.
  • Nydick, S. (2022). catIrt: Simulate IRT-Based Computerized Adaptive Tests. R package version 0.5.1. https://CRAN.R-project.org/package=catIrt
  • Ostini, R., & Nering, M. L. (2006). Polytomous item response theory models. SAGE.
  • Paap, M. C. S., Born, S., & Braeken, J. (2019). Measurement efficiency for fixed-precision multidimensional computerized adaptive tests: comparing health measurement and educational testing using example banks. Applied Psychological Measurement, 43(1), 68–83. https://doi.org/10.1177/0146621618765719
  • Paap, M. C. S., Kroeze, K. A., Glas, C. A. W., Terwee, C. B., van der Palen, J., & Veldkamp, B. P. (2017). Measuring patient-reported outcomes adaptively: multidimensionality matters!. Applied Psychological Measurement, 42(5), 327–342. https://doi.org/10.1177/0146621617733954
  • Pedraza, O., Sachs, B. C., Ferman, T. J., Rush, B. K., & Lucas, J. A. (2011). Difficulty and discrimination parameters of Boston Naming Test items in a consecutive clinical series. Archives of Clinical Neuropsychology, 26(5), 434-444. https://doi.org/10.1093/arclin/acr042
  • Ping, C., Shuliang, D., Haijing, L., & Jie, Z. (2006). Item selection strategies of computerized adaptive testing based on graded response model. Acta Psychologica Sinica, 38(03), 461. https://journal.psych.ac.cn/acps/EN/Y2006/V38/I03/461
  • Reckase, M. D. (2009). Multidimensional item response theory models. In Multidimensional item response theory (pp. 79-112). Springer.
  • Reise, S. P. (1990). A comparison of item- and person-fit methods of assessing model-data fit in IRT. Applied Psychological Measurement, 14(2), 127-137. https://doi.org/10.1177/014662169001400202
  • Reise, S. P., & Henson, J. M. (2000). Computerization and adaptive administration of the NEO PI-R. Assessment, 7(4), 347–364. https://doi.org/10.1177/107319110000700404
  • Reise, S. P., & Revicki, D. A. (2015). Handbook of item response theory modeling: Applications to typical performance assessment. Routledge.
  • Ren, H., Choi, S.W. & van der Linden, W.J. (2020). Bayesian adaptive testing with polytomous items. Behaviormetrika 47, 427–449. https://doi.org/10.1007/s41237-020-00114-8
  • Revelle, W. (2015) psych: Procedures for Personality and Psychological Research, Northwestern University, Evanston, Illinois, USA, http://CRAN.R-project.org/package=psych Version = 1.5.8.
  • Rezaie, M., & Golshan, M. (2015). Computer adaptive test (CAT): Advantages and limitations. International Journal of Educational Investigations, 2(5), 128–137. http://www.ijeionline.com/attachments/article/42/IJEI_Vol.2_No.5_2015-5-11.pdf
  • Rizopoulos, D. (2006). “ltm: An R package for Latent Variable Modelling and Item Response Theory Analyses.” Journal of Statistical Software, 17(5), 1–25. https://doi.org/10.18637/jss.v017.i05.
  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 35(17), 139. http://doi.org/10.1007/BF02290599
  • Schinka, J. A., & Velicer, W. F. (2003). Research Methods in Psychology. In: I. B. Weiner (Ed.), Handbook of Psychology (Vol. 2). John Wiley & Sons, Inc.
  • Simms, L. J., & Clark, L. A. (2005). Validation of a computerized adaptive version of the Schedule for Nonadaptive and Adaptive Personality (SNAP). Psychological Assessment, 17(1), 28–43. http://doi.org/10.1037/1040-3590.17.1.28
  • Simms, L. J., Goldberg, L. R., Roberts, J. E., Watson, D., Welte, J., & Rotterman, J. H. (2011). Computerized adaptive assessment of personality disorder: introducing the CAT–PD project. Journal of Personality Assessment, 93(4), 380–389. http://doi.org/10.1080/00223891.2011.577475
  • Şimşek, A.S., & Tavşancıl, E. (2022). Validity and reliability of Turkish version of skills confidence inventory. Turkish Psychological Counseling and Guidance Journal, 12(64), 89-107. https://doi.org/10.17066/tpdrd.1096008
  • Smits, N., Cuijpers, P., & van Straten, A. (2011). Applying computerized adaptive testing to the CES-D scale: A simulation study. Psychiatry Research, 188(1), 147–155. http://doi.org/10.1016/j.psychres.2010.12.001
  • Stochl, J., Böhnke, J. R., Pickett, K. E., & Croudace, T. J. (2016). An evaluation of computerized adaptive testing for general psychological distress: combining GHQ-12 and Affectometer-2 in an item bank for public mental health research. BMC Medical Research Methodology, 16(1), 58. http://doi.org/10.1186/s12874-016-0158-7
  • Sulak, S., & Kelecioğlu, H. (2019). Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications. Journal of Measurement and Evaluation in Education and Psychology, 315–326. https://doi.org/10.21031/epod.530528
  • Thissen, D., & Wainer, H. (2001). Test Scoring. Lawrance Erlbaum Associates.
  • Thompson, N. a., & Weiss, D. J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research and Evaluation, 16(1), 1–9. https://doi.org/10.7275/wqzt-9427 Veldkamp, B. P. (2001). Item selection in polytomous CAT. In Proceedings of the International Meeting of the Psychometric Society IMPS2001 (pp. 207–214). Osaka - Japan.
  • Vogels, A. G. C., Jacobusse, G. W., & Reijneveld, S. A. (2011). An accurate and efficient identification of children with psychosocial problems by means of computerized adaptive testing. BMC Medical Research Methodology, 11, 111. http://doi.org/10.1186/1471-2288-11-111
  • Wainer, H., Dorans, N. J., Eignor, D., Flaugher, R., Green, B. F., Mislevy, R., Thissen, D. (2000). Computerized adaptive testing: A primer (Second Ed). Lawrence Erlbaum Assocaiates.
  • Waller, N. G., & Reise, S. P. (1989). Computerized adaptive personality assessment: an illustration with the Absorption scale. Journal of Personality and Social Psychology, 57(6), 1051–1058. http://doi.org/10.1037/0022-3514.57.6.1051
  • Wang, S., & Wang, T. (2002). Relative precision of ability estimation in polytomous CAT: a comparison under the generalized partial credit model and graded response model. American Educational Research Association.
  • Weiss, D. J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600408
  • Weiss, D. J. (2004). Computerized adaptive testing for effective and efficient measurement in counseling and education. Measurement and Evaluation in Counseling and Development, 37(2), 70–84. Retrieved from http://www.psych.umn.edu/psylabs/catcentral/pdf files/we04070.pdf
  • Weiss, D. J. (2011). Better data from better measurements using computerized adaptive testing. Journal of Methods and Measurement in the Social Sciences, 2(1), 1–23. Retrieved from https://www.assess.com/docs/Weiss(2011)_CAT.pdf
  • Yasuda, J. I., Hull, M. M., & Mae, N. (2022). Improving test security and efficiency of computerized adaptive testing for the Force Concept Inventory. Physical Review Physics Education Research, 18(1), 010112. https://doi.org/10.1103/PhysRevPhysEducRes.18.010112