Madde Havuzu Özelliklerinin Yetenek Kestirimi ve Madde Havuzu Kullanımına Etkileri: Bir Simülasyon Çalışması
Bireyselleştirilmiş bilgisayarlı test uygulaması için madde havuzunun geliştirilmesi uzun ve zahmetli bir süreç gerektirmektedir. Bu süreç hem maddi hem de zaman anlamında yorucu olabilir. Bu nedenle ‘Optimal bir madde havuzu nasıl olmalıdır?, Bir madde havuzunda en az kaç madde yer almalıdır?’ gibi sorularla sıklıkla karşılaşılmaktadır. Optimal bir madde havuzunda bulunması gereken özellikler hakkında yapılan çalışmalar çeşitlilik göstermekle birlikte özellikle madde havuzunun büyüklüğü ile ilgili bir fikir birliği sağlanamamıştır. Bu çalışmada; farklı madde sayısına ve madde dağılımlarına sahip madde havuzlarının yetenek kestirimine ve madde havuzlarının kullanımına etkisi incelenmiştir. Çalışmada 36 farklı madde havuzu SimulCAT yazılımı kullanılarak üretilmiştir. 1000 birey kullanılarak tek oturumluk CAT ortamları simüle ve çalışmada iki farklı sonlandırma kuralı kullanılmıştır. Çalışmanın sonucu genel olarak ele alındığında madde havuzu büyüklüğü belli bir büyüklüğe kadar arttıkça ölçme kesinliğinin arttığı, kullanılmayan madde sayısının azaldığı görülmüştür. Sonuçlara b parametresi özelinde bakıldığında madde havuzu büyüdükçe b parametresi dağılımının değerler üstündeki etkisinin azaldığı görülmüştür.
Effects of Item Pool Characteristics on Ability Estimate and Item Pool Utilization: A Simulation Study
Forming an item pool for computerized adaptive testing requires a long and demanding process that may be challenging, both in terms of time and cost. Therefore, one may come across such questions as ‘How should an optimal item pool be?’ and/or ‘How many items should exist in an item pool?’ Although research with regard to the features to exist in an optimal item pool vary, there has been no consensus reached about how big the item pool size should be. In the current study, the effect of different item pool size and item distribution on ability estimation and item pool utilization was analysed. 36 different item pools were generated through SimulCAT software. Using 1,000 simulees, single session CAT environments were simulated and two different termination rules were used in the study. Findings of the study indicated that as the size of the item pool increased to a specific size, the precision of measurement increased and the number of unused items decreased. By examining the results according to b parameter, it was found that the effect of b parameter distribution over the results decreased.
___
- Ariel, A., van der Linden, W. J., & Veldkamp, B. P. (2006). A strategy for optimizing item-pool management.
Journal of Educational Measurement, 43(2), 85-96.
- Bergstrom, B. A., & Lunz, M. E. (1999). CAT for certification and licensure. In F. Drasgow, & J. B. Olson-Buchanan
(Eds.), Innovations in computerized assessment (pp. 67-91). Mahwah, NJ: Lawrence Erlbaum.
- Boyd, A., Dodd, B., & Choi, S. (2010). Polytomous models in computerized adaptive testing. In M. L. Nering, & R.
Ostini (Eds.), Handbook of polytomous item response theory models (pp. 229-255). New York: Routledge.
- Boyd, M. A. (2003). Strategies for controlling testlet exposure rates in computerized adaptive testing systems
(Unpublished Doctoral dissertation). The University of Texas, Austin.
- Chen, S.Y., Ankenmann, R.D., & Spray, J.A. (2003). The relationship between item exposure and test overlap in
computerized adaptive testing. Journal of Educational Measurement, 40(2), 129-145.
- Davis, L. L. (2002). Strategies for controlling item exposure in computerized adaptive testing with polytomously
scored items (Unpublished Doctoral dissertation). The University of Texas at Austin, Austin.
- Flaugher, R. (2000). Item pools. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 37-59). Mahwah,
NJ: Lawrence Erlbaum.
- Gorin, J. S., Dodd, B. G., Fitzpatrick, S. J., & Shieh, Y. (2005). Computerized adaptive testing with the partial credit
model: Estimation procedure, population distributions, and item pool characteristics. Applied Psychological
Measurement, 29(6), 433-456.
- Gu, L., & Reckase, M. D. (2007). Designing optimal item pools for computerized adaptive tests with sympson-Hetter
exposure control. Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing.
- Hambleton, R.K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. New York:
Sage publication.
- Han, K. T. (2011). User’s manual: SimulCAT. Retrieved 01 November 2017 from
http://www.umass.edu/remp/software/simcata/simulcat/SimulCAT_Manual.pdf.
- Han, K. T. (2012). An efficiency balanced information criterion for item selection in computerized adaptive testing.
Journal of Educational Measurement, 49(3), 225-246.
- He, W., & Reckase, M. D. (2013). Item pool design for an operational variable length computerized adaptive test.
Educational and Psychological Measurement, 74(3), 473-494.
- Jacobsen, J., Ackermann, R., Eguez, J., Ganguli, D., Rickard, P., & Taylor, L. (2011). Design of a computer-adaptive
test to measure English literacy and numeracy in the Singapore workforce: considerations, benefits, and
implications. Journal of Applied Testing Technology, 12(SI), 1-26.
- Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied
Measurement in Education, 2(4), 359-375.
- Millman, J., & Arter, J. A. (1984). Issues in item banking. Journal of Educational Measurement, 21(4), 315-330.
- Parshall, C., Spray, J., Kalohn, J., & Davey, T. (2002). Practical considerations in computer-based testing. New
York: Springer Verlag.
- Reckase, M. D. (1989). Adaptive testing: The evolution of a good idea. Educational measurement: Issues and
practice, 8(3), 11-15.
- Reckase, M. D. (2010). Designing item pools to optimize functioning of a computerized adaptive test. Psychological
Test and Assessment Modeling, 52(2), 127-141.
- Segall, D. O. (2004). Computerized Adaptive Testing. Encyclopaedia of Social Measurement, Academic Press.
Retrieved from http://iacat.org/sites/default/files/biblio/se04-01.pdf.
- Stocking, M. L. (1994). Three practical issues for modern adaptive testing item pools (ETS Research Report No. 93-
2). Educational Testing Service: Princeton, NJ.
- Thissen, D., & Mislevy, R. J. (2000). Testing algorithms. In H. Wainer (Ed.), Computerized Adaptive Testing: A
Primer (2nd ed., pp. 101-135). London: Routledge.
- Thompson, N. A., & Weiss, D. J. (2011). A Framework for the development of computerized adaptive tests.
Practical Assessment, Research & Evaluation, 16(1).
- Urry, V. W. (1977). Tailored testing: a successful application of latent trait theory. Journal of Educational
Measurement, 14(2), 181-196.
- van der Linden, W. J., & Glas, C. A. V. (2002). Computerized adaptive testing: theory and practice. USA: Kluwer
Academic.
- Veldkamp, B. P., & van der Linden, W. P. (2010). Designing item pools for adaptive testing. In W. J. van der Linden,
& C. A. Glas (Eds.), Elements of Adaptive Testing (pp. 231-245). New York: Springer.
- Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and
example. Journal of Educational Measurement, 38(1), 19-49.
- Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing.
Journal of Educational Measurement, 35(2), 109-135.
- Weiss, D. J. (1983). New Horizons in Testing. New York: Academic Press.
- Weiss, D. J. (1985). Adaptive testing by computer. Journal of Consulting and Clinical Psychology, 53(6), 774-789.
- Wise, S. L. (1997). An Evaluation of the Item Pools Used for Computerized Adaptive Test Versions of The Maryland
Functional Tests. A Report Prepared for the Assessment Branch of the Maryland State Department of Education.
Retrieved 10 March 2017 from https://marces.org/mdarch/pdf/M032045.pdf.
- Xing, D., & Hambleton, R. K. (2004). Impacts of test design, item quality, and item bank size on the psychometric
properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1),
5-21.
- Zhou, X., & Reckase, M. D. (2014). Optimal item pool design for computerized adaptive tests with polytomous items
using GPCM. Psychological Test and Assessment Modeling, 56(3), 255-274.