Classroom assessment that tailor instruction and direct learning: A validation study

We report the validity of a test instrument that assesses the arithmetic ability of primary students by (a) describing the theoretical model of arithmetic ability assessment using Wilson’s (2004) four building blocks of constructing measures and (b) providing empirical evidence for the validation study. The instrument consists of 21 multiple-choice questions that hierarchically evaluate arithmetic intended learning outcomes (ILOs) on arithmetic ability, hierarchically, based on Bloom’s cognitive taxonomy for 138 primary three grade students. The theoretical model describes students’ arithmetic ability on three distinct levels: solid, developing, and basic. At each level, the model describes the characteristics of the tasks that the students can answer correctly. The analysis shows that the difficulty of the items followed the expected order in the theoretical construct map, where the difficulty of each designed item aligned with the cognitive level of the student, the item difficulty distribution aligned with the structure of the person construct map, and word problems required higher cognitive abilities than the calculation problems did. The findings, however, pointed out that more difficult items can be added to better differentiate students with different ability levels, and an item should be revised to enhance the reliability and validity of the research. We conclude that the conceptualizations of such formative assessments provide meaningful information for teachers to support learning and tailoring instruction.

Classroom assessment that tailor instruction and direct learning: A validation study

We report the validity of a test instrument that assesses the arithmetic ability of primary students by (a) describing the theoretical model of arithmetic ability assessment using Wilson’s (2004) four building blocks of constructing measures and (b) providing empirical evidence for the validation study. The instrument consists of 21 multiple-choice questions that hierarchically evaluate arithmetic intended learning outcomes (ILOs) on arithmetic ability, hierarchically, based on Bloom’s cognitive taxonomy for 138 primary three grade students. The theoretical model describes students’ arithmetic ability on three distinct levels: solid, developing, and basic. At each level, the model describes the characteristics of the tasks that the students can answer correctly. The analysis shows that the difficulty of the items followed the expected order in the theoretical construct map, where the difficulty of each designed item aligned with the cognitive level of the student, the item difficulty distribution aligned with the structure of the person construct map, and word problems required higher cognitive abilities than the calculation problems did. The findings, however, pointed out that more difficult items can be added to better differentiate students with different ability levels, and an item should be revised to enhance the reliability and validity of the research. We conclude that the conceptualizations of such formative assessments provide meaningful information for teachers to support learning and tailoring instruction.

___

  • Adillah, G., Ridwan, A., & Rahayu, W. (2022). Content validation through expert judgement of an instrument on the self-assessment of mathematics education student competency. International Journal of Multicultural and Multireligious Understanding, 9(3), 780-790. http://dx.doi.org/10.18415/ijmmu.v9i3.3738
  • Alderson, J.C. (1990). Testing reading comprehension skills. Reading in a Foreign Language, 6(2), 425-438.
  • Alonzo, A.C., & Steedle, J.T. (2009). Developing and assessing a force and motion learning progression. Science Education, 93(3), 389-421.
  • Baird, J.-A., Andrich, D., Hopfenbeck, T.N., & Stobart, G. (2017). Assessment and learning: Fields apart? Assessment in Education: Principles, Policy & Practice, 24(3), 317-350.
  • Baroody, A.J., & Dowker, A. (Eds.). (2003). The development of arithmetic concepts and skills: Constructing adaptive expertise. Lawrence Erlbaum Associates Publishers.
  • Beck, K. (2020). Ensuring content validity of psychological and educational tests – the role of experts. Frontline Learning Research, 8(6), 1-37. https://doi.org/10.14786/flr.v8i6.517
  • Bell, B., & Cowie, B. (2001). The characteristics of formative assessment in science education. Science Education, 85(5), 536-553.
  • Björklund, C., Marton, F., & Kullberyg, A. (2021). What is to be learnt? Critical aspects of elementary skills. Educational Studies in Mathematics, 107, 261-284. https://doi.org/10.1007/s10649-021-10045-0
  • Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education, 5(1), 7-74.
  • Black, P., & Wiliam, D. (2003). In praise of educational research’: Formative assessment. British Educational Research Journal, 29(5), 623-637.
  • Black, P., & Wiliam, D. (2010). Inside the black box: Raising standards through classroom assessment. Phi Delta Kappan, 92(1), 81-90. https://doi.org/10.1177/003172171009200119
  • Black, P., Wilson, M., & Yao, S. (2011). Road maps for learning: A guide to the navigation of learning progressions. Measurement: Interdisciplinary Research & Perspectives, 9, 1-52.
  • Bloom, B.S. (1956). Taxonomy of educational objectives. The classification of educational goals. Handbook 1: Cognitive domain. David McKay.
  • Bond, T., & Fox, C.M. (2015). Applying the Rasch model: Fundamental measurement in the human sciences (3rd ed.). Routledge.
  • Brookhart, S.M., Moss, C.M., & Long, B.A. (2010). Teacher inquiry into formative assessment practices in remedial reading classrooms. Assessment in Education: Principles, Policy & Practice, 17(1), 41-58.
  • Cardinet, J. (1989). Evaluer sans juger. Revue Française de Pédagogie, 88, 41-52.
  • Cataloglu, E., & Robinett, R.W. (2002). Testing the development of student conceptual and visualization understanding in quantum mechanics through the undergraduate career. American Journal of Physics, 70(3), 238-251. https://doi.org/10.1119/1.1405509
  • Chiang, T., & Chen, Y. (2019). Semantically-aligned equation generation for solving and reasoning math word problems. Proceedings of the 2019 Conference of the North. https://doi.org/10.18653/v1/n19-1272
  • Clark, I. (2012). Formative assessment: Assessment is for self-regulated learning. Educational Psychology Review, 24(2), 205-249.
  • Cummins, J. (1991). Interdependence of first- and second-language proficiency in bilingual children. Language Processing in Bilingual Children, 70-89. https://doi.org/10.1017/cbo9780511620652.006
  • Dixson, D.D., & Worrell, F.C. (2016). Formative and summative assessment in the classroom. Theory into Practice, 55(2), 153-159. https://doi.org/10.1080/00405841.2016.1148989
  • Dowker, A. (2005) Early Identification and Intervention for Students with Mathematics Difficulties. Journal of Learning Disabilities, 38, 324. http://dx.doi.org/10.1177/00222194050380040801
  • Duckor, B., & Holmberg, C. (2017). Mastering formative assessment moves: 7 high-leverage practices to advance student learning. ASCD Press.
  • Embretson, S.E., & Reise, S.P. (2000). Item response theory for psychologists. Lawrence Erlbaum.
  • Engvall, M., Samuelsson, J., & Östergren, R. (2020). The effect on students’ arithmetic skills of teaching two differently structured calculation methods. Problems of Education in the 21st Century, 78(2), 167-195. https://doi.org/10.33225/pec/20.78.167
  • Fisher,W.P., Jr. (1997). Is content validity valid? Rasch Measurement Transactions, 11, 548. Fisher, W.P., Jr. (2013). Imagining education tailored to assessment as, for, and of learning: Theory, standards, and quality improvement. Assessment and Learning, 2, 6-22.
  • Geary, D.C. (1993). Mathematical disabilities: Cognitive, neuropsychological, and genetic components. Psychological Bulletin, 114(2), 345–362. https://doi.org/10.1037/0033-2909.114.2.345
  • Goldman, S.R., & Hasselbring, T.S. (1997). Achieving meaningful mathematics literacy for students with learning disabilities. Journal of Learning Disabilities, 30(2), 198–208.
  • Gorin, J.S., & Mislevy, R.J. (2013). Inherent measurement challenges in the Next Generation Science Standards for both formative and summative assessment (K-12 Center at Educational Testing Service No. Invitational Research Symposium on Science Assessment). ETS.
  • Gurel, D.K., Eryilmaz, A., & McDermott, L.C. (2015). A review and comparison of diagnostic instruments to identify students’ misconceptions in science. EURASIA Journal of Mathematics, Science and Technology Education, 11(5). https://doi.org/10.12973/eurasia.2015.1369a
  • Guskey, T.R. (2003). How classroom assessments improve learning. Educational Leadership, 60(5), 6-11.
  • Hambleton, R.K., & Jones, R.W. (1993). Comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38–47. https://doi.org/10.1111/j.1745-3992.1993.tb00543.x
  • Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Kluwer.Nijhoff.
  • Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage
  • Hattie, J. (2008). Visible learning: A synthesis of over 800 meta-analyses relating to achievement. Routledge.
  • Hidayati, K., Budiyono, & Sugiman. (2019). Using alignment index and Polytomous item response theory on statistics essay test. Eurasian Journal of Educational Research, 79, 115-132.
  • Hiebert, J., & Lefevre, P. (1986). Conceptual and procedural knowledge in mathematics: An introductory analysis. In J. Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 1–27). Lawrence Erlbaum Associates, Inc.
  • Hong Kong Education Bureau (2018). Explanatory Notes to Primary Mathematics Curriculum (Key Stage 1).
  • Hong Kong Education Bureau (2000). Mathematics education key learning area.
  • Kilpatrick, J., Swafford, J., & Findell, B. (2001). Adding it up: Helping children learn mathematics. National Academy Press.
  • Lee, N.P., & Fisher, W.P., Jr. (2005). Evaluation of the diabetes self-care scale. Journal of Applied Measurement, 6(4), 366-381.
  • Linacre, J.M. (1998). Structure in Rasch residuals: Why principal components analysis (PCA)? Rasch Measurement Transactions, 12(2), 636.
  • Linacre, J.M. (2000). Computer-adaptive testing: A methodology whose time has come. MESA Memorandum, 69, 1991-2000.
  • Linacre, J.M. (2009). Local independence and residual covariance: A study of Olympic figure skating ratings. Journal of Applied Measurement, 10(2), 157-169.
  • Linacre, J.M. (2012). A user's guide to Winsteps. Ministeps. Rasch-model computer programs. Program manual 3.74.0. https://www.winsteps.com/a/Winsteps-Manual.pdf
  • Liu, O.L., Lee, H.S., Hofstetter, C., & Linn, M.C. (2008). Assessing knowledge integration in science: Construct, measures, and evidence. Educational Assessment, 13(1), 33-35.
  • Luque-Vara, T., Linares-Manrique, M., Fernández-Gómez, E., Martín-Salvador, A., Sánchez-Ojeda, M.A., & Enrique-Mirón, C. (2020). Content validation of an instrument for the assessment of school teachers’ levels of knowledge of diabetes through expert judgment. International Journal of Environmental Research and Public Health, 17(22), 8605. https://doi.org/10.3390/ijerph17228605
  • Messick, S. (1981). Evidence and ethics in the evaluation of tests. Educational Researcher, 10(9), 9-20.
  • Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13-103). Macmillan Publishing.
  • Millians, M. (2011). Computational skills. In S. Goldsteing & J. A. Naglieri (Eds.), Encyclopedia of child behavior and development. Springer. https://doi.org/10.1007/978-0-387-79061-9_645
  • Mislevy, R.J., Steinberg, L.S., & Almond, R.G. (2003). Focus article: On the structure of educational assessments. Measurement: Interdisciplinary Research & Perspective, 1(1), 3-62. https://doi.org/10.1207/s15366359mea0101_02
  • National Research Council (NRC), (2006). Systems for state science assessment. Committee on Test Design for K-12 Science Achievement. M. R. Wilson and M. W. Bertenthal (Eds.). Board on Testing and Assessment, Center for Education. Division of Behavioral and Social Sciences and Education. The National Academies Press.
  • Nasir, N.A.M., Singh, P., Narayanan, G., Mohd Habali, A.H., & Rasid, N.S. (2022). Development of mathematical thinking test: Content validity process. ESTEEM Journal of Social Sciences and Humanities, 6(2), 18-29.
  • Parviainen, P. (2019). The development of early mathematical skills - A theoretical framework for a holistic model. Journal of Early Childhood Education Research, 8(1), 162-191.
  • Popham, W.J. (2009). Our failure to use formative assessment: Immoral omission. Leadership, 1(2), 1-6.
  • Popham, W. (2010). Wanted: A formative assessment starter kit. Assessment Matters, 2, 182.
  • Prakitipong, N., & Nakamura, S. (2006). Analysis of mathematics performance of grade five students in Thailand using Newman procedure. Journal of International Cooperation in Education, 9(1), 111-122.
  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests (Reprint, with Foreword and Afterword by B. D. Wright, Chicago: University of Chicago Press, 1980). Copenhagen, Denmark: Danmarks Paedogogiske Institut.
  • Rasch, G. (1993). Probabilistic models for some intelligence and attainment tests. Mesa Press.
  • Riley, M.S., Greeno, J.G., & Heller, J.I. (1983). Development of children's problem-solving ability in arithmetic. In H. Ginsburg (Ed.), The development of mathematical thinking (pp. 153-196). Academic Press.
  • Shepard, L.A. (2006). Classroom assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 623-646). Rowman & Littlefield.
  • Sievert, H., van den Ham, A.-K., & Heinze, A. (2021). Are first graders' arithmetic skills related to the quality of mathematics textbooks? A study on students' use of arithmetic principles. Learning and Instruction, 73. https://doi.org/10.1016/j.learninstruc.2020.101401
  • Simon, H.A. (1978). Information-processing theory of human problem solving. In W.K. Estes (Ed.), Handbook of learning and cognitive processes (Volume 5): Human information processing (pp. 271-295). Psychology Press.
  • Stiggins, R.J. (1994). Student-centered classroom assessment. Merrill.
  • Vlassis, J., Baye, A., Auqui ère, A., de Chambrier, A.-F., Dierendonck, C., Giauque, N., Kerger, S., Luxembourger, C., Poncelet, D., Tinnes-Vigne, M., Tazouti, Y. & Fagnant, A. (2022). Developing arithmetic skills in kindergarten through a game-based approach: a major issue for learners and a challenge for teachers. International Journal of Early Years Education. https://doi.org/10.1080/09669760.2022.2138740
  • Wertheimer, M. (1959). Productive thinking. Enlarged Edition. Harper and Brothers.
  • Wiggins, G. (1998). Educative assessment: Designing assessments to inform and improve student performance. Jossey-Bass.
  • Wilson, M. (2004). Constructing measures: An item response modeling approach. Routledge.
  • Wole, G.A., Fufa, S., & Seyoum, Y. (2021). Evaluating the Content Validity of Grade 10 Mathematics Model Examinations in Oromia National Regional State, Ethiopia. Mathematics Education Research Journal. https://doi.org/10.1155/2021/5837931 Wright, B.D., & Masters, G.N. (1982). Rating scale analysis. Mesa Press.
  • Wright, B.D., & Stone, M.H. (1999). Measurement essentials. Wide Range.