Using Alignment Index and Polytomous Item Response Theory on Statistics Essay Test

Purpose: Essay test in mathematics, both in the form of restricted-response and extended-response, generally consist of polytomous scored items. However, the essay test used by teachers in Indonesia has not been fully supported by sufficient quality evidence. There have been many studies focusing on the development of the essay test, but not many of them have applied the use of relevant measurement theory for the polytomous data. The evidence of content validity also has not been supported by its alignment with the curriculum. This study used alignment index to  prove the content validity and IRT polytomous GPCM to determine the characteristics of test items in order to produce an essay test that could accurately measure the achievement of students on statistical materials. Method: Procedures of this study: (1) preparation of preliminary test, (2) trials, (3) interpretation. Trial was conducted involving 688 Junior High School students in Yogyakarta, Indonesia. Results: The content validity of the test was good, supported by V Aiken index of 0.88–1.00 and Porter alignment index of 0.93. The test items had good construct validity. Test reliability was categorized as good with the Construct Reliability coefficient of 0.88 and the Alpha coefficient of 0.78. Judging from its characteristics, all test items were categorized as good. Implications for Research and Practice: The use of the alignment index contribution to the verification of content validity of essay test and the use of the IRT polytomous GPCM may provide reference for the use of appropriate measurement theory to determine the item characteristics of essay test.

___

  • Aiken, L. R. (1985). Three coefficients for analyzing the reliability and validity of ratings. Educational and Psychological Measurement, 45, 131-142.
  • Ananda, S. (2003). Rethinking issues of alignment under No Child Left Behind. San Francisco: WestEd.
  • Anderson, D., Irvin, S., Alonzo, J., & Tindal, G. A. (2015). Gauging item alignment through online systems while controlling for rater effects. Educational Measurement: Issues and Practice, 34, 22-33.
  • Anderson, L. W. & Krathwohl, D. R. (2001). A taxonomy for learning, teaching, and assessing. A revision of Bloom’s taxonomy of educational objectives. New York: Addison Wesley Longman.
  • Bhola, D. S., Impara, J. C., & Buchendahl, C. W. (2003). Aligning tests with states’ content standards: Methods and issues. Educational Measurement: Issues and Practice, 22(3), 21–29.
  • Biggs, J. (2003). Teaching for quality learning at university. Glasgow: The Society for Research into Higher Education & Open University Press.
  • Buhaerah. (2010). Pengembangan perangkat pembelajaran berdasarkan masalah pada materi statistika di kelas IX SMP. Gamatika. Nomor 1. Nopember.
  • Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. Belmont: Wadsworth Group.
  • Cronbach, L. J. (1951). Coefficient alpha and the internal structure of test. Psychometrika, 16, 297-334.
  • Ebel, R. L. & Frisbie, D. A. (1991). Essentials of educational measurement. USA: Prentice-Hall Inc.
  • Effendi, K. N. S. & Farlina, E. (2017). Kemampuan berpikir kreatif siswa SMP kelas VII dalam penyelesaian masalah statistika. Jurnal Analisa. 3(2). 130-138.
  • Garson, G. D. (2009). Overview structural equation modeling, http://faculty.chass.ncsu.edu/garson/PA765/structur.htm.
  • Guler, N. (2014). Analysis of open-ended statistics questions with many facet Rasch model. Eurasian Journal of Educational Research, 55, 73-90. http://dx.doi.org/10.14689/ejer.2014.55.5.
  • Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2009). Multivariate data analysis (7thed). Prentice Hall [versi elektronik].
  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamental of item response theory. Newbury Park, CA: Sage Publication Inc.
  • Hanjarwati, R. & Wiyarno, Y. (2015). Pengembangan bahan ajar matematika (materi statistik) dengan menggunakan model active learning sistem 5 M untuk siswa kelas VII. Jurnal Teknologi Pembelajaran Devosi. 5(2).
  • Hasmi A., Hussain T., & Shoaib A. (2018). Alignment between Mathematics Curriculum and Textbook of Grade VIII in Punjab. Bulletin of Education and Research, April 2018, 40(1), 57-76.
  • Igbaria, M., Zinatelli, N., Cragg, P., & Cavaye, A. L. M. (1997). Personal computing acceptable factors in small firms: A structural equation model. MIS Quarterly, September, 279-299.
  • Kayapinar, U. (2014). Measuring essay assessment: Intra-rater and inter-rater reliability. Eurasian Journal of Educational Research, 57, 113-136, http://dx.doi.org/10.14689/ejer.2014.57.2.
  • MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong S. (1999). Sample size in factor analysis. Psychological Methods, 4, 84-99.
  • Muraki, E. (1993). Information functions of the generalized partial credit model. Applied Psychological Measurement, 17(4), 351-393.
  • Muraki, E. & Bock, D. (2002) PARSCALE 4.1 Computer program. Chicago: Scientific Software International, Inc.
  • Naga, D. S. (1992). Pengantar teori skor pada pengukuran pendidikan. Jakarta: Gunadarma
  • Nasstrom, G. & Henriksson, W. (2008). Alignment of standards and assessment: A theoretical and empirical study of methods for alignment. Eletronic Journal of Research in Educational Psychology, 6(3), 667-690.
  • Nitko, A. J. & Brookhart,S. M. (2011). Educational assessment of students (6th ed.). Boston, MA: Pearson Education Inc.
  • Nunnally, J. C. (1981). Psychometric theory (2nd ed). New Delhi: McGraw-Hill Publishing Company Limited.
  • Oriondo, L. L. & Antonio, D. E. M. (1998). Evaluation educational outcomes. Manila: Rex Printing Compagny.
  • Reeve, B. B., & Fayers, P. (2005). Appliying item response theory modeling for evaluating questionnaire item and scale properties. Dalam Fayers, P. & Hays, E.D. (Eds), Assessing quality of life in clinical trials: Methods of practice (2nd ed). New York: Oxford University Press.
  • Sireci, S. & Bond, M. F. (2014). Validity evidence based on test content. Psicothema, (26)1, 100-107.
  • Thorpe, G. L. & Favia A. (2012). Data analysis using item response theory methodology: An introduction to selected programs and applications. Psycology Faculty Scholarship. Paper 20. http://digitalcommons.library.umaine.edu/psy_facpub/20.
  • Tindal, G. (2005). Alignment of alternate assessments using the webb system. Washington, DC: Council of Chief State Officers.
  • Van der Linden, W. J. & Hambleton, R. K. (1997). Handbook of modern item response theory. New York: Springer-Verlag.
  • Walstad, W. B. (2006). Testing for depth of understanding in economics using essay questions. Journal of Economic Education. Washington: Winter.
  • Webb, N. L. (1997). Criteria for alignment of expectations and assessments in mathematics and science education (Research monograph No. 6). Washington, DC: Council of Chief State School Officers.
  • Wells, C. S., Hambleton, R. K. & Purwono, U. (Juni 2008). Item response theory. Polytomous response IRT models and aplicatios. Handout delivered on the Training of Educational Assessment and Psychology (Psychometry), at Yogyakarta State University.
  • Wiggins, G. & McTighe, J. (2001). Understanding by Design (2nd Ed.). Alexandria, VA: Association for Supervision and Curriculum Development.
  • Wijanto, S. H. (2008). Structural equation modeling dengan Lisrel 8.8. Yogyakarta: Graha Ilmu.
  • Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah: Lawrence Erlbaum Associates, Inc.