Serkan ARIKAN, Ferah ÖZER, Vuslat ŞEKER, Güneş ERTAŞ

The Importance of Sample Weights and Plausible Values in Large-Scale Assessments

International large-scale assessments such as PISA (The Programme for International Student Assessment), PIAAC (The Programme for the International Assessment of Adult Competencies) and TIMSS (Trends in International Mathematics Science Study), play a key role in determining educational policies besides their primary objectives of measuring, evaluating and monitoring the educational process. Therefore, it is critical to analyze the data gathered from the large scale assessments using scientifically accurate statistical methods as the results have the potential to influence millions of stakeholders through major policy changes. Analysis of these data that consists of hundreds of different genuine variables requires expertise and using specific methods. This study illustrates issues to be considered while analyzing PISA, PIAAC and TIMSS data by presenting relevant syntax and exemplifying the possible incorrect results that might be encountered. In Turkey, there are very limited courses that focus on large scale data analysis. Workshops are also very limited to reach major groups. The aim of this study is to raise awareness related to sample weights and plausible values. Comparative findings of the study showed that without using sample weights and plausible values there is a high probability to get incorrect results. In this study, t-test and multiple regression analyses conducted by IDB Analyzer and multilevel regression analysis by Mplus were exemplified.

Keywords:

Sample weights plausible values, large scale assessment, IDB Analyzer, Mplus,

PDF

___

Addey, C., & Sellar, S. (2018). Why do countries participate in PISA? Understanding the role of international large-scale assessments in global education policy. In A. Verger, M. Novelli & H. Kosar-Altınyeken (Eds.), Global education policy and international development: New agendas, issues and policies, (pp. 97-118). New York, NY: Bloomsbury Publishing.
Addey, C., Sellar, S., Steiner-Khamsi, G., Lingard, B., & Verger A. (2017). Forum discussion: The rise of international large-scale assessments and rationales for participation. Compare, 47(3), 434-452. doi:10.1080/03057925.2017.1301399
Aydın, A., Selvitopu, A., & Kaya, M. (2018). Eğitime yapılan yatırımlar ve PISA 2015 sonuçları karşılaştırmalı bir inceleme. İlköğretim Online, 17(3), 1283-1301.
Bialecki, I., Jakubowski, M., & Wiśniewski, J. (2017). Education policy in Poland: The impact of PISA (and other international studies). European Journal of Education, 52(2), 167-174.
Carvalho, L. M. & Estela Costa, E. (2015) Seeing education with one's own eyes and through PISA lenses: considerations of the reception of PISA in European countries, Discourse: Studies in the Cultural Politics of Education, 36(5), 638-646. doi:10.1080/01596306.2013.871449
Ertl, H. (2006). Educational standards and the changing discourse on education: The reception and consequences of the PISA study in Germany. Oxford Review of Education, 32(5), 619-634.
Figazzolo, L. (2009). Testing, ranking, reforming: Impact of PISA 2006 on the education policy debate. Brussels: Education International. Froese-Germain,
B. (2010). The OECD, PISA and the impacts on educational policy. Canadian Teachers’ Federation (NJ1). Retrieved from http://eric.ed.gov/?id=ED532562
Gonzalez, E. J. (2012). Rescaling sampling weights and selecting mini-samples from large-scale assessment databases. IERI Monograph Series: Issues and Methodologies in Large-Scale Assessments, 5, 117–134.
Gür, B. S., Celik, Z., & Özoğlu, M. (2012). Policy options for Turkey: A critique of the interpretation and utilization of PISA results in Turkey. Journal of Education Policy, 27(1), 1-21.
Hamilton, L. (2003). Assessment as a policy tool. Review of research in education, 27(1), 25-68.
International Association for the Evaluation of Educational Achievement (IEA) (2019). IDB Analyzer (version 4.0).
Hamburg, Germany: IEA Hamburg. Landahl, J. (2018): De-scandalisation and international assessments: the reception of IEA surveys in Sweden during the 1970s. Globalisation, Societies and Education, 16(5), 566-576. doi:10.1080/14767724.2018.1531235
LaRoche, S., & Foy, P. (2016). Sample design in TIMSS Advanced 2015. In M. O. Martin, I. V. S. Mullis, & M. Hooper (Eds.), Methods and procedures in TIMSS Advanced 2015 (pp. 3.1–3.27). Erişim adresi http://timssandpirls.bc.edu/publications/timss/2015- a-methods/chapter-3.html
LaRoche, S., & Foy, P. (2016). Sample implementation in TIMSS 2015. In M. O. Martin, I. V. S. Mullis, & M. Hooper (Eds.), Methods and Procedures in TIMSS 2015 (pp. 5.1-5.175). Retrieved from http://timss.bc.edu/publications/timss/2015-methods/chapter-5.html
Laukaityte, I., & Wiberg, M. (2017). Using plausible values in secondary analysis in large-scale assessments. Communications in Statistics-Theory and Methods, 46(22), 11341-11357.
Martens, K., & Niemann, D. (2010). Governance by comparison: How ratings & rankings impact national policy-making in education (No. 139). TranState Working Paper. Bremen: University of Bremen Collaborative Research Centre
Milli Eğitim Bakanlığı (MEB). (2014). TIMSS 2011 ulusal matematik ve fen raporu 8. sınıflar. Ankara: T.C. Milli Eğitim Bakanlığı Yenilik ve Eğitim Teknolojileri Genel Müdürlüğü. Milli Eğitim Bakanlığı (MEB). (2016a). PISA 2015 ulusal raporu. Ankara: T.C. Milli Eğitim Bakanlığı Ölçme, Değerlendirme ve Sınav Hizmetleri Genel Müdürlüğü.
Milli Eğitim Bakanlığı (MEB). (2016b). TIMSS 2015 ulusal matematik ve fen bilimleri ön raporu 4. ve 8. sınıflar. Ankara: MEB: Ölçme, Değerlendirme ve Sınav Hizmetleri Genel Müdürlüğü.
Milli Eğitim Bakanlığı (MEB). (2018). 2018 Liselere Geçiş Sistemi (LGS): Merkezi sınavla yerleşen öğrencilerin performansı. Eğitim Analiz ve Değerlendirme Raporları Serisi (No. 3). Ankara: T.C. Milli Eğitim Bakanlığı.
Michel, A. (2017). The contribution of PISA to the convergence of education policies in Europe. European Journal of Education, 52(2), 206-216.
Monseur, C., & Adams, R. (2009). Plausible values: How to deal with their limitations. Journal of Applied Measurement, 10(3), 1-15.
Mullis, I. V. S., & Martin, M. O. (Eds.). (2017). TIMSS 2019 assessment frameworks. Chestnut Hill, MA: TIMSS & PIRLS International Study Center, Lynch School of Education, Boston College and International Association for the Evaluation of Educational Achievement (IEA). Retrieved from http://timssandpirls.bc.edu/timss2019/frameworks/
Muthén, L. K., & Muthén, B. O. (2015). Mplus user’s guide. (7th ed.). Los Angeles, CA: Muthén and Muthén. Novoa, A. & Yariv-Mashal,
T. (2003). Comparative research in education: A mode of governance or a historical journey? Comparative Education, 39(4), 423–438.
The Organisation for Economic Co-operation and Development (OECD). (2016). Skills matter: Further results from the survey of Adult Skills. OECD Skills Studies. Paris: OECD Publishing. doi:10.1787/9789264258051-en.
The Organisation for Economic Co-operation and Development (OECD). (2017). PISA 2015 Technical Report. Paris: OECD Publishing. Retrieved from https://www.oecd.org/pisa/sitedocument/PISA-2015-technical-report-final.pdf The Organisation for Economic Co-operation and Development (OECD). (2019).
PISA 2018 Assessment and Analytical Framework. PISA. Paris: OECD Publishing. doi:10.1787/b25efab8-en. Pizmony-Levy, O. (2018). Compare globally, interpret locally: international assessments and news media in Israel. Globalisation, Societies and Education, 16(5), 577-595. doi:10.1080/14767724.20.18.1531236
Rubin, D. (1987). Multiple imputation for nonresponse in sample surveys. New York: John Wiley. Rust, K. (2013). Sampling, weighting, and variance estimation in international large-scale assessments. In L.
Rutkowski, M. von Davier, & D. Rutkowski (Eds.), Handbook of international large-scale assessment: Background, technical issues, and methods of data analysis (1st ed., pp. 117–154). New York, NY: Chapman and Hall/CRC Press.
Rutkowski, L., Gonzalez, E., Joncas, M., & von Davier, M. (2010). International large-scale assessment data: Issues in secondary analysis and reporting. Educational Researcher, 39(2), 142-151.
Steiner-Khamsi, G. & Waldow, F. (2018). PISA for scandalisation, PISA for projection: the use of international large-scale assessments in education policy making – an introduction. Globalisation, Societies and Education, 16(5), 557-565. doi:10.1080/14767724.2018.1531234
Tiana Ferrer, A. (2017). PISA in Spain: Expectations, impact and debate. European Journal of Education, 52, 184-191. TEDMEM. (2016). OECD yetişkin becerileri araştırması: Türkiye ile ilgili sonuçlar. Ankara: Türk Eğitim Derneği Yayınları.
Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI Monograph Series, 2(1), 9-36. Waldow, F. (2009). What PISA did and did not do: Germany after the ‘PISA-shock’. European Educational Research Journal, 8(3), 476-483.
Wiseman, A. (2013). Policy responses to PISA in comparative perspective. In H.D. Meye, & A. Benavot (Eds.) PISA, power, and policy: The emergence of global educational governance. (pp.303–322). Oxford: Symposium Books
Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31(2-3), 114-128.