İlhan KOYUNCU, Selahattin GELBAL, Osman TAT

The Influence of Using Plausible Values and Survey Weights on Multiple Regression and Hierarchical Linear Model Parameters

In large-scale assessments like Programme for International Students Assessment (PISA) and the Trends in International Mathematics and Science Study (TIMSS), plausible values are often used as students’ ability estimations. In those studies, stratified sampling method is employed in order to draw participants, and hence, the data gathered has a hierarchical structure. In the context of large-scale assessments, plausible values refer to randomly drawn values from posterior ability distribution. It is reported that using one of plausible values or mean of those values as independent variable in linear models may lead to some estimation errors. Moreover, it is observed that sampling weights sometimes are not used during analysis of large-scale assessment data. This study aims to investigate the influence of three approaches on the parameters of linear and hierarchical linear regression models: 1) using only one plausible value, 2) using all plausible values, 3) incorporating sampling weights or not. Data used in the present study is obtained from school and student questionnaires in PISA (2015) Turkey database. Results revealed that the use of sampling weights and number of plausible values has significant effects on regression coefficients, standard errors and explained variance for both regression models. Findings of the study were discussed in details and some conclusions were drawn for practice and further research.

PDF

___

Acar, T., & Öğretmen, T. (2012). Çok düzeyli istatistiksel yöntemler ile 2006 PISA fen bilimleri performansının incelenmesi. Eğitim ve Bilim, 37(163). Retrieved from http://egitimvebilim.ted.org.tr/index.php/EB/article/download/1040/346
Adams, R. J., & Wu, M. L. (Eds.) (2002) PISA 2000 technical report. Paris: OECD Publications.
Atar, B. (2010). Basit doğrusal regresyon analizi ile hiyerarşik doğrusal modeller analizinin karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 1(2), 78-84.
Atar, H. Y., & Atar, B. (2012). Examining the effects of Turkish education reform on students’ TIMSS 2007 science achievements. Educational Sciences: Theory and Practice, 12(4), 2632–2636.
Beaton, A.E. (1987). Implementing the new design. (The NAEP 1983-84 technical report, Report No. 15-TR-20). Princeton, NJ: Educational Testing Service.
Bock, R. D. & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: An application of the EM algorithm. Psychometrika 46, 443-459.
Bryk, A. S., & Raudenbush, S. W. (1988). Toward a more appropriate conceptualization of research on school effects: A three-level hierarchical linear model. American Journal of Education 97(1), 65-108.
Bryk, A. S., & Raudenbush, S. W. (2002). Hierarchical linear models: Applications and data analysis methods (2nd ed.). Thousand Oaks, CA: Sage Publications.
Bryk, A. S., Raudenbush, S. W., & Congdon, R. (2010). HLM7 for Windows [Computer software]. Chicago, IL: Scientific Software International, Inc.
Carle, A. C. (2009). Fitting multilevel models in complex survey data with design weights: Recommendations. BMC Medical Research Methodology, 9(1), 1-13. doi: 10.1186/1471-2288-9-49
Chowa, G. A., Masa, R. D., Ramos, Y., & Ansong, D. (2015). How do student and school characteristics influence youth academic achievement in Ghana? A hierarchical linear modelling of Ghana Youth Save baseline data. International Journal of Educational Development, 45, 129-140.
Cochran, W. G. (1977). Sampling techniques (3rd ed.). New York, NY: John Wiley and Sons.
Fraenkel, J. R.; Wallen, N. E.; Hyun, H. H. (2012): How to design and evaluate research in education (8th Ed.). New York, NY: McGraw-Hill Humanities / Social Sciences/Languages.
Gelman, A. (2006). Multilevel (hierarchical) modelling: What it can and cannot do. Technometrics 48(3), 432-435.
Goldstein, H. (2011). Multilevel statistical models (Vol. 922). Oxford: John Wiley & Sons.
International Business Machines Corp. (2015). IBM SPSS Statistics for Windows (Version 23.0) [Computer software]. Armonk, NY: IBM Corp.
International Association for the Evaluation of Educational Achievement, (2016), Help Manual for the IDB Analyzer. Hamburg, Germany. Retrieved from www.iea.nl/data)
Lohr, S. (2010). Sampling: Design and analysis (2nd edition). Boston, MA: Brooks / Cole.
Maas, C. J. M., & Hox, J. J. (2005). Sufficient sample sizes for multilevel modelling. Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 1(3), 86-92. doi:10.1027/1614-2241.1.3.86
Meinck, S. (2015). Computing sampling weights in large-scale assessments in education [Special issue]. Survey Insights: Methods from the Field, Weighting: Practical Issues and ‘How to’ Approach. Retrieved from https://surveyinsights.org/?p=5353
Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), 177–196.
Mislevy, R. J. (1993). Should “multiple imputations” be treated as “multiple indicators”? Psychometrika, 58(1), 79–85.
Organization for Economic Cooperation and Development (2009). Analyses with plausible values. In PISA Data Analysis Manual: SPSS, (Second Edition), OECD Publishing. Retrieved from http://dx.doi.org/10.1787/9789264056275-9-en
Organization for Economic Cooperation and Development (2014). PISA 2012 technical report. Paris: OECD.
Organization for Economic Cooperation and Development (2017). PISA 2015 Technical report. Paris: OECD.
Osborne, J. W. (2000). Advantages of hierarchical linear modeling. Practical Assessment, Research & Evaluation, 7(1), 1-3.
Palardy, G. J. (2011). Review of HLM 7. Social Science Computer Review, 29(4), 515–520. doi: 10.1177/0894439311413437
Rasch, G. (1960). Studies in mathematical psychology: I. probabilistic models for some intelligence and attainment tests. Oxford, England: Nielsen & Lydiche.
Raudenbush, S. W. (1988). Educational applications of hierarchical linear models: A review. Journal of Educational Statistics, 13(2), 85-116.
Raudenbush, S. W., & Bryk, A. S. (1986). A hierarchical model for studying school effects. Sociology of Education, 59(1), 1-17.
Roberts, J. K. (2004). An introductory primer on multilevel and hierarchical linear modelling. Learning Disabilities: A Contemporary Journal 2, 30-38.
Rubin, D. B. (1987). Multiple imputations for non-response in surveys. New York, NY: Wiley.
Särndal, C., Swensson, B. & Wretman, J. (1992). Model assisted survey sampling. New York, NY: Springer-Verlag.
Snijders, T. A. B., & Bosker, R. J. (1999). Multilevel analysis. An introduction to basic and advanced multilevel modeling. London: Sage.
Snijders, T., & Bosker, R. (2003). Multilevel analysis: An introduction to basic and applied multilevel analysis.
Thousand Oaks, CA: Sage Publications.
Stipek, D., & Valentino, R. A. (2015). Early childhood memory and attention as predictors of academic growth trajectories. Journal of Educational Psychology, 107(3), 771-788.
Von Davier, M., Gonzalez, E., & Mislevy, R. (2009). What are plausible values and why are they useful. IERI Monograph Series, 2, 9-36.
Warm, T. A. (1985). Weighted maximum likelihood estimation of ability in item response theory with tests q/jinite length. (Technical Report No. CGI-TR-85-08). Oklahoma, OK: Coast Guard Institute.
Woltman, H., Feldstain, A., MacKay, J. C., & Rocchi, M. (2012). An introduction to hierarchical linear modeling. Tutorials in Quantitative Methods for Psychology, 8(1), 52-69.
Wright. B.D., & Stone, M. H. (1979). Best test design. Chicago, IL: MESA Press.
Wu, M. (2005). The role of plausible values in large-scale surveys. Studies in Educational Evaluation, 31(2), 114-128.
Yang, H. (2013). The case for being automatic: Introducing the automatic linear modeling (LINEAR) procedure in SPSS statistics. Multiple Linear Regression Viewpoints, 39(2), 27–37.