EĞİTİM ARAŞTIRMALARINDA UYGUN MALİYETLİ SEÇKİSİZ DENEYLER TASARLAMAK İÇİN PRATİK BİR KILAVUZ: PİLOT ÇALIŞMALARDAN BÜYÜK ÖLÇEKLİ MÜDAHALELERE

Bu çalışma, pilot çalışmalardan büyük ölçekli müdahalelere kadar uygun maliyetli seçkisiz deneylerin nasıl tasarlanacağını göstermeyi amaçlamaktadır. Seçkisiz deneylerin optimal tasarımı için iki olası senaryo vardır; ilk olarak, toplam maliyeti sabit bir miktarda veya altında tutarken güç oranını maksimize etmek isteyebiliriz ve ikinci olarak, güç oranını nominal güç oranında (genellikle 0,80) veya üzerinde tutarken toplam maliyeti minimize etmek isteyebiliriz. Bu iki senaryo göz önüne alındığında, optimal tasarım stratejisi, maliyet açısından eşdeğer olası tüm tasarımlar arasından en yüksek güç oranına sahip tasarımı seçmemizi veya istatistiksel güç açısından eşdeğer olası tüm tasarımlar arasından en az maliyete sahip tasarımı seçmemizi sağlar. Katılımcılar/katılımcı grupları hakkında daha fazla bilgi toplanarak veya katılımcılar homojen alt kümelere bloke edilerek maliyet düşürülebilir. Maliyeti düşük tasarımları belirlemek için Bulus (2021) tarafından sağlanan excel sayfası ve cosa R paketi (Bulus & Dong, 2021a, 2021b) kullanıldı. Akademisyenler, kaynak kısıtlamaları olduğunda, örneklem büyüklüklerini bu şekilde gerekçelendirebilirler.

Anahtar Kelimeler:

optimal tasarım, seçkisiz deneyler, seçkisiz küme deneyleri, bloklanmış seçkisiz küme deneyleri, seçkisiz öntest-sontest kontrol grubu olan tasarımlar, uygun maliyetli deneyler, optimal tasarım, seçkisiz deneyler, seçkisiz küme deneyleri, bloklanmış seçkisiz küme deneyleri, seçkisiz öntest-sontest kontrol grubu olan tasarımlar, uygun maliyetli deneyler

A PRACTICAL GUIDE TO DESIGNING COST-EFFICIENT RANDOMIZED EXPERIMENTS IN EDUCATION RESEARCH: FROM PILOT STUDIES TO INTERVENTIONS AT SCALE

This study aims to illustrate how to design cost-efficient randomized experiments from pilot studies to interventions at scale. There are two possible scenarios for optimal design of randomized experiments; first, we may want to maximize the power rate while keeping the total cost at or under a fixed amount, and second, we may want to minimize the total cost while keeping the power rate at or above a nominal power rate (often 0.80). Considering these two scenarios, the optimal design strategy ensures that we choose the design with the highest power rate among all possible cost-equivalent designs, or that we choose the design with the minimum cost among all possible power-equivalent designs. Further cost-efficiency can be achieved via collecting more information on the subjects/group of subjects, or via blocking subjects into homogenous subsets. We used the excel sheet provided by Bulus (2021) and cosa R package (Bulus & Dong, 2021a, 2021b) to determine cost-efficient designs. Scholars can justify their sample size in this fashion when they have resource constraints. This will indicate that they opted for the design with the highest power rate among all possible cost-equivalent designs (same cost but different power rates), or opted for the design with the minimum cost among all possible power-equivalent designs (same power rate but different costs).

Keywords:

optimal design, randomized experiments randomized trials, cluster-randomized trial, blocked cluster-randomized trial, randomized pretest-posttest control-group design, cost-efficient experiments, optimal design, randomized experiments, randomized trials, cluster-randomized trial, blocked cluster-randomized trial, randomized pretest-posttest control-group design, cost-efficient experiments,

PDF

___

Akpınar, E. (2014). The use of interactive computer animations based on POE as a presentation tool in primary science teaching. Journal of Science Education and Technology, 23(4), 527-537. https://doi.org/10.1007/s10956-013-9482-4
Bloom, H. S. (2005). Randomizing groups to evaluate place-based programs. In H. S. Bloom (Ed.), Learning more from social experiments evolving analytic approaches (pp. 115–172). Sage.
Bloom, H. S., Bos, J. M., & Lee, S. W. (1999). Using cluster random assignment to measure program impacts: Statistical Implications for the evaluation of education programs. Evaluation Review, 23(4), 445–469. https://doi.org/10.1177%2F0193841X9902300405
Borenstein, M., Hedges, L. V., & Rothstein, H. (2012). CRT Power. Teaneck, NJ: Biostat. [Software]
Boruch, R. F. (2005). Better evaluation for evidence based policy: Place randomized trials in education, criminology, welfare, and health. The Annals of American Academy of Political and Social Science, 599. https://doi.org/10.1177%2F0002716205275610
Boruch, R. F., DeMoya, D., & Snyder, B. (2002). The importance of randomized field trials in education and related areas. In F. Mosteller & R. F. Boruch (Eds.), Evidence matters: Randomized fields trials in education research (pp. 50–79). Washington, DC: Brookings Institution Press.
Boruch, R. F. & Foley, E. (2000). The honestly experimental society. In L. Bickman (Ed.), Validity and social experiments: Donald Campbell’s legacy (pp. 193–239). Sage.
Bulus, M. (2021). Sample size determination and optimal design of randomized/non-equivalent pretest-posttest control-group designs. Adiyaman Univesity Journal of Educational Sciences, 11(1), 48-69. https://doi.org/10.17984/adyuebd.941434
Bulus, M. (2022). Minimum detectable effect size computations for cluster-level regression discontinuity: Specifications beyond the linear functional form. Journal of Research on Education Effectiveness, 15(1), 151-177. https://doi.org/10.1080/19345747.2021.1947425
Bulus, M., & Dong, N. (2021a). Bound constrained optimization of sample sizes subject to monetary restrictions in planning of multilevel randomized trials and regression discontinuity studies. The Journal of Treatmental Education, 89(2), 379–401. https://doi.org/10.1080/00220973.2019.1636197
Bulus, M., & Dong, N. (2021b). cosa: Bound constrained optimal sample size allocation. R package version 2.1.0. https://CRAN.R-project.org/package=cosa
Bulus, M., & Dong, N. (2022). Consequences of ignoring a level of nesting in blocked three-level regression discontinuity designs: Power and Type I error rates. [Manuscript in preperation].
Bulus, M., Dong, N., Kelcey, B., & Spybrook, J. (2021). PowerUpR: Power analysis tools for multilevel randomized treatments. R package version 1.1.0. https://CRAN.R-project.org/package=PowerUpR
Bulus, M., & Koyuncu, I. (2021). Statistical power and precision of experimental studies originated in the Republic of Turkey from 2010 to 2020: Current practices and some recommendations. Journal of Participatory Education Research, 8(4), 24-43. https://doi.org/10.17275/per.21.77.8.4
Bulus, M., & Sahin, S. G. (2019). Estimation and standardization of variance parameters for planning cluster-randomized trials: A short guide for researchers. Journal of Measurement and Evaluation in Education and Psychology, 10(2), 179-201. https://doi.org/10.21031/epod.530642
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates.
Cook, T. D. (2002). Randomized experiments in educational policy research: A critical examination of the reasons the educational evaluation community has offered for not doing them. Educational Evaluation and Policy Analysis, 24, 175–199. https://doi.org/10.3102%2F01623737024003175
Cook, T. D. (2005). Emergent principles for the design, implementation, and analysis of cluster-based experiments in social science. The Annals of American Academy of Political and Social Science, 599. https://doi.org/10.1177%2F0002716205275738
Dong, N., & Maynard, R. (2013). PowerUp!: A tool for calculating minimum detectable effect sizes and minimum required sample sizes for experimental and quasi-experimental design studies. Journal of Research on Educational Effectiveness, 6(1), 24-67. https://doi.org/10.1080/19345747.2012.673143
Dong, N., Curenton, S. M., Bulus, M., & Ibekwe-Okafor, N. (2022). Investigating the differential effects of early child care and education in reducing gender and racial academic achievement gaps from kindergarten to 8th grade. Journal of Education. Advance online publication. https://doi.org/10.1177/00220574221104979
Dong, N., Spybrook, J., Kelcey, B., & Bulus, M. (2021). Power analyses for moderator effects with (non)random slopes in cluster randomized trials. Methodology, 17(2), 92-110. https://doi.org/10.5964/meth.4003
Hedges, L. V., & Borenstein, M. (2014). Conditional Optimal Design in Three- and Four-Level Experiments. Journal of Educational and Behavioral Statistics, 39(4), 257-281. https://doi.org/10.3102/1076998614534897
Heyard, R., & Hottenrott, H. (2021). The value of research funding for knowledge creation and dissemination: A study of SNSF Research Grants. Humanities and Social Sciences Communications, 8(1), 1-16. https://doi.org/10.1057/s41599-021-00891-x
Konstantopoulos, S. (2009). Incorporating Cost in Power Analysis for Three-Level Cluster-Randomized Designs. Evaluation Review, 33(4), 335-357. https://doi.org/10.1177/0193841X09337991
Konstantopoulos, S. (2011). Optimal Sampling of Units in Three-Level Cluster Randomized Designs: An ANCOVA Framework. Educational and Psychological Measurement, 71(5), 798-813. https://doi.org/10.1177/0013164410397186
Konstantopoulos, S. (2013). Optimal Design in Three-Level Block Randomized Designs with Two Levels of Nesting: An ANOVA Framework with Random Effects. Educational and Psychological Measurement, 73(5), 784-802. https://doi.org/10.1177/0013164413485752
Koyuncu, I., Bulus, M., & Firat, T. (2022). The moderator role of gender and socioeconomic status in the relationship between metacognitive skills and reading scores. Journal of Participatory Education Research, 9(3), 82-97. https://doi.org/10.17275/per.22.55.9.3
Lakens, D. (2022). Sample size justification. Collabra: Psychology, 8(1), 33267. https://doi.org/10.1525/collabra.33267
Liu, X. (2003). Statistical Power and Optimum Sample Allocation Ratio for Treatment and Control Having Unequal Costs per Unit of Randomization. Journal of Educational and Behavioral Statistics, 28(3), 231-248. https://doi.org/10.3102/10769986028003231
Mosteller, F., & Boruch, R. F. (2002). Evidence matters: Randomized trials in education research. Brookings Institution Press.
Ozcan, B., & Bulus, M. (2022). Protective factors associated with academic resilience of adolescents in individualist and collectivist cultures: Evidence from PISA 2018 large scale assessment. Current Psychology, 41, 1740-1756. https://doi.org/10.1007/s12144-022-02944-z
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org
Raudenbush, S. W. (1997). Statistical Analysis and Optimal Design for Cluster Randomized Trials. Psychological Methods, 2(2), 173. http://dx.doi.org/10.1037/1082-989X.2.2.173
Raudenbush, S. W., & Liu, X. (2000). Statistical Power and Optimal Design for Multisite Trials. Psychological Methods, 5, 199-213. http://dx.doi.org/10.1037/1082-989X.5.2.199
Raudenbush, S. W., Spybrook, J., Congdon, R., Liu, X. F., Martinez, A., & Bloom, H. (2011). Optimal design software for multi-level and longitudinal research (Version 3.01) [Software].
Zhu, P., Jacob, R., Bloom, H., & Xu, Z. (2011). Designing and analyzing studies that randomize schools to estimate intervention effects on student academic outcomes without classroom-level information. Educational Evaluation and Policy Analysis, 34(1), 45-68. https://doi.org/10.3102%2F0162373711423786
Wu, S., Wong, W. K., & Crespi, C. M. (2017). Maximin Optimal Designs for Cluster Randomized Trials. Biometrics, 73(3), 916-926. https://doi.org/10.1111/biom.12659 van Breukelen, G. J. P., & Candel, M. J. J. M. (2018). Efficient design of cluster randomized trials with treatment‐dependent costs and treatment‐dependent unknown variances. Statistics in Medicine, 37(21), 3027-3046. https://doi.org/10.1002/sim.7824
Zopluoglu, C. (2012). A cross-national comparison of intra-class correlation coefficient in educational achievement outcomes. Journal of Measurement and Evaluation in Education and Psychology, 3(1), 242-278.