Ensemble Based Box-Cox Transformation via Meta Analysis

Normal distribution has a vital role for the most of statistical methods. Box-Cox power transformation is the most usually applied method when the distribution of data is not normal. In this study, a novel algorithm is proposed assembling different Box-Cox transformation estimates of the well performed six techniques through random effect model in meta analysis. These techniques include the use of goodness-of-fit tests for normality; Anderson–Darling, Lilliefors, Cramer-von Mises, Shapiro–Wilk, Jarque–Bera and Shapiro–Francia tests. For the estimation of Box-Cox parameter, we assemble all possible combinations (63 combinations) of estimates calculated by these six methods. A Monte-Carlo simulation study is implemented to investigate which combination performs better compared to the rest. The simulation study states that the combination of Shapiro–Wilk, Jarque–Bera and Ander-son–Darling tests performs well in most of the simulation scenarios constructed under different transformation parameters and sample sizes. In this study, this combination is proposed as ensemble based Box-Cox transformation via meta analysis. The proposed approach is implemented on white blood count data of leukaemia patients which are not normally distributed. Also, the proposed methodology is provided in AID R package with “box-coxmeta” function for public use.

___

  • Asar, O., Ilk, O., Dag, O. (2017). Estimating Box-Cox power transformation parameter via goodness-of-fit tests. Communications in Statistics – Simulation and Computation, 46:1, 91-105. DOI: https://doi.org/10.1080/03610918.2014.957839
  • Balduzzi, S., Rucker, G., Schwarzer, G. (2019). How to perform a meta-analysis with R: a practical tutorial. Evidence-Based Mental Health, 22, 153-160. DOI: https://doi.org/10.1136/ebmental-2019-300117
  • Box, G.E.P., Cox, D.R. (1964). An analysis of transformations (with discussion). Journal of Royal Statistical Society, Series B (Methodological), 26:2, 211-252. DOI: https://doi.org/10.1111/j.2517-6161.1964.tb00553.x
  • Dag, O., Asar, O., Ilk, O. (2014). A methodology to implement Box-Cox transformation when no covariate is available. Communications in Statistics – Simulation and Computation, 43:7, 1749-1759. DOI: https://doi.org/10.1080/03610918.2012.744042
  • Liu, W., Shi, J., He, S., Luo, X., Zhong, W., Yang, F. (2021). Understanding variations and influencing factors on length of stay for T2DM patients based on a multilevel model. Plos One, 16:3, 1-14. DOI: https://doi.org/10.1371/journal.pone.0248157
  • Nelson, D., Law, G.R., McGonagle, I., Turner, P., Jackson, C., Kane, R. (2020). The Effect of Rural Residence on Cancer‐Related Self‐Efficacy With UK Cancer Survivors Following Treatment. The Journal of Rural Health. DOI: https://doi.org/10.1111/jrh.12549
  • R Development Core Team (2020) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, Retrieved from http://www.R-project.org.
  • Rahman, M. (1999). Estimating the Box-Cox transformation via Shapiro-Wilk W statistic. Communications in Statistics – Simulation and Computation 28(1):223-241. DOI: https://doi.org/10.1080/03610919908813545
  • Rahman, M., Pearson, L. M. (2008). Anderson-Darling statistic in estimating the Box-Cox transformation parameter. Journal of Applied Probability and Statistics 3(1):45-57.
  • Roy, A., Widjaja, R., Wang, M., Cutright, D., Gopalakrishnan, M., Mittal, B.B. (2021). Treatment plan quality control using multivariate control charts. Medical Physics. In press. DOI: https://doi.org/10.1002/mp.14795
  • Singh, S.R., Dhanasekara, C.S., Tello, N., Southerland, P., Saleh, A.A., Kesey, J., Dissanaike, S. (2021). Variations in insulin requirements can be an early indicator of sepsis in burn patients. Burns. In press. DOI: https://doi.org/10.1016/j.burns.2021.02.026
  • Venables, W.N., Ripley, B.D. (2002). Modern Applied Statistics with S. Fourth Edition. Springer. New York. DOI: http://dx.doi.org/10.1007/b97626
  • Wickham, H. (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN: 978-0-387-98141-3