Kanonik Korelasyon Katsayılarının İstatistiksel Önemliliğini Test Etmek için Hangi Test Daha Güvenilirdir?
Bu çalışmada, kanonik korelasyon katsayılarının istatistiksel olarak önemlilik testinde kullanılan Wilks' Λ (W), Hotelling-Lawley Trace (H) ve Pillai's Trace (P) testleri gerçek tip I hata oranı açısından karşılaştırılmıştır. Yapılan 10000 simülasyon deneyi sonucunda, normal olan ve normallikten hafif veya orta derecede sapan çok değişkenli dağılımlardan örnekler alındığında, W testi gerçek tip I hata oranını tüm durumlarda koruma açısından muhafazakar olmuştur. Ancak normallikten aşırı derecede sapma olduğunda, W testi için gerçek tip I hata oranları hemen hemen tüm durumlarda Bradley kriterinin üst sınırını (%4,50-5,50) aşmıştır. H testi ve P testi ise genel olarak Bradley sınırlarının dışında kalan gerçek tip I hata oranları elde etmiştir.
Which Test is More Reliable for The Testing Statistical Significance of Canonical Correlation Coefficients?
In this study, Wilks’ Λ (W), Hotelling-Lawley Trace (H) and Pillai’s Trace (P) tests which are used in testing of statistically significance for canonical correlation coefficients were compared in terms of actual type I error rate. As a result of 10000 simulation experiments conducted, when samples were taken from multivariate distributions which are normal and deviate slightly or moderately from normality, the W test was conservative in terms of protecting actual type I error rate in all cases. However, when there is excessively deviate from normality, actual type I error rates for the W test exceeded the upper limit of Bradley’s criterion (4.50-5.50%) almost in all cases. On the other hand, the H test and P test generally obtained actual type I error rates which were outside Bradley limits.
___
- Anderson, T. W. (1984). An Introduction to Multivariate Statistical Analysis (2nd edition). John Wiley and Sons,
- Anderson, T. W. (1999). Asymptotic theory for canonical correlation analysis. Journal of Multivariate Analysis, 70(1), 1-29.
- Andrew, G. Arora, R, Bilmes, J. & Livescu, K. (2013). Deep canonical correlation analysis. In International conference on machine learning (pp. 1247-1255).
- Baggaley, A. R. (1981). Multivariate analysis: an introduction for consumers of behavioral research.Evaluation Review, 5, 123-131.
- Bradley, J. V. (1978). Robustness?. British Journal of Mathematical and Statistical Psychology, 31(2), 144-152.
- Carroll, J. D. (1968). Generalization of canonical correlation analysis to three or more sets of variables. Proceedings of the 76th Annual Convention of the Psychological Association, 3, 227–228.
- Gauch, H. G. & Wentworth, T. R. (1976). Canonical correlation analysis as an ordination technique. Vegetatio, 33(1), 17-22.
- Hotelling, H. (1936). Relations between two sets of variates. Biometrika, 28(3/4):321–377, 1936.
- Ferreira, M. A. & Purcell, S. M. (2009). A multivariate test of association. Bioinformatics, 25(1), 132-133.
- Fleishman, A. I. (1978). A method for simulating non-normal distributions. Psychometrika, 43, 521-532.
- Hotelling, H. (1951) “A generalized T-test and measure of multivariate dispersion,” in Proceedings of the Second Berkely Symposium on Mathematics and Statistics, pp. 23–41, Berkeley, CA, USA, August 1951.
- Knapp, T. R. (1978). Canonical correlation analysis: A general parametric significance-testing system. Psychological Bulletin, 85(2), 410.
- Kerlinger, F. N. & Pedhazur, E. J., (1973). Multiple regression in behavioral research. New York, NY:Holt Rinehart & Winston.
- Lawley D. N. (1938), A generalization of Fisher’s z test, Biometrika, vol. 30, no. 1‐2, pp. 180–187, 1938.
- Meloun, M. & Militky, J. (2011). Statistical data analysis: A practical guide. Woodhead Publishing, Limited.
- Pillai, K. C. S. (1955). Some new test criteria in multivariate analysis. The Annals of Mathematical Statistics, 26(1), 117-121.
- R Core Team. (2019). R: A language and environment for statistical computing. Ankara, Turkey: R Foundation for Statistical Computing. URL http://www.R-project.org/
- Rao, C. R. (1973). Linear Statistical Inference and Its Applications. 2nd ed. New York: John Wiley & Sons.
- Sharma, S. (1996). Applied Multivariate Techniques: Canonical Corelation, 391-418. John Willey and Sons Inc., USA.
- Stewart, D. & Love, W. (1968). A general canonical correlation index. Psychological bulletin, 70(3p1), 160.
- Tang, C. S. & Ferreira, M. A. (2012). A gene-based test of association using canonical correlation analysis. Bioinformatics, 28(6), 845-850.
- Takane, Y. Yanai, H. & Hwang, H. (2006). An improved method for generalized constrained canonical correlation analysis. Computational statistics & data analysis, 50(1), 221-241.
- Thompson, B. (1984). Canonical correlation analysis uses and interpretations. Newbury Park, CA: Sage.
- Vale, D. C. & Maurelli, V. A. (1983). Simulating multivariate nonnormal distributions. Psychometrika, 48, 465-471.
- Van De Velden, M. & Bijmolt, T. H. (2006). Generalized canonical correlation analysis of matrices with missing rows: a simulation study. Psychometrika, 71(2), 323-331.
- Waller, N. G. (2016). Fungible correlation matrices: A method for generating nonsingular, singular, and improper correlation matrices for Monte Carlo research. Multivariate behavioral research, 51(4), 554-568.
- Wilks S. S. (1932). Certain generalizations made in the analysis of variance, Biometrica, vol. 24, no. 3-4, pp. 471–494, 1932.
- Yanai, H. & Takane, Y. (1992). Canonical correlation analysis with linear constraints. Linear algebra and its applications, 176, 75-89.