Dengesiz Verilerle Çalışılması Durumunda Gruplardaki Gözlem Sayıları Kaç Olmalıdır?

Deneme planlamasında en önemli aşamalardan birisi, gerekli olan örnek hacminin belirlenmesidir. Örnek hacminin gereğinden fazla olması kaynakların israfına neden olmaktadır. Gereğinden az olması durumunda ise parametre tahminlerinde oldukça büyük sapmalar meydana gelmekte ve karşılaştırılacak muamele grup ortalamaları arasında gerçekte var olan farklılıklar ortaya konulamamaktadır. Karşılaştırılacak gruplardaki gözlem sayılarının eşit olması istenen bir durumdur. Ancak, uygulamada her zaman eşit hacimli örneklerle çalışmak mümkün olamamaktadır. Bu çalışmada, dengesiz denemelerin söz konusu olması durumunda hangi örnek hacmi kombinasyonlarının % 80’lik güç değerini sağlayabildiklerinin belirlenmesi amacıyla bir simülasyon çalışması yapılmıştır. Yapılan 50,000 simülasyon denemesi sonucunda, pek çok örnek hacmi kombinasyonu ile çalışılması durumunda % 80’lik güç değerine ulaşıldığı görülmüştür. Ancak, örnek hacimlerindeki dengesizliğin artması, araştırıcıyı daha fazla gözlem ile çalışmaya zorlamaktadır. Mesela varyanslar homojen iken n= 16, 16, 16 örnek hacmi kombinasyonu toplam 48 gözlem ile varılan güç değerine, dengesiz denemelerin söz konusu olması durumunda ancak n= 12, 30, 30 ve n= 12, 24, 36 toplam 72 gözlem örnek hacmi kombinasyonu ile çalışılması durumunda ulaşılmaktadır. Varyansların heterojenlik derecesinin artmasına paralel olarak örnek hacimlerindeki dengesizliklerin testin gücüne olan etkilerinin daha da belirginleştiği görülmüştür

Anahtar Kelimeler:

Uygun örnek hacmi, testin gücü, etki büyüklüğü, dengesiz veriler

How Many Samples are Enough When Data are Unbalanced?

A crucial component of the design of a study is the number of participants or observations sample size required. Taking too many samples will waste time and resources, both in collecting and analyzing the data. On the other hand, taking too small samples can make the whole study meaningless or lead to errors in interpritation. Equal group sizes are preferable. But, this is not always the case in practice. The aim of this study is to clarify some of the key issues regarding sample size and power 80 % when data are unbalanced. For this aim, a simulation study was conducted. At the end of the 50,000-simulation trial it was seen that there are many different sample size combinations that make it possible to reach around 80% test power. On the other hand, as the numbers of observations were getting more different, we needed more observations to reach around 80 % test power. For instance, the test power we reached for the 16 observations in each group n=16:16:16 , total 48 observations, we can only reach with 72 observations when sample sizes were unequal n=12, 30, 30 and n=12: 24: 36 . As the variances were getting more heterogenous, the effect of unbalanced data on test power was getting more obvious

Keywords:

Optimum sample size, test power, effect size, unbalanced data,

PDF

___

Adcock, C. J. 1997. Sample Size Determination. The Statistician 46 (2): 261-283.
Alexander, R. A. and D. M. Govern. 1994. A new and simple approximation for ANOVA under variance heterogeneity. Journal of education Statistics 19: 91-101.
Algina, R. A., T. C. Oshima and W. Y. Lin. 1994. Type I error rates for Welch’s test and James’s second-order test under nonnormality and inequality of variance when there are two groups. Journal of Educational and Behavioral Statistics 19: 275-291.
Anonymous, 1994. FORTRAN subroutines for Mathematical Applications. IMSL MATH/LIBRARY. Vol.1-2. Visual Numerics, Inc., Houston, USA.
Bratcher, T. L., M. A. Moran and W. J. Zimmer. 1970. Tables of sample sizes in the analysis of variance. Journal of Quality Technology 15: 33-39.
Box, G. E. P. 1954. Some theorems on quadratic forms applied in the study of analysis variance problems, I. effect of inequatiley of variance in the one-way model. The Annals of Mathematical Statistics 25: 290-302.
Brown, M. B. and A. B. Forsythe. 1974. The small sample behaviour of some statistics which test the equality of several means. Technometrics 16: 129-132.
Cohen, J. 1969. Statistical power analysis for behavioral science. New York: Academic Press.
Cohen, J. 1988. Statistical power analysis for the behavioral sciences. Second Ed. New Jersey: Lawrence Erlbaum Associates, Hillsdale.
Cook, C. M. A. and S. D. Raj. 2003. Making the concepts of power and sample size relevant and accessible to students in introductory statistics courses using applets. Journal of Statistics Education.
Desu, M. M. and D. Raghavarao. 1990. Sample size methodology. Boston: Academic Press.
Dupont, W. D. and W. D. Jr. Plummer. 1990. Power and sample size calculations: A review and computer program. Controlled Clinical Trials 11: 116-128.
Dupont, W. D. and W. D. Jr. Plummer. 1998. Power and sample size calculations for studies involving linear regression. Controlled Clinical Trials 19: 589-601.
Eckblad, J. W. 1991. How many samples should be taken? BioScience 41 (5): 346-348.
Elliott, J. M. 1977. Some methods for the statistical analysis of samples of benthic invertebrates. 2nd Edition. Freshwater Biological Association Scientific Publication No. 25.
Fenstad, G. U. 1983. A comparison between U and V tests in the behrens-fisher problem. Biometrika 70: 300-302.
Ferron J. and C. Sentovich. 2002. Statistical power of randomization tests used with multiple- baseline designs. Journal of Experimental Education 70 (2): 165-178.
Gatti, G. G. and M.Harwell, 1998. Advantages of computer programs over power charts for the estimation of power. Journal of Statistics Education,6(3).
Glass, G. V., P. D. Peckham and Jr. Sanders. 1972. Consequences of failure to meet assumptions underlying analysis of variance and covariance. Review of Educational Research 42: 237-288.
Hicks, C. R. 1993. Fundamental Concepts in the Design of Experiments (4th ed.) New York: Saunders College Publishing.
Hoening, J. M. and D. M. Heisey. 2001. The abuse of power: The prevasive fallacy of power calculations for data analysis. The American Statistician 55: 19-24.
Horn, M. and R. Vollandt. 1998. Sample sizes for comparisons of k treatments with a control based on different definitions of power. Biometrical Journal 40: 589-612.
Horn, M., R. Vollandt and C. W. Dunnett. 2000. Sample size determination for testing whether an identified treatment is best. Biometrics 56: 70-72.
Lenth, R. V. 2001. Some practical guidelines for effective sample size determination. The American Statistician 55: 187-193.
Mendeş, M. 1998. Sample size determination in parameter estimation and testing of hypothesis for between difference among k-group means. MsD. Thesis. Thesis. Ankara University Graduates School of Natural and Applied Sciences Department of Animal Science (unpublished).
Mendes, M. 2002. The comparison of some parametric alternative test to one-way analysis of variance in terms of Type I error rates and power of test under non-normality and heterogeneity of variance. Ph.D. Thesis. Ankara University Graduates School of Natural and Applied Sciences Department of Animal Science (unpublished).
Mendes, M. and A. Pala. 2004. Evaluation of four tests when normality and homogeneity of variance assumptions are violated. Pakistan Journal of Information and Technology 4 (1): 38-42.
Mendes, M. and B. Tekindal. 2002. Normal ve normal olmayan populasyonlarda ortalamalar arası farkın testinde uygun örnek genişliğinin belirlenmesi. Gazi Üniversitesi Endüstriyel Sanatlar Eğitim Dergisi 11: 25-38.
Montgomery, D. C. 2001. Design and Analysis of Experiments (5th ed.) John Wiley and Sons. New York.
Nelson, L. S. 1985. Sample size tables for analysis of variance. Journal of Quality Technology 17 (3): 167-169.
Ott, L. 1998. An introduction to statistical methods and data analysis. Third Edition. PWS-Kent Publishing Company.
Schneider, P. J. and D. A. Penfield. 1997. Alexander and Govern’s approximation: Providing an alternative to ANOVA under variance heterogeneity. The Journal of Experimental Education 65: 271-286.
Vollandt, R., M. Horn and P. K. Sen. 2000. Sample size determination of Steel’s nonparametric many-one test. Commun.Statist.-Theory Meth. 29: 2915-2919.
Wilcox, R. R., V. L. Charlin and K. L. Thompson. 1986. New monte carlo results on the robustness of the ANOVA F, W and F* statistics. Journal of Statistical Computation and Simulation 15: 33-943.
Wilcox, R. R. 1988. A new alternarive to the ANOVA F and new results on James’s second-order method. British Journal of Mathematical and Statistical Psychology 41: 109-117.
Winer,B. J., D. R. Brown and K. M. Michels. 1991. Statistical principles in experimental design. New York: McGraw-Hill Book Company.
Zar, J. H. 1999. Biostatistical analysis. New Jersey: Prentice –Hall Inc. Simon and Schuster/A Viacom Company.