A Comparative Research on Data Analysis with Factorial ANOVA, Logistic Regression and CHAID Classification Tree Methods

A Comparative Research on Data Analysis with Factorial ANOVA, Logistic Regression and CHAID Classification Tree Methods

When the data structure is large and complex, the extraction of information hidden within the data is called data mining. In the context of data mining, there are numerous methods developed for statistical data analysis. When these methods are classified as conventional-classical methods and current methods, factorial ANOVA (FANOVA) and Logistic Regression (LR) methods are shown as conventional methods, while decision trees called Classification Tree (CT) and Regression Tree (RT) can be shown as current methods. The method to be used in statistical data analysis is directly related to the researcher’s hypothesis (i.e. purpose) and variable type. Therefore, the choice of data analysis method is important. In this regard, studies in which methods are examined comparatively are guiding. In this study, a dataset on which inferences could be made by ANOVA, LR, and CT methods was analyzed.With this dataset, the relationship between the birth type (single-twin) (dependent variable) and the yield year and maternal age (independent variables) in an Awassi sheep flock was examined. The findings of each method were interpreted in its own specific way. The methods were compared in terms of model fit, the similarity and differences of the information that they presented, and explaining the relationship between the dependent and the independent variables. It was concluded that each method offered different inferences based on purpose and perspective. It is believed that it is the right approach for researchers to determine the data analysis method appropriate to their goals by taking into account the data structure.

___

  • Alev Çetin F, Mikail N. 2016. Data Mining Aplications in Livestock. Turk J Agric Res, 3: 79-88.
  • Alpar R 2021. Applied Multivariate Statistical Methods (in Turkish). Detay Publishing, Ankara, Turkey, 6th ed., pp 858.
  • Bek Y and Efe E 1989. Research and Application Methods I (in Turkish). Çukurova University, Agriculture Faculty, Textbook. Publication No 71. Adana, Turkey, 1th ed., pp 395.
  • Bircan H. 2004. Logistic Regression Analysis (in Turkish): An Application on Medical Data. Kocaeli Univ J Social Sci Institute, 2: 185-208.
  • Breiman L, Friedman JH, Olshen RA, Stone CF. 1984. Classification and Regression Tree. Wadsworth International Group, Belmont, California, USA. 1: 3-7.
  • Cottle DJ, Gilmour AR, Pabiou T, Amer PR, Fahey AG. 2016. Genetic selection for increased mean and reduced variance of twinning rate in Belclare ewes. J Anim Breed Genetics, 133: 126-137. Çokluk Ö. 2010. Logistic regression analysis: Concept and application. Educational sciences in theory and practice. 10: 1357-1407.
  • Dangeti P. 2017. Statistical for Machine Learning. Packt Publishing Ltd. ISBN: 1788295757, 9781788295758. Birmingham, England, 1th ed., pp 442. Gacar BK, Kocakoç ID. 2020. Regression Analyses or Decision Trees? Manisa Celal Bayar Univ J Social Sci, 18: 251-260.
  • Güner ZB. 2014. CART and logistic regression analysis in data mining: An application on pharmacy provision system data. Social Security Professionals Assoc J Social Security, 6: 59-61.
  • Koç Y, Eyduran E, Akbulut Ö. 2016. Application of Regression Tree Method for Different Data from Animal Science. Pakistan J Zool, 49: 599-607.
  • Koç Y. 2016. Application of regression tree method for different data from animal science. MSc thesis. Iğdır University, The Institute of Science and Technology, Iğdır, 75.
  • Kurt İ, Türe M, Kurum AT. 2008. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Syst Appl, 34: 366-374.
  • Kuyucu YE. 2012. Comparison of logistic regression analysis (LRA), artificial neural networks (ANN) and classification and regression trees (C&RT) methods and an application in medicine. MSc thesis, Gaziosmanpasa University, Institute of Health Sciences. Tokat, Türkiye, pp: 112.
  • Notter DR. 2008. Genetic aspects of reproduction in sheep. Reprod Domestic Anim, 43: 122-128.
  • Özdamar K. 2004. Statistical data analysis with package programs II. Multivariate Analysis. 5th ed., Kaan Publishing House, Eskisehir, Türkiye, pp: 649.
  • Özgür EG, Doğanay Erdoğan B. 2020. Regression tree approach in computer adaptive testing (BUT) applications: Evaluation of standard CAT algorithm using a psychometric model with regression decision trees on artificial data. J Ankara Health Sci, 9(1): 161-167.
  • Özkan K. 2012. Modelling ecological data using classification and regression tree technique (CART). Süleymen Demirel Üniv Fac Forest J, 13: 1-4.
  • Şahin O. 2017. Determining the important risk factors in preferring Ayvalık for touristic purpose using the method of logistic. Electronic J Soc Sci, 16(61): 647-660.
  • Şata M, Çakan M. 2018. Comparison of results of CHAID analysis and logistic regression analysis. Dicle Univ J Ziya Gökalp Fac of Educ, 33: 48-56.
  • Şenel S, Alatlı B. 2014. A review of articles used logistic regression analysis. J Measur Eval Educ Psychol, 5: 35-52.
  • SPSS 2011. SPSS for Windows, Version 20, SPSS Inc., Chicago, US.
  • Tatlıyer A. 2020. The effects of raising type on performances of some data mining algorithms in lambs. KSU J Agric Nat, 23: 772-780.
  • Vatankhah M, Talebi MA. 2008. Heritability estimates and correlations between production and reproductive traits in Lori-Bakhtiari sheep in Iran. South African J Anim Sci, 38: 110-118.
  • Vupa Çilengiroğlu Ö, Yavuz A. 2020. Comparison of predictive performance of logistic regression and CART methods for life satisfaction data. European J Sci Tec, 18: 719-727.
  • Yıldız N, Akbulut Ö, Bircan H. 2020. Introduction to statistics, 14th ed., Culture and Education Foundation Publishing House. Erzurum, Türkiye, pp: 326.
  • Yıldız N, Bircan H. 1994. Research and application methods in statistics. 2th ed., Agriculture Faculty Publication No: 697. Erzurum, Türkiye, pp: 266.