Performance Analysis of Machine Learning Algorithms and Feature Selection Methods on Hepatitis Disease

Performance Analysis of Machine Learning Algorithms and Feature Selection Methods on Hepatitis Disease

In this study, some machine learning classification techniques are applied on Hepatitis data set acquired from UCI Machine Learning Repository. Naïve Bayes Classifier, Logistic Regression and J48 Decision Tree are used as classification algorithms and they have been compared according to filter-based feature selection methods. For filter-based feature selection, Cfs Subset Eval, Info Gain Attribute Eval and Principal Components have been used and the performance of them is evaluated in terms of precision, recall, F-Measure and ROC Area. Among the all used classification algorithms, Naïve Bayes Classifier has higher classification accuracy on the Hepatitis data set than the others with applied and non-applied filter-based feature selection. Moreover, we declare that the best filter-based feature selection is Principal Components because of the highest classification accuracy obtained with for hepatitis patients.    

___

  • [1] U.S. Food and Drug Administration Homepage, [Online]. Available: https://www.fda.gov/patients/get-illnesscondition-information/hepatitis-b-c
  • [2] World Health Organization Homepage, [Online]. Available: https://www.who.int/features/qa/76/en/
  • [3] R. K. Das, M. Panda, N. Mahapatra, and S. S. Dash, “Application of Artificial Immune System Algorithms on Healthcare Data”, in 2017 International Conference on Computational Intelligence and Networks, 2017, pp. 110-114.
  • [4] P. Nancy, V. Sudha, and R. Akiladevi, “Analysis of feature Selection and Classification algorithms on Hepatitis Data”, International Journal of Advanced Research in Computer Engineering & Technology, Volume 6, Issue 1, 2017.
  • [5] T. Karthikeyan, and P. Thangaraju, “Analysis of Classification Algorithms Applied to Hepatitis Patients”, International Journal of Computer Applications, 62(15), 2013.
  • [6] B. V. Ramana, and R. S. K Boddu, “Performance Comparison of Classification Algorithms on Medical Datasets”, In 2019 IEEE 9th Annual Computing and Communication Workshop and Conference, 2019, pp. 140-145.
  • [7] S. O. Hussien, S. S. Elkhatem, N. Osman, and A. O. Ibrahim, “A Review of Data Mining Techniques for Diagnosing Hepatitis”, in 2017 Sudan Conference on Computer Science and Information Technology, 2017, pp. 1-6.
  • [8] V. Shankar sowmien, V. Sugumaran, C. P. Kartikeyan, and T. R. Vijayaram, “Diagnosis of Hepatitis Using Decision Tree Algorithm”, International Journal of Engineering and Technology, Vol 8, pp. 1411-1419, 2016.
  • [9] M. Fatima, and M. Pasha, “Survey of Machine Learning Algorithms for Disease Diagnostic”, Journal of Intelligent Learning Systems and Applications, 9(01), 1, 2017.[10] F. M. Ba-Alwi, and H. M. Hintaya, Comparative Study for Analysis the Prognostic in Hepatitis Data: Data Mining Approach, International Journal of Scientific & Engineering Research, Vol 4, Issue 8, August-2013.
  • [11] Ö. Yıldız, T. Dayanan, and İ. Düzdar Arfun, “Comparison of Accuracy Values of Biomedical Data with Different Applications Decision Tree Method”, in 2018 Electric Electronics, Computer Science, Biomedical Engineering’s Meeting, 2018, pp. 1-4.
  • [12] E. Seğmen, and A. Uyar, “Performance Analysis of Classification Models for Medical Diagnostic Decision Support Systems”, Signal Processing and Communications Applications Conference, 2013, pp. 1-4.
  • [13] UCI Homepage, [Online]. Available: https://archive.ics.uci.edu/ml/datasets/hepatitis
  • [14] C. Coşkun, and A. Baykal, Veri Madenciliğinde Sınıflandırma Algoritmalarının bir Ornek Uzerinde Karşılaştırılması, Akademik Bilişim, 2011, 1-8.
  • [15] C. Luan, and G. Dong, “Experimental Identification of Hard Data Sets for Classification and Feature Selection Methods with Insights on Method Selection”, Data and Knowledge Engineering, Vol 118, 41-51, 2018.
  • [16] S. Priya, and R. Manavalan “Optimum Parameters Selection Using ACOR Algorithm to Improve the Classification Performance of Weighted Extreme Learning Machine for Hepatitis Disease Data Set”, IEEE International Conference on Inventive Research in Computing Applications, 2018, pp. 986-991.
  • [17] M. Gunay, E. Yildiz, Y. Nalcakan, B. Asiroglu, A. Zencirli, and T. Ensari, “Digital Data Forgetting: A Machine Learning Approach”, IEEE International Symposium on Multidisciplinary Studies and Innovative Technologies, 2018, pp. 1-4.
  • [18] E. Aydindag Bayrak, and P. Kirci, Intelligent Big Data Analytics in Health. In Early Detection of Neurological Disorders Using Machine Learning Systems, pp. 252-291, IGI Global, 2019.
  • [19] E. Karabulut, and R. Alpar, Lojistik Regresyon, Uygulamalı Çok Değişkenli İstatistiksel Yöntemler, Detay Yayıncılık Ankara, ISBN: 978-605-5437-42-8, 2011.
  • [20] C. Yoo, L. Ramirez, and J. Liuzzi, Big Data Analysis Using Modern Statistical and Machine Learning Methods in Medicine, International Neurourology Journal, 18 (2), 50, 2014.
  • [21] P. Tapkan, L. Özbakır, and A. Baykasoğlu, “Weka ile Veri Madenciliği Süreci ve Örnek Uygulama”, Endüstri Mühendisliği Yazılımları ve Uygulamaları Kongresi, 2011.
International Journal of Multidisciplinary Studies and Innovative Technologies-Cover
  • ISSN: 2602-4888
  • Yayın Aralığı: Yılda 2 Sayı
  • Başlangıç: 2017
  • Yayıncı: SET Teknoloji