YAZILIM HATA SINIFLANDIRMASINDA FARKLI NAİVE BAYES TEKNİKLERİN KIYASLANMASI

   Bu çalışmada, Mutlak Bağlantılı Ağırlıklandırılmış Naive Bayes metodu, Naive Bayes metodu ve Naive Bayes sınıflandırma metodu üzerine uygulanan çeşitli yumuşatma teknikleri (Jelinek-Mercer, Dirichlet, Two-Stage) ile yazılım ölçütlerine dayalı hata sınıflandırmasının karşılaştırmalı analizi araştırılmıştır. Yapılan çalışmada Chidamber & Kemerer ve LOC metrik kümesine sahip 3 veri seti üzerinde modellerin başarımı incelenmiştir. Bu çalışma, kullanılan veri setleri/ölçüt gruplarına göre araştırılan metotlardan Naive Bayes metodu üzerine uygulanan bazı tekniklerin (Dirichlet, Two-Stage) sınıflandırma performansını diğer sınıflandırma metotlarına kıyasla daha da iyileştirdiği sonucunu gösterdi. Bu çalışmanın sonuçlarına göre, %90 üzerinde sınıflandırma doğrulukları DIT-NOC-CBO, DIT-NOC-LCOM, DIT-CBO-LCOM, NOC-CBO-LCOM ölçüt grupları için Dirichlet ve Two-Stage teknikleriyle elde edildi.

A COMPARISON OF DIFFERENT NAIVE BAYES TECHNIQUES FOR SOFTWARE DEFECT CLASSIFICATION

   In this study was investigated that the comparative analysis of software defect classification with using Absolute Correlation Weighted Naive Bayes method, Naive Bayes method and various smoothing techniques (Jelinek-Mercer, Dirichlet, Two-Stage) on Naive Bayes classification technique. In this study, the performance of the models were examined on 3 data sets with a set of metrics Chidamber & Kemerer and LOC. The study results showed that according to used data sets/metric groups and methods the performance of some smoothing techniques (Dirichlet, Two-Stage) performs better than other classification methods. As the results of this study, over 90% classification accuracies were obtained with Dirichlet and Two-Stage smoothing techniques on DIT-NOC-CBO, DIT-NOC-LCOM, DIT-CBO-LCOM, NOC-CBO-LCOM metric groups.

___

  • [1] ERTEMEL, H.Ö., SELÇUK, Y.E., KALIPSIZ, O., “A Cohesion Metric Proposal for Object-Oriented Systems: Comias”, 13th WSEAS Int’l. Conf. on Computers, 575-580. Rhodes, Greece, 2009.
  • [2] BOEHM, B., BASILI, V.R., “Software Defect Reduction Top 10 List”, IEEE Computer, 34, 135-137, 2001.
  • [3] https://www.ijcsi.org/papers/IJCSI-9-5-2-288-296.pdf (erişim tarihi 19.12.2017)
  • [4] MALHOTRA, R., “A Systematic Review of Machine Learning Techniques for Software Fault Prediction”, Applied Soft Computing, 27, 504–518, 2015.
  • [5] KAUR, G., OBERAI, E.N., “A Review Article on Naive Bayes Classifier with Various Smoothing Techniques”, International Journal of Computer Science and Mobile Computing, 3, 864-868, 2014.
  • [6] KARAKOYUN, M., HACIBEYOĞLU, M., “Biyomedikal Veri Kümeleri ile Makine Öğrenmesi Sınıflandırma Algoritmalarının İstatistiksel Olarak Karşılaştırılması”, D.E.Ü. Mühendislik Fakültesi Mühendislik Bilimleri Dergisi, 16, 30-41, 2014.
  • [7] YUAN, Q., CONG, G., THALMANN, N.M., “Enhancing Naive Bayes with Various Smoothing Methods for Short Text Classification”, Proceedings of the 21st International Conference on World Wide Web, 645-646. Lyon, France, 2012.
  • [8] AGGARWAL, S., “Naive Bayes Classifier with Various Smoothing Techniques for Text Documents”, International Journal of Computer Trends and Technology, 4, 873-876, 2013.
  • [9] ADEWOLE, A.P., FAKOREDE, O.J., AKWUEGBO, S.O.N., “ Evaluation of Linear Interpolation Smoothing on Naive Bayes Spam Classifier”, International Journal of Technology Enhancements and Emerging Engineering Research, 2, 143-146, 2014.
  • [10] http://rali.iro.umontreal.ca/rali/sites/default/files/publis/LMforTextClassification.pdf (erişim tarihi 19.12.2017)
  • [11] SARKAR, A.M.J., LEE, Y.K., LEE, S., “A Smoothed Naive Bayes-Based Classifier for Activity Recognition”, IETE Technical Review, 27, 107–119, 2010.
  • [12] PATIL, R.R., “Heart Disease Prediction System Using Naive Bayes and Jelinek-Mercer Smoothing”, International Journal of Advanced Research in Computer and Communication Engineering, 3, 6787-6792, 2014.
  • [13] ASMONO, R.T., WAHONO, R.S., SYUKUR, A., “Absolute Correlation Weighted Naive Bayes for Software Defect Prediction”, Journal of Software Engineering, 1, 38-45, 2015.
  • [14] DENG, H., SUN, Y., CHANG, Y., HAN, J., Probabilistic Models for Classification. C.C. AGGARWAL (Eds.), Data Classification Algorithms and Applications (pp. 67-70), CRC Press, New York, USA, 2015.
  • [15] AGGARWAL, S., “Enhanced Smoothing Methods Using Naïve Bayes Classifier for Better Spam Classification”, International Journal of Engineering Research and Technology, 2, 3061-3073, 2013.
  • [16] http://research.ijcaonline.org/icccmit2014/number2/icccmit7017.pdf (erişim tarihi 19.12.2017)
  • [17] PRESSMAN, R.S., Software Engineering a Practitioner’s Approach (7th ed.), McGraw-Hill, New York, USA, 2010.
  • [18] http://dspace.yildiz.edu.tr:8080/xmlui/bitstream/handle/20.500.11871/170/0036786.pdf?sequence=1&isAllowed=y (erişim tarihi 19.12.2017)
  • [19] http://bug.inf.usi.ch/ (erişim tarihi 11.06.2016)
  • [20] WANG, H., KHOSHGOFTAAR, T. M., SELIYA, N., “How Many Software Metrics Should be Selected for Defect Prediction?”, In Proceedings of the 24th Florida Artificial Intelligence Research Society Conference, 69-74. Florida, USA, 2011.
  • [21] JURECZKO, M., “Significance of Different Software Metrics in Defect Prediction”, Software Engineering: An International Journal, 1, 86-95, 2011.