Kosinüs Benzerliğine Dayalı Çapraz-proje Hata Tahmini

Çapraz-proje tahmini özellikle metrik heterojenliği açısından araştırmacıların ilgisini çekmekte, bu alanda yeni yöntemlere ihtiyaç duyulmaktadır. Hata tahmin işleminin farklı projeler üzerinden yürütülmesi geliştiricilere anlamlı bilgiler sunmaktadır. Bu çalışmada, çapraz-proje tahmini için, Kosinüs benzerliğine dayalı metrik eşleştirmesi yapan CSCDP isimli bir algoritma geliştirilmiştir. Yöntem 36 farklı veri setinde üç farklı sınıflandırıcı ile test edilmiştir. Elde edilen sonuçlara göre ortalama tahmin performansının yapay sinir ağlarında diğer sınıflandırıcılara göre daha yüksek olduğu tespit edilmiştir. Ayrıca, seyreklik analizine dayalı olarak seçilen eğitim veri setlerinin test başarısını olumlu etkilediği tespit edilmiştir. Son olarak, CSCDP kullanılarak yürütülen çapraz-proje tahmininin sınıflandırma hatasını Random Forest algoritmasında F-skor parametresi için 0.65 oranında azalttığı gözlemlenmiştir.

Cosine Similarity-based Cross-project Defect Prediction

Cross-project defect prediction has been intriguing researchers in terms of metric heterogeneity and new methods are needed in this field. Performing defect prediction through different projects presents valuable information for developers. In this work, a metric matching algorithm namely CSCDP is presented for cross-project defect prediction. The method is then tested on 36 different projects via three classifiers. According to the obtained results, neural network predictor outperforms the others in terms of mean prediction values. Further, selecting training data sets using sparsity analysis creates a favorable effect on testing performance. Last, CSCDP was able to reduce classification error up to 0.65 in Random Forest for F-score.

Keywords:

defect prediction, consine-similarity cross-project,

PDF

___

[1] J. Nam, W. Fu, S. Kim, T. Menzies, T., L. Tan, “Heterogeneous defect prediction”, IEEE Transactions on Software Engineering,(1), 2017.
[2] S. Wang, L. Taiyue, L. Tan. "Automatically learning semantic features for defect prediction", Software Engineering (ICSE), IEEE/ACM 38th International Conference , 297-308, 2016.
[3] S. Herbold, "Training data selection for cross-project defect prediction", Proceedings of the 9th International Conference on Predictive Models in Software Engineering, ACM, 6, 2013.
[4] Y. Zhang, X. Lo, J. Sun, “An empirical study of classifier combination for cross-project defect prediction”, Computer Software and Applications Conference (COMPSAC), 2, 264- 269, 2015.
[5] C. Ni, "A Cluster Based Feature Selection Method for CrossProject Software Defect Prediction", Journal of Computer Science and Technology, 32(6), 2017.
[6] Q. Yu, J. Shujuan, Y. Zhang, "A feature matching and transfer approach for cross-company defect prediction", Journal of Systems and Software, 132, 2017.
[7] Q. Yu, S. Jiang, J. Qian, “Which is more important for cross project defect prediction: instance or feature?”, Software Analysis, Testing and Evolution (SATE), International Conference, 90-95, 2016.
[8] Y. Zhou, Y. Yang, H. Lu, L. Chen, Y., Zhao, “How Far We Have Progressed in the Journey?, An Examination of Cross-Project Defect Prediction”, ACM Transactions on Software Engineering and Methodology (TOSEM), 27(1), 1, 2018
[9] X. Xia, L. O. David, S. J. Pan, N. Nagappan, X.Wang, “Hydra: Massively compositional model for cross-project defect prediction”, IEEE Transactions on software Engineering, 42(10), 2016.
[10] H. V. Nguyen, L. Bai, “Kosinüs similarity metric learning for face verification”, Asian conference on computer vision, 709- 720, 2010.
[11] T. Zimmermann, N. Nagappan, H. Gall, E. Giger, B. Murphy, “Cross-project defect prediction: a large scale experiment on data vs. domain vs. process”, Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, 91-100, 2009.
[12] F. Zhang, Q. Zheng, Y. Zou, A. E. Hassan, “Cross-project defect prediction using a connectivity-based unsupervised classifier”, Proceedings of the 38th International Conference on Software Engineering, 309-320, 2016.
[13] D. Ryu, J. I. Jang, J. Baik, “A transfer cost-sensitive boosting approach for cross-project defect prediction”, Software Quality Journal, 25(1), 2017.
[14] W. N Poon, K. E. Bennin, J. Huang, P. Phannachitta, J. W. Keung, “Cross-project defect prediction using a credibility theory based naive bayes classifier”, Software Quality, Reliability and Security (QRS), IEEE International Conference, 434-441, 2017.
[15] N. Limsettho, K. E. Bennin, J. W. Keung, H. Hata, H., & K. Matsumoto, “Cross project defect prediction using class distribution estimation and oversampling”, Information and Software Technology, 100, 2018.
[16] S. Herbold, A. Trautsch, J. Grabowski, “A comparative study to benchmark cross-project defect prediction approaches”, Proceedings of the 40th International Conference on Software Engineering, 1063-1063, 2017.
[17] F. Wu, X. Y. Jing, X. Dong, J., “Cross-project and within-project semi-supervised software defect prediction problems study using a unified solution”, Software Engineering Companion (ICSEC), IEEE/ACM 39th International Conference, 195-197, 2017.
[18] X. Jing, F. Wu, X. Dong, F. Qi, B. Xu, “Heterogeneous crosscompany defect prediction by unified metric representation and CCA-based transfer learning”, Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 496- 507, 2015.
[19] S. Herbold, “Benchmarking cross-project defect prediction approaches with costs metrics”, arXiv preprint arXiv:1801.04107, 2018.
[20] S. Herbold, "Crosspare: a tool for benchmarking cross-project defect predictions", 30th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW), IEEE, 90-96, 2015.
[21] C. Catal, M. Song, C. Muratli, E. H. J. Kim, M. A. Tosuner, Y. Kayikci, “Cross-Cultural Personality Prediction based on Twitter Data”, Journal of Software, 12(11), 2017.
[22] J. Hann, M. Kamber, “Data Mining: Concepts and Techniques”, Morgan Kaufman Publishers, 2000.
[23] T. A. Davis, S. Rajamanickam, W. Sid-Lakhdar, “A survey of direct methods for sparse linear systems”, Acta Numerica, 25, 2016.
[24] Internet: T. Menzies, R. Krishna, D. Pryor, The Promise Repository of Empirical Software Engineering Data, http://openscience.us/repo. North Carolina State University, Department of Computer Science, 2016.
[25] F. Porto, L. Minku, E. Mendes, A. Simao, “A systematic study of cross-project defect prediction with meta-learning”, arXiv preprint arXiv:1802.06025, 2018.
[26] T. Fukushima, Y. Kamei, S. McIntosh, K. Yamashita, K., N. Ubayashi, “An empirical study of just-in-time defect prediction using cross-project models”, Proceedings of the 11th Working Conference on Mining Software Repositories, 172-181, 2014.
[27] B. Karaöz, U. T. Gürsoy, “Adaptif Öğrenme Sözlüğü Temelli Duygu Analiz Algoritması Önerisi”, Bilişim Teknolojileri Dergisi, 11(3), 2018.
[28] U. Ayvaz, H. Gürüler, “Bilgisayar Kullanıcılarına Yönelik Duygusal İfade Tespiti”, Bilişim Teknolojileri Dergisi, 10(2), 2017.
[29] M. A. Kızrak, B. Bolat, “Derin Öğrenme ile Kalabalık Analizi Üzerine Detaylı Bir Araştırma”, Bilişim Teknolojileri Dergisi, 11(3), 2018.