Sınıflar Arası Kenar Payını Genişletmek İçin Yeni Bir Örnek Seçim Algoritması

Veri kümelerindeki gereksiz örneklerin atılması öğrenme sürecini kısalttığı gibi gürültülü verileri ortadan kaldırdığı için öğrenme performansını da arttırmaktadır. Örnek seçim yöntemleri, yukarıda belirtilen görevleri yerine getirmek için yaygın olarak kullanılmaktadır. Bu makalede, "Border Instances Reduction using Classes Handily (BIRCH)" adlı yeni bir denetimli örnek seçim algoritması öneriyoruz. BIRCH, her örneğin k-en yakın komşularını dikkate alarak, sadece aynı sınıftan komşuları olan, yani farklı sınıflardan komşuları olmayan örnekleri seçer. BIRCH, çeşitli alanlardan on beş veri kümesi kullanılarak biri geleneksel ve dördü son teknoloji örnek seçim algoritması ile karşılaştırılmıştır. Ampirik sonuçlar, BIRCH'in komşu sayısının ayarlanmasıyla doğruluk oranı ve azaltma oranı arasındaki dengeyi iyi sağladığını göstermektedir. Ayrıca önerilen yöntem, yüksek bir sınıflandırma doğruluğunu sağlamayı garanti eder. Önerilen algoritmanın kaynak kodu https://github.com/fatihaydin1/BIRCH web adresinde bulunabilir.

A New Instance Selection Method for Enlarging Margins Between Classes

As discarding superfluous instances in data sets shortens the learning process, it also increases learning performance because of eliminating noisy data. Instance selection methods are commonly utilized to undertake the abovementioned tasks. In this paper, we propose a new supervised instance selection algorithm called Border Instances Reduction using Classes Handily (BIRCH). BIRCH considers k-nearest neighbors of each instance and selects instances that have neighbors from the only same class, namely, but not having neighbors from the different classes. It has been compared with one traditional and four state-of-the-art instance selection algorithms by using fifteen data sets from various domains. The empirical results show BIRCH well delivers the trade-off between accuracy rate and reduction rate by tuning the number of neighbors. Furthermore, the proposed method guarantees to yield a high classification accuracy. The source code of the proposed algorithm can be found in https://github.com/fatihaydin1/BIRCH.

___

  • Akinyelu, A. A. and Adewumi, A. O. (2017) ‘Improved Instance Selection Methods for Support Vector Machine Speed Optimization’, Security and Communication Networks, 2017, pp. 1–11. doi: 10.1155/2017/6790975.
  • Akinyelu, A. A. and Ezugwu, A. E. (2019) ‘Nature Inspired Instance Selection Techniques for Support Vector Machine Speed Optimization’, IEEE Access, 7, pp. 154581–154599. doi: 10.1109/ACCESS.2019.2949238.
  • Alpaydin, E. (1997) ‘Voting over Multiple Condensed Nearest Neighbors’, Artificial Intelligence Review, 11(1/5), pp. 115–132. doi: 10.1023/A:1006563312922.
  • Arnaiz-González, Á. et al. (2016) ‘Instance selection of linear complexity for big data’, Knowledge-Based Systems, 107, pp. 83–95. doi: 10.1016/j.knosys.2016.05.056.
  • Aslani, M. and Seipel, S. (2020) ‘A fast instance selection method for support vector machines in building extraction’, Applied Soft Computing, 97, p. 106716. doi: 10.1016/j.asoc.2020.106716.
  • Aslani, M. and Seipel, S. (2021) ‘Efficient and decision boundary aware instance selection for support vector machines’, Information Sciences, 577, pp. 579–598. doi: 10.1016/j.ins.2021.07.015.
  • Cover, T. and Hart, P. (1967) ‘Nearest neighbor pattern classification’, IEEE Transactions on Information Theory, 13(1), pp. 21–27. doi: 10.1109/TIT.1967.1053964.
  • García-Pedrajas, N. (2011) ‘Evolutionary computation for training set selection’, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(6), pp. 512–523. doi: 10.1002/widm.44.
  • Garcia, S. et al. (2012) ‘Prototype Selection for Nearest Neighbor Classification: Taxonomy and Empirical Study’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 34(3), pp. 417–435. doi: 10.1109/TPAMI.2011.142.
  • Hart, P. (1968) ‘The condensed nearest neighbor rule (Corresp.)’, IEEE Transactions on Information Theory, 14(3), pp. 515–516. doi: 10.1109/TIT.1968.1054155.
  • Liu, C. et al. (2017) ‘An efficient instance selection algorithm to reconstruct training set for support vector machine’, Knowledge-Based Systems, 116, pp. 58–73. doi: 10.1016/j.knosys.2016.10.031.
  • Olvera-López, J. A. et al. (2010) ‘A review of instance selection methods’, Artificial Intelligence Review, 34(2), pp. 133–143. doi: 10.1007/s10462-010-9165-y.
  • Rico-Juan, J. R., Valero-Mas, J. J. and Calvo-Zaragoza, J. (2019) ‘Extensions to rank-based prototype selection in k-Nearest Neighbour classification’, Applied Soft Computing, 85, p. 105803. doi: 10.1016/j.asoc.2019.105803.
  • Ruiz, I. L. and Gómez-Nieto, M. Á. (2020) ‘Prototype Selection Method Based on the Rivality and Reliability Indexes for the Improvement of the Classification Models and External Predictions’, Journal of Chemical Information and Modeling, 60(6), pp. 3009–3021. doi: 10.1021/acs.jcim.0c00176.
  • Sun, X. et al. (2019) ‘Fast Data Reduction With Granulation-Based Instances Importance Labeling’, IEEE Access, 7, pp. 33587–33597. doi: 10.1109/ACCESS.2018.2889122.
  • Susheela Devi, V. and Murty, M. N. (2002) ‘An incremental prototype set building technique’, Pattern Recognition, 35(2), pp. 505–513. doi: 10.1016/S0031-3203(00)00184-9.
  • Wang, Z., Tsai, C.-F. and Lin, W.-C. (2021) ‘Data cleaning issues in class imbalanced datasets: instance selection and missing values imputation for one-class classifiers’, Data Technologies and Applications, ahead-of-p(ahead-of-print). doi: 10.1108/DTA-01-2021-0027.
  • Wilson, D. L. (1972) ‘Asymptotic Properties of Nearest Neighbor Rules Using Edited Data’, IEEE Transactions on Systems, Man, and Cybernetics, SMC-2(3), pp. 408–421. doi: 10.1109/TSMC.1972.4309137.
  • Wilson, D. R. and Martinez, T. R. (2000) ‘Reduction techniques for instance-based learning algorithms’, Machine Learning, 38, pp. 257–286.
  • Yang, L. et al. (2019) ‘Constraint nearest neighbor for instance reduction’, Soft Computing, 23(24), pp. 13235–13245. doi: 10.1007/s00500-019-03865-z.
Zeki Sistemler Teori ve Uygulamaları Dergisi-Cover
  • Başlangıç: 2018
  • Yayıncı: Özer UYGUN