DESTEK VEKTÖR ÖBEKLEME İÇİN ETKİLİ KERNEL FONKSİYONLARININ ARAŞTIRILMASI

Öbekleme verideki bilinmeyen desenleri açığa çıkararak farklı sınıflara ayıran etkili bir araçtır. Ancak, k-ortalama, k-NN, bulanık c-ortalama gibi geleneksel öbekleme algoritmalarında, veriye göre değişken olan öbek sayısının seçimi belirsizdir. Dahası, öbekleme algoritmalarının uygulanacağı veri setleri genellikle öbekler arası doğrusal olmayan sınırlara sahiptir. Bu doğrusal olmayan sınırları giriş uzayında belirlemek karmaşık bir problemdir. Bahsi geçen sorunları çözmek için, son yıllarda öbek sayısını ve sınırlarını otomatik olarak belirleyen kernel tabanlı öbekleme yöntemleri geliştirilmiştir. Özellikle, Destek Vektör Kümele(DVK) algoritması öbek sayısını otomatik olarak belirleme ve Gauss kenel parametresine göre doğrusal olmayan sınırları ortaya çıkarma gibi özellikleriyle veri analizinde büyük ilgi görmektedir. DVK tarafından belirlenen öbek ve öbekler arası sınırlar, kernel fonksiyonunun seçimine ve parametrelerine bağlı olarak değişiklik gösterebilir. Bundan dolayı, kernel fonksiyonunun seçimi önemli bir rol oynar. Bu çalışmada, ilk kez, DVK çatısı altında iki farklı kernel (Cauchy ve Laplacian) fonksiyonunun uygulanması ve performanslarının değerlendirilmesi gerçekleştirilmiştir. Elde edilen sonuçlardan Laplacian kernel fonksiyonunun Gauss ve Cauchy kernel fonksiyonlarından daha iyi performans gösterdiği gözlemlenmiştir.

EXPLORING EFFICIENT KERNEL FUNCTIONS FOR SUPPORT VECTOR CLUSTERING

Clustering is an effective tool that divides data into different classes to reveal internal and previously unknown data schemes. However, in conventional clustering algorithms such as the k-means, k-NN, fuzzy c tool, the selection of the appropriate number of clusters for each data set is uncertain and varies with the data sets. Furthermore, the data sets to which the clustering algorithm is applied generally have nonlinear boundaries between clusters. Determining these nonlinear boundaries in the input space causes a complex problem. To overcome these problems, kernel-based clustering methods have been developed in recent years, which automatically determine the number and boundaries of clusters. In particular, the Support Vector Clustering (SVC) algorithm has received great attention in data analysis because of its features such as automatically determining the number of clusters and recognizing nonlinear boundaries based on the Gaussian kernel parameter. The number of clusters and region boundaries produced by SVC may show variation depending on the choice of the kernel function and its parameters. Therefore, the choice of kernel function plays a significant role. In this study, for the first time, the implementation of two different kernel (Cauchy and Laplacian) functions and evaluation of their performances have been realized within the framework of SVC. It was observed that the Laplacian kernel function performed better than Gauss and Cauchy kernel functions.

___

  • Xu, R. & WunschII, D. Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16(3), 645-678. 2005.
  • Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A., Foufou, S. and Bouras, A. A Survey of Clustering Algorithms for Big Data: Taxonomy and Empirical Analysis. IEEE Transactions on Emerging Topics in Computing, 2(3), 267-279, 2014.
  • Xu R, Wunsch D. Clustering algorithms in biomedical research: A review. Biomedical Engineering, IEEE Reviews in, 3:120 – 154. 2010.
  • Wang D, Shi L, Yeung DS, Tsang ECC, Heng PA. Ellipsoidal support vector clustering for functional MRI analysis. Pattern Recognition, 40:2685–2695. 2007.
  • A. L. Gamboa, Hybrid Fuzzy-SV Clustering for Heart Disease Identification, CIMCA-IAWTIC, 6. 2006.
  • Singh N. & Mohapatra A. Breast cancer mass detection in mammograms using k-means and fuzzy c-means clustering. Int. J. Com. Appl. 22 (2), 15-21. 2011.
  • Khanmohammadı, S., Adıbeıg, N., & Shanehbandy, S. An Improved overlapping k-means clustering method for Medical applications. Expert Systems With Applications, 67, 12-18. 2017.
  • Chicco G. Overview and performance assessment of the clustering methods for electrical load pattern grouping. Energy, 42 (1), 68–80. 2012.
  • Kou, G. Peng, Y. Wang, G. Evaluation of clustering algorithms for financial risk analysis using MCDM methods. Inf. Sci. 275, 1–12. 2014.
  • Bochkaryov P. V. & Guseva A. I. The Use of Clustering Algorithms Ensemble with Variable Distance Metrics in Solving Problems of Web Mining. 5th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW), 41-46. 2017.
  • Scholkopf, B., Mika, S., Burges, C. J., Knirsch, P., Muller, K. R., Ratsch, G., & Smola, A. J. Input space versus feature space in kernel-based methods. IEEE transactions on neural networks, 10(5),1000-1017. 1999.
  • Minh, H. Q., Niyogi, P., & Yao, Y. Mercer’s theorem, feature maps, and smoothing. In International Conference on Computational Learning Theory Springer, Berlin, Heidelberg, 154-168. 2006.
  • Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. Support vector clustering. Journal of machine learning research, 125-137. 2001.
  • Chicco G. & Ilie I.-S., Support vector clustering of electrical load pattern data. IEEE Trans. on Power Systems, 24, no. 3, 1619- 1628, 2009.
  • Huang J.-J., Tzeng G.-H., & C.-S. Ong, Marketing segmentation using support vector clustering, Expert Systems with Applications, 32 (2), 313–317. 2007.
  • Wang D., Shi L., Yeung D. S., Heng P. A., T. T. Wong, & Tsang E. C. C.. Support vector clustering for brain activation detection. In Proc. of the 8th International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 572–579, Oct. 2005.
  • Garcia, C. & Moreno, J. Application of support vector clustering to the visualization of medical images. IEEE International Symposium on Biomedical Imaging, 1553–1556. 2004.
  • Lee, J., & Lee, D. Dynamic characterization of cluster structures for robust and inductive support vector clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11), 1869–1874. 2006.
  • Lee J. & Lee D. An Improved Cluster Labeling Method for Support Vector Clustering, IEEE Trans. Pattern Analysis and Machine Intelligence, 27, no. 3, 461-464, Mar. 2005.
  • DonGiovanni, D. & Vaina, L. Select and Cluster: A Method for Finding Functional Networks of Clustered Voxels in fMRI. Computational Intelligence and Neuroscience, 1-19. 2016.
  • Yin, Z. & Zhang, J. Identification of temporal variations in mental workload using locally-linear-embedding-based EEG feature reduction and supportvector- achine-based clustering and classification techniques. Computer Methods and Programs in Biomedicine, 115(3), 119-134. 2014.
  • Villazana, S., Seijas, C., & Caralli, A. Lempel-Ziv complexity and Shannon entropy-based support vector clustering of ECG signals. Revista Ingenıería Uc, 22(1). 2015
  • Karal, Ö., & Bağcı, F. B. Elipsoit Destek Vektör Öbekleme Algoritmasının Biyomedikal Veri Setleri Üzerinde Karşılaştırmalı Performans Analizi. Akademik Platform Mühendislik ve Fen Bilimleri Dergisi, 7(1), 140-148, 2019.
  • Kavzoglu, T., & Colkesen, I.A kernel functions analysis for support vector machines for land cover classification. International Journal of Applied Earth Observation and Geoinformation, 11(5), 352-359. 2009.
  • Chen, W., Wang, J., Xie, X., Hong, H., Van Trung, N., Bui, D. T., ... & Li, X. Spatial prediction of landslide susceptibility using integrated frequency ratio with entropy and support vector machines by different kernel functions. Environmental Earth Sciences, 75(20), 1344. 2016.
  • Sharafi, H., Ebtehaj, I., Bonakdari, H., & Zaji, A. H. Design of a support vector machine with different kernel functions to predict scour depth around bridge piers. Natural Hazards, 84(3), 2145-2162. 2016.
  • Zhang, X., Liu, X., & Wang, Z. J. Evaluation of a set of new ORF kernel functions of SVM for speech recognition. Engineering Applications of Artificial Intelligence, 26(10), 2574-2580. 2013.
  • Feizizadeh, B., Roodposhti, M. S., Blaschke, T., & Aryal, J. Comparing GIS-based support vector machine kernel functions for landslide susceptibility mapping. Arabian Journal of Geosciences, 10(5), 122. 2017.
  • Fadel, S., Ghoniemy, S., Abdallah, M., Sorra, H. A., Ashour, A., & Ansary, A. Investigating the effect of different kernel functions on the performance of SVM for recognizing Arabic characters. IJACSA) International Journal of Advanced Computer Science and Applications, 7(1), 446-450. 2016.