ARKLI BAĞLANTI YÖNTEMLERİ İLE HİYERARŞİK KÜMELEME TOPLULUĞU

Kümeleme topluluğu, yüksek kümeleme performansı sağlaması nedeniyle son yıllarda tercih edilen bir teknik haline gelmiştir. Bu çalışmada, Bağlantı-tabanlı Hiyerarşik Kümeleme Topluluğu (BHKT) olarak isimlendirilen yeni bir yaklaşım önerilmektedir. Önerilen yaklaşımda, topluluk elemanları farklı bağlantı yöntemleri kullanarak hiyerarşik kümeleme yapmakta ve sonrasında çoğunluk oylaması ile ortak karar üretmektedir. Çalışmada kullanılan bağlantı yöntemleri: tek bağlantı, tam bağlantı, ortalama bağlantı, merkez bağlantı, Ward yöntemi, komşu birleştirme yöntemi ve ayarlı tam bağlantıdır. Ayrıca çalışmada, farklı boyutlardaki hiyerarşik kümeleme toplulukları incelenmiş ve birbiriyle karşılaştırılmıştır. Deneysel çalışmalarda, hiyerarşik kümeleme toplulukları 8 farklı veri setinde uygulanmış ve tek bir kümeleme algoritmasına göre daha iyi sonuçlar elde edilmiştir.

Hierarchical Clustering Ensemble with Different Linkage Methods

Clustering ensemble has become a preferred technique in recent years due to the high clustering performance it provides. In this study, a new approach called Link-based Hierarchical Clustering Ensemble (LHCE) is proposed. In the proposed approach, the ensemble members perform hierarchical clustering using different linkage methods and then make joint decisions with majority voting. Linkage methods used in this study are single linkage, complete linkage, average linkage, centroid linkage, Ward method, neighbor joining and adjusted complete linkage. In this study, hierarchical clustering ensembles with different sizes were also investigated and compared with each other. In the experimental studies, hierarchical clustering ensembles were applied on 8 different datasets and better results were obtained rather than a single clustering algorithm.

___

  • Akyüz, S., Otar, B.Ç., 2017, “Doğruluk ve çeşitlilik ödünleşimlerinin eniyilemesi ile kümeleme topluluklarının seçilmesi”, 25th IEEE Signal Processing and Communications Applications Conference (SIU), 15-18 Mayıs 2017, Antalya, Türkiye.
  • Alqurashi, T., Wang, W., 2018, “Clustering ensemble method”, International Journal of Machine Learning and Cybernetics, ss. 1-20.
  • Amasyalı, M.F., Ersoy, O., 2008, “Kümeleyici topluluklarının başarısını etkileyen faktörler”, IEEE 16th Signal Processing, Communication and Applications Conference (SIU 2008), 20-22 Nisan 2008, Aydın, Türkiye.
  • Cornuejols, A., Wemmert, C., Gançarski, P., Bennani, Y., 2018, “Collaborative clustering: Why, when, what and how”, Information Fusion, Cilt 39, ss. 81-95.
  • D’Urso, P., Giovanni, L.D., Disegna, M., Massari, R., 2013, “Bagged clustering and its application to tourism market segmentation”, Expert Systems with Applications, Cilt 40, ss. 4944-4956.
  • Gionis, A., Mannila, H., Tsaparas, P., 2007, “Clustering aggregation”, ACM Transactions on Knowledge Discovery from Data, Cilt 1, Sayı 1, ss. 1-30.
  • Kamvar, S., Klein, D., Manning, C., 2002, “Interpreting and Extending Classical Agglomerative Clustering Algorithms Using a Model-Based Approach”, 19th International Conference on Machine Learning (ICML 2002), 8-12 Temmuz 2002, Sydney, Australia, ss. 283-290.
  • Khan, I., Huang, J. Z., Ivanov, K., 2016, “Incremental density-based ensemble clustering over evolving data streams”, Neurocomputing, Cilt 191, ss. 36-43.
  • Li, T., Chen, Y., 2009, “Hierarchical clustering ensemble algorithm based association rules”, International Conference on Wireless Communications, Networking and Mobile Computing, 24-26 Eylül 2009, Beijing, Çin, ss. 5320-5323.
  • Liu, H., Wu, J., Liu, Tao, D., Fu, Y., 2017, “Spectral ensemble clustering via weighted k-Means: theoretical and practical evidence”, IEEE Transactions on Knowledge and Data Engineering, Cilt 29, Sayı 5, ss. 1129-1143.
  • Murtagh, F., Contreras, P., 2017, “Algorithms for hierarchical clustering: an overview II”, WIREs Data Mining and Knowledge Discovery, Cilt 7, Sayı 6, ss. 1-16.
  • UCI Machine Learning Repository. https:// archive.ics.uci.edu/ml/datasets.html, ziyaret tarihi: 12 Mart 2018.
  • Rafsanjani, M.K., Varzaneh, Z.A., Chukanlo, N.E., 2012, “A survey of hierarchical clustering algorithms”, The Journal of Mathematics and Computer Science, Cilt 5, Sayı 3, ss. 229-240.
  • Ren, Y., Domeniconi, C., Zhang, G., Yu, G., 2017, “Weighted-object ensemble clustering: methods and analysis”, Knowledge and Information Systems, Cilt 51, ss. 661-689.
  • Sarumathi, S. , Shanthi, N., Ranjetha, P., 2015, “Analysis of diverse cluster ensemble techniques”, World Academy of Science, Engineering and Technology, International Journal of Computer and Information Engineering, Cilt 9, Sayı 11, ss. 2386-2396.
  • Saeed, F., Salim, N., Abdo, A., 2012, “Voting-based consensus clustering for combining multiple clusterings of chemical structures”, Journal of Cheminformatics, Cilt 4, Sayı 37, ss. 1-8.
  • Sharma, A., Jaloree, S., Thakur, R.S., 2018, “Review of Clustering Methods: Toward Phylogenetic Tree Constructions”, International Conference on Recent Advancement on Computer and Communication, Lecture Notes in Networks and Systems, Cilt 34, ss. 475-480.
  • Smeraldi, F., Bicego, M., Cristani, M., Murino, V., 2011, “CLOOSTING: CLustering data with bOOSTING”, MCS 2011, Lecture Notes in Computer Science, Cilt 6713, ss. 289-298.
  • Pirim, H., Seker, S.E., 2012, “Ensemble clustering for biological datasets", Bioinformatics, InTech publisher, ss. 287-298.
  • Rashedi, E., Mirzaei, A., 2013, “A hierarchical clusterer ensemble method based on boosting theory”, Knowledge-Based Systems, Cilt 45, ss. 83-93.
  • Vega-pons, S., Ruiz-Shulcloper, J., 2011, “A survey of clustering ensemble algorithms”, International Journal of Pattern Recognition and Artificial Intelligence, Cilt 25, Sayı 3, ss. 337-372.
  • Yang, F., Li, T., Zhou, Q., Xiao, H., 2017, “Cluster ensemble selection with constraints”, Neurocomputing, Cilt 235, ss. 59-70.
  • Yi, J., Yang, T., Jin, R., Jain, A.K., Mahdavi, M., 2012, “Robust ensemble clustering by matrix completion”, IEEE 12th International Conference on Data Mining, ss. 1176-1181.
  • Yu, Z., Li, L., Gao, Y., You, J., Liu, J., Wong, H.-S., Han, G., 2014, “Hybrid clustering solution selection strategy”, Pattern Recognition, Cilt 47, ss. 3362-3375.
  • Xiao, W., Yang, Y., Wang, H., Li, T., Xing, H., 2016, “Semi-supervised hierarchical clustering ensemble and its application”, Neurocomputing, Cilt 173, ss. 1362-1376.
  • Zhao, X., Liang, J., Dang, C., 2017, “Clustering ensemble selection for categorical data based on internal validity indices”, Pattern Recognition, Cilt 69, ss. 150-168.