Ecoli Veri Protein Lokalizasyonunda Bulanık ve Olabilirlikli Kümeleme Algoritmalarının Analizi

Kümeleme, veri kümelerini parçalanmış kümelere bölme işlemidir, böylece aynı veri kümesi benzerdir, farklı kümelerin verileri farklıdır. Bulanık kümeleme algoritmalarının temeli c- ortalamalar aileleridir ve en güçlü algoritma bulanık c- ortalamalar algoritmasıdır. Bununla birlikte, bulanık c- ortalamalar algoritması aykırı değerlere duyarlıdır. Bu çalışmada, gerçek veri seti üzerinde, bulanık c- ortalamalar algoritmasının bu olumsuz etkinin üstesinden gelmek için geliştirilen üç farklı algoritma – olabilirlikli c-ortalamalar algoritması (PCM), bulanık olabilirlikli c-ortalamalar (FPCM) ve olabilirlikli bulanık c- ortalamalar algoritması (PFCM) – incelenmiştir. Bu algoritmaları karşılaştırmak için yineleme sayıları ve tamamlanma süreleri hesaplanmıştır.

Anahtar Kelimeler:

Bulanık c- ortalamalar, Olabilirlikli c- ortalamalar, Bulanık olabilirlikli c- ortalamalar, Olabilirlikli bulanık c- ortalamalar

Analysis of Fuzzy and Possibilistic C-Means Clustering Algorithms on Protein Localization with Ecoli Data

Clustering is the process of dividing data clusters into fragmented clusters so that the same set of data is similar, but the data of different clusters is different. The basis of the fuzzy clustering algorithms is the c-means families and the strongest algorithm is the fuzzy c-means algorithm. However, the fuzzy c-means algorithm is sensitive to outliers. In this study, on the real data set we examined three different algorithms -possibilistic c-means algorithm (PCM), fuzzy possibilistic c-means (FPCM) and possibilistic fuzzy c- means algorithm (PFCM)- which are developed to overcome the unfavorable side of the FCM algorithm. To compare these algorithms, iteration numbers and completion times were calculated.

Keywords:

Fuzzy c-means, Possibilistic c-means, Fuzzy possibilistic c-means, Possibilistic fuzzy c- means,

PDF

___

Berry M. W., 2003. Survey of Text Mining, Springer-Verlag, New York, NY, USA
Berthold M R. and Hand D. J., 1999. Intelligent Data Analysis, Springer-Verlag, Berlin, Germany
Bezdek J. C., 1981. Pattern Recognition with Fuzzy Objective Function Algorithms, New York: Plenum
Horton P. and Nakai K., 1996. A Probablistic Classification System for Predicting the Cellular Localization Sites of Proteins, Proc Int Conf Intell Syst Mol Biol. 4:109-15
Krishnapuram R. and Keller J., 1993. A possibilistic approach to clustering, IEEE Trans. Fuzzy Syst., vol. 1, no. 2, pp. 98-110
Nakai K. and Kanehisa M., 1992. A Knowledge Base for Predicting Protein Localization Sites in Eukaryotic Cells, , Genomics 14:897-911
Nakai K. and Kanehisa M., 1991. Expert Sytem for Predicting Protein Localization Sites in Gram-Negative Bacteria,PROTEINS: Structure, Function, and Genetics 11:95-110
Nefti S. and Oussalah M., 2004. Probabilistic-Fuzzy Clustering Algorithm, IEEE international Conference on Systems, Man and Cybemetics, pp. 4786-4791
Pal N. R., Pal K., and Bezdek J. C., 1997. A mixed c-means clustering model, in IEEE Int. Conf. Fuzzy Systems, Spain, pp. 11 -21
Pal N. R., Pal K., Keller J. M., and Bezdek J. C., 2005. A Possibilistic Fuzzy c-Means Clustering Algorithm, IEEE Trans. on Fuzzy Systems, vol. 13, no. 4, pp. 517-530
Ruspini E. R, 1969. A New Approach to Clustering, Inform. Control, vol. 15, no. 1, pp. 22-32
Singhal R., Deepika N., 2016. Classification of Words: Using PFCM Clustering, International Journal of Computer Science and Mobile Computing, Vol.5 Issue.4, pg. 114-117
Timm H, Borgelt C., Doring C., Kruse R., 2004. An extension to possibilistic fuzzy cluster analysis”, Fuzzy Sets and Systems 147 3–16
Zadeh L., 1965. Fuzzy Sets, Inform. Control, 8, pp. 338-353
https://archive.ics.uci.edu/ml/index.php