GUSTAFSON-KESSEL AND FUZZY C-MEANS ALGORITHMS BY COLON CANCER DATA IN FUZZY CLUSTERING

Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. For the analysis of microarray data, clustering techniques are frequently used. So in this study, in cases where classical clustering analysis is insufficient to analyze data, fuzzy c-means algorithm and Gustafson-Kessel algorithm, which are improved to supply with advancing alternative statistical methods, are used. Firstly, the number of the optimum cluster was decided since the number of the cluster was not known at the beginning. Then, validity indexes and elbow criterion are applied to find the optimal number of clusters for both algorithms. It is seen that for both algorithms, the elbow was situated in the c=3 position as a result of the experimental result. At the end of the study, it is graphically stated that the fuzzy c-means algorithm is getting better clusters for the colon cancer dataset.

GUSTAFSON-KESSEL AND FUZZY C-MEANS ALGORITHMS BY COLON CANCER DATA IN FUZZY CLUSTERING

Microarray technology has made it possible to simultaneously measure the expression levels of large numbers of genes in a short time. For the analysis of microarray data, clustering techniques are frequently used. So in this study, in cases where classical clustering analysis is insufficient to analyze data, fuzzy c-means algorithm and Gustafson-Kessel algorithm, which are improved to supply with advancing alternative statistical methods, are used. Firstly, the number of the optimum cluster was decided since the number of the cluster was not known at the beginning. Then, validity indexes and elbow criterion are applied to find the optimal number of clusters for both algorithms. It is seen that for both algorithms, the elbow was situated in the c=3 position as a result of the experimental result. At the end of the study, it is graphically stated that the fuzzy cmeans algorithm is getting better clusters for the colon cancer dataset. 

___

  • 1. Avcı, U., (2006). Bulanık Kümeleme Algoritmalarının Karşılaştırmalı Analizi ve Bilgisayar Uygulamaları. Yüksek Lisans Tezi, Ege Üniversitesi Fen Bilimleri Enstitüsü, 78s.
  • 2. Babuska, R., (1996). Fuzzy Systems, Modeling and Identification. Delft University of Technology Department of Electrical Engineering.
  • 3. Höppner, F., Klawonn, F., Rudolf, K., and Runkler, T., (1999). Fuzzy Cluster Analysis: Methods for Classification Data Analysis and Image Recognition. John Wiley & Sons, p:5-75.
  • 4. Wang, W. and Zhang, Y., (2007). On Fuzzy Cluster Validity Indices. Fuzzy Sets and Systems, 158, pp:2095-2117.
  • 5. Bezdek, J.C., (1974). Cluster Validity with fuzzy sets. J. Cybernetics, Vol:3, pp:58-73.
  • 6. Zadeh, L.A., (1965). Fuzzy Sets. Information and Control, 8, 338-353.
  • 7. Bezdek, J.C., (1981). Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York.
  • 8. Bezdek, J.C., Ehrlich, R., and Full, W., (1984). FCM: Fuzzy C-Means Algorithm. Computers and Geoscience, 10(2-3), 191-203.
  • 9. Gustafson, D.E. and Kessel, W.C., (1979). Fuzzy Clustering with a Fuzzy Covariance Matrix. IEEE CDC San Diego, 761-766.
  • 10. Madhulatha, T.S., (2012). An Overview on Clustering Methods, IOSR Journal of Engineering, Vol:2(4) pp:719-725.
  • 11. www.ncbi.nlm.nih.gov/geo.
  • 12. www.mathworks.com/access/helpdesk/toolbox/fuzzy/.
  • 13. Sturn, A., Quackenbush, J., and Trajanoski, Z., (2002). Genesis: Cluster Analysis of Microarray Data. Bioinformatics Applications Note, 18(1), 207-208.
  • 14. Bilen, M., Işık, A.H., and Yiğit, T., (2015). A Hybrid Artificial Neural Network-Genetic Algorithm Approach for Classification of Microarray Data. 23th Signal Processing and Communications Applications Conference (SIU). IEEE.