Rough fuzzy cuckoo search for triclustering microarray gene expression data
Rough fuzzy cuckoo search for triclustering microarray gene expression data
Analyzing time series microarray gene expression data is a computational challenge due to its threedimensional characteristics. Triclustering techniques are applied to three-dimensional data for mining similarly expressedgenes under a subset of conditions and time points. In this work, a novel rough fuzzy cuckoo search algorithm is proposedfor triclustering genes across samples and time points simultaneously. By applying the upper and lower approximation ofrough set theory and the objective function of fuzzy k-means, rough fuzzy k-means was incorporated into a cuckoo searchto handle the uncertainty of the data. The proposed method was applied to three real-life time series gene expressiondatasets. This work was evaluated using four validation indices and correlation analysis was performed to indicate thecluster quality. The proposed work was also compared with the existing triclustering algorithms and it outperformedthe other methods.
___
- [1] Xu R, Wunsch D. Survey of clustering algorithms. IEEE Transactions on Neural Networks 2005; 16 (3): 645-678. doi: 10.1109/TNN.2005.845141
- [2] Cheng Y, Church GM. Biclustering of expression data. In: International Conference on Intelligent Systems for
Molecular Biology; San Diego, CA, USA; 2000. pp. 93-103.
- [3] Lingras P, West C. Interval set clustering of web users with rough K-Means. Journal of Intelligent Information
Systems 2004; 23 (1): 5-16. doi: 10.1023/B:JIIS.0000029668.88665.1a
- [4] Rissino S, Torres GL. Rough set theory – fundamental concepts, principals, data extraction, and application. In:
Ponce J, Karahoca A (editor). Data Mining and Knowledge Discovery in Real Life Applications. Vienna, Austria:
In-Tech. 2009, pp. 35-58. doi: 10.5772/6440
- [5] Fister IJr, Yang XS, Fister D, Fister I. Cuckoo search: a brief literature review. Studies in Computational Intelligence 2014; 516(1): 49-62. doi: 10.1007/978-3-319-02141-6_3
- [6] Payne RB, Sorenson MD, Klitz K. The Cuckoos. New York, NY, USA: Oxford University Press, 2005.
- [7] Yang XS, Deb S. Cuckoo search via Lévy flights. In: 2009 World Congress on Nature & Biologically Inspired
Computing; Coimbatore, India; 2009. pp. 210-214.
- [8] Zhang JH, Ha MH, Wu J. Implementation of rough fuzzy k-means clustering algorithm in MATLAB. In: IEEE
2010 International Conference on Machine Learning and Cybernetics; Qingdao, China; 2010. pp. 2084-2087.
- [9] Pontes B, GirldezJess R, Ruiz SA. Quality measures for gene expression biclusters. Plos one 2015; 10(3): e0115497.
- [10] Bhar A, Haubrock M, Mukhopadhyay A, Maulik U, Bandyopadhyay S, Wingender E. Coexpression and coregulation analysis of time-series gene expression data in estrogen-induced breast cancer cell. Algorithm MolBiol 2013; 8 (1): 9. doi: 10.1186/1748-7188-8-9
- [11] Haifa BS, Mourad E. DNA microarray data analysis: a new survey on biclustering. International Journal of
Computational Biology 2015; 4 (1): 21-37.
- [12] Huang DW, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID
bioinformatics resources. Nature Protocols 2009; 4 (1): 44-57. doi: 10.1038/nprot.2008.211
- [13] Swathypriyadharsini P, Premalatha K. TrioCuckoo: a multi objective cuckoo search algorithm for Triclustering
Microarray Gene Expression Data. Journal of Information Science and Engineering 2018; 34 (6): 1617-1631. doi:
10.6688/JISE.201811_34(6).0014
- [14] Bhar A, Haubrock M, Mukhopadhyay A, Maulik U, Bandyopadhyay S et al. Multi objective triclustering of timeseries transcriptome data reveals key genes of biological processes. BMC Bioinformatics 2015; 16 (1): 200.
- [15] Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple
testing. Journal of the Royal Statistical Society. Series B 1995; 57 (1): 289-300.
- [16] Smyth GK. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 2004; 3 (1): 1544-6115. doi: 10.2202/1544- 6115.1027