A hybrid convolutional neural network approach for feature selection and disease classification
A hybrid convolutional neural network approach for feature selection and disease classification
Many researchers have analyzed the high dimensional gene expression data for disease classification using several conventional and machine learning-based approaches, but still there exists some issues which make this task nontrivial. Due to the growing complexities of the unstructured data, the researchers focus on the deep learning approach, which is the latest form of machine learning algorithm. In the presented work, a kernel-based Fisher score (KFS) approach is implemented to extract the notable genes, and an improvised chaotic Jaya (CJaya) algorithm optimized convolutional neural network (CJaya-CNN) model is applied to classify high dimensional gene expression or microarray data. This model is tested on two binary class and two multi class standard microarray datasets. Here, the presented hybrid deep learning model (KFS based CJaya-CNN) has been compared with other standard machine learning classification models like CJaya hybridized multi-layer perceptron (CJaya-MLP), CJaya hybridized extreme learning machine (CJayaELM), and CJaya hybridized kernel extreme learning machine (CJaya-KELM). The suggested model is evaluated by classification accuracy percentage, number of significant genes selected, sensitivity and specificity values with receiver operating characteristic (ROC) curves. Eventually, the experimental outcomes obtained from the presented model has also been compared with the recent existing feature selection and classification models for a suitable research in analysing high dimensional microarray data. The presented model offered the classification accuracy percentage of 98.2, 99.96, 99.78, and 99.87 for colon cancer, leukemia, lymphoma-3, and small round blue cell tumor (SRBCT) datasets, respectively. All the experimental outcomes reveal that the KFS based CJaya-CNN model is outperforming. Hence, the presented method can be used as a dependable framework for disease classification.
___
- [1] Shilaskar S, Ghatol A, Chatur P. Medical decision support system for extremely imbalanced datasets. Information Sciences. 2017; 384:205-19. doi: 10.1016/j.ins.2016.08.077
- [2] Ang JC, Mirzal A, Haron H, Hamed HN. Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics. 2015; 13 (5):971-89. doi: 10.1109/TCBB.2015.2478454
- [3] Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M et al. Bloomfield CD. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. science. 1999;286 (5439):531-7. doi: 10.1126/science.286.5439.531
- [4] Panda M. Elephant search optimization combined with deep neural network for microarray data analysis. Journal of King Saud University-Computer and Information Sciences. 2020 ;32 (8):940-8. doi: 10.1016/j.jksuci.2017.12.002
- [5] Kilicarslan S, Adem K, Celik M. Diagnosis and classification of cancer using hybrid model based on ReliefF and convolutional neural network. Medical hypotheses. 2020 ;137. doi: 10.1016/j.mehy.2020.109577
- [6] Rao R. Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems. International Journal of Industrial Engineering Computations. 2016;7 (1):19-34. doi: 10.5267/j.ijiec.2015.8.004
- [7] Mohapatra P, Chakravarty S, Dash PK. Microarray medical data classification using kernel ridge regression and modified cat swarm optimization based gene selection system. Swarm and Evolutionary Computation. 2016; 28:144- 60. doi: 10.1016/j.swevo.2016.02.002
- [8] Aziz R, Verma C, Srivastava N. A fuzzy based feature selection from independent component subspace for machine learning classification of microarray data. Genomics data. 2016; 8:4-15. doi: 10.1016/j.gdata.2016.02.012
- [9] Kar S, Sharma KD, Maitra M. Gene selection from microarray gene expression data for classification of cancer subgroups employing PSO and adaptive K-nearest neighborhood technique. Expert Systems with Applications. 2015 ;42 (1):612-27. doi:10.1016/j.eswa.2014.08.014
- [10] Shukla AK, Singh P, Vardhan M. A two-stage gene selection method for biomarker discovery from microarray data for cancer classification. Chemometrics and Intelligent Laboratory Systems. 2018; 183:47-58. doi: 10.1016/j.chemolab.2018.10.009
- [11] Alshamlan HM, Badr GH, Alohali YA. Genetic Bee Colony (GBC) algorithm: A new gene selection method for microarray cancer classification. Computational biology and chemistry. 2015 ;56:49-60. doi: 10.1016/j.compbiolchem.2015.03.001
- [12] García-Nieto J, Alba E. Parallel multi-swarm optimizer for gene selection in DNA microarrays. Applied Intelligence. 2012 ;37 (2):255-66. doi: 10.1007/s10489-011-0325-9
- [13] Hernandez JC, Duval B, Hao JK. A genetic embedded approach for gene selection and classification of microarray data. InEuropean conference on evolutionary computation, machine learning and data mining in bioinformatics 2007:90-101. Springer, Berlin, Heidelberg. doi: 10.1007/978-3-540-71783-6_9
- [14] Wang A, An N, Chen G, Yang J, Li L et al. Incremental wrapper based gene selection with Markov blanket. In2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2014 : 74-79. IEEE. doi: 10.1109/BIBM.2014.6999251
- [15] Alshamlan H, Badr G, Alohali Y. mRMR-ABC: a hybrid gene selection algorithm for cancer classification using microarray gene expression profiling. Biomed research international. 2015 ;2015. doi: 10.1155/2015/604910
- [16] Liu KH, Zeng ZH, Ng VT. A hierarchical ensemble of ECOC for cancer classification based on multi-class microarray data. Information Sciences. 2016 ;349:102-18. doi: 10.1016/j.ins.2016.02.028
- [17] Bolón-Canedo V, Sánchez-Maroño N, Alonso-Betanzos A. Distributed feature selection: An application to microarray data classification. Applied soft computing. 2015 ;30:136-50. doi: 10.1016/j.asoc.2015.01.035
- [18] Baliarsingh SK, Vipsita S, Gandomi AH, Panda A, Bakshi S et al. Analysis of high-dimensional genomic data using MapReduce based probabilistic neural network. Computer methods and programs in biomedicine. 2020 ; 195:105625. doi: 10.1016/j.cmpb.2020.105625
- [19] Kumar M, Rath SK. Classification of microarray using MapReduce based proximal support vector machine classifier. Knowledge-Based Systems. 2015 ; 89:584-602. doi: 10.1016/j.knosys.2015.09.005
- [20] Wang Y, Yang XG, Lu Y. Informative gene selection for microarray classification via adaptive elastic net with conditional mutual information. Applied Mathematical Modelling. 2019 ;71:286-97. doi: 10.1016/j.apm.2019.01.044
- [21] Díaz-Uriarte R, De Andres SA. Gene selection and classification of microarray data using random forest. BMC bioinformatics. 2006 ;7 (1):1-3. doi: 10.1186/1471-2105-7-3
- [22] Ludwig SA, Jakobovic D, Picek S. Analyzing gene expression data: Fuzzy decision tree algorithm applied to the classification of cancer data. In2015 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) 2015: 1-8. IEEE. doi: 10.1109/FUZZ-IEEE.2015.7337854
- [23] Medjahed SA, Saadi TA, Benyettou A, Ouali M. Kernel-based learning and feature selection analysis for cancer diagnosis. Applied Soft Computing. 2017 ; 51:39-48. doi: 10.1016/j.asoc.2016.12.010
- [24] Liu J, Wang X, Cheng Y, Zhang L. Tumor gene expression data classification via sample expansion-based deep learning. Oncotarget. 2017;8 (65):109646. doi: 10.18632/oncotarget.22762
- [25] Liao Q, Jiang L, Wang X, Zhang C, Ding Y. Cancer classification with multi-task deep learning. In2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC) 2017: 76-81. IEEE. doi: 10.1109/SPAC.2017.8304254
- [26] Zeebaree DQ, Haron H, Abdulazeez AM. Gene selection and classification of microarray data using convolutional neural network. In2018 International Conference on Advanced Science and Engineering (ICOASE) 2018 : 145-150. IEEE. doi: 10.1109/ICOASE.2018.8548836
- [27] Polat K, Güneş S. A new feature selection method on classification of medical datasets: Kernel F-score feature selection. Expert Systems with Applications. 2009 ;36 (7):10367-73. doi: 10.1016/j.eswa.2009.01.041
- [28] Bochinski E, Senst T, Sikora T. Hyper-parameter optimization for convolutional neural network committees based on evolutionary algorithms. In2017 IEEE international conference on image processing (ICIP) 2017: 3924-3928. IEEE. doi: 10.1109/ICIP.2017.8297018
- [29] Debata PP, Mohapatra P. Selection of informative genes from high-dimensional cancerous data employing an improvised meta-heuristic algorithm. Evolutionary Intelligence. 2021:1-9. doi: 10.1007/s12065-021-00593
- [30] LeCun Y, Bengio Y, Hinton G. Deep learning. nature. 2015 ;521 (7553):436-44. doi: 10.1038/nature14539
- [31] Deng L, Yu D. Deep learning: methods and applications. Foundations and trends in signal processing. 2014 ;7 (3–4):197-387. doi: 10.1561/2000000039
- [32] Chen YW, Lin CJ. Combining SVMs with various feature selection strategies. InFeature extraction 2006: 315-324. Springer, Berlin, Heidelberg. doi: 10.1007/978-3-540-35488-8_13
- [33] Rao RV. Jaya: an advanced optimization algorithm and its engineering applications.(2019): 770-780. doi: 10.1007/978-3-319-78922-4
- [34] Yu J, Kim CH, Wadood A, Khurshiad T, Rhee SB. A novel multi-population based chaotic JAYA algorithm with application in solving economic load dispatch problems. Energies. 2018 ;11 (8):1946. doi: 10.3390/en11081946
- [35] Alon U, Barkai N, Notterman DA, Gish K, Ybarra S et al. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proceedings of the National Academy of Sciences. 1999 ;96 (12):6745-50. doi: 10.1073/pnas.96.12.6745
- [36] Golub TR, Slonim DK, Tamayo P, Huard C, Gaasenbeek M et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. science. 1999 ;286 (5439):531-7. doi: 10.1126/science.286.5439.531
- [37] Zhu Z, Ong YS, Dash M. Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognition. 2007 ;40 (11):3236-48. doi: 10.1016/j.patcog.2007.02.007
- [38] Baliarsingh SK, Vipsita S, Muhammad K, Dash B, Bakshi S. Analysis of high-dimensional genomic data employing a novel bio-inspired algorithm. Applied Soft Computing. 2019 ;77:520-32. doi: 10.1016/j.asoc.2019.01.007
- [39] Basavegowda HS, Dagnew G. Deep learning approach for microarray cancer data classification. CAAI Trans. Intell. Technol.. 2020 ;5 (1):22-33. doi: 10.1049/trit.2019.0028
- [40] Shah SH, Iqbal MJ, Ahmad I, Khan S, Rodrigues JJ. Optimized gene selection and classification of cancer from microarray gene expression data using deep learning. Neural Computing and Applications. 2020 Oct 6:1-2. doi: 10.1007/s00521-020-05367-8