Effect of intuitionistic fuzzy normalization in microarray gene selection

Effect of intuitionistic fuzzy normalization in microarray gene selection

Analysis of gene expression data is essential in microarray gene expression in order to retrieve the required information. Gene expression data generally contain a large number of genes but a small number of samples. The complicated relations among the different genes make analysis more difficult, and removing irrelevant genes improves the quality of results. This paper presents two fuzzy preprocessing techniques, using a fuzzy set (FS) and intuitionistic fuzzy set (IFS), to normalize datasets. In the feature selection part, four statistical methods were used. Using three publicly available gene expression datasets, the fuzzy normalization techniques were compared with two standard normalization techniques (min-max and Z-score) as well as raw gene expression. The classifiers of support vector machine, k-nearestneighbor, and random forest were used to identify the accuracy of selected features. The experimental results show that the genes selected using FS- and IFS-normalized datasets give high classification accuracy; in addition, IFS outperforms FS normalization.

___

  • Ding C, Peng HC. Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biol 2005; 3: 185-205.
  • Sahu B, Mishra D. Feature selection for cancer classification signal-to-noise ratio approach. Int J Sci Eng Res 2011;2: 1-7.
  • Chandra B, Manish G. An efficient statistical feature selection approach for classification of gene expression data. J Biomed Inform 2011; 44: 529-535.
  • Yang K, Cai Z, Li J and Lin G. A stable gene selection in microarray data analysis. BMC Bioinformatics 2006; 7:228-243.
  • Alon U, Barkai N, Notterman DA, Gish K, Ybarra S, Mack D, Levine AJ. Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. P Natl Acad Sci USA 1999; 96: 6745-6750.
  • Golub TR,Slonim DK, Tamayo P, Huard C, Gaasenbeek M, Mesirov JP, Coller H, Loh ML, Downing JR, Caligiuri MA et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 1999; 286: 531-537.
  • Atanassov KT. Intuitionistic Fuzzy Sets: Theory and Application. Heidelberg, Germany: Physica Verlag, 1999.
  • Zadeh LA. Fuzzy sets. Inform Control 1965; 8: 338-353.
  • Horng JT, Wu LC, Liu BJ, Kuo JL, Kuo WH, Zhang JJ. An expert system to classify microarray gene expression data using gene selection by decision tree. Expert Syst Appl 2009; 36: 9072-9081.
  • Hsu HH, Hsieh CW, Lu MD. Hybrid feature selection by combining filters and wrappers. Expert Syst Appl 2011; 38: 8144-8150.
  • Blum AL, Langley P. Selection of relevant features and examples in machine learning. Artif Intell 1997; 97: 245-271.
  • Guyon I, Elisseeff A. An introduction to variable and feature selection. J Mach Learn Res 2003; 3: 1157-1182.
  • Jenson R, Shen Q. New approaches to fuzzy-rough feature selection. IEEE T Fuzzy Syst 2009; 17: 824-838.
  • Tabakhi S, Moradi P, Akhlaghian F. An unsupervised feature selection algorithm based on ant colony optimization. Eng Appl Artif Intel 2014; 32: 112-123.
  • Min N, Hu Q, Zhu W. Feature selection with test cost constraint. Int J Approx Reason 2014; 55: 167-179.
  • Hoque N, Bhattacharyya DK, Kalita JK. MIFS-ND: A mutual information-based feature selection method. Expert Syst Appl 2014; 41: 6371-6385.
  • Bhattacharyya DK, Kalita JK. Network Anomaly Detection: A Machine Learning Perspective. 1st ed. Boca Raton, FL, USA: CRC Press, 2013.
  • Shi L, Reid LH, Jones WD, Shippy R, Warrington JA, Baker SC, Collins PJ, de Longueville F, Kawasaki ES, Lee KY et al. The MicroArray Quality Control (MAQC) project shows inter-and intraplatform reproducibility of gene expression measurements. Nat Biotechnol 2006; 24: 1151-1161.
  • Rinaldis ED, Lahm A. DNA Microarrays: Current Applications. Norfolk, UK: Horizon Bioscience, 2007.
Turkish Journal of Electrical Engineering and Computer Sciences-Cover
  • ISSN: 1300-0632
  • Yayın Aralığı: 6
  • Yayıncı: TÜBİTAK