A Novel Cluster of Quarter Feature Selection Based on Symmetrical Uncertainty

Due to the diversity of sources, a large amount of data is being produced and also has variousproblems including mislabeled data, missing values, imbalanced class labels, noise and highdimensionality. In this research article, we proposed a novel framework to address highdimensionality issue with feature reduction to increase the classification performance of variouslazy learners, rule-based induction, bayes, and tree-based models. In this research, we proposedrobust Quarter Feature Selection (QFS) framework based on Symmetrical Uncertainty AttributeEvaluator. Our proposed technique analyzed with Six real world datasets. The proposedframework , divide whole data space into 4 sets (Quarters) of features without duplication. Eachsuch quarter has less than or equals 25 % features of whole data space. Practical results recordedthat, one of the quarter, sometimes more than one quarter recording improved accuracy than thealready available feature selection methods in the literature. In this research, we used filter-basedfeature selection methods such as GRAE, IG, CHI-SQUARE (CHI 2), Relief to compare thequarter of features produced by proposed technique.

___

  • [1] Rahman, H. (Ed.). (2009). Data mining applications for empowering knowledge societies. Hershey, PA: Information Science Reference. [2] Chandrashekar, Girish, and Ferat Sahin, “A Survey on Feature Selection Methods,” Computers & Electrical Engineering,vol 40 ,no 1 ,2014, pp. 16–28. doi:10.1016/j.compeleceng.2013.11.024. [3] Yu, Lei, and Huan Liu, “Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution,”In ICML vol3,2003,pp.856–863. http://www.aaai.org/Papers/ICML/2003/ICML03-111.pdf. [4] Cui, Yue, Jesse S. Jin, Shiliang Zhang, Suhuai Luo, and Qi Tian, “Correlation-Based Feature Selection and Regression,” In Pacific-Rim Conference on Multimedia, 2010,pp. 25–35. Springer. http://link.springer.com/chapter/10.1007/978-3-642-15702-8_3. [5] Jiongjiong SUN, Jun LIU, and Xuguang Wei ,“Feature Selection algorithm based on SVM, ” In 35th Chinese Control Conference, 2016,pp.4113-4116. [6] Roffo, Giorgio, Simone Melzi, and Marco Cristani, “Infinite Feature Selection,” In ICCV ,2015, pp. 4202–10. IEEE. doi:10.1109/ICCV.2015.478. [7] S. I. Ali and W. Shahzad, “A feature subset selection method based on symmetric uncertainty and ant colony optimization,” International Conference on Emerging Technologies (ICET), 2012, pp. 1–6. [8] S. Senthamarai Kannan and N. Ramaraj, “A novel hybrid feature selection via Symmetrical Uncertainty ranking based local memetic search algorithm,” Knowledge-Based Systems, vol. 23, no. 6, pp. 580–585, Aug. 2010. [9] E. Balkanli, A. N. Zincir-Heywood, and M. I. Heywood, “Feature selection for robust backscatter DDoS detection,” in Local Computer Networks Conference Workshops (LCN Workshops), 2015 IEEE 40th, 2015, pp. 611–618. [10] A. A. Christopher and S. A. alias Balamurugan, “Feature selection techniques for prediction of warning level in aircraft accidents,” in Advanced Computing and Communication Systems (ICACCS), 2013 International Conference on, 2013, pp. 1–6. [11] D.-W. Tan, W. Yeoh, Y. L. Boo, and S.-Y. Liew, “The Impact of Feature Selection: A Data-Mining Application In Direct Marketing: The Impact Of Feature Selection,” Intelligent Systems in Accounting, Finance and Management, Jan. 2013, vol. 20, no. 1, pp. 23–38 [12] J. Novakovic, P. Strbac, and D. Bulatovic, “Toward optimal feature selection using ranking methods and classification algorithms,” Yugoslav Journal of Operations Research, vol. 21, no. 1, pp. 119–135, 2011. [13] A. S. Eesa, Z. Orman, and A. M. A. Brifcani, “A novel feature-selection approach based on the cuttlefish optimization algorithm for intrusion detection systems,” Expert Systems with Applications, vol. 42, no. 5, pp. 2670–2679, Apr. 2015. [14] D. Hongbin, T. Xuyang, and Y. Xue, “Feature Selection Based on the Measurement of Correlation Information Entropy,” Journal of Computer Research and Development, vol. 53, no. 8, pp. 1684–1695, 2016. [15] M. Bennasar, Y. Hicks, and R. Setchi, “Feature selection using Joint Mutual Information Maximisation,” Expert Systems with Applications, vol. 42, no. 22, pp. 8520–8532, Dec. 2015. [16] Hongbin Dong, Xuyang Teng, Yang Zhou, and Jun He,”Feature subset selection using Dynamic Mixed Strategy,” IEEE Congress on Evolutionary Computation, CES 2015. [17] Yong Liu, Feng Tang, and Zhiyong Zeng, “Feature Selection Based on Dependency Margin,” IEEE Transactions on Cybernetics, vol. 45, no. 6, pp. 1209–1221, Jun. 2015. [18] Z. Wang, M. Li, and J. Li, “A multi-objective evolutionary algorithm for feature selection based on mutual information with a new redundancy measure,” Information Sciences, vol. 307, pp. 73–88, Jun. 2015. [19] O. Soufan, D. Kleftogiannis, P. Kalnis, and V. B. Bajic, “DWFS: A Wrapper Feature Selection Tool Based on a Parallel Genetic Algorithm,” PLOS ONE, vol. 10, no. 2, p. e0117988, Feb. 2015. [20] T. A. Alhaj, M. M. Siraj, A. Zainal, H. T. Elshoush, and F. Elhaj, “Feature Selection Using Information Gain for Improved Structural-Based Alert Correlation,” PLOS ONE, vol. 11, no. 11, p. e0166017, Nov. 2016. [21] A. Moayedikia, K.-L. Ong, Y. L. Boo, W. G. Yeoh, and R. Jensen, “Feature selection for high dimensional imbalanced class data using harmony search,” Engineering Applications of Artificial Intelligence, vol. 57, pp. 38–49, Jan. 2017