Hybrid of genetic algorithm and great deluge algorithm for rough set attribute reduction

The attribute reduction problem is the process of reducing unimportant attributes from a decision system to decrease the difficulty of data mining or knowledge discovery tasks. Many algorithms have been used to optimize this problem in rough set theory. The genetic algorithm (GA) is one of the algorithms that has already been applied to optimize this problem. This paper proposes 2 kinds of memetic algorithms, which are a hybridization of the GA, with 2 versions (linear and nonlinear) of the great deluge (GD) algorithm. The purpose of this hybridization is to investigate the ability of this local search algorithm to improve the performance of the GA. In both of the methods, the local search (the GD algorithm) is employed to each generation of the GA. The only difference of these methods is the rate of increase in the `level' in the GD algorithm. The level is increased by a fixed value in the linear GD algorithm, while the nonlinear GD algorithm uses the quality of the current solution to calculate the increase rate of the level in each iteration. The 13 datasets taken from the University of California - Irvine machine learning repository are used to test the methods and compare the results with the on-hand results in the literature, especially with the original GA. The classification accuracies of each dataset using the obtained reducts are examined and compared with other approaches using ROSETTA software. The promising results show the potential of the algorithm to solve the attribute reduction problem.

Hybrid of genetic algorithm and great deluge algorithm for rough set attribute reduction

The attribute reduction problem is the process of reducing unimportant attributes from a decision system to decrease the difficulty of data mining or knowledge discovery tasks. Many algorithms have been used to optimize this problem in rough set theory. The genetic algorithm (GA) is one of the algorithms that has already been applied to optimize this problem. This paper proposes 2 kinds of memetic algorithms, which are a hybridization of the GA, with 2 versions (linear and nonlinear) of the great deluge (GD) algorithm. The purpose of this hybridization is to investigate the ability of this local search algorithm to improve the performance of the GA. In both of the methods, the local search (the GD algorithm) is employed to each generation of the GA. The only difference of these methods is the rate of increase in the `level' in the GD algorithm. The level is increased by a fixed value in the linear GD algorithm, while the nonlinear GD algorithm uses the quality of the current solution to calculate the increase rate of the level in each iteration. The 13 datasets taken from the University of California - Irvine machine learning repository are used to test the methods and compare the results with the on-hand results in the literature, especially with the original GA. The classification accuracies of each dataset using the obtained reducts are examined and compared with other approaches using ROSETTA software. The promising results show the potential of the algorithm to solve the attribute reduction problem.

___

  • J. Han, M. Kamber, Data Mining: Concepts and Techniques, Oxford, Morgan Kaufmann Publishers, 2006.
  • Z. Pawlak, J. Grzymala, R. Slowinski, “Rough sets”, Communications of the Association for Computing Machinery, Vol. 8, pp. 89–95, 1995.
  • Z. Pawlak, “Rough sets”, International Journal of Information and Computer Science. Vol. 11, pp. 341–356, 1982. Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Boston, Kluwer, 1991.
  • R. Jensen, Q. Shen, “Semantics-preserving dimensionality reduction: rough and fuzzy-rough based approaches”, IEEE Transactions on Knowledge and Data Engineering, Vol. 16, pp. 1457–1471, 2004.
  • A. Hedar, J. Wang, M. Fukushima, “Tabu search for attribute reduction in rough set theory”, Soft Computing, Vol. 12, pp. 909–918, 2006.
  • R. Jensen, Q. Shen, “Finding rough set reducts with ant colony optimization”, Proceedings of the UK Workshop Computational Intelligence, pp. 15–22, 2003.
  • L. Ke, Z. Feng, Z. Ren, “An efficient ant colony optimization approach to attribute reduction in rough set theory”, Pattern Recognition Letters, Vol. 29, pp. 1351–1357, 2008.
  • J. Wang, A. Hedar, G. Zheng, S. Wang, “Scatter search for rough set attribute reduction”, International Joint Conference on Computational Sciences and Optimization, Vol. 1, pp. 531–535, 2009.
  • Z. Jun, L. Jian-yong, W. Zhen, “Rough set attribute reduction algorithm based on immune genetic algorithm”, 2nd International IEEE Conference on Computer Science and Information Technology, pp. 421–424, 2009.
  • Z. Xu, D. Gu, B. Yang, “Attribute reduction algorithm based on genetic algorithm”, International Conference on Intelligent Computation Technology and Automation, pp. 169–172, 2009.
  • B. Liu, F. Liu, X. Cheng, “An adaptive genetic algorithm based on rough set attribute reduction”, International Conference on Biomedical Engineering and Informatics, pp. 2880–2883, 2010.
  • J. Holland, Adaptation in Natural and Artificial Systems, Ann Arbor, University of Michigan Press, 1975.
  • G. Dueck, “New optimization heuristics: the great deluge algorithm and the record-to-record travel”, Journal of Computational Physics, Vol. 104, pp. 86–92, 1993.
  • S. Abdullah, E.K. Burke, “A multi-start large neighbourhood search approach with local search methods for examination timetabling”, International Conference on Automated Planning and Scheduling, pp. 334–337, 2006.
  • D. Landa-Silva, J.H. Obit, “Great deluge with nonlinear decay rate for solving course timetabling problems”, IEEE Conference on Intelligent Systems, pp. 8.11–8.18, 2008.
  • D. Landa-Silva, J.H. Obit, “Evolutionary nonlinear great deluge for university course timetabling”, International Conference on Hybrid Artificial Intelligence Systems, 2009.
  • S. Abdullah, N.S. Jaddi, “Great deluge algorithm for rough set attribute reduction”, Database Theory and Application, Bio-Science and Bio-Technology, Communications in Computer and Information Science, Vol. 118, pp. 189–197, 2010.
  • M. Mafarja, S. Abdullah, “Modified great deluge for attribute reduction in rough set theory”, Proceedings of the International Conference on Fuzzy Systems and Knowledge Discovery, pp. 1464–1469, 2011.
  • C.L. Blake, C.J. Merz, UCI Repository of Machine Learning Databases, University of California at Irvine, http://www.ics.uci.edu/ ∼mlearn/, 1998.
  • J. Komorowski, A. Øhrn, A. Skowron, The ROSETTA Rough Set Software System, in Handbook of Data Mining and Knowledge Discovery (Eds. W. Kl¨ osgen and J. Zytkow), Oxford University Press, pp. 554–559, 2002.