Ant Colony Optimization for Feature Selection Involving Effective Local Search

Md. Monirul Kabir; Md. Shahjahan; Kazuyuki Murase

doi:10.20965/jaciii.2011.p0671

single-jc.php

« previous

JACIII Vol.15 No.6 pp. 671-680

doi: 10.20965/jaciii.2011.p0671

(2011)

Paper:

Views over last 60 days: 924

Ant Colony Optimization for Feature Selection Involving Effective Local Search

Md. Monirul Kabir^*, Md. Shahjahan^**, and Kazuyuki Murase^*,***

^*Department of System Design Engineering, Graduate School of Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui 910-8507, Japan

^**Department of Electrical and Electronic Engineering, Khulna University of Engineering and Technology, Building no-13E, KUET Campus, Khulna 9203, Bangladesh

^***Research and Education Program for Life Science, University of Fukui, Japan

Received:

December 20, 2010

Accepted:

April 27, 2011

Published:

August 20, 2011

Keywords:

feature selection, local search, ant colony optimization algorithm, neural network

Abstract

This paper proposes an effective algorithm for feature selection (ACOFS) that uses a global Ant Colony Optimization algorithm (ACO) search strategy. To make ACO effective in feature selection, our proposed algorithm uses an effective local search in selecting significant features. The novelty of ACOFS lies in its effective balance between ant exploration and exploitation using new pheromone update and heuristic information computation rules to generate a subset of a smaller number of significant features. We evaluate algorithm performance using seven real-world benchmark classification datasets. Results show that ACOFS generates smaller subsets of significant features with improved classification accuracy.

Cite this article as:

M. Kabir, M. Shahjahan, and K. Murase, “Ant Colony Optimization for Feature Selection Involving Effective Local Search,” J. Adv. Comput. Intell. Intell. Inform., Vol.15 No.6, pp. 671-680, 2011.

Data files:

References

[1] H. Liu and L. Tu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Trans. on Knowledge and Data Engineering, Vol.17, No.4, pp. 491-502, 2004.
[2] I. Guyon and A. Elisseeff, “An Introduction to variable and feature selection,” J. of Machine Learning Research, Vol.3, pp. 1157-1182, 2003.
[3] M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis, Vol.1, pp. 131-156, 1997.
[4] R. K. Sivagaminathan and S. Ramakrishnan, “A hybrid approach for feature subset selection using neural networks and ant colony optimization,” Expert systems with applications, Vol.33, pp. 49-60, 2007.
[5] J. Huang, Y. Cai, and X. Xu, “A hybrid genetic algorithm for feature selection wrapper based on mutual information,” Pattern Recognition Letters, Vol.28, pp. 1825-1844, 2007.
[6] S. Guan, J. Liu, and Y. Qi, “An incremental approach to contribution-based feature selection,” J. of Intelligence Systems, Vol.13, No.1, 2004.
[7] E. Gasca, J. S. Sanchez, and R. Alonso, “Eliminating redundancy and irrelevance using a new MLP-based feature selection method,” Pattern Recognition, Vol.39, pp. 313-315, 2006.
[8] S. Abe, “Modified backward feature selection by cross validation,” Proc. of the European Symposium on Artificial Neural Networks, pp. 163-168, 2005.
[9] C. Hsu, H. Huang, and D. Schuschel, “The ANNIGMA-wrapper approach to fast feature selection for neural nets,” IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol.32, No.2, pp. 207-212, 2002.
[10] E. Romero and J. M. Sopena, “Performing feature selection with multilayer perceptrons,” IEEE Trans. on Neural Networks, Vol.19, No.3, pp. 431-441, 2008.
[11] R. Setiono and H. Liu, “Neural network feature selector,” IEEE Trans. on Neural Networks, Vol.8, 1997.
[12] R. Caruana and D. Freitag, “Greedy attribute selection,” Proc. of the 11th Int. Conf. of Machine Learning, USA, Morgan Kaufmann, 1994.
[13] D. J. Straceezzi and P. E. Utgoff, “Randomized variable elimination,” J. of Machine Learning Research, Vol.5, pp. 1331-1362, 2004.
[14] P. Pudil, J. Novovicova, and J. Kittler, “Floating search methods in feature selection,” Pattern Recognition Letters, Vol.15, No.11, pp. 1119-1125, 1994.
[15] L. Ke, Z. Feng, and Z. Ren, “An efficient ant colony optimization approach to attribute reduction in rough set theory,” Pattern Recognition Letters, Vol.29, pp. 1351-1357, 2008.
[16] M. Dorigo and T. Stutzle, “Ant Colony Optimization,” MIT Press, Cambridge, MA, 2004.
[17] M. H. Aghdam, N. G. Aghaee, and M. E. Basiri, “Text feature selection using ant colony optimization,” Expert systems with applications, Vol.36, pp. 6843-6853, 2009.
[18] H. R. Kanan, K. Faez, and S.M. Taheri, “Feature selection using ant colony optimization (ACO): a new method and comparative study in the application of face recognition system,” Int. Conf. on data mining, pp. 63-76, 2007.
[19] R. N. Khushaba, A. Alsukker, A. A. Ani, and A. A. Jumaily, “Enhanced feature selection algorithm using ant colony optimization and fuzzy memberships,” Proc. of the sixth IASTED Int. Conf. on biomedical engineering, pp. 34-39, 2008.
[20] A. Ani, “Feature subset selection using ant colony optimization,” Int. J. of computational intelligence, Vol.2, pp. 53-58, 2005.
[21] K. R. Robbins, W. Zhang, and J. K. Bertrand, “The ant colony algorithm for feature selection in high- dimension gene expression data for disease classification” J. ofMathematical Medicine and Biology, pp. 1-14, 2008.
[22] M. M. Kabir, M. Shahjahan, and K. Murase, “An Efficient Feature Selection using Ant Colony Optimization Algorithm,” 16^th Int. Conf. on Neural Information Processing (ICONIP), pp. 242-252, 2009.
[23] A. Verikas and M. Bacauskiene, “Feature selection with neural networks, Pattern Recognition Letters,” Vol.23, pp. 1323-1335, 2002.
[24] N. R. Pal and K. Chintalapudi, “A connectionist system for feature selection,” Int. J. of neural, parallel and scientific computation, Vol.5, pp. 359-381, 1997.
[25] D. Chakraborty and N. R. Pal, “A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification,” IEEE Trans. on Neural Networks, Vol.15, No.1, pp. 110-123, 2004.
[26] A. Rakotomamonjy, “Variable selection using SVM-based criteria,” Journal ofMachine Learning Research, Vol.3, pp. 1357-1370, 2003.
[27] L. Wang, N. Zhou, and F. Chu, “A general wrapper approach to selection of class-dependent features,” IEEE Trans. on Neural Networks, Vol.19, No.7, pp. 1267-1278, 2008.
[28] J. H. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” IEEE intelligent systems, Vol.13, No.2, pp. 44-49, 1998.
[29] D. P. Muni, N. R. Pal, and J. Das, “Genetic Programming for Simultaneous Feature Selection and Classifier Design,” IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol.36, No.1, 2006.
[30] X. Wang, J. Yang, X. Teng, W. Xia, and R. Jensen, “Feature selection based on rough sets and particle swarm optimization,” Pattern recognition letter, Vol.28, pp. 459-471, 2007.
[31] I. Oh, J. Lee, and B. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.26, No.11, pp. 1424-1437, 2004.
[32] M. M. Kabir, M. M. Islam, and K. Murase, “A new wrapper feature selection approach using neural network,” Neurocomputing, Vol.73, pp. 3273-3283, 2010.
[33] M. M. Kabir, M. M. Islam, and K. Murase, “A New Wrapper Feature Selection Approach using Neural Network,” Proc. of Joint 4th Int. Conf. on Soft Computing and Intelligent Systems and 9th Int. Symposium on Advanced Intelligent Systems, (SCIS&ISIS2008), Japan, pp. 1953-1958, 2008.
[34] M. M. Kabir, M. Shahjahan, and K. Murase, “Involving New Local Search in Hybrid Genetic Algorithm for Feature Selection,” 16th Int. Conf. on Neural Information Processing, ICONIP2009, pp. 150-158, 2009.
[35] L. Prechelt, “PROBEN1-A set of neural network benchmark problems and benchmarking rules,” Technical Report 21/94, Faculty of Informatics, University of Karlsruhe, 1994.
[36] R. Reed, “Pruning algorithms-a survey,” IEEE Trans. on Neural Networks, Vol.4, No.5, pp. 740-747, 1933.
[37] F. Giroshi, M. Jones, and T. Poggio, “Regularization theory and neural networks architectures,” Neural Computation, Vol.7, No.2, pp. 219-269, 1995.
[38] T. Y. Kwok and D. Y. Yeung, “Constructive algorithms for structure learning in feed-forward neural networks for regression problems,” IEEE Trans. on Neural Networks, Vol.8, pp. 630-645, 1997.
[39] T. Y. Kwok and D. Y. Yeung, “Objective functions for training new hidden units in constructive neural networks,” IEEE Trans. Neural Network, Vol.8, No.5, pp. 1131-1148, 1997.
[40] M. Lehtokangas, “Modified cascade-correlation learning for classification,” IEEE Trans. on Neural Networks, Vol.11, pp. 795-798, 2000.
[41] D. J. newman, S. Hettich, C. L. Blake, and C. J. Merz, “UCI Repository of Machine Learning Databases,” Dept. of Information and Computer Sciences, University of California, Irvine 1998.
http://www.ics.uci.edu/˜mlearn/MLRepository.html
[42] D. E. Rumelhart and J. McClelland, “Parallel distributed processing,” MIT Press, 1986.

This article is published under a Creative Commons Attribution-NoDerivatives 4.0 Internationa License.

[1] [1] H. Liu and L. Tu, “Toward integrating feature selection algorithms for classification and clustering,” IEEE Trans. on Knowledge and Data Engineering, Vol.17, No.4, pp. 491-502, 2004.

[2] [2] I. Guyon and A. Elisseeff, “An Introduction to variable and feature selection,” J. of Machine Learning Research, Vol.3, pp. 1157-1182, 2003.

[3] [3] M. Dash and H. Liu, “Feature selection for classification,” Intelligent Data Analysis, Vol.1, pp. 131-156, 1997.

[4] [4] R. K. Sivagaminathan and S. Ramakrishnan, “A hybrid approach for feature subset selection using neural networks and ant colony optimization,” Expert systems with applications, Vol.33, pp. 49-60, 2007.

[5] [5] J. Huang, Y. Cai, and X. Xu, “A hybrid genetic algorithm for feature selection wrapper based on mutual information,” Pattern Recognition Letters, Vol.28, pp. 1825-1844, 2007.

[6] [6] S. Guan, J. Liu, and Y. Qi, “An incremental approach to contribution-based feature selection,” J. of Intelligence Systems, Vol.13, No.1, 2004.

[7] [7] E. Gasca, J. S. Sanchez, and R. Alonso, “Eliminating redundancy and irrelevance using a new MLP-based feature selection method,” Pattern Recognition, Vol.39, pp. 313-315, 2006.

[8] [8] S. Abe, “Modified backward feature selection by cross validation,” Proc. of the European Symposium on Artificial Neural Networks, pp. 163-168, 2005.

[9] [9] C. Hsu, H. Huang, and D. Schuschel, “The ANNIGMA-wrapper approach to fast feature selection for neural nets,” IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol.32, No.2, pp. 207-212, 2002.

[10] [10] E. Romero and J. M. Sopena, “Performing feature selection with multilayer perceptrons,” IEEE Trans. on Neural Networks, Vol.19, No.3, pp. 431-441, 2008.

[11] [11] R. Setiono and H. Liu, “Neural network feature selector,” IEEE Trans. on Neural Networks, Vol.8, 1997.

[12] [12] R. Caruana and D. Freitag, “Greedy attribute selection,” Proc. of the 11th Int. Conf. of Machine Learning, USA, Morgan Kaufmann, 1994.

[13] [13] D. J. Straceezzi and P. E. Utgoff, “Randomized variable elimination,” J. of Machine Learning Research, Vol.5, pp. 1331-1362, 2004.

[14] [14] P. Pudil, J. Novovicova, and J. Kittler, “Floating search methods in feature selection,” Pattern Recognition Letters, Vol.15, No.11, pp. 1119-1125, 1994.

[15] [15] L. Ke, Z. Feng, and Z. Ren, “An efficient ant colony optimization approach to attribute reduction in rough set theory,” Pattern Recognition Letters, Vol.29, pp. 1351-1357, 2008.

[16] [16] M. Dorigo and T. Stutzle, “Ant Colony Optimization,” MIT Press, Cambridge, MA, 2004.

[17] [17] M. H. Aghdam, N. G. Aghaee, and M. E. Basiri, “Text feature selection using ant colony optimization,” Expert systems with applications, Vol.36, pp. 6843-6853, 2009.

[18] [18] H. R. Kanan, K. Faez, and S.M. Taheri, “Feature selection using ant colony optimization (ACO): a new method and comparative study in the application of face recognition system,” Int. Conf. on data mining, pp. 63-76, 2007.

[19] [19] R. N. Khushaba, A. Alsukker, A. A. Ani, and A. A. Jumaily, “Enhanced feature selection algorithm using ant colony optimization and fuzzy memberships,” Proc. of the sixth IASTED Int. Conf. on biomedical engineering, pp. 34-39, 2008.

[20] [20] A. Ani, “Feature subset selection using ant colony optimization,” Int. J. of computational intelligence, Vol.2, pp. 53-58, 2005.

[21] [21] K. R. Robbins, W. Zhang, and J. K. Bertrand, “The ant colony algorithm for feature selection in high- dimension gene expression data for disease classification” J. ofMathematical Medicine and Biology, pp. 1-14, 2008.

[22] [22] M. M. Kabir, M. Shahjahan, and K. Murase, “An Efficient Feature Selection using Ant Colony Optimization Algorithm,” 16^th Int. Conf. on Neural Information Processing (ICONIP), pp. 242-252, 2009.

[23] [23] A. Verikas and M. Bacauskiene, “Feature selection with neural networks, Pattern Recognition Letters,” Vol.23, pp. 1323-1335, 2002.

[24] [24] N. R. Pal and K. Chintalapudi, “A connectionist system for feature selection,” Int. J. of neural, parallel and scientific computation, Vol.5, pp. 359-381, 1997.

[25] [25] D. Chakraborty and N. R. Pal, “A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification,” IEEE Trans. on Neural Networks, Vol.15, No.1, pp. 110-123, 2004.

[26] [26] A. Rakotomamonjy, “Variable selection using SVM-based criteria,” Journal ofMachine Learning Research, Vol.3, pp. 1357-1370, 2003.

[27] [27] L. Wang, N. Zhou, and F. Chu, “A general wrapper approach to selection of class-dependent features,” IEEE Trans. on Neural Networks, Vol.19, No.7, pp. 1267-1278, 2008.

[28] [28] J. H. Yang and V. Honavar, “Feature subset selection using a genetic algorithm,” IEEE intelligent systems, Vol.13, No.2, pp. 44-49, 1998.

[29] [29] D. P. Muni, N. R. Pal, and J. Das, “Genetic Programming for Simultaneous Feature Selection and Classifier Design,” IEEE Trans. on Systems, Man, and Cybernetics-Part B: Cybernetics, Vol.36, No.1, 2006.

[30] [30] X. Wang, J. Yang, X. Teng, W. Xia, and R. Jensen, “Feature selection based on rough sets and particle swarm optimization,” Pattern recognition letter, Vol.28, pp. 459-471, 2007.

[31] [31] I. Oh, J. Lee, and B. Moon, “Hybrid genetic algorithms for feature selection,” IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol.26, No.11, pp. 1424-1437, 2004.

[32] [32] M. M. Kabir, M. M. Islam, and K. Murase, “A new wrapper feature selection approach using neural network,” Neurocomputing, Vol.73, pp. 3273-3283, 2010.

[33] [33] M. M. Kabir, M. M. Islam, and K. Murase, “A New Wrapper Feature Selection Approach using Neural Network,” Proc. of Joint 4th Int. Conf. on Soft Computing and Intelligent Systems and 9th Int. Symposium on Advanced Intelligent Systems, (SCIS&ISIS2008), Japan, pp. 1953-1958, 2008.

[34] [34] M. M. Kabir, M. Shahjahan, and K. Murase, “Involving New Local Search in Hybrid Genetic Algorithm for Feature Selection,” 16th Int. Conf. on Neural Information Processing, ICONIP2009, pp. 150-158, 2009.

[35] [35] L. Prechelt, “PROBEN1-A set of neural network benchmark problems and benchmarking rules,” Technical Report 21/94, Faculty of Informatics, University of Karlsruhe, 1994.

[36] [36] R. Reed, “Pruning algorithms-a survey,” IEEE Trans. on Neural Networks, Vol.4, No.5, pp. 740-747, 1933.

[37] [37] F. Giroshi, M. Jones, and T. Poggio, “Regularization theory and neural networks architectures,” Neural Computation, Vol.7, No.2, pp. 219-269, 1995.

[38] [38] T. Y. Kwok and D. Y. Yeung, “Constructive algorithms for structure learning in feed-forward neural networks for regression problems,” IEEE Trans. on Neural Networks, Vol.8, pp. 630-645, 1997.

[39] [39] T. Y. Kwok and D. Y. Yeung, “Objective functions for training new hidden units in constructive neural networks,” IEEE Trans. Neural Network, Vol.8, No.5, pp. 1131-1148, 1997.

[40] [40] M. Lehtokangas, “Modified cascade-correlation learning for classification,” IEEE Trans. on Neural Networks, Vol.11, pp. 795-798, 2000.

[41] [41] D. J. newman, S. Hettich, C. L. Blake, and C. J. Merz, “UCI Repository of Machine Learning Databases,” Dept. of Information and Computer Sciences, University of California, Irvine 1998.
http://www.ics.uci.edu/˜mlearn/MLRepository.html

[42] [42] D. E. Rumelhart and J. McClelland, “Parallel distributed processing,” MIT Press, 1986.

Ant Colony Optimization for Feature Selection Involving Effective Local Search

Md. Monirul Kabir*, Md. Shahjahan**, and Kazuyuki Murase*,***

Md. Monirul Kabir^*, Md. Shahjahan^**, and Kazuyuki Murase^*,***