南京大学学报(自然科学版) ›› 2010, Vol. 46 ›› Issue (5): 487–493.

• • 上一篇    下一篇

 基于粗糙集和蚁群优化算法的特征选择方法*

 王 璐, 邱桃荣** , 何 妞, 刘 萍   

  • 出版日期:2015-04-02 发布日期:2015-04-02
  • 作者简介: ( 南昌大学计算机系, 南昌, 330031)
  • 基金资助:
     国家自然科学基金( 50863003, 61070139) , 江西省教育厅科技资助项目( 赣教技字[ GJJ08042] 号)

 A method for feature selection based on rough sets and ant colonyoptimization algorithm

 Wang Lu, Qiu Tao Rong, H e Niu, Liu Ping   

  • Online:2015-04-02 Published:2015-04-02
  • About author: (Department of Computer, Nanchang University, Nanchang, 330031, China)

摘要:  特征选择在许多领域特具有重要的作用. 本文将粗糙集方法和蚁群优化算法相结合, 提出一种基于粗糙集蚁群优化方法的特征选择的算法. 该算法以属性依赖度和属性重要度作为启发因子应用
于转移规则中, 用粗糙集方法的分类质量和特征子集的长度构建信息素更新策略. 通过对数据集的测试, 结果表明所提出的方法是可行的.

Abstract:  Feature selection has become the focus of research in the field of data mining, machine learning, pattern recognition and so on. Feature selection uses a more stable set and appropriate precision characteristics to describe
the original feature set. Feature selection research has focused on two aspects: one is for the search strategy of the subset and the other is the performance evaluation feature subset. Therefore, the research on more effective feature
selection algorithm, to obtain the better feature subset, to reduce the time complexity of the algorithm, and to find the fast feature selection algorithm, is still the focus of the study of feature selection. According to the defects and
deficiencies of the current algorithm, by analyzing the advantages and disadvantages of the existing algorithms, the current shortcomings and deficiencies of methods have been found to propose a new method for feature selection
which combined the rough set method and ant colony optimization algorithm. T o improve the algorithm ’s performance, the core attribute as the start of the feature selection. In the transfer rules and the pheromone update
strategy, this algorithm uses rough set dependency and attributes significance to guide the ants search process to improve the performance of the algorithm. In addition, the quality of classification based on rough set method and
the length of the feature subset are used to measure the strengths and weaknesses of feature subset. By choosing a data set with certain number of data and attributes the proposed method is tested to compare with the feature
selection method based on rough set and the feature selection method based on ant colony optimization. Testing and comparison results show that the proposed method is feasible and this method has obvious advantages in the
indicators feature subset length and accuracy when the data set have core attributes. Finally, the given example and testing in real datasets show that the proposed method is effective.

 [ 1 ]  Liu H, Motoda H. Feature selection for knowledge discovery and data mining. Kluwer: Aca emic Publishers, 1998, 214.
[ 2 ]  Guyon I, Elisseeff A. An introduction to varia ble and feature selection. Journal of Machine Learning Research, 2003, 3: 1157~ 1182.
[ 3 ] Kudo M, Sklansky J. Comparison of algorithms that select features for pattern classifiers. Pattern Recognition, 2000, 33 ( 1) : 25~ 41.
[ 4 ]  Sun Z H, Bebis G, Miller R. Obieet detection using feature subset selection. Pattern Reeognition, 2004, 37( 11): 2165~ 2176.
[ 5 ] Jain A K, Duin R D W, Mao J C. Statistical pattern recognition: A review. Institute of Elec trical and Electronics Engineers Transaction Pattern Analysis and Machine Intelligence, 2000, 22(1): 4~ 37.
[ 6 ]  Kudo M, Sklansky J. Comparison of algorithms that select features for pattern classifiers. Pattern Recognition, 2000, 33( l) : 25~ 41.
[ 7 ] Chen X W. An improved branch and bound algorithm for feature selection. Pattern Recognition Letters, 2003, 24( 12): 1925~ 1933.
[ 8 ]  Wang L. Intelligent algorithms and its applica tion. Beijing: Tsinghua University Press, 2004, 17~ 78. ( 王? 凌. 智能优化算法及其应用. 北京: 清华大学出版社, 2004, 17~ 78) .
[ 9 ] Wu B L, Abbott T, FishmanD, et al. Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioin for Maties, 2003, 19( 13): 1636~ 1643.
[ 10]  Swiniarski R W, Skowron A. Rough set methods in feature selection and recognition. Pattern Recognition Letters, 2003, 24( 6) : 833~ 849.
[ 11]  Dorigo M, Maniezzo V, Coloni A. Ant system: Optimization by a colony of cooperating agents. Institute of Electrical and Electronics Engineers T ransactions on Systems, M an and Cybernetics, 1996, 26 (1): 8~ 41.
[ 12]  Zhang W X, Wu W Z, Liang J Y, et al. Rough set theory and method. Beijing: Science Publishing, 2001, 15~ 90. ( 张文修, 吴伟志, 梁吉业等. 粗糙集理论与方法. 北京: 科学出版社, 2001, 15~ 90) .
[ 13]  Liu Q. Rough sets and rough reasoning. Beijing: Science Press, 2005, 11~ 75. ( 刘  清. Rough 集及 Rough 推理. 北京: 科学出版社, 2003, 11~ 75) .
[ 14]  Wang G Y, Zhao J. Theoretical study on attribute reduction of rough set theory: Comparison of algebra and information views. Proceedings of the 3 rd Institute of Electrical and Electronics Engineers International Conference on Cognitive
Informatics (ICCI) 04) , Canada: IEEE, 2004, 148~ 155.
[ 15] Uncu O, Turksen I B. A novel feature selection approach: Combining feature wrappers and filters. Information Sciences, 2007, 177 ( 2): 449~ 466.
[ 16] Swiniarski R W, Skowron A. Rough set methods in feature selection and recognition. Pattern Recognition Letters, 2003, 24(6): 833~ 849.
[ 17]  Wang Y, Xie J Y. An adaptive ant colony algorithm and simulation studies. Journal of System Simulation, 2002, 14( 1) : 31~ 33. (王  颖, 谢剑英. 一种自适应蚁群算法及其仿真研究. 系统仿真学报, 2002, 14(1): 31~ 33) .
[ 18]  Chouchoulas A, Shen Q. Rough set-aided keyword reduction for text vategorization. Applied Artificial Intelligence, 2001, 15( 9) : 843~ 873.
[ 19] Swiniarski R W, Skowron A. Rough set methods in feature selection and recognition. Pattern Recognition Letters, 2003, 24(6): 833~ 849.
[ 20]  Wei J X, Liu H, Su X N. Document clustering algorithm design and simulation based on the genetic algorithm. Journal of Nanjing University ( Natural Sciences) , 2009, 45(3): 432~ 438.
(魏建香, 刘怀, 苏新宁. 基于遗传算法的文档聚类算法的设计与访真. 南京大学学报( 自然科学) , 2009, 45( 3) : 432~ 438).
[ 21]  Liu X Y, Wu J X, Zhou Z H. A method about based on the model cascade type imbalanced data classification. Journal of Nanjing University ( Natural Sciences) , 2006, 42( 2) : 148~ 155.
(刘胥影, 吴建鑫, 周志华. 一种基于级联模型的类别不平衡数据分类方法. 南京大学学报(自然科学) , 2006, 42(2): 148~ 155) .
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!