南京大学学报(自然科学版) ›› 2017, Vol. 53 ›› Issue (4): 802–.

• • 上一篇    下一篇

 基于改进邻域粒的模糊熵特征选择算法

 姚 晟1,2*,徐 风1,2,赵 鹏1,2,刘政怡1,2,陈 菊1,2   

  • 出版日期:2017-08-03 发布日期:2017-08-03
  • 作者简介: 1.安徽大学计算机科学与技术学院,合肥,230601;
    2.安徽大学计算智能与信号处理教育部重点实验室,合肥,230601
  • 基金资助:
     基金项目:国家自然科学基金(61602004,61300057),安徽省自然科学基金(1508085MF127),安徽省高等学校自然科学研究重点项目(KJ2016A041),安徽大学信息保障技术协同创新中心公开招标课题(ADXXBZ2014-6),安徽大学博士科研启动基金(J10113190072),安徽大学计算智能与信号处理教育部重点实验室课题
    收稿日期:2017-06-06
    *通讯联系人,E-mail:fisheryao@126.com

 Fuzzy entropy feature selection algorithm based on improved neighborhood granule

 Yao Sheng1,2*,Xu Feng1,2,Zhao Peng1,2,Liu Zhengyi1,2,Chen Ju1,2   

  • Online:2017-08-03 Published:2017-08-03
  • About author: 1.College of Computer Science and Technology,Anhui University,Hefei,230601,China;
    2.Key Laboratory of Intelligent Computing & Signal Processing,Ministry of Education,Anhui University,Hefei,230601,China

摘要:  特征选择是一项重要的数据预处理技术,其目的是在不降低数据分类精度情形下选择一个特征子集,从而对原数据集达到降维的效果,同时也提高学习算法的性能.在邻域粗糙集模型中,传统方法构造出的对象邻域粒未考虑数据的分布问题,使得邻域粒存在一定的误差.首先通过方差来刻画数据的分布,然后根据数据分布提出一种改进的邻域粒,这种改进的邻域粒能够自适应数据的分布,有着较好的优越性,最后将改进邻域粒与邻域模糊熵结合,提出一种特征重要度的评估方式,并给出对应的特征选择算法.实验结果表明,新提出的特征选择算法在特征选择结果、时间消耗和特征子集的分类精度方面都更具一定的优越性.

Abstract:  Feature selection is a significant technology of data preprocessing in areas of machine learning and data mining,and its purpose is to select a subset of features without reducing the accuracy of the classification of data,and achieving the effect of dimension reduction for the original data set.At the same time,the performance of learning algorithm is improved as well.In neighborhood rough set model,the similarity relationship of the between objects is described by neighborhood granule.However,the neighborhood granule is constructed through the traditional methods without consideration of the problems about the data distribution,and it makes some errors exist in the neighborhood granule.In this paper,the data distribution is firstly described through the variance.Then,an improved neighborhood granule is proposed according to the data distribution.The improved neighborhood granule has an adaptive data distribution.In addition,the improved neighborhood granule has more superiority compared with the traditional neighborhood granule.And then,on the basis of the improved neighborhood granule,combing the improved neighborhood granule with neighborhood fuzzy entropy,making the features in the information system have better evaluation of importance.At last,the algorithm of feature selection based on fuzzy entropy is proposed according to the evaluation of importance,which is the better importance evaluation through the combination of the neighborhood granule and the neighborhood fuzzy entropy.The experimental results show that the new proposed feature selection algorithm can select smaller feature subset.At the same time,the feature subset keeps more accuracy of the classification.In addition,the proposed algorithm of feature selection also has higher efficiency.Therefore,the proposed feature selection algorithm has more superiority.

 [1] Lu S X,Wang X Z,Zhang G Q,et al.Effective algorithms of the Moore-Penrose inverse matrices for extreme learning machine.Intelligent Data Analysis,2015,19(4):743-760.
[2] Majumdar J,Mal A,Gupta S.Heuristic model to improve feature selection based on machine learning in data mining.In:Proceedings of 2016 the 6th International Conference Cloud System and Big Data Engineering.Noida,India:IEEE,2016:73-77.
[3] Neagoe V E,Neghina E C.Feature selection with ant colony optimization and its applications for pattern recognition in space imagery.In:Proceedings of 2016 International Conference on Communications.Bucharest,Romania:IEEE,2016:101-104.
[4] Pawlak Z.Rough sets.International Journal of Computer & Information Sciences,1982,11(5):341-356.
[5] 段 洁,胡清华,张灵均等.基于邻域粗糙集的多标记分类特征选择算法.计算机研究与发展,2015,52(1):56-65.(Duan J,Hu Q H,Zhang L J,et al.Feature selection for multi-label classification based on neighborhood rough sets.Journal of Computer Research and Development,2015,52(1):56-65.)
[6] Chen Y M,Wu K S,Chen X H,et al.An entropy-based uncertainty measurement approach in neighborhood systems.Information Sciences,2014,279:239-250.
[7] Jiang F,Sui Y F,Zhou L.A relative decision entropy-based feature selection approach.Pattern Recognition,2015,48(7):2151-2163.
[8] Wang C Z,Shao M W,He Q,et al.Feature subset selection based on fuzzy neighborhood rough sets.Knowledge-Based Systems,2016,111:173-179.
[9] Lin T Y.Rough sets,neighborhood systems and approximation.World Journal of Surgery,1986,10(2):189-194.
[10] Hu Q H,Yu D R,Liu J F,et al.Neighborhood rough set based heterogeneous feature subset selection.Information Sciences,2008,178(18):3577-3594.
[11] Zhu P F,Hu Q H.Adaptive neighborhood granularity selection and combination based on margin distribution optimization.Information Sciences,2013,249:1-12.
[12] Zhao H,Wang P,Hu Q H.Cost-sensitive feature selection based on adaptive neighborhood granularity with multi-level confidence.Information Sciences,2016,366:134-149.
[13] Zheng T T,Zhu L Y.Uncertainty measures of neighborhood system-based rough sets.Knowledge-Based Systems,2015,86:57-65.
[14] Chakrabarty K,Biswas R,Nanda S.Fuzziness in rough sets.Fuzzy Sets and Systems,2000,110(2):247-251.
[15] Hu J.Uncertainty measuring study on rough sets in covering approximation space.Computer Applications and Software,2011,28(11):180-183.
[16] Wei W,Liang J Y,Qian Y H,et al.Can fuzzy entropies be effective measures for evaluating the roughness of a rough set? Information Sciences,2013,232:143-166.
[17] Liu Y,Huang W L,Jiang Y L,et al.Quick attribute reduct algorithm for neighborhood rough set model.Information Sciences,2014,271:65-81.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!