南京大学学报(自然科学版) ›› 2018, Vol. 54 ›› Issue (4): 733–.

• • 上一篇    下一篇

一种基于邻域关系和模糊决策的特征选择方法

温 欣1,李德玉1,2*,王素格1,2   

  • 出版日期:2018-04-30
  • 作者简介:1.山西大学计算机与信息技术学院,太原,030006; 2.计算智能与中文信息处理教育部重点实验室,山西大学,太原,030006
  • 基金资助:
    基金项目:国家自然科学基金(61672331,61632011,61573231,61432011,61603229),山西省自然科学基金(201601D021076) 收稿日期:2018-05-08 *通讯联系人,E-mail:lidy@sxu.edu.cn

A method for feature selection based on neighborhood relation and fuzzy decision

Wen Xin1,Li Deyu1,2*,Wang Suge1,2   

  • Online:2018-04-30
  • About author:1.School of Computer & Information Technology,Shanxi University,Taiyuan,030006,China; 2.Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education,Shanxi University,Taiyuan,030006,China

摘要: 数据特征空间的高维性使得学习过程耗费了相对较多的时间,而且可能影响分类性能. 邻域粗糙集模型可以用来解决特征选择问题,但该模型未能描述现实存在的样本的模糊性,可能导致信息的丢失. 因此,建立了一种新的单标记特征选择模型,采用两种不同的隶属度计算方法获得样本对等价类的模糊隶属度,将每个等价类中最小隶属度值作为隶属度阈值. 然后利用邻域样本隶属度与阈值的关系重新定义邻域粗糙上、下近似,进而通过衡量决策属性对特征子集依赖度的大小进行特征选择. 在七个公开的UCI数据集上进行了实验,实验结果表明,与已有的几种特征选择方法相对比,分类准确度得到了进一步提高,选择的特征数目明显减少.

Abstract: The high dimensionality of existing data in the feature space makes learning procedure spend more time relatively and the classification performance could be affected. Neighborhood rough set model can be used to deal with the problem of feature selection,but the model cannot describe fuzziness of samples which exist in real world. Without description of fuzziness,the useful information could be lost. Thus,a novel feature selection model of single-label is built considering above analysis. The fuzzy membership values of every sample for all the equivalence classes are obtained by two different computational method of fuzzy membership and the smallest fuzzy membership value of every equivalence class is regarded as the threshold value of fuzzy membership. Then,neighborhood rough upper and lower approximation are redefined by the use of relation between the fuzzy membership values of neighbor samples and the threshold value of fuzzy membership in the current equivalence class researched. Further,the feature subspace is obtained by measuring the dependency of decision attribute on the feature subset. The experiment is carried out in seven public UCI datasets. The experimental results show that classification accuracy is further improved and the selected feature numbers are decreased obviously by comparing with the existing approaches which are used to select feature subspace.

[1] 李 华,李德玉,王素格等. 基于粗糙集的多标记专属特征学习算法. 小型微型计算机系统,2015,36(12):2730-2734.(Li H,Li D Y,Wang S G,et al. Multi-label learning with label-specific features based on rough sets. Journal of Chinese Computer Systems,2015,36(12):2730-2734.) [2] Jungjit S,Freitas A A,Michaelis M,et al. Two extensions to multi-label correlation-based feature selection:A case study in bioinformatics ∥ IEEE International Conference on Systems,Man,and Cybernetics. Manchester,United Kingdom:IEEE,2013:1519-1524. [3] Liu H,Motoda H. Feature selection for knowledge discovery and data mining. Norwell:Kluwer Academic Publishers,1998,214. [4] Li H,Li D Y,Zhai Y H,et al. A novel attribute reduction approach for multi-label data based on rough set theory. Information Sciences,2016,367-368:827-847. [5] Hu Q H,Yu D R,Xie Z X. Information-preserving hybrid data reduction based on fuzzy-rough techniques. Pattern Recognition Letters,2006,27(5):414-423. [6] Liang J Y,Wang F,Dang C Y,et al. A group incremental approach to feature selection applying rough set technique. IEEE Transactions on Knowledge and Data Engineering,2014,26(2):294-308. [7] Fu X J,Wang L P. Data dimensionality reduction with application to simplifying RBF network structure and improving classification performance. IEEE Transactions on Systems,Man,and Cybernetics,Part B(Cybernetics),2003,33(3):399-409. [8] Saeys Y,Inza I,Larraaga P. A review of feature selection techniques in bioinformatics. Bioinformatics,2007,23(19):2507-2517. [9] Pawlak Z. Rough sets. International Journal of Parallel Programming,1982,11(5):341-356. [10] 于达仁,胡清华,鲍 文. 融合粗糙集和模糊聚类的连续数据知识发现. 中国电机工程学报,2004,24(6):205-210.(Yu D R,Hu Q H,Bao W. Combining rough set methodology and fuzzy clustering for knowledge discovery from quantitative data. Proceedings of the CSEE,2004,24(6):205-210.) [11] Wang C Z,Qi Y L,Shao M W,et al. A fitting model for feature selection with fuzzy rough sets. IEEE Transactions on Fuzzy Systems,2017,25(4):741-753. [12] Chen D G,Zhang L,Zhao S Y,et al. A novel algorithm for finding reducts with fuzzy rough sets. IEEE Transactions on Fuzzy Systems,2012,20(2):385-389. [13] Jensen R,Shen Q. Fuzzy-rough attribute reduction with application to web categorization. Fuzzy Sets and Systems,2004,141(3):469-485. [14] 张文修,吴伟志,梁吉业等. 粗糙集理论与方法. 北京:科学出版社,2001,232.(Zhang W X,Wu W Z,Liang J Y,et al. Rough sets theory and methods. Beijing:Science Press,2001,232.) [15] 胡清华,于达仁,谢宗霞. 基于邻域粒化和粗糙逼近的数值属性约简. 软件学报,2008,19(3):640-649.(Hu Q H,Yu D R,Xie Z X. Numerical attribute reduction based on neighborhood granulation and rough approximation. Journal of Software,2008,19(3):640-649.) [16] Hu Q H,Yu D R,Liu J F,et al. Neighborhood rough set based heterogeneous feature subset selection. Information Sciences,2008,178(18):3577-3594. [17] 胡清华,赵 辉,于达仁. 基于邻域粗糙集的符号与数值属性快速约简算法. 模式识别与人工智能,2008,21(6):732-738.(Hu Q H,Zhao H,Yu D R. Efficient symbolic and numerical attribute reduction with neighborhood rough sets. Pattern Recognition and Artificial Intelligence,2008,21(6):732-738.) [18] Guo G Z,Liu Z R,Lou C,et al. Improving on a rapid attribute reduction algorithm based on neighborhood rough sets ∥ 12th International Conference on Fuzzy Systems and Knowledge Discovery. Zhangjiaiie,China:IEEE,2015:236-240. [19] Wang C Z,Shao M W,He Q,et al. Feature subset selection based on fuzzy neighborhood rough sets. Knowledge-Based Systems,2016,111:173-179. [20] 段 洁,胡清华,张灵均等. 基于邻域粗糙集的多标记分类特征选择算法. 计算机研究与发展,2015,52(1):56-65.(Duan J,Hu Q H,Zhang L J,et al. Feature selection for multi-label classification based on neighborhood rough sets. Journal of Computer Research and Development,2015,52(1):56-65.) [21] Hu Q H,Zhang L,An S,et al. On robust fuzzy rough set models. IEEE Transactions on Fuzzy Systems,2012,20(4):636-651.
No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!