|本期目录/Table of Contents|

[1]李 婵,杨文元*,赵 红. 联合依赖最大化与稀疏表示的无监督特征选择方法[J].南京大学学报(自然科学),2017,53(4):775.[doi:10.13232/j.cnki.jnju.2017.04.021]
 Li Chan,Yang Wenyuan*,Zhao Hong. Unsupervised feature selection method via dependence maximization and sparse representation[J].Journal of Nanjing University(Natural Sciences),2017,53(4):775.[doi:10.13232/j.cnki.jnju.2017.04.021]
点击复制

 联合依赖最大化与稀疏表示的无监督特征选择方法()
     

《南京大学学报(自然科学)》[ISSN:0469-5097/CN:32-1169/N]

卷:
53
期数:
2017年第4期
页码:
775
栏目:
出版日期:
2017-08-03

文章信息/Info

Title:
 Unsupervised feature selection method via dependence maximization and sparse representation
作者:
 李 婵杨文元*赵 红
 闽南师范大学粒计算重点实验室,漳州,363000
Author(s):
 Li ChanYang Wenyuan*Zhao Hong
 Laboratory of Granular Computing,Minnan Normal University,Zhangzhou,363000,China
关键词:
依赖最大化稀疏表示无监督特征选择投 影
Keywords:
 dependence maximizationsparse representationunsupervised feature selectionprojection
分类号:
TP181
DOI:
10.13232/j.cnki.jnju.2017.04.021
文献标志码:
A
摘要:
 高维数据分析任务中,无监督特征选择是一项重要并具有挑战性的任务.传统的无监督特征选择算法通过保持流形结构或者特征之间相关性进行特征选择,而没有直接考虑选择特征与原始数据的依赖程度.通过考虑投影后的低维空间数据与原始数据信息之间的依赖性,提出有良好性能的特征依赖于原始数据的度量原则.首先利用最大化依赖使投影后数据尽可能保持原始数据的特征信息,据此获得投影矩阵,从而对原始数据达到降维效果.然后联合稀疏表示进行特征选择.提出一种新的无监督特征选择算法,称之为联合依赖最大化与稀疏表示的无监督特征选择方法(DMSR).在4个实际的数据集上进行实验,并与3种已有的无监督特征选择算法进行比较,在两种评价指标聚类精度和互信息上的实验结果表明,提出的DMSR算法是有效的.
Abstract:
 Unsupervised feature selection is an important and challenging task in high dimensional data analysis tasks.The traditional unsupervised feature selection algorithms make a feature selection by keeping manifold structure or the correlation in features,however,it fails to directly think about the dependence between the selected features and the original data.In contrast,we consider the dependence between original data and low dimensional spatial data after projection,and propose a measurement principle that the feature with good performance based on the original data in this paper.First,we make the projected data retain the characteristic information of the original data as much as possible by maximizing the dependence to count projection matrix.Thus the dimensionality reduction effect is achieved for the original data.Then,we combine sparse representation to make feature selection and put forward an unsupervised feature selection algorithm.The proposed algorithm is termed as unsupervised feature selection method via dependence maximization and sparse representation(DMSR).Finally,experiments are carried out on four public data sets to compare with three existing unsupervised feature selection algorithms.The experimental results on two evaluation indexes,which are clustering accuracy and mutual information show that the proposed DMSR algorithm is effective.

参考文献/References:

 [1] Belkin M,Niyogi P.Laplacian eigenmaps for dimensionality reduction and data representation.Neural Computation,2003,15(6):1373-1396.
[2] 曹冬寅,王 琼,张兴敢.基于稀疏重构残差和随机森林的集成分类算法.南京大学学报(自然科学),2016,52(6):1127-1132.(Cao D Y,Wang Q,Zhang X G.Ensemble classification method based on sparse reconstruction residuals and random forest.Journal of Nanjing University(Natural Sciences),2016,52(6):1127-1132.)
[3] Wang S P,Pedrycz W,Zhu Q X,et al.Subspace learning for unsupervised feature selection via matrix factorization.Pattern Recognition,2015,48(1):10-19.
[4] Chapelle O,Scholkopf B,Zien A.Semi-Supervised learning.IEEE Transactions on Neural Networks,2009,20(3):542.
[5] Lee Rodgers J,Nicewander W A.Thirteen ways to look at the correlation coefficient.The American Statistician,1988,42(1):59-66.
[6] Mitchell D,Bridge R.A test of Chargaff’s second rule.Biochemical and Biophysical Research Communications,2006,340(1):90-94.
[7] Cover T M,Thomas J A.Elements of information theory.The 2nd Edition.New York:Wiley,2006,792.
[8] Gretton A,Bousquet O,Smola A,et al.Measuring statistical dependence with Hilbert-Schmidt norms.In:Jain S,Simon H U,Tomita E.Algorithmic Learning Theory.Springer Berlin Heidelberg,2005:63-77.
[9] Gretton A,Fukumizu K,Teo C H,et al.A kernel statistical test of independence.In:Proceedings of the 20th International Conference on Neural Information Processing Systems.Vancouver,Canada:Curran Associates Inc.,2007:585-592.
[10] Zhang Y,Zhou Z H.Multilabel dimensionality reduction via dependence maximization.ACM Transactions on Knowledge Discovery from Data(TKDD),2010,4(3):14.
[11] Li Z C,Yang Y,Liu J,et al.Unsupervised feature selection using nonnegative spectral analysis.In:Proceedings of the 26th AAAI Conference on Artificial Intelligence.Toronto,Canada:AAAI Press,2012:1026-1032.
[12] Dy J G.Unsupervised feature selection.In:Liu H,Motoda H.Computational Methods of Feature Selection.Boca Raton,FL,USA:Chapman & Hall,CRC,2008:19-39.
[13] 谢娟英,屈亚楠,王明钊.基于密度峰值的无监督特征选择算法.南京大学学报(自然科学),2016,52(4):735-745.(Xie J Y,Qu Y N,Wang M Z.Unsupervised feature selection algorithms based on density peaks.Journal of Nanjing University(Natural Sciences),2016,52(4):735-745.)
[14] Cai D,Zhang C Y,He X F.Unsupervised feature selection for multi-cluster data.In:Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Washington D C,USA:ACM,2010:333-342.
[15] He X F,Cai D,Niyogi P.Laplacian score for feature selection.In:Proceedings of the 18th International Conference on Neural Information Processing Systems.Vancouver,Canada:MIT Press,2005:507-514.
[16] Zhao Z,Liu H.Spectral feature selection for supervised and unsupervised learning.In:Proceedings of the 24th International Conference on Machine Learning.Corvalis,OR,USA:ACM,2007:1151-1157.
[17] Zhu P F,Zuo W M,Zhang L,et al.Unsupervised feature selection by regularized self-representation.Pattern Recognition,2015,48(2):438-446.
[18] Song L,Smola A,Gretton A,et al.A dependence maximization view of clustering.In:Proceedings of the 24th International Conference on Machine Learning.Corvalis,OR,USA:ACM,2007:815-822.
[19] Nie F P,Huang H,Cai X,et al.Efficient and robust feature selection via joint l2,1-norms minimization.In:Proceedings of the 23rd International Conference on Neural Information Processing Systems.Vancouver,Canada:Curran Associates Inc.,2010:1813-1821.
[20] Hou C P,Nie F P,Li X L,et al.Joint embedding learning and sparse regression:A framework for unsupervised feature selection.IEEE Transactions on Cybernetics,2014,44(6):793-804.
[21] Feature Selection Datasets.http://featureselection.asu.edu/old/datasets.php.
[22] Publications & Codes.http://www.escience.cn/people/fpnie/papers.html.

相似文献/References:

备注/Memo

备注/Memo:
 基金项目:国家自然科学基金(61379049,61379089)
收稿日期:2017-06-09
*通讯联系人,E-mail:yangwy@xmu.edu.cn
更新日期/Last Update: 2017-08-03