南京大学学报(自然科学版) ›› 2017, Vol. 53 ›› Issue (3): 525.
李新玉1,徐桂云1,任世锦2*,杨茂云1,2
Li Xinyu1,Xu Guiyun1,Ren Shijin2*,Yang Maoyun1,2
摘要: 子空间聚类已经广泛应用于多个涉及高维数据聚类应用领域,受到机器学习研究者的广泛关注.子空间聚类方法是一种使用特征选择的聚类分析技术,通过选择重要特征子集实现对高维空间的低维表示,在实际应用中能够取得更好的性能,成为流行的高维数据聚类方法.与硬聚类方法相比,软聚类能够给出复杂数据更有意义的划分.扩展k-均值聚类并提出基于可靠性的正则化加权软k-均值新的子空间聚类方法(Reliabilitybased regularized weighted soft kmeans clustering algorithm,RRWSKM),该方法能够计算每个特征对每个聚类的贡献度,从而找到与不同聚类相关的重要特征子集.另外,该方法能够通过调整模型参数准确地辨识数据模式,具有良好的聚类性能.该方法把维度加权熵和划分熵作为正则化项引入到目标函数,避免过拟合问题同时使更多的特征参与辨识聚类.为了提高算法的鲁棒性,使用可靠性测度获得特征权重初始值,提高算法的可靠性和性能.考虑到该算法是非凸优化问题,使用迭代优化方法得到优化问题的最优解.使用多个实际数据集对本文算法进行仿真验证,结果表明,与其他子空间聚类算法相比,该算法能够有效发现高维数据的低维表示,具有良好的聚类性能,适合高维数据的聚类.
[1] Huang X,Ye Y,Xiong L,et al.Time series kmeans:A new kmeans type smooth subspace clustering for time series data.Information Sciences,2016,367-368:1-13. [2] Yin X,Chen S,Hu E.Regularized soft kmeans for discriminant analysis.Neurocomputing,2013,103:29-42. [3] Ehsan E,Rene V.Sparse subspace clustering:Algorithm,theory,and applications.IEEE Transaction on Pattern Analysis and Machine Intelligence,2013,35(11):2765-2781. [4] Li B,Lu C,Wen Z,et al.Localityconstrained nonnegative robust shape interaction subspace clustering and its applications.Digital Signal Processing,2017,60:113-121. [5] Xu J,Xu K,Chen K,et al.Reweighted sparse subspace.Computer Vision and Image Understanding,2015,138:25-37. [6] Jing L,Michael K Ng,Huang J Z.An entropy weighting kmeans algorithm for subspace clustering of highdimensional sparse data.IEEE Transactions on Knowledge and Data Engineering,2007,19(8):1026-1041. [7] Deng Z,Choi KupSze,Jiang Y,et al.A survey on soft subspace clustering.Information Sciences,2016,348:84-106. [8] Vidal R.Subspace clustering.IEEE Signal Process Magazine,2011,28(2):1129-1139. [9] Liu G,Lin Z,Yan S,et al.Robust recovery of subspace structures by lowrank representation.IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):171-184. [10] Favaro P,Vidal R,Ravichandran A.A closed form solution for robust subspace estimation and clustering.In:The 24th IEEE Conference on Computer Vision and Pattern Recognition.Colorado Springs,USA:IEEE Press,2011:1801-1807. [11] Amir A,Michael E,Yacov HelOr.Probabilistic subspace clustering via sparse representations.IEEE Signal Processing Letters,2013,20(1):63-66. [12] Huang J Z,Michael K Ng,Rong H,et al.Automated variable weighting in kmeans type clustering.IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(5):1-12. [13] Chen L,Wang S,Wang K,et al.Soft subspace clustering categorical data with probabilistic distance.Pattern Recognition,2016,51:322-332. [14] Boongoen T,Shang C,Lam N,et al.Extending data reliability measure to a filter approach for soft subspace clustering.IEEE Transactions on Systems,Man,and Cybernetics - Part B:Cybernetics,2011,41(6):170541-1750564. [15] Christos B,Malik MagdonIsmail.Deterministic feature selection for kmeans clustering.IEEE Transactions on Information Theory,2013,59(9):6099-6110. [16] Gao J,Wang S T.Fuzzy clustering algorithm with ranking features and identifying noise simultaneously.Acta Automatica Sinca,2009,35(2):145-153. [17] Chen ChienHsing.A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection.Applied Soft Computing,2014,20:4-14. [18] Boongoen T,Shen Q.Nearestneighbor guided evaluation of data reliability and its applications.IEEE Transactions on System,Man,Cybernetics - Part B:Cybernetics,2010,40(6):1622-1633. [19] Domeniconi C,Gunopulos D,Ma S,et al.Locally adaptive metrics for clustering high dimensional data.Data Mining Knowledge Discovery,2007,14(1):63-97. [20] Deng Z,Choi K,Chung F,et al.Enhanced soft subspace clustering integrating withincluster and betweencluster information.Pattern Recognition,2010,43(3):767-781. [21] Bezdek J C,Hathaway R,Sobin M,et al.Convergence theory for fuzzy Cmeans:Counter examples and repairs.IEEE Transactions on Systems,Man,and Cybernetics,1987,17(5):873-877. [22] Wang Q,Ye Y M,Huang J Z.Fuzzy kmeans with variable weighting in high dimensional data analysis.In:Proceeding of the 9th International Conference on WebAge Information Management.Zhangjiajie,China:IEEE,2008:365-372. [23] Eschrich S,Ke J,Hall L O,et al.Fast accurate fuzzy clustering through data reduction.IEEE Transactions on Fuzzy System,2003,11:262-270. [24] Ding C,Li T.Adaptive dimension reduction using discriminant analysis and kmeans clustering.In:Proceedings of the 24th International Conference on Machine Larning(ICML2007).Corvalis,USA:ACM,2007:521-528. [25] Dave R N,Sen S.Robust fuzzy clustering of relational data.IEEE Transactions on Fuzzy System,2002,10:713-727. [26] Deng Z,Choi K,Chung F,et al.,Enhanced soft subspace clustering integrating withincluster and betweencluster information.Pattern Recognition,2010,43(3):767-781. [27] Tang W,Xiong H,Zhong S,et al.Enhancing semisupervised clustering:A feature projection perspective.In:Proceedings of the Knowledge Discovery and Data Mining.San Jose,USA:Springer,2007:707-716. |
No related articles found! |
|