|本期目录/Table of Contents|

[1]帅 惠,袁晓彤*,刘青山.基于L0约束的稀疏子空间聚类[J].南京大学学报(自然科学),2018,54(1):23.[doi:10.13232/j.cnki.jnju.2018.01.003]
 Shuai Hui,Yuan Xiaotong*,Liu Qingshan.Sparse subspace clustering based on L0 constraint[J].Journal of Nanjing University(Natural Sciences),2018,54(1):23.[doi:10.13232/j.cnki.jnju.2018.01.003]
点击复制

基于L0约束的稀疏子空间聚类()
     

《南京大学学报(自然科学)》[ISSN:0469-5097/CN:32-1169/N]

卷:
54
期数:
2018年第1期
页码:
23
栏目:
出版日期:
2018-02-01

文章信息/Info

Title:
Sparse subspace clustering based on L0 constraint
作者:
帅 惠袁晓彤*刘青山
江苏省大数据分析技术重点实验室,南京信息工程大学,南京,210044
Author(s):
Shuai HuiYuan Xiaotong*Liu Qingshan
Jiangsu Key Laboratory of Big Data Analysis Technology,Nanjing University of Information Science and Technology,Nanjing,210044,China
关键词:
 高维数据处理稀疏子空间聚类交替方向乘子法谱聚类L0约束正交匹配追踪
Keywords:
 high-dimension dataSparse Space Clustering(SSC)Alternating Direction Method of Multipliers(ADMM)spectral clusteringL0 constraintOrthogonal Matching Pursuit(OMP)
分类号:
TP311
DOI:
10.13232/j.cnki.jnju.2018.01.003
文献标志码:
A
摘要:
 大数据时代背景下,随着所获数据数量和维度的不断增加,高维数据的处理成为聚类分析的重点和难点.基于同一类别高维数据通常分布在高维环绕空间的低维子空间这一事实,子空间聚类成为高维数据聚类分析领域的重要方法.稀疏子空间聚类(Sparse Space Clustering,SSC)通过交替方向乘子法(Alternating Direction Method of Multipliers,ADMM)对数据矩阵的稀疏自表达系数进行求解,发现分布于低维子空间并集中的数据的稀疏表示并进行聚类.但是ADMM参数多、收敛速度慢,其效率难以满足对大规模数据库进行聚类分析的要求.针对这一问题提出了基于L0约束的稀疏子空间聚类方法,该方法使用正交匹配追踪(Orthogonal Matching Pursuit,OMP)算法求解L0约束的自表达稀疏重建问题,构建数据集中各数据之间的相关性矩阵,最终对相关性矩阵应用谱聚类方法得到聚类结果.根据OMP算法每次迭代之间的耦合关系对其进行优化,进一步降低了计算复杂度,提高了算法效率.在生成数据和Extended Yale B database人脸数据库的实验结果表明,该算法与SSC相比,在显著减少计算时间的基础上,取得了与SSC相当的聚类准确率.
Abstract:
With the rapid increasement of the amount and dimensionality of data,high-dimension data processing has become the key and difficult point of cluster analysis in the age of big data.Subspace clustering is an important method in the field of high-dimension data clustering on account of the fact that data in a class or category lie in a low-dimension subspace of the ambient space.Sparse Space Clustering(SSC)proposed by Elhamifar discovers the sparse representations of data distributed in a union of low-dimension subspaces.SSC solves the sparse self-expression coefficient of data matrix constrained by L1 norm via Alternating Direction Method of Multipliers(ADMM)and establishes the Laplacian matrix of the data.Then,the data are classified into specific categories via special clustering algorithm.However,ADMM has too many parameters to optimalize and slow convergence speed.These disadvantages make SSC far from dealing with large scale datasets efficiently.In consideration of these problems,we propose a sparse subspace clustering algorithm based on L0 constraint in this paper.The proposed method solves the sparse self-expression reconstruction problem constrained by L0 norm through Orthogonal Matching Pursuit(OMP).OMP finds the sparse represent of each data point as a linear combination of other data points in a direct and efficient way.The sparse self-expression coefficient acquired by OMP is transformed into similarity matrix.Ultimately similarity matrix is applied by spectral clustering to obtain the clustering result.In order to further decrease the computation complexity of OMP,we also optimize OMP according to the relativity in continuous iterations and improve the efficiency of our algorithm.Experiments on synthetic data and Extended Yale B database demonstrate that the proposed L0 constrained sparse subspace clustering is significantly more efficient while the accuracy is comparable to SSC.

参考文献/References:

 [1] Kriegel H P,Krger P,Zimek A.Subspace clustering.Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery,2012,2(4):351-364.
[2] Elhamifar E,Vidal R.Sparse subspace clustering:Algorithm,theory,and applications.IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(11):2765-2781.
[3] Bellman R E.Dynamic programming.Princeton:Princeton University Press,1957.
[4] 周志华.机器学习.北京:清华大学出版社,2016:252-253.(Zhou Z H.Machine learning.Beijing:Tsinghua University Press,2016:252-253.)
[5] Vidal R,Subspace clustering.IEEE Signal Processing Magazine,2011,28(2):52-68.
[6] Tropp J A,Gilbert A C.Signal recovery from random measurements via orthogonal matching pursuit.IEEE Transactions on information theory,2007,53(12):4655-4666.
[7] Hong W,Wright J,Huang K,et al.Multiscale hybrid linear models for lossy image representation.IEEE Transactions on Image Processing,2006,15(12):3655-3671.
[8] Yang A Y,Wright J,Ma Y,et al.Unsupervised segmentation of natural images via lossy data compression.Computer Vision and Image Understanding,2008,110(2):212-225.Vidal R.
[9] Costeira J P,Kanade T.A multibody factorization method for independently moving objects.International Journal of Computer Vision,1998,29(3):159-179.
[10] Kanatani K.Motion segmentation by subspace separation and model selection ∥ 8th IEEE International Conference on Computer Vision.Vancouver,Canada:IEEE,2001,2:586-591.
[11] Vidal R,Ma Y,Sastry S.Generalized principal component analysis(GPCA).IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(12):1945-1959.
[12] Tseng P.Nearest q-flat to m points.Journal of Optimization Theory and Applications,2000,105(1):249-252.
[13] Ho J,Yang M H,Lim J,et al.Clustering appearances of objects under varying illumination conditions ∥ 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Madison,WI,USA:IEEE,2003,1:I-11-I-18.
[14] Ng A Y,Jordan M I,Weiss Y.On spectral clustering:Analysis and an algorithm ∥ Proceedings of the 14th International Conference on Neural Information Processing Systems:Natural and Synthetic.Vancouver,Canada:ACM,2001:849-856.
[15] von Luxburg U.A tutorial on spectral clustering.Statistics and Computing,2007,17(4):395-416.
[16] Liu G C,Lin Z C,Yu Y.Robust subspace segmentation by low-rank representation ∥ Proceedings of the 27th International Conference on International Conference on Machine Learning.Haifa,Israel:ACM,2010:663-670.
[17] Liu G C,Lin Z C,Yan S C,et al.Robust recovery of subspace structures by low-rank representation.IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(1):171-184.
[18] Liu G C,Yan S C.Latent low-rank representation for subspace segmentation and feature extraction ∥ 2011 IEEE International Conference on Computer Vision(ICCV).Barcelona,Spain:IEEE,2011:1615-1622.
[19] Liu G C,Xu H,Tang J H,et al.A deterministic analysis for LRR.IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,38(3):417-430.
[20] Yuan X T,Li P.Sparse additive subspace clustering ∥ Fleet D,Pajdla T,Schiele B,et al.European Conference on Computer Vision(ECCV 2014).Zurich,Swiss Confederation:Springer,2014:644-659.
[21] Shalev-Shwartz S,Srebro N,Zhang T.Trading accuracy for sparsity in optimization problems with sparsity constraints.SIAM Journal on Optimization,2010,20(6):2807-2832.
[22] Basri R,Jacobs D W.Lambertian reflectance and linear subspaces.IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(2):218-233.

相似文献/References:

备注/Memo

备注/Memo:
 基金项目:国家自然科学基金(61532009,61402232,61522308)
收稿日期:2017-12-11
*通讯联系人,E-mail:xtyuan1980@gmail.com
更新日期/Last Update: 2018-01-31