南京大学学报(自然科学版) ›› 2024, Vol. 60 ›› Issue (1): 53–64.doi: 10.13232/j.cnki.jnju.2024.01.006

• • 上一篇    下一篇

基于多阶近邻约束的深度不完整多视图聚类方法

王梅1, 王伟东1, 刘勇2(), 于源泽1   

  1. 1.东北石油大学计算机与信息技术学院,大庆,163318
    2.中国人民大学高瓴人工智能学院,北京,100049
  • 收稿日期:2023-08-10 出版日期:2024-01-30 发布日期:2024-01-29
  • 通讯作者: 刘勇 E-mail:liuyonggsai@ruc.edu.cn
  • 基金资助:
    国家自然科学基金(51774090);黑龙江省博士后科研启动金资助项目(LBH?Q20080)

Deep incomplete multi⁃view clustering based on multi⁃order neighborhood constraint

Mei Wang1, Weidong Wang1, Yong Liu2(), Yuanze Yu1   

  1. 1.School of Computer and Information Technology, Northeast Petroleum University, Daqing, 163318, China
    2.Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, 100049, China
  • Received:2023-08-10 Online:2024-01-30 Published:2024-01-29
  • Contact: Yong Liu E-mail:liuyonggsai@ruc.edu.cn

摘要:

多视图聚类是重要的无监督学习方法之一,然而在实际应用中很难获取完整的多视图数据,导致不完整多视图聚类问题.大多数已有的不完整多视图聚类方法只考虑了视图的属性信息,而忽视了数据结构信息对聚类的影响,使提取的特征不能充分表示原始数据的潜在结构.针对以上问题,提出一种基于多阶近邻约束的深度不完整多视图聚类方法.首先,利用具有自注意力机制的深度自编码器获取带有视图间信息交互的深层次隐含特征,并采用加权融合的方式获取视图的公共语义信息;然后,对于不完整多视图中的缺失数据,利用多视图的公共表示进行补全;最后,提出一种多阶近邻约束机制,该机制考虑不完整多视图数据的深层结构信息,利用多视图的互补性构建近似完整的近邻图,引导编码器学习更紧致、更有判别性的高级语义特征.在公共数据集上的实验结果证明了所提方法的有效性.

关键词: 不完整多视图聚类, 自注意力, 结构信息, 多阶近邻

Abstract:

Multi?view clustering is an important unsupervised learning method. However,in real applications,it is difficult to obtain complete multi?view data,which leads to incomplete multi?view clustering problem. Most of the existing incomplete multi?view clustering methods only consider the attribute information of views,but ignore the influence of structure information on clustering,resulting in extracted features cannot fully represent the latent structure of the original data. To address these problems,in this paper,a deep method based on multi?order neighborhood constraints is proposed for incomplete multi?view clustering. Firstly,the deep autoencoder with self?attention is used to obtain the rich complex latent features with cross?view information interaction,and the weighted fusion approach is employed to learn the consistency common information of views. Then,in incomplete multi?view settings,the missing data are fixed up by the consistency common representation of multi?views data. Finally,the multi?order neighborhood constraint mechanism is proposed,which considers the deep structural information within incomplete views and constructs an approximate complete neighborhood graph using the complementarity of multi?views,guiding the encoder to learn more compact and discriminative high?level semantic features. Experimental results show that the proposed method is effective.

Key words: incomplete multi?view clustering, self?attention, structure information, multi?order neighborhood

中图分类号: 

  • TP301

图1

近邻关系图"

图2

多头自注意力机制"

图3

DMNC模型框架"

表1

实验使用的数据集的详细信息"

数据集样本数视图数类别
MNIST⁃USPS5000210
CCV6773320
Multi⁃Fashion10000210
Caltech7147467

表2

不同缺失率下各聚类方法在MNIST?USPS数据集上的聚类结果"

MethodACCNMIPurity
10%30%50%10%30%50%10%30%50%
BSV50.03%43.63%36.67%45.69%39.78%31.90%52.74%47.76%39.01%
Concat54.43%47.19%37.74%48.33%42.66%38.10%56.00%53 51%45.47%
PVC64.73%63.69%52.73%58.70%55.77%46.47%67.99%67.36%55.51%
UEAF71.97%66.26%61.94%66.81%58.14%57.84%72.74%67.26%66.67%
MKKM⁃IK72.61%64.44%49.74%61.64%52.34%37.59%73.58%64.64%50.06%
EE⁃R⁃IMVC75.71%58.54%45.31%64.37%49.47%34.15%75.84%61.31%45.83%
DCP96.23%96.30%94.42%92.76%92.31%91.13%96.74%96.10%95.39%
DMNC97.53%96.70%96.19%95.44%94.20%91.94%97.82%97.43%96.77%

表3

不同缺失率下各聚类方法在CCV数据集上的聚类结果"

MethodACCNMIPurity
10%30%50%10%30%50%10%30%50%
BSV19.37%17.41%15.76%17.22%15.20%13.15%21.25%20.28%18.93%
Concat21.11%18.02%15.89%23.40%19.79%15.77%22.64%20.52%17.55%
PVC16.48%15.27%15.03%13.68%10.28%10.67%20.71%19.00%17.75%
UEAF26.38%24.82%21.53%23.64%23.09%21.53%29.47%28.08%27.93%
MKKM⁃IK20.71%18.52%15.63%14.13%12.60%10.30%22.81%21.07%18.52%
EE⁃R⁃IMVC25.92%23.33%17.90%21.43%17.55%21.95%28.73%25.82%20.77%
DCP22.64%20.48%18.39%22.60%19.42%17.88%27.87%25.60%20.11%
DMNC29.31%28.24%26.11%28.72%27.66%24.91%30.05%28.63%26.26%

表4

不同缺失率下各聚类方法在Multi?Fashion数据集上的聚类结果"

MethodACCNMIPurity
10%30%50%10%30%50%10%30%50%
BSV50.63%43.51%36.32%48.99%40.48%32.56%54.21%46.85%37.62%
Concat51.77%47.13%40.22%52.25%48.37%41.32%57.06%54.33%49.39%
PVC45.68%41.75%42.03%44.33%39.51%39.27%47.54%52.90%48.82%
UEAF57.67%50.88%47.96%57.13%48.52%44.03%61.72%55.31%50.16%
MKKM⁃IK70.01%59.92%46.38%61.26%50.53%39.31%70.31%59.69%47.32%
EE⁃R⁃IMVC71.97%63.12%51.64%65.81%57.60%43.77%72.98%63.55%51.47%
DCP78.77%74.06%71.38%82.94%77.69%74.54%81.37%74.52%71.99%
DMNC85.36%82.59%78.63%86.59%86.90%79.54%83.66%81.59%77.63%

表5

不同缺失率下各聚类方法在Caltech7数据集上的聚类结果"

MethodACCNMIPurity
10%30%50%10%30%50%10%30%50%
BSV43.82%39.61%38.63%40.02%31.31%26.93%51.62%47.55%44.32%
Concat42.63%40.18%38.88%43.93%37.71%30.60%52.99%50.41%45.19%
PVC40.32%38.93%35.41%44.74%43.21%38.06%45.54%43.49%40.34%
UEAF47.83%44.73%37.15%40.99%32.62%24.31%81.93%79.22%76.05%
MKKM⁃IK36.54%34.89%36.02%24.51%23.73%22.89%72.31%74.49%72.16%
EE⁃R⁃IMVC40.36%38.03%36.46%30.37%28.55%23.43%76.88%75.13%73.34%
DCP47.89%44.37%35.92%50.89%47.91%42.74%84.46%82.80%77.45%
DMNC48.13%45.01%36.39%49.43%48.26%43.77%84.19%84.17%82.79%

图4

在MNIST?USPS数据集上的参数敏感性分析实验"

表6

在MNIST?USPS数据集上采用不同阶近邻关系时DMNC算法的性能比较"

聚类指标零阶一阶二阶三阶四阶
ACC79.12%86.69%91.09%96.42%95.32%
NMI70.53%75.11%85.97%92.13%90.81%
Purity80.28%88.07%90.71%96.39%93.47%

图5

DMNC在MNIST?USPS数据集上的收敛曲线"

图6

对MNIST?USPS数据集的聚类可视化"

1 Zhang C Q, Hu Q H, Fu H Z,et al. Latent multi?view subspace clustering∥2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu,HI,USA:IEEE,2017:4333-4341.
2 Peng X, Huang Z Y, Lü J C,et al. COMIC:Multi?view clustering without parameter selection∥Proceedings of the 36th International Conference on Machine Learning. Los Angeles,CA,USA:PMLR,2019:5092-5101.
3 Liu X W, Liu L, Liao Q,et al. One pass late fusion multi?view clustering∥Proceedings of the 38th International Conference on Machine Learning. Virtual:PMLR,2021:6850-6859.
4 Tang H Y, Liu Y. Deep safe multi?view clustering:Reducing the risk of clustering performance degradation caused by view increase∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans,LA,USA:IEEE,2022:202-211.
5 Lu J T, Nie F P, Wang R,et al. Fast multiview clustering by optimal graph mining. IEEE Transactions on Neural Networks and Learning Systems2023:1-7,DOI:10.1109/TNNLS. 2023.3256066 .
6 张绎凡,李婷,葛洪伟. 多样性诱导的潜在嵌入多视图聚类. 南京大学学报(自然科学)202359(3):388-397.
Zhang Y F, Li T, Ge H W. Diversity?induced multi?view clustering in latent embedded space. Journal of Nanjing University (Natural Science)202359(3):388-397.
7 Wang J, Tang C, Wan Z G,et al. Efficient and effective one?step multiview clustering. IEEE Transactions on Neural Networks and Learning Systems2023:1-12,DOI:10.1109/TNNLS. 2023.3253246 .
8 Xu J, Ren Y Z, Tang H Y,et al. Multi?VAE:Learning disentangled view?common and view?peculiar visual representations for multi?view clustering∥2021 IEEE/CVF International Conference on Computer Vision. Montreal,Canada:IEEE,2021:9234-9243.
9 程玉胜,徐玉婷,王一宾,等. 基于共享子空间的潜在语义学习. 南京大学学报(自然科学)202258(5):816-826.
Cheng Y S, Xu Y T, Wang Y B,et al. Latent semantic learning based on shared subspace. Journal of Nanjing University (Natural Science)202258(5):816-826.
10 Li S Y, Jiang Y, Zhou Z H. Partial multi?view clustering∥Proceedings of the AAAI Conference on Artificial Intelligence. Québec City,Canada:AAAI,2014:1968-1974.
11 Hu M L, Chen S C. Doubly aligned incomplete multi?view clustering∥Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm,Sweden:AAAI Press,2018:2262-2268.
12 Zhao H D, Liu H F, Fu Y. Incomplete multi?modal visual data grouping∥Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York,NY,USA:AAAI Press,2016:2392-2398.
13 Wang H, Zong L L, Liu B,et al. Spectral perturbation meets incomplete multi?view data∥Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macau,China:AAAI Press,2019:3677-3683.
14 Liu X W, Zhu X Z, Li M M,et al. Multiple kernel k?means with incomplete kernels. IEEE Transactions on Pattern Analysis and Machine Intelligence202042(5):1191-1204.
15 Liu X W, Li M M, Tang C,et al. Efficient and effective regularized incomplete multi?view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence202143(8):2634-2646.
16 Wen J, Zhang Z, Zhang Z,et al. Unified tensor framework for incomplete multi?view clustering and missing?view inferring∥The 35th AAAI Conference on Artificial Intelligence. Palo Alto,CA,USA:AAAI,202135(11):10273-10281.
17 刘晓琳,白亮,赵兴旺,等. 基于多阶近邻融合的不完整多视图聚类算法. 软件学报202233(4):1354-1372.
Liu X L, Bai L, Zhao X W,et al. Incomplete multi?view clustering algorithm based on multi?order neighborhood fusion. Journal of Software202233(4):1354-1372.
18 Zhang C Q, Han Z B, Cui Y J,et al. CPM?Nets:Cross partial multi?view networks∥Proceedings of the 32th Conference on Neural Information Processing Systems. Vancouver,Canada:MIT Press,2019:559-569.
19 Wang Q Q, Ding Z M, Tao Z Q,et al. Generative partial multi?view clustering with adaptive fusion and cycle consistency. IEEE Transactions on Image Processing2021,30:1771-1783.
20 Zhang Y, Liu X W, Wang S W,et al. One?stage incomplete multi?view clustering via late fusion∥Proceedings of the 29th ACM International Conference on Multimedia. Chengdu,China:ACM,2021:2717-2725.
21 Tang H Y, Liu Y. Deep safe incomplete multi?view clustering:Theorem and algorithm∥Proceedings of the 39th International Conference on Machine Learning. Baltimore,MD,USA:ACM,2022,162:21090-21110.
22 Zhang C Q, Cui Y J, Han Z B,et al. Deep partial multi?view learning. IEEE Transactions on Pattern Analysis and Machine Intelligence202244(5):2402-2415.
23 Lin Y J, Gou Y B, Liu Z T,et al. COMPLETER:Incomplete multi?view clustering via contrastive prediction∥2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Nashville,TN,USA:IEEE,2021:11174-11183.
24 Lin Y J, Gou Y B, Liu X T,et al. Dual contrastive prediction for incomplete multi?view representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence202345(4):4447-4461.
25 Yang M X, Li Y F, Hu P,et al. Robust multi?view clustering with incomplete information. IEEE Transactions on Pattern Analysis and Machine Intelligence202345(1):1055-1069.
26 Liu C L, Wen J, Luo X L,et al. Incomplete multi?view multi?label learning via label?guided masked view?and category?aware transformers∥Proceedings of the Thirty?Seventh AAAI Conference on Artificial Intelligence and Thirty?Fifth Conference on Innovative Applications of Artificial Intelligence and 13th Symposium on Educational Advances in Artificial Intelligence. Washington DC,USA:AAAI Press,2023:8816-8824.
27 Liu C L, Wen J, Luo X L,et al. DICNet:Deep instance?level contrastive network for double incomplete multi?view multi?label classification∥Proceedings of the Thirty?Seventh AAAI Conference on Artificial Intelligence and Thirty?Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence. Washington DC,USA:AAAI Press,2023:8807-8815.
28 王丽娟,陈少敏,尹明,等. 基于近邻图改进的块对角子空间聚类算法. 计算机应用202141(1):36-42.
Wang L J, Chen S M, Yin M,et al. Improved block diagonal subspace clustering algorithm based on neighbor graph. Journal of Computer Applications202141(1):36-42.
29 付聪,李六武,杨振国,等. 基于自学习近邻图策略的短文本匹配方法. 计算机应用研究202037(6):1697-1701.
Fu C, Li L W, Yang Z G,et al. Self?adaptive affinity graph learning for short text matching. Application Research of Computers202037(6):1697-1701.
30 Tang C, Liu X W, Zhu X Z,et al. CGD:Multi?view clustering via cross?view graph diffusion∥The 34th AAAI Conference on Artificial Intelligence. New York,NY,USA:AAAI,202034(4):5924-5931.
31 Li Z L, Tang C, Liu X W,et al. Consensus graph learning for multi?view clustering. IEEE Transactions on Multimedia2021,24:2461-2472.
32 Liang Y W, Huang D, Wang C D. Consistency meets inconsistency:A unified graph learning framework for multi?view clustering∥2019 IEEE International Conference on Data Mining. Beijing,China:IEEE,2019:1204-1209.
33 Wang H, Yang Y, Liu B. GMC:Graph?based multi?view clustering. IEEE Transactions on Knowledge and Data Engineering202032(6):1116-1129.
34 Tang J, Qu M, Wang M Z,et al. LINE:Large?scale information network embedding∥Proceedings of the 24th International Conference on World Wide Web. Florence,Italy:Republic and Canton of Geneva,2015:1067-1077.
35 Vaswani A, Shazeer N, Parmar N,et al. Attention is all you need∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:6000-6010.
36 张智慧,杨燕,张熠玲. 面向不完整多视图聚类的深度互信息最大化方法. 智能系统学报202318(1):12-22.
Zhang Z H, Yang Y, Zhang Y L. Deep mutual information maximization method for incomplete multi?view clustering. CAAI Transactions on Intelligent Systems202318(1):12-22.
37 Xu J, Tang H Y, Ren Y Z,et al. Multi?level feature learning for contrastive multi?view clustering∥2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans,LA,USA:IEEE,2022:16051-16060.
38 刘相男,丁世飞,王丽娟. 基于深度图正则化矩阵分解的多视图聚类算法. 智能系统学报202217(1):158-169.
Liu X N, Ding S F, Wang L J. A multi?view clustering algorithm based on deep matrix factorization with graph regularization. CAAI Tran?sactions on Intelligent Systems202217(1):158-169.
39 Wan Z B, Zhang C Q, Zhu P F,et al. Multi?view information?bottleneck representation learning∥The 35th AAAI Conference on Artificial Intelligence. Palo Alto,CA,USA:AAAI,202135(11):10085-10092.
40 黄展鹏,吴杰康,易法令. 自适应图融合的缺失多视图聚类算法. 计算机工程与应用202359(9):176-181.
Huang Z P, Wu J K, Yi F L. Incomplete multi?view clustering algorithm with adaptive graph fusion. Computer Engineering and Applications202359(9):176-181.
41 诸葛文章,范瑞东,罗廷金,等. 基于独立自表达学习的不完全多视图聚类. 中国科学:信息科学,202252(7):1186-1203.
Zhuge W Z, Fan R D, Luo T J,et al. Incomplete multi?view clustering via independent self?representation learning. Scientia Sinica Informationis202252(7):1186-1203.
42 Paszke A, Gross S, Massa F,et al. PyTorch:An imperative style,high?performance deep learning library∥Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver,Canada:MIT Press,2019:8026-8037.
43 Van Der Maaten L, Hinton G. Visualizing data using t?SNE. Journal of Machine Learning Research20089(86):2579-2605.
[1] 资文杰, 贾庆仁, 陈浩, 李军, 景宁. 基于Transformer的城市三角网格语义分割方法[J]. 南京大学学报(自然科学版), 2024, 60(1): 18-25.
[2] 曲皓, 狄岚, 梁久祯, 刘昊. 双端输入型嵌套融合多尺度信息的织物瑕疵检测[J]. 南京大学学报(自然科学版), 2023, 59(3): 398-412.
[3] 苏雅茜, 崔超然, 曲浩. 基于自注意力移动平均线的时间序列预测[J]. 南京大学学报(自然科学版), 2022, 58(4): 649-657.
[4] 顾健伟, 曾 诚, 邹恩岑, 陈 扬, 沈 艺, 陆 悠, 奚雪峰. 基于双向注意力流和自注意力结合的机器阅读理解[J]. 南京大学学报(自然科学版), 2019, 55(1): 125-132.
[5]  段明月1,2,黄 晶1,2*,陈贺昌1,2,金 弟3*.  一种基于异域自适应的新型社团发现算法[J]. 南京大学学报(自然科学版), 2018, 54(1): 40-.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!