基于乘法更新规则的k⁃means与谱聚类的联合学习

doi:10.13232/j.cnki.jnju.2021.02.002

南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (2): 177–188.doi: 10.13232/j.cnki.jnju.2021.02.002

• • 上一篇

基于乘法更新规则的k⁃means与谱聚类的联合学习

陈迪, 刘惊雷()

烟台大学计算机与控制工程学院，烟台，264005

收稿日期:2020-10-09 出版日期:2021-03-23 发布日期:2021-03-23
通讯作者: 刘惊雷 E-mail:jinglei_liu@sina.com
作者简介:E⁃mail:jinglei_liu@sina.com
基金资助:
国家自然科学基金(61572419)

Joint learning of k⁃means and spectral clustering based on multiplication update rule

Di Chen, Jinglei Liu()

School of Computer and Control Engineering，Yantai University，Yantai，264005，China

Received:2020-10-09 Online:2021-03-23 Published:2021-03-23
Contact: Jinglei Liu E-mail:jinglei_liu@sina.com

摘要/Abstract

摘要：

k?means和谱聚类是两种应用最广泛的聚类技术.k?means是基于矩阵分解的聚类方法，并且是在数据空间上基于误差极小化的聚类方法.谱聚类是基于图的聚类方法，并且是基于两点在数据空间和特征空间的相似性保持的聚类方法.为了利用两者的优势，提出一种基于乘法更新规则的k?means和谱聚类的联合学习方法，该方法将k?means和谱聚类结合成一个统一的聚类模型，该模型可在单次优化中同时优化k?means和谱聚类的目标；此外，还基于乘法更新规则设计了对聚类中心C与聚类指示器Y进行迭代更新的优化算法.重要的是，在理论上证明了所设计算法的正确性和收敛性.在典型的数据集上进行测试，实验结果表明提出的联合学习算法在聚类精度和标准互信息度指标上都有所提高.

关键词: k?means, 谱聚类, 联合学习方法, 乘法更新规则, 正确性和收敛性

Abstract:

k?means and spectral clustering are two of the most widely used clustering techniques. k?means is a clustering method based on matrix decomposition and error minimization in the data space. Spectral clustering is a graph based clustering method and a similarity preserving clustering method based on two points in data space and feature space. In order to make full use of the advantages of both techniques，a joint learning method of k?means and spectral clustering based on the multiplication update rule is proposed. This method combines k?means and spectral clustering into a uni?ed clustering model，in the single optimization，the model optimizes the target of k?means and spectral clustering simultaneously. In addition，in order to solve the optimization goal of the clustering problem，an iterative updating algorithm for cluster center C and cluster indicator Y is designed based on the multiplication update rule. The correctness and convergence of the proposed algorithm are proved theoretically. Finally，a test is carried out on a typical dataset，and the experimental results show that the proposed joint learning algorithm improves both the clustering accuracy and the standard mutual information degree.

Key words: k?means clustering, spectral clustering, joint learning method, multiplication update rule, correctness and convergence

中图分类号:

TP391

陈迪, 刘惊雷. 基于乘法更新规则的k⁃means与谱聚类的联合学习[J]. 南京大学学报(自然科学版), 2021, 57(2): 177–188.

Di Chen, Jinglei Liu. Joint learning of k⁃means and spectral clustering based on multiplication update rule[J]. Journal of Nanjing University(Natural Sciences), 2021, 57(2): 177–188.

图/表 11

表1

本研究中使用的符号"

符号	描述
$X ∈ R m × n$	数据矩阵
$C$	聚类中心矩阵
$G, Y$	聚类指示器矩阵
$L$	拉普拉斯矩阵
$A, B$	两类数据点
$Z$	带权邻接矩阵
$W$	相似度矩阵
$D$	度矩阵
$M$	原始数据空间的聚类中心矩阵
$U, V$	非负矩阵
$T r ?$	矩阵的迹
$X i j$	X的 $i, j$ 元素
c	数据集中类的数量
n	数据样本数量
m	数据维度
$σ$	尺度参数
$λ$	正则化参数
$X F 2 = T r X T X$	X的Frobenius范数平方

表1

图1

图2

表2

图3

表3

表4

表5

表6

表7

图4

参考文献 30

1	Tasdemir K，Merenyi E. Exploiting data topology in visualization and clustering of self?organizing maps. IEEE Transactions on Neural Networks，2009，20(4)：549-562.
2	Kumar A，Rai P，Daumé H. Co?regularized multi?view spectral clustering∥Proceedings of the 24^th International Conference on Neural Information Processing Systems. Red Hook，NY，USA：Curran Associates Inc.，2011：1413-1421.
3	Ben?Hur A，Horn D，Siegelmann H T，et al. Support vector clustering. Journal of Machine Learning Research，2002，2(12)：125-137.
4	Girolami M. Mercer kernel?based clustering in feature space. IEEE Transactions on Neural Networks，2002，13(3)：780-784.
5	Zhang R，Nie F P，Guo M H，et al. Joint learning of fuzzy k?means and nonnegative spectral clustering with side information. IEEE Transactions on Image Processing，2019，28(5)：2152-2162，doi：10.1109/TIP.2018.2882925.
6	Zhao W L，Deng C H，Ngo C W. K?means：a revisit. Neurocomputing，2018，291：195-206.
7	Xu J L，Han J W，Nie F P，et al. Re?weighted discriminatively embedded k?means for multi?view clustering. IEEE Transactions on Image Processing，2017，26(6)：3016-3027，doi：10.1109/TIP.2017. 2665976.
8	Bauckhage C. K?means clustering is matrix factorization. 2015，arXiv，1512.07548.
9	Hu Z X，Nie F P，Wang R，et al. Multi?view spectral clustering via integrating nonnegative embedding and spectral embedding. Information Fusion，2020，55：251-259.
10	Kailkhura B，Thiagarajan J J，Rastogi C，et al. A spectral approach for the design of experiments：design，analysis and algorithms. Journal of Machine Learning Research，2018，19(34)：1-46.
11	Arias?Castro E，Lerman G，Zhang T. Spectral clustering based on local PCA. Journal of Machine Learning Research，2017，18(1)：1-57.
12	Lee D D，Seung H S. Algorithms for non?negative matrix factorization∥Proceedings of the 13^th International Conference on Neural Information Processing Systems. Cambridge，MA，USA：MIT Press，2000，13：535-541.
13	Duda R O，Hart P E. Pattern classi?cation and scene analysis. New York：Wiley，1973，512.
14	Mao J C，Jain A K. A self?organizing network for hyperellipsoidal clustering. IEEE Transactions on Neural Networks，1996，7(1)：16-29.
15	Likas A，Vlassis N，Verbeek J J. The global k?means clustering algorithm. Pattern Recognition，2003，36(2)：451-461.
16	Hansen P，Ngai E，Cheung B K. Analysis of global k?means，an incremental heuristic for minimum sum?of?squares clustering. Journal of Classi?cation，2005，22(2)：287-310.
17	Shi J B，Malik J M. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence，2000，22(8)：888-905，doi：10.1109/34.868688.
18	Ding C，He X F，Simon H D. Nonnegative lagrangian relaxation of k?means and spectral clustering∥Gama J，Camacho R，Brazdil P B，et al. Machine Learning：ECML2005.
	Springer Berlin Heidelberg，2005，3720：530-538.
19	Ng A Y，Jordan M I，Weiss Y. On spectral clustering：analysis and an algorithm∥Proceedings of the 14^th International Conference on Neural Information Processing Systems：Natural and Synthetic. Cambridge，MA，USA：MIT Press，2001，14：849-856.
20	Wan M H，Lai Z H，Ming Z，et al. An improve face representation and recognition method based on graph regularized non?negative matrix factorization. Multimedia Tools and Applications，2019，78(15)：22109-22126.
21	Lee D D，Seung H S. Learning the parts of objects by non?negative matrix factorization. Nature，1999，401(6755)：788-791.
22	Ding C，He X F，Simon H D. On the equivalence of nonnegative matrix factorization and spectral clustering∥Proceedings of the SIAM International Conference on Data Mining. Philadelphia，PA,USA：Society for Industrial and Applied Mathematics, 2005: 606-610..
23	Ding C H Q，Li T，Jordan M I. Convex and semi?nonnegative matrix factorizations. IEEE Transactions on Pattern Analysis and Machine Intelligence，2010，32(1)：45-55.
24	Ding C，Li T，Peng W. Nonnegative matrix factorization and probabilistic latent semantic indexing：equivalence，chi?square statistic，and a hybrid method∥Proceedings of the 21^st National Conference on Arti?cial Intelligence. Boston，MA，USA：AAAI Press，2006：342-347.
25	Lin C J. On the convergence of multiplicative update algorithms for nonnegative matrix factorization. IEEE Transactions on Neural Networks，2007，18(6)：1589-1596.
26	Xu W，Liu X，Gong Y H. Document clustering based on non?negative matrix factorization∥Proceedings of the 26^th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval. New York，NY，USA：ACM，2003：267-273.
27	Jolliffe I T. Principal component analysis. Springer Berlin Heidelberg，1986，231-246.
28	Deerwester S，Dumais S T，Furnas G W，et al. Indexing by latent semantic analysis. Journal of the American Society for Information Science，1990，41(6)：391-407.
29	Zha H Y，He X F，Ding C，et al. Spectral relaxation for k?smeans clustering∥Proceedings of the 14^th International Conference on Neural Information Processing Systems：Natural and Synthetic. Cambridge，MA，USA：MIT Press，2001：1057-1064.
30	von Luxburg U. A tutorial on spectral clustering. Statistics and Computing，2007，17(4)：395-416.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

P	ACC (%)					NMI (%)
P	k?means	SC	PCA	NMF	KMSC	k?means	SC	PCA	NMF	KMSC
4	69.27	83.92	84.69	75.59	90.31	62.57	76.13	76.38	65.35	89.06
6	63.29	76.23	77.7	73.92	89.63	65.0	74.77	75.54	71.49	89.65
8	63.72	73.00	72.44	71.86	87.81	68.61	76.39	73.76	71.99	89.92
10	59.72	72.85	70.83	69.42	84.2	66.84	75.9	73.6	72.62	88.18
12	59.36	73.13	67.03	69.59	81.84	68.76	77.82	72.75	73.98	88.18
14	59.07	71.34	66.59	66.41	79.67	69.27	78.04	73.78	72.8	88.46
16	56.39	69.05	64.99	65.76	80.51	70.34	79.36	74.82	73.43	88.54
18	55.69	71.91	65.2	64.51	78.13	70.34	79.36	74.82	73.43	88.54
20	55.27	68.4	62.58	63.0	77.91	70.55	77.96	73.56	72.83	88.99

P	ACC (%)					NMI (%)
P	k?means	SC	SVD	NMF	KMSC	k?means	SC	SVD	NMF	KMSC
5	65.35	83.64	82.7	95.5	99.65	78.1	74.9	76.8	92.7	98.91
10	68.5	88.2	68.2	83.6	91.4	73.1	69.2	69.2	82.4	89.44
15	64.9	82.1	65.3	79.9	93.4	74.0	71.8	71.8	82.0	88.0
20	63.9	79.0	63.4	76.3	91.2	75.7	71.5	71.5	80.6	85.9
25	61.5	74.3	60.8	75.0	88.6	74.6	70.9	70.9	79.0	83.9
30	61.2	71.2	65.9	71.9	88.6	74.7	74.7	74.7	65.13	86.21

数据集	样本数量	类别数	属性数	类的大小
Iris	150	3	4	50，50，50
Vehicle	846	4	18	199，217，218，212

K	ACC (%)					NMI (%)
K	k?means	SC	NMF	k?means++	KMSC	k?means	SC	NMF	k?means++	KMSC
Iris	89.33	89.33	69.33	66.67	90.67	75.14	74.5	79.15	52.23	79.6
Vehicle	45.27	41.61	39.13	40.19	45.74	18.15	16.32	12.3	14.19	20.46

[1]	段友祥,柳璠,孙歧峰,李洪强. 基于相带划分的孔隙度预测[J]. 南京大学学报(自然科学版), 2019, 55(6): 934-941.
[2]	帅　惠, 袁晓彤, 刘青山. 基于L0约束的稀疏子空间聚类[J]. 南京大学学报(自然科学版), 2018, 54(1): 23-.
[3]	孟　娜1，梁吉业1，2*，庞天杰1. 一种基于抽样的谱聚类集成算法 [J]. 南京大学学报(自然科学版), 2016, 52(6): 1090-.
[4]	曹江中^1*，陈佩²，戴青云³，凌永权¹. 基于Markov随机游走的谱聚类相似图构造方法[J]. 南京大学学报(自然科学版), 2015, 51(4): 772-780.
[5]	贾洪杰1,2丁世飞1,2. 基于邻域粗糙集约减的谱聚类算法[J]. 南京大学学报(自然科学版), 2013, 49(5): 619-627.
[6]	高尚兵¹*，周静波²，严云洋¹. 一种新的基于超像素的谱聚类图像分割算法[J]. 南京大学学报(自然科学版), 2013, 49(2): 169-175.
[7]	纳跃跃，于剑** . 一种用于谱聚类图像分割的像素相似度计算方法*[J]. 南京大学学报(自然科学版), 2013, 49(2): 159-168.
[8]	刘娜^1.2，肖智博¹，鲁明羽^1*. 基于形态学的单词一文档谱聚类方法 [J]. 南京大学学报(自然科学版), 2012, 48(2): 154-163.

基于乘法更新规则的k⁃means与谱聚类的联合学习

Joint learning of k⁃means and spectral clustering based on multiplication update rule

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 11

参考文献 30

相关文章 8

Metrics

本文评价

推荐阅读 0