利用粗图训练图神经网络实现网络对齐

doi:10.13232/j.cnki.jnju.2023.06.005

南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (6): 947–960.doi: 10.13232/j.cnki.jnju.2023.06.005

利用粗图训练图神经网络实现网络对齐

钱峰¹^,², 张蕾¹, 赵姝²(), 陈洁²

^1.铜陵学院数学与计算机学院，铜陵，244061
^2.安徽大学计算机科学与技术学院，合肥，230601

收稿日期:2023-08-10 出版日期:2023-11-30 发布日期:2023-12-06
通讯作者: 赵姝 E-mail:zhaoshuzs2002@hotmail.com
基金资助:
国家自然科学基金(61876001);安徽省高校科研计划(2022AH051749);安徽省高校优秀人才支持计划(GXYQ2020054);安徽省高校优秀青年骨干人才国内外访学研修项目(GXGNFX2021148)

Training of graph neural networks on coarsening graphs for network alignment

Feng Qian¹^,², Lei Zhang¹, Shu Zhao²(), Jie Chen²

^1.School of Mathematics and Computer Science, Tongling University, Tongling, 244061, China
^2.School of Computer Science and Technology, Anhui University, Hefei, 230601, China

Received:2023-08-10 Online:2023-11-30 Published:2023-12-06
Contact: Shu Zhao E-mail:zhaoshuzs2002@hotmail.com

摘要/Abstract

摘要：

网络对齐是一项极具挑战性的任务，旨在识别不同网络中的等效节点，由于网络的复杂性和监督数据的缺乏，传统方法的计算复杂度高，精度低.近年来，图神经网络（Graph Neural Networks，GNN）在网络对齐算法中得到了越来越多的应用.已有研究表明，与传统方法相比，使用GNN进行网络对齐可以降低计算复杂度并提高对齐精度，然而，基于GNN的方法的性能受到训练数据质量和网络规模的限制.为此，提出一种快速鲁棒的无监督网络对齐方法FAROS，采用在粗图上训练的GNN模型进行网络对齐.使用粗图进行GNN训练的优点：（1）显著减少训练数据，最大限度地减少GNN反向传播过程中必须更新的权重参数，减少训练时间；（2）缓解数据噪声，能提取网络最重要的结构特征，便于GNN获得更鲁棒的嵌入向量.在训练过程中，FAROS通过引入基于伪锚节点对的自监督学习来提高对齐精度.在真实数据集上的实验结果验证了FAROS算法的有效性，其在保持较好精度的同时，比同类方法快几个数量级.

关键词: 网络对齐, 图神经网络, 网络嵌入, 粗图, 锚节点对

Abstract:

Network alignment is a challenging task that aims to identify equivalent nodes in different networks. Conventional methods face high computational complexity and low accuracy due to the complexity of networks and the lack of supervision. In recent years，Graph Neural Networks (GNN) have been increasingly explored in network alignment algorithms，as they can reduce computational complexity and improve accuracy compared to traditional methods. However，the performance of GNN?based methods is limited by the quality of training data and network size. To address these limitations，we propose a fast and robust unsupervised network alignment method called FAROS. FAROS employs a GNN model trained on coarse graphs for network alignment. The use of coarse graphs for GNN training significantly reduces training data，minimizes weight parameters that must be updated during GNN back?propagation，and accelerates training time. Coarse graphs also mitigate data noise and extract the most important structural features of the network，which facilitates GNN to obtain more resilient embedding vectors. During training，FAROS elevates alignment accuracy by introducing self?supervised learning based on pseudo?anchor node pairs. Experimental results on real datasets demonstrate the effectiveness of FAROS，which is several orders of magnitude faster than comparative methods while maintaining good accuracy.

Key words: network alignment, graph neural network, network embedding, coarsen graph, anchor node pairs

中图分类号:

TP391

钱峰, 张蕾, 赵姝, 陈洁. 利用粗图训练图神经网络实现网络对齐[J]. 南京大学学报(自然科学版), 2023, 59(6): 947–960.

Feng Qian, Lei Zhang, Shu Zhao, Jie Chen. Training of graph neural networks on coarsening graphs for network alignment[J]. Journal of Nanjing University(Natural Sciences), 2023, 59(6): 947–960.

图/表 16

图1

图2

表1

表2

表3

douban online/offline数据集上FAROS算法与对比算法的Precision@K对比"

$P r e c i s i o n @ K$	1	5	10	15	20	25	30
REGAL	0.0447	0.1360	0.2030	0.2576	0.3068	0.3435	0.3739
GAlign	0.4419	0.6780	0.7800	0.8327	0.8685	0.8909	0.9114
NAWAL	0.0000	0.0027	0.0036	0.0072	0.0089	0.0107	0.0179
WAlign	0.2039	0.3882	0.4946	0.5653	0.6136	0.6547	0.6816
Grad⁃Align	0.4794	0.7084	0.7737	0.8113	0.8318	0.8515	0.8649
FAROS	0.5358	0.7549	0.8435	0.8873	0.9106	0.9204	0.9275

表3

表4

allmv/imdb数据集上FAROS算法与对比算法的Precision@K对比"

$P r e c i s i o n @ K$	1	5	10	15	20	25	30
REGAL	0.0951	0.2685	0.3868	0.4672	0.5280	0.5682	0.6049
GAlign	0.6982	0.8184	0.8549	0.8725	0.8858	0.8968	0.9061
NAWAL	0.0002	0.0002	0.0008	0.0014	0.0019	0.0027	0.0035
WAlign	0.5384	0.7112	0.7720	0.8053	0.8261	0.8420	0.8551
Grad⁃Align	0.7081	0.9370	0.9600	0.9654	0.9708	0.9728	0.9745
FAROS	0.7233	0.8516	0.8878	0.9065	0.9192	0.9279	0.9353

表4

表5

flickr/myspace数据集上FAROS算法与对比算法的Precision@K对比"

$P r e c i s i o n @ K$	1	5	10	15	20	25	30
REGAL	0.0037	0.0112	0.0112	0.0225	0.0375	0.0562	0.0562
GAlign	0.0000	0.0075	0.0187	0.0375	0.0449	0.0487	0.0674
NAWAL	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
WAlign	0.0037	0.0075	0.0375	0.0412	0.0524	0.0787	0.0861
Grad⁃Align	0.0075	0.0112	0.0150	0.0225	0.0412	0.0524	0.0562
FAROS	0.0112	0.0262	0.0487	0.0562	0.0824	0.0974	0.1049

表5

表6

flickr/lastfm数据集上FAROS算法与对比算法的Precision@K对比"

$P r e c i s i o n @ K$	1	5	10	15	20	25	30
REGAL	0.0000	0.0022	0.0067	0.0177	0.0244	0.0443	0.0554
GAlign	0.0044	0.0044	0.0155	0.0200	0.0244	0.0310	0.0377
NAWAL	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000	0.0000
WAlign	0.0000	0.0044	0.0133	0.0200	0.0266	0.0310	0.0377
Grad⁃Align	0.0044	0.0288	0.0510	0.0710	0.0953	0.1153	0.1397
FAROS	0.0133	0.0355	0.0732	0.1109	0.1463	0.1840	0.1973

表6

表7

在douban online/offline数据集上FAROS算法在压缩比不同时的P@K对比"

参数r	节点数	运行时间（s）	$P @ 1$	$P @ 5$	$P @ 10$	$P @ 15$	$P @ 20$	$P @ 25$	$P @ 30$
0	3096/1118	104.43	0.5349	0.7639	0.8462	0.8775	0.9106	0.9347	0.9428
0.2	3125/895	69.91	0.5331	0.7621	0.8390	0.8855	0.9088	0.9311	0.9436
0.4	2344/671	44.84	0.5268	0.7522	0.8408	0.8810	0.9204	0.9320	0.9445
0.6	1563/449	25.03	0.5385	0.7665	0.8497	0.8828	0.9132	0.9284	0.9410
0.8	782/224	13.95	0.5224	0.7585	0.8408	0.8748	0.9070	0.9195	0.9302

表7

表8

allmv/imdb数据集上FAROS算法在压缩比不同时的P@K对比"

参数r	节点数	运行时间（s）	$P @ 1$	$P @ 5$	$P @ 10$	$P @ 15$	$P @ 20$	$P @ 25$	$P @ 30$
0	6011/5713	1212.40	0.7073	0.8495	0.8878	0.9075	0.9214	0.9306	0.9374
0.2	4809/4571	794.71	0.7253	0.8516	0.8868	0.9082	0.9214	0.9301	0.9361
0.4	3607/3428	494.16	0.7243	0.8518	0.8906	0.9102	0.9208	0.9293	0.9368
0.6	2405/2286	233.32	0.7243	0.8570	0.8920	0.9109	0.9225	0.9306	0.9364
0.8	1203/1143	130.00	0.7162	0.8503	0.8860	0.9049	0.9177	0.9264	0.9355

表8

表9

flickr/myspace数据集上FAROS算法在压缩比不同时的P@K对比"

参数r	节点数	运行时间（s）	$P @ 1$	$P @ 5$	$P @ 10$	$P @ 15$	$P @ 20$	$P @ 25$	$P @ 30$
0	6714/10733	1223.03	0.0112	0.0262	0.0487	0.0674	0.0787	0.0974	0.1049
0.2	5372/8587	775.85	0.0075	0.0225	0.0337	0.0524	0.0637	0.0787	0.1011
0.4	4029/6440	426.75	0.0075	0.0187	0.0262	0.0375	0.0412	0.0487	0.0824
0.6	2686/4294	191.73	0.0150	0.0225	0.0412	0.0599	0.0674	0.0861	0.1011
0.8	1343/2147	55.76	0.0075	0.0075	0.0075	0.0225	0.0562	0.0637	0.0787

表9

表10

flickr/lastfm数据集上FAROS算法在压缩比不同时的P@K对比"

参数r	节点数	运行时间（s）	$P @ 1$	$P @ 5$	$P @ 10$	$P @ 15$	$P @ 20$	$P @ 25$	$P @ 30$
0	12974/15436	3687.23	0.0111	0.0310	0.0710	0.1086	0.1397	0.1729	0.1929
0.2	10380/12349	2259.77	0.0044	0.0399	0.0665	0.1042	0.1552	0.1929	0.2151
0.4	7785/9262	1194.05	0.0067	0.0310	0.0732	0.1197	0.1530	0.1863	0.2062
0.6	5190/6175	535.10	0.0067	0.0377	0.0754	0.1153	0.1353	0.1619	0.1951
0.8	2595/3088	140.94	0.0089	0.0310	0.0687	0.1109	0.1353	0.1685	0.1907

表10

图3

图4

图5

图6

参考文献 43

1	Trung H T， Toan N T， Van Tong V，et al. A comparative study on network alignment techniques. Expert Systems with Applications，2020(140)：112883.
2	Ren J Q， Jiang L， Peng H，et al. Cross?network social user embedding with hybrid differential privacy guarantees∥Proceedings of the 31st ACM Inter?national Conference on Information & Knowledge Management. Atlanta，GA，USA：ACM，2022：1685-1695.
3	Gu S， Milenkovi? T. Data?driven biological network alignment that uses topological，sequence，and functional information. BMC Bioinformatics，2021，22(1)：34.
4	Chakrabarti S， Singh H， Lohiya S，et al. Joint completion and alignment of multilingual knowledge graphs∥Proceedings of 2022 Conference on Empirical Methods in Natural Language Processing. Abu Dhabi，United Arab Emirates：Association for Computational Linguistics，2022：11922-11938.
5	Zhang S， Tong H H. Attributed network alignment：Problem definitions and fast solutions. IEEE Transactions on Knowledge and Data Engineering，2019，31(9)：1680-1692.
6	Man T， Shen H W， Liu S H，et al. Predict anchor links across social networks via an embedding approach∥Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York，NY，USA：AAAI Press，2016：1823-1829.
7	Heimann M， Shen H M， Safavi T，et al. REGAL：Representation learning?based graph alignment∥Proceedings of the 27th ACM International Conference on Information and Knowledge Management. Torino，Italy：ACM，2018：117-126.
8	Nguyen T T， Pham M T， Nguyen T T，et al. Structural representation learning for network alignment with self?supervised anchor links. Expert Systems with Applications，2021(165)：113857.
9	Trung H T， Van Vinh T， Tam N T，et al. Adaptive network alignment with unsupervised and multi?order convolutional networks∥Proceedings of the IEEE 36th International Conference on Data Engineering. Dallas，TX，USA：IEEE，2020：85-96.
10	Qin K K， Salim F D， Ren Y L，et al. G?CREWE：Graph compression with embedding for network alignment∥Proceedings of the 29th ACM Interna?tional Conference on Information & Knowledge Management. Online：ACM，2020：1255-1264.
11	Gao J， Huang X， Li J D. Unsupervised graph alignment with wasserstein distance discriminator∥Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. Singapore：ACM，2021：426-435.
12	Park J D， Tran C， Shin W Y，et al. Grad?align：Gradual network alignment via graph neural networks (student abstract)∥Proceedings of the 36^th AAAI Conference on Artificial Intelligence. Online：AAAI，2022：13027-13028.
13	Lu M L， Dai Y L， Zhang Z Q. Social network alignment：A bi?layer graph attention neural networks based method. Applied Intelligence，2022，52(14)：16310-16333.
14	Huynh T T， Chi T D， Nguyen T T，et al. Network alignment with holistic embeddings. IEEE Transactions on Knowledge and Data Engineering，2023，35(2)：1881-1894.
15	Zhou J Y， Liu L， Wei W Q，et al. Network representation learning：From preprocessing，feature extraction to node embedding. ACM Computing Surveys，2023，55(2)：38.
16	Xie Y C， Xu Z， Zhang J T，et al. Self?supervised learning of graph neural networks：A unified review. IEEE Transactions on Pattern Analysis and Machine Intelligence，2023，45(2)：2412-2429.
17	Perozzi B， Al?Rfou R， Skiena S. DeepWalk：Online learning of social representations∥Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York，NY，USA：ACM，2014：701-710.
18	Tang J， Qu M， Wang M Z，et al. LINE：Large?scale information network embedding∥Proceedings of the 24th International Conference on World Wide Web. Florence，Italy：ACM，2015：1067-1077.
19	Kü?ükpetek S， Polat F， O?uztüzün H. Multilevel graph partitioning：An evolutionary approach. Journal of the Operational Research Society，2005，56(5)：549-562.
20	Akyildiz T A， Aljundi A A， Kaya K. Understanding coarsening for embedding large?scale graphs∥Proceedings of 2020 IEEE International Conference on Big Data. Atlanta，GA，USA：IEEE，2020：2937-2946.
21	Yang D， Ge Y R， Nguyen T，et al. Structural equivalence in subgraph matching. IEEE Transactions on Network Science and Engineering，2023，10(4)：1846-1862.
22	Yang M S， Hussain I. Unsupervised multi?view K?means clustering algorithm. IEEE Access，2023(11)：13574-13593.
23	Niknam G， Molaei S， Zare H，et al. Graph representation learning based on deep generative gaussian mixture models. Neurocomputing，2023(523)：157-169.
24	Sattar N S， Arifuzzaman S. Scalable distributed Louvain algorithm for community detection in large graphs. The Journal of Supercomputing，2022，78(7)：10275-10309.
25	Loukas A. Graph reduction with spectral and cut guarantees. Journal of Machine Learning Research，2019，20(116)：1-42.
26	Drineas P， Mahoney M W. On the Nystr?m method for approximating a gram matrix for improved kernel?based learning. The Journal of Machine Learning Research，2005(6)：2153-2175.
27	Abudalfa S， Mikki M. A dynamic linkage clustering using KD?tree. The International Arab Journal of Information Technology，2013，10(3)：283-289.
28	Goodfellow I J， Pouget?Abadie J， Mirza M，et al. Generative adversarial networks. Communications of the ACM，2020，63(11)：139-144.
29	Sch?nemann P H. A generalized solution of the orthogonal procrustes problem. Psychometrika，1966，31(1)：1-10.
30	Zhu J， Koutra D， Heimann M. CAPER：Coarsen，align，project，refine：A general multilevel framework for network alignment∥Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Atlanta，GA，USA：ACM，2022：4747-4751.
31	Chen X Y， Heimann M， Vahedian F，et al. CONE?Align：Consistent network alignment with proximity?preserving node embedding∥Proceedings of the 29th ACM International Conference on Information & Knowledge Management. Online：ACM，2020：1985-1988.
32	Kipf T N， Welling M. Semi?supervised classification with graph convolutional networks. 2016，arXiv:.
33	Xu K， Hu W H， Leskovec J，et al. How powerful are graph neural networks? 2018，arXiv:.
34	Shi H T， Huang C Z， Zhang X C，et al. Wasserstein distance based multi?scale adversarial domain adaptation method for remaining useful life prediction. Applied Intelligence，2023，53(3)：3622-3637.
35	Wu M J， Vogt M， Maggiora G M，et al. Design of chemical space networks on the basis of Tversky similarity. Journal of Computer?Aided Molecular Design，2016，30(1)：1-12.
36	Béthune L， Kaloga Y， Borgnat P，et al. Hierarchical and unsupervised graph representation learning with Loukas's coarsening. Algorithms，2020，13(9)：206.
37	Morris C， Ritzert M， Fey M，et al. Weisfeiler and leman go neural：Higher?order graph neural networks∥Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu，HI，USA：AAAI Press，2019：4602-4609.
38	Wu ZH， Pan SR， Chen FW，et al. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems，2021，32(1)：4-24.
39	Zhang S， Tong HH. FINAL：Fast attributed network alignment∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco，CA，USA：ACM，2016：1345-1354.
40	Grover A， Leskovec J. node2vec：Scalable feature learning for networks∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco，CA，USA：ACM，2016：855-864.
41	Ron D， Safro I， Brandt A. Relaxation?based coarsening and multiscale graph organization. Multiscale Modeling & Simulation，2011，9(1)：407-423.
42	Dhillon I S， Guan Y Q， Kulis B. Weighted graph cuts without eigenvectors a multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence，2007，29(11)：1944-1957.
43	Shuman D I， Faraji M J， Vandergheynst P. A multiscale pyramid transform for graph signals. IEEE Transactions on Signal Processing，2016，64(8)：2119-2134.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

数据集	节点数	边数	属性维度	锚链接数	平均度	最大度	最小度	聚类系数
douban online/offline	3906/1118	8164/1511	538	1118	4.18/2.70	97/26	1/1	0.0404/0.0866
allmv/imdb	6011/5713	124709/119073	14	5175	41.49/41.68	159/157	1/1	0.3813/0.3833
flickr/myspace	6714/10733	7333/11081	3	267	2.18/2.06	639/164	1/1	0.0014/0.0
flickr/lastfm	12974/15436	16149/16319	3	451	2.49/2.11	868/976	1/1	0.0087/0.0129

数据集	评价指标	REGAL	GAlign	NAWAL	WAlign	Grad⁃Align	FAROS
douban	MAP	0.1005	0.5518	0.0051	0.3004	0.5843	0.6504
douban	Time (s)	13.00	66.82	623.29	172.11	239.53	18.99
allmv/imdb	MAP	0.1887	0.7537	0.0012	0.6205	0.8094	0.7830
allmv/imdb	Time (s)	62.10	511.42	1731.96	539.04	5674.63	175.73
flickr/myspace	MAP	0.0132	0.0111	0.0006	0.0168	0.0056	0.0317
flickr/myspace	Time (s)	49.13	587.69	2068.85	933.91	14327.69	113.49
flickr/lastfm	MAP	0.0073	0.0102	0.0000	0.0073	0.0040	0.0372
flickr/lastfm	Time (s)	87.95	1570.02	4389.22	2685.89	54327.69	302.82

[1]	刘志中, 李林霞, 孟令强. 基于混合图神经网络的个性化POI推荐方法研究[J]. 南京大学学报(自然科学版), 2023, 59(3): 373-387.
[2]	张蕾, 钱峰, 赵姝, 陈洁, 杨雪洁, 张燕平. 基于卷积图神经网络的多粒度表示学习框架[J]. 南京大学学报(自然科学版), 2023, 59(1): 43-54.
[3]	王扬, 陈智斌, 杨笑笑, 吴兆蕊. 深度强化学习结合图注意力模型求解TSP问题[J]. 南京大学学报(自然科学版), 2022, 58(3): 420-429.
[4]	徐樱笑, 资文杰, 宋洁琼, 邵瑞喆, 陈浩. 基于多站点、多时间注意力机制的电磁强度时空关联分析与可视化[J]. 南京大学学报(自然科学版), 2021, 57(5): 838-846.

利用粗图训练图神经网络实现网络对齐

Training of graph neural networks on coarsening graphs for network alignment

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 16

参考文献 43

相关文章 4

Metrics

本文评价

推荐阅读 0