南京大学学报(自然科学版) ›› 2023, Vol. 59 ›› Issue (6): 947–960.doi: 10.13232/j.cnki.jnju.2023.06.005

• • 上一篇    下一篇

利用粗图训练图神经网络实现网络对齐

钱峰1,2, 张蕾1, 赵姝2(), 陈洁2   

  1. 1.铜陵学院数学与计算机学院,铜陵,244061
    2.安徽大学计算机科学与技术学院,合肥,230601
  • 收稿日期:2023-08-10 出版日期:2023-11-30 发布日期:2023-12-06
  • 通讯作者: 赵姝 E-mail:zhaoshuzs2002@hotmail.com
  • 基金资助:
    国家自然科学基金(61876001);安徽省高校科研计划(2022AH051749);安徽省高校优秀人才支持计划(GXYQ2020054);安徽省高校优秀青年骨干人才国内外访学研修项目(GXGNFX2021148)

Training of graph neural networks on coarsening graphs for network alignment

Feng Qian1,2, Lei Zhang1, Shu Zhao2(), Jie Chen2   

  1. 1.School of Mathematics and Computer Science, Tongling University, Tongling, 244061, China
    2.School of Computer Science and Technology, Anhui University, Hefei, 230601, China
  • Received:2023-08-10 Online:2023-11-30 Published:2023-12-06
  • Contact: Shu Zhao E-mail:zhaoshuzs2002@hotmail.com

摘要:

网络对齐是一项极具挑战性的任务,旨在识别不同网络中的等效节点,由于网络的复杂性和监督数据的缺乏,传统方法的计算复杂度高,精度低.近年来,图神经网络(Graph Neural Networks,GNN)在网络对齐算法中得到了越来越多的应用.已有研究表明,与传统方法相比,使用GNN进行网络对齐可以降低计算复杂度并提高对齐精度,然而,基于GNN的方法的性能受到训练数据质量和网络规模的限制.为此,提出一种快速鲁棒的无监督网络对齐方法FAROS,采用在粗图上训练的GNN模型进行网络对齐.使用粗图进行GNN训练的优点:(1)显著减少训练数据,最大限度地减少GNN反向传播过程中必须更新的权重参数,减少训练时间;(2)缓解数据噪声,能提取网络最重要的结构特征,便于GNN获得更鲁棒的嵌入向量.在训练过程中,FAROS通过引入基于伪锚节点对的自监督学习来提高对齐精度.在真实数据集上的实验结果验证了FAROS算法的有效性,其在保持较好精度的同时,比同类方法快几个数量级.

关键词: 网络对齐, 图神经网络, 网络嵌入, 粗图, 锚节点对

Abstract:

Network alignment is a challenging task that aims to identify equivalent nodes in different networks. Conventional methods face high computational complexity and low accuracy due to the complexity of networks and the lack of supervision. In recent years,Graph Neural Networks (GNN) have been increasingly explored in network alignment algorithms,as they can reduce computational complexity and improve accuracy compared to traditional methods. However,the performance of GNN?based methods is limited by the quality of training data and network size. To address these limitations,we propose a fast and robust unsupervised network alignment method called FAROS. FAROS employs a GNN model trained on coarse graphs for network alignment. The use of coarse graphs for GNN training significantly reduces training data,minimizes weight parameters that must be updated during GNN back?propagation,and accelerates training time. Coarse graphs also mitigate data noise and extract the most important structural features of the network,which facilitates GNN to obtain more resilient embedding vectors. During training,FAROS elevates alignment accuracy by introducing self?supervised learning based on pseudo?anchor node pairs. Experimental results on real datasets demonstrate the effectiveness of FAROS,which is several orders of magnitude faster than comparative methods while maintaining good accuracy.

Key words: network alignment, graph neural network, network embedding, coarsen graph, anchor node pairs

中图分类号: 

  • TP391

图1

基于嵌入的网络对齐过程"

图2

FAROS算法的框架图"

表1

实验使用的数据集的统计信息"

数据集节点数边数属性维度锚链接数平均度最大度最小度聚类系数
douban online/offline3906/11188164/151153811184.18/2.7097/261/10.0404/0.0866
allmv/imdb6011/5713124709/11907314517541.49/41.68159/1571/10.3813/0.3833
flickr/myspace6714/107337333/1108132672.18/2.06639/1641/10.0014/0.0
flickr/lastfm12974/1543616149/1631934512.49/2.11868/9761/10.0087/0.0129

表2

FAROS和对比方法在四个数据集上的MAP和运行时间的比较"

数据集评价指标REGALGAlignNAWALWAlignGrad⁃AlignFAROS
doubanMAP0.10050.55180.00510.30040.58430.6504
Time (s)13.0066.82623.29172.11239.5318.99
allmv/imdbMAP0.18870.75370.00120.62050.80940.7830
Time (s)62.10511.421731.96539.045674.63175.73
flickr/myspaceMAP0.01320.01110.00060.01680.00560.0317
Time (s)49.13587.692068.85933.9114327.69113.49
flickr/lastfmMAP0.00730.01020.00000.00730.00400.0372
Time (s)87.951570.024389.222685.8954327.69302.82

表3

douban online/offline数据集上FAROS算法与对比算法的Precision@K对比"

Precision@K151015202530
REGAL0.04470.13600.20300.25760.30680.34350.3739
GAlign0.44190.67800.78000.83270.86850.89090.9114
NAWAL0.00000.00270.00360.00720.00890.01070.0179
WAlign0.20390.38820.49460.56530.61360.65470.6816
Grad⁃Align0.47940.70840.77370.81130.83180.85150.8649
FAROS0.53580.75490.84350.88730.91060.92040.9275

表4

allmv/imdb数据集上FAROS算法与对比算法的Precision@K对比"

Precision@K151015202530
REGAL0.09510.26850.38680.46720.52800.56820.6049
GAlign0.69820.81840.85490.87250.88580.89680.9061
NAWAL0.00020.00020.00080.00140.00190.00270.0035
WAlign0.53840.71120.77200.80530.82610.84200.8551
Grad⁃Align0.70810.93700.96000.96540.97080.97280.9745
FAROS0.72330.85160.88780.90650.91920.92790.9353

表5

flickr/myspace数据集上FAROS算法与对比算法的Precision@K对比"

Precision@K151015202530
REGAL0.00370.01120.01120.02250.03750.05620.0562
GAlign0.00000.00750.01870.03750.04490.04870.0674
NAWAL0.00000.00000.00000.00000.00000.00000.0000
WAlign0.00370.00750.03750.04120.05240.07870.0861
Grad⁃Align0.00750.01120.01500.02250.04120.05240.0562
FAROS0.01120.02620.04870.05620.08240.09740.1049

表6

flickr/lastfm数据集上FAROS算法与对比算法的Precision@K对比"

Precision@K151015202530
REGAL0.00000.00220.00670.01770.02440.04430.0554
GAlign0.00440.00440.01550.02000.02440.03100.0377
NAWAL0.00000.00000.00000.00000.00000.00000.0000
WAlign0.00000.00440.01330.02000.02660.03100.0377
Grad⁃Align0.00440.02880.05100.07100.09530.11530.1397
FAROS0.01330.03550.07320.11090.14630.18400.1973

表7

在douban online/offline数据集上FAROS算法在压缩比不同时的P@K对比"

参数r节点数运行时间(s)P@1P@5P@10P@15P@20P@25P@30
03096/1118104.430.53490.76390.84620.87750.91060.93470.9428
0.23125/89569.910.53310.76210.83900.88550.90880.93110.9436
0.42344/67144.840.52680.75220.84080.88100.92040.93200.9445
0.61563/44925.030.53850.76650.84970.88280.91320.92840.9410
0.8782/22413.950.52240.75850.84080.87480.90700.91950.9302

表8

allmv/imdb数据集上FAROS算法在压缩比不同时的P@K对比"

参数r节点数运行时间(s)P@1P@5P@10P@15P@20P@25P@30
06011/57131212.400.70730.84950.88780.90750.92140.93060.9374
0.24809/4571794.710.72530.85160.88680.90820.92140.93010.9361
0.43607/3428494.160.72430.85180.89060.91020.92080.92930.9368
0.62405/2286233.320.72430.85700.89200.91090.92250.93060.9364
0.81203/1143130.000.71620.85030.88600.90490.91770.92640.9355

表9

flickr/myspace数据集上FAROS算法在压缩比不同时的P@K对比"

参数r节点数运行时间(s)P@1P@5P@10P@15P@20P@25P@30
06714/107331223.030.01120.02620.04870.06740.07870.09740.1049
0.25372/8587775.850.00750.02250.03370.05240.06370.07870.1011
0.44029/6440426.750.00750.01870.02620.03750.04120.04870.0824
0.62686/4294191.730.01500.02250.04120.05990.06740.08610.1011
0.81343/214755.760.00750.00750.00750.02250.05620.06370.0787

表10

flickr/lastfm数据集上FAROS算法在压缩比不同时的P@K对比"

参数r节点数运行时间(s)P@1P@5P@10P@15P@20P@25P@30
012974/154363687.230.01110.03100.07100.10860.13970.17290.1929
0.210380/123492259.770.00440.03990.06650.10420.15520.19290.2151
0.47785/92621194.050.00670.03100.07320.11970.15300.18630.2062
0.65190/6175535.100.00670.03770.07540.11530.13530.16190.1951
0.82595/3088140.940.00890.03100.06870.11090.13530.16850.1907

图3

在四个数据集上对FAROS算法移除不同比例边的P@5的对比"

图4

在四个数据集上对FAROS算法添加不同比例边的P@5的对比"

图5

FAROS使用不同GNN模型的P@5对比"

图6

FAROS使用不同粗化方法的P@5对比"

1 Trung H T, Toan N T, Van Tong V,et al. A comparative study on network alignment techniques. Expert Systems with Applications2020(140):112883.
2 Ren J Q, Jiang L, Peng H,et al. Cross?network social user embedding with hybrid differential privacy guarantees∥Proceedings of the 31st ACM Inter?national Conference on Information & Knowledge Management. Atlanta,GA,USA:ACM,2022:1685-1695.
3 Gu S, Milenkovi? T. Data?driven biological network alignment that uses topological,sequence,and functional information. BMC Bioinformatics202122(1):34.
4 Chakrabarti S, Singh H, Lohiya S,et al. Joint completion and alignment of multilingual knowledge graphs∥Proceedings of 2022 Conference on Empirical Methods in Natural Language Processing. Abu Dhabi,United Arab Emirates:Association for Computational Linguistics,2022:11922-11938.
5 Zhang S, Tong H H. Attributed network alignment:Problem definitions and fast solutions. IEEE Transactions on Knowledge and Data Engineering201931(9):1680-1692.
6 Man T, Shen H W, Liu S H,et al. Predict anchor links across social networks via an embedding approach∥Proceedings of the 25th International Joint Conference on Artificial Intelligence. New York,NY,USA:AAAI Press,2016:1823-1829.
7 Heimann M, Shen H M, Safavi T,et al. REGAL:Representation learning?based graph alignment∥Proceedings of the 27th ACM International Conference on Information and Knowledge Management. Torino,Italy:ACM,2018:117-126.
8 Nguyen T T, Pham M T, Nguyen T T,et al. Structural representation learning for network alignment with self?supervised anchor links. Expert Systems with Applications2021(165):113857.
9 Trung H T, Van Vinh T, Tam N T,et al. Adaptive network alignment with unsupervised and multi?order convolutional networks∥Proceedings of the IEEE 36th International Conference on Data Engineering. Dallas,TX,USA:IEEE,2020:85-96.
10 Qin K K, Salim F D, Ren Y L,et al. G?CREWE:Graph compression with embedding for network alignment∥Proceedings of the 29th ACM Interna?tional Conference on Information & Knowledge Management. Online:ACM,2020:1255-1264.
11 Gao J, Huang X, Li J D. Unsupervised graph alignment with wasserstein distance discriminator∥Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. Singapore:ACM,2021:426-435.
12 Park J D, Tran C, Shin W Y,et al. Grad?align:Gradual network alignment via graph neural networks (student abstract)∥Proceedings of the 36th AAAI Conference on Artificial Intelligence. Online:AAAI,2022:13027-13028.
13 Lu M L, Dai Y L, Zhang Z Q. Social network alignment:A bi?layer graph attention neural networks based method. Applied Intelligence202252(14):16310-16333.
14 Huynh T T, Chi T D, Nguyen T T,et al. Network alignment with holistic embeddings. IEEE Transactions on Knowledge and Data Engineering202335(2):1881-1894.
15 Zhou J Y, Liu L, Wei W Q,et al. Network representation learning:From preprocessing,feature extraction to node embedding. ACM Computing Surveys202355(2):38.
16 Xie Y C, Xu Z, Zhang J T,et al. Self?supervised learning of graph neural networks:A unified review. IEEE Transactions on Pattern Analysis and Machine Intelligence202345(2):2412-2429.
17 Perozzi B, Al?Rfou R, Skiena S. DeepWalk:Online learning of social representations∥Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York,NY,USA:ACM,2014:701-710.
18 Tang J, Qu M, Wang M Z,et al. LINE:Large?scale information network embedding∥Proceedings of the 24th International Conference on World Wide Web. Florence,Italy:ACM,2015:1067-1077.
19 Kü?ükpetek S, Polat F, O?uztüzün H. Multilevel graph partitioning:An evolutionary approach. Journal of the Operational Research Society200556(5):549-562.
20 Akyildiz T A, Aljundi A A, Kaya K. Understanding coarsening for embedding large?scale graphs∥Proceedings of 2020 IEEE International Conference on Big Data. Atlanta,GA,USA:IEEE,2020:2937-2946.
21 Yang D, Ge Y R, Nguyen T,et al. Structural equivalence in subgraph matching. IEEE Transactions on Network Science and Engineering202310(4):1846-1862.
22 Yang M S, Hussain I. Unsupervised multi?view K?means clustering algorithm. IEEE Access2023(11):13574-13593.
23 Niknam G, Molaei S, Zare H,et al. Graph representation learning based on deep generative gaussian mixture models. Neurocomputing2023(523):157-169.
24 Sattar N S, Arifuzzaman S. Scalable distributed Louvain algorithm for community detection in large graphs. The Journal of Supercomputing202278(7):10275-10309.
25 Loukas A. Graph reduction with spectral and cut guarantees. Journal of Machine Learning Research201920(116):1-42.
26 Drineas P, Mahoney M W. On the Nystr?m method for approximating a gram matrix for improved kernel?based learning. The Journal of Machine Learning Research2005(6):2153-2175.
27 Abudalfa S, Mikki M. A dynamic linkage clustering using KD?tree. The International Arab Journal of Information Technology,201310(3):283-289.
28 Goodfellow I J, Pouget?Abadie J, Mirza M,et al. Generative adversarial networks. Communications of the ACM202063(11):139-144.
29 Sch?nemann P H. A generalized solution of the orthogonal procrustes problem. Psychometrika196631(1):1-10.
30 Zhu J, Koutra D, Heimann M. CAPER:Coarsen,align,project,refine:A general multilevel framework for network alignment∥Proceedings of the 31st ACM International Conference on Information & Knowledge Management. Atlanta,GA,USA:ACM,2022:4747-4751.
31 Chen X Y, Heimann M, Vahedian F,et al. CONE?Align:Consistent network alignment with proximity?preserving node embedding∥Proceedings of the 29th ACM International Conference on Information & Knowledge Management. Online:ACM,2020:1985-1988.
32 Kipf T N, Welling M. Semi?supervised classification with graph convolutional networks. 2016,arXiv:.
33 Xu K, Hu W H, Leskovec J,et al. How powerful are graph neural networks? 2018,arXiv:.
34 Shi H T, Huang C Z, Zhang X C,et al. Wasserstein distance based multi?scale adversarial domain adaptation method for remaining useful life prediction. Applied Intelligence202353(3):3622-3637.
35 Wu M J, Vogt M, Maggiora G M,et al. Design of chemical space networks on the basis of Tversky similarity. Journal of Computer?Aided Molecular Design201630(1):1-12.
36 Béthune L, Kaloga Y, Borgnat P,et al. Hierarchical and unsupervised graph representation learning with Loukas's coarsening. Algorithms202013(9):206.
37 Morris C, Ritzert M, Fey M,et al. Weisfeiler and leman go neural:Higher?order graph neural networks∥Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Honolulu,HI,USA:AAAI Press,2019:4602-4609.
38 Wu ZH, Pan SR, Chen FW,et al. A comprehensive survey on graph neural networks. IEEE Transactions on Neural Networks and Learning Systems202132(1):4-24.
39 Zhang S, Tong HH. FINAL:Fast attributed network alignment∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco,CA,USA:ACM,2016:1345-1354.
40 Grover A, Leskovec J. node2vec:Scalable feature learning for networks∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco,CA,USA:ACM,2016:855-864.
41 Ron D, Safro I, Brandt A. Relaxation?based coarsening and multiscale graph organization. Multiscale Modeling & Simulation20119(1):407-423.
42 Dhillon I S, Guan Y Q, Kulis B. Weighted graph cuts without eigenvectors a multilevel approach. IEEE Transactions on Pattern Analysis and Machine Intelligence200729(11):1944-1957.
43 Shuman D I, Faraji M J, Vandergheynst P. A multiscale pyramid transform for graph signals. IEEE Transactions on Signal Processing201664(8):2119-2134.
[1] 刘志中, 李林霞, 孟令强. 基于混合图神经网络的个性化POI推荐方法研究[J]. 南京大学学报(自然科学版), 2023, 59(3): 373-387.
[2] 张蕾, 钱峰, 赵姝, 陈洁, 杨雪洁, 张燕平. 基于卷积图神经网络的多粒度表示学习框架[J]. 南京大学学报(自然科学版), 2023, 59(1): 43-54.
[3] 王扬, 陈智斌, 杨笑笑, 吴兆蕊. 深度强化学习结合图注意力模型求解TSP问题[J]. 南京大学学报(自然科学版), 2022, 58(3): 420-429.
[4] 徐樱笑, 资文杰, 宋洁琼, 邵瑞喆, 陈浩. 基于多站点、多时间注意力机制的电磁强度时空关联分析与可视化[J]. 南京大学学报(自然科学版), 2021, 57(5): 838-846.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!