南京大学学报(自然科学版) ›› 2022, Vol. 58 ›› Issue (3): 386–397.doi: 10.13232/j.cnki.jnju.2022.03.003

• • 上一篇    

基于图神经网络的社交网络影响力预测算法

陈轶洲1, 刘旭生2, 孙林檀2, 李文中1(), 方立兵3, 陆桑璐1   

  1. 1.南京大学计算机软件新技术国家重点实验室, 南京, 210023
    2.国家电网有限公司客户服务中心, 天津, 300309
    3.南京大学工程管理学院, 南京, 210023
  • 收稿日期:2022-03-08 出版日期:2022-05-30 发布日期:2022-06-07
  • 通讯作者: 李文中 E-mail:lwz@nju.edu.cn
  • 基金资助:
    国家电网有限公司科技项目(5700?202153172A?0?0?00)

Social influence prediction with graph neural network

Yizhou Chen1, Xusheng Liu2, Lintan Sun2, Wenzhong Li1(), Libing Fang3, Sanglu Lu1   

  1. 1.State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, China
    2.Customer Service Center of State Grid Corporation of China, Tianjin, 300309, China
    3.School of Management &Engineering, Nanjing University, Nanjing, 210023, China
  • Received:2022-03-08 Online:2022-05-30 Published:2022-06-07
  • Contact: Wenzhong Li E-mail:lwz@nju.edu.cn

摘要:

近十年来,通过社交网络(如微博、推特)分享信息已经成为人们日常生活中不可缺少的一个环节,如何有效地预测信息传播的影响力成为社交网络研究中的重要课题,不论是识别病毒式营销和虚假新闻还是精确推荐和在线广告都有许多应用.目前,一些应用深度学习进行社交网络影响力预测的方法已经取得了一定进展,但在进行深度学习时仍会面临以下难点:用户通常具有不同的行为和兴趣并且他们同时通过不同的渠道进行互动;用户之间的关系难以检测和形式化表达.传统的社交网络影响力预测方法通过设计复杂的规则来手动提取用户及其所处网络的特征信息,这一方法的有效性严重依赖于设置规则的专业性,所以很难将某一领域的规则推广到其他领域的应用中去.基于深度神经网络模型,设计一种端到端的神经网络来学习用户的隐藏特征信息以预测其社交网络影响力.首先通过图嵌入的方式对用户的局部网络进行特征提取,然后将特征向量作为输入对图神经网络进行训练,从而对用户的社会表征进行预测.该方法的创新之处:运用图卷积和图关注方法,将社交网络中用户的特征属性和其所处局域网络特征相结合,大大提高了模型预测的精度.通过在推特、微博、开放知识图谱等数据集上的大量实验,证明该方法在不同类型的网络中都有较好的表现.

关键词: 图嵌入, 图卷积, 图注意力, 社交网络, 深度学习

Abstract:

For a decade,sharing information through social networks (e.g. Microblog and Twitter) has become an indispensable part in our daily life. Therefore,how to effectively predict the social influence has become an important subject in the study of social network. There are many applications,such as identifying viral marketing and fake news,accurate recommendations and online advertising. In recent years,some deep learning social network influence prediction methods have made some progress,but still faces the following difficulties: users typically have different behaviors and interests and they interact through different channels at the same time,and relationships between users are difficult to detect and formally describe. Traditional social network influence prediction methods manually extract the feature information of users and their networks by designing complex rules. However,the effectiveness of this method heavily depends on the domain knowledge of the rules set,which makes it difficult to generalize the rules in one field to the application in other fields. Based on deep learning method,we design an end?to?end neural network to learn the features of users' hidden information to predict their influence in the social network. Firstly,feature extraction is carried out on users' local network through graph embedding,and then the graph neural network is trained with feature vector as input,so as to predict users' social representation. Compared with previous work,we combine the feature attributes of users in social network with the local area network features by using graph convolution and graph attention method,which greatly improves the accuracy of model prediction.

Key words: graph embedding, graph convolution, graph attention, social network, deep learning

中图分类号: 

  • TP391

图1

欧氏空间数据(上)与非欧氏空间数据(下) [15]"

图2

图注意力层"

图3

社交影响力预测模型"

表1

数据集介绍"

微博推特OAG
V1776950456626953675
E308489739125084134151463
N779164449160499848

表2

采用手动提取方式得到的网络节点特征与局部图特征"

类别描述
节点特征核数(Coreness)
网页排名(Pagerank)
权威度(Authority score)
特征向量中心(Eigenvector Centrality)
聚类系数(Clustering Coefficient)
稀有度(Rarity)
图表征[28]使用Inf2vec算法提取出的64维图表征向量
子图特征节点的活跃邻居数量
由活跃邻居诱导出的子网密度
由活跃邻居组成的连通分量

表3

四种方法在三个数据集上的表现"

数据集模型AUCPrecisionRecallF1
WeiboLR76.09%41.33%71.87%52.55%
PSCN80.30%46.71%70.52%56.23%
GCN75.84%41.43%70.29%52.20%
GAT81.71%47.52%75.08%58.26%
TwitterLR77.06%44.85%68.80%54.35%
PSCN77.73%46.35%66.28%54.58%
GCN75.59%43.30%65.73%52.25%
GAT79.21%47.40%68.07%55.92%
OAGLR64.54%31.25%68.96%43.15%
PSCN68.15%35.44%63.63%45.60%
GCN62.54%29.27%73.35%42.02%
GAT70.78%39.76%59.96%47.85%

表4

在AUC指标下GAT方法较其它三种方法的相对增益"

模型微博推特OAG
相对增益1.2%1.1%3.3%
LR76.09%77.06%64.54%
PSCN80.30%77.73%68.15%
GCN75.84%75.59%62.54%
GAT81.71%79.21%70.78%

表5

将手动提取的特征作为输入与不使用手动提取特征作为输入在GAT模型上的实验结果对比"

数据集

手动

提取特征

AUCPrecisionRecallF1
Weibo81.71%47.52%75.08%58.26%
80.46%45.89%74.01%56.70%
Twitter79.21%47.40%68.07%55.92%
77.29%46.23%64.35%53.83%
OAG70.78%39.76%59.96%47.85%
67.06%33.7665.86%44.77%

图4

四种方法在下列指标下的实验结果对比"

图5

RWR算法的回归概率对算法的AUC和F1的影响"

图6

采样节点的数量对RWR算法的AUC和F1的影响"

图7

数据集中正例与反例的比例对RWR算法的AUC和F1的影响"

1 Zhang J, Liu B, Tang J,et al. Social influence locality for modeling retweeting behaviors∥Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Beijing,China:AAAI Press,2013:2761-2767.
2 Easley D, Kleinberg J. Networks,crowds,and markets. New York,USA:Cambridge University Press,2010:727.
3 Hamilton W L, Ying R, Leskovec J. Inductive representation learning on large graphs∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:1025-1035.
4 Hootsuite. Digital in 2019. We are social2019. 2019-05-10.
5 Grover A, Leskovec J. node2vec:Scalable feature learning for networks∥Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco,CA,USA:ACM,2016:855-864.
6 Guille A, Hacid H. A predictive model for the temporal dynamics of information diffusion in online social networks∥Proceedings of the 21st International Conference on World Wide Web. Lyon,France:ACM,2012:1145-1152.
7 Ugander J, Backstrom L, Marlow C,et al. Structural diversity in social contagion. Proceedings of the National Academy of Sciences of the United States of America2012109(16):5962-5966.
8 Perozzi B, Ai?Rfou R, Skiena S. DeepWalk:Online learning of social representations∥Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York,NY,USA:ACM,2014:701-710.
9 Bakshy E, Rosenn I, Marlow C,et al. The role of social networks in information diffusion∥Proceedings of the 21st International Conference on World Wide Web. Lyon,France:ACM,2012:519-528.
10 Tang J, Qu M, Wang M Z,et al. LINE:Large?scale information network embedding∥Proceedings of the 24th International Conference on World Wide Web. Florence,Italy:ACM,2015:1067-1077.
11 Myers S A, Zhu C G, Leskovec J. Information diffusion and external influence in networks∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing,China:ACM,2012:33-41.
12 Guo R C, Shaabani E, Bhatnagar A,et al. Toward order?of?magnitude cascade prediction∥Proceedings of 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. Paris,France:ACM,2015:1610-1613.
13 Kempe D, Kleinberg J, Tardos é. Maximizing the spread of influence through a social network∥Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington DC,USA:ACM,2003:137-146.
14 Wang L R, Ermon S, Hopcroft J E. Feature?enhanced probabilistic models for diffusion network inference∥Proceedings of the Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer Berlin Heidelberg,2012:499-514.
15 Li C, Ma J Q, Guo X X,et al. DeepCas:An end?to?end predictor of information cascades∥Proceedings of the 26th International Conference on World Wide Web. Perth,Australia:ACM,2017:577-586.
16 Defferrard M, Bresson X, Vandergheynst P. Convolutional neural networks on graphs with fast localized spectral filtering∥Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona,Spain:Curran Associates Inc.,2016:3844-3852.
17 Su H Y, Gionis A, Rousu J. Structured prediction of network response∥Proceedings of the 31st International Conference on Machine Learning. Beijing,China:JMLR,2014:442-450.
18 Aris A, Ravi K, Mohammad M. Influence and correlation in social networks∥Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas,NV,USA:Association for Computing Machinery,2008:7-15.
19 Niepert M, Ahmed M, Kutzkov K. Learning convolutional neural networks for graphs∥Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York,NY,USA:JMLR,2016:2014-2023.
20 Yanardag P, Vishwanathan S V N. Deep graph kernels∥Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney,Australia:ACM,2015:1365-1374.
21 Luong M, Pham H, Manning C D. Effective approaches to attention?based neural machine translation∥Proceedings of 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon,Portugal:The Association for Computational Linguistics,2015:1412-1421.
22 Graves A, Mohamed A R, Hinton G. Speech recognition with deep recurrent neural networks∥Proceedings of 2013 IEEE International Conference on Acoustics,Speech and Signal Processing. Vancouver,Canada:IEEE,2013:6645-6649.
23 Wu Y C, Yin F, Liu C L. Influence and correlation in social networks. WWW,201212(8):519-528.
24 Ma H. An experimental study on implicit social recommendation∥Proceedings of the 36th Interna?tional ACM SIGIR Conference on Research and Development in Information Retrieval. Dublin,Ireland:ACM,2013:73-82.
25 Dong Y X, Chawla N V, Swami A. metapath2vec:Scalable representation learning for heterogeneous networks∥Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax,Canada:ACM,2017:135-144.
26 Mikolov T, Chen K, Corrado G,et al. Efficient estimation of word representations in vector space. 2013,arXiv:.
27 Anagnostopoulos A, Kumar R, Mahdian M. Influence and correlation in social networks∥Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas,NV,USA:ACM,2008:7-15.
28 Yanardag P, Vishwanathan S V N. A structural smoothing framework for robust graph?comparison∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:2134-2142.
29 Jenders M, Kasneci G, Naumann F. Analyzing and predicting viral tweets∥Proceedings of the 22nd International Conference on World Wide Web. Rio de Janeiro,Brazil:ACM,2013:657-664.
30 Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks∥Proceedings of the 25rth International Conference on Neural Information Processing Systems. Lake Tahoe,NV,USA:Curran Associates Inc.,2012:1097-1105.
31 Borgwardt K M, Kriegel P H. Shortest?path kernels on graphs∥Proceedings of the 5th IEEE International Conference on Data Mining. Houston,TX,USA:IEEE,2005:74-81.
32 Duchi J, Hazan E, Singer Y. Adaptive subgradient methods for online learning and stochastic optimization. JMLR201112(61):2121-2159.
33 Chen S, Moore J L, Turnbull D,et al. Playlist prediction via metric embedding∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Beijing,China:ACM,2012:714-722.
[1] 杜渊洋, 邓成伟, 张建. 基于深度卷积神经网络的RNA三维结构打分函数[J]. 南京大学学报(自然科学版), 2022, 58(3): 369-376.
[2] 蒋伟进, 孙永霞, 朱昊冉, 陈萍萍, 张婉清, 陈君鹏. 边云协同计算下基于ST⁃GCN的监控视频行为识别机制[J]. 南京大学学报(自然科学版), 2022, 58(1): 163-174.
[3] 张玮, 赵永虹, 邱桃荣. 基于注意力机制和深度学习的运动想象脑电信号分类方法[J]. 南京大学学报(自然科学版), 2022, 58(1): 29-37.
[4] 孟浩, 刘强. 基于FPGA的卷积神经网络训练加速器设计[J]. 南京大学学报(自然科学版), 2021, 57(6): 1075-1082.
[5] 陈磊, 孙权森, 王凡海. 基于深度对抗网络和局部模糊探测的目标运动去模糊[J]. 南京大学学报(自然科学版), 2021, 57(5): 735-749.
[6] 倪斌, 陆晓蕾, 童逸琦, 马涛, 曾志贤. 胶囊神经网络在期刊文本分类中的应用[J]. 南京大学学报(自然科学版), 2021, 57(5): 750-756.
[7] 杨静, 赵文仓, 徐越, 冯旸赫, 黄金才. 一种基于少样本数据的在线主动学习与分类方法[J]. 南京大学学报(自然科学版), 2021, 57(5): 757-766.
[8] 贾霄, 郭顺心, 赵红. 基于图像属性的零样本分类方法综述[J]. 南京大学学报(自然科学版), 2021, 57(4): 531-543.
[9] 普志方, 陈秀宏. 基于卷积神经网络的细胞核图像分割算法[J]. 南京大学学报(自然科学版), 2021, 57(4): 566-574.
[10] 段建设, 崔超然, 宋广乐, 马乐乐, 马玉玲, 尹义龙. 基于多尺度注意力融合的知识追踪方法[J]. 南京大学学报(自然科学版), 2021, 57(4): 591-598.
[11] 胡文彬, 张宏宇, 王晨曦, 王倪传, 李慧. 社交网络中攻击背景下个人隐私泄露度量研究[J]. 南京大学学报(自然科学版), 2021, 57(2): 289-298.
[12] 罗金屯, 滕飞, 周亚波, 池茂儒, 张海波. 数据驱动的高速铁路轮轨作用力反演模型[J]. 南京大学学报(自然科学版), 2021, 57(2): 299-308.
[13] 曾宪华, 陆宇喆, 童世玥, 徐黎明. 结合马尔科夫场和格拉姆矩阵特征的写实类图像风格迁移[J]. 南京大学学报(自然科学版), 2021, 57(1): 1-9.
[14] 余方超, 方贤进, 张又文, 杨高明, 王丽. 增强深度学习中的差分隐私防御机制[J]. 南京大学学报(自然科学版), 2021, 57(1): 10-20.
[15] 张萌, 韩冰, 王哲, 尤富生, 李浩然. 基于深度主动学习的甲状腺癌病理图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 21-28.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!