一种基于用户结构和属性的无监督用户对齐方法

doi:10.13232/j.cnki.jnju.2020.01.001

南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (1): 1–8.doi: 10.13232/j.cnki.jnju.2020.01.001

• • 下一篇

一种基于用户结构和属性的无监督用户对齐方法

俞冬明,李苑,李智星,王国胤()

计算智能重庆市重点实验室，重庆邮电大学，重庆，400065

收稿日期:2019-07-17 出版日期:2020-01-30 发布日期:2020-01-10
通讯作者: 王国胤 E-mail:wanggy.cq@hotmail.com
基金资助:
国家重点研发计划(2017YFB0802300)

An unsupervised user alignment method based on user structure and attribute

Dongming Yu,Yuan Li,Zhixing Li,Guoyin Wang()

Chongqing Key Laboratory of Computational Intelligence，Chongqing University of Posts and Telecommunications，Chongqing，400065，China

Received:2019-07-17 Online:2020-01-30 Published:2020-01-10
Contact: Guoyin Wang E-mail:wanggy.cq@hotmail.com

摘要/Abstract

摘要：

随着互联网应用的蓬勃发展，一个人在不同的社交网络平台上都拥有账户是很常见的.如何在多个社交网络上找到同一个人的账户，对许多应用是很重要的问题，也被称为用户对齐问题.在用户对齐问题上，目前有两个主要的挑战：首先，收集手工对齐的用户对作为训练数据的代价非常大,但传统的有监督方法往往需要大量的标注数据才能获得较好的效果；其次，不同网络中的用户的结构和属性往往不太相同，进一步增加了用户对齐的难度.提出一种无监督用户对齐方法SPUAL（Soft Principle for User Alignment），设计了一种新颖的基于用户的属性与结构的软对齐一致性原则，通过无监督方法计算用户对是否服从此原则来推断用户对是否对齐.在几个公共数据集上的实验表明，该方法的性能比目前最先进的无监督方法都有明显提高.

关键词: 用户对齐, 社交网络, 无监督, 对齐原则

Abstract:

With the fast development of Internet applications，it's common for someone to have accounts on different social network platforms. How to find out which account on multiple social networks are of the same person is an important issue for many applications today，which is also known as the user alignment problem. There are two major challenges when it comes to user alignment. First，it's extremely expensive to collect manually aligned user pairs as training data，but traditional supervised methods often need a large amount of labeled data to achieve better results. Second，users on different networks often have different structures and attributes，which further increase the difficulty of user alignment. We propose an unsupervised user alignment method SPUAL (Soft Principle for User Alignment)，design a novel soft alignment principle based on user attributes and structure，and then infer whether the user alignment is correct or not by calculating whether the users meet to the principles by unsupervised method. Experiments on several common datasets show that the performance of our method is much better than the most advanced unsupervised methods.

Key words: user alignment, social network, unsupervised, alignment principle

中图分类号:

TP391

俞冬明, 李苑, 李智星, 王国胤. 一种基于用户结构和属性的无监督用户对齐方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 1–8.

Dongming Yu, Yuan Li, Zhixing Li, Guoyin Wang. An unsupervised user alignment method based on user structure and attribute[J]. Journal of Nanjing University(Natural Sciences), 2020, 56(1): 1–8.

图/表 5

表1

符号和意义"

符号	意义
$G = A, N$	一个社交网络
$A$ $N$	网络的邻接矩阵网络用户的属性矩阵
$n 1$ , $n 2$	网络 $G s$ 和网络 $G t$ 的用户数
$K$	用户节点的属性个数
$a, b$	网络 $G s$ 的用户索引
$x, y$	网络 $G t$ 的用户索引
$v, w$	向量化对齐的节点对索引
$I, l$	分别是单位矩阵和值全为1的向量
$H$ , $S$	对齐前的偏好和对齐矩阵
$α, β, m$	对齐原则的权重以及正则化参数
$a = v e c (A)$ $D = d i a g (d)$ $?$	将矩阵 $A$ 以列的顺序向量化对角化向量 $d$ 克罗内克积

表1

图1

表2

四个数据集的信息"

Dataset

Flickr?

Lastfm

Flickr?Myspace

Douban

ACM?DBLP

Number

12974~15436

6714~10733

3906~1118

9872~9916

Attributes

Sex,

Pagerank

Sex,

Pagerank

Location

Conference

H

Username Similarity

Degree Similarity

表2

表3

表4

SPAUL在两个数据集中使用不同参数进行对齐的准确率对比"

Parameter( $α - β$ )	Flickr?Lastfm	Flickr?Myspace
0.7-0.3	0.677	0.663
0.8-0.2	0.675	0.655
0.9-0.1	0.670	0.659

表4

参考文献 23

1	Zhang Y T，Tang J，Yang Z L，et al. Cosnet：connecting heterogeneous social networks with local and global consistency∥Proceedings of the 21^th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Sydney，Australia：ACM，2015：1485-1494.
2	Chen Z，Yu X，Song B，et al. Community?based network alignment for large attributed network∥Proceedings of the 2017 ACM on Conference on Information and Knowledge Management.Singapore：ACM，2017：587-596.
3	Manners H N，Elmsallati A，Guzzi P H，et al. Performing local network alignment by ensembling global aligners∥2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). Kansas City，MO，USA：IEEE，2017：1316-1323.
4	Smalter A，Huan J，Lushington G. Gpm：a graph pattern matching kernel with diffusion for chemical compound classification∥2008 8^th IEEE International Conference on BioInformatics and BioEngineering.Athens，Greece：IEEE，2008：1-6.
5	Bayati M，Gerritsen M，Gleich D F，et al. Algorithms for large，sparse network alignment problems∥2009 9^th IEEE International Conference on Data Mining.Miami，FL，USA：IEEE，2009：705-710.
6	Koutra D，Tong H，Lubensky D. Big?Align：Fast bipartite graph alignment∥2013 IEEE 13^th International Conference on Data Mining (ICDM). Dallas，TX，USA：IEEE，2013.
7	Zhang S，Tong H H. Final：Fast attributed network alignment∥Proceedings of the 22^nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.San Francisco，CA，USA：ACM，2016：1345-1354.
8	Zhang S，Tong H，Tang J，et al. iNEAT：Incomplete network alignment∥2017 IEEE International Conference on Data Mining.New Orleans，LA，USA：IEEE，2017：1189-1194.
9	Goga O，Perito D，Lei H，et al. Large?scale correlation of accounts across social networks. Technical Report. Berkeley：University of California at Berkeley，2013：TR?13?002.
10	Liu S Y，Wang S H，Zhu F D. HYDRA：Large?scale social identity linkage via heterogeneous behavior modeling∥Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data.Snowbird，UT，USA：ACM，2014.
11	Zhou X P，Liang X，Du X Y，et al. Structure based user identification across social networks. IEEE Transactions on Knowledge and Data Engineering，2017，30(6)：1178-1191.
12	Liu L，Cheung W K，Li X，et al. Aligning users across social networks using network embedding∥Proceedings of the 25^th International Joint Conference on Artificial Intelligence.New York，NY，USA：AAAI Press，2016.
13	Zhang J，Chen B，Wang X，et al. Mego2vec：embedding matched ego networks for user alignment across social networks∥Proceedings of the 27^th ACM International Conference on Information and Knowledge Management.Torino，Italy：ACM，2018：327-336.
14	Page L，Brin S，Motwani R，et al. The PageRank citation ranking：bringing order to the web. Technical Report. Stanford InfoLab，1999.
15	Liao C S，Lu K，Baym M，et al. IsoRankN：Spectral methods for global alignment of multiple protein networks. Bioinformatics，2009，25(12)：i253-i258.
16	Andersen R，Chung F R K，Lang K J. Local graph partitioning using pageRank vectors∥2006 47^th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06). Berkeley，CA，USA：IEEE，2006.
17	Bayati M，Shah D，Sharma M. Maximum weight matching via max?product belief propagation∥Proceedings of International Symposium on Information Theory (ISIT 2005). Adelaide，Australia：IEEE，2005.
18	Zhang J W，Yu P S. Multiple anonymized social networks alignment∥2015 IEEE International Conference on Data Mining.Atlantic City，NJ，USA：IEEE，2015：599-608.
19	Zhong E H，Fan W，Wang J W，et al. ComSoc：adaptive transfer of user behaviors over composite social network∥ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.Beijing，China：ACM，2012.
20	Heimann M，Shen H M，Safavi T，et al. Regal：representation learning?based graph alignment∥Proceedings of the 27^th ACM International Conference on Information and Knowledge Management.Torino，Italy：ACM，2018：117-126.
21	Klau G W. A new graph?based method for pairwise global network alignment. BMC Bioinformatics，2009，10(S1)：S59.
22	Kollias G，Mohammadi S，Grama A. Network similarity decomposition (NSD)：a fast and scalable approach to network alignment. IEEE Transactions on Knowledge and Data Engineering，2012，24(12)：2232-2243.
23	Zheng Z D，Zheng L，Yang Y. Pedestrian alignment network for large?scale person re?identification. IEEE Transactions on Circuits and Systems for Video Technology，2018，doi：10.1109/TCSVT.2018. doi: 10.1109/TCSVT.2018
	2873599. doi: 10.1109/TCSVT.2018

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Methods	Flickr?Lastfm	Flickr?Myspace	Douban	ACM?DBLP
Regal	0.01	0.01	0	0.003
IsoRank	0.4	0.36	0.07	0.21
NetAlign	0.43	0.45	0.01	0.03
UniAlign	0.1	0.03	0.01	0.01
Klau's Algorithm	0.38	0.4	0.07	0.12
FINAL	0.665	0.640	0.239	0.210
SPUAL	0.677	0.663	0.249	0.241

[1]	陈俊芬,赵佳成,韩洁,翟俊海. 基于深度特征表示的Softmax聚类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 533-540.
[2]	吕国俊,曹建军,郑奇斌,常宸,翁年凤. 基于结构保持对抗网络的跨模态实体分辨[J]. 南京大学学报(自然科学版), 2020, 56(2): 197-205.
[3]	王伯伟, 聂秀山, 马林元, 尹义龙. 基于语义相似度的无监督图像哈希方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 41-48.
[4]	靳义林1，2*，胡　峰1，2. 基于三支决策的中文文本分类算法研究[J]. 南京大学学报(自然科学版), 2018, 54(4): 794-.
[5]	董利梅，赵　红*，杨文元. 基于稀疏聚类的无监督特征选择[J]. 南京大学学报(自然科学版), 2018, 54(1): 107-.
[6]	李　凡1，2，赵　姝1，2*，陈　洁1，2，张燕平1，2. 基于加权中介中心性的结构洞占据者方法获取[J]. 南京大学学报(自然科学版), 2017, 53(4): 756-.
[7]	李　婵，杨文元*，赵　红. 联合依赖最大化与稀疏表示的无监督特征选择方法[J]. 南京大学学报(自然科学版), 2017, 53(4): 775-.
[8]	宾　晟*，孙更新. 基于多子网复合复杂网络模型的多关系社交网络重要节点发现算法[J]. 南京大学学报(自然科学版), 2017, 53(2): 378-.
[9]	张燕平^1,2张顺^1,2钱付兰^1,2严远亭^1,2. 一种局部和全局用户影响力相结合的社交推荐算法[J]. 南京大学学报(自然科学版), 2015, 51(4): 858-865.

一种基于用户结构和属性的无监督用户对齐方法

An unsupervised user alignment method based on user structure and attribute

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 5

参考文献 23

相关文章 9

Metrics

本文评价

推荐阅读 10