南京大学学报(自然科学版) ›› 2022, Vol. 58 ›› Issue (1): 143–152.doi: 10.13232/j.cnki.jnju.2022.01.014

• • 上一篇    

融合用户标签相似度的矩阵分解算法

武聪, 马文明(), 王冰, 朱建豪   

  1. 烟台大学计算机与控制工程学院,烟台,264005
  • 收稿日期:2021-10-08 出版日期:2022-01-30 发布日期:2022-02-22
  • 通讯作者: 马文明 E-mail:mwmytu@126.com
  • 作者简介:E⁃mail:mwmytu@126.com
  • 基金资助:
    国家自然科学基金(61602399)

Matrix factorization algorithm combined with user tag similarity

Cong Wu, Wenming Ma(), Bing Wang, Jianhao Zhu   

  1. College of Computer and Control Engineering,Yantai University,YanTai,264005,China
  • Received:2021-10-08 Online:2022-01-30 Published:2022-02-22
  • Contact: Wenming Ma E-mail:mwmytu@126.com

摘要:

随着互联网时代的到来,推荐系统已经成为人们在网络上筛选资源的得力助手,传统推荐系统通过用户的评分信息来计算用户相似度并为用户进行资源的推荐,但仍存在冷启动、数据稀疏性等各种问题,极大地影响推荐质量.近年来,标签的出现带给推荐系统新的机遇,它能够具体准确地描述用户的兴趣偏好,使推荐系统可以通过标签属性来更准确地了解用户喜好,从而为用户进行个性化推荐,极大提高了推荐精度和用户满意度.结合标签属性与评分的关系来计算用户标签相似度,结合用户和资源信息来计算用户相似度,将两者同时融入矩阵分解模型中,从而加强了推荐依据,提升了推荐的准确性.实验结果表明,在ml?latest?small数据集上,提出的算法UTagJMF的RMSE降低2%左右;在Hetrec2011?movielens?2k数据集上,UTagJMF的RMSE降低2.2%左右.证明提出的算法模型明显优于其他算法的预测效果.

关键词: 推荐系统, 标签, 兴趣偏好, 用户标签相似度矩阵, Jaccard相似度矩阵

Abstract:

With the advent of the Internet era,the recommendation system has become a powerful assistant for people to screen Internet resources. Traditional recommendation systems use users' rating information to calculate user similarity and recommend resources for users,but there are still various problems such as cold start and data sparseness,which greatly affect the quality of recommendation. The traditional matrix decomposition mainly calculates the similarity between users and resources through a scoring matrix,searches for the neighbors of users and resources,and predicts users' rating of resources through the neighbor set. However,due to the huge amount of resources on the network,users can only rate and evaluate a small part of them,so there is very little scoring data that can be used,and the data sparseness is very serious. In recent years,the appearance of tags has brought new opportunities for recommendation systems. Tags specifically and accurately describe users' interests and preferences. The recommendation system can more accurately understand users' preferences through tag attributes,so as to make personalized recommendations for users,greatly improving the accuracy of recommendation and user satisfaction. Social tags are of great value for the recommendation and sharing of resources,and provide a strong basis for personalized recommendations. Finding the connection between users and resources through social tags is bound to improve recommendation efficiency,greatly increase user satisfaction,and bring new opportunities for resource sharing and recommendation. According to tags,items or resources can be classified. Because the label covers the characteristics of the resource,it provides a reliable basis for the classification of the resource. Searching through tags can better fit users' ideas and opinions,and make the search more accurate. Use the tags that the user usually annotates to recommend resources or users with tags with high similarity of tags that the user is interested in,so that the potential interests of the user can be mined,and personalized recommendation to the user can be realized. At present,many websites have already used social tags to varying degrees,thereby enhancing user satisfaction and loyalty,and creating huge revenue for the website. The gradual rise of labels on the Internet has made more and more users accustomed to labeling their favorite resources. This label can not only represent users' preferences,but also describe the attributes of the product. Making full use of these label information can effectively improve the accuracy of recommendation and alleviate data sparseness. In this paper,the relationship between tag attributes and ratings is used to calculate user tag similarity. Users' and resource information are used to calculate user similarity. Both are integrated into the matrix decomposition model to strengthen the recommendation basis and improve the accuracy of recommendation. Experimental results show that the RMSE of the proposed algorithm UTagJMF reduces about 2% on the ml?latest?small dataset,and reduces about 2.2% on the Hetrec2011?movielens?2k dataset. Therefore,the proposed algorithm model effectively alleviate the adverse effects of data sparsity,and has a significantly better prediction effect than other algorithms.

Key words: recommend system, tag, interests and preferences, user tag similarity matrix, Jaccard similarity matrix

中图分类号: 

  • TP301

图1

矩阵分解模型"

图2

用户标签相似度矩阵分解模型"

表1

数据集ml?latest?small的属性"

Dataset attributesml?latest?small
users610
items9724
ratings100836
tags1365
Tag records3683
sparsity98.3%

表2

数据集Hetrec2011?movielens?2k的属性"

Dataset attributesHetrec2011?movielens?2k
users2113
items10197
ratings855598
tags13222
Tag records47957
sparsity96%

图3

ml?latest?small数据集上各标签相似度算法模型的实验结果对比"

图4

Hetrec2011?movielens?2k数据集上各标签相似度算法模型的实验结果对比"

图5

Hetrec2011?movielens?2k?sparsity数据集上各标签相似度算法模型的实验结果对比"

图6

超参数λp在ml?latest?small数据集上的调整"

图7

超参数λp在Hetrec2011?movielens?2k数据集上的调整"

图8

超参数λq在ml?latest?small数据集上的调整"

图9

超参数λq在Hetrec2011?movielens?2k数据集上的调整"

图10

数据稀疏程度对实验效果的影响"

1 周万珍,曹迪,许云峰,等.推荐系统研究综述.河北科技大学学报,2020,41(01):76-87.
Zhou W Z,Cao D,Xu Y F,et al.A survey ofrecommendation systems.Journal of Hebei Universityof Science and Technology,2020,41(1):76-87.
2 潘博磊. 5G网络新技术及核心网架构. 信息与电脑,2019(16):172-173,181.
Pan B L. 5G network new technology and core network architecture. China Computer & Communication,2019(16):172-173,181.
3 李新卫. 基于Hadoop的音乐推荐系统的研究与实现. 硕士学位论文. 西安:西安工业大学,2018.
Li X W. Research and implementation of music recommendation system based on Hadoop. Master Disser?tation. Xi'an:Xi'an Technological University,2018.
4 李卓远,曾丹,张之江. 基于协同过滤和音乐情绪的音乐推荐系统研究. 工业控制计算机,2018,31(7):130-131,134.
Li Z Y,Zeng D,Zhang Z J. Research on music recommender systems based on collaborative filtering and music emotion. Industrial Control Computer,2018,31(7):130-131,134.
5 侯强. 基于在线评论的泛视频推荐系统的设计与实现.博士学位论文.大连:大连理工大学,2018. (Design and implementation of pan?video recommendation system based on online comments. Ph.D. Dissertation. Dalian:Dalian University of Technology,2018.)
6 Wang L C,Meng X W,Zhang Y J. Context?aware recommender systems. Journal of Software,2012,23(1):1-20.
7 Sarwar B,Karypis G,Konstan J,et al. Item?based collaborative filtering recommendation algorithms∥Proceedings of the 10th International Conference on World Wide Web. Hong Kong,China:ACM,2001:285-295.
8 Breese J S,Heckerman D,Kadie C. Empirical analysis of predictive algorithms for collaborative filtering∥Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence. Madison,WI,USA:ACM,1998:43-52.
9 Konstan J A,Miller B N,Maltz D,et al. GroupLens:Applying collaborative filtering to Usenet news. Communications of the ACM,1997,40(3):77-87.
10 Wei S X,Zheng X L,Chen D R,et al. A hybrid approach for movie recommendation via tags and ratings. Electronic Commerce Research & Applications,2016(18):83-94.
11 de Campos L M,Fernández?Luna J M,Huete J F,et al. Combining content?based and collaborative recommendations:A hybrid approach based on Bayesian networks. International Journal of Approximate Reasoning,2010,51(7):785-799.
12 Costeira J P,Kanade T. A multibody factorization method for independently moving objects. International Journal of Computer Vision,1998,29(3):159-179.
13 Lu L,Vidal R. Combined central and subspace clustering for computer vision applications∥Proceedings of the 23rd International Conference on Machine Learning. Pittsburgh,PE,USA:ACM,2006:593-600.
14 Xu R,Wunschii D. Survey of clustering algorithms. IEEE Transactions on Neural Networks,2005,16(3):645-678.
15 Ma H,Yang H X,Lyu M R,et al. SoRec:Social recommendation using probabilistic matrix factorization∥Proceedings of the 17th ACM Conference on Information and Knowledge Management. Napa Valley,CA,USA:ACM,2008:931-940.
16 Koren Y. Collaborative filtering with temporal dynamics. Communications of the ACM,2010,53(4):89-97.
17 Gantner Z,Drumond L,Freudenthaler C,et al. Learning attribute?to?feature mappings for cold?start recommendations∥2010 IEEE International Conference on Data Mining. Sydney,Australia:IEEE,2010:176-185.
18 Zhao L,Xiao B. Matrix factorization based models considering item categories and user neighbors∥2015 8th International Symposium on Computational Intelligence and Design. Hangzhou,China:IEEE,2015:470-473.
19 杨强,杨有,余平. 信任传递的矩阵分解推荐算法. 重庆文理学院学报,2015,34(5):125-129.
Yang Q,Yang Y,Yu P. Martrix factorization recommender algorithm using trust propagation. Journal of Chong?qing University of Arts and Sciences,2015,34(5):125-129.
20 Zhang K H,Liang J Y,Zhao X W,et al. A collaborative filtering recommendation algorithm based on information of community experts. Journal of Computer Research and Development,2018,55(5):968-976.
21 Yu Y H,Gao Y,Wang H,et al. Integrating user social status and matrix factorization for item recommendation. Journal of Computer Research and Development,2018,55(1):113-124.
22 何明,要凯升,杨芃,等. 基于标签信息特征相似性的协同过滤个性化推荐. 计算机科学,2018,45(6A):415-422.
He M,Yao K S,Yang P,et al. Collaborative filtering personalized recommendation based on similarity of tag information feature. Computer Science,2018,45(6A):415-422.
23 姚陶钧. 基于社会化标签和概率化矩阵分解推荐算法的研究. 硕士学位论文. 杭州:浙江大学,2013.
Yao T J. Research on recommendation algorithm based on social tagging and probabilistic matrix factorization.Master Dissertation. Hangzhou:Zhejiang University, 2013.
24 Zhen Y,Li W J,Yeumg D Y. TagiCoFi:Tag informed collaborative filtering∥Proceedings of the 3rd ACM Conference on Recommender Systems. New York,NY,USA:ACM,2009:69-76.
25 Diederich J,Iofciu T. Finding communities of practice from user profiles based on folksonomies. CEUR Workshop Proceedings,2006:213.
26 Heung?Nam K,Majdi R,Abdulmotaleb El S. Leveraging collaborative filtering to tag?based personalized search. Usex Modeling A daption and Personalization,2011:195-206.
27 Eck D,Lamere P,Bertin?Mahieux T,et al. Automatic generation of social tags for music recommendation∥Proceedings of the 20th Inter?national Conference on Neural Information Processing Systems. Vancouver,Canada:Curran Associates Inc.,2007:385-392.
28 Zhao S W,Du N,Naucrz A,et al. Improved recommendation based on collaborative tagging behaviors∥Proceedings of the 13th International Conference on Intelligent User Interfaces. Gran Canaria,Spain:ACM,2008:413-416.
29 Firan C S,Nejdl W,Paiu R. The benefit of using tag?based profiles∥Proceedings of 2007 Latin American Web Conference. Santiago,Chile:IEEE,2007:32-41.
30 吴航. 融入用户信任和标签的协同过滤推荐研究. 硕士学位论文. 上海:华东师范大学,2019.
Wu H. Research on collaborative filtering recommendation integrating user trust and tags. Master Dissertation. Shanghai:East China Normal University,2019.
31 王运,倪静. 融合用户偏好和物品相似度的概率矩阵分解推荐算法. 小型微型计算机系统,2020,41(4):746-751.
Wang Y,Ni J. Probability matrix factorization recommendation algorithm combining user preferences and item similarity. Journal of Chinese Computer Systems,2020,41(4):746-751.
[1] 吕亚兰, 徐媛媛, 张恒汝. 一种可解释性泛化矩阵分解推荐算法[J]. 南京大学学报(自然科学版), 2022, 58(1): 135-142.
[2] 郝昱猛, 马文明, 王冰. 基于特定用户约束的概率矩阵分解算法[J]. 南京大学学报(自然科学版), 2021, 57(5): 818-827.
[3] 袁晓峰, 钱苏斌, 周彩根. 基于填充先验约束的矩阵分解算法[J]. 南京大学学报(自然科学版), 2021, 57(2): 197-207.
[4] 方志文, 刘青山, 周峰. 基于像素⁃目标级共生关系学习的多标签航拍图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(2): 208-216.
[5] 孙金萍, 丁恩杰, 鲍蓉, 厉丹, 李子龙. 多特征融合的长时间目标跟踪算法[J]. 南京大学学报(自然科学版), 2021, 57(2): 217-226.
[6] 郑文萍, 刘美麟, 穆俊芳, 杨贵. 一种基于节点稳定性的社区发现算法[J]. 南京大学学报(自然科学版), 2021, 57(1): 101-109.
[7] 王一宾, 郑伟杰, 程玉胜, 曹天成. 基于PLSA学习概率分布语义信息的多标签分类算法[J]. 南京大学学报(自然科学版), 2021, 57(1): 75-89.
[8] 黄雨婷,徐媛媛,张恒汝,闵帆. 融合标签结构依赖性的标签分布学习[J]. 南京大学学报(自然科学版), 2020, 56(4): 524-532.
[9] 李亚重,杨有龙,仇海全. 一种基于嵌入式的弱标记分类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 549-560.
[10] 罗春春,郝晓燕. 基于双重注意力模型的微博情感倾向性分析[J]. 南京大学学报(自然科学版), 2020, 56(2): 236-243.
[11] 徐媛媛,张恒汝,闵帆,黄雨婷. 三支交互推荐[J]. 南京大学学报(自然科学版), 2019, 55(6): 973-983.
[12] 何轶凡, 邹海涛, 于化龙. 基于动态加权Bagging矩阵分解的推荐系统模型[J]. 南京大学学报(自然科学版), 2019, 55(4): 644-650.
[13] 赵天龙,刘 峥,韩慧健,张彩明. 基于二分图的个性化图像标签推荐算法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1193-1205.
[14]  徐智康1,李 旸1,李德玉1,2*.  基于可变最小贝叶斯风险的层次多标签分类方法[J]. 南京大学学报(自然科学版), 2017, 53(6): 1023-.
[15] 李云毅1,2,苗夺谦1,2,卫志华1,2*. 基于特征融合与多元关系一致性的社会标签精化模型[J]. 南京大学学报(自然科学版), 2016, 52(2): 244-252.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!