南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (5): 818–827.doi: 10.13232/j.cnki.jnju.2021.05.012

• • 上一篇    

基于特定用户约束的概率矩阵分解算法

郝昱猛, 马文明(), 王冰   

  1. 烟台大学计算机与控制工程学院,烟台,264005
  • 收稿日期:2021-06-02 出版日期:2021-09-29 发布日期:2021-09-29
  • 通讯作者: 马文明 E-mail:mwmytu@126.com
  • 作者简介:E⁃mail:mwmytu@126.com
  • 基金资助:
    国家自然科学基金(61602399);烟台大学研究生科技创新基金(YDZD2119)

Probabilistic matrix factorization algorithm based on specific user constraints

Yumeng Hao, Wenming Ma(), Bing Wang   

  1. School of Computer and Control Engineering,Yantai University,Yantai,264005,China
  • Received:2021-06-02 Online:2021-09-29 Published:2021-09-29
  • Contact: Wenming Ma E-mail:mwmytu@126.com

摘要:

近年来,推荐系统的实用价值越来越高,良好的推荐算法可以给用户提供好的用户体验效果,然而随着信息化的不断增长,信息过载问题变得越来越突出,用户懒于对物品评分已经成为习惯.怎样向这些特定用户群体提供好的推荐算法、提高推荐质量已经成为现在的热门问题.为了更好地推动推荐系统的发展,解决这些特定用户群体的评分稀疏问题,提出一种受约束的贝叶斯概率矩阵分解算法.该算法针对特定的评分稀疏用户引入一种潜在的相似度约束矩阵来影响用户的特征向量,并结合最大后验概率(Maximum A Posteriori,MAP)估计和蒙特卡罗采样(Markov Chain Monte Carlo,MCMC)推断进行概率矩阵分解(Probabilistic Matrix Factorization,PMF),自动调整模型正则化参数,最后在数据集上进行测试评估和对比实验.实验结果表明,该算法在预测性能上得到很大提升,并且在解决特定用户的评分稀疏问题上效果更佳.

关键词: 推荐系统, 评分稀疏, 约束矩阵, 概率矩阵分解, 协同过滤

Abstract:

In recent years,the practical value of recommender system is getting higher and higher. A good recommendation algorithm can provide a good experience effect for users. However,with the continuous growth of information technology,the problem of information overload has become more and more prominent. The formation of living habits for users is to be lazy about rating items. How to provide good recommendation algorithm to these specific user groups and to improve the recommendation quality has become a hot issue now. In order to improve recommender system and solve rating sparse problem of specific user groups,a constrained Bayesian probability matrix factorization algorithm is proposed. This algorithm introduces a potential similarity constraint matrix to affect user eigenvectors for specific sparse rating users. We combine Maximum A Posteriori (MAP) and Markov Chain Monte Carlo (MCMC) for Probabilistic Matrix Factorization (PMF),automatically adjusting the regularization parameters. Finally,test evaluation and comparison experiments are conducted on the MovieLens dataset. The experimental results show that the proposed algorithm can improve prediction performance and perform well for specific users with sparse ratings.

Key words: recommendation system, data sparseness, constraint matrix, probabilistic matrix factorization, collaborative filtering

中图分类号: 

  • TP301

图1

CPMF模型"

图2

基于贝叶斯的PMF模型"

表1

模型参数的符号"

变量描述
Rij用户i对电影j的评分
ΘU,ΘV,ΘW用户、电影、约束矩阵的高斯?威沙特分布的超参数
T采样次数
D潜在特征向量维度
N高斯分布函数
W威沙特分布函数
N,M分别是用户和电影的个数
Iij指示变量
W0,W1单位矩阵
μU,μV,μW用户、电影、约束矩阵的高斯分布的均值参数
ΛU,ΛV,ΛW用户、电影、约束矩阵的高斯分布的方差矩阵
gxLogistic函数

图3

CBPMF模型"

表2

数据集的信息统计"

数据集信息MovieLens?100kMovieLens?1M
用户数U9436039
电影数V16828662
评分范围1~50.5~5.0
评分数100 k1 M
评分密度6.3%4.3%

图4

不同的模型基线方法的RMSE和MAE的比较"

表3

MovieLens?1M数据集上BPMF和CBPMF算法在不同维度下的RMSE和MAE"

DatasetMetricsMethod10D30D50D70D100D
MovieLens?1MMAEBPMF0.91130.90290.91860.93740.9786
CBPMF0.82190.80870.81740.83290.8610
RMSEBPMF0.95360.94950.95790.96780.9889
CBPMF0.85610.85330.85870.86130.8743

图5

MovieLens?1M数据集上BPMF和CBPMF的RMSE比较"

图6

MovieLens?1M数据集上BPMF和CBPMF的MAE比较"

图7

几种算法在MovieLens?100k数据集上的RMSE比较"

图8

几种算法在MovieLens?100k数据集上的MAE比较"

表4

案例分析的比较结果"

用户评分数无约束矩阵预测误差RMSE有约束矩阵预测误差RMSE
A5094.4%90.7%
B1493.6%86.6%
C598.2%82.5%
D996.4%86.2%
1 Resnick P,Varian H R. Recommender systems. Communications of the ACM,1997,40(3):56-58.
2 Huang L W,Fu M S,Li F,et al. A deep reinforcement learning based long?term recommender system. Knowledge?Based Systems,2021,213:106706.
3 王立才,孟祥武,张玉洁. 上下文感知推荐系统. 软件学报,2012,23(1):1-20. (Wang L C,Meng X W,Zhang Y J. Context?aware recommender
systems. Journal of Software,2012,23(1):1-20.
4 倪维健,郭浩宇,刘彤等. 基于多头自注意力神经网络的购物篮推荐方法. 数据分析与知识发现,2020,4(2-3):68-77. (Ni W J,Guo H Y Liu T,et al.
Online product recommendation based on multi?head self?attention neural networks. Data Analysis and Know?ledge Discovery,2020,4(2-3):68-77.
5 Hikmatyar M,Ruuhwan. Book recommendation system development using user?based collaborative filtering. Journal of Physics:Conference Series,2020,1477(3):032024.
6 Liu X J,He Q,Tian Y Y,et al. Event?based social networks:Linking the online and offline social worlds∥Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York,NY,USA:ACM,2012:1032-1040.
7 魏晓辉,孙冰怡,崔佳旭. 基于图神经网络的兴趣活动推荐算法. 吉林大学学报(工学版),2021,51(1):278-284. (Wei X H,Sun B Y,Cui J X. Interest in activities recommended algorithm based on neural network diagram. Journal of Jilin University
Science) Engineering,2021,51(1:278-284.
8 夏景明,刘聪慧. 一种基于用户和商品属性挖掘的协同过滤算法. 现代电子技术,2020,43(23):120-123. (Xia J M,Liu C H. A collaborative filtering
algorithm based on user and commodity attribute
mining. Modern Electronic Technology,2020,43(23):120-123.
9 孔麟,黄俊,马浩等. 融合多层相似度与信任机制的协同过滤算法. 计算机工程与设计,2020,41(12):3405-3411. (Kong L,Huang J,Ma H,et al.
Collaborative filtering algorithm fusing multi?level similarity and trust mechanism. Computer
Engineering and Design,2020,41(12):3405-3411.
10 王英博,孙永荻. 基于GNN的矩阵分解推荐算法. 计算机工程与应用,2020:1-11.
Wang Y B,Sun Y D. GNN?based matrix factorization recommendation algorithm. Computer Engineering and Application,2020:1-11.
11 Koren Y. Factorization meets the neighborhood:A multifaceted collaborative filtering model∥Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York,NY,USA:ACM,2008:426-
434
12 Ortega F,Lara?Cabrera R,González?Prieto á,et al. Providing reliability in recommender systems through Bernoulli Matrix Factorization. Information Sciences,2021(553):110-128.
13 陈珏伊,朱颖琪,周刚等. 基于迁移的联合矩阵分解的协同过滤算法. 四川大学学报(自然科学版),2020,57(6):1096-1102.
Chen J Y,Zhu Y Q,Zhou G,et al. Collaborative filtering recommendation based on transfer learning and joint matrix decompo?sition. Journal of Sichuan University (Natural Science Edition),2020,57(6):1096-1102.
14 Salakhutdinov R,Mnih A. Probabilistic matrix factorization∥Proceedings of the 20th International Processing Conference on Neural Information Processing Systems. New York,NY,USA:Curran Associates Inc.,2007:1257-1264.
15 Salakhutdinov R,Mnih A. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo∥Proceedings of the 25th International Conference on Machine Learning. New York,NY,USA:ACM,2008:880-887.
16 Yang Y,Gao X G,Chen D Q,et al. Learning Bayesian networks using the constrained maximum a posteriori probability method. Pattern Recognition,2019(91):123-134.
17 毛宜钰,刘建勋,胡蓉等. 基于Logistic函数和用户聚类的协同过滤算法. 浙江大学学报(工学版),2017,51(6):1252-1258. (Mao Y Y,Liu J X,Hu R,et al. Collaborative filtering algorithm based on
logistic function and user clustering. Journal of
Zhejiang University (Engineering Science),2017,
51(6):1252-1258.
18 Ning X,Karypis G. SLIM:sparse linear methods for top?N recommender systems∥2011 IEEE 11th International Conference on Data Mining. Vancouver,Canada:IEEE,2011:497-506.
19 吴宾,娄铮铮,叶阳东. 联合正则化的矩阵分解推荐算法. 软件学报,2018,29(9):2681-2696.
Wu B,Lou Z Z,Ye Y D. Co?regularized matrix factorization recommendation algorithm. Journal of Software,2018,29(9):2681-2696.
[1] 袁晓峰, 钱苏斌, 周彩根. 基于填充先验约束的矩阵分解算法[J]. 南京大学学报(自然科学版), 2021, 57(2): 197-207.
[2] 徐媛媛,张恒汝,闵帆,黄雨婷. 三支交互推荐[J]. 南京大学学报(自然科学版), 2019, 55(6): 973-983.
[3] 何轶凡, 邹海涛, 于化龙. 基于动态加权Bagging矩阵分解的推荐系统模型[J]. 南京大学学报(自然科学版), 2019, 55(4): 644-650.
[4] 黄 帷,闵 帆*,任 杰. 基于协同过滤加权预测的主动学习缺失值填补算法[J]. 南京大学学报(自然科学版), 2018, 54(4): 758-.
[5] 林耀进*张佳,林梦雷,李进金. 基于协同过滤的药物重定位算法[J]. 南京大学学报(自然科学版), 2015, 51(4): 834-841.
[6] 张燕平1,2张顺1,2钱付兰1,2严远亭1,2. 一种局部和全局用户影响力相结合的社交推荐算法[J]. 南京大学学报(自然科学版), 2015, 51(4): 858-865.
[7]  韦素云1**,业宁1,吉根林2,张丹丹1,殷晓飞1
.  基于项目类别和兴趣度的协同过滤推荐算法*[J]. 南京大学学报(自然科学版), 2013, 49(2): 142-149.
[8]  于洪 ** , 李转运 .  基于遗忘曲线的协同过滤推荐算法*

[J]. 南京大学学报(自然科学版), 2010, 46(5): 520-527.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!