南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (4): 644–650.doi: 10.13232/j.cnki.jnju.2019.04.014

所属专题: 测试专题

• • 上一篇    下一篇

基于动态加权Bagging矩阵分解的推荐系统模型

何轶凡,邹海涛,于化龙()   

  1. 江苏科技大学计算机学院,镇江,212003
  • 收稿日期:2019-05-12 出版日期:2019-07-30 发布日期:2019-07-23
  • 通讯作者: 于化龙 E-mail:yuhualong@just.edu.cn
  • 基金资助:
    江苏省自然科学基金(BK20130471);中国博士后特别资助计划(2015T80481);中国博士后科学基金(2013M540404);江苏省博士后基金(1401037B);江苏省研究生科研与实践创新计划(KYCX19_1698)

Recommender system model based on dynamic⁃weighted bagging matrix factorization

Yifan He,Haitao Zou,Hualong Yu()   

  1. School of Computer, Jiangsu University of Science and Technology, Zhenjiang, 212003, China
  • Received:2019-05-12 Online:2019-07-30 Published:2019-07-23
  • Contact: Hualong Yu E-mail:yuhualong@just.edu.cn

摘要:

为了提升推荐模型的预测精度,传统方法通常是利用更多的附加信息参与模型的构建.然而,此类方法在提高算法精度的同时也大大增加了算法的时间开销,同时对数据集也存在一定的要求.为了解决上述问题,提出一种基于Bagging集成的矩阵分解模型.该模型根据用户、产品评分数为基学习器动态分配权重,并通过加权求和得到预测评分.在三个不同规模的真实数据集上的实验结果显示:该动态加权Bagging矩阵分解模型拥有与传统矩阵分解模型一样的时间消耗,并且在各个衡量指标上都优于传统的矩阵分解模型.

关键词: 推荐系统, 矩阵分解, Bagging, 动态加权

Abstract:

To promote the prediction accuracy of the recommender system,the traditional methods generally use the additional information to construct model,which always increase the time consumption greatly,as well they always needs more detailed data. To solve the problem above,we propose a Bagging?based matrix factorization model which assigns dynamic weights to every base learner according to the number of users’ and items’ ratings,then acquires the prediction ratings by weight summation. The experimental results on three real datasets show that our dynamic?weighted bagging matrix factorization model has the same efficiency as the traditional matrix factorization model,and it is superior to the traditional matrix factorization on all measures.

Key words: recommender system, matrix factorization, Bagging, dynamic weighting

中图分类号: 

  • TP391.3

图1

算法流程图"

表1

实验数据集描述"

数据集用户数产品数打分数
MovieLens1m604039521000209
MovieLens10m715676513310000054
MovieLens20m13849313126220000263

表2

两种算法在MovieLens1m数据集上的性能"

算法MAERMSENDCG@10
MF0.69180.86780.8225
DWBMF0.68580.86030.8296

表3

两种算法在MovieLens10m数据集上的性能"

算法MAERMSENDCG@10
MF0.63840.81870.8250
DWBMF0.62950.80640.8342

表4

两种算法在MovieLens20m数据集上的性能"

算法MAERMSENDCG@10
MF0.63210.81310.8233
DWBMF0.62090.80070.8323

图2

两种算法在MovieLens20m上表现对比"

1 ShardanandU,MaesP. Social information filtering:algorithms for automating “Word of Mouth”∥Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. Denver,CO,USA:ACM,1995:210-217.
2 DeshpandeM,KarypisG. Item?Based Top?N recommendation algorithms. ACM Transactions on Information Systems,2004,22(1):143-177.
3 RennieJ D M,SrebroN. Fast maximum margin matrix factorization for collaborative prediction∥Proceedings of The 22th International conference on Machine learning. New York,NY,USA:ACM,2005:713-719.
4 SalakhutdinovR,MnihA. Probabilistic matrix factorization∥Proceedings of the 20th International Conference on Neural Information Processing Systems. Vancouver,Canada:ACM,2008:1257-1264.
5 KorenY,BellR,VolinskyC. Matrix factorization techniques for recommender systems. Computer,2009,42(8):30-37.
6 ZouH T,GongZ G,ZhangN,et al. Adaptive ensemble with trust networks and collaborative recommendations. Knowledge and Information Systems,2015,44(3):663-688.
7 ChengZ Y,DingY,ZhuL,et al. Aspect?aware latent factor model: Rating prediction with ratings and reviews. 2018,arXiv:1802.07938.
8 ParkD,NeemanJ,ZhangJ,et al. Preference completion:large?scale collaborative ranking from pairwise comparisons∥Proceedings of the 32th International Conference on Machine Learning.Lille,France:Springer,2015:1907-1916.
9 XuJ W,YaoY,TongH H,et al. HoORaYs:high?order optimization of rating distance for recom?mender systems∥Proceedings of the 23th International Conference on Knowledge Discoveryand Data Mining. Halifax,Canada:ACM,2017:525-534.
10 WangS,MinkuL L,YaoX. Resampling?based ensemble methods for online class imbalance learning. IEEE Transactions on Knowledge and Data Engineering,2015,27(5):1356-1368.
11 SunZ B,SongQ B,ZhuX Y,et al. A novel ensemble method for classifying imbalanced data. Pattern Recognition,2015,48(5):1623-1637.
12 方育柯,傅彦,周俊临. 基于集成学习的个性化推荐算法. 计算机工程与应用,2011,47(10):1-4.
Fang Y K,Fu Y,Zhou J L.Boosting algorithm for personalized recommendation. Computer Engineering and Applications,2011,47(10):1-4.
13 崔岩,祁伟,庞海龙等. 融合协同过滤和XGBoost的推荐算法. 计算机应用研究,2018,37(1)
doi: 10.19734/j.issn.1001?3695 2018.06.0463
Cui Y,Qi W,Pang H L,et al. Extreme gradient boosting recommendation algorithm with collaborative filtering. Application Research of Computers,2018,DOI:10.19734/j.issn.1001?3695 2018.06.0463.
doi: 10.19734/j.issn.1001?3695 2018.06.0463
14 BreimanL. Bagging predictors. Machine Learning,1996,24(2):123-140.
15 OhJ,HanW S,YuH,et al. Fast and robust parallel SGD matrix factorization∥Proceedings of the 21th International Conference on Knowledge Discovery and Data Mining. Sydney,Australia:ACM,2015:865-874.
16 JohnsonR W. An introduction to the bootstrap. Teaching Statistics,2001,23(2):49-54.
[1] 徐媛媛,张恒汝,闵帆,黄雨婷. 三支交互推荐[J]. 南京大学学报(自然科学版), 2019, 55(6): 973-983.
[2]  宗林林,张宪超*,赵乾利,于 红,刘馨月.  一种多流形正则化的多视图非负矩阵分解算法[J]. 南京大学学报(自然科学版), 2017, 53(3): 557-.
[3] 卢文凯*,景丽萍*,杨 柳 . 截断式鲁棒非负矩阵分解算法[J]. 南京大学学报(自然科学版), 2016, 52(4): 714-.
[4] 张燕平1,2张顺1,2钱付兰1,2严远亭1,2. 一种局部和全局用户影响力相结合的社交推荐算法[J]. 南京大学学报(自然科学版), 2015, 51(4): 858-865.
[5]  韦素云1**,业宁1,吉根林2,张丹丹1,殷晓飞1
.  基于项目类别和兴趣度的协同过滤推荐算法*[J]. 南京大学学报(自然科学版), 2013, 49(2): 142-149.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 吕海敏,沈水龙,严学新,史玉金,许烨霜. 上海地面沉降对轨道交通安全运营风险评估[J]. 南京大学学报(自然科学版), 2019, 55(3): 392 -400 .
[2] 柴变芳,魏春丽,曹欣雨,王建岭. 面向网络结构发现的批量主动学习算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 1020 -1029 .
[3] 段友祥,柳璠,孙歧峰,李洪强. 基于相带划分的孔隙度预测[J]. 南京大学学报(自然科学版), 2019, 55(6): 934 -941 .
[4] 许汇源,侯读杰,刘全有. 东营凹陷沙河街组泥页岩中正丙基胆甾烷与异海绵烷的研究:硫循环对有机质富集的影响[J]. 南京大学学报(自然科学版), 2020, 56(3): 366 -381 .
[5] 汪洋,陈泰格,陆晓凡,辛小燕,王坤,李茗,青钊,张英为,严晓敏,吴超,言方荣,张冰. COVID⁃19的临床和影像特征与试行指南的映证分析[J]. 南京大学学报(自然科学版), 2020, 56(3): 430 -436 .
[6] 陈俊芬,赵佳成,韩洁,翟俊海. 基于深度特征表示的Softmax聚类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 533 -540 .
[7] 郑建兴,李沁文,王素格,李德玉. 基于翻译模型的异质重边信息网络链路预测研究[J]. 南京大学学报(自然科学版), 2020, 56(4): 541 -548 .
[8] 李亚重,杨有龙,仇海全. 一种基于嵌入式的弱标记分类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 549 -560 .
[9] 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591 -600 .