南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (4): 591–598.doi: 10.13232/j.cnki.jnju.2021.04.007

• • 上一篇    

基于多尺度注意力融合的知识追踪方法

段建设1, 崔超然1(), 宋广乐2(), 马乐乐3, 马玉玲4, 尹义龙3   

  1. 1.山东财经大学计算机科学与技术学院,济南,250014
    2.山东省人工智能学会,济南,250101
    3.山东大学软件学院,济南,250101
    4.山东建筑大学计算机科学与技术学院,济南,250101
  • 收稿日期:2021-02-25 出版日期:2021-07-30 发布日期:2021-07-30
  • 通讯作者: 崔超然,宋广乐 E-mail:crcui@sdufe.edu.cn;glsong921@163.com
  • 作者简介:glsong921@163.com
    E⁃mail:crcui@sdufe.edu.cn
  • 基金资助:
    国家自然科学基金(62077033)

Knowledge tracing based on multi⁃scale attention fusion

Jianshe Duan1, Chaoran Cui1(), Guangle Song2(), Lele Ma3, Yuling Ma4, Yilong Yin3   

  1. 1.School of Computer Science and Technology,Shandong University of Finance and Economics,Ji'nan,250014,China
    2.Shandong Association for Artifical Intelligence,Ji'nan,250101,China
    3.School of Software,Shandong University,Ji'nan,250101,China
    4.School of Computer Science and Technology,Shandong Jianzhu University,Ji'nan,250101,China
  • Received:2021-02-25 Online:2021-07-30 Published:2021-07-30
  • Contact: Chaoran Cui,Guangle Song E-mail:crcui@sdufe.edu.cn;glsong921@163.com

摘要:

互联网的普及使线上教育迅速发展,在缓解教育资源不均衡问题的同时,也为科研人员提供了大量的研究数据.教育数据挖掘是一个新兴学科,通过分析海量数据来理解学生的学习行为,为学生提供个性化学习建议.知识追踪是教育数据挖掘中的重要任务,其利用学生的历史答题序列预测学生下一次的答题表现.已有的知识追踪模型没有区分历史序列中的长期交互信息和短期交互信息,忽略了不同时间尺度的序列信息对未来预测的不同影响.针对该问题,提出一种基于多尺度注意力融合的知识追踪模型,使用时间卷积网络捕获历史交互序列的不同时间尺度信息,并基于注意力机制进行多尺度信息融合.针对不同学生及答题序列,该模型能自适应地确定不同时间尺度信息的重要性.实验结果表明,提出模型的性能优于已有的知识追踪模型.

关键词: 知识追踪, 时间卷积神经网络, 多尺度融合, 注意力机制, 深度学习

Abstract:

The popularization of the Internet has enabled the rapid development of online education,which has not only alleviated the imbalance of educational resources,but also provided sufficient educational data for researchers. As an emerging discipline,Educational Data Mining (EDM) aims to understand students' learning behaviors and provide personalized suggestions by analyzing the educational data. Knowledge tracing is an important task in EDM,which models students' historical answer sequences to predict their next answer performance. Existing knowledge tracking models do not distinguish long?term and recent interaction information in historical sequences,and ignore the sequence information at different time scales on future predictions. In this paper,we propose a novel knowledge tracing method based on multi?scale attention fusion,which uses temporal convolution networks to capture multi?scale information of historical interaction sequences,and performs multi?scale information fusion based on the attention mechanism. For different students and historical sequences,the model can adaptively determine the importance of different time scales. Experimental results show that the performance of our model is better than the existing knowledge tracing models.

Key words: knowledge tracing, temporal convolution networks, multi?scale fusion, attention mechanism, deep learning

中图分类号: 

  • TP391

图1

MAFKT模型结构和卷积操作"

表1

三个公开数据集"

数据集ASSISTments 2009ASSISTments 2017Synthetic?5
学生数量41516864000
习题标签1101025
习题记录325637942816200000

表2

模型在三个数据集上的AUC"

数据集DKTDKT+DKVMNSAKTMAFKT
ASSISTments200982.882.380.280.284.5
ASSISTments201768.369.969.065.670.6
Synthetic?581.782.582.182.383.3

表3

模型在三个数据集上的训练时间 (h)"

数据集

ASSIST?

ments2009

ASSIST?

ments2017

Synthetic?5
DKT5.1911.832.92
MAFKT1.422.780.91

表4

四种模型的网络结构"

时间卷积

网络

单尺度

特征

多尺度

特征融合

注意力

机制

TKT1
TKT2
MFKT
MAFKT

表5

消融实验(一)结果"

ASSIST?

ments2009

ASSIST?

ments2017

Synthetic?5
TKT183.269.182.3
TKT283.469.582.2
MFKT84.070.283.1

表6

消融实验(二)结果"

ASSISTments2009ASSISTments2017Synthetic?5
TKT283.469.582.2
MFKT84.070.283.1
MAFKT84.570.683.3
1 Corbett A T,Anderson J R. Knowledge tracing:Modeling the acquisition of procedural knowledge. User Modeling and User?Adapted Interaction,1994,4(4):253-278.
2 Piech C,Bassen J,Huang J,et al. Deep knowledge tracing∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Cambridge,MA,USA:MIT Press,2015:505-513.
3 Schuster M,Paliwal K K. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing,1997,45(11):2673-2681.
4 Ebbinghaus H. Memory:a contribution to experimental psychology. Annals of Neurosciences,2013,13(4):155-156.
5 Jones M,Curran T,Mozer M C,et al. Sequential effects in response time reveal learning mechanisms and event representations. Psychological Review,2013,120(3):628-666.
6 Bai S J,Kolter J Z,Koltun V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. 2018,arXiv:.
7 Yu F,Koltun V. Multi?scale context aggregation by dilated convolutions. 2015,arXiv:.
8 Baker R S J D,Corbett A T,Aleven V. More accurate student modeling through contextual estimation of slip and guess probabilities in Bayesian knowledge tracing∥Proceedings of the 9th International Conference on Intelligent Tutoring Systems. Springer Berlin Heidelberg,1970:406-415.
9 Yudelson M V,Koedinger K R,Gordon G J. Individualized Bayesian knowledge tracing models∥Proceedings of the 16th International Conference on Arti?cial Intelligence in Education. Springer Berlin Heidelberg,2013:171-180.
10 Yeung C K,Yeung D Y. Addressing two problems in deep knowledge tracing via prediction?consistent regularization∥Proceedings of the 5th Annual ACM Conference on Learning at Scale. New York,NY,USA:ACM,2018:41-50.
11 Zhang J N,Shi X J,King I,et al. Dynamic key?value memory networks for knowledge tracing∥Proceedings of the 26th International Conference on World Wide Web. Geneva,Switzerland:International World Wide Web Conferences Steering Committee,2017:765-774.
12 Santoro A,Bartunov S,Botvinick M,et al. Meta?learning with memory?augmented neural networks∥Proceedings of the 33rd International Conference on Machine Learning. New York,NY,USA:ACM,2016:1842-1850.
13 Abdelrahman G,Wang Q. Knowledge tracing with sequential key?value memory networks∥Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. New York,NY,USA:ACM,2019:175-184.
14 Pandey S,Karypis G. A self?attentive model for knowledge tracing. 2019,arXiv:.
15 Vaswani A,Shazeer N,Parmar N,et al. Attention is all you need∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook,NY,USA:CAI,2017:5998-6008.
16 Ghosh A,Heffernan N,Lan A S. Context?aware attentive knowledge tracing∥Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York,NY,USA:ACM,2020:2330-2339.
17 Pandey S,Srivastava J. RKT:Relation?aware self?attention for knowledge tracing∥Proceedings of the 29th ACM International Conference on Information & Knowledge Management. New York,NY,USA:ACM,2020:1205-1214.
18 Chen Y T,Kang Y F,Chen Y X,et al. Probabilistic forecasting with temporal convolutional neural network. Neurocomputing,2020(399):491-501.
19 Guirguis K,Schorn C,Guntoro A,et al. SELD?TCN:sound event localization & detection via temporal convolutional networks. 2020,arXiv:2003. 01609.
20 Hu J,Shen L,Albanie S,et al. Squeeze?and?excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023.
[1] 贾霄, 郭顺心, 赵红. 基于图像属性的零样本分类方法综述[J]. 南京大学学报(自然科学版), 2021, 57(4): 531-543.
[2] 普志方, 陈秀宏. 基于卷积神经网络的细胞核图像分割算法[J]. 南京大学学报(自然科学版), 2021, 57(4): 566-574.
[3] 罗金屯, 滕飞, 周亚波, 池茂儒, 张海波. 数据驱动的高速铁路轮轨作用力反演模型[J]. 南京大学学报(自然科学版), 2021, 57(2): 299-308.
[4] 曾宪华, 陆宇喆, 童世玥, 徐黎明. 结合马尔科夫场和格拉姆矩阵特征的写实类图像风格迁移[J]. 南京大学学报(自然科学版), 2021, 57(1): 1-9.
[5] 余方超, 方贤进, 张又文, 杨高明, 王丽. 增强深度学习中的差分隐私防御机制[J]. 南京大学学报(自然科学版), 2021, 57(1): 10-20.
[6] 张萌, 韩冰, 王哲, 尤富生, 李浩然. 基于深度主动学习的甲状腺癌病理图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 21-28.
[7] 李一凡, 朱斐, 凌兴宏, 刘全. 具有窗口结构Bi⁃LSTM网络的心电图QRS波检测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 42-51.
[8] 温玉莲, 林培光. 基于行业背景差异下的金融时间序列预测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 90-100.
[9] 潘越,王骏,李文飞,张建,王炜. 基于卷积神经网络的蛋白质折叠类型最小特征提取[J]. 南京大学学报(自然科学版), 2020, 56(5): 744-753.
[10] 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600.
[11] 李康,谢宁,李旭,谭凯. 基于卷积神经网络和几何优化的统计染色体核型分析方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 116-124.
[12] 徐扬,周文瑄,阮慧彬,孙雨,洪宇. 基于层次化表示的隐式篇章关系识别[J]. 南京大学学报(自然科学版), 2019, 55(6): 1000-1009.
[13] 韩普,刘亦卓,李晓艳. 基于深度学习和多特征融合的中文电子病历实体识别研究[J]. 南京大学学报(自然科学版), 2019, 55(6): 942-951.
[14] 张家精,夏巽鹏,陈金兰,倪友聪. 基于张量分解和深度学习的混合推荐算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 952-959.
[15] 曹欣怡,李鹤,王蔚. 基于语料库的语音情感识别的性别差异研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 758-764.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!