南京大学学报(自然科学版) ›› 2021, Vol. 57 ›› Issue (5): 750–756.doi: 10.13232/j.cnki.jnju.2021.05.004

• • 上一篇    

胶囊神经网络在期刊文本分类中的应用

倪斌1, 陆晓蕾2(), 童逸琦3, 马涛1, 曾志贤1   

  1. 1.中国科学院计算技术研究所厦门数据智能研究院,厦门,361000
    2.厦门大学外文学院,厦门,361005
    3.厦门大学信息学院,厦门,361005
  • 收稿日期:2021-05-31 出版日期:2021-09-29 发布日期:2021-09-29
  • 通讯作者: 陆晓蕾 E-mail:luxiaolei@xmu.edu.cn
  • 作者简介:E⁃mail:luxiaolei@xmu.edu.cn
  • 基金资助:
    福建省中青年教师教育科研项目(JSZW20001);装备预先研究领域基金(快速扶持项目第一阶段)(61403120117)

Automated journal text classification based on capsule neural network

Bin Ni1, Xiaolei Lu2(), Yiqi Tong3, Tao Ma1, Zhixian Zeng1   

  1. 1.Xiamen Data Intelligence Academy,Institute of Computing Technology, Chinese Academy of Sciences,Xiamen,361000,China
    2.College of Foreign Languages and Cultures,Xiamen University,Xiamen,361005,China
    3.School of Informatics,Xiamen University,Xiamen,361005,China
  • Received:2021-05-31 Online:2021-09-29 Published:2021-09-29
  • Contact: Xiaolei Lu E-mail:luxiaolei@xmu.edu.cn

摘要:

通过引入BERT (Bidirectional Encoder Representation from Transformers)词向量和胶囊神经网络架构,建立期刊文本自动分类模型.选取三个不同规模的Web of Science数据集,以期刊领域的文本分类作为研究任务.在分析文本的基础上,对论文摘要进行多种深度学习算法训练.利用向量化的胶囊神经元和动态路由机制获取文本的局部?整体关系,最终实现更加精准的文本分类模型.实验结果表明,在该数据集上,基于胶囊神经网络的文本分类器的准确率、精准率、召回率和F1值等多项指标均领先于其他基线算法,同时动态路由的迭代次数需要综合考虑模型的损失与训练速度.

关键词: 期刊自动分类, 文本分类, 深度学习, 胶囊神经网络

Abstract:

By employing the capsule neural network and BERT (Bidirectional Encoder Representation from Transformers) word embedding,we propose a model for the automatic classification of journal texts. By training three labeled datasets from Web of Science,we conduct a series of deep learning experiments on the abstracts. We use the vectorized capsule neurons and the dynamic routing mechanism to capture the local?global relationship of the text,and attain a more accurate text classification model. The experimental results show that the capsule neural network achieves better performance than the compared baseline methods in terms of accuracy,precision,recall and F1. Besides,we set dynamic routing iterations as three in an effort to optimize the combining effect of the loss and training speed of the model. This article demonstrates the practicality and effectiveness of capsule neural network in the field of automatic journal classification.

Key words: automatic journal classification, text classification, deep learning, capsule neural network

中图分类号: 

  • TP391

图1

基于胶囊神经网络的文本分类架构"

表1

Web of Science数据集的相关信息"

数据集训练集测试集分类标签
WOS?469854228646997
WOS?11967801839497
WOS?5736458811483

表2

Web of Sience数据集文本长度分析"

领域文本数目最大长度最小长度平均长度长度标准差
A56877211321267.4
B42377152219670.1
C65146902718168.7
D54835752116163.1
E146259982822371.2
F32977532318774.2
G714212621719767.3

图2

数据集中各类别文本长度分布箱型图"

表3

CapsNet模型和基线模型训练结果"

模 型WOS?11967WOS?46985WOS?5736
AccuracyPrecisionRecallF1AccuracyPrecisionRecallF1AccuracyPrecisionRecallF1
NB[23]50.1%51.6%56.4%52.6%47.5%46.2%51.2%48.6%58.6%60.1%61.9%63.4%
SVM[24]70.1%71.5%74.8%75.9%68.4%67.5%71.3%69.5%78.7%79.6%82.8%84.6%
CNN83.6%77.6%87.2%80.8%77.5%74.6%84.4%77.9%95%94.5%96.4%95.1%
Bi?LSTM80.6%81.2%87.9%82.8%77.1%74.5%82.7%78.2%94.0%95.6%95.8%94.6%
GRU85.6%79.8%88.6%83.2%78.9%79.3%90.4%83.0%96.5%97.5%96.5%96.8%
BERT90.0%89.5%90.9%90.1%93.1%92.9%93.1%93.0%96.9%97.3%96.3%96.7%
本文模型(w2v)86.7%83.2%89.7%85.2%81.3%83.3%90.9%83.3%97.4%95.7%96.5%96.1%
本文模型(BERT)91.2%89.9%91.4%90.8%94.2%93.5%93.6%93.5%97.4%97.9%96.8%97.0%

图3

WOS?46985数据集的训练准确性(上)和损失(下)平滑曲线"

图4

不同路由迭代次数与训练损失曲线"

1 Johnson R,Watkinson A,Mabe M. The STM report:An overview of scientific and scholarly journal publishing. The 5th Edition. Hague,Netherlands:International Association of Scientific,Technical and Medical Publishers,2018.
2 李广建,杨林. 大数据视角下的情报研究与情报研究技术. 图书与情报,2012(6):1-8.
Li G J,Yang L. Intelligence analysis and intelligence technology in view of big data. Library and Information,2012(6):1-8.
3 邓三鸿,傅余洋子,王昊. 基于LSTM模型的中文图书多标签分类研究. 数据分析与知识发现,2017(7):52-60.
Deng S H,Fu Y Y Z,Wang H. Multi?label classification of Chinese books with LSTM model. Data Analysis and Knowledge Discovery,2017(7):52-60.
4 邓要武,王连俊. 图书自动分类专家系统的设计尝试. 图书情报工作,1997(5):43-45.
Deng Y W,Wang L J. The design of automatic book classification expert system. Library and Information Service,1997(5):43-45.
5 McCallum A,Nigam K. A comparison of event models for naive bayes text classification∥AAAI?98 Workshop on Learning for Text Categorization. Madison,WI,USA:AAAI,1998:41-48.
6 Joachims T. Transductive inference for text classification using support vector machines∥Proceedings of the 16th International Conference on Machine Learning. Bled,Slovenia:Morgan Kaufmann Publishers Inc.,1999:200-209.
7 Forman G. An extensive empirical study of feature selection metrics for text classification.The Journal of Machine Learning Research,2003(3):1289-1305.
8 陈白雪,宋培彦. 基于用户自然标注的TF?IDF辅助标引算法及实证研究. 图书情报工作,2018,62(1):132-139. (Chen B X,Song P Y. Empirical research on TF?IDF assisted indexing algorithm based on
users' natural annotation. Library and Information Service,2018,62(1):132-139.
9 Mikolov T,Sutskever I,Chen K,et al. Distributed representations of words and phrases and their compositionality∥Proceedings of the 26th Inter?national Conference on Neural Information Processing Systems. Sydney,Australia:Curran Associates Inc.,2013:3111-3119.
10 Kim Y. Convolutional neural networks for sentence classification. 2014,arXiv:.
11 Zhang X,Zhao J B,LeCun Y. Character?level convolutional networks for text classification∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:649-657.
12 Hochreiter S,Schmidhuber J. Long short?term memory. Neural Computation,1997,9(8):1735-1780.
13 Dai A M,Le Q V. Semi?supervised sequence learning∥Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:3079-3087.
14 Cho K,Van Merrienboer B,Bahdanau D,et al. On the properties of neural machine translation:Encoder?decoder approaches. 2014,arXiv:.
15 Kowsari K,Brown D E,Heidarysafa M,et al. HDLTex:Hierarchical deep learning for text classification∥2017 16th IEEE International Conference on Machine Learning and Applications. Cancun,Mexico:IEEE,2017:364-371.
16 Vaswani A,Shazeer N,Parmar N,et al. Attention is all you need∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:6000-6010.
17 Hinton G,Alex K,Sida D. Transforming auto?encoders∥International Conference on Artificial Neural Networks. Springer Berlin Heidelberg,2011:44-51.
18 Sabour S,Frosst N,Hinton G E. Dynamic routing between capsules∥Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,CA,USA:Curran Associates Inc.,2017:3859-3869.
19 Hinton G,Sabour S,Frosst N. Matrix capsules with EM routing∥6th International Conference on Learning Representations. Vancouver,Canada:ICLR Committee,2018:1-15.
20 Wang Y Q,Sun A X,Han J L,et al. Sentiment analysis by capsules∥Proceedings of 2018 World Wide Web Conference. Lyon,France:International World Wide Web Conferences Steering Committee,2018:1165-1174.
21 Kim J,Jang S,Choi S,et al. Text classification using capsules. 2018,arXiv:.
22 Zhao W,Ye J B,Yang M,et al. Investigating capsule networks with dynamic routing for text classification. 2018,arXiv:.
23 Devlin J,Chang M W,Lee K,et al. BERT:
Pre?training of deep bidirectional transformers for language understanding. 2019,arXiv:.
24 Chen K W,Zhang Z P,Long J,et al. Turning from TF?IDF to TF?IGM for term weighting in text classification. Expert Systems with Applications,2016(66):245-260.
[1] 陈磊, 孙权森, 王凡海. 基于深度对抗网络和局部模糊探测的目标运动去模糊[J]. 南京大学学报(自然科学版), 2021, 57(5): 735-749.
[2] 贾霄, 郭顺心, 赵红. 基于图像属性的零样本分类方法综述[J]. 南京大学学报(自然科学版), 2021, 57(4): 531-543.
[3] 普志方, 陈秀宏. 基于卷积神经网络的细胞核图像分割算法[J]. 南京大学学报(自然科学版), 2021, 57(4): 566-574.
[4] 段建设, 崔超然, 宋广乐, 马乐乐, 马玉玲, 尹义龙. 基于多尺度注意力融合的知识追踪方法[J]. 南京大学学报(自然科学版), 2021, 57(4): 591-598.
[5] 罗金屯, 滕飞, 周亚波, 池茂儒, 张海波. 数据驱动的高速铁路轮轨作用力反演模型[J]. 南京大学学报(自然科学版), 2021, 57(2): 299-308.
[6] 曾宪华, 陆宇喆, 童世玥, 徐黎明. 结合马尔科夫场和格拉姆矩阵特征的写实类图像风格迁移[J]. 南京大学学报(自然科学版), 2021, 57(1): 1-9.
[7] 余方超, 方贤进, 张又文, 杨高明, 王丽. 增强深度学习中的差分隐私防御机制[J]. 南京大学学报(自然科学版), 2021, 57(1): 10-20.
[8] 张萌, 韩冰, 王哲, 尤富生, 李浩然. 基于深度主动学习的甲状腺癌病理图像分类方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 21-28.
[9] 李一凡, 朱斐, 凌兴宏, 刘全. 具有窗口结构Bi⁃LSTM网络的心电图QRS波检测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 42-51.
[10] 温玉莲, 林培光. 基于行业背景差异下的金融时间序列预测方法[J]. 南京大学学报(自然科学版), 2021, 57(1): 90-100.
[11] 潘越,王骏,李文飞,张建,王炜. 基于卷积神经网络的蛋白质折叠类型最小特征提取[J]. 南京大学学报(自然科学版), 2020, 56(5): 744-753.
[12] 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600.
[13] 李康,谢宁,李旭,谭凯. 基于卷积神经网络和几何优化的统计染色体核型分析方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 116-124.
[14] 韩普,刘亦卓,李晓艳. 基于深度学习和多特征融合的中文电子病历实体识别研究[J]. 南京大学学报(自然科学版), 2019, 55(6): 942-951.
[15] 张家精,夏巽鹏,陈金兰,倪友聪. 基于张量分解和深度学习的混合推荐算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 952-959.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!