南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (1): 125–132.doi: 10.13232/j.cnki.jnju.2019.01.013

• • 上一篇    下一篇

基于双向注意力流和自注意力结合的机器阅读理解

顾健伟1,2,曾 诚3,邹恩岑1,陈 扬1,2,沈 艺1,2,陆 悠1,奚雪峰1,2*   

  1. 1. 苏州科技大学电子与信息工程学院,苏州,215009; 2. 苏州市虚拟现实智能交互及应用技术重点实验室,苏州,215009; 3. 昆山市公安局指挥中心,苏州,215300
  • 接受日期:2018-12-10 出版日期:2019-02-01 发布日期:2019-01-26
  • 通讯作者: 奚雪峰, E-mail:xfxi@mail.usts.edu.cn E-mail:xfxi@mail.usts.edu.cn
  • 基金资助:
    国家自然科学基金(61673290,61728205,61750110534),江苏省研究生实践创新计划(SJCX17_0681),苏州市科技发展计划产业前瞻性项目(SYG201707,SYG201817)

Research on machine reading comprehension task based on BiDAF with self-attention

Gu Jianwei1,2,Zeng Cheng3,Zou Encen1,Chen Yang1,2,Shen Yi1,2,Lu You1,Xi Xuefeng1,2*   

  1. 1. School of Electronics and Information Engineering,Suzhou University of Science and Technology,Suzhou,215009,China; 2. Suzhou Key Laboratory of Virtual Reality and Intelligent Interaction,Suzhou,215009,China; 3. Kunshan Public Security Bureau Command Center,Suzhou,215300,China
  • Accepted:2018-12-10 Online:2019-02-01 Published:2019-01-26
  • Contact: Xi Xuefeng, E-mail:xfxi@mail.usts.edu.cn E-mail:xfxi@mail.usts.edu.cn

摘要: 机器阅读理解(Machine Reading Comprehension,MRC)一直是自然语言处理(Natural Language Processing,NLP)领域的研究热点和核心问题. 近期,百度开源了一款大型中文阅读理解数据集DuReader,旨在处理现实生活中的RC(Reading Comprehension)问题. 该数据集包含1000 k的文本、200 k的问题和420 k的答案,是目前最大型的中文机器阅读理解数据集,在此数据集上发布的阅读理解任务比以往更具有实际意义,也更有难度. 针对该数据集的阅读理解任务,分析研究了一种结合双向注意力流与自注意力(self-attention)机制实现的神经网络模型. 该模型通过双向注意力流机制来获取query-aware上下文信息表征并进行粒度分级,使用自注意力机制捕捉文本和问题句内的词语依赖关系和句法信息,再通过双向长短期记忆(Long Short-Term Memory,LSTM)网络进行语义信息聚合. 实验结果最终得到相同词数百分比(BLEU-4)为44.7%,重叠单元百分比(Rouge-L)为49.1%,与人类测试平均水平较为接近,证明了该模型的有效性.

关键词: 中文机器阅读理解, DuReader数据集, BiDAF模型, 自注意力机制

Abstract: Machine Reading Comprehension(MRC)is always the research hotspot and core problem in Natural Language Processing(NLP). How to make the machine get close to human understanding will be the continuous research goal before the arrival of the intelligent era. Recently,Baidu released a large open-source Chinese reading comprehension data set DuReader,which aims to handle real-life RC(Reading Comprehension)issues. This large-scale QA(question and answer)dataset is more practical and more difficult than ever. Not long ago,attention mechanism has been successfully extended to NLP. Typically,these methods use attention to focus on a small portion of the context and summarize it with a fixed-size vector,couple attentions temporally,and often form a uni-directional attention. In view of the excellent effect of attention mechanism applied in the field of NLP,we study and use the Bi-Directional Attention Flow(BiDAF)with self-attention network to deal with the MRC task in this paper. By using the model,the query-aware context representation can be obtained and the granularity can also be classified. We also use self-attention mechanism to capture word dependencies and syntax information in the sentences of text and questions. This step can reduce semantic loss of sentences during information aggregation. Then we aggregate semantic information by bi-LSTM(Long Short-Term Memory)to get the information matrix which is used to predict the final answer. After training,we obtain the result that percentage of identical words(BLEU-4)is 44.7% and percentage of overlapping units(Rouge-L)is 49.1%,where human average level are 55.1% and 54.4% respectively. There is still a certain gap between the experimental results and the human level but it is not very large,indicating that the method is effective and scalable.

Key words: MRC, DuReader, BiDAF, self-attention

中图分类号: 

  • TP183
[1] Hermann K M,Kocisky T,Grefenstette E,et al. Teaching machines to read and comprehend ∥ Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal,Canada:MIT Press,2015:1693-1701.
[2] Hill F,Bordes A,Chopra S,et al. The Goldilocks principle:Reading children’s books with explicit memory representations. 2016,arXiv:1511.02301.
[3] Chen D Q,Bolton J,Manning C D. A thorough examination of the CNN/daily mail reading comprehension task ∥ Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin,Germany:ACL,2016:2359-2367.
[4] Rajpurkar P,Zhang J,Lopyrev K,et al. Squad:100,000+questions for machine comprehension of text ∥ Proceedings of 2016 Conference on EmpiricalMethods in Natural Language Processing(EMNLP 2016). Austin,TX,USA:ACL,2016:2383-2392.
[5] Cui Y M,Liu T,Chen Z P,et al. Consensus attention-based neural networks for Chinese reading comprehension ∥ Proceedings of COLING 2016,the 26th InternationalConference on Computational Linguistics. Osaka,Japan:TechnicalPapers,2016:1777-1786.
[6] He W,Liu K,Lyu Y,et al. DuReader:A Chinese machine reading comprehension dataset from real-world applications. 2018,arXiv:1711.05073.
[7] Zhang J B,Zhu X D,Chen Q,et al. Exploring question understanding and adaptation in neural-network-based question answering. 2017,arXiv:1703.04617.
[8] Huang H Y,Zhu C G,Shen Y L,et al. Fusionnet:Fusing via fully-aware attention with application to machine comprehension.2018,arXiv:1711.07341.
[9] Liu R,Wei W,Mao W G,et al. Phase conductor on multi-layered attentions for machine comprehension. 2018,arXiv:1710.10504.
[10] Liu X D,Shen Y L,Duh K,et al. Stochastic answer networks for machine reading comprehension. 2018,arXiv:1712.03556.
[11] Bahdanau D,Cho K,Bengio Y. Neural machine translation by jointly learning to align and translate ∥ Proceedings of the 3rd International Conference on Learning Representations(ICLR’2015). San Diego,CA,USA:arXiv:1409.0473,2015.
[12] Kim Y. Convolutional neural networks for sentence classification ∥ Proceedings of Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA,USA:ACL,2014:1746-1751.
[13] Li S,Zhao Z,Hu R F,et al. Analogical reasoning on Chinese morphological and semantic relations. 2018,arXiv:1805.06504.
[14] Srivastava R K,Greff K,Schmidhuber J. Highway networks. 2015,arXiv:1505. 00387.
[15] Hochreiter S,Schmidhuber J. Long short-term memory. Neural Computation,1997:1735-1780.
[16] Papineni S,Roukos S,Ward T,et al. Bleu:A method for automatic evaluation of machine translation ∥ Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Philadelphia,PA,USA:Association for Computational Linguistics Stroudsburg,2002,7:311-318.
[17] Lin C Y. ROUGE:A package for automatic evaluation of summaries ∥ Proceedings of the Workshop on Text Summarization Branches Out. Barcelona,Spain:ACL,2004:10.
[18] Wang S H,Jiang J. Machine comprehension using match-LSTM and answer pointer. 2016,arXiv:1608.07905.
[1] 安 晶, 艾 萍, 徐 森, 刘 聪, 夏建生, 刘大琨. 一种基于一维卷积神经网络的旋转机械智能故障诊断方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 133-142.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 许 林,张 巍*,梁小龙,肖 瑞,曹剑秋. 岩土介质孔隙结构参数灰色关联度分析[J]. 南京大学学报(自然科学版), 2018, 54(6): 1105 -1113 .
[2] 卢 毅,于 军,龚绪龙,王宝军,魏广庆,季峻峰. 基于DFOS的连云港第四纪地层地面沉降监测分析[J]. 南京大学学报(自然科学版), 2018, 54(6): 1114 -1123 .
[3] 王 倩,聂秀山,耿蕾蕾,尹义龙. D2D通信中基于Q学习的联合资源分配与功率控制算法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1183 -1192 .
[4] 孔 颉, 孙权森, 纪则轩, 刘亚洲. 基于仿射不变离散哈希的遥感图像快速目标检测新方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 49 -60 .
[5] 安 晶, 艾 萍, 徐 森, 刘 聪, 夏建生, 刘大琨. 一种基于一维卷积神经网络的旋转机械智能故障诊断方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 133 -142 .
[6] 王蔚, 胡婷婷, 冯亚琴. 基于深度学习的自然与表演语音情感识别[J]. 南京大学学报(自然科学版), 2019, 55(4): 660 -666 .
[7] 王博闻, 史江峰, 史逝远, 张伟杰, 马晓琦, 赵业思. 基于遥感数据定位老龄树群[J]. 南京大学学报(自然科学版), 2019, 55(4): 699 -707 .
[8] 马益平,严浩军,王琼京,赵亚云,张秋菊,孔春龙. 混合配体法合成氨基MIL⁃101(Cr)及其二氧化碳吸附和除湿性能[J]. 南京大学学报(自然科学版), 2019, 55(5): 840 -849 .
[9] 李勤,陆现彩,张立虎,程永贤,刘鑫. 蒙脱石层间阳离子交换的分子模拟[J]. 南京大学学报(自然科学版), 2019, 55(6): 879 -887 .
[10] 党政,代群威,安超,彭启轩,卓曼他,杨丽君. 静态水蚀条件下自然钙华预制块的溶出特性研究[J]. 南京大学学报(自然科学版), 2019, 55(6): 916 -923 .