基于双向注意力流和自注意力结合的机器阅读理解

doi:10.13232/j.cnki.jnju.2019.01.013

南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (1): 125–132.doi: 10.13232/j.cnki.jnju.2019.01.013

基于双向注意力流和自注意力结合的机器阅读理解

顾健伟^1,2，曾　诚³，邹恩岑¹，陈　扬^1,2，沈　艺^1,2，陆　悠¹，奚雪峰^1,2*

1. 苏州科技大学电子与信息工程学院，苏州，215009； 2. 苏州市虚拟现实智能交互及应用技术重点实验室，苏州，215009； 3. 昆山市公安局指挥中心，苏州，215300

接受日期:2018-12-10 出版日期:2019-02-01 发布日期:2019-01-26
通讯作者: 奚雪峰, E－mail：xfxi@mail.usts.edu.cn E-mail:xfxi@mail.usts.edu.cn
基金资助:
国家自然科学基金(61673290，61728205，61750110534)，江苏省研究生实践创新计划(SJCX17_0681)，苏州市科技发展计划产业前瞻性项目(SYG201707，SYG201817)

Research on machine reading comprehension task based on BiDAF with self－attention

Gu Jianwei^1,2，Zeng Cheng³，Zou Encen¹，Chen Yang^1,2，Shen Yi^1,2，Lu You¹，Xi Xuefeng^1,2*

1. School of Electronics and Information Engineering，Suzhou University of Science and Technology，Suzhou，215009，China； 2. Suzhou Key Laboratory of Virtual Reality and Intelligent Interaction，Suzhou，215009，China； 3. Kunshan Public Security Bureau Command Center，Suzhou，215300，China

Accepted:2018-12-10 Online:2019-02-01 Published:2019-01-26
Contact: Xi Xuefeng, E－mail：xfxi@mail.usts.edu.cn E-mail:xfxi@mail.usts.edu.cn

摘要/Abstract

摘要： 机器阅读理解(Machine Reading Comprehension，MRC)一直是自然语言处理(Natural Language Processing，NLP)领域的研究热点和核心问题. 近期，百度开源了一款大型中文阅读理解数据集DuReader，旨在处理现实生活中的RC(Reading Comprehension)问题. 该数据集包含1000 k的文本、200 k的问题和420 k的答案，是目前最大型的中文机器阅读理解数据集，在此数据集上发布的阅读理解任务比以往更具有实际意义，也更有难度. 针对该数据集的阅读理解任务，分析研究了一种结合双向注意力流与自注意力(self－attention)机制实现的神经网络模型. 该模型通过双向注意力流机制来获取query－aware上下文信息表征并进行粒度分级，使用自注意力机制捕捉文本和问题句内的词语依赖关系和句法信息，再通过双向长短期记忆(Long Short－Term Memory，LSTM)网络进行语义信息聚合. 实验结果最终得到相同词数百分比(BLEU－4)为44.7%，重叠单元百分比(Rouge－L)为49.1%，与人类测试平均水平较为接近，证明了该模型的有效性．

关键词: 中文机器阅读理解, DuReader数据集, BiDAF模型, 自注意力机制

Abstract: Machine Reading Comprehension(MRC)is always the research hotspot and core problem in Natural Language Processing(NLP). How to make the machine get close to human understanding will be the continuous research goal before the arrival of the intelligent era. Recently，Baidu released a large open－source Chinese reading comprehension data set DuReader，which aims to handle real－life RC(Reading Comprehension)issues. This large－scale QA(question and answer)dataset is more practical and more difficult than ever. Not long ago，attention mechanism has been successfully extended to NLP. Typically，these methods use attention to focus on a small portion of the context and summarize it with a fixed－size vector，couple attentions temporally，and often form a uni－directional attention. In view of the excellent effect of attention mechanism applied in the field of NLP，we study and use the Bi－Directional Attention Flow(BiDAF)with self－attention network to deal with the MRC task in this paper. By using the model，the query－aware context representation can be obtained and the granularity can also be classified. We also use self－attention mechanism to capture word dependencies and syntax information in the sentences of text and questions. This step can reduce semantic loss of sentences during information aggregation. Then we aggregate semantic information by bi－LSTM(Long Short－Term Memory)to get the information matrix which is used to predict the final answer. After training，we obtain the result that percentage of identical words(BLEU－4)is 44.7% and percentage of overlapping units(Rouge－L)is 49.1%，where human average level are 55.1% and 54.4% respectively. There is still a certain gap between the experimental results and the human level but it is not very large，indicating that the method is effective and scalable.

Key words: MRC, DuReader, BiDAF, self－attention

中图分类号:

TP183

顾健伟, 曾　诚, 邹恩岑, 陈　扬, 沈　艺, 陆　悠, 奚雪峰. 基于双向注意力流和自注意力结合的机器阅读理解[J]. 南京大学学报(自然科学版), 2019, 55(1): 125–132.

Gu Jianwei, Zeng Cheng, Zou Encen, Chen Yang, Shen Yi, Lu You, Xi Xuefeng. Research on machine reading comprehension task based on BiDAF with self－attention[J]. Journal of Nanjing University(Natural Sciences), 2019, 55(1): 125–132.

参考文献

[1] Hermann K M，Kocisky T，Grefenstette E，et al. Teaching machines to read and comprehend ∥ Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal，Canada：MIT Press，2015：1693－1701.
[2] Hill F，Bordes A，Chopra S，et al. The Goldilocks principle：Reading children’s books with explicit memory representations. 2016，arXiv：1511.02301.
[3] Chen D Q，Bolton J，Manning C D. A thorough examination of the CNN/daily mail reading comprehension task ∥ Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin，Germany：ACL，2016：2359－2367.
[4] Rajpurkar P，Zhang J，Lopyrev K，et al. Squad：100，000＋questions for machine comprehension of text ∥ Proceedings of 2016 Conference on EmpiricalMethods in Natural Language Processing(EMNLP 2016). Austin，TX，USA：ACL，2016：2383－2392.
[5] Cui Y M，Liu T，Chen Z P，et al. Consensus attention－based neural networks for Chinese reading comprehension ∥ Proceedings of COLING 2016，the 26th InternationalConference on Computational Linguistics. Osaka，Japan：TechnicalPapers，2016：1777－1786.
[6] He W，Liu K，Lyu Y，et al. DuReader：A Chinese machine reading comprehension dataset from real－world applications. 2018，arXiv：1711.05073.
[7] Zhang J B，Zhu X D，Chen Q，et al. Exploring question understanding and adaptation in neural－network－based question answering. 2017，arXiv：1703.04617.
[8] Huang H Y，Zhu C G，Shen Y L，et al. Fusionnet：Fusing via fully－aware attention with application to machine comprehension.2018，arXiv：1711.07341.
[9] Liu R，Wei W，Mao W G，et al. Phase conductor on multi－layered attentions for machine comprehension. 2018，arXiv：1710.10504.
[10] Liu X D，Shen Y L，Duh K，et al. Stochastic answer networks for machine reading comprehension. 2018，arXiv：1712.03556.
[11] Bahdanau D，Cho K，Bengio Y. Neural machine translation by jointly learning to align and translate ∥ Proceedings of the 3rd International Conference on Learning Representations(ICLR’2015). San Diego，CA，USA：arXiv：1409.0473，2015.
[12] Kim Y. Convolutional neural networks for sentence classification ∥ Proceedings of Conference on Empirical Methods in Natural Language Processing. Stroudsburg，PA，USA：ACL，2014：1746－1751.
[13] Li S，Zhao Z，Hu R F，et al. Analogical reasoning on Chinese morphological and semantic relations. 2018，arXiv：1805.06504.
[14] Srivastava R K，Greff K，Schmidhuber J. Highway networks. 2015，arXiv：1505. 00387.
[15] Hochreiter S，Schmidhuber J. Long short－term memory. Neural Computation，1997：1735－1780.
[16] Papineni S，Roukos S，Ward T，et al. Bleu：A method for automatic evaluation of machine translation ∥ Proceedings of the 40th Annual Meeting on Association for Computational Linguistics. Philadelphia，PA，USA：Association for Computational Linguistics Stroudsburg，2002，7：311－318.
[17] Lin C Y. ROUGE：A package for automatic evaluation of summaries ∥ Proceedings of the Workshop on Text Summarization Branches Out. Barcelona，Spain：ACL，2004：10.
[18] Wang S H，Jiang J. Machine comprehension using match－LSTM and answer pointer. 2016，arXiv：1608.07905.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于双向注意力流和自注意力结合的机器阅读理解

Research on machine reading comprehension task based on BiDAF with self－attention

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 1

Metrics

本文评价

推荐阅读 10