南京大学学报(自然科学版) ›› 2020, Vol. 56 ›› Issue (2): 236–243.doi: 10.13232/j.cnki.jnju.2020.02.009

• • 上一篇    下一篇

基于双重注意力模型的微博情感倾向性分析

罗春春,郝晓燕()   

  1. 太原理工大学信息与计算机学院,晋中,030600
  • 收稿日期:2020-01-13 出版日期:2020-03-30 发布日期:2020-04-02
  • 通讯作者: 郝晓燕 E-mail:haoxiaoyan@tyut.edu.cn
  • 基金资助:
    教育部人文社会科学研究规划基金(17YJA740031);山西省自然科学基金(201801D121137)

Microblog sentiment orientation analysis based on double attention model

Chunchun Luo,Xiaoyan Hao()   

  1. School of Information and Computer, Taiyuan University of Technology, Jinzhong, 030600, China
  • Received:2020-01-13 Online:2020-03-30 Published:2020-04-02
  • Contact: Xiaoyan Hao E-mail:haoxiaoyan@tyut.edu.cn

摘要:

在现有的微博情感倾向性分析任务中,微博标签往往被视为噪声信息,在数据预处理阶段就被剔除.但微博标签蕴含着微博内容的关键信息,所以标签的剔除对于微博的情感倾向性分析是不利的.针对该问题,充分考虑微博的文本特点,提出一种基于双重注意力的情感分析模型.采用Bi?LSTM (Bi?directional Long Short?Term Memory)分别构建微博文本和微博标签的语义表示,采用双重注意力机制同时对微博的正文层和微博的标签层进行语义编码,提取出文本中的关键信息.最后,基于所构建的语义表示训练情感分类模型.实验结果表明,该模型在微博情感倾向性分析上取得了较好的效果.

关键词: 双向长短期记忆网络, 双重注意力模型, 情感倾向性分析, 新浪微博, 微博标签

Abstract:

In the existing microblog emotion orientation classification task,the microblog tag is often regarded as noise information and eliminated in the data preprocessing stage. But the microblog tag contains key information and the elimination of tags is not good for microblog emotional orientation classification. Aiming at this problem,this paper fully considers the characteristics of microblog text and proposes a sentiment analysis model based on double attention model. The Bi?LSTM

(Bi?directional Long Short?Term Memory) is used to construct the semantic representation of the microblog text and the microblog tag. The double attention mechanism is used to semantically encode the microblog text and the microblog tag and to extract the key information in the text. Finally,the sentiment classification model is trained based on the constructed semantic representation. Experimental results show that the model achieved good results in the microblog emotional sentiment analysis.

Key words: Bi?directional Long Short?Term Memory, double attention model, sentiment orientation analysis, Sina microblog, microblog tag

中图分类号: 

  • TP391.1

表1

微博标签示例"

微博标签微博正文
#海南春节期间宰客#当地的政府部门为什么要这么做?应该去测一下智商了!

图1

双重注意力机制模型结构"

图2

LSTM模型结构"

表2

数据集中微博文本的数目"

数据集正向负向总计
训练集456139628523
测试集12029292131
总计5763489110654

表3

数据集中的微博文本示例"

正向情感负向情感
#官员调研#这样好啊.山东又更上一层楼了.#菲军舰恶意撞击#很是气愤,这么小的国家也敢欺负中国,太嚣张了,端了它!

表4

实验参数设置"

参数名参数值
词向量维度200
批处理个数32
学习率0.001
dropout0.2
LSTM输出维度200

表5

各模型的实验结果对比"

模型Accuracy(%)
CNN82.56
LSTM83.09
LSTM+ATT83.68
Bi?LSTM83.83
SAM84.06
Bi?LSTM+ATT84.17
ACNN82.87
H?ATT84.40
DAM(本文模型)85.39

图3

注意力分布图"

1 黄佳峰,薛云,卢昕等. 面向中文网络评论情感分类的集成学习框架. 中文信息学报,2018,32(9):113-122.
Huang J F,Xue Y,Lu X, et al.An ensemble learning framework for sentiment classification of chinese online reviews. Journal of Chinese Information Processing,2018,32(9):113-122.
2 王盛玉,曾碧卿,商齐等. 基于词注意力卷积神经网络模型的情感分析研究. 中文信息学报,2018,32(9):123-131.
Wang S Y,Zeng B Q,Shang Q, et al. Word attention?based convolutional neural networks for sentiment analysis. Journal of Chinese Information Processing,2018,32(9):123-131.
3 孙建旺,吕学强,张雷瀚. 基于词典与机器学习的中文微博情感分析研究. 计算机应用与软件,2014,31(7):177-181.
Sun J W,Lu X Q,Zhang L H.On sentiment analysis of Chinese microblogging based on lexicon and machine learning. Computer Applications and Software,2014,31(7):177-181.
4 Pang B,Lee L,Vaithyanathan S. Thumbs up:sentiment classification using machine learning techniques∥Proceedings of the ACL?02 Conference on Empirical Methods in Natural Language Processing. Stroudsburg,PA,USA:Association for Computational Linguistics,2002:79-86.
5 Kolajo T,Daramola O,Adebiyi A. Sentiment analysis on Naija?tweets∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence,Italy:ACM Press,2019:338-343.
6 周文,欧阳纯萍,阳小华等. 一种基于情感依存元组的简单句情感判别方法. 中文信息学报,2017,31(3):177-183.
Zhou W,Ouyang C P,Yang X H, et al. A simple?sentence sentiment classification method based onemotional dependency tuples. Journal of Chinese Information Processing,2017,31(3):177-183.
7 王灿伟. 基于主题提取的海量微博情感分析. 南京大学学报(自然科学),2017,53(3):549-556.
Wang C W.Sentimental analysis of massive micro?blog based on topic extraction. Journal of Nanjing University (Natural Science),2017,53(3):549-556.
8 汪鹏,赵学礼,李娜娜等. 基于结构对应学习的跨语言情感分类研究. 南京大学学报(自然科学),2017,53(6):1133-1140.
Wang P,Zhao X L,Li N N, et al. Research of cross?language sentiment classification based on structural correspondence learning. Journal of Nanjing University (Natural Science),2017,53(6):1133-1140.
9 杜慧,徐学可,伍大勇等. 基于情感词向量的微博情感分类. 中文信息学报,2017,31(3):170-176.
Du H,Xu X K,Wu D Y, et al. A sentiment classification method based on sentiment?specific word embedding. Journal of Chinese Information Processing,2017,31(3):170-176.
10 任巨伟,杨亮,吴晓芳等. 基于情感常识的微博事件公众情感趋势预测. 中文信息学报,2017,31(2):169-178. Ren J W,Yang L,Wu X F, et al. Public sentiment trend prediction of microblog events based on affective commonsense knowledge. Journal of Chinese Information Processing,2017,31(2):169-178.
11 Thongtan T,Phienthrakul T. Sentiment classifi?cation using document embeddings trained with cosine similarity∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics:Student Research. Florence,Italy:ACM Press,2019:407-414.
12 Kim Y. Convolutional neural networks for sentence classification∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing. Doha,Qatar:The Association for Computational Linguistics,2014.
13 Tang D,Qin B,Liu T,et al. Document modeling with gated recurrent neural network for sentiment classification∥Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. Lisbon,Portugal:The Association for Computational Linguistics,2015:1422-1432.
14 Lai S W,Xu L H,Liu K,et al. Recurrent convolutional neural networks for text classification∥Proceedings of the 29th AAAI Conference on Artificial Intelligence. Austin,TX,USA:AAAI,2015:2267-2273.
15 Zhou C T,Sun C,Liu Z,et al. A C?LSTM neural network for text classification. Computer Science,2015,1(4):39-44.
16 Yang Z C,Yang D Y,Dyer C,et al. Hierarchical attention networks for document classification∥Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies. San Diego,CA,USA:ACL,2016:1480-1489.
17 Bao L X,Lambert P,Badia T. Attention and lexicon regularized LSTM for aspect based sentiment analysis∥Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence,Italy:ACM Press,2019:253-259.
18 程艳,叶子铭,王明文等. 融合卷积神经网络与层次化注意力网络的中文文本情感倾向性分析. 中文信息学报,2019,33(1):133-142.
Cheng Y,Ye Z M,Wang M W, et al. Chinese text sentiment orientation analysis based on convolution neural network and hierarchical attention network. Chinese Journal of Information Science,2019,33(1):133-142.
19 曾锋,曾碧卿,韩旭丽等. 基于双层注意力循环神经网络的方面级情感分析. 中文信息学报,2019,33(6):108-115.
Zeng F,Zeng B Q,Han X L, et al. Double attention neural network for aspect?based sentiment analysis. Journal of Chinese Information Processing,2019,33(6):108-115.
20 Hochreiter S,Schmidhuber J. Long short?term memory. Neural Computation,1997,9(8):1735-1780.
21 Tang D,Qin B,Feng X,et al. Effective LSTMs for target?dependent sentiment classification∥Proceedings of the 26th International Conference on Computational Linguistics. Osaka,Japan:The Association for Computational Linguistics,2016:3298-3307.
22 Wang Y Q,Huang M,Zhu X,et al. Attention?based LSTM for aspect?level sentiment classification∥Proceedings of Conference on Empirical Methods in Natural Language Processing. Austin,TX,USA:The Association for Computational Linguistics,2016:606-615.
23 Wang W C,Feng S,Gao W,et al. Personalized microblog sentiment classification via adversarial cross?lingual multi?task learning∥Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Brussels,Belgium:The Association for Computational Linguistics,2018:338-348.
[1] 孟佳娜*, 赵丹丹, 于玉海, 孙世昶. 归纳式迁移学习在跨领域情感倾向性分析中的应用[J]. 南京大学学报(自然科学版), 2016, 52(1): 175-183.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 林 銮,陆武萍,唐朝生,赵红崴,冷 挺,李胜杰. 基于计算机图像处理技术的松散砂性土微观结构定量分析方法[J]. 南京大学学报(自然科学版), 2018, 54(6): 1064 -1074 .
[2] 段新春,施 斌,孙梦雅,魏广庆,顾 凯,冯晨曦. FBG蒸发式湿度计研制及其响应特性研究[J]. 南京大学学报(自然科学版), 2018, 54(6): 1075 -1084 .
[3] 梅世嘉,施 斌,曹鼎峰,魏广庆,张 岩,郝 瑞. 基于AHFO方法的Green-Ampt模型K0取值试验研究[J]. 南京大学学报(自然科学版), 2018, 54(6): 1085 -1094 .
[4] 卢 毅,于 军,龚绪龙,王宝军,魏广庆,季峻峰. 基于DFOS的连云港第四纪地层地面沉降监测分析[J]. 南京大学学报(自然科学版), 2018, 54(6): 1114 -1123 .
[5] 胡 淼,王开军,李海超,陈黎飞. 模糊树节点的随机森林与异常点检测[J]. 南京大学学报(自然科学版), 2018, 54(6): 1141 -1151 .
[6] 洪思思,曹辰捷,王 喆*,李冬冬. 基于矩阵的AdaBoost多视角学习[J]. 南京大学学报(自然科学版), 2018, 54(6): 1152 -1160 .
[7] 魏 桐,童向荣. 基于加权启发式搜索的鲁棒性信任路径生成[J]. 南京大学学报(自然科学版), 2018, 54(6): 1161 -1170 .
[8] 秦 娅, 申国伟, 赵文波, 陈艳平. 基于深度神经网络的网络安全实体识别方法[J]. 南京大学学报(自然科学版), 2019, 55(1): 29 -40 .
[9] 贾海宁, 王士同. 面向重尾噪声的模糊规则模型[J]. 南京大学学报(自然科学版), 2019, 55(1): 61 -72 .
[10] 陆慎涛, 葛洪伟, 周 竞. 自动确定聚类中心的移动时间势能聚类算法[J]. 南京大学学报(自然科学版), 2019, 55(1): 143 -153 .