南京大学学报(自然科学版) ›› 2017, Vol. 53 ›› Issue (3): 549.
王灿伟1,2*
Wang Canwei1,2*
摘要: 从海量微博数据中分析公众对某一社会事件的情感倾向具有重要研究意义,而海量微博文本稀疏规模庞大,导致传统方法处理这一任务时面临诸多挑战.提出一种基于主题聚类的海量微博情感分析方法.首先基于高质量微博数据挖掘频繁项集,设定语义相关阈值,筛选重要频繁项集进行谱聚类,得到主题关键词.基于主题关键词对海量微博数据依据语义相关度归类,最后结合情感词典对每类中的微博检索主题关键词前后修饰距离内情感词及否定词,结合表情符号计算微博情感值.在百万规模中文微博上进行实验,证明该方法能准确按主题归类且能有效在该主题上进行情感分类.
[1] 丁兆云,贾 焰,周 斌.微博数据挖掘研究综述.计算机研究与发展,2014,51(4):691-706.(Ding Z Y,Jia Y,Zhou B.Survey of data mining for Microblogs.Journal of Computer Research and Development,2014,51(4):691-706.) [2] 赵妍研,秦 兵,车万翔等.基于句法路径的情感评价单元识别.软件学报,2011,22(5):887-898.(Zhao Y Y,Qin B,Che W X,et al.Appraisal expression recognition based on syntactic path.Journal of Software,2011,22(5):887-898.) [3] 张成功,刘培玉,朱振方等.一种基于极性词典的情感分析方法.山东大学学报(理学版),2012,47(3):50-53.(Zhang C G,Liu P Y,Zhu Z F,et al.A sentiment analysis method based on a polarity lexicon.Journal of Shandong University(Natural Science),2012,47(3):50-53.) [4] 杨佳能,阳爱民,周咏梅.基于语义分析的中文微博情感分类方法.山东大学学报(理学版),2014,49(11):14-21.(Yang J N,Yang A M,Zhou Y M.Sentiment classification method of Chinese Microblog based on semantic analysis.Journal of Shandong University(Natural Science),2014,49(11):14-21.) [5] Tseng C,Patel N,Paranjape H,et al.Classifying twitter data with Naïve bayes classifier.In:Proceedings of 2012 IEEE International Conference on Granular Computing(GrC).Piscataway,USA:IEEE Press,2012:294-299. [6] Escalante H J,Montes Y G,Solorio T.A weighted profile intersection measure for profilebased authorship attribution.In:Proceedings of the 10th Mexican International Conference on Advances in Artificial Intelligence.Springer,2011:232-243. [7] Ren Y,Kaji N,Yoshinaga N,et al.Sentiment classification in resourcescarce languages by using label propagation.In:Proceedings of the 25th Pacific Asia Conference on Language,Information and Computation.Singapore,Singapore:Pacific,2011:420-429. [8] Jung J J.Maximum entropybased named entity recognition method for multiple social networking services.Journal of Internet Technology,2012,13(6):931-937. [9] Zhu X D,Guo H Y,Mohammad S,et al.An empirical study on the effect of negation words on sentiment.In:Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics.Stroudsburg,USA:ACL,2014:304-313. [10] 张志飞,苗夺谦,聂建云等.否定句的情感不确定性度量及分类.计算机研究与发展,2015,52(8):1806-1816.(Zhang Z F,Miao D Q,Nie J Y,et al.Sentiment uncertainty measure and classification of negative sentences.Journal of Computer Research and Development,2015,52(8):1806-1806.) [11] 彭 敏,黄佳佳,朱佳晖等.基于频繁项集的海量短文本聚类与主题抽取.计算机研究与发展,2015,52(9):1941-1953.(Peng M,Huang J J,Zhu J H,et al.Mass of short texts clustering and topic extraction based on frequent itemsets.Journal of Computer Research and Development,2015,52(9):1941-1953.) [12] Peng M,Huang J J,Fu H,et al.High quality Microblog extraction based on multiple features fusion and timefrequency transformation.In:Proceedings of the 14th International Conference of Web Information Systems Engineering(WISE’13).Springer,2013:188-201. [13] 周咏梅,阳爱民,林江豪.中文微博情感词典构建方法.山东大学学报(工学版),2014,44(3):36-40.(Zhou Y M,Yang A M,Lin J H.A method of building Chinese Microblog sentiment lexicon.Journal of Shandong University(Engineering Science),2014,44(3):36-40.) [14] Yang A M,Lin J H,Zhou Y M,et al.Research on building a Chinese sentiment lexicon based on SOPMI.Applied Mechanics and Materials,2013:1688-1693. [15] 王潇天.基于中文微博的热点事件情感倾向分析.博士学位论文.北京:北京邮电大学,2014.(Wang X T.Sentiment analysis of popular events based on Chinese Microblog network.Ph.D.Dissertation.Beijing:Beijing University of Posts and Telecommunications,2014.) [16] Liu B,Hao Z,Tsang E C.Nesting oneagainstone algorithm based on SVMs for pattern classification.IEEE Transactions on Neural Networks,2008,19(12):2044-2052. [17] Shen Y,Li S,Zheng L,et al.Emotion mining research on Microblog.In:Proceedings of the 1st IEEE Symposium on Web Society(SWS 2009).Lanzhou,China:IEEE Press,2009:71-75. |
No related articles found! |
|