南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (5): 765773.doi: 10.13232/j.cnki.jnju.2019.05.008
摘要:
情感感知具有普遍性和差异性,不同语言表达的情感有不同的情感特征,但也存在相似的情感特征.选择IEMOCAP英语情感数据库、CASIA汉语情感数据库、EMO?BD德语情感数据库,以中性、生气、快乐、悲伤四种情感为研究对象,了解在单语言语料库、混合语言语料库、跨语料库的语音情感识别情况.使用支持向量机(Support Vector Machine,SVM)、卷积神经网络(Convolutional Neural Networks,CNN)和长短时记忆网络(Long?Short Term Memory,LSTM)为分类器进行训练,对情感进行识别.从实验结果可以看出,不同语料库的语音情感的识别模式存在相似性,也存在相似的语言情感特性.还发现英文的中性情感和中文的悲伤情感具有良好的模型泛化性,英文的悲伤情感和中文的中性情感有较好的适应性.
中图分类号:
1 | 宋鹏,郑文明,赵力 . 基于特征迁移学习方法的跨库语音情感识别. 清华大学学报(自然科学版),2016,56(11):1179-1183. |
Song P , Zheng W M , Zhao L . Cross?corpus speech emotion recognition based on a feature transfer learning method. Journal of Tsinghua University (Natural Science Edition),2016,56(11):1179-1183. | |
2 | Shah M , Chakrabarti C , Spanias A . Within and cross?corpus speech emotion recognition using latent topic model?based features. EURASIP Journal on Audio,Speech,and Music Processing ,2015,2015(1):4. |
3 | Schuller B , Vlasenko B , Eyben F ,et al . Cross?corpus acoustic emotion recognition:variances and strategies. IEEE Transactions on Affective Computing,2010,1(2):119-131. |
4 | Schuller B , Zhang Z X , Weninger F ,et al . Using multiple databases for training in emotion recognition:to unite or to vote?∥Proceedings of the 12th Annual Conference of the International Speech Communication Association. Florence,Italy,2011:1553-1556. |
5 | Abdelwahab M , Busso C . Supervised domain adaptation for emotion recognition from speech∥2015 IEEE International Conference on Acoustics,Speech and Signal Processing. Brisbane,Australia:IEEE,2015:5058-5062. |
6 | Mao Q R , Xue W T , Rao Q R ,et al . Domain adaptation for speech emotion recognition by sharing priors between related source and target classes∥2016 IEEE International Conference on Acoustics,Speech and Signal Processing. Shanghai,China:IEEE,2016:2608-2612. |
7 | 李爱军,邵鹏飞,党建武 . 情感表达的跨文化多模态感知研究. 清华大学学报(自然科学版),2009,49(S1):1393-1401. |
Li A J , Shao P F , Dang J W . Intercultural multimodal perception of emotional expressions. Journal of Tsinghua University (Natural Science Edition),2009,49(S1):1393-1401. | |
8 | Scherer K R , Banse R , Wallbott H G . Emotion inferences from vocal expression correlate across languages and cultures. Journal of Cross?Cultural Psychology,2001,32(1):76-92. |
9 | Pell M D , Paulmann S , Dara C ,et al . Factors in the recognition of vocally expressed emotions:a comparison of four languages. Journal of Phonetics,2009,37(4):417-435. |
10 | Paulmann S , Uskul A K . Cross?cultural emotional prosody recognition:evidence from Chinese and British listeners. Cognition and Emotion,2014,28(2):230-244. |
11 | Koeda M , Belin P , Hama T ,et al . Cross?cultural differences in the processing of non?verbal affec?tive vocalizations by Japanese and Canadian listeners. Frontiers in Psychology,2013,4:105. |
12 | Sauter D A , Eisner F , Ekman P ,et al . Cross?cultural recognition of basic emotions through nonverbal emotional vocalizations. Proceedings of the National Academy of Sciences of the United States of America,2010,107(6):2408-2412. |
13 | Lanjewar R B , Mathurkar S , Patel N . Implementation and comparison of speech emotion recognition system using Gaussian Mixture Model (GMM) and K?Nearest Neighbor (K?NN) techni?ques. Procedia Computer Science,2015,49:50-57. |
14 | 孙红进 . 基于GMM的语音情感信息识别. 信息技术,2008(12):138-140. |
Sun H J . Emotion recognition of speech based on GMM. Information Technology,2008(12):138-140. | |
15 | Chen Y L , Zhang Z . Research on text sentiment analysis based on CNNs and SVM∥2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA). Wuhan,China:IEEE,2018:2731-2734. |
16 | 任浩,叶亮,李月 等 . 基于多级SVM分类的语音情感识别算法. 计算机应用研究,2017,34(6):1682-1684. |
Ren H , Ye L , Li Y ,et al . Speech emotion recognition algorithm based on multi?layer SVM classification. Application Research of Computers,2017,34(6):1682-1684. | |
17 | Zhao J F , Xia M , Chen L J . Learning deep features to recognise speech emotion using merged deep CNN. IET Signal Processing,2018,12(6):713-721. |
18 | 薄洪健,马琳,孔祥浩 等 . 基于卷积神经网络学习的语音情感特征降维方法研究. 高技术通讯,2017,27(11-12):889-898. |
Bo H J , Ma L , Kong X H ,et al . Research on a dimension reduction method of speech emotional feature based on convolution neural network. Chinese High Technology Letters,2017,27(11-12):889-898. | |
19 | Chao L L , Tao J H , Yang M H ,et al . Long short term memory recurrent neural network based encoding method for emotion recognition in video∥IEEE International Conference on Acoustics,Speech and Signal Processing. Shanghai,China:IEEE,2016:2752-2756. |
20 | 刘畅,张一珂,张鹏远 等 . 基于改进主题分布特征的神经网络语言模型. 电子与信息学报,2018,40(1):219-225. |
Liu C , Zhang Y K , Zhang P Y ,et al . Neural network language modeling using an improved topic distribution feature. Journal of Electronics and Information Technology,2018,40(1):219-225. | |
21 | Eyben F , W?llmer M , Schuller B . Opensmile:The munich versatile and fast open?source audio feature extractor∥Proceedings of the 18th ACM International Conference on Multimedia.Firenze,Italy:ACM,2010:1459-1462. |
22 | Milton A , Roy S S , Selvi S T . SVM scheme for speech emotion recognition using MFCC feature. International Journal of Computer Applications,2013,69(9):34-39. |
23 | Wollmer M , Schuller B , Eyben F ,et al . Combining long short?term memory and dynamic Bayesian networks for incremental emotion?sensitive artificial listening. IEEE Journal of Selected Topics in Signal Processing,2010,4(5):867-881. |
24 | Busso C , Bulut M , Lee C C ,et al . IEMOCAP:interactive emotional dyadic motion capture database. Language Resources and Evaluation,2008,42(4):335-359. |
25 | Pan S F , Tao J H , Li Y . The CASIA audio emotion recognition method for audio/visual emotion challenge 2011∥Proceedings of the 4th International Conference on Affective Computing and Intelligent Interaction.Memphis,TN,USA:ACM,2011:388-395. |
26 | Burkhardt F , Paeschke A , Rolfes M ,et al . A database of German emotional speech∥Proceedings of Interspeech 2005. Lisbon,Portugal,2005:1517-1520. |
27 | Juth P , Lundqvist D , Karlsson A ,et al . Looking for foes and friends:perceptual and emotional factors when finding a face in the crowd. Emotion,2005,5(4):379-395. |
28 | Shimamura A P , Ross J G , Bennett H D . Memory for facial expressions:the power of a smile. Psychonomic Bulletin & Review,2006,13(2):217-222. |
29 | Scherer K R . The role of culture in emotion?antecedent appraisal. Journal of Personality & Social Psychology,1997,73(5):902-922. |
[1] | 王丽娟,丁世飞,丁玲. 基于迁移学习的软子空间聚类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 515-523. |
[2] | 朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600. |
[3] | 陈俊芬,赵佳成,韩洁,翟俊海. 基于深度特征表示的Softmax聚类算法[J]. 南京大学学报(自然科学版), 2020, 56(4): 533-540. |
[4] | 李康,谢宁,李旭,谭凯. 基于卷积神经网络和几何优化的统计染色体核型分析方法[J]. 南京大学学报(自然科学版), 2020, 56(1): 116-124. |
[5] | 韩普,刘亦卓,李晓艳. 基于深度学习和多特征融合的中文电子病历实体识别研究[J]. 南京大学学报(自然科学版), 2019, 55(6): 942-951. |
[6] | 张家精,夏巽鹏,陈金兰,倪友聪. 基于张量分解和深度学习的混合推荐算法[J]. 南京大学学报(自然科学版), 2019, 55(6): 952-959. |
[7] | 曹欣怡,李鹤,王蔚. 基于语料库的语音情感识别的性别差异研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 758-764. |
[8] | 王蔚, 胡婷婷, 冯亚琴. 基于深度学习的自然与表演语音情感识别[J]. 南京大学学报(自然科学版), 2019, 55(4): 660-666. |
[9] | 洪思思,曹辰捷,王 喆*,李冬冬. 基于矩阵的AdaBoost多视角学习[J]. 南京大学学报(自然科学版), 2018, 54(6): 1152-1160. |
[10] | 陈琳琳1*,陈德刚2. 一种基于核对齐的分类器链的多标记学习算法[J]. 南京大学学报(自然科学版), 2018, 54(4): 725-. |
[11] | 孟佳娜*, 赵丹丹, 于玉海, 孙世昶. 归纳式迁移学习在跨领域情感倾向性分析中的应用[J]. 南京大学学报(自然科学版), 2016, 52(1): 175-183. |
[12] | 张鹏,黄毅,阮雅端,陈启美*. 基于稀疏特征的交通流视频检测算法[J]. 南京大学学报(自然科学版), 2015, 51(2): 264-270. |
[13] | 曹连连,陈松灿**. 加权Laplacian分类器*[J]. 南京大学学报(自然科学版), 2012, 48(4): 459-465. |
[14] | 蒋才智**,王浩,姚宏亮 . 基于知网的贝叶斯中文人名识别* [J]. 南京大学学报(自然科学版), 2012, 48(2): 147-153. |
[15] | 杨小军 1 , 杨兴炜 2 , 曾 峦 3 , 刘文予 4 . 基于轮廓关键点集的形状分类 [J]. 南京大学学报(自然科学版), 2010, 46(1): 47-55. |
|