基于语料库的语音情感识别的性别差异研究

doi:10.13232/j.cnki.jnju.2019.05.007

南京大学学报(自然科学版) ›› 2019, Vol. 55 ›› Issue (5): 758–764.doi: 10.13232/j.cnki.jnju.2019.05.007

基于语料库的语音情感识别的性别差异研究

曹欣怡,李鹤,王蔚()

南京师范大学教育科学学院教育技术系机器学习与认知实验室，南京，210097

收稿日期:2019-08-14 出版日期:2019-09-30 发布日期:2019-11-01
通讯作者: 王蔚 E-mail:wangwei5@njnu.edu.cn
基金资助:
国家哲学社会科学基金(BCA150054)

A study on gender differences in speech emotion recognitionbased on corpus

Xinyi Cao,He Li,Wei Wang()

MLC Lab, Department of Educational Technology, School of Educational Science, Nanjing Normal University, Nanjing, 210097, China

Received:2019-08-14 Online:2019-09-30 Published:2019-11-01
Contact: Wei Wang E-mail:wangwei5@njnu.edu.cn

摘要/Abstract

摘要：

性别是语音情感识别中重要的影响因素之一.用机器学习方法和情感语音数据库对语音情感识别的性别差异进行探究，并进一步从声学特征的角度分析了性别影响因素.在两个英文情感数据集以及它们的融合数据集上进行实验，分别用三种分类器对男女语音情感进行识别，并用注意力机制挑选出在男女语音情感识别中的重要特征并比较其差异.结果表明，女性语音的情感识别率高于男性.梅尔倒谱系数、振幅微扰、频谱斜率等频谱特征在男女语音的情感识别中的重要性差异较大.

关键词: 机器学习, 性别, 情感识别, 语音情感, 注意力机制

Abstract:

Gender is one of the important factors in speech emotion recognition. This study aims to explore the gender differences in speech emotion recognition by using machine learning and emotional speech database,and further explore the gender differences from the perspective of acoustic features. The research was experiment on two English emotional datasets and their fusion dataset,respectively. We employed three kinds of classifiers for men's and women's speech emotion recognition. Besides,attention mechanism was used to select the important features in speech emotion recognition and compare the differences of males and females. The results show that the recognition rate of female speech emotion is higher than that of male. The importance of spectrum features such as Mel Frequency Cepstral Coefficient,Shimmer and Spectral slope in speech emotion recognition varies greatly between men and women.

Key words: machine learning, gender, emotion recognition, speech emotion, attention mechanism

中图分类号:

H107

曹欣怡,李鹤,王蔚. 基于语料库的语音情感识别的性别差异研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 758–764.

Xinyi Cao,He Li,Wei Wang. A study on gender differences in speech emotion recognitionbased on corpus[J]. Journal of Nanjing University(Natural Sciences), 2019, 55(5): 758–764.

图/表 5

表1

表2

表3

表4

表5

参考文献 21

1	DehghanA,OrtizE G,ShuG,et al. DAGER：deep age,gender and emotion recognition using convolutional neural network. arXiv：1702.04280,2017.
2	KimJ,EnglebienneG,TruongK P,et al. Towards speech emotion recognition “in the wild” using aggregated corpora and deep multi?task learning. arXiv：1708.03920,2017.
3	WangZ Q,TashevI. Learning utterance?level representations for speech emotion and age/gender recognition using deep neural networks∥2017 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). New Orleans,LA,USA：IEEE,2017,doi：10.1109/ICASSP.2017.7953138. doi: 10.1109/ICASSP.2017.7953138
4	FuL Q,WangC J,ZhangY M. A study on influence of gender on speech emotion classification∥2^nd International Conference on Signal Processing System. Dalian,China：IEEE,2010，DOI：10.1109/ICSPS.2010.5555556. doi: 10.1109/ICSPS.2010.5555556
5	BrodyL R,HallJ A. Gender,emotion,and socialization∥Chrisler J C,McCreary D R. Handbook of Gender Research in Psychology. Springer Berlin Heidelberg,2010：429-454.
6	ChaplinT M,AldaoA. Gender differences in emotion expression in children：a meta?analytic review. Psychological Bulletin,2013,139(4)：735-765.
7	LausenA,SchachtA. Gender differences in the recognition of vocal emotions. Frontiers in Psychology,2018,9：882.
8	赵力,黄程韦. 实用语音情感识别中的若干关键技术. 数据采集与处理,2014,29(2)：157-170.
	ZhaoL,HuangC W. Key technologies in practical speech emotion recognition. Data Acquisition and Processing,2014,29(2)：157-170.
9	VogtT,AndréE. Improving automatic emotion recognition from speech via gender differentiation ∥Proceeding of Language Resources and Evaluation Conference. Genoa,Italy：LREC,2006：1-4.
10	ShahinI. Speaker identification in emotional talking environments using both gender and emotion cues∥International Conference on Communications,Signal Processing,and their Applications (ICCSPA). Sharjah,United Arab Emirates：IEEE,2013：1652-1659.
11	LaddeP P,DeshmukhV S. Use of multiple classifier system for gender driven speech emotion recognition∥International Conference on Computational Intelligence and Communication Networks. Jabalpur,India：IEEE,2015,DOI：10.1109/CICN.2015.145. doi: 10.1109/CICN.2015.145
12	BelinP,FillionB S,GosselinF. The Montreal affective voices：a validated set of nonverbal affect bursts for research on auditory affective processing. Behavior Research Methods,2008,40：531-539.
13	CollignonO,GirardS,GosselinF,et al. Women process multisensory emotion expressions more efficiently than men. Neuropsychologia,2010,48(1)：220-225.
14	GongX M,WongN,WangD H. Are gender differences in emotion culturally universal? Comparison of emotional intensity between Chinese and German samples. Journal of Cross?Cultural Psychology,2018,46(8)：993-1005,doi：10.1177/0022022118768434. doi: 10.1177/0022022118768434
15	MetallinouA,WollmerM,KatsamanisA,et al. Context?sensitive learning for enhanced audio?visual emotion classification. IEEE Transactions on Affective Computing,2012,3(2)：184-198.
16	EybenF,SchererK R,SchullerB W,et al. The geneva minimalistic acoustic parameter set (GEMAPS) for voice research and affective 0computing. IEEE Transactions on Affective Computing,2016,7(2)：190-202.
17	MirsamadiS,BarsoumE,ZhangC. Automatic speech emotion recognition using recurrent neural networks with local attention∥IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). New Orleans,LA,USA：IEEE,2017：2227-2231.
18	GoblC,ChasaideA N. The role of voice quality in communicating emotion,mood and attitude. Speech Communication,2003,40(1-2)：189-212.
19	JanszJ. Masculine identity and restrictive emotionality∥Fischer A H. Gender and Emotion：Social Psychological Perspectives. Cambridge,England：Cambridge University Press,2000：166-188.
20	ShieldsS A. Speaking from the heart：gender and the social meaning of emotion. Cambridge,England：Cambridge University Press,2002，230.
21	WoodW,EaglyA H. A cross?cultural analysis of the behavior of women and men：implications for the origins of sex differences. Psychological Bulletin,2002,128(5)：699-727.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

性别	特征集	SVM	CNN	LSTM
Male	eGeMAPS	0.601	0.565	0.6145
Male	Emobase 2010	0.5445	0.6365	0.6475
Female	eGeMAPS	0.6575	0.5785	0.6455
Female	Emobase 2010	0.584	0.661	0.6775

性别	特征集	SVM	CNN	LSTM
Male	eGeMAPS	0.6533	0.4867	0.7067
Male	Emobase 2010	0.8667	0.6733	0.8
Female	eGeMAPS	0.7267	0.5467	0.7933
Female	Emobase 2010	0.88	0.6867	0.8267

数据集	特征集	SVM
IEMOCAP	eGeMAPS	0.000
IEMOCAP	Emobase 2010	0.000
eNTERFACE'05	eGeMAPS	0.000
eNTERFACE'05	Emobase 2010	0.0139

性别	特征集	SVM	CNN	LSTM
Male	eGeMAPS	0.5871	0.5426	0.5992
Male	Emobase 2010	0.5308	0.6292	0.6654
Female	eGeMAPS	0.6388	0.5770	0.6483
Female	Emobase 2010	0.5846	0.6833	0.6829

重要性相差大的前15个特征名称	语音情感识别中的重要性排名(女性/男性)
梅尔倒谱系数4(MFCC4)	9/24
振幅微扰(Shimmer)	20/6
频谱斜率 (Spectral Slope0?500 Hz)	13/26
谐波差异H1?A3 (Harmonic difference)	29/19
F3相关能量	22/32
F1带宽(F1 bandwidth)	32/22
F2带宽(F2 bandwidth)	31/23
F1频率(F1 frequency)	16/8
每秒连续发音区域的数量	10/18
Hammarbeg指数	8/15
有声区域的平均长度	11/4
无声区域的平均长度	21/14
频谱流量(Spectral ?ux)	7/1
频率微扰(jitter)	23/17
等效音级 (equivalent Sound Level_dBp )	4/10

基于语料库的语音情感识别的性别差异研究

A study on gender differences in speech emotion recognitionbased on corpus

RichHTML

PDF (PC)

摘要/Abstract

引用本文

使用本文

图/表 5

参考文献 21

相关文章 9

Metrics

本文评价

推荐阅读 10

[1]	朱伟,张帅,辛晓燕,李文飞,王骏,张建,王炜. 结合区域检测和注意力机制的胸片自动定位与识别[J]. 南京大学学报(自然科学版), 2020, 56(4): 591-600.
[2]	徐扬,周文瑄,阮慧彬,孙雨,洪宇. 基于层次化表示的隐式篇章关系识别[J]. 南京大学学报(自然科学版), 2019, 55(6): 1000-1009.
[3]	钟琪,冯亚琴,王蔚. 跨语言语料库的语音情感识别对比研究[J]. 南京大学学报(自然科学版), 2019, 55(5): 765-773.
[4]	钱付兰, 黄鑫, 赵姝, 张燕平. 基于路径相互关注的网络嵌入算法[J]. 南京大学学报(自然科学版), 2019, 55(4): 573-580.
[5]	王蔚, 胡婷婷, 冯亚琴. 基于深度学习的自然与表演语音情感识别[J]. 南京大学学报(自然科学版), 2019, 55(4): 660-666.
[6]	阚　威, 李　云. 基于LSTM的脑电情绪识别模型[J]. 南京大学学报(自然科学版), 2019, 55(1): 110-116.
[7]	顾健伟, 曾　诚, 邹恩岑, 陈　扬, 沈　艺, 陆　悠, 奚雪峰. 基于双向注意力流和自注意力结合的机器阅读理解[J]. 南京大学学报(自然科学版), 2019, 55(1): 125-132.
[8]	朱亚奇1,邓维斌1 ,2*. 一种基于不平衡数据的聚类抽样方法[J]. 南京大学学报(自然科学版), 2015, 51(2): 421-429.
[9]	朱亚奇¹,邓维斌^1,2*. 一种基于不平衡数据的聚类抽样方法[J]. 南京大学学报(自然科学版), 2015, 51(2): 421-429.