Emotion recognition from speech with StarGAN and Dense‐DCNN

Abstract Both traditional and the latest speech emotion recognition methods face the same problem, that is, the lack of standard emotion speech data sets. This leads to the network being unable to learn emotion features comprehensively because of limited data. Moreover, in these methods, the time re...

Full description

Saved in:
Bibliographic Details
Main Authors: Lu‐Qiao Li, Kai Xie, Xiao‐Long Guo, Chang Wen, Jian‐Biao He
Format: Article
Language:English
Published: Wiley 2022-02-01
Series:IET Signal Processing
Online Access:https://doi.org/10.1049/sil2.12078
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832559602652151808
author Lu‐Qiao Li
Kai Xie
Xiao‐Long Guo
Chang Wen
Jian‐Biao He
author_facet Lu‐Qiao Li
Kai Xie
Xiao‐Long Guo
Chang Wen
Jian‐Biao He
author_sort Lu‐Qiao Li
collection DOAJ
description Abstract Both traditional and the latest speech emotion recognition methods face the same problem, that is, the lack of standard emotion speech data sets. This leads to the network being unable to learn emotion features comprehensively because of limited data. Moreover, in these methods, the time required for training is extremely long, which makes it difficult to ensure efficient classification. The proposed network Dense‐DCNN, combined with StarGAN, can address this issue. StarGAN is used to generate numerous Log‐Mel spectra with related emotions and extract high‐dimensional features through the Dense‐DCNN to achieve a high‐precision classification. The classification accuracy for all the data sets was more than 90%. Simultaneously, DenseNet's layer jump connection can speed up the classification process, thereby improving efficiency. The experimental verification shows that our model not only has good generalisation ability but also exhibits good robustness in multiscene and multinoise environments, thereby showing potential for application in medical and social education industries.
format Article
id doaj-art-0b469f23fce841c6a480f775c9d45b8f
institution Kabale University
issn 1751-9675
1751-9683
language English
publishDate 2022-02-01
publisher Wiley
record_format Article
series IET Signal Processing
spelling doaj-art-0b469f23fce841c6a480f775c9d45b8f2025-02-03T01:29:37ZengWileyIET Signal Processing1751-96751751-96832022-02-01161627910.1049/sil2.12078Emotion recognition from speech with StarGAN and Dense‐DCNNLu‐Qiao Li0Kai Xie1Xiao‐Long Guo2Chang Wen3Jian‐Biao He4School of Electronic and Information Yangtze University Jingzhou ChinaSchool of Electronic and Information Yangtze University Jingzhou ChinaSchool of Electronic and Information Yangtze University Jingzhou ChinaWestern Institute of Yangtze University Karamay ChinaSchool of Computer Science and engineering Central South University Changsha ChinaAbstract Both traditional and the latest speech emotion recognition methods face the same problem, that is, the lack of standard emotion speech data sets. This leads to the network being unable to learn emotion features comprehensively because of limited data. Moreover, in these methods, the time required for training is extremely long, which makes it difficult to ensure efficient classification. The proposed network Dense‐DCNN, combined with StarGAN, can address this issue. StarGAN is used to generate numerous Log‐Mel spectra with related emotions and extract high‐dimensional features through the Dense‐DCNN to achieve a high‐precision classification. The classification accuracy for all the data sets was more than 90%. Simultaneously, DenseNet's layer jump connection can speed up the classification process, thereby improving efficiency. The experimental verification shows that our model not only has good generalisation ability but also exhibits good robustness in multiscene and multinoise environments, thereby showing potential for application in medical and social education industries.https://doi.org/10.1049/sil2.12078
spellingShingle Lu‐Qiao Li
Kai Xie
Xiao‐Long Guo
Chang Wen
Jian‐Biao He
Emotion recognition from speech with StarGAN and Dense‐DCNN
IET Signal Processing
title Emotion recognition from speech with StarGAN and Dense‐DCNN
title_full Emotion recognition from speech with StarGAN and Dense‐DCNN
title_fullStr Emotion recognition from speech with StarGAN and Dense‐DCNN
title_full_unstemmed Emotion recognition from speech with StarGAN and Dense‐DCNN
title_short Emotion recognition from speech with StarGAN and Dense‐DCNN
title_sort emotion recognition from speech with stargan and dense dcnn
url https://doi.org/10.1049/sil2.12078
work_keys_str_mv AT luqiaoli emotionrecognitionfromspeechwithstargananddensedcnn
AT kaixie emotionrecognitionfromspeechwithstargananddensedcnn
AT xiaolongguo emotionrecognitionfromspeechwithstargananddensedcnn
AT changwen emotionrecognitionfromspeechwithstargananddensedcnn
AT jianbiaohe emotionrecognitionfromspeechwithstargananddensedcnn