Emotion recognition from speech with StarGAN and Dense‐DCNN
Abstract Both traditional and the latest speech emotion recognition methods face the same problem, that is, the lack of standard emotion speech data sets. This leads to the network being unable to learn emotion features comprehensively because of limited data. Moreover, in these methods, the time re...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2022-02-01
|
Series: | IET Signal Processing |
Online Access: | https://doi.org/10.1049/sil2.12078 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832559602652151808 |
---|---|
author | Lu‐Qiao Li Kai Xie Xiao‐Long Guo Chang Wen Jian‐Biao He |
author_facet | Lu‐Qiao Li Kai Xie Xiao‐Long Guo Chang Wen Jian‐Biao He |
author_sort | Lu‐Qiao Li |
collection | DOAJ |
description | Abstract Both traditional and the latest speech emotion recognition methods face the same problem, that is, the lack of standard emotion speech data sets. This leads to the network being unable to learn emotion features comprehensively because of limited data. Moreover, in these methods, the time required for training is extremely long, which makes it difficult to ensure efficient classification. The proposed network Dense‐DCNN, combined with StarGAN, can address this issue. StarGAN is used to generate numerous Log‐Mel spectra with related emotions and extract high‐dimensional features through the Dense‐DCNN to achieve a high‐precision classification. The classification accuracy for all the data sets was more than 90%. Simultaneously, DenseNet's layer jump connection can speed up the classification process, thereby improving efficiency. The experimental verification shows that our model not only has good generalisation ability but also exhibits good robustness in multiscene and multinoise environments, thereby showing potential for application in medical and social education industries. |
format | Article |
id | doaj-art-0b469f23fce841c6a480f775c9d45b8f |
institution | Kabale University |
issn | 1751-9675 1751-9683 |
language | English |
publishDate | 2022-02-01 |
publisher | Wiley |
record_format | Article |
series | IET Signal Processing |
spelling | doaj-art-0b469f23fce841c6a480f775c9d45b8f2025-02-03T01:29:37ZengWileyIET Signal Processing1751-96751751-96832022-02-01161627910.1049/sil2.12078Emotion recognition from speech with StarGAN and Dense‐DCNNLu‐Qiao Li0Kai Xie1Xiao‐Long Guo2Chang Wen3Jian‐Biao He4School of Electronic and Information Yangtze University Jingzhou ChinaSchool of Electronic and Information Yangtze University Jingzhou ChinaSchool of Electronic and Information Yangtze University Jingzhou ChinaWestern Institute of Yangtze University Karamay ChinaSchool of Computer Science and engineering Central South University Changsha ChinaAbstract Both traditional and the latest speech emotion recognition methods face the same problem, that is, the lack of standard emotion speech data sets. This leads to the network being unable to learn emotion features comprehensively because of limited data. Moreover, in these methods, the time required for training is extremely long, which makes it difficult to ensure efficient classification. The proposed network Dense‐DCNN, combined with StarGAN, can address this issue. StarGAN is used to generate numerous Log‐Mel spectra with related emotions and extract high‐dimensional features through the Dense‐DCNN to achieve a high‐precision classification. The classification accuracy for all the data sets was more than 90%. Simultaneously, DenseNet's layer jump connection can speed up the classification process, thereby improving efficiency. The experimental verification shows that our model not only has good generalisation ability but also exhibits good robustness in multiscene and multinoise environments, thereby showing potential for application in medical and social education industries.https://doi.org/10.1049/sil2.12078 |
spellingShingle | Lu‐Qiao Li Kai Xie Xiao‐Long Guo Chang Wen Jian‐Biao He Emotion recognition from speech with StarGAN and Dense‐DCNN IET Signal Processing |
title | Emotion recognition from speech with StarGAN and Dense‐DCNN |
title_full | Emotion recognition from speech with StarGAN and Dense‐DCNN |
title_fullStr | Emotion recognition from speech with StarGAN and Dense‐DCNN |
title_full_unstemmed | Emotion recognition from speech with StarGAN and Dense‐DCNN |
title_short | Emotion recognition from speech with StarGAN and Dense‐DCNN |
title_sort | emotion recognition from speech with stargan and dense dcnn |
url | https://doi.org/10.1049/sil2.12078 |
work_keys_str_mv | AT luqiaoli emotionrecognitionfromspeechwithstargananddensedcnn AT kaixie emotionrecognitionfromspeechwithstargananddensedcnn AT xiaolongguo emotionrecognitionfromspeechwithstargananddensedcnn AT changwen emotionrecognitionfromspeechwithstargananddensedcnn AT jianbiaohe emotionrecognitionfromspeechwithstargananddensedcnn |