An improved deep learning approach for speech enhancement

Single-channel speech enhancement refers to the task of improving the quality and intelligibility of a speech signal in a noisy environment. Time-domain and time-frequency-domain methods are two main categories of approaches for speech enhancement. In this paper, we propose a approach based on a cro...

Full description

Saved in:
Bibliographic Details
Main Authors: Malek Miled, Mohamed Anouar Ben Messaoud
Format: Article
Language:English
Published: Universidade do Porto 2023-11-01
Series:U.Porto Journal of Engineering
Subjects:
Online Access:https://journalengineering.fe.up.pt/index.php/upjeng/article/view/1531
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849702757797199872
author Malek Miled
Mohamed Anouar Ben Messaoud
author_facet Malek Miled
Mohamed Anouar Ben Messaoud
author_sort Malek Miled
collection DOAJ
description Single-channel speech enhancement refers to the task of improving the quality and intelligibility of a speech signal in a noisy environment. Time-domain and time-frequency-domain methods are two main categories of approaches for speech enhancement. In this paper, we propose a approach based on a cross-domain framework. This framework utilizes our knowledge of the spectrogram and overcomes some of the limitations faced by time-frequency domain methods. First, we apply the intrinsic mode functions of the empirical mode decomposition and an improved version of principal component analysis. Then, we design a cross-domain learning framework to determine the correlations along the frequency and time axes. At low SNR = -5 dB, the effectiveness of our proposed approach is demonstrated by its performance based on objective and subjective measures. With average scores of -0.49, 2.47, 2.44, and 0.68 for SegSNR, PESQ, Cov, and STOI, respectively. The results highlight the success of our approach in addressing low SNR conditions.
format Article
id doaj-art-c506a633d4e743d7aced2ea7b49271c9
institution DOAJ
issn 2183-6493
language English
publishDate 2023-11-01
publisher Universidade do Porto
record_format Article
series U.Porto Journal of Engineering
spelling doaj-art-c506a633d4e743d7aced2ea7b49271c92025-08-20T03:17:32ZengUniversidade do PortoU.Porto Journal of Engineering2183-64932023-11-019510.24840/2183-6493_009-005_001531An improved deep learning approach for speech enhancementMalek Miled0https://orcid.org/0009-0002-4456-3748Mohamed Anouar Ben Messaoud1https://orcid.org/0000-0002-7190-2736Universidade do El Manar, Instituta de EngenhariaNational School of Engineers of TunisSingle-channel speech enhancement refers to the task of improving the quality and intelligibility of a speech signal in a noisy environment. Time-domain and time-frequency-domain methods are two main categories of approaches for speech enhancement. In this paper, we propose a approach based on a cross-domain framework. This framework utilizes our knowledge of the spectrogram and overcomes some of the limitations faced by time-frequency domain methods. First, we apply the intrinsic mode functions of the empirical mode decomposition and an improved version of principal component analysis. Then, we design a cross-domain learning framework to determine the correlations along the frequency and time axes. At low SNR = -5 dB, the effectiveness of our proposed approach is demonstrated by its performance based on objective and subjective measures. With average scores of -0.49, 2.47, 2.44, and 0.68 for SegSNR, PESQ, Cov, and STOI, respectively. The results highlight the success of our approach in addressing low SNR conditions. https://journalengineering.fe.up.pt/index.php/upjeng/article/view/1531Speech EnhancementEmpirical Mode DecompositionPrincipal Component AnalysisLearning Model
spellingShingle Malek Miled
Mohamed Anouar Ben Messaoud
An improved deep learning approach for speech enhancement
U.Porto Journal of Engineering
Speech Enhancement
Empirical Mode Decomposition
Principal Component Analysis
Learning Model
title An improved deep learning approach for speech enhancement
title_full An improved deep learning approach for speech enhancement
title_fullStr An improved deep learning approach for speech enhancement
title_full_unstemmed An improved deep learning approach for speech enhancement
title_short An improved deep learning approach for speech enhancement
title_sort improved deep learning approach for speech enhancement
topic Speech Enhancement
Empirical Mode Decomposition
Principal Component Analysis
Learning Model
url https://journalengineering.fe.up.pt/index.php/upjeng/article/view/1531
work_keys_str_mv AT malekmiled animproveddeeplearningapproachforspeechenhancement
AT mohamedanouarbenmessaoud animproveddeeplearningapproachforspeechenhancement
AT malekmiled improveddeeplearningapproachforspeechenhancement
AT mohamedanouarbenmessaoud improveddeeplearningapproachforspeechenhancement