Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning

This paper describes a supervised speech enhancement (SE) method utilising a noise-aware four-layer deep neural network and training target switching. For optimal speech denoising, the SE system, trained with multiple-target joint learning, switches between mapping-based, masking-based, or complemen...

Full description

Saved in:

Bibliographic Details
Main Authors:	Salinna Abdullah, Majid Zamani, Andreas Demosthenous
Format:	Article
Language:	English
Published:	IEEE 2024-01-01
Series:	IEEE Open Journal of Circuits and Systems
Subjects:	Deep neural network digital circuits field programmable gate array (FPGA) mapping masking multi-target learning
Online Access:	https://ieeexplore.ieee.org/document/10500889/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832592862573756416
author	Salinna Abdullah Majid Zamani Andreas Demosthenous
author_facet	Salinna Abdullah Majid Zamani Andreas Demosthenous
author_sort	Salinna Abdullah
collection	DOAJ
description	This paper describes a supervised speech enhancement (SE) method utilising a noise-aware four-layer deep neural network and training target switching. For optimal speech denoising, the SE system, trained with multiple-target joint learning, switches between mapping-based, masking-based, or complementary processing, depending on the level of noise contamination detected. Optimisation techniques, including ternary quantisation, structural pruning, efficient sparse matrix representation and cost-effective approximations for complex computations, were implemented to reduce area, memory, and power requirements. Up to 19.1x compression was obtained, and all weights could be stored on the on-chip memory. When processing NOISEX-92 noises, the system achieved an average short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ) scores of 0.81 and 1.62, respectively, outperforming SE algorithms trained with only a single learning target. The proposed SE processor was implemented on a field programmable gate array (FPGA) for proof of concept. Mapping the design on a 65-nm CMOS process led to a chip core area of <inline-formula> <tex-math notation="LaTeX">$3.88~mm^{2}$ </tex-math></inline-formula> and a power consumption of 1.91 mW when operating at a 10 MHz clock frequency.
format	Article
id	doaj-art-629751578c634b40b182822b3723c2a9
institution	Kabale University
issn	2644-1225
language	English
publishDate	2024-01-01
publisher	IEEE
record_format	Article
series	IEEE Open Journal of Circuits and Systems
spelling	doaj-art-629751578c634b40b182822b3723c2a92025-01-21T00:02:52ZengIEEEIEEE Open Journal of Circuits and Systems2644-12252024-01-01514115210.1109/OJCAS.2024.338910010500889Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep LearningSalinna Abdullah0https://orcid.org/0000-0003-0092-3190Majid Zamani1Andreas Demosthenous2https://orcid.org/0000-0003-0623-963XDepartment of Electronic and Electrical Engineering, University College London, London, U.K.Department of Electronic and Electrical Engineering, University College London, London, U.K.Department of Electronic and Electrical Engineering, University College London, London, U.K.This paper describes a supervised speech enhancement (SE) method utilising a noise-aware four-layer deep neural network and training target switching. For optimal speech denoising, the SE system, trained with multiple-target joint learning, switches between mapping-based, masking-based, or complementary processing, depending on the level of noise contamination detected. Optimisation techniques, including ternary quantisation, structural pruning, efficient sparse matrix representation and cost-effective approximations for complex computations, were implemented to reduce area, memory, and power requirements. Up to 19.1x compression was obtained, and all weights could be stored on the on-chip memory. When processing NOISEX-92 noises, the system achieved an average short-time objective intelligibility (STOI) and perceptual evaluation of speech quality (PESQ) scores of 0.81 and 1.62, respectively, outperforming SE algorithms trained with only a single learning target. The proposed SE processor was implemented on a field programmable gate array (FPGA) for proof of concept. Mapping the design on a 65-nm CMOS process led to a chip core area of <inline-formula> <tex-math notation="LaTeX">$3.88~mm^{2}$ </tex-math></inline-formula> and a power consumption of 1.91 mW when operating at a 10 MHz clock frequency.https://ieeexplore.ieee.org/document/10500889/Deep neural networkdigital circuitsfield programmable gate array (FPGA)mappingmaskingmulti-target learning
spellingShingle	Salinna Abdullah Majid Zamani Andreas Demosthenous Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning IEEE Open Journal of Circuits and Systems Deep neural network digital circuits field programmable gate array (FPGA) mapping masking multi-target learning
title	Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning
title_full	Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning
title_fullStr	Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning
title_full_unstemmed	Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning
title_short	Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning
title_sort	hardware efficient speech enhancement with noise aware multi target deep learning
topic	Deep neural network digital circuits field programmable gate array (FPGA) mapping masking multi-target learning
url	https://ieeexplore.ieee.org/document/10500889/
work_keys_str_mv	AT salinnaabdullah hardwareefficientspeechenhancementwithnoiseawaremultitargetdeeplearning AT majidzamani hardwareefficientspeechenhancementwithnoiseawaremultitargetdeeplearning AT andreasdemosthenous hardwareefficientspeechenhancementwithnoiseawaremultitargetdeeplearning

Hardware Efficient Speech Enhancement With Noise Aware Multi-Target Deep Learning

Similar Items