Sines, transient, noise neural modeling of piano notes

This article introduces a novel method for emulating piano sounds. We propose to exploit the sine, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corr...

Full description

Saved in:

Bibliographic Details
Main Authors:	Riccardo Simionato, Stefano Fasciani
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-01-01
Series:	Frontiers in Signal Processing
Subjects:	sound source modeling physics-informed modeling acoustic modeling deep learning piano synthesis differentiable digital signal processing
Online Access:	https://www.frontiersin.org/articles/10.3389/frsip.2024.1494864/full
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832582906342539264
author	Riccardo Simionato Stefano Fasciani
author_facet	Riccardo Simionato Stefano Fasciani
author_sort	Riccardo Simionato
collection	DOAJ
description	This article introduces a novel method for emulating piano sounds. We propose to exploit the sine, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corresponding quasi-harmonic, transient, and noise signals. Splitting the emulation into three independently trainable models reduces the modeling tasks’ complexity. The quasi-harmonic content is produced using a differentiable sinusoidal model guided by physics-derived formulas, whose parameters are automatically estimated from audio recordings. The noise sub-module uses a learnable time-varying filter, and the transients are generated using a deep convolutional network. From singular notes, we emulate the coupling between different keys in trichords with a convolutional-based network. Results show the model matches the partial distribution of the target while predicting the energy in the higher part of the spectrum presents more challenges. The energy distribution in the spectra of the transient and noise components is accurate overall. While the model is more computationally and memory efficient, perceptual tests reveal limitations in accurately modeling the attack phase of notes. Despite this, it generally achieves perceptual accuracy in emulating single notes and trichords.
format	Article
id	doaj-art-710927f025b6412593c7f3d6bccaedc9
institution	Kabale University
issn	2673-8198
language	English
publishDate	2025-01-01
publisher	Frontiers Media S.A.
record_format	Article
series	Frontiers in Signal Processing
spelling	doaj-art-710927f025b6412593c7f3d6bccaedc92025-01-29T06:46:05ZengFrontiers Media S.A.Frontiers in Signal Processing2673-81982025-01-01410.3389/frsip.2024.14948641494864Sines, transient, noise neural modeling of piano notesRiccardo SimionatoStefano FascianiThis article introduces a novel method for emulating piano sounds. We propose to exploit the sine, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corresponding quasi-harmonic, transient, and noise signals. Splitting the emulation into three independently trainable models reduces the modeling tasks’ complexity. The quasi-harmonic content is produced using a differentiable sinusoidal model guided by physics-derived formulas, whose parameters are automatically estimated from audio recordings. The noise sub-module uses a learnable time-varying filter, and the transients are generated using a deep convolutional network. From singular notes, we emulate the coupling between different keys in trichords with a convolutional-based network. Results show the model matches the partial distribution of the target while predicting the energy in the higher part of the spectrum presents more challenges. The energy distribution in the spectra of the transient and noise components is accurate overall. While the model is more computationally and memory efficient, perceptual tests reveal limitations in accurately modeling the attack phase of notes. Despite this, it generally achieves perceptual accuracy in emulating single notes and trichords.https://www.frontiersin.org/articles/10.3389/frsip.2024.1494864/fullsound source modelingphysics-informed modelingacoustic modelingdeep learningpiano synthesisdifferentiable digital signal processing
spellingShingle	Riccardo Simionato Stefano Fasciani Sines, transient, noise neural modeling of piano notes Frontiers in Signal Processing sound source modeling physics-informed modeling acoustic modeling deep learning piano synthesis differentiable digital signal processing
title	Sines, transient, noise neural modeling of piano notes
title_full	Sines, transient, noise neural modeling of piano notes
title_fullStr	Sines, transient, noise neural modeling of piano notes
title_full_unstemmed	Sines, transient, noise neural modeling of piano notes
title_short	Sines, transient, noise neural modeling of piano notes
title_sort	sines transient noise neural modeling of piano notes
topic	sound source modeling physics-informed modeling acoustic modeling deep learning piano synthesis differentiable digital signal processing
url	https://www.frontiersin.org/articles/10.3389/frsip.2024.1494864/full
work_keys_str_mv	AT riccardosimionato sinestransientnoiseneuralmodelingofpianonotes AT stefanofasciani sinestransientnoiseneuralmodelingofpianonotes

Sines, transient, noise neural modeling of piano notes

Similar Items