Sines, transient, noise neural modeling of piano notes

This article introduces a novel method for emulating piano sounds. We propose to exploit the sine, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corr...

Full description

Saved in:
Bibliographic Details
Main Authors: Riccardo Simionato, Stefano Fasciani
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Signal Processing
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/frsip.2024.1494864/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832582906342539264
author Riccardo Simionato
Stefano Fasciani
author_facet Riccardo Simionato
Stefano Fasciani
author_sort Riccardo Simionato
collection DOAJ
description This article introduces a novel method for emulating piano sounds. We propose to exploit the sine, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corresponding quasi-harmonic, transient, and noise signals. Splitting the emulation into three independently trainable models reduces the modeling tasks’ complexity. The quasi-harmonic content is produced using a differentiable sinusoidal model guided by physics-derived formulas, whose parameters are automatically estimated from audio recordings. The noise sub-module uses a learnable time-varying filter, and the transients are generated using a deep convolutional network. From singular notes, we emulate the coupling between different keys in trichords with a convolutional-based network. Results show the model matches the partial distribution of the target while predicting the energy in the higher part of the spectrum presents more challenges. The energy distribution in the spectra of the transient and noise components is accurate overall. While the model is more computationally and memory efficient, perceptual tests reveal limitations in accurately modeling the attack phase of notes. Despite this, it generally achieves perceptual accuracy in emulating single notes and trichords.
format Article
id doaj-art-710927f025b6412593c7f3d6bccaedc9
institution Kabale University
issn 2673-8198
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Signal Processing
spelling doaj-art-710927f025b6412593c7f3d6bccaedc92025-01-29T06:46:05ZengFrontiers Media S.A.Frontiers in Signal Processing2673-81982025-01-01410.3389/frsip.2024.14948641494864Sines, transient, noise neural modeling of piano notesRiccardo SimionatoStefano FascianiThis article introduces a novel method for emulating piano sounds. We propose to exploit the sine, transient, and noise decomposition to design a differentiable spectral modeling synthesizer replicating piano notes. Three sub-modules learn these components from piano recordings and generate the corresponding quasi-harmonic, transient, and noise signals. Splitting the emulation into three independently trainable models reduces the modeling tasks’ complexity. The quasi-harmonic content is produced using a differentiable sinusoidal model guided by physics-derived formulas, whose parameters are automatically estimated from audio recordings. The noise sub-module uses a learnable time-varying filter, and the transients are generated using a deep convolutional network. From singular notes, we emulate the coupling between different keys in trichords with a convolutional-based network. Results show the model matches the partial distribution of the target while predicting the energy in the higher part of the spectrum presents more challenges. The energy distribution in the spectra of the transient and noise components is accurate overall. While the model is more computationally and memory efficient, perceptual tests reveal limitations in accurately modeling the attack phase of notes. Despite this, it generally achieves perceptual accuracy in emulating single notes and trichords.https://www.frontiersin.org/articles/10.3389/frsip.2024.1494864/fullsound source modelingphysics-informed modelingacoustic modelingdeep learningpiano synthesisdifferentiable digital signal processing
spellingShingle Riccardo Simionato
Stefano Fasciani
Sines, transient, noise neural modeling of piano notes
Frontiers in Signal Processing
sound source modeling
physics-informed modeling
acoustic modeling
deep learning
piano synthesis
differentiable digital signal processing
title Sines, transient, noise neural modeling of piano notes
title_full Sines, transient, noise neural modeling of piano notes
title_fullStr Sines, transient, noise neural modeling of piano notes
title_full_unstemmed Sines, transient, noise neural modeling of piano notes
title_short Sines, transient, noise neural modeling of piano notes
title_sort sines transient noise neural modeling of piano notes
topic sound source modeling
physics-informed modeling
acoustic modeling
deep learning
piano synthesis
differentiable digital signal processing
url https://www.frontiersin.org/articles/10.3389/frsip.2024.1494864/full
work_keys_str_mv AT riccardosimionato sinestransientnoiseneuralmodelingofpianonotes
AT stefanofasciani sinestransientnoiseneuralmodelingofpianonotes