User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE

This research proposes a way for music generation models to more clearly reflect user intent. Previous studies have attempted to generate personalized music through various inputs such as text and images, but these expressions have ambiguous meanings and are difficult to reflect directly in music. T...

Full description

Saved in:
Bibliographic Details
Main Authors: Soyoung Jang, Jaeho Lee
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11122520/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849330267697709056
author Soyoung Jang
Jaeho Lee
author_facet Soyoung Jang
Jaeho Lee
author_sort Soyoung Jang
collection DOAJ
description This research proposes a way for music generation models to more clearly reflect user intent. Previous studies have attempted to generate personalized music through various inputs such as text and images, but these expressions have ambiguous meanings and are difficult to reflect directly in music. To solve this problem, this study proposes a method to define the user’s emotional state based on color and use it as a condition for music generation. The colors were classified into saturation, luminance, and hue, and the pitch and density information corresponding to each emotional condition was extracted through a Mood Classifier trained on them. This information was incorporated into a loss function that redefined the MusicVAE’s latent vector, which was then subjected to Actor-Critic-based condition injection. The model performance was evaluated using Spotify Energy-Valence analysis, PCA-based latent space visualization, and listening tests (200 subjects), and found to be superior to existing models in both conditioned performance and musical naturalness.
format Article
id doaj-art-e9dbd4358a5f43b185c0dabcb7a3dcd3
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-e9dbd4358a5f43b185c0dabcb7a3dcd32025-08-20T03:47:01ZengIEEEIEEE Access2169-35362025-01-011314128114129410.1109/ACCESS.2025.359774111122520User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAESoyoung Jang0https://orcid.org/0009-0006-4279-6467Jaeho Lee1https://orcid.org/0000-0003-0455-9939Department of Information and Communication Technology Convergence Engineering, Duksung Women’s University, Seoul, South KoreaDepartment of Software, Duksung Women’s University, Seoul, South KoreaThis research proposes a way for music generation models to more clearly reflect user intent. Previous studies have attempted to generate personalized music through various inputs such as text and images, but these expressions have ambiguous meanings and are difficult to reflect directly in music. To solve this problem, this study proposes a method to define the user’s emotional state based on color and use it as a condition for music generation. The colors were classified into saturation, luminance, and hue, and the pitch and density information corresponding to each emotional condition was extracted through a Mood Classifier trained on them. This information was incorporated into a loss function that redefined the MusicVAE’s latent vector, which was then subjected to Actor-Critic-based condition injection. The model performance was evaluated using Spotify Energy-Valence analysis, PCA-based latent space visualization, and listening tests (200 subjects), and found to be superior to existing models in both conditioned performance and musical naturalness.https://ieeexplore.ieee.org/document/11122520/Actor-critic methodologycolor representationconditional music generationenergy-valence modellatent vectormood classification
spellingShingle Soyoung Jang
Jaeho Lee
User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
IEEE Access
Actor-critic methodology
color representation
conditional music generation
energy-valence model
latent vector
mood classification
title User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
title_full User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
title_fullStr User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
title_full_unstemmed User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
title_short User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
title_sort user intent based music generation model combining actor critic approach with musicvae
topic Actor-critic methodology
color representation
conditional music generation
energy-valence model
latent vector
mood classification
url https://ieeexplore.ieee.org/document/11122520/
work_keys_str_mv AT soyoungjang userintentbasedmusicgenerationmodelcombiningactorcriticapproachwithmusicvae
AT jaeholee userintentbasedmusicgenerationmodelcombiningactorcriticapproachwithmusicvae