User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE
This research proposes a way for music generation models to more clearly reflect user intent. Previous studies have attempted to generate personalized music through various inputs such as text and images, but these expressions have ambiguous meanings and are difficult to reflect directly in music. T...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/11122520/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849330267697709056 |
|---|---|
| author | Soyoung Jang Jaeho Lee |
| author_facet | Soyoung Jang Jaeho Lee |
| author_sort | Soyoung Jang |
| collection | DOAJ |
| description | This research proposes a way for music generation models to more clearly reflect user intent. Previous studies have attempted to generate personalized music through various inputs such as text and images, but these expressions have ambiguous meanings and are difficult to reflect directly in music. To solve this problem, this study proposes a method to define the user’s emotional state based on color and use it as a condition for music generation. The colors were classified into saturation, luminance, and hue, and the pitch and density information corresponding to each emotional condition was extracted through a Mood Classifier trained on them. This information was incorporated into a loss function that redefined the MusicVAE’s latent vector, which was then subjected to Actor-Critic-based condition injection. The model performance was evaluated using Spotify Energy-Valence analysis, PCA-based latent space visualization, and listening tests (200 subjects), and found to be superior to existing models in both conditioned performance and musical naturalness. |
| format | Article |
| id | doaj-art-e9dbd4358a5f43b185c0dabcb7a3dcd3 |
| institution | Kabale University |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-e9dbd4358a5f43b185c0dabcb7a3dcd32025-08-20T03:47:01ZengIEEEIEEE Access2169-35362025-01-011314128114129410.1109/ACCESS.2025.359774111122520User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAESoyoung Jang0https://orcid.org/0009-0006-4279-6467Jaeho Lee1https://orcid.org/0000-0003-0455-9939Department of Information and Communication Technology Convergence Engineering, Duksung Women’s University, Seoul, South KoreaDepartment of Software, Duksung Women’s University, Seoul, South KoreaThis research proposes a way for music generation models to more clearly reflect user intent. Previous studies have attempted to generate personalized music through various inputs such as text and images, but these expressions have ambiguous meanings and are difficult to reflect directly in music. To solve this problem, this study proposes a method to define the user’s emotional state based on color and use it as a condition for music generation. The colors were classified into saturation, luminance, and hue, and the pitch and density information corresponding to each emotional condition was extracted through a Mood Classifier trained on them. This information was incorporated into a loss function that redefined the MusicVAE’s latent vector, which was then subjected to Actor-Critic-based condition injection. The model performance was evaluated using Spotify Energy-Valence analysis, PCA-based latent space visualization, and listening tests (200 subjects), and found to be superior to existing models in both conditioned performance and musical naturalness.https://ieeexplore.ieee.org/document/11122520/Actor-critic methodologycolor representationconditional music generationenergy-valence modellatent vectormood classification |
| spellingShingle | Soyoung Jang Jaeho Lee User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE IEEE Access Actor-critic methodology color representation conditional music generation energy-valence model latent vector mood classification |
| title | User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE |
| title_full | User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE |
| title_fullStr | User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE |
| title_full_unstemmed | User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE |
| title_short | User Intent-Based Music Generation Model Combining Actor-Critic Approach With MusicVAE |
| title_sort | user intent based music generation model combining actor critic approach with musicvae |
| topic | Actor-critic methodology color representation conditional music generation energy-valence model latent vector mood classification |
| url | https://ieeexplore.ieee.org/document/11122520/ |
| work_keys_str_mv | AT soyoungjang userintentbasedmusicgenerationmodelcombiningactorcriticapproachwithmusicvae AT jaeholee userintentbasedmusicgenerationmodelcombiningactorcriticapproachwithmusicvae |