Enhancing MusicGen with Prompt Tuning
Generative AI has been gaining attention across various creative domains. In particular, MusicGen stands out as a representative approach capable of generating music based on text or audio inputs. However, it has limitations in producing high-quality outputs for specific genres and fully reflecting...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-07-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/15/8504 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Generative AI has been gaining attention across various creative domains. In particular, MusicGen stands out as a representative approach capable of generating music based on text or audio inputs. However, it has limitations in producing high-quality outputs for specific genres and fully reflecting user intentions. This paper proposes a prompt tuning technique that effectively adjusts the output quality of MusicGen without modifying its original parameters and optimizes its ability to generate music tailored to specific genres and styles. Experiments were conducted to compare the performance of the traditional MusicGen with the proposed method and evaluate the quality of generated music using the Contrastive Language-Audio Pretraining (CLAP) and Kullback–Leibler Divergence (KLD) scoring approaches. The results demonstrated that the proposed method significantly improved the output quality and musical coherence, particularly for specific genres and styles. Compared with the traditional model, the CLAP score was increased by 0.1270, and the KLD score was increased by 0.00403 on average. The effectiveness of prompt tuning in optimizing the performance of MusicGen validated the proposed method and highlighted its potential for advancing generative AI-based music generation tools. |
|---|---|
| ISSN: | 2076-3417 |