Audio-Language Datasets of Scenes and Events: A Survey
Audio-language models (ALMs) generate linguistic descriptions of sound-producing events and scenes. Advances in dataset creation and computational power have led to significant progress in this domain. This paper surveys 69 datasets used to train ALMs, covering research up to September 2024 (<uri...
Saved in:
Main Authors: | Gijs Wijngaard, Elia Formisano, Michele Esposito, Michel Dumontier |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10854210/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
A Novel Audio Copy Move Forgery Detection Method With Classification of Graph-Based Representations
by: Beste Ustubioglu, et al.
Published: (2025-01-01) -
Deep convolutional neural networks for double compressed AMR audio detection
by: Aykut Büker, et al.
Published: (2021-06-01) -
Audiogmenter: a MATLAB toolbox for audio data augmentation
by: Gianluca Maguolo, et al.
Published: (2025-01-01) -
PENGGUNAAN MEDIA AUDIO VISUAL PADA MATA PELAJARAN PENDIDIKAN AGAMA ISLAM UNTUK MENINGKATKAN AKTIVITAS BELAJAR SISWA KELAS V SD N 09 PALEMBANG
by: Ibrahim Ibrahim, et al.
Published: (2024-01-01) -
Peningkatan Kedisiplinan Siswa Sekolah Dasar Melalui Pemanfaatan Media Audio Visual
by: Siti Diyah Rachmatika, et al.
Published: (2024-09-01)