Echo: A crowd-sourced Romanian speech dataset.
Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less pop...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
ASLERD
2024-11-01
|
Series: | Interaction Design and Architecture(s) |
Online Access: | https://ixdea.org/62_9/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832585185641627648 |
---|---|
author | Remus-Dan Ungureanu Mihai Dascalu |
author_facet | Remus-Dan Ungureanu Mihai Dascalu |
author_sort | Remus-Dan Ungureanu |
collection | DOAJ |
description |
Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less popular languages. Echo is a crowd-sourcing platform that has collected more than 300 hours of speech from various contributors. In this study, we document how a large speech dataset enables researchers to train automatic speech recognition, speaker verification, and diarization models to automatically process students’ notes. We publicly release both the dataset and the Whisper-based baseline model as open-source. |
format | Article |
id | doaj-art-96eb50bfafa04ddd923dde278887c976 |
institution | Kabale University |
issn | 2283-2998 |
language | English |
publishDate | 2024-11-01 |
publisher | ASLERD |
record_format | Article |
series | Interaction Design and Architecture(s) |
spelling | doaj-art-96eb50bfafa04ddd923dde278887c9762025-01-26T18:43:17ZengASLERDInteraction Design and Architecture(s)2283-29982024-11-016214115210.55612/s-5002-062-009Echo: A crowd-sourced Romanian speech dataset.Remus-Dan UngureanuMihai Dascalu Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less popular languages. Echo is a crowd-sourcing platform that has collected more than 300 hours of speech from various contributors. In this study, we document how a large speech dataset enables researchers to train automatic speech recognition, speaker verification, and diarization models to automatically process students’ notes. We publicly release both the dataset and the Whisper-based baseline model as open-source.https://ixdea.org/62_9/ |
spellingShingle | Remus-Dan Ungureanu Mihai Dascalu Echo: A crowd-sourced Romanian speech dataset. Interaction Design and Architecture(s) |
title | Echo: A crowd-sourced Romanian speech dataset. |
title_full | Echo: A crowd-sourced Romanian speech dataset. |
title_fullStr | Echo: A crowd-sourced Romanian speech dataset. |
title_full_unstemmed | Echo: A crowd-sourced Romanian speech dataset. |
title_short | Echo: A crowd-sourced Romanian speech dataset. |
title_sort | echo a crowd sourced romanian speech dataset |
url | https://ixdea.org/62_9/ |
work_keys_str_mv | AT remusdanungureanu echoacrowdsourcedromanianspeechdataset AT mihaidascalu echoacrowdsourcedromanianspeechdataset |