Echo: A crowd-sourced Romanian speech dataset.

Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less pop...

Full description

Saved in:

Bibliographic Details
Main Authors:	Remus-Dan Ungureanu, Mihai Dascalu
Format:	Article
Language:	English
Published:	ASLERD 2024-11-01
Series:	Interaction Design and Architecture(s)
Online Access:	https://ixdea.org/62_9/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832585185641627648
author	Remus-Dan Ungureanu Mihai Dascalu
author_facet	Remus-Dan Ungureanu Mihai Dascalu
author_sort	Remus-Dan Ungureanu
collection	DOAJ
description	Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less popular languages. Echo is a crowd-sourcing platform that has collected more than 300 hours of speech from various contributors. In this study, we document how a large speech dataset enables researchers to train automatic speech recognition, speaker verification, and diarization models to automatically process students’ notes. We publicly release both the dataset and the Whisper-based baseline model as open-source.
format	Article
id	doaj-art-96eb50bfafa04ddd923dde278887c976
institution	Kabale University
issn	2283-2998
language	English
publishDate	2024-11-01
publisher	ASLERD
record_format	Article
series	Interaction Design and Architecture(s)
spelling	doaj-art-96eb50bfafa04ddd923dde278887c9762025-01-26T18:43:17ZengASLERDInteraction Design and Architecture(s)2283-29982024-11-016214115210.55612/s-5002-062-009Echo: A crowd-sourced Romanian speech dataset.Remus-Dan UngureanuMihai Dascalu Romanian is the seventh most popular European language, with around 30 million speakers worldwide. Despite its popularity, the available speech resources are limited. As a result, there are few models that transcribe Romanian well, most of them being multilingual models that also cover less popular languages. Echo is a crowd-sourcing platform that has collected more than 300 hours of speech from various contributors. In this study, we document how a large speech dataset enables researchers to train automatic speech recognition, speaker verification, and diarization models to automatically process students’ notes. We publicly release both the dataset and the Whisper-based baseline model as open-source.https://ixdea.org/62_9/
spellingShingle	Remus-Dan Ungureanu Mihai Dascalu Echo: A crowd-sourced Romanian speech dataset. Interaction Design and Architecture(s)
title	Echo: A crowd-sourced Romanian speech dataset.
title_full	Echo: A crowd-sourced Romanian speech dataset.
title_fullStr	Echo: A crowd-sourced Romanian speech dataset.
title_full_unstemmed	Echo: A crowd-sourced Romanian speech dataset.
title_short	Echo: A crowd-sourced Romanian speech dataset.
title_sort	echo a crowd sourced romanian speech dataset
url	https://ixdea.org/62_9/
work_keys_str_mv	AT remusdanungureanu echoacrowdsourcedromanianspeechdataset AT mihaidascalu echoacrowdsourcedromanianspeechdataset

Echo: A crowd-sourced Romanian speech dataset.

Similar Items