Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]

This paper presents the development of a speech recognition system for the Arabic language that can handle continuous speech and a large number of words, independent of the speaker, using deep neural network models trained by self-supervised learning. The system was built using the HuBERT model, and...

Full description

Saved in:
Bibliographic Details
Main Authors: Rima Sbih, Assef Jafar, Ali Kazem
Format: Article
Language:Arabic
Published: Higher Commission for Scientific Research 2025-01-01
Series:Syrian Journal for Science and Innovation
Subjects:
Online Access:https://journal.hcsr.gov.sy/archives/1523
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832586218474307584
author Rima Sbih
Assef Jafar
Ali Kazem
author_facet Rima Sbih
Assef Jafar
Ali Kazem
author_sort Rima Sbih
collection DOAJ
description This paper presents the development of a speech recognition system for the Arabic language that can handle continuous speech and a large number of words, independent of the speaker, using deep neural network models trained by self-supervised learning. The system was built using the HuBERT model, and resulted in a word error rate (WER) of 19.3%. Our study on different data sets revealed that the HuBERT-based system has a significant ability to generalize to different spoken dialects. Additionally, we conducted a statistical analysis on the errors specific to the Arabic language that arise from the HuBERT-based system, which highlighted the necessity of incorporating an error correction language model to enhance system accuracy. After the addition of an Arabic language model, the WER decreased to 10.7%. Overall, this study emphasizes the potential of self-supervised learning-based speech recognition systems for the Arabic language and highlights the importance of incorporating language models to enhance system accuracy.
format Article
id doaj-art-ae432e68dd2b47f59c0139ba6aa1bec1
institution Kabale University
issn 2959-8591
language Arabic
publishDate 2025-01-01
publisher Higher Commission for Scientific Research
record_format Article
series Syrian Journal for Science and Innovation
spelling doaj-art-ae432e68dd2b47f59c0139ba6aa1bec12025-01-26T08:24:57ZaraHigher Commission for Scientific ResearchSyrian Journal for Science and Innovation2959-85912025-01-013110.5281/zenodo.14723614Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]Rima Sbih0Assef Jafar1Ali Kazem2Higher Institute for Applied Sciences and Technology_Damascus_Syria.Higher Institute for Applied Sciences and Technology_Damascus_Syria.Higher Institute for Applied Sciences and Technology_Damascus_Syria.This paper presents the development of a speech recognition system for the Arabic language that can handle continuous speech and a large number of words, independent of the speaker, using deep neural network models trained by self-supervised learning. The system was built using the HuBERT model, and resulted in a word error rate (WER) of 19.3%. Our study on different data sets revealed that the HuBERT-based system has a significant ability to generalize to different spoken dialects. Additionally, we conducted a statistical analysis on the errors specific to the Arabic language that arise from the HuBERT-based system, which highlighted the necessity of incorporating an error correction language model to enhance system accuracy. After the addition of an Arabic language model, the WER decreased to 10.7%. Overall, this study emphasizes the potential of self-supervised learning-based speech recognition systems for the Arabic language and highlights the importance of incorporating language models to enhance system accuracy.https://journal.hcsr.gov.sy/archives/1523speech recognitiondeep learningself-attentionsupervised learningself-supervised learning.
spellingShingle Rima Sbih
Assef Jafar
Ali Kazem
Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
Syrian Journal for Science and Innovation
speech recognition
deep learning
self-attention
supervised learning
self-supervised learning.
title Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
title_full Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
title_fullStr Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
title_full_unstemmed Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
title_short Building Arabic Speech Recognition System Using HuBERT Model and Studying the Sources of Errors [Arabic]
title_sort building arabic speech recognition system using hubert model and studying the sources of errors arabic
topic speech recognition
deep learning
self-attention
supervised learning
self-supervised learning.
url https://journal.hcsr.gov.sy/archives/1523
work_keys_str_mv AT rimasbih buildingarabicspeechrecognitionsystemusinghubertmodelandstudyingthesourcesoferrorsarabic
AT assefjafar buildingarabicspeechrecognitionsystemusinghubertmodelandstudyingthesourcesoferrorsarabic
AT alikazem buildingarabicspeechrecognitionsystemusinghubertmodelandstudyingthesourcesoferrorsarabic