WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages

Automatic speech recognition systems are developed for translating the speech signals into the corresponding text representation. This translation is used in a variety of applications like voice enabled commands, assistive devices and bots, etc. There is a significant lack of efficient technology fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Tripti Choudhary, Vishal Goyal, Atul Bansal
Format: Article
Language:English
Published: Tsinghua University Press 2023-03-01
Series:Big Data Mining and Analytics
Subjects:
Online Access:https://www.sciopen.com/article/10.26599/BDMA.2022.9020017
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832572735065161728
author Tripti Choudhary
Vishal Goyal
Atul Bansal
author_facet Tripti Choudhary
Vishal Goyal
Atul Bansal
author_sort Tripti Choudhary
collection DOAJ
description Automatic speech recognition systems are developed for translating the speech signals into the corresponding text representation. This translation is used in a variety of applications like voice enabled commands, assistive devices and bots, etc. There is a significant lack of efficient technology for Indian languages. In this paper, an wavelet transformer for automatic speech recognition (WTASR) of Indian language is proposed. The speech signals suffer from the problem of high and low frequency over different times due to variation in speech of the speaker. Thus, wavelets enable the network to analyze the signal in multiscale. The wavelet decomposition of the signal is fed in the network for generating the text. The transformer network comprises an encoder decoder system for speech translation. The model is trained on Indian language dataset for translation of speech into corresponding text. The proposed method is compared with other state of the art methods. The results show that the proposed WTASR has a low word error rate and can be used for effective speech recognition for Indian language.
format Article
id doaj-art-2261e997fd084ae3ba32e009901e7ac4
institution Kabale University
issn 2096-0654
language English
publishDate 2023-03-01
publisher Tsinghua University Press
record_format Article
series Big Data Mining and Analytics
spelling doaj-art-2261e997fd084ae3ba32e009901e7ac42025-02-02T07:53:41ZengTsinghua University PressBig Data Mining and Analytics2096-06542023-03-0161859110.26599/BDMA.2022.9020017WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian LanguagesTripti Choudhary0Vishal Goyal1Atul Bansal2Department of Electronics and Communication, GLA University, Mathura 281406, IndiaDepartment of Electronics and Communication, GLA University, Mathura 281406, IndiaChandigarh University, Mohali 140413, IndiaAutomatic speech recognition systems are developed for translating the speech signals into the corresponding text representation. This translation is used in a variety of applications like voice enabled commands, assistive devices and bots, etc. There is a significant lack of efficient technology for Indian languages. In this paper, an wavelet transformer for automatic speech recognition (WTASR) of Indian language is proposed. The speech signals suffer from the problem of high and low frequency over different times due to variation in speech of the speaker. Thus, wavelets enable the network to analyze the signal in multiscale. The wavelet decomposition of the signal is fed in the network for generating the text. The transformer network comprises an encoder decoder system for speech translation. The model is trained on Indian language dataset for translation of speech into corresponding text. The proposed method is compared with other state of the art methods. The results show that the proposed WTASR has a low word error rate and can be used for effective speech recognition for Indian language.https://www.sciopen.com/article/10.26599/BDMA.2022.9020017transformerwaveletautomatic speech recognition (asr)indian language
spellingShingle Tripti Choudhary
Vishal Goyal
Atul Bansal
WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
Big Data Mining and Analytics
transformer
wavelet
automatic speech recognition (asr)
indian language
title WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
title_full WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
title_fullStr WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
title_full_unstemmed WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
title_short WTASR: Wavelet Transformer for Automatic Speech Recognition of Indian Languages
title_sort wtasr wavelet transformer for automatic speech recognition of indian languages
topic transformer
wavelet
automatic speech recognition (asr)
indian language
url https://www.sciopen.com/article/10.26599/BDMA.2022.9020017
work_keys_str_mv AT triptichoudhary wtasrwavelettransformerforautomaticspeechrecognitionofindianlanguages
AT vishalgoyal wtasrwavelettransformerforautomaticspeechrecognitionofindianlanguages
AT atulbansal wtasrwavelettransformerforautomaticspeechrecognitionofindianlanguages