A comprehensive voice dataset for Hindko digit recognitionMendeley Data
Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hin...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2025-02-01
|
Series: | Data in Brief |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S235234092401182X |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832576479818416128 |
---|---|
author | Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah |
author_facet | Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah |
author_sort | Tanveer Ahmed |
collection | DOAJ |
description | Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hindko language. About 80% of the population in districts like Haripur, Abbotabad and Mansehra speak Hindko. The spoken content of Hindko covers a wide range of subjects, including religion, education, poetry, politics, theater, and more. Despite all this, Hindko lacks a voice recognition system that could enhance accessibility, preserve the language, and promote digital inclusion for its speakers. This paper presents a voice recognition dataset that consists of 17,597 voice samples, and is accessible to the public for academic and research purposes. The dataset consists of 20 Hindko digits ranging from 1 to 20 and all the voice samples are taken from the students and staff and faculty of Pak-Austria Fachhochschule Institute of Applied Science and Technology. |
format | Article |
id | doaj-art-6f95cd4e2bc544fbbfd6782d0cbd3c01 |
institution | Kabale University |
issn | 2352-3409 |
language | English |
publishDate | 2025-02-01 |
publisher | Elsevier |
record_format | Article |
series | Data in Brief |
spelling | doaj-art-6f95cd4e2bc544fbbfd6782d0cbd3c012025-01-31T05:11:33ZengElsevierData in Brief2352-34092025-02-0158111220A comprehensive voice dataset for Hindko digit recognitionMendeley DataTanveer Ahmed0Maqbool Khan1Khalil Khan2Ikram Syed3Syed Sajid Ullah4Pak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, PakistanPak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, Pakistan; Software Competence Center Hagenberg, Softwarepark 32a, 4232 Hagenberg, AustriaDepartment of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University, KazakhstanDepartment of Information & Communication Engineering, Hankuk University of Foreign Studies, Yongin 17035, South Korea; Corresponding authors.Department of Information & Communication Technology, University of Agder (UiA), Norway; Corresponding authors.Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hindko language. About 80% of the population in districts like Haripur, Abbotabad and Mansehra speak Hindko. The spoken content of Hindko covers a wide range of subjects, including religion, education, poetry, politics, theater, and more. Despite all this, Hindko lacks a voice recognition system that could enhance accessibility, preserve the language, and promote digital inclusion for its speakers. This paper presents a voice recognition dataset that consists of 17,597 voice samples, and is accessible to the public for academic and research purposes. The dataset consists of 20 Hindko digits ranging from 1 to 20 and all the voice samples are taken from the students and staff and faculty of Pak-Austria Fachhochschule Institute of Applied Science and Technology.http://www.sciencedirect.com/science/article/pii/S235234092401182XNatural language processingVoice recognitionSignal processingMachine learningArtificial intelligence |
spellingShingle | Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah A comprehensive voice dataset for Hindko digit recognitionMendeley Data Data in Brief Natural language processing Voice recognition Signal processing Machine learning Artificial intelligence |
title | A comprehensive voice dataset for Hindko digit recognitionMendeley Data |
title_full | A comprehensive voice dataset for Hindko digit recognitionMendeley Data |
title_fullStr | A comprehensive voice dataset for Hindko digit recognitionMendeley Data |
title_full_unstemmed | A comprehensive voice dataset for Hindko digit recognitionMendeley Data |
title_short | A comprehensive voice dataset for Hindko digit recognitionMendeley Data |
title_sort | comprehensive voice dataset for hindko digit recognitionmendeley data |
topic | Natural language processing Voice recognition Signal processing Machine learning Artificial intelligence |
url | http://www.sciencedirect.com/science/article/pii/S235234092401182X |
work_keys_str_mv | AT tanveerahmed acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT maqboolkhan acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT khalilkhan acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT ikramsyed acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT syedsajidullah acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT tanveerahmed comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT maqboolkhan comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT khalilkhan comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT ikramsyed comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT syedsajidullah comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata |