A comprehensive voice dataset for Hindko digit recognitionMendeley Data

Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hin...

Full description

Saved in:
Bibliographic Details
Main Authors: Tanveer Ahmed, Maqbool Khan, Khalil Khan, Ikram Syed, Syed Sajid Ullah
Format: Article
Language:English
Published: Elsevier 2025-02-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S235234092401182X
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576479818416128
author Tanveer Ahmed
Maqbool Khan
Khalil Khan
Ikram Syed
Syed Sajid Ullah
author_facet Tanveer Ahmed
Maqbool Khan
Khalil Khan
Ikram Syed
Syed Sajid Ullah
author_sort Tanveer Ahmed
collection DOAJ
description Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hindko language. About 80% of the population in districts like Haripur, Abbotabad and Mansehra speak Hindko. The spoken content of Hindko covers a wide range of subjects, including religion, education, poetry, politics, theater, and more. Despite all this, Hindko lacks a voice recognition system that could enhance accessibility, preserve the language, and promote digital inclusion for its speakers. This paper presents a voice recognition dataset that consists of 17,597 voice samples, and is accessible to the public for academic and research purposes. The dataset consists of 20 Hindko digits ranging from 1 to 20 and all the voice samples are taken from the students and staff and faculty of Pak-Austria Fachhochschule Institute of Applied Science and Technology.
format Article
id doaj-art-6f95cd4e2bc544fbbfd6782d0cbd3c01
institution Kabale University
issn 2352-3409
language English
publishDate 2025-02-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj-art-6f95cd4e2bc544fbbfd6782d0cbd3c012025-01-31T05:11:33ZengElsevierData in Brief2352-34092025-02-0158111220A comprehensive voice dataset for Hindko digit recognitionMendeley DataTanveer Ahmed0Maqbool Khan1Khalil Khan2Ikram Syed3Syed Sajid Ullah4Pak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, PakistanPak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, Pakistan; Software Competence Center Hagenberg, Softwarepark 32a, 4232 Hagenberg, AustriaDepartment of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University, KazakhstanDepartment of Information & Communication Engineering, Hankuk University of Foreign Studies, Yongin 17035, South Korea; Corresponding authors.Department of Information & Communication Technology, University of Agder (UiA), Norway; Corresponding authors.Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hindko language. About 80% of the population in districts like Haripur, Abbotabad and Mansehra speak Hindko. The spoken content of Hindko covers a wide range of subjects, including religion, education, poetry, politics, theater, and more. Despite all this, Hindko lacks a voice recognition system that could enhance accessibility, preserve the language, and promote digital inclusion for its speakers. This paper presents a voice recognition dataset that consists of 17,597 voice samples, and is accessible to the public for academic and research purposes. The dataset consists of 20 Hindko digits ranging from 1 to 20 and all the voice samples are taken from the students and staff and faculty of Pak-Austria Fachhochschule Institute of Applied Science and Technology.http://www.sciencedirect.com/science/article/pii/S235234092401182XNatural language processingVoice recognitionSignal processingMachine learningArtificial intelligence
spellingShingle Tanveer Ahmed
Maqbool Khan
Khalil Khan
Ikram Syed
Syed Sajid Ullah
A comprehensive voice dataset for Hindko digit recognitionMendeley Data
Data in Brief
Natural language processing
Voice recognition
Signal processing
Machine learning
Artificial intelligence
title A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_full A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_fullStr A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_full_unstemmed A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_short A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_sort comprehensive voice dataset for hindko digit recognitionmendeley data
topic Natural language processing
Voice recognition
Signal processing
Machine learning
Artificial intelligence
url http://www.sciencedirect.com/science/article/pii/S235234092401182X
work_keys_str_mv AT tanveerahmed acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT maqboolkhan acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT khalilkhan acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT ikramsyed acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT syedsajidullah acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT tanveerahmed comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT maqboolkhan comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT khalilkhan comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT ikramsyed comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata
AT syedsajidullah comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata