A comprehensive voice dataset for Hindko digit recognitionMendeley Data

Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hin...

Full description

Saved in:

Bibliographic Details
Main Authors:	Tanveer Ahmed, Maqbool Khan, Khalil Khan, Ikram Syed, Syed Sajid Ullah
Format:	Article
Language:	English
Published:	Elsevier 2025-02-01
Series:	Data in Brief
Subjects:	Natural language processing Voice recognition Signal processing Machine learning Artificial intelligence
Online Access:	http://www.sciencedirect.com/science/article/pii/S235234092401182X
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832576479818416128
author	Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah
author_facet	Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah
author_sort	Tanveer Ahmed
collection	DOAJ
description	Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hindko language. About 80% of the population in districts like Haripur, Abbotabad and Mansehra speak Hindko. The spoken content of Hindko covers a wide range of subjects, including religion, education, poetry, politics, theater, and more. Despite all this, Hindko lacks a voice recognition system that could enhance accessibility, preserve the language, and promote digital inclusion for its speakers. This paper presents a voice recognition dataset that consists of 17,597 voice samples, and is accessible to the public for academic and research purposes. The dataset consists of 20 Hindko digits ranging from 1 to 20 and all the voice samples are taken from the students and staff and faculty of Pak-Austria Fachhochschule Institute of Applied Science and Technology.
format	Article
id	doaj-art-6f95cd4e2bc544fbbfd6782d0cbd3c01
institution	Kabale University
issn	2352-3409
language	English
publishDate	2025-02-01
publisher	Elsevier
record_format	Article
series	Data in Brief
spelling	doaj-art-6f95cd4e2bc544fbbfd6782d0cbd3c012025-01-31T05:11:33ZengElsevierData in Brief2352-34092025-02-0158111220A comprehensive voice dataset for Hindko digit recognitionMendeley DataTanveer Ahmed0Maqbool Khan1Khalil Khan2Ikram Syed3Syed Sajid Ullah4Pak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, PakistanPak-Austria Fachhochschule: Institute of Applied Sciences and Technology, Haripur, Pakistan; Software Competence Center Hagenberg, Softwarepark 32a, 4232 Hagenberg, AustriaDepartment of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University, KazakhstanDepartment of Information & Communication Engineering, Hankuk University of Foreign Studies, Yongin 17035, South Korea; Corresponding authors.Department of Information & Communication Technology, University of Agder (UiA), Norway; Corresponding authors.Hindko is a language primarily spoken in Northwestern areas of Pakistan. Approximately eight million people speak the Hindko language. According to its native speakers, it is 7th largest language of Pakistan and 2nd largest language of Khyber Pakhtunkhwa. The Hazara region is the cultural hub of Hindko language. About 80% of the population in districts like Haripur, Abbotabad and Mansehra speak Hindko. The spoken content of Hindko covers a wide range of subjects, including religion, education, poetry, politics, theater, and more. Despite all this, Hindko lacks a voice recognition system that could enhance accessibility, preserve the language, and promote digital inclusion for its speakers. This paper presents a voice recognition dataset that consists of 17,597 voice samples, and is accessible to the public for academic and research purposes. The dataset consists of 20 Hindko digits ranging from 1 to 20 and all the voice samples are taken from the students and staff and faculty of Pak-Austria Fachhochschule Institute of Applied Science and Technology.http://www.sciencedirect.com/science/article/pii/S235234092401182XNatural language processingVoice recognitionSignal processingMachine learningArtificial intelligence
spellingShingle	Tanveer Ahmed Maqbool Khan Khalil Khan Ikram Syed Syed Sajid Ullah A comprehensive voice dataset for Hindko digit recognitionMendeley Data Data in Brief Natural language processing Voice recognition Signal processing Machine learning Artificial intelligence
title	A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_full	A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_fullStr	A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_full_unstemmed	A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_short	A comprehensive voice dataset for Hindko digit recognitionMendeley Data
title_sort	comprehensive voice dataset for hindko digit recognitionmendeley data
topic	Natural language processing Voice recognition Signal processing Machine learning Artificial intelligence
url	http://www.sciencedirect.com/science/article/pii/S235234092401182X
work_keys_str_mv	AT tanveerahmed acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT maqboolkhan acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT khalilkhan acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT ikramsyed acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT syedsajidullah acomprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT tanveerahmed comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT maqboolkhan comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT khalilkhan comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT ikramsyed comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata AT syedsajidullah comprehensivevoicedatasetforhindkodigitrecognitionmendeleydata

A comprehensive voice dataset for Hindko digit recognitionMendeley Data

Similar Items